The human side of machine learning

What is the data telling us?

Data science extracts knowledge and insights from your data. Machine learning uses past data to predict future events. Artificial intelligence is no longer science fiction. Smart companies are already using these technologies to develop new products and services. Effective decision makers are already relying on these methods to make the most informed choices possible. How can DataJenius bring your project or organization to the next level?


We can find and collect the data relevant to you, or organize existing data within your organization. Whether you are dealing with hundreds of records, or trillions, we have the experience to help.


We turn piles of data into actionable insights. Custom visualizations, statistical analysis, and advanced model training, are just some of the tools we use to dig deeper into your data.


From basic regression models to advanced deep learning methods, our team builds enterprise level applications that take full advantage of these insights as quickly as possible.


From top-level executives to hands-on programmers, we are here to help your organization better understand these technologies, and join the artificial intelligence revolution.

DataJenius Articles

We want you to better understand data science and machine learning by reading about how our team has applied these technologies to a wide variety of different subjects. Our goal is to dig a little deeper, and uncover insights that other analysts miss.

A Random Walk Down the S&P 500

When it comes to investing, there is certainly no shortage of opinions on the Internet. Despite the promises of countless "gurus", no one is 100% sure when the next crash is coming. We won't know for sure until we're already halfway through it. You can't control the outcome. All you can do is improve your odds. Read More

The Mathphobics Guide to Linear and Logistic Regression

Starting with some algebra you probably first encountered somewhere around age 12, we're going to follow it all the way through some basic statistics, a dash of calculus, and even some linear algebra. We are going to connect the dots between all these goofy math formulas and the code snippets we find online. Read More

Recommending True Love with Non-Negative Matrix Factorization

In this article we are going to take a closer look at the logic behind recommendation systems. Specifically, we’re going to talk about Non-Negative Matrix Factorization, PCA, NLP, data dimensionality, and the scalability of our solution in production. We’re also going to be talking about sex and dating, so hopefully that will hold your attention. Read More

Solving N-Queens with Genetic Algorithms

Genetic algorithms take a very different approach to artificial intelligence. Rather than deriving answers from data, these algorithms mimic the rules of evolution instead. The solutions found by these algorithms do not result from a rigorous mathematical formula, but rather, the laws of nature, and a whole lot of trial and error. Read More

Using Decision Trees to Identify White Nationalists

Imagine it is the year 1995. A terrorist who identifies as a "White Nationalist" has just killed 168 people, and injured another 680 more. The President proposes using machine learning to identify other White Nationalists, so that we can prevent further attacks. Do you agree or disagree with this proposal? Read More

School Shootings in America and the Challenge of Biased Data

No matter where you stand on this issue, the purpose of this article is not to change your mind, but rather, to investigate the subject as objectively as possible. What is the data telling us? More importantly: where is that data coming from? What biases does it contain? What has been included? What has been omitted? Who has made these decisions? Read More

The Cryptocurreny Crash of January 2018

On September 3, 1929, the Dow Jones Industrial Average swelled to a record high of 381.17. By November 13, 1929 it had fallen to 198.69, a loss of approximately 47.87% in 71 days. In other words, the crypto market just fell further, faster, than the Wall Street Crash of 1929 which is often credited with ushering in the Great Depression. Read More

Categorizing Subreddits with Latent Dirichlet allocation (LDA)

Latent Dirichlet allocation (LDA) is an unsupervised machine learning algorithm that attempts to automatically divide a corpus of documents into groups of logical topics. The applications of this algorithm are numerous. Properly applied, LDA can identify the various topics discussed in any single document, categorize individual documents into their most dominant topic, or suggest documents that are intuitively related. Read More

What do our clients say?

Our clients include garage based start ups, Fortune 500 companies, and non-profit organizations. Do you have a challenging project, some interesting data, or an idea for a new product or service that utilizes machine learning technologies? If so we want to hear about it.

Erik Ellis

Network Engineer, Intel

"The team behind DataJenius includes some of the hardest working people in this industry. Mastering these rapidly evolving technologies requires an enormous amount of study and dedication."

Justin Meehan

Program Director, Head Instructor, Orion Fencing

"Working with Josh at DataJenius proved to be invaluable. I found him to be tireless in his efforts, patient and willing to explain sophisticated systems repeatedly [...] exuberantly inspired."

Mel Martin

Former President, Brevard Indian River Lagoon Coalition

"We had someone who had the tech know-how, analytical insight, creative innovation, and dedicated follow-through to make amazing things happen for our organization."

Colin Delia

Founder, Fat Headlines

"The first three things that come to mind when I think about Josh are his intelligence, his passion and his work ethic. [...] There is a non-zero chance that Josh is secretly a machine himself."

Meet the team

Our team is comprised of a diverse group of individuals who all share a burning passion for data. Do you have a challenging project, some interesting data, or an idea for a new product or service that utilizes machine learning technologies? If so we want to hear about it.

Chris Kamm

Data Management

CK has over 20 years of IT experience, with more than half of that time spent working at the world's largest options and futures exchange (CME Group, NYC). He is responsible for the backend operations of DataJenius, managing the massive data sets required by our projects.

Josh Pause

Machine Learning

Josh has over 15 years of experience developing data-driven software. He has more than 50 active technical certifications relating to data science and machine learning, and currently studies statistics at Cal State. He is trying to build a machine that can think.

Data Jenius is headquartered in Alameda, CA. We serve clients in Silicon Valley, the greater Bay Area, and beyond.

All website contents are © 2018 Data Jenius