Data Science & Tech Blog

Machine Learning Model Interpretation

By Nan Shi on Thu, Aug 15, 2019

To either a model-driven company or a company catching up with the rapid adoption of AI in the industry, machine learning model interpretation has become a key factor that helps to make decisions towards promoting models into business. This is not an easy task – imagine trying to explain a mathematical theory to your parents. Yet business owners should always be curious about these models, and some questions easily raise:

Tags: interpretability, modeling

Matching for Non-Random Studies

By Shane Pederson on Thu, Jul 25, 2019

Experimental designs such as A/B testing are a cornerstone of statistical practice. By randomly assigning treatments to subjects, we can test the effect of a test versus a control (as in a clinical trial for a proposed new drug) or can determine which of several web page layouts for a promotional offer receives the largest response. Designed, controlled experiments are a common feature of much of scientific and business research. THe internet is a natural platform from which to launch tests on almost any topic, and the principles of randomization are easily understood.

Tags: A/B testing, Causal inference

Distances and Data Science

By Sam Shideler on Thu, Jun 27, 2019

We’re all aware of what ‘distance’ means in real-life scenarios, and how our notion of what ‘distance’ means can change with context. If we’re talking about the distance from the ODG office to one of our favorite lunch spots, we probably mean the distance we walk when traversing Chicago’s grid of city blocks from the office to the restaurant, not the ‘direct line’ distance on the Earth’s surface. Similarly, if we’re talking about the distance from Chicago to St. Louis, the distance we’re talking about probably depends on whether we’re driving (taking the shortest path via available roads) or flying (which is usually much closer to the ‘straight line’ path on the Earth’s surface than driving is). In this post, we explore a more rigorous construction of distance and its applications to data science.

Tags: python, metric learning