Data Science & Tech Blog

Building a Recommender System

By Matthew Mahowald on Thu, Apr 25, 2019

Recommender systems are one of the most prominent examples of machine learning in the wild today. They determine what shows up in your Facebook news feed, in what order products appear on Amazon, what videos are suggested in your Netflix queue, as well as countless other examples. But what are recommender systems, and how do they work? This post is the first in a series exploring some common techniques for building recommender systems as well as their implementation.

Tags: recommender

Tropical Geometry and Neural Networks

By Sam Shideler on Thu, Apr 11, 2019

Algebraic geometry is not a subject that often arises in conversations around data science and machine learning. However, recent work in the field of tropical geometry (a subset of algebraic geometry) suggests that this subject might be able give some insights into the types of functions representable by neural networks (as well as give some upper bounds on the complexity of functions representable by neural nets of fixed width and depth).

Tags: algebraic geometry

When (not) to Lemmatize or Remove Stop Words in Text Preprocessing

By Alex Schumacher on Thu, Mar 21, 2019

Natural language text is messy. It’s full of disfluencies (‘ums’ and ‘uhs’) or spelling mistakes or unexpected foreign text, among others. What’s worse, even when all of that mess is cleaned up, natural language text has structural aspects that are not ideal for many applications. Two of those challenges, inconsistency of form and contentless material are addressed by two common practices: lemmatization and stop word removal. These practices are effective countermeasures to their respective problems, but they are often taken as writ when in fact they should are application- and problem-specific. In this blog, I’ll be discussing lemmatization and stop word removal, why they’re done, when to use them, and when not to.

Tags: NLP, AI, robots