by Ben | Apr 18, 2016 | Trinalysis |

Customer Lifetime Value (CLV) is an estimation of the entire net profit attributed to a single customer. It’s an important metric to understand because it helps businesses determine how much is too much to spend on advertising to acquire a single customer....
by Ben | Apr 2, 2016 | Python |

This guide is to help bridge the gap between understanding what a regular expression is and understanding how to use them in Python. If you’re brand new to regular expressions, I highly recommend checking out RegexOne. For this guide, we’ll use...
by Ben | Apr 2, 2016 | R |

This guide is to help bridge the gap between understanding what a regular expression is and understanding how to use them in R. If you’re brand new to regular expressions, I highly recommend checking out RegexOne. Hadley Wickham’s stringr package makes...
by Ben | Aug 30, 2015 | Business Intelligence, Trinalysis |

Here’s a practical guide for calculating customer retention and churn from transaction data. First of all, customer retention represents a rate at which customers remain customers before churning, or ending their relationship with a business. Customer churn is...
by Ben | Aug 24, 2015 | Machine Learning |

Logistic regression is a generalized linear model most commonly used for classifying binary data. It’s output is a continuous range of values between 0 and 1 (commonly representing the probability of some event occurring), and its input can be a multitude...
by Ben | Aug 31, 2014 | Decision Trees, Machine Learning, R |

The rpart package in R provides a powerful framework for growing classification and regression trees. To see how it works, let’s get started with a minimal example. First let’s define a problem. There’s a common scam amongst motorists where a person...
by Ben | Jul 26, 2014 | R |

Rolling joins are commonly used for analyzing data involving time. A simple example – suppose you have a table of product sales and a table of commercials. You might want to associate each product sale with the most recent commercial that aired prior to the...
by Ben | Jul 25, 2014 | R |

The data.table package in R provides fast methods for handling large tables of data with very simplistic syntax. The following is an introduction to the basic join operations available using the data.table package. Suppose you have two data.tables – a table of...
by Ben | Jul 21, 2014 | R |

A factor variable (commonly called a categorical variable outside of R) is a variable that takes on a limited set of values. For example, days of the week {Sunday, Monday, etc.} or the set of colors {Red, Blue, Green} should be a factor. By contrast, a vector of...
by Ben | Jul 9, 2014 | Actuarial Science |

When I was in high school, I knew I wanted to pursue a career involving math. I did an internship working for some mechanical engineers at an oil platform consultant company, but I never witnessed my mentors do more than basic geometry or algebra. That’s when I...