Insights from a Data Scientist: Predicting the Customer Journey to Boost ROAS - CAKE
Request a Demo

Integrated Solutions to Grow with Your Business

CAKE's powerful performance marketing software will bring clarity to your marketing campaigns and empower you with the insights to make intelligent marketing decisions.

Affiliate Marketing

Manage and measure partner performance with precision for improved profit margins.

Learn More

Lead Generation

Collect, validate and distribute leads in real-time for maximum profitability.

Learn More

MultiChannel Marketing

Measure channel performance using multitouch attribution, for ROAS optimization.

Learn More

Readily integrate with 3rd-party systems or access 24/7 support.

Integrations Support

Insights from a Data Scientist: Predicting the Customer Journey to Boost ROAS

Our Data Science department at CAKE has been busy. Recent accomplishments are highlighted on our tech blog and a press release announced our related Patent Application which utilizes Hidden Markov Models to boost return on ad spend (ROAS).

Digital marketers are interested in gaining deeper insights about the customer journey to better understand their customers and ultimately to maximize ROAS. There are several ways to do this. One common method is with the Sankey diagram which denotes paths to purchase through different channel type combinations. This allows the marketer to analyze the diagram and look for patterns, or perhaps to delve into time consuming spreadsheets which contain summary statistics and look for the most predominant patterns.

However, humans don’t necessarily dissect and analyze numbers as well as computers.  This is the reason why we have developed a novel approach to using the Hidden Markov Model for predictions to provide machine-driven recommendations that help aid in the interpretation of marketing data.

Why a Hidden Markov Model, you might ask? Examining the history of the model reveals its use for extracting meaningful patterns in data sequences and taking these sequences of events comprised of various types of customer interactions to extract the most meaningful patterns. At the end of the day, the Hidden Markov Model is nothing more than that – a compact, descriptive way of summarizing the event probabilities that happened when your customers convert and/or buy your product.

The real magic happens when CAKE uses the model to predict your customer interactions.  We have demonstrated that our model finds nuances in the customer journey and produces a multinomial distribution for each possible customer interaction, given a particular customer’s history of interactions.  From here we can extrapolate many different interesting data points including:

  1. We get to observe the likelihood of the customer clicking on a channel or media type driven advertisement, based on their history. The next likely channels for all active users can be aggregated to dynamically optimize campaign spend.
  2. We can find out how likely the user is to convert compared to other similar users that have converted. If we look at this data historically, CAKE believes this should provide a measurement of the current level of effectiveness of the campaign, for users who have not yet converted.
  3. We can give an estimate of the number of active users, their positions in the funnel and the likelihood of the position with respect to a conversion. In other words, based on the model, how many clicks will it take to influence each prospect towards the converted state?


Our Data Science team evaluates (behind the scenes) several variants of this model and finds the model that best predicts your data, so you don’t have to optimize manually. Some of the key model parameters include the number of stages in your marketing funnel, which is related to the number of clicks your product typically takes prior to conversion. We can also optimize the nature of the model: namely the number of terms in the “interaction” component, e.g. should we consider two, three or four different types of channels interacting, and whether that “interaction” component should have an order.

But wait, there’s more. We’ve only just begun and are already moving forward with further innovations. In a forthcoming product release, we are providing several outputs from multiple models including Chi-squared tests, Linear Regression based Attribution Tests, Statistical Tests, and new Visualizations which include importance of order between touch points. The benefit of using these approaches together is that we get to use many different tools, to provide various viewpoints into your marketing data. This serves as a guide into what’s working and whether various conceptions such as order matter, along with the statistical information to back it up.