Retail industry. In the glory of Data Science, it’s all about the data and tailor-made targeting. If you want to brag about it, you would say – I’ve got the unique, omni-channel, 360-something, that can perfectly model customer’s behaviour and even go to Mars. What really happens is that you literally feel lost. There are plenty of different models, lots of useful and noisy data, challenges regarding resources, competition, expenditures,… How do you handle them all? Well, that’s not the topic of this post, but will certainly be one in future posts.
Modeling customer’s behaviour is a tricky task. Customer’s habits and preferences may change over time, you always get stuck between earning more money and retaining a given customer base. And you have to keep in mind that you’re in a competitive market, so if your product is pure shit, too expensive, or poorly communicated – you’re doomed. Thus – it’s not only the customer, but the total business process, including several departments – and everything has to be coherent and harmonized in order to meet the best results. Pretty ambitious right? 😀
Whenever you have ambitious plans, the successive realization is a good way to go. “Think big, act small”, they say. That is what Things Solver is aimed at. We always divide a problem into minor problems. Modeling customer’s behaviour is a complex task, and should be handled carefully, from different (let’s say 360, just to keep up 😀 ) angles and through several phases.
To frame it, if you set a goal, for example – to improve customer experience – you know where to start from. The customer is ready to pay for something, if that something is going to meet his expectations and fulfill his needs. How to come up with that something? Data science is there to help.
There are several key components that you need to analyze:
- characteristics (customer information and demographics),
- activity level (purchase frequency, recency and trends),
- habits (regarding market basket, time, spending,… ),
- preferences (regarding manufacturers, stores, materials,…).
This is a very demanding and comprehensive analysis. It includes various modules, from lead generation, scoring and segmentation, through customer lifetime value estimation, propensity modeling and survival analysis, to the market basket analysis and recommender systems. If you know how to consolidate their outputs, you may say you’ve found the holy grail of tailor-made targeting. Since each of these modules is a separate area of research and development, there is enough material to dedicate a whole blog post to each one of them.
What I will talk about is a small fragment of a puzzle, a very attractive problem in customer behaviour modeling, that deals with the customer’s propensity and activity level. I will talk about two different approaches that are often misinterpreted as independent fields of analysis. Those are: propensity to purchase and survival analysis.
Propensity to purchase
What machine learning often relies on is that there are some hidden patterns in behaviour that can be identified and insightful. Identification of those patterns in customer behaviour can be pretty valuable for campaign management and focused targeting. Measuring the probability that a customer will make a purchase in some future period of time is explained as customer’s propensity to purchase.
Why is this important? Well, each customer has some tendencies towards purchasing. If we can measure the probability that a customer will make a purchase, we can form our campaign in regards to that. Targeting customers that will certainly come can result in costs. Not targeting the sleepy customers will result in attrition and even higher costs.
Propensity to purchase analysis includes analyzing customer, transactional and internal data. This is important, because we want to understand all the circumstances and components that affect customer’s decision to come to the store (or visit a site). It is important to analyze customer’s data, if it is available, because teenagers and married couples may have different patterns of behaviour. If available, it is also important to track the behaviour online, and to join it with offline behaviour to get a complete picture.
On the other hand, transactions are a treasury of information. They can show what the habits and preferences are, without a single word from a customer. Internal data can provide additional information about availability, supplies, actions and discounts.
In this analysis, there are plenty of features that should be included. These features should reflect customer habits and activity level. It is important to analyze frequency, recency, amount of money spent, but it is also important to include interpurchase time and purchase trend. Speaking of retail industry, there are also seasonalities, like holidays and seasonal sales.
At Things Solver, we are closely working on this comprehensive analysis, trying to solve all the challenges encountered during the whole process. And there are lots of them. Some are technical, like how do you/can you identify a customer (this is important if you want to gather any demographic information), while some are analytical, like should you analyze all customers at once, or how much history to include, etc…
Modeling customer’s propensity to make a purchase is a very demanding task, but can be pretty helpful. Although, there are some drawbacks like – when is the purchase going to happen? Or, if this customer is not going to make a purchase in the period observed, is he going to buy in some future period? And that is when the survival analysis is coming to the stage.
The survival analysis, as many other approaches, finds its roots in medical research and biological studies, and it represents the time-to-event modeling. You get the idea, right? The survival analysis is dealing with estimating the time period from an action to a given event. Magnificent!
Although pretty intuitive and sharp, survival analysis is a complex task. And it is more powerful than you can imagine. The main goal of this analysis is to estimate the time to a given event, and to quantitatively explain how this time depends on various properties of the treatment, customers and other variables. What is the event? Well, in our case it’s a purchase. What is the treatment? The promotion we’re targeting the customer with.
Why is this analysis so adorable and powerful? Because it solves some of the main drawbacks we encountered in the propensity to purchase modeling. First, the propensity to purchase, as said, only gives the probability of making a purchase, but we want to know when it will be. Second, in the propensity to purchase modeling, there are lots of unknown or missing outcomes (the customer hasn’t made another purchase by the time we’re observing the data, which does not mean he won’t do it in the future), which can be a problem when dealing with classification tasks. The records (customers) that we don’t know the outcome for are called censored records, and the survival analysis successfully deals with them.
The core of this analysis are two functions, survival and hazard function.
The survival function is defined as the probability that an individual “survives” from the time origin (time of some trigger event) to time t. The value of the survival function at time point t corresponds to the fraction of customers who have not yet experienced the event at that point. While the survival function focuses on the probability of the event not happening, thus – the survival time; the hazard function describes the “risk” of the event, which is more convenient for out case.
The hazard function is defined as the probability of the event in an infinite small time period between t and t+dt, given that the individual has survived up until time t. In other words, it’s the probability that an event will happen in a particular time frame. If I made a confusion, I strongly recommend to consult Google, it will give you plenty of thorough explanations of those two terms.
The second advantage of this analysis is that it can model the time to an event of different groups that we want to target. Some advanced techniques and extended parametric and nonparametric models can estimate the time to an event, giving a set of features like demographic and behavioral properties, or targeting information. And it can also be focused on an individual. If you want to learn more, take a look on this Python library: https://lifelines.readthedocs.io/en/latest/index.html.
My co-worker Strahinja always stresses that the key to finding the best solution is to try the hybrid approach. Once we had a lecture at the university, and a big heading was written on the presentation slide. “1+1=3”. It was inspired by the team work and united force. That is what we want to obtain. Let’s combine the best of both approaches, and create a higher value (pretty similar to the Master algorithm story, right? 🙂 ).
How can these approaches help us in campaign management optimization? If we have the probability that a customer will purchase (propensity to purchase), and if we have estimated time to that event (survival analysis), we can easily plan the timestamp and frequency of targeting. We don’t want to swamp our customers with thousands of promotions they are not interested in, or if they simply don’t have a custom to purchase in a given period of time. Other modules like segmentation, recommender system and CLV can help is in tailor-made campaign creation.
Notice that campaign management can be successfully obtained if these modules are comprehensively developed. In a series of blog post, my colleagues and I will try to explain these modules in more detail, so stay tuned! Cheers! 🙂