Combating Model Drift: Impact of Covid-19 on Predictive Modeling
Updated on

Combating Model Drift: Impact of Covid-19 on Predictive Modeling

Covid-19 has been called the “kryptonite” of AI, breaking its brittle models with outlier data that becomes the new normal. While scientist Wim Naude suggested this in the context of the mass of new and unprecedented data such as COVID-19 hampering the use of AI in medical forecasting, the implication holds for consumer behavior too.

The consumer has been in a state of constant flux since the early weeks of the COVID-19 outbreak. From panic buying of essentials to a tentative return to non-essential spending and with fluctuating attitudes towards sanitizing, social engagements and other aspects of life, the new customer profile is an ephemeral one.

In this context, most Artificial Intelligence (AI) and Machine Learning (ML) solutions designed for profiling customer behaviors and predicting future events are no longer viable. For instance, when retail store footfalls and shopping patterns changed, many predictive models that were used for segmentation or forecasting based on in-store shopping started to fail, interrupting demand and supply chains.

Related Article: Predictive Power of Clustering in Marketing

Fast and Furious Model Drift

Predictive Analytics identifies and uncovers historical trends to predict what will happen in the future. Since recent behaviors have changed drastically and continue to be volatile, static AI/ML solutions are turning out to be either inaccurate or misleading.
This is an extreme case of “model drift” — one of the core problems at the heart of data science. Model drift can have a long-term impact on the businesses’ ability to make smart, data-driven decisions. Model drift can be broadly classified into Concept drift and Data drift:

  • Concept drift: Concept drift occurs when the concept of what is to be predicted changes. Today’s concept drift in predictive modeling is occurring due to changes in consumer behavior and economic activities due to social distancing, lockdown and other responses to the pandemic.
  • Data drift: Data drift, on the other hand, occurs when there is a change in the statistical properties of input data. Today’s data drift is occurring due to changes in buying trends.

These drifts are not only impacting models designed to forecast demand and sales in retail and e-commerce but also those designed to predict fraud and evaluate customer risk profiles which impacts pricing across industries.

As a result, data science teams can no longer rely on historical data alone to train and deploy models in real-world scenarios. Data scientists need to be more agile, adaptive and proactive to keep deployed models responsive, making sure they provide the value they were built to provide.

How to combat this drift?

Data and analytics can help stabilize businesses, lay foundations of new processes and predict what’s next. Therefore, it is important to have short and long-term data-driven plans in place as quickly as possible to make informed decisions.

There is a need to focus on the following to combat the drift:

  • Monitor significant changes in the data and alert solution owners so they can take proactive measures on possible deteriorations in data shifts and outcome degradation
  • Monitor and decide how input data should be adjusted when training datasets
  • Make AI/ML solutions more agile and flexible – Leverage MLOps automation to detect, understand and reduce the impact of concept drift on production models, and enable process automation

Course5’s Multi-Pronged Approach

To address the challenges of concept drift, we at Course5 proposes a multi-pronged strategy that accounts for not only on retail footfall data, but also data on customer behavior online (ex: Visits, Page Views, Cart Additions, Abandonment Rate, Checkouts, Avg. Session Duration, Internal searches, Visits duration etc.). This approach has been extremely useful in recommending the right products to the customers.

We also use a Hybrid Modelling approach that involves second generation machine learning models like Random Forest, Gradient Boosting ML etc. to effectively capture model drift.

Suman Hiremath

Suman Hiremath

Suman is a Senior Data Scientist with 10+ years of experience in statistical analysis, predictive modelling and machine learning across various industries such as Technology,...

Read More    Read More