A Complete Guide to Monitoring Machine Learning Models: Part 1
Send me Nitor Infotech's Monthly Blog Newsletter!
×
nitor logo
  • Company
    • About
    • Leadership
    • Partnership
  • Resource Hub
  • Blog
  • Contact
nitor logo
Add more content here...
Artificial intelligence Big Data Blockchain and IoT
Business Intelligence Careers Cloud and DevOps
Digital Transformation Healthcare IT Manufacturing
Mobility Product Modernization Software Engineering
Thought Leadership
Aastha Sinha Abhijeet Shah Abhishek Suranglikar
Abhishek Tanwade Abhishek Tiwari Ajinkya Pathak
Amit Pawade Amol Jadhav Ankita Kulkarni
Antara Datta Anup Manekar Chandra Gosetty
Chandrakiran Parkar Dr. Girish Shinde Gaurav Mishra
Gaurav Rathod Harshali Chandgadkar Kapil Joshi
Madhavi Pawar Marappa Reddy Milan Pansuriya
Minal Doiphode Mohit Agarwal Mohit Borse
Nalini Vijayraghavan Neha Garg Nikhil Kulkarni
Omkar Ingawale Omkar Kulkarni Pranit Gangurde
Prashant Kamble Prashant Kankokar Priya Patole
Rahul Ganorkar Ramireddy Manohar Ravi Agrawal
Robin Pandita Rohini Wwagh Sachin Saini
Sadhana Sharma Sambid Pradhan Sandeep Mali
Sanjeev Fadnavis Saurabh Pimpalkar Sayanti Shrivastava
Shardul Gurjar Shravani Dhavale Shreyash Bhoyar
Shubham Kamble Shubham Muneshwar Shweta Chinchore
Sidhant Naveria Sreenivasulu Reddy Sujay Hamane
Tejbahadur Singh Tushar Sangore Vasishtha Ingale
Veena Metri Vidisha Chirmulay Yogesh Kulkarni
Artificial intelligence | 18 Jan 2023 |   14 min

A Complete Guide to Monitoring Machine Learning Models: Part 1

featured image

The popularity of AI and ML is increasing. So, business owners are trying to build various use cases and getting machine learning models in production. Businesses tend to include the technology in their process rather than focusing the impact on the profit.

This approach is best when one wants to introduce the latest technologies in the products. After a lot of research and collaboration with different stakeholders data scientists develop a product and make the ML models go live.

Creating such products involves a great understanding of:

  • underlying business,
  • data gathering,
  • data storage,
  • what to predict or classify,
  • data cleaning,
  • data transform,
  • applying different ML models,
  • hyperparameter tuning,
  • validating results,
  • deploying

After deployment comes the very important part which is ML Ops. ML Ops is a set of practices that aims to deploy and maintain ML models in production reliably and efficiently.

In this blog, we are going to dive into the model monitoring part of ML Ops and model retraining or incremental learning. There are various tools which can be used in real time production to perform such tasks which uses different methods for the same. We are going to see some common statistical approaches for monitoring and understand the need for retraining.

Model monitoring is the process of regularly evaluating the performance of a machine learning model in production. This is important because machine learning models are often deployed in real-world environments where the data they encounter may differ from the data they were trained on. This  leads to degradation of performance over time.

Model monitoring allows organizations to detect and address this degradation in performance before it becomes a major problem.

Monitoring Machine Learning Models Nitor Infotech

There are several key components to effective model monitoring:

Metrics: The first step in model monitoring is to define a set of metrics that will be used to check the performance of the model. These metrics should be chosen carefully, as they will be used to determine whether the model is performing as expected or if there are any issues that need to be addressed. Some common metrics for evaluating the performance of machine learning models include accuracy, precision, recall, and F1 score.

Monitoring Machine Learning Models- Metrics Nitor Infotech

Data collection: To check the performance of a machine learning model, data must be collected on how the model is performing in production. This can be done through a variety of methods, such as logging predictions made by the model or using a monitoring tool to track the model’s performance over time.

Data analysis: Once data has been collected on the model’s performance, it must be analysed to determine if there are any issues that need to be addressed. This can be done manually, or with automated tools that can alert an organization if certain thresholds are exceeded.

There are mainly two ways by which a model performance drops and does not generate real value:

  1. Data Drift
  2. Concept Drift

What is Data Drift?

Data drift is a phenomenon where the data that was used to train an ML model does not mimic the test data, or the data received in production. When I say ‘mimic’, it simply means that the data received in production must have equivalent statistical properties of training data.

Monitoring Machine Learning Models- Data drift Nitor Infotech

Why does it occur?

1.Changes in the underlying system being modelled: For example, a model is trained to predict traffic patterns on a highway. It may see change in the data if there are new construction projects that are affecting the flow and making people deviate from their routes.

2.Changes in data collection process: For example, if an ML model is trained on a set of data collected using a certain type of sensor, and then used to make predictions on data collected using a different type of sensor.

3.Changes in the environment in which the data is collected: For example, if an ML model is trained to predict weather, it may see a change in the data if there are changes in the climate or in the geographical location being modelled.

4.Change in the data distribution: Suppose we have trained a model on predicting health score of population aged above 60. This model is used to predict health score of people below the age of 25. There is a change in population which affects the model performance.

Types of Data Drift

There are mainly two types of data drifts which are as follows:

Covariate Shift: If the distribution of input feature changes, but the output variable remains the same. A simple example could be if a model is trained on predicting the price of a house based on its location, size and age, covariate shift may occur if the distribution of house sizes or ages changes over time, but overall demand of the houses remain the same.

Prior Probability Shift: Distribution of target variable changes. For example, during Covid times, some people’s incomes were not affected but they decided to not pay their EMI and take advantage of some government schemes. Maybe they did this to save money in case the condition worsened but (income = input variable not changed but output = EMI have changed)

What is Concept Drift?

As the name suggests, the very concept or underlying situation on which the model was trained changes significantly. In simple words, it can also be defined as the combination of covariate shift and prior probability shift.

Imagine you have a machine learning model that is trained to predict the sales of a particular product based on various features such as the product’s price, the store it is sold in, and the season. The model is trained on sales data from the past few years and performs well on the test set.

However, over time, both the input features and the output variable change. For example, the price of the product may increase, the store may start selling the product in a different location, the season may change, and the overall demand for the product may vary. These changes in both the input features and the output variable (sales of the product) would be considered concept drift.

How to overcome drifts:

There are several strategies that you can use to overcome data drift:

Strategies to overcome data drift Nitor Infotech

  • Monitor your data: Regularly monitor the statistical properties of your data and look for signs of drift. This can help you identify when drift is occurring and take corrective action before it affects the performance of your model.
  • Use adaptive models: Some machine learning models such as online learning algorithms, are designed to adapt to changes in the data distribution over time. These models can be trained on a continuous stream of data and automatically update their internal parameters as the data distribution changes.
  • Use data augmentation: Data augmentation involves generating additional synthetic data that is like your existing data, but with slightly different characteristics. This can help your model learn to generalize better and be more robust to data drift.
  • Retrain your model: If you detect significant data drift, you may need to retrain your model on a new, updated data set. This can help ensure that your model is able to accurately make predictions on the current data distribution.
  • Use domain knowledge: If you have domain knowledge about the data you are working with, you can use this knowledge to anticipate and mitigate data drift. For example, if you know that certain factors are likely to change over time (e.g., demographics and economic indicators), you can incorporate this information into your model to make it more robust to drift.

 

Well, this blog has been an introduction to model monitoring and its types with many examples and how it is a serious concern. We’ve also looked at retraining. In my next blog, we will look at some statistical approaches for detecting the drifts in data as well as methods to overcome such situations.

Send us an email at Nitor Infotech with your comments or if you’d like to discover details about our AI & ML capabilities.

Related Topics

Artificial intelligence

Big Data

Blockchain and IoT

Business Intelligence

Careers

Cloud and DevOps

Digital Transformation

Healthcare IT

Manufacturing

Mobility

Product Modernization

Software Engineering

Thought Leadership

<< Previous Blog fav Next Blog >>
author image

Vasishtha Ingale

Software Engineer

Vasishtha is a Software Engineer at Nitor Infotech. He has a keen interest in assimilating statistical approaches for Data Science. He is passionate about Python programming. He has worked mainly in domains like NLP, Data Analysis, Machine Learning modelling, Model Explainability, and Object Detection and has hands-on experience in libraries like Scikit Learn and MLlib. He likes to explore Machine Learning modelling. Experienced in handling data related to BFSI, healthcare and investment portfolios, he has a deep understanding of core concepts behind any open-source conversational AI platform and likes to derive compelling use cases for different industries. Apart from work, he is interested in human psychology, history as well as different ethnic cultures and their lifestyles.

   

You may also like

featured image

A Complete Guide to Monitoring Machine Learning Models: Part 2

In the first part of this series, I introduced you to the monitoring of machine learning models, its types, and real-world examples of each one of those. You can read Read Blog


featured image

Building and Managing AI Frameworks

I’m sure you would concur when I say that reliable AI is well on its way to becoming a vital requirement in today’s business landscape. Its features of fairness, explainability, robustness, data li...
Read Blog


featured image

Top 4 Types of Sentiment Analysis

When you’re analyzing what works for your business and what doesn’t, you deal with two types of data- objective, tangible data that you collate from surveys, feedback, and reviews, and then there’s...
Read Blog


subscribe

Subscribe to our fortnightly newsletter!

We'll keep you in the loop with everything that's trending in the tech world.

Services

    Modern Software Engineering


  • Idea to MVP
  • Quality Engineering
  • Product Engineering
  • Product Modernization
  • Reliability Engineering
  • Product Maintenance

    Enterprise Solution Engineering


  • Idea to MVP
  • Strategy & Consulting
  • Enterprise Architecture & Digital Platforms
  • Solution Engineering
  • Enterprise Cognition Engineering

    Digital Experience Engineering


  • UX Engineering
  • Content Engineering
  • Peer Product Management
  • RaaS
  • Mobility Engineering

    Technology Engineering


  • Cloud Engineering
  • Cognitive Engineering
  • Blockchain Engineering
  • Data Engineering
  • IoT Engineering

    Industries


  • Healthcare
  • Retail
  • Manufacturing
  • BFSI
  • Supply Chain

    Company


  • About
  • Leadership
  • Partnership
  • Contact Us

    Resource Hub


  • White papers
  • Brochures
  • Case studies
  • Datasheet

    Explore More


  • Blog
  • Career
  • Events
  • Press Releases
  • QnA

About


With more than 16 years of experience in handling multiple technology projects across industries, Nitor Infotech has gained strong expertise in areas of technology consulting, solutioning, and product engineering. With a team of 700+ technology experts, we help leading ISVs and Enterprises with modern-day products and top-notch services through our tech-driven approach. Digitization being our key strategy, we digitally assess their operational capabilities in order to achieve our customer's end- goals.

Get in Touch


  • +1 (224) 265-7110
  • marketing@nitorinfotech.com

We are Social 24/7


© 2023 Nitor Infotech All rights reserved

  • Terms of Usage
  • Privacy Policy
  • Cookie Policy
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it. Accept Cookie policy