Hi,
How would you advise to monitor the performance of a model in production over time, which could be subject to data drift? Is it for example possible to see the confidence levels degrade over time, or are there beter ways to do this?
Thanks!
Hi,
How would you advise to monitor the performance of a model in production over time, which could be subject to data drift? Is it for example possible to see the confidence levels degrade over time, or are there beter ways to do this?
Thanks!
Hi @Nasnl,
Data drift is a common issue for all machine learning models in production and is closely connected to concept drift. Data drift occurs when the data a model was trained on changes and is one of the most frequent reasons for why model performance decreases over time.
Addressing the problem:
Since data drift happens when the data changes from the data the model was trained on, there are a couple of ways to measure and address this:
You could also monitor a model’s confidence, but there are some drawbacks. Machine learning models are commonly referred to as “being overconfident” when making predictions, so using the models confidence is not a reliable way to measure data drift. If one mitigates this with, for example model calibration, then it could definitely be used as an additional measurement, but should not be your primary metric
Thanks @markussagen ! You did add some new ideas to our approach with your feedback! Really helpful!