Amazon SageMaker Model Monitor
Amazon SageMaker Model Monitor helps you maintain high quality machine learning (ML) models by automatically detecting and alerting on inaccurate predictions from models deployed in production.
The accuracy of ML models can deteriorate over time, a phenomenon known as model drift. Many factors can cause model drift such as changes in model features. The accuracy of ML models can also be affected by concept drift, the difference between data used to train models and data used during inference.
Amazon SageMaker Model Monitor helps you maintain high quality ML models by detecting model and concept drift in real-time, and sending you alerts so you can take immediate action. Model and concept drift are detected by monitoring the quality of the model based on independent and dependent variables. Independent variables (also known as features) are the inputs to an ML model, and dependent variables are the outputs of the model. For example, with an ML model predicting a bank loan approval, independent variables could be age, income, and credit history of the applicant, and the dependent variable would be the actual result of the loan application. Further, SageMaker Model Monitor constantly monitors model performance characteristics such as accuracy which measures the number of correct predictions compared to the total number of predictions so you can take action to address anomalies.
Additionally, SageMaker Model Monitor is integrated with Amazon SageMaker Clarify to help you identify potential bias in your ML models with model bias detection.
Data collection and monitoring
With Amazon SageMaker Model Monitor, you can select the data you would like to monitor and analyze without the need to write any code. SageMaker Model Monitor lets you select data from a menu of options such as prediction output, and captures metadata such as timestamp, model name, and endpoint so you can analyze model predictions based on the metadata. You can specify the sampling rate of data capture as a percentage of overall traffic in the case of high volume real-time predictions, and the data is stored in your own Amazon S3 bucket. You can also encrypt this data, configure fine-grained security, define data retention policies, and implement access control mechanisms for secure access.
Amazon SageMaker Model Monitor offers built-in analysis in the form of statistical rules, to detect drifts in data and model quality. You can also write custom rules and specify thresholds for each rule. The rules can then be used to analyze model performance. SageMaker Model Monitor runs rules on the data collected, detects anomalies, and records rule violations.
All metrics emitted by Amazon SageMaker Model Monitor can be collected and viewed in Amazon SageMaker Studio, so you can visually analyze your model performance without writing additional code. Not only can you visualize your metrics, but you can also run ad-hoc analysis in a SageMaker notebook instance to understand your models better.
Ongoing model prediction
Amazon SageMaker Model Monitor allows you to ingest data from your ML application in order to compute model performance. The data is stored in Amazon S3 and secured through access control, encryption, and data retention policies.
You can monitor your ML models by scheduling monitoring jobs through Amazon SageMaker Model Monitor. You can automatically kick off monitoring jobs to analyze model predictions during a given time period. You can also have multiple schedules on a SageMaker endpoint.
Integration with Amazon SageMaker Clarify
Amazon SageMaker Model Monitor is integrated with Amazon SageMaker Clarify to improve visibility into potential bias. Although your initial data or model may not have been biased, changes in the world may cause bias to develop over time in a model that has already been trained. For example, a substantial change in home buyer demographics could cause a home loan application model to become biased if certain populations were not present in the original training data. Integration with SageMaker Clarify enables you to configure alerting systems such as Amazon CloudWatch to notify you, if your model begins to develop bias.
Reports and alerts
The reports generated by monitoring jobs can be saved in Amazon S3 for further analysis. Amazon SageMaker Model Monitor emits metrics to Amazon CloudWatch where you can consume notifications to trigger alarms or corrective actions such as retraining the model or auditing data. The metrics include information such as rules that were violated and timestamp information. SageMaker Model Monitor also integrates with other visualization tools including Tensorboard, Amazon QuickSight, and Tableau.
Outliers or anomalies
Use Amazon SageMaker Model Monitor to detect when predictions are outside the expected range or on the edge of what is expected such as a minimum or maximum value. For example, you may expect the temperature to be between 65°F - 75°F so an out-of-bound result would be 50°F. This out of bound result will be alerted as an anomaly.
Use Amazon SageMaker Model Monitor to detect when predictions become skewed because of real-world conditions such as inaccurate sensor readings caused by aging sensors. Amazon SageMaker Model Monitor detects data skew by comparing real-world data to a baseline dataset such as a training dataset or an evaluation dataset.
Often new data are introduced in the real world so you want to be able to adjust your model to take the new features into account. For example, an autonomous driving model needs to be updated for autonomous vehicles to detect new objects on the road. Amazon SageMaker Model Monitor detects new observations so you can keep your models up to date.