Posted On: Nov 23, 2020

Amazon Managed Streaming for Apache Kafka (MSK) now offers consumer lag metrics for new Amazon MSK clusters by default, making it easier for you to track whether your applications are consuming the latest data available in your Apache Kafka cluster. Consumer lag metrics quantify the difference between the latest data written to your Kafka cluster, and the data read by your applications. Monitoring consumer lag allows application developers to identify, and alarm on slow or stuck consumers that are not keeping up with the latest data available in an Apache Kafka topic so they can take remedial actions such as scaling or rebooting those consumers. 

Consumer lag metrics can be monitored using your preferred Amazon MSK monitoring solution, Amazon CloudWatch or Open Monitoring for Prometheus. Additionally, Amazon MSK now includes ten more broker level and two topic level metrics in its default monitoring tier for all clusters, which allows you to monitor key cluster metrics such as bytes-in and bytes-out for free.  

Consumer lag metrics are available for all new Amazon MSK clusters in all Amazon Web Services regions where the service is currently available, including the Amazon Web Services China (Beijing) region, operated by Sinnet and the Amazon Web Services China (Ningxia) region, operated by NWCD. For existing clusters, these metrics will be enabled next time the cluster is upgraded with the latest operating system patch.