Posted On: Oct 19, 2023

Amazon Kinesis Data Firehose now integrates with Amazon MSK to offer a fully managed solution that simplifies the processing and delivery of streaming data from Amazon MSK Apache Kafka clusters into Amazon S3 data lake. With just few clicks, Amazon MSK customers can continuously load data from their desired Apache Kafka clusters to their Amazon S3 bucket, eliminating the need to develop or run their own connector applications. 

Amazon MSK is a fully managed service for Apache Kafka that makes it easy for you to build and run applications that use Apache Kafka as a data store. Kinesis Data Firehose is a fully managed service that continuously capture, transform, and delivers streaming data to data lakes, data stores, and analytics services. Kinesis Data Firehose automatically scales to match the throughput of your Amazon MSK data and without ongoing administration. Kinesis Data Firehose also offers easy to use features like JSON to Parquet/ORC for format conversion and batch aggregation to optimize the S3 file size. These features simplify data analytical/processing workflows on delivered data. 

To get started, you need an Amazon Web Services account. Once you have an account, you can create a delivery stream in the Amazon Kinesis Console. To learn more, explore the Amazon Kinesis Data Firehose developer guide. Amazon MSK to Amazon S3 delivery using Amazon Kinesis Data Firehose can be used in Amazon Web Services China (Beijing) Region, operated by Sinnet and Amazon Web Services China (Ningxia) Region, operated by NWCD as well as all commercial Amazon Web Services regions where Kinesis Data Firehose is available.