Q: What are Apache Kafka’s primary capabilities?

Apache Kafka has three key capabilities: Apache Kafka stores streaming data in a fault-tolerant way as a continuous series of records and preserves the order in which the records were produced. Apache Kafka acts as a buffer between data producers and data consumers. Apache Kafka allows many data producers (e.g. websites, IoT devices, Amazon EC2 instances) to continuously publish streaming data and categorize this data using Apache Kafka topics. Multiple data consumers (e.g. machine learning applications, Lambda functions) read from these topics at their own rate, similar to a message queue or enterprise messaging system. Data consumers process data from Apache Kafka topics on a first-in-first-out basis, preserving the order data was produced.

Amazon MSK FAQs

General

Open all

Amazon MSK is a new Amazon Web Services streaming data service that manages Apache Kafka infrastructure and operations, making it easy for developers and DevOps managers to run Apache Kafka applications and Kafka Connect connectors on Amazon Web Services, without the need to become experts in operating Apache Kafka clusters. Amazon MSK is an ideal place to run existing or new Apache Kafka applications in Amazon Web Services. Amazon MSK operates and maintains Apache Kafka clusters, provides enterprise-grade security features out of the box, and has built-in Amazon Web Services integrations that accelerate development of streaming data applications.

To get started, you can migrate existing Apache Kafka workloads and Kafka Connect connectors into Amazon MSK, or with a few clicks, you can build new ones from scratch in minutes. There are no data transfer charges for in-cluster traffic, and no commitments or upfront payments required. You only pay for the resources that you use.

Apache Kafka is an open-source, high performance, fault-tolerant, and scalable platform for building real-time streaming data pipelines and applications. Apache Kafka is a streaming data store that decouples applications producing streaming data (producers) into its data store from applications consuming streaming data (consumers) from its data store. Organizations use Apache Kafka as a data source for applications that continuously analyze and react to streaming data.

Streaming data is a continuous stream of small records or events (a record or event is typically a few kilobytes) generated by thousands of machines, devices, websites, and applications. Streaming data includes a wide variety of data such as log files generated by customers using your mobile or web applications, ecommerce purchases, in-game player activity, information from social networks, financial trading floors, geospatial services, and telemetry from connected devices or instrumentation in data centers. Streaming data services like Amazon MSK and Amazon Kinesis Data Streams make it easy for you to continuously collect, process, and deliver streaming data.

Apache Kafka has three key capabilities:

Apache Kafka stores streaming data in a fault-tolerant way as a continuous series of records and preserves the order in which the records were produced.
Apache Kafka acts as a buffer between data producers and data consumers. Apache Kafka allows many data producers (e.g. websites, IoT devices, Amazon EC2 instances) to continuously publish streaming data and categorize this data using Apache Kafka topics. Multiple data consumers (e.g. machine learning applications, Lambda functions) read from these topics at their own rate, similar to a message queue or enterprise messaging system.
Data consumers process data from Apache Kafka topics on a first-in-first-out basis, preserving the order data was produced.

Apache Kafka stores records in topics. Data producers write records to topics and consumers read records from topics. Each record in Apache Kafka consists of a key, a value, and a timestamp. Apache Kafka partitions topics and replicates these partitions across multiple nodes called brokers. Apache Kafka runs as a cluster on one or more brokers, and brokers can be located in multiple Amazon Web Services availability zones to create a highly available cluster. Apache Kafka relies on Apache ZooKeeper to coordinate cluster tasks and can maintain state for resources interacting with an Apache Kafka cluster.

Apache Kafka is used to support real-time applications that transform, deliver, and react to streaming data, and for building real-time streaming data pipelines that reliably get data between multiple systems or applications.

Amazon MSK makes it easy to get started and run open-source versions of Apache Kafka in Amazon Web Services with high availability and security while providing integration with Amazon Web Services services without the operational overhead of running an Apache Kafka cluster. Amazon MSK allows you to use and configure open-source versions of Apache Kafka while the service manages the setup, provisioning, Amazon Web Services integrations, and on-going maintenance of Apache Kafka clusters.

With a few clicks in the console, you can provision an Amazon MSK cluster. From there, Amazon MSK replaces unhealthy brokers, automatically replicates data for high availability, manages Apache ZooKeeper nodes, automatically deploys hardware patches as needed, manages the integrations with Amazon Web Services services, makes important metrics visible through the console, and supports Apache Kafka version upgrades so you can take advantage of improvements to the open-source version of Apache Kafka.

For supported Kafka versions, see the Amazon MSK documentation.

Yes, all data plane and admin APIs are natively supported by Amazon MSK.

Yes.

Yes, Apache Kafka clients can use the Amazon Glue Schema Registry, a serverless feature of Amazon Glue, at no additional charge. Visit the Schema Registry user documentation to get started and to learn more.

Amazon MSK now supports Graviton 3 based M7g instances from “large” through “16xlarge” sizes to run all Kafka workloads. Graviton instances come with the same availability and durability benefits of MSK, with up to 24% lower costs compared to corresponding M5 instances. Graviton instances provide up to 29% higher throughput per instance compared to MSK’s M5 instances, which enables customers to run MSK clusters with a fewer brokers or smaller sized instances.

MSK Connect

Open all

Kafka Connect, an open source component of Apache Kafka, is a framework for connecting Apache Kafka with external systems such as databases, key-value stores, search indexes, and file systems.

A connector integrates external systems including Amazon Web Services services with Apache Kafka by continuously copying streaming data from a data source into an Apache Kafka topic, or continuously copying data from an Apache Kafka topic into a data sink. A connector may perform lightweight logic such as transformation, format conversion, or filtering data before the data is delivered to a destination. Source connectors pull data from a data source and push this data into Apache Kafka, while sink connectors pull data from Apache Kafka and push this data into a data sink.

MSK Connect runs any connector that implements the Kafka Connect interfaces. There are many sources of connectors, including from our partners, open source projects like Debezium, and from commercial connector vendors like Confluent and lenses.io.

MSK Connect works with Amazon MSK clusters, other Apache Kafka and compatible clusters including self-managed clusters in EC2 or non-Amazon Web Services environments, subject to Amazon VPC connectivity.

MSK Replicator

Open all

Amazon MSK Replicator is a feature of Amazon MSK that helps customers reliably replicate data across MSK clusters in different Amazon Web Services Regions (cross-Region replication) or within the same Amazon Web Services Region (same-Region replication), without writing code or managing infrastructure. You can use cross-Region replication to build highly available and fault-tolerant multi-Region streaming applications for increased resiliency. You can also use cross-Region replication to provide lower latency access to consumers in different geographic regions. You can use same-Region replication to distribute data from one cluster to many clusters for sharing data with your partners and teams. You can also use same-Region replication to aggregate data from multiple clusters into one for analytics.

To set up replication between a pair of source and target MSK clusters, you need to create a Replicator in the destination Region. To create a Replicator, you specify details that include the Amazon Resource Name (ARN) of the source and target MSK clusters and an IAM role that MSK Replicator can use to access the clusters. You will need to create the target MSK cluster if it does not already exist.

MSK Replicator supports replication across MSK clusters only. Both Provisioned and Serverless types of MSK clusters are supported. You can also use MSK Replicator to move from Provisioned to Serverless or the other way around with other Kafka clusters that are not supported.

Yes, you can specify which topics you want to replicate using allow and deny lists while creating the Replicator.

Yes, MSK Replicator automatically replicates the necessary Kafka metadata, such as topic configuration, ACLs, and consumer group offsets so that consuming applications can resume processing seamlessly after failover. You can choose to turn off one or more of these settings if you only want to replicate the data. You can also specify which consumer groups you want to replicate using allow or deny lists while creating the Replicator.

No, MSK Replicator automatically deploys, provisions, and scales the underlying replication infrastructure to support changes in your ingress throughput.

No, MSK Replicator only supports replication across MSK clusters in the same Amazon Web Services account.

You can use CloudWatch in the destination Region to view metrics for ReplicationLatency,
MessageLag, and ReplicatorThroughput at a topic and aggregate level for each Replicator at no additional charge. Metrics are visible under ReplicatorName in the Amazon Web Services/Kafka namespace. You can also see the ReplicatorFailure, AuthError, and ThrottleTime metrics to check if your Replicator is running into any issues.

You can use MSK Replicator to set up active-active or active-passive cluster topologies to increase resiliency of your Kafka application across Regions. In an active-active setup, both MSK clusters are actively serving reads and writes. Comparatively, in an active-passive setup, only one MSK cluster at a time is actively serving streaming data while the other cluster is on standby.

Yes. By creating a different Replicator for each source and target cluster pair, you can replicate data from one cluster to multiple clusters or replicate data from many clusters to one.

MSK Replicator uses IAM access control to connect to your source and target clusters. You need to turn on your source and target MSK clusters for IAM access control to create a Replicator. You can continue to use other authentication methods including SASL/SCRAM and mTLS at the same time for your clients since Amazon MSK supports multiple authentication methods simultaneously.

MSK Replicator replicates data asynchronously. Replication latency varies based on many factors, including the network distance between the Regions of your MSK clusters, your source and target clusters’ throughput capacity, and the number of partitions on your source and target clusters.

Yes, MSK Replicator allows you to keep topic names identical in your source and destination cluster. For multi-region availability, you can use MSK Replicator to create an active-active setup with identical topic names (Keep the same topics name in console) or with Prefixed topic name (Add prefix to topics name in console). If you use Identical topic name replication for your MSK Replicator, it will replicate your topics with the same name as the corresponding source topics. In this case, both MSK clusters have identical topic names and are actively serving reads and writes. Using identical topic names across clusters helps to avoid reconfiguring your clients. In this case, you will pay additional data processing and data transfer charges for each Replicator because each Replicator will need to process twice the usual amount of data, once for replication and again to prevent infinite loops. You can track the total amount of data processed by each replicator using the ReplicatorBytesInPerSec metric which includes the data replicated to target cluster as well as the data filtered by MSK Replicator to prevent the data being copied back to the same topic it originated from. See documentation on Monitor replication for more information on this metric.

Yes. By default, when you create a new Replicator, it starts replicating data from the tip of the stream (latest offset) on the source cluster. Alternatively, if you want to replicate existing data, you can configure a new Replicator to start replicating data from the earliest offset in the source cluster topic partitions.

Since MSK Replicator acts as a consumer for your source cluster, it is possible that replication causes other consumers to be throttled on your source cluster. This depends on how much read capacity you have on your source cluster and throughput of the data you are replicating. We recommend that you provision identical capacity for your source and target clusters and account for the replication throughput while calculating how much capacity you need. You can also set Kafka quotas for the Replicator on your source and target clusters to control how much capacity the Replicator can use.

Yes, you can specify your choice of compression codec while creating the Replicator amongst None, GZIP, Snappy, LZ4, and ZSTD.

Data production and consumption

Open all

Yes, Amazon MSK supports the native Apache Kafka producer and consumer APIs. Your application code does not need to change when clients begin to work with clusters within Amazon MSK.

Yes, you can use any component that leverages the Apache Kafka producer and consumer APIs, and the Apache Kafka Admin Client. Tools that upload .jar files into Apache Kafka clusters are currently not compatible with Amazon MSK, including Confluent Control Center, Confluent Auto Data Balancer, and Uber uReplicator.

Standard brokers

Open all

Standard brokers for MSK Provisioned offer the high flexibility to configure your cluster's performance. You can choose from a wide range of cluster configurations, to achieve the availability, durability, throughput, and latency characteristics required for your applications. You can also provision storage capacity and increase it as and when needed. Amazon MSK handles the hardware maintenance of Standard brokers and attached storage resources, automatically repairing hardware issues that may arise.

Express brokers

Open all

Express brokers for MSK Provisioned make Apache Kafka simpler to manage, more cost-effective to run at scale, and more elastic with the low latency you expect. Brokers include pay-as-you-go storage that scales automatically and requires no sizing, provisioning, or proactive monitoring. Depending on the instance size selected, each broker node can provide up to 3x more throughput per broker, scale up to 20x faster, and recover 90% quicker compared to standard Apache Kafka brokers. Express brokers come pre-configured with Amazon MSK’s best practice defaults and enforce client throughput quotas to minimize resource contention between clients and Kafka’s background operations.

No storage management: Express brokers eliminate the need to provision or manage any storage resources. You get elastic, virtually unlimited, pay-as-you-go, and fully managed storage. For high throughput use cases, you do not need to reason about the interactions between compute instances and storage volumes and the associated throughput bottlenecks. These capabilities simplify cluster management and eliminate storage management operational overhead.
Faster scaling: Express brokers allow you to scale your cluster and move partitions faster than on Standard brokers. This capability is crucial when you need to scale out your cluster to handle upcoming load spikes or scale in your cluster to reduce cost. See the sections on expanding your cluster, removing brokers, reassigning partitions, and setting up LinkedIn’s Cruise Control for rebalancing for more details on scaling your cluster.
Higher throughput: Express brokers offer up to 3x more throughput per broker than Standard brokers. For example, you can safely write data at up to 500 MBps with each m7g.16xlarge sized Express broker compared to 153.8 MBps on the equivalent Standard broker (both numbers assume sufficient bandwidth allocation towards background operations, such as replication and rebalancing).
Configured for high resilience: Express brokers automatically offer various best practices pre-configured to improve your cluster’s resilience. These include guardrails on critical Apache Kafka configurations, throughput quotas, and capacity reservation for background operations and unplanned repairs. These capabilities make it safer and easier to run large scale Apache Kafka applications. See the sections on Express broker configurations and Amazon MSK Express broker quota for more details.
No Maintenance windows: There are no maintenance windows for Express brokers. Amazon MSK automatically updates your cluster hardware on an ongoing basis. See Amazon MSK Express brokers for more details.

Express brokers provide more throughput per broker, so you can create clusters with fewer brokers for the same workload. Additionally, once your cluster is up and running, you can monitor the use of your cluster resources and right-size capacity faster than with Standard brokers. You can, therefore, provision resources that are fit for the capacity you need and scale faster to meet any changes in demand.

Clusters with Express brokers work with Apache Kafka APIs and tools that use the standard Apache Kafka client.

Express brokers come preconfigured with Amazon MSK best practice defaults that optimize for availability and durability. You may customize some of these configurations to further fine-tune the performance of your clusters. Read more about Express broker configurations in the Amazon MSK Developer Guide.

Just as for Standard brokers, Amazon MSK integrates with Amazon Key Management Service (Amazon KMS) to offer transparent server-side encryption for the storage in Express brokers. When you create an MSK cluster with Express brokers, you can specify the Amazon KMS key that you want Amazon MSK to use to encrypt your data at rest. If you don't specify a KMS key, Amazon MSK creates an Amazon Web Services managed key for you and uses it on your behalf. Amazon MSK also uses TLS to encrypt data in transit for Express brokers, as it does for Standard brokers.

Most MSK Provisioned features and capabilities that work on Standard brokers also work with clusters that use Express brokers. Some differences include: storage management, instance type availability, and supported versions. See table comparing Standard and Express brokers under MSK Provisioned highlights some key similarities and differences.

Yes, you can migrate the data in your Kafka cluster to a cluster comprising of Express brokers using MirrorMaker 2 or Amazon MSK Replicator, which copies both the data and the metadata of your cluster to a new cluster. You can learn more about using Migrate to an Amazon MSK Cluster and MSK Replicator in the Amazon MSK Developer Guide.

Express brokers increase your price performance, provide higher resiliency, and lower operational overhead, making it the ideal choice for all Apache Kafka workloads on MSK Provisioned. However, you can choose Standard broker types if you want to control more of your brokers’ configurations and settings. With Standard brokers, you can customize a wider set of Kafka configurations, including replication factor, size of log files, and leader election policies, which gives you more flexibility over your cluster settings.

Migrating to Amazon MSK

Open all

Yes, you can use third-party tools or open source tools like MirrorMaker that come with open source Apache Kafka to replicate data from clusters into an Amazon MSK cluster.

Version upgrades

Open all

Yes, Amazon MSK supports fully managed in-place Apache Kafka version upgrades. To learn more about upgrading your Apache Kafka version and high availability best practices, see the version upgrades documentation.

Clusters

Open all

You can create your first cluster with a few clicks in the Amazon Web Services management console or using the Amazon SDKs. First, in the Amazon MSK console select an Amazon Web Services region to create an Amazon MSK cluster in. Choose a name for your cluster, the VPC you want to run the cluster with, a data replication strategy for the cluster, and the subnets for each AZ. Next, pick a broker instance type and quantity of brokers per AZ, and click create.

Each cluster contains broker instances, provisioned storage, and Apache ZooKeeper nodes.

For provisioned clusters, you can choose EC2 T3.small or instances within the EC2 M7g and M5 instance families. For serverless clusters, brokers are completely abstracted.

No, not at this time.

No, each broker you provision includes boot volume storage managed by the Amazon MSK service.

Some resources, like elastic network interfaces (ENIs), will show up in your Amazon EC2 account. Other Amazon MSK resources will not show up in your EC2 account as these are managed by the Amazon MSK service.

You need to provision broker instances and broker storage with every cluster you create. You do not provision Apache ZooKeeper nodes as these resources are included at no additional charge with each cluster you create.

Unless otherwise specified, Amazon MSK uses the same defaults specified by the open-source version of Apache Kafka. The default settings are documented here.

No, Amazon MSK enforces the best practice of balancing broker quantities across AZs within a cluster.

Amazon MSK uses Apache Kafka’s leader-follower replication to replicate data between brokers. Amazon MSK makes it easy to deploy clusters with multi-AZ replication and gives you the option to use a custom replication strategy by topic. By default with each of the replication options, leader and follower brokers will be deployed and isolated using the replication strategy specified. For example, if you select a 3 AZ broker replication strategy with 1 broker per AZ cluster, Amazon MSK will create a cluster of three brokers (1 broker in three AZs in a region), and by default (unless you choose to override the topic replication factor) the topic replication factor will also be 3.

Yes, Amazon MSK allows you to create custom configurations and apply them to new and existing clusters. For more information on custom configurations, see the configuration documentation.

The configurations properties that you can customize are documented here.

Amazon MSK uses Apache Kafka’s default configuration unless otherwise specified here.

Topics

Open all

Once your Apache Kafka cluster has been created, you can create topics using the Apache Kafka APIs. All topic and partition level actions and configurations are performed using Apache Kafka APIs. The following command is an example of creating a topic using Apache Kafka APIs:

bin/kafka-topics.sh --create —bootstrap-server ConnectionString:9092 --replication-factor 3 --partitions 1 --topic TopicName

Networking

Open all

Yes, Amazon MSK always runs within an Amazon VPC managed by the Amazon MSK service. Amazon MSK resources will be available to your own Amazon VPC, subnet, and security group you select when the cluster is setup. IP addresses from your VPC are attached to your Amazon MSK resources through elastic network interfaces (ENIs), and all network traffic stays within the Amazon network and is not accessible to the internet by default.

The brokers in your cluster will be made accessible to clients in your VPC through ENIs appearing in your account. The Security Groups on the ENIs will dictate the source and type of ingress and egress traffic allowed on your brokers.

Yes, Amazon MSK offers an option to securely connect to the brokers of Amazon MSK clusters running Apache Kafka 2.6.0 or later versions over the internet. By enabling public access, authorized clients external to a private Amazon Virtual Private Cloud (VPC) can stream encrypted data in and out of specific Amazon MSK clusters. You can enable public access for MSK clusters after a cluster has been created at no additional cost, but standard Amazon data transfer costs for cluster ingress and egress apply. To learn more about turning on public access, see the public access documentation.

By default, the only way data can be produced and consumed from an Amazon MSK cluster is over a private connection between your clients in your VPC and the Amazon MSK cluster. However, if you turn on public access for your Amazon MSK cluster and connect to your MSK cluster using the public bootstrap-brokers string, the connection, though authenticated, authorized and encrypted, is no longer considered private. We recommend that you configure the cluster's security groups to have inbound TCP rules that allow public access from your trusted IP address and make these rules as restrictive as possible if you turn on public access.

Connecting to the VPC

Open all

The easiest way is to turn on public connectivity over the internet to the brokers of MSK clusters running Apache Kafka 2.6.0 or later versions. For security reasons, you can't turn on public access while creating an MSK cluster. However, you can update an existing cluster to make it publicly accessible. You can also create a new cluster and then update it to make it publicly accessible. To learn more about turning on public access, see the public access documentation.

You can connect to your MSK cluster from any VPC or Amazon Web Services account different than your MSK cluster’s by turning on the multi-VPC private connectivity for MSK clusters running Apache Kafka versions 2.7.1. or later versions. You can only turn on private connectivity after cluster creation for any of the supported authentication schemes (IAM authentication, SASL SCRAM and mTLS authentication). Amazon MSK uses Amazon PrivateLink technology to enable private connectivity and you should configure your clients to connect to the cluster using Amazon PrivateLink endpoints. To learn more about setting up private connectivity, see Access from within Amazon Web Services documentation.

Encryption

Open all

Yes, Amazon MSK uses Amazon EBS server-side encryption and Amazon KMS keys to encrypt storage volumes.

Yes, by default new clusters have encryption in-transit enabled via TLS for inter-broker communication. You can opt-out of using encryption in-transit when a cluster is created.

Yes, by default in-transit encryption is set to TLS only for clusters created from the CLI or Amazon Web Services Console. Additional configuration is required for clients to communicate with clusters using TLS encryption. You can change the default encryption setting by selecting the TLS/plaintext or plaintext settings. Read More: MSK Encryption

Yes, Amazon MSK clusters running Apache Kafka version 2.5.1 or greater support TLS in-transit encryption between Kafka brokers and ZooKeeper nodes.

You can change client-to-broker encryption settings on your clusters from the console or through the update-security API. Please note that broker-to-broker encryption settings cannot be changed on existing clusters.

Access Management

Open all

Amazon MSK offers three options for controlling authentication (AuthN) and authorization (AuthZ). 1) IAM Access Control for both AuthN/Z (recommended), 2) TLS certificate authentication (CA) for AuthN and access control lists for AuthZ, and 3) SASL/SCRAM for AuthN and access control lists for AuthZ. Amazon MSK recommends using IAM Access Control. It’s the easiest to use and because it defaults to least privilege access, the most secure option.

If you are using IAM Access Control, Amazon MSK uses the policies you write and its own authorizer to authorize actions. If you are using TLS certificate authentication or SASL/SCRAM, Apache Kafka uses access control lists (ACLs) for authorization. To enable ACLs you must enable client authentication using either TLS certificates or SASL/SCRAM.

If you are using IAM Access Control, Amazon MSK will authenticate and authorize for you without any additional set up. If you are using TLS authentication, you can use the Dname of clients TLS certificates as the principal of the ACL to authorize client requests. If you are using SASL/SCRAM, you can use the username as the principal of the ACL to authorize client requests.

You can control service API actions using Amazon IAM.

No, however a feature that would allow you to update your authentication settings is coming soon.

No, IAM Access Control is only available for Amazon MSK clusters.

You can enable or disable authentication modes for your clusters from the console or through the update-security API. When using the API, the authentication modes that are explicitly declared will be modified accordingly, while those that are omitted will be maintained as is. For example, if your cluster uses mTLS for authentication and you enable IAM Access Control by calling the update-security API, both mTLS and IAM Access Control will be enabled on your cluster.

Yes, you can add multiple authentication modes to your cluster, both during creation and updates. The brokers within the cluster have dedicated ports for each authentication mode, and your clients that connect to Kafka through these ports must have the corresponding authentication mode enabled.

Yes, you can disable an authentication mode. To ensure that your clients do not lose connectivity with the brokers, do not disable any existing authentication modes until all the clients have been updated to use other available authentication modes.

Yes, you can track the number of open connections by authentication mode using the ClientConnectionCount metric published to Amazon CloudWatch metrics.

You can attach a cluster policy to your Amazon MSK cluster to provide your cross-account Kafka client permissions to set up private connectivity to your Amazon MSK cluster. When using IAM client authentication, you can also use the cluster policy to granularly define the Kafka data plane permissions for the connecting client. To learn more about cluster policies, see the cluster policy documentation.

Monitoring, metrics, logging, tagging

Open all

You can monitor the performance of your clusters using the Amazon MSK console, Amazon CloudWatch console, or you can access JMX and host metrics using Open Monitoring with Prometheus, an open source monitoring solution.

The cost of monitoring your cluster using Amazon CloudWatch is dependent on the monitoring level and the size of your Apache Kafka cluster. Amazon CloudWatch charges per metric per month and includes a free tier; see Amazon CloudWatch pricing for more information. For details on the number of metrics exposed for each monitoring level, see Amazon MSK monitoring documentation.

Tools that are designed to read from Prometheus exporters are compatible with Open Monitoring, like: Datadog, Lenses, New Relic, Sumologic, or a Prometheus server. For details on Open Monitoring, see Amazon MSK Open Monitoring documentation.

You can use any client-side monitoring supported by the Apache Kafka version you are using.

Yes, you can tag Amazon MSK clusters from the Amazon CLI or Console.

Topic level consumer lag metrics are available as part of the default set of metrics that Amazon MSK publishes to Amazon CloudWatch for all clusters. No additional setup is required to get these metrics. To get partition level metrics (partition dimension), you can enable enhanced monitoring (PER_PARTITION_PER_TOPIC) on your cluster. Alternatively, you can enable Open Monitoring on your cluster, and use a Prometheus server, to capture partition level metrics from the brokers in your cluster. Consumer lag metrics are available at port 11001, just as other Kafka metrics.

Topic level metrics are included in the default set of Amazon MSK metrics, which are free of charge. Partition level metrics are charged as per Amazon CloudWatch pricing.

You can enable broker log delivery for new and existing Amazon MSK clusters. You can deliver broker logs to Amazon CloudWatch Logs, Amazon S3, and Kinesis Data Firehose. Kinesis Data Firehose supports Amazon Elasticsearch Service among other destinations. To learn how to enable this feature, see the Amazon MSK Logging Documentation. To learn about pricing refer to CloudWatch Logs and Kinesis Data Firehose pricing pages.

Amazon MSK provides INFO level logs for all brokers within a cluster.

You can request Apache ZooKeeper logs through a support ticket.

Yes, if you use IAM Access Control, the use of Apache Kafka resource APIs is logged to Amazon CloudTrail.

Apache ZooKeeper

Open all

From https://zookeeper.apache.org/: “Apache ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications,” including Apache Kafka.

Yes, Amazon MSK uses Apache ZooKeeper and manages Apache ZooKeeper within each cluster as a part of the Amazon MSK service. Apache ZooKeeper nodes are included with each cluster at no additional cost.

Your clients can interact with Apache ZooKeeper through an Apache ZooKeeper endpoint provided by the service. This endpoint is provided in the Amazon Web Services management console or using the DescribeCluster API.

Integrations

Open all

Amazon MSK integrates with:

Amazon VPC for network isolation and security
Amazon CloudWatch for metrics
Amazon KMS for storage volume encryption
Amazon IAM for authentication and authorization of Apache Kafka and service APIs.
Amazon Lambda for MSK event sourcing
Amazon IoT for IoT event sourcing
Amazon Glue Schema Registry for controlling the evolution of schemas used by Apache Kafka applications
Amazon CloudTrail for Amazon API logs
Amazon Certificate Manager for Private CAs used for client TLS authentication
Amazon CloudFormation for describing and provisioning Amazon MSK clusters using code
Amazon Kinesis Data Analytics for fully managed Apache Flink applications that process streaming data
Amazon Secrets Manager for client credentials used for SASL/SCRAM authentication

Scaling

Open all

You can scale up storage in your provisioned clusters using the Amazon Management Console or the Amazon CLI. You can also use tiered storage to virtually store unlimited data on your cluster without having to add brokers for storage. In serverless clusters, storage is scaled seamlessly based on your usage.

You can create an auto scaling policy for storage using the Amazon Web Services Management console or by creating an Amazon Web Services Application Auto scaling Policy using the Amazon CLI or APIs.

Yes. You can choose to scale to a smaller or larger broker type on your Amazon MSK clusters.

You can use Cruise Control for automatically rebalancing partitions to manage I/O heat. See the Cruise Control documentation for more information. Alternatively, you can use the Kafka Admin API kafka-reassign-partitions.sh to reassign partitions across brokers.

Apache Kafka stores data in files called log segments. As each segment is complete, based on the size configured at cluster or topic level, it is copied to the low-cost storage tier. Data is held in performance optimized storage for a specified retention time, or size, and then deleted. There is a separate time and size limit setting for the low-cost storage, which will be longer than the primary storage tier. If clients request data from segments stored in the low-cost tier, the broker will read the data from it and serve the data in the same way as if it is being served from the primary storage.

Pricing and availability

Open all

Pricing is per broker-hour and per GB-month of storage provisioned. Amazon Web Services data transfer rates apply for data transfer in and out of Amazon MSK. For more information, visit our pricing page.

No, all in-cluster data transfer is included with the service at no additional charge.

You will pay standard Amazon Web Services data transfer charges for data transferred in and out of an Amazon MSK cluster. You will not be charged for data transfer within the cluster in a region, including data transfer between brokers and data transfer between brokers and Apache ZooKeeper nodes.

Service Level Agreement

Open all

Our Amazon MSK SLA guarantees a Monthly Uptime Percentage of at least 99.9% for Amazon MSK (including MSK Connect).

You are eligible for a SLA credit for Amazon MSK under the Amazon MSK SLA if Multi-AZ deployments on Amazon MSK have a Monthly Uptime Percentage of less than 99.9% during any monthly billing cycle.

For full details on all of the terms and conditions of the SLA, please see the Service Level Agreements for Amazon Web services page.

Amazon MSK FAQs

General

MSK Connect

MSK Replicator

Data production and consumption

Standard brokers

Express brokers

Migrating to Amazon MSK

Version upgrades

Clusters

Topics

Networking

Connecting to the VPC

Encryption

Access Management

Monitoring, metrics, logging, tagging

Apache ZooKeeper

Integrations

Scaling

Pricing and availability

Service Level Agreement

Get started with Amazon MSK

About Us

Products & Solutions

Resources & Support

Manage Your Account

Amazon MSK FAQs

General

Q: What is Amazon MSK?

Q: What is Apache Kafka?

Q: What is streaming data?

Q: What are Apache Kafka’s primary capabilities?

Q: What are the key concepts of Apache Kafka?

Q: When should I use Apache Kafka?

Q: What does Amazon MSK do?

Q: What Apache Kafka versions does Amazon MSK support?

Q: Are Apache Kafka APIs compatible with Amazon MSK?

Q: Is the Apache Kafka AdminClient supported by Amazon MSK?

Q: Does Amazon MSK support schema registration?

Q: How do I get started with M7g clusters?

MSK Connect

What is Apache Kafka Connect?

What is a connector?

Where can I get connectors?

Does MSK Connect support non-MSK Apache Kafka clusters?

MSK Replicator

What is Amazon MSK Replicator?

How do I use MSK Replicator?

Which type of Kafka clusters are supported by MSK Replicator?

Can I specify which topics I want to replicate?

Does MSK Replicator replicate topic settings and consumer group offsets?

Do I need to scale the replication when my ingress throughput changes?

Can I replicate data across MSK clusters in different Amazon Web Services accounts?

How can I monitor the replication?

How can I use replication to increase the resiliency of my streaming application across Regions?

Can I use MSK Replicator to replicate data from one cluster to multiple clusters or replicate data from many clusters to one?

How does MSK Replicator connect to the source and target MSK clusters?

How much replication latency should I expect with MSK Replicator?

Can I keep topic names the same with MSK Replicator?

Can I replicate existing data on the source cluster?

Can replication result in throttling consumers on the source cluster?

Can I compress data before writing to the target cluster?

Can I compress data before writing to the target cluster?

Data production and consumption

Can I use Apache Kafka APIs to get data in and out of Apache Kafka?

Can I use Apache Kafka Connect, Apache Kafka Streams, or any other ecosystem component of Apache Kafka with Amazon MSK?

Standard brokers

What are Standard brokers?

Express brokers

What are Express brokers?

What are the key benefits of Express brokers?

How can I optimize my cost with Express brokers?

Which Apache Kafka APIs and tools can I use with Express brokers?

Which Kafka configurations do I need to customize for Express brokers?

Which encryption options are available with Express brokers?

What are the Amazon MSK feature differences between Standard and Express brokers?

Can I move my existing Kafka workload to Express brokers?

How should I choose between Standard and Express MSK Provisioned broker types?

Migrating to Amazon MSK

Q: Can I migrate data within my existing Apache Kafka cluster to Amazon MSK?

Version upgrades

Q: Are Apache Kafka version upgrades supported?

Clusters

Q: How do I create my first Amazon MSK cluster?

Q: What resources are within a cluster?

Q: What types of broker instances can I provision within an Amazon MSK cluster?

Q: Does Amazon MSK offer Reserved Instance pricing?

Q: Do I need to provision and pay for broker boot volumes?

Q: When I create an Apache Kafka cluster, do the underlying resources (e.g. Amazon EC2 instances) show up in my EC2 console?

Q: What do I need to provision within an Amazon MSK cluster?

Q: What is the default broker configuration for a cluster?

Q: Can I provision brokers such that they are imbalanced across AZs (e.g. 3 in cn-north-1a, 2 in cn-north-1b, 1 in cn-north-1c)?

Q: How does data replication work in Amazon MSK?

Q: Can I change the default broker configurations or upload a cluster configuration to Amazon MSK?

Q: What configuration properties am I able to customize?

Q: What is the default configuration of a new topic?

Topics

Q: How do I create topics?

Networking

Does Amazon MSK run in an Amazon VPC?

How will the brokers in my Amazon MSK cluster be made accessible to clients within my VPC?

Is it possible to connect to my cluster over the public Internet?

Is the connection between my clients and an Amazon MSK cluster private?

Connecting to the VPC

How do I connect to my Amazon MSK cluster over the internet?

How do I connect to my Amazon MSK cluster from inside Amazon Web Services China network but outside the cluster’s Amazon VPC?