MSKDev Guide
MSKDev Guide
Table of Contents
Welcome ........................................................................................................................................... 1
What is Amazon MSK? ................................................................................................................ 1
Setting up ......................................................................................................................................... 4
Sign up for Amazon ................................................................................................................... 4
Download libraries and tools ....................................................................................................... 4
Getting started .................................................................................................................................. 5
Step 1: Create a cluster .............................................................................................................. 5
Step 2: Create a client machine ................................................................................................... 5
Step 3: Create a topic ................................................................................................................. 6
Step 4: Produce and consume data .............................................................................................. 7
Step 5: View metrics .................................................................................................................. 8
Step 6: Delete the resources ........................................................................................................ 8
How it works ................................................................................................................................... 10
Creating a cluster ..................................................................................................................... 10
Broker types .................................................................................................................... 10
Creating a cluster using the Amazon Web Services Management Console ................................. 11
Creating a cluster using the Amazon CLI ............................................................................. 12
Creating a cluster with a custom MSK configuration using the Amazon CLI ............................... 13
Creating a cluster using the API ......................................................................................... 13
Deleting a cluster ..................................................................................................................... 13
Deleting a cluster using the Amazon Web Services Management Console ................................. 14
Deleting a cluster using the Amazon CLI ............................................................................. 14
Deleting a cluster using the API ......................................................................................... 14
Getting the Apache ZooKeeper connection string ......................................................................... 14
Getting the Apache ZooKeeper connection string using the Amazon Web Services Management
Console ........................................................................................................................... 14
Getting the Apache ZooKeeper connection string using the Amazon CLI .................................. 14
Getting the Apache ZooKeeper connection string using the API .............................................. 15
Getting the bootstrap brokers .................................................................................................... 16
Getting the bootstrap brokers using the Amazon Web Services Management Console ................ 16
Getting the bootstrap brokers using the Amazon CLI ............................................................ 16
Getting the bootstrap brokers using the API ........................................................................ 16
Listing clusters ......................................................................................................................... 17
Listing clusters using the Amazon Web Services Management Console .................................... 17
Listing clusters using the Amazon CLI ................................................................................. 17
Listing clusters using the API ............................................................................................. 17
Provisioning storage throughput ................................................................................................ 17
Throughput bottlenecks .................................................................................................... 17
Measuring storage throughput ........................................................................................... 18
Configuration update ....................................................................................................... 18
Provisioning storage throughput using the Amazon Web Services Management Console ............. 18
Provisioning storage throughput using the Amazon CLI ......................................................... 19
Provisioning storage throughput using the API ..................................................................... 20
Scaling up broker storage .......................................................................................................... 20
Automatic scaling ............................................................................................................. 20
Manual scaling ................................................................................................................. 22
Updating the broker type .......................................................................................................... 22
Updating the broker type using the Amazon Web Services Management Console ...................... 23
Updating the broker type using the Amazon CLI .................................................................. 23
Updating the broker type using the API .............................................................................. 24
Updating the configuration of a cluster ....................................................................................... 24
Updating the configuration of a cluster using the Amazon CLI ................................................ 25
Updating the configuration of a cluster using the API ........................................................... 26
Expanding a cluster .................................................................................................................. 26
iii
Amazon Managed Streaming for
Apache Kafka Developer Guide
Expanding a cluster using the Amazon Web Services Management Console .............................. 26
Expanding a cluster using the Amazon CLI .......................................................................... 27
Expanding a cluster using the API ...................................................................................... 28
Updating security ..................................................................................................................... 28
Updating a cluster's security settings using the Amazon Web Services Management Console ....... 28
Updating a cluster's security settings using the Amazon CLI ................................................... 29
Updating a cluster's security settings using the API ............................................................... 30
Rebooting a broker for a cluster ................................................................................................ 30
Rebooting a broker using the Amazon Web Services Management Console ............................... 30
Rebooting a broker using the Amazon CLI ........................................................................... 30
Rebooting a broker using the API ....................................................................................... 30
Tagging a cluster ...................................................................................................................... 31
Tag basics ....................................................................................................................... 32
Tracking costs using tagging .............................................................................................. 32
Tag restrictions ................................................................................................................ 32
Tagging resources using the Amazon MSK API ..................................................................... 33
Configuration ................................................................................................................................... 34
Custom configurations .............................................................................................................. 34
Dynamic configuration ...................................................................................................... 39
Topic-level configuration ................................................................................................... 40
States ............................................................................................................................. 40
Default configuration ................................................................................................................ 40
Configuration operations ........................................................................................................... 42
Create configuration ......................................................................................................... 42
To update an MSK configuration ........................................................................................ 43
To delete an MSK configuration ......................................................................................... 44
To describe an MSK configuration ...................................................................................... 44
To describe an MSK configuration revision ........................................................................... 44
To list all MSK configurations in your account for the current Region ....................................... 45
MSK Serverless ................................................................................................................................ 47
Getting started tutorial ............................................................................................................. 47
Step 1: Create a cluster ..................................................................................................... 48
Step 2: Create an IAM role ................................................................................................ 49
Step 3: Create a client machine .......................................................................................... 50
Step 4: Create a topic ....................................................................................................... 51
Step 5: Produce and consume data ..................................................................................... 52
Step 6: Delete resources .................................................................................................... 52
Configuration ........................................................................................................................... 53
Monitoring ............................................................................................................................... 53
Cluster states ................................................................................................................................... 55
Security ........................................................................................................................................... 57
Data protection ........................................................................................................................ 57
Encryption ....................................................................................................................... 58
How do I get started with encryption? ................................................................................ 59
Authentication and authorization for Amazon MSK APIs ................................................................ 61
How Amazon MSK works with IAM ..................................................................................... 61
Identity-based policy examples .......................................................................................... 64
Service-linked roles .......................................................................................................... 67
Amazon managed policies ................................................................................................. 68
Troubleshooting ............................................................................................................... 72
Authentication and authorization for Apache Kafka APIs ............................................................... 73
IAM access control ............................................................................................................ 73
Mutual TLS authentication ................................................................................................. 81
SASL/SCRAM authentication .............................................................................................. 85
Apache Kafka ACLs ........................................................................................................... 88
Changing security groups .......................................................................................................... 89
Controlling access to Apache ZooKeeper ..................................................................................... 90
iv
Amazon Managed Streaming for
Apache Kafka Developer Guide
v
Amazon Managed Streaming for
Apache Kafka Developer Guide
vi
Amazon Managed Streaming for
Apache Kafka Developer Guide
What is Amazon MSK?
• Create an Amazon MSK cluster by following the Getting started using Amazon MSK (p. 5) tutorial.
• Dive deeper into the functionality of Amazon MSK in Amazon MSK: How it works (p. 10).
• Run Apache Kafka without having to manage and scale cluster capacity with MSK Serverless (p. 47).
For highlights, product details, and pricing, see the service page for Amazon MSK.
1
Amazon Managed Streaming for
Apache Kafka Developer Guide
What is Amazon MSK?
• Broker nodes — When creating an Amazon MSK cluster, you specify how many broker nodes you want
Amazon MSK to create in each Availability Zone. In the example cluster shown in this diagram, there's
one broker per Availability Zone. Each Availability Zone has its own virtual private cloud (VPC) subnet.
• ZooKeeper nodes — Amazon MSK also creates the Apache ZooKeeper nodes for you. Apache
ZooKeeper is an open-source server that enables highly reliable distributed coordination.
• Producers, consumers, and topic creators — Amazon MSK lets you use Apache Kafka data-plane
operations to create topics and to produce and consume data.
2
Amazon Managed Streaming for
Apache Kafka Developer Guide
What is Amazon MSK?
• Cluster Operations You can use the Amazon Web Services Management Console, the Amazon
Command Line Interface (Amazon CLI), or the APIs in the SDK to perform control-plane operations. For
example, you can create or delete an Amazon MSK cluster, list all the clusters in an account, view the
properties of a cluster, and update the number and type of brokers in a cluster.
Amazon MSK detects and automatically recovers from the most common failure scenarios for clusters so
that your producer and consumer applications can continue their write and read operations with minimal
impact. When Amazon MSK detects a broker failure, it mitigates the failure or replaces the unhealthy
or unreachable broker with a new one. In addition, where possible, it reuses the storage from the older
broker to reduce the data that Apache Kafka needs to replicate. Your availability impact is limited to the
time required for Amazon MSK to complete the detection and recovery. After a recovery, your producer
and consumer apps can continue to communicate with the same broker IP addresses that they used
before the failure.
3
Amazon Managed Streaming for
Apache Kafka Developer Guide
Sign up for Amazon
Tasks
• Sign up for Amazon (p. 4)
• Download libraries and tools (p. 4)
If you have an Amazon account already, skip to the next task. If you don't have an Amazon account, use
the following procedure to create one.
1. Open https://portal.amazonaws.cn/billing/signup.
2. Follow the online instructions.
Part of the sign-up procedure involves receiving a phone call and entering a verification code on the
phone keypad.
• The Amazon Command Line Interface (Amazon CLI) supports Amazon MSK. The Amazon CLI enables
you to control multiple Amazon Web Services from the command line and automate them through
scripts. Upgrade your Amazon CLI to the latest version to ensure that it has support for the Amazon
MSK features that are documented in this user guide. For detailed instructions on how to upgrade the
Amazon CLI, see Installing the Amazon Command Line Interface. After you install the Amazon CLI, you
must configure it. For information on how to configure the Amazon CLI, see aws configure.
• The Amazon Managed Streaming for Kafka API Reference documents the API operations that Amazon
MSK supports.
• The Amazon Web Services SDKs for Go, Java, JavaScript, .NET, Node.js, PHP, Python, and Ruby include
Amazon MSK support and samples.
4
Amazon Managed Streaming for
Apache Kafka Developer Guide
Step 1: Create a cluster
Topics
• Step 1: Create an Amazon MSK cluster (p. 5)
• Step 2: Create a client machine (p. 5)
• Step 3: Create a topic (p. 6)
• Step 4: Produce and consume data (p. 7)
• Step 5: Use Amazon CloudWatch to view Amazon MSK metrics (p. 8)
• Step 6: Delete the Amazon resources created for this tutorial (p. 8)
To create an Amazon MSK cluster using the Amazon Web Services Management Console
1. Sign in to the Amazon Web Services Management Console, and open the Amazon MSK console at
https://console.amazonaws.cn/msk/home?region=us-east-1#/home/.
2. Choose Create cluster.
3. For Creation method, leave the Quick create option selected. The Quick create option lets you
create a cluster with default settings.
4. For Cluster name, enter a descriptive name for your cluster. For example, MSKTutorialCluster.
5. For General cluster properties, choose Provisioned as the Cluster type.
6. From the table under All cluster settings, copy the values of the following settings and save them
because you need them later in this tutorial:
• VPC
• Subnets
• Security groups associated with VPC
7. Choose Create cluster.
8. Check the cluster Status on the Cluster summary page. The status changes from Creating to Active
as Amazon MSK provisions the cluster. When the status is Active, you can connect to the cluster. For
more information about cluster status, see Cluster states (p. 55).
Next Step
5
Amazon Managed Streaming for
Apache Kafka Developer Guide
Step 3: Create a topic
machine in the VPC that is associated with the MSK cluster so that the client can easily connect to the
cluster.
Next Step
wget https://archive.apache.org/dist/kafka/2.6.2/kafka_2.12-2.6.2.tgz
6
Amazon Managed Streaming for
Apache Kafka Developer Guide
Step 4: Produce and consume data
Note
If you want to use a mirror site other than the one used in this command, you can choose a
different one on the Apache website.
6. Run the following command in the directory where you downloaded the TAR file in the previous
step.
If the command succeeds, you see the following message: Created topic MSKTutorialTopic.
Next Step
1. Go to the bin folder of the Apache Kafka installation on the client machine, and create a text file
named client.properties with the following contents.
security.protocol=PLAINTEXT
<path-to-your-kafka-installation>/bin/kafka-console-producer.sh --broker-
list BootstrapServerString --producer.config client.properties --topic MSKTutorialTopic
3. Enter any message that you want, and press Enter. Repeat this step two or three times. Every
time you enter a line and press Enter, that line is sent to your Apache Kafka cluster as a separate
message.
4. Keep the connection to the client machine open, and then open a second, separate connection to
that machine in a new window.
7
Amazon Managed Streaming for
Apache Kafka Developer Guide
Step 5: View metrics
5. In the following command, replace BootstrapServerString with the plaintext connection string
that you saved earlier. Then, to create a console consumer, run the following command with your
second connection to the client machine.
<path-to-your-kafka-installation>/bin/kafka-console-consumer.sh --bootstrap-
server BootstrapServerString --consumer.config client.properties --topic
MSKTutorialTopic --from-beginning
You start seeing the messages you entered earlier when you used the console producer command.
6. Enter more messages in the producer window, and watch them appear in the consumer window.
Next Step
Next Step
Step 6: Delete the Amazon resources created for this tutorial (p. 8)
To delete the resources using the Amazon Web Services Management Console
8
Amazon Managed Streaming for
Apache Kafka Developer Guide
Step 6: Delete the resources
5. Choose the instance that you created for your client machine, for example, MSKTutorialClient.
6. Choose Instance state, then choose Terminate instance.
9
Amazon Managed Streaming for
Apache Kafka Developer Guide
Creating a cluster
Topics
• Creating an Amazon MSK cluster (p. 10)
• Deleting an Amazon MSK cluster (p. 13)
• Getting the Apache ZooKeeper connection string for an Amazon MSK cluster (p. 14)
• Getting the bootstrap brokers for an Amazon MSK cluster (p. 16)
• Listing Amazon MSK clusters (p. 17)
• Provisioning storage throughput (p. 17)
• Scaling up broker storage (p. 20)
• Updating the broker type (p. 22)
• Updating the configuration of an Amazon MSK cluster (p. 24)
• Expanding an Amazon MSK cluster (p. 26)
• Updating a cluster's security settings (p. 28)
• Rebooting a broker for an Amazon MSK cluster (p. 30)
• Tagging an Amazon MSK cluster (p. 31)
Before you can create an Amazon MSK cluster you need to have an Amazon Virtual Private Cloud (VPC)
and set up subnets within that VPC.
You need two subnets in two different Availability Zones in the US West (N. California) Region. In all
other Regions where Amazon MSK is available, you can specify either two or three subnets. Your subnets
must all be in different Availability Zones. When you create a cluster, Amazon MSK distributes the broker
nodes evenly over the subnets that you specify.
Broker types
When you create an Amazon MSK cluster, you specify the type of brokers that you want it to have.
Amazon MSK supports the following broker types:
• kafka.t3.small
• kafka.m5.large, kafka.m5.xlarge, kafka.m5.2xlarge, kafka.m5.4xlarge, kafka.m5.8xlarge,
kafka.m5.12xlarge, kafka.m5.16xlarge, kafka.m5.24xlarge
10
Amazon Managed Streaming for
Apache Kafka Developer Guide
Creating a cluster using the Amazon
Web Services Management Console
M5 brokers have higher baseline throughput performance than T3 brokers and are recommended for
production workloads. M5 brokers can also have more partitions per broker than T3 brokers. Use M5
brokers if you are running larger production-grade workloads or require a greater number of partitions.
To learn more about M5 instance types, see Amazon EC2 M5 Instances.
T3 brokers have the ability to use CPU credits to temporarily burst performance. Use T3 brokers for
low-cost development, if you are testing small to medium streaming workloads, or if you have low-
throughput streaming workloads that experience temporary spikes in throughput. We recommend
that you run a proof-of-concept test to determine if T3 brokers are sufficient for production or critical
workload. To learn more about T3 instance types, see Amazon EC2 T3 Instances.
For more information on how to choose broker types, see Best practices (p. 138).
11
Amazon Managed Streaming for
Apache Kafka Developer Guide
Creating a cluster using the Amazon CLI
17. Check the cluster Status on the Cluster summary page. The status changes from Creating to Active
as Amazon MSK provisions the cluster. When the status is Active, you can connect to the cluster. For
more information about cluster status, see Cluster states (p. 55).
{
"InstanceType": "kafka.m5.large",
"ClientSubnets": [
"Subnet-1-ID",
"Subnet-2-ID"
],
"SecurityGroups": [
"Security-Group-ID"
]
}
Important
Specify exactly two subnets if you are using one of the following Regions: South America
(São Paulo), Canada (Central), and US West (N. California). For other Regions where Amazon
MSK is available, you can specify either two or three subnets. The subnets that you specify
must be in distinct Availability Zones. When you create a cluster, Amazon MSK distributes
the broker nodes evenly across the subnets that you specify.
2. Run the following Amazon CLI command in the directory where you saved the
brokernodegroupinfo.json file, replacing "Your-Cluster-Name" with a name of your
choice. For "Monitoring-Level", you can specify one of the following three values: DEFAULT,
PER_BROKER, or PER_TOPIC_PER_BROKER. For information about these three different levels of
monitoring, see ??? (p. 107). The enhanced-monitoring parameter is optional. If you don't
specify it in the create-cluster command, you get the DEFAULT level of monitoring.
{
"ClusterArn": "...",
"ClusterName": "AWSKafkaTutorialCluster",
"State": "CREATING"
}
Note
The create-cluster command might return an error stating that one or more subnets
belong to unsupported Availability Zones. When this happens, the error indicates which
12
Amazon Managed Streaming for
Apache Kafka Developer Guide
Creating a cluster with a custom MSK
configuration using the Amazon CLI
Availability Zones are unsupported. Create subnets that don't use the unsupported
Availability Zones and try the create-cluster command again.
3. Save the value of the ClusterArn key because you need it to perform other actions on your cluster.
4. Run the following command to check your cluster STATE. The STATE value changes from CREATING
to ACTIVE as Amazon MSK provisions the cluster. When the state is ACTIVE, you can connect to the
cluster. For more information about cluster status, see Cluster states (p. 55).
For information about custom MSK configurations and how to create them, see Configuration (p. 34).
1. Save the following JSON to a file, replacing configuration-arn with the ARN of the configuration
that you want to use to create the cluster.
{
"Arn": configuration-arn,
"Revision": 1
}
2. Run the create-cluster command and use the configuration-info option to point to the
JSON file you saved in the previous step. The following is an example.
{
"ClusterArn": "arn:aws:kafka:us-east-1:123456789012:cluster/
CustomConfigExampleCluster/abcd1234-abcd-dcba-4321-a1b2abcd9f9f-2",
"ClusterName": "CustomConfigExampleCluster",
"State": "CREATING"
}
13
Amazon Managed Streaming for
Apache Kafka Developer Guide
Deleting a cluster using the Amazon
Web Services Management Console
14
Amazon Managed Streaming for
Apache Kafka Developer Guide
Getting the Apache ZooKeeper
connection string using the API
The output of this describe-cluster command looks like the following JSON example.
{
"ClusterInfo": {
"BrokerNodeGroupInfo": {
"BrokerAZDistribution": "DEFAULT",
"ClientSubnets": [
"subnet-0123456789abcdef0",
"subnet-2468013579abcdef1",
"subnet-1357902468abcdef2"
],
"InstanceType": "kafka.m5.large",
"StorageInfo": {
"EbsStorageInfo": {
"VolumeSize": 1000
}
}
},
"ClusterArn": "arn:aws:kafka:us-east-1:111122223333:cluster/
testcluster/12345678-abcd-4567-2345-abcdef123456-2",
"ClusterName": "testcluster",
"CreationTime": "2018-12-02T17:38:36.75Z",
"CurrentBrokerSoftwareInfo": {
"KafkaVersion": "2.2.1"
},
"CurrentVersion": "K13V1IB3VIYZZH",
"EncryptionInfo": {
"EncryptionAtRest": {
"DataVolumeKMSKeyId": "arn:aws:kms:us-east-1:555555555555:key/12345678-
abcd-2345-ef01-abcdef123456"
}
},
"EnhancedMonitoring": "DEFAULT",
"NumberOfBrokerNodes": 3,
"State": "ACTIVE",
"ZookeeperConnectString": "10.0.1.101:2018,10.0.2.101:2018,10.0.3.101:2018"
}
}
The previous JSON example shows the ZookeeperConnectString key in the output of the
describe-cluster command. Copy the value corresponding to this key and save it for when you
need to create a topic on your cluster.
Important
Your Amazon MSK cluster must be in the ACTIVE state for you to be able to obtain the
Apache ZooKeeper connection string. When a cluster is still in the CREATING state, the
output of the describe-cluster command doesn't include ZookeeperConnectString.
If this is the case, wait a few minutes and then run the describe-cluster again after
your cluster reaches the ACTIVE state.
15
Amazon Managed Streaming for
Apache Kafka Developer Guide
Getting the bootstrap brokers
For an MSK cluster that uses the section called “IAM access control” (p. 73), the output of this
command looks like the following JSON example.
{
"BootstrapBrokerStringSaslIam": "b-1.myTestCluster.123z8u.c2.kafka.us-
west-1.amazonaws.com:9098,b-2.myTestCluster.123z8u.c2.kafka.us-west-1.amazonaws.com:9098"
}
The following example shows the bootstrap brokers for a cluster that has public access
turned on. Use the BootstrapBrokerStringPublicSaslIam for public access, and the
BootstrapBrokerStringSaslIam string for access from within Amazon.
{
"BootstrapBrokerStringPublicSaslIam": "b-2-public.myTestCluster.v4ni96.c2.kafka-
beta.us-east-1.amazonaws.com:9198,b-1-public.myTestCluster.v4ni96.c2.kafka-
beta.us-east-1.amazonaws.com:9198,b-3-public.myTestCluster.v4ni96.c2.kafka-beta.us-
east-1.amazonaws.com:9198",
"BootstrapBrokerStringSaslIam": "b-2.myTestCluster.v4ni96.c2.kafka-
beta.us-east-1.amazonaws.com:9098,b-1.myTestCluster.v4ni96.c2.kafka-
beta.us-east-1.amazonaws.com:9098,b-3.myTestCluster.v4ni96.c2.kafka-beta.us-
east-1.amazonaws.com:9098"
}
The bootstrap brokers string should contain three brokers from across the Availability Zones in which
your MSK cluster is deployed (unless only two brokers are available).
16
Amazon Managed Streaming for
Apache Kafka Developer Guide
Listing clusters
You can specify the provisioned throughput rate in MiB per second for clusters whose brokers are of type
kafka.m5.4xlarge or larger and if the storage volume is 10 GiB or greater. It is possible to specify
provisioned throughput during cluster creation. You can also enable or disable provisioned throughput
for a cluster that is in the ACTIVE state.
Throughput bottlenecks
There are multiple causes of bottlenecks in broker throughput: volume throughput, EC2-EBS network
throughput, and EC2 egress throughput. You can enable provisioned storage throughput to adjust
volume throughput. However, broker throughput limitations can be caused by EC2-EBS network
throughput and EC2 egress throughput.
EC2 egress throughput is impacted by the number of consumer groups and consumers per consumer
groups. Also, both EC2-EBS network throughput and EC2 egress throughput are higher for larger broker
types, as shown in the following table.
kafka.m5.4xlarge 593.75
kafka.m5.8xlarge 850
kafka.m5.12xlarge 1187.5
17
Amazon Managed Streaming for
Apache Kafka Developer Guide
Measuring storage throughput
kafka.m5.16xlarge 1700
kafka.m5.24xlarge 2375
For information about the VolumeReadBytes and VolumeWriteBytes metrics, see the section called
“PER_BROKER Level monitoring” (p. 112).
Configuration update
You can update your Amazon MSK configuration either before or after you turn on provisioned
throughput. However, you won't see the desired throughput until you perform both actions: update the
num.replica.fetchers configuration parameter and turn on provisioned throughput.
In the default Amazon MSK configuration, num.replica.fetchers has a value of 2. To update your
num.replica.fetchers, you can use the suggested values from the following table. These values are
for guidance purposes. We recommend that you adjust these values based on your use case.
kafka.m5.4xlarge 4
kafka.m5.8xlarge 8
kafka.m5.12xlarge 14
kafka.m5.16xlarge 16
kafka.m5.24xlarge 16
Your updated configuration may not take effect for up to 24 hours, and may take longer when a source
volume is not fully utilized. However, transitional volume performance at least equals the performance
of source storage volumes during the migration period. A fully-utilized 1 TiB volume typically takes
about six hours to migrate to an updated configuration.
18
Amazon Managed Streaming for
Apache Kafka Developer Guide
Provisioning storage throughput using the Amazon CLI
1. Copy the following JSON and paste it into a file. Replace the subnet IDs and security group ID
placeholders with values from your account. Name the file cluster-creation.json and save it.
{
"Provisioned": {
"BrokerNodeGroupInfo":{
"InstanceType":"kafka.m5.4xlarge",
"ClientSubnets":[
"Subnet-1-ID",
"Subnet-2-ID"
],
"SecurityGroups":[
"Security-Group-ID"
],
"StorageInfo": {
"EbsStorageInfo": {
"VolumeSize": 10,
"ProvisionedThroughput": {
"Enabled": true,
"VolumeThroughput": 250
}
}
}
},
"EncryptionInfo": {
"EncryptionInTransit": {
"InCluster": false,
"ClientBroker": "PLAINTEXT"
}
},
"KafkaVersion":"2.2.1",
"NumberOfBrokerNodes": 2
},
"ClusterName": "provisioned-throughput-example"
}
2. Run the following Amazon CLI command from the directory where you saved the JSON file in the
previous step.
19
Amazon Managed Streaming for
Apache Kafka Developer Guide
Provisioning storage throughput using the API
Topics
• Automatic scaling (p. 20)
• Manual scaling (p. 22)
Automatic scaling
To automatically expand your cluster's storage in response to increased usage, you can configure an
Application Auto-Scaling policy for Amazon MSK. In an auto-scaling policy, you set the target disk
utilization and the maximum scaling capacity.
Before you use automatic scaling for Amazon MSK, you should consider the following:
• Important
A storage scaling action can occur only once every six hours.
We recommend that you start with a right-sized storage volume for your storage demands.
For guidance on right-sizing your cluster, see Right-size your cluster: Number of brokers per
cluster (p. 138).
• Amazon MSK does not reduce cluster storage in response to reduced usage. Amazon MSK does not
support decreasing the size of storage volumes. If you need to reduce the size of your cluster storage,
you must migrate your existing cluster to a cluster with smaller storage. For information about
migrating a cluster, see Migration (p. 104).
• Amazon MSK does not support automatic scaling in the Asia Pacific (Osaka) and Africa (Cape Town)
Regions.
• When you associate an auto-scaling policy with your cluster, Amazon EC2 Auto Scaling automatically
creates an Amazon CloudWatch alarm for target tracking. If you delete a cluster with an auto-scaling
policy, this CloudWatch alarm persists. To delete the CloudWatch alarm, you should remove an auto-
scaling policy from a cluster before you delete the cluster. To learn more about target tracking, see
Target tracking scaling policies for Amazon EC2 Auto Scaling in the Amazon EC2 Auto Scaling User
Guide.
20
Amazon Managed Streaming for
Apache Kafka Developer Guide
Automatic scaling
• Storage Utilization Target: The storage utilization threshold that Amazon MSK uses to trigger an
auto-scaling operation. You can set the utilization target between 10% and 80% of the current storage
capacity. We recommend that you set the Storage Utilization Target between 50% and 60%.
• Maximum Storage Capacity: The maximum scaling limit that Amazon MSK can set for your broker
storage. You can set the maximum storage capacity up to 16 TiB per broker. For more information, see
Amazon MSK quota (p. 123).
When Amazon MSK detects that your Maximum Disk Utilization metric is equal to or greater than
the Storage Utilization Target setting, it increases your storage capacity by an amount equal to
the larger of two numbers: 10 GiB or 10% of current storage. For example, if you have 1000 GiB, that
amount is 100 GiB. The service checks your storage utilization every minute. Further scaling operations
continue to increase storage by an amount equal to the larger of two numbers: 10 GiB or 10% of current
storage.
When you save and enable the new policy, the policy becomes active for the cluster. Amazon MSK then
expands the cluster's storage when the storage utilization target is reached.
21
Amazon Managed Streaming for
Apache Kafka Developer Guide
Manual scaling
Manual scaling
To increase storage, wait for the cluster to be in the ACTIVE state. Storage scaling has a cool-down
period of at least six hours between events. Even though the operation makes additional storage
available right away, the service performs optimizations on your cluster that can take up to 24 hours or
more. The duration of these optimizations is proportional to your storage size.
The Target-Volume-in-GiB parameter represents the amount of storage that you want each broker
to have. It is only possible to update the storage for all the brokers. You can't specify individual brokers
for which to update storage. The value you specify for Target-Volume-in-GiB must be a whole
number that is greater than 100 GiB. The storage per broker after the update operation can't exceed
16384 GiB.
22
Amazon Managed Streaming for
Apache Kafka Developer Guide
Updating the broker type using the Amazon
Web Services Management Console
your cluster I/O. Amazon MSK uses the same broker type for all the brokers in a given cluster. This
section describes how to update the broker type for your MSK cluster. The broker-type update happens
in a rolling fashion while the cluster is up and running. This means that Amazon MSK takes down one
broker at a time to perform the broker-type update. For information about how to make a cluster highly
available during a broker-type update, see the section called “Build highly available clusters” (p. 139).
To further reduce any potential impact on productivity, you can perform the broker-type update during a
period of low traffic.
During a broker-type update, you can continue to produce and consume data. However, you must wait
until the update is done before you can reboot brokers or invoke any of the update operations listed
under Amazon MSK operations.
If you want to update your cluster to a smaller broker type, we recommend that you try the update on a
test cluster first to see how it affects your scenario.
Important
You can't update a cluster to a smaller broker type if the number of partitions per broker
exceeds the maximum number specified in the section called “ Right-size your cluster: Number
of partitions per broker” (p. 138).
Replace Current-Cluster-Version with the current version of the cluster and TargetType with
the new type that you want the brokers to be. To learn more about broker types, see the section
called “Broker types” (p. 10).
The output of this command looks like the following JSON example.
{
"ClusterArn": "arn:aws:kafka:us-east-1:0123456789012:cluster/exampleName/
abcd1234-0123-abcd-5678-1234abcd-1",
23
Amazon Managed Streaming for
Apache Kafka Developer Guide
Updating the broker type using the API
"ClusterOperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef"
}
2. To get the result of the update-broker-type operation, run the following command, replacing
ClusterOperationArn with the ARN that you obtained in the output of the update-broker-
type command.
The output of this describe-cluster-operation command looks like the following JSON
example.
{
"ClusterOperationInfo": {
"ClientRequestId": "982168a3-939f-11e9-8a62-538df00285db",
"ClusterArn": "arn:aws:kafka:us-east-1:0123456789012:cluster/exampleName/
abcd1234-0123-abcd-5678-1234abcd-1",
"CreationTime": "2021-01-09T02:24:22.198000+00:00",
"OperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-operation/
exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef",
"OperationState": "UPDATE_COMPLETE",
"OperationType": "UPDATE_BROKER_TYPE",
"SourceClusterInfo": {
"InstanceType": "t3.small"
},
"TargetClusterInfo": {
"InstanceType": "m5.large"
}
}
}
If OperationState has the value UPDATE_IN_PROGRESS, wait a while, then run the describe-
cluster-operation command again.
For information about MSK configuration, including how to create a custom configuration, which
properties you can update, and what happens when you update the configuration of an existing cluster,
see Configuration (p. 34).
24
Amazon Managed Streaming for
Apache Kafka Developer Guide
Updating the configuration of
a cluster using the Amazon CLI
Replace Configuration-Revision with the revision of the configuration that you want to use.
Configuration revisions are integers (whole numbers) that start at 1. This integer mustn't be in
quotes in the following JSON.
{
"Arn": ConfigurationArn,
"Revision": Configuration-Revision
}
2. Run the following command, replacing ClusterArn with the ARN that you obtained when you
created your cluster. If you don't have the ARN for your cluster, you can find it by listing all clusters.
For more information, see the section called “Listing clusters” (p. 17).
Replace Path-to-Config-Info-File with the path to your configuration info file. If you named
the file that you created in the previous step configuration-info.json and saved it in the
current directory, then Path-to-Config-Info-File is configuration-info.json.
The output of this update-cluster-configuration command looks like the following JSON
example.
{
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"ClusterOperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef"
}
3. To get the result of the update-cluster-configuration operation, run the following command,
replacing ClusterOperationArn with the ARN that you obtained in the output of the update-
cluster-configuration command.
25
Amazon Managed Streaming for
Apache Kafka Developer Guide
Updating the configuration of a cluster using the API
The output of this describe-cluster-operation command looks like the following JSON
example.
{
"ClusterOperationInfo": {
"ClientRequestId": "982168a3-939f-11e9-8a62-538df00285db",
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"CreationTime": "2019-06-20T21:08:57.735Z",
"OperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef",
"OperationState": "UPDATE_COMPLETE",
"OperationType": "UPDATE_CLUSTER_CONFIGURATION",
"SourceClusterInfo": {},
"TargetClusterInfo": {
"ConfigurationInfo": {
"Arn": "arn:aws:kafka:us-east-1:123456789012:configuration/
ExampleConfigurationName/abcdabcd-abcd-1234-abcd-abcd123e8e8e-1",
"Revision": 1
}
}
}
}
For information about how to rebalance partitions after you add brokers to a cluster, see the section
called “Reassign partitions” (p. 141).
26
Amazon Managed Streaming for
Apache Kafka Developer Guide
Expanding a cluster using the Amazon CLI
4. Enter the number of brokers that you want the cluster to have per Availability Zone and then choose
Save changes.
The Target-Number-of-Brokers parameter represents the total number of broker nodes that
you want the cluster to have when this operation completes successfully. The value you specify for
Target-Number-of-Brokers must be a whole number that is greater than the current number of
brokers in the cluster. It must also be a multiple of the number of Availability Zones.
The output of this update-broker-count operation looks like the following JSON.
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"ClusterOperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef"
}
2. To get the result of the update-broker-count operation, run the following command, replacing
ClusterOperationArn with the ARN that you obtained in the output of the update-broker-
count command.
The output of this describe-cluster-operation command looks like the following JSON
example.
{
"ClusterOperationInfo": {
"ClientRequestId": "c0b7af47-8591-45b5-9c0c-909a1a2c99ea",
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"CreationTime": "2019-09-25T23:48:04.794Z",
"OperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef",
"OperationState": "UPDATE_COMPLETE",
"OperationType": "INCREASE_BROKER_COUNT",
"SourceClusterInfo": {
"NumberOfBrokerNodes": 9
},
27
Amazon Managed Streaming for
Apache Kafka Developer Guide
Expanding a cluster using the API
"TargetClusterInfo": {
"NumberOfBrokerNodes": 12
}
}
}
The cluster must be in the ACTIVE state for you to update its security settings.
If you turn on authentication using IAM, SASL, or TLS, you must also turn on encryption between clients
and brokers. The following table shows the possible combinations.
For more information about security settings, see Security (p. 57).
28
Amazon Managed Streaming for
Apache Kafka Developer Guide
Updating a cluster's security settings using the Amazon CLI
4. Choose the authentication and encryption settings that you want for the cluster, then choose Save
changes.
{"EncryptionInTransit":{"ClientBroker": "TLS"}}
2. Create a JSON file that contains the authentication settings that you want the cluster to have. The
following is an example.
{"Sasl":{"Scram":{"Enabled":true}}}
The output of this update-security operation looks like the following JSON.
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"ClusterOperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef"
}
4. To see the status of the update-security operation, run the following command, replacing
ClusterOperationArn with the ARN that you obtained in the output of the update-security
command.
The output of this describe-cluster-operation command looks like the following JSON
example.
{
"ClusterOperationInfo": {
"ClientRequestId": "c0b7af47-8591-45b5-9c0c-909a1a2c99ea",
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"CreationTime": "2021-09-17T02:35:47.753000+00:00",
"OperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef",
"OperationState": "PENDING",
29
Amazon Managed Streaming for
Apache Kafka Developer Guide
Updating a cluster's security settings using the API
"OperationType": "UPDATE_SECURITY",
"SourceClusterInfo": {},
"TargetClusterInfo": {}
}
}
If OperationState has the value PENDING or UPDATE_IN_PROGRESS, wait a while, then run the
describe-cluster-operation command again.
The Amazon MSK service may reboot the brokers for your MSK cluster during system maintenance,
such as patching or version upgrades. Rebooting a broker manually lets you test resilience of your Kafka
clients to determine how they respond to system maintenance.
If you don't have the ARN for your cluster, you can find it by listing all clusters. For more
information, see the section called “Listing clusters” (p. 17).
If you don't have the broker IDs for your cluster, you can find them by listing the broker nodes. For
more information, see list-nodes.
30
Amazon Managed Streaming for
Apache Kafka Developer Guide
Rebooting a broker using the API
The output of this reboot-broker operation looks like the following JSON.
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"ClusterOperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef"
}
2. To get the result of the reboot-broker operation, run the following command, replacing
ClusterOperationArn with the ARN that you obtained in the output of the reboot-broker
command.
The output of this describe-cluster-operation command looks like the following JSON
example.
{
"ClusterOperationInfo": {
"ClientRequestId": "c0b7af47-8591-45b5-9c0c-909a1a2c99ea",
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"CreationTime": "2019-09-25T23:48:04.794Z",
"OperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef",
"OperationState": "REBOOT_IN_PROGRESS",
"OperationType": "REBOOT_NODE",
"SourceClusterInfo": {},
"TargetClusterInfo": {}
}
}
Topics
• Tag basics (p. 32)
• Tracking costs using tagging (p. 32)
• Tag restrictions (p. 32)
• Tagging resources using the Amazon MSK API (p. 33)
31
Amazon Managed Streaming for
Apache Kafka Developer Guide
Tag basics
Tag basics
You can use the Amazon MSK API to complete the following tasks:
You can use tags to categorize your Amazon MSK resources. For example, you can categorize your
Amazon MSK clusters by purpose, owner, or environment. Because you define the key and value for
each tag, you can create a custom set of categories to meet your specific needs. For example, you might
define a set of tags that help you track clusters by owner and associated application.
Tag restrictions
The following restrictions apply to tags in Amazon MSK.
Basic restrictions
• Each tag key must be unique. If you add a tag with a key that's already in use, your new tag overwrites
the existing key-value pair.
• You can't start a tag key with aws: because this prefix is reserved for use by Amazon. Amazon creates
tags that begin with this prefix on your behalf, but you can't edit or delete them.
• Tag keys must be between 1 and 128 Unicode characters in length.
• Tag keys must consist of the following characters: Unicode letters, digits, white space, and the
following special characters: _ . / = + - @.
32
Amazon Managed Streaming for
Apache Kafka Developer Guide
Tagging resources using the Amazon MSK API
• Tag values can be blank. Otherwise, they must consist of the following characters: Unicode letters,
digits, white space, and any of the following special characters: _ . / = + - @.
• ListTagsForResource
• TagResource
• UntagResource
33
Amazon Managed Streaming for
Apache Kafka Developer Guide
Custom configurations
Topics
• Custom MSK configurations (p. 34)
• The default Amazon MSK configuration (p. 40)
• Amazon MSK configuration operations (p. 42)
Name Description
34
Amazon Managed Streaming for
Apache Kafka Developer Guide
Custom configurations
Name Description
35
Amazon Managed Streaming for
Apache Kafka Developer Guide
Custom configurations
Name Description
36
Amazon Managed Streaming for
Apache Kafka Developer Guide
Custom configurations
Name Description
37
Amazon Managed Streaming for
Apache Kafka Developer Guide
Custom configurations
Name Description
MinValue: 10000
38
Amazon Managed Streaming for
Apache Kafka Developer Guide
Dynamic configuration
Name Description
MinValue = 6000
To learn how you can create a custom MSK configuration, list all configurations, or describe them, see
the section called “Configuration operations” (p. 42). To create an MSK cluster using a custom MSK
configuration or to update a cluster with a new custom configuration, see How it works (p. 10).
When you update your existing MSK cluster with a custom MSK configuration, Amazon MSK does rolling
restarts when necessary, using best practices to minimize customer downtime. For example, after
Amazon MSK restarts each broker, it tries to let the broker catch up on data that the broker might have
missed during the configuration update before it moves to the next broker.
Dynamic configuration
In addition to the configuration properties that Amazon MSK provides, you can dynamically set cluster-
and broker-level configuration properties that don't require a broker restart. You can dynamically set
configuration properties that aren't marked as read-only in the table under Broker Configs in the Apache
Kafka documentation. For information about dynamic configuration and example commands, see
Updating Broker Configs in the Apache Kafka documentation.
Note
You can set the advertised.listeners property, but not the listeners property.
39
Amazon Managed Streaming for
Apache Kafka Developer Guide
Topic-level configuration
Topic-level configuration
You can use Apache Kafka commands to set or modify topic-level configuration properties for new and
existing topics. For more information about topic-level configuration properties and examples on how to
set them, see Topic-Level Configs in the Apache Kafka documentation.
Configuration states
Amazon MSK configurations can be in the following states. To perform an operation on a configuration,
the configuration must be in the ACTIVE or DELETE_FAILED state:
• ACTIVE
• DELETING
• DELETE_FAILED
default.replication.factor Default replication factors for 3 for 3-AZ clusters, 2 for 2-AZ
automatically created topics. clusters
40
Amazon Managed Streaming for
Apache Kafka Developer Guide
Default configuration
41
Amazon Managed Streaming for
Apache Kafka Developer Guide
Configuration operations
For information about how to specify custom configuration values, see the section called “Custom
configurations” (p. 34).
auto.create.topics.enable = true
zookeeper.connection.timeout.ms = 1000
log.roll.ms = 604800000
2. Run the following Amazon CLI command, replacing config-file-path with the path to the file
where you saved your configuration in the previous step.
Note
The name that you choose for your configuration must match the following regex: "^[0-9A-
Za-z][0-9A-Za-z-]{0,}$".
42
Amazon Managed Streaming for
Apache Kafka Developer Guide
To update an MSK configuration
{
"Arn": "arn:aws:kafka:us-east-1:123456789012:configuration/SomeTest/abcdabcd-1234-
abcd-1234-abcd123e8e8e-1",
"CreationTime": "2019-05-21T19:37:40.626Z",
"LatestRevision": {
"CreationTime": "2019-05-21T19:37:40.626Z",
"Description": "Example configuration description.",
"Revision": 1
},
"Name": "ExampleConfigurationName"
}
3. The previous command returns an Amazon Resource Name (ARN) for the newly created
configuration. Save this ARN because you need it to refer to this configuration in other commands.
If you lose your configuration ARN, you can find it again by listing all the configurations in your
account.
auto.create.topics.enable = true
zookeeper.connection.timeout.ms = 1000
min.insync.replicas = 2
2. Run the following Amazon CLI command, replacing config-file-path with the path to the file
where you saved your configuration in the previous step.
Replace configuration-arn with the ARN you obtained when you created the configuration.
If you didn't save the ARN when you created the configuration, you can use the list-
configurations command to list all configuration in your account, and find the configuration that
you want in the list that appears in the response. The ARN of the configuration also appears in that
list.
{
"Arn": "arn:aws:kafka:us-east-1:123456789012:configuration/SomeTest/abcdabcd-1234-
abcd-1234-abcd123e8e8e-1",
"LatestRevision": {
"CreationTime": "2020-08-27T19:37:40.626Z",
"Description": "Example configuration revision description.",
"Revision": 2
}
}
43
Amazon Managed Streaming for
Apache Kafka Developer Guide
To delete an MSK configuration
1. To run this example, replace configuration-arn with the ARN you obtained when you created
the configuration. If you didn't save the ARN when you created the configuration, you can use
the list-configurations command to list all configuration in your account, and find the
configuration that you want in the list that appears in the response. The ARN of the configuration
also appears in that list.
{
"arn": " arn:aws:kafka:us-east-1:123456789012:configuration/SomeTest/abcdabcd-1234-
abcd-1234-abcd123e8e8e-1",
"state": "DELETING"
}
To run this example, replace configuration-arn with the ARN you obtained when you created
the configuration. If you didn't save the ARN when you created the configuration, you can use
the list-configurations command to list all configuration in your account, and find the
configuration that you want in the list that appears in the response. The ARN of the configuration
also appears in that list.
{
"Arn": "arn:aws:kafka:us-east-1:123456789012:configuration/SomeTest/abcdabcd-
abcd-1234-abcd-abcd123e8e8e-1",
"CreationTime": "2019-05-21T00:54:23.591Z",
"Description": "Example configuration description.",
"KafkaVersions": [
"1.1.1"
],
"LatestRevision": {
"CreationTime": "2019-05-21T00:54:23.591Z",
"Description": "Example configuration description.",
"Revision": 1
},
"Name": "SomeTest"
}
44
Amazon Managed Streaming for
Apache Kafka Developer Guide
To describe an MSK configuration revision
• Run the following command, replacing configuration-arn with the ARN you obtained when you
created the configuration. If you didn't save the ARN when you created the configuration, you can
use the list-configurations command to list all configuration in your account, and find the
configuration that you want in the list that appears in the response. The ARN of the configuration
also appears in that list.
{
"Arn": "arn:aws:kafka:us-east-1:123456789012:configuration/SomeTest/abcdabcd-
abcd-1234-abcd-abcd123e8e8e-1",
"CreationTime": "2019-05-21T00:54:23.591Z",
"Description": "Example configuration description.",
"Revision": 1,
"ServerProperties":
"YXV0by5jcmVhdGUudG9waWNzLmVuYWJsZSA9IHRydWUKCgp6b29rZWVwZXIuY29ubmVjdGlvbi50aW1lb3V0Lm1zID0gMTAwM
}
The value of ServerProperties is encoded using base64. If you use a base64 decoder (for
example, https://www.base64decode.org/) to manually decode it, you get the contents of the
original configuration file that you used to create the custom configuration. In this case, you get the
following:
auto.create.topics.enable = true
zookeeper.connection.timeout.ms = 1000
log.roll.ms = 604800000
{
"Configurations": [
{
"Arn": "arn:aws:kafka:us-east-1:123456789012:configuration/SomeTest/
abcdabcd-abcd-1234-abcd-abcd123e8e8e-1",
"CreationTime": "2019-05-21T00:54:23.591Z",
"Description": "Example configuration description.",
"KafkaVersions": [
45
Amazon Managed Streaming for
Apache Kafka Developer Guide
To list all MSK configurations in
your account for the current Region
"1.1.1"
],
"LatestRevision": {
"CreationTime": "2019-05-21T00:54:23.591Z",
"Description": "Example configuration description.",
"Revision": 1
},
"Name": "SomeTest"
},
{
"Arn": "arn:aws:kafka:us-east-1:123456789012:configuration/SomeTest/
abcdabcd-1234-abcd-1234-abcd123e8e8e-1",
"CreationTime": "2019-05-03T23:08:29.446Z",
"Description": "Example configuration description.",
"KafkaVersions": [
"1.1.1"
],
"LatestRevision": {
"CreationTime": "2019-05-03T23:08:29.446Z",
"Description": "Example configuration description.",
"Revision": 1
},
"Name": "ExampleConfigurationName"
}
]
}
46
Amazon Managed Streaming for
Apache Kafka Developer Guide
Getting started tutorial
MSK Serverless
Note
MSK Serverless is available in the US East (Ohio), US East (N. Virginia), US West (Oregon),
Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe
(Stockholm) and Europe (Ireland) Regions.
MSK Serverless is a cluster type for Amazon MSK that makes it possible for you to run Apache Kafka
without having to manage and scale cluster capacity. It automatically provisions and scales capacity
while managing the partitions in your topic, so you can stream data without thinking about right-sizing
or scaling clusters. MSK Serverless offers a throughput-based pricing model, so you pay only for what
you use. Consider using a serverless cluster if your applications need on-demand streaming capacity that
scales up and down automatically.
MSK Serverless is fully compatible with Apache Kafka, so you can use any compatible client applications
to produce and consume data. It also integrates with the following services:
MSK Serverless requires IAM access control for all clusters. For more information, see the section called
“IAM access control” (p. 73).
For information about the service quota that apply to MSK Serverless, see the section called “Quota for
serverless clusters” (p. 123).
To help you get started with serverless clusters, and to learn more about configuration and monitoring
options for serverless clusters, see the following.
Topics
• Getting started using MSK Serverless clusters (p. 47)
• Configuration for serverless clusters (p. 53)
• Monitoring serverless clusters (p. 53)
Topics
47
Amazon Managed Streaming for
Apache Kafka Developer Guide
Step 1: Create a cluster
1. Sign in to the Amazon Web Services Management Console, and open the Amazon MSK console at
https://console.aws.amazon.com/msk/home.
2. Choose Create cluster.
3. For Creation method, leave the Quick create option selected. The Quick create option lets you
create a serverless cluster with default settings.
4. For Cluster name, enter a descriptive name, such as msk-serverless-tutorial-cluster.
5. For General cluster properties, choose Serverless as the Cluster type. Use the default values for the
remaining General cluster properties.
6. Note the table under All cluster settings. This table lists the default values for important settings
such as networking and availability, and indicates whether you can change each setting after you
create the cluster. To change a setting before you create the cluster, you should choose the Custom
create option under Creation method.
Note
You can connect clients from up to five different VPCs with MSK Serverless clusters. To help
client applications switch over to another Availability Zone in the event of an outage, you
must specify at least two subnets in each VPC.
7. Choose Create cluster.
1. In the Cluster summary section, choose View client information. This button remains grayed out
until Amazon MSK finishes creating the cluster. You might need to wait a few minutes until the
button becomes active so you can use it.
2. Copy the string under the label Endpoint. This is your bootstrap server string.
3. Choose the Properties tab.
4. Under the Networking settings section, copy the IDs of the subnets and the security group and save
them because you need this information later to create a client machine.
5. Choose any of the subnets. This opens the Amazon VPC Console. Find the ID of the Amazon VPC
that is associated with the subnet. Save this Amazon VPC ID for later use.
Next Step
48
Amazon Managed Streaming for
Apache Kafka Developer Guide
Step 2: Create an IAM role
To create an IAM policy that makes it possible to create topics and write to them
Replace region with the code of the Amazon Web Services Region where you created your cluster.
Replace Account-ID with your account ID. Replace msk-serverless-tutorial-cluster with
the name of your serverless cluster.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kafka-cluster:Connect",
"kafka-cluster:AlterCluster",
"kafka-cluster:DescribeCluster"
],
"Resource": [
"arn:aws:kafka:region:Account-ID:cluster/msk-serverless-tutorial-
cluster/*"
]
},
{
"Effect": "Allow",
"Action": [
"kafka-cluster:*Topic*",
"kafka-cluster:WriteData",
"kafka-cluster:ReadData"
],
"Resource": [
"arn:aws:kafka:region:Account-ID:topic/msk-serverless-tutorial-cluster/
*"
]
},
{
"Effect": "Allow",
"Action": [
"kafka-cluster:AlterGroup",
"kafka-cluster:DescribeGroup"
],
"Resource": [
"arn:aws:kafka:region:Account-ID:group/msk-serverless-tutorial-cluster/
*"
]
}
]
}
For instructions on how to write secure policies, see the section called “IAM access
control” (p. 73).
49
Amazon Managed Streaming for
Apache Kafka Developer Guide
Step 3: Create a client machine
Next Step
50
Amazon Managed Streaming for
Apache Kafka Developer Guide
Step 4: Create a topic
13. In the left navigation pane, choose Instances. Then choose the check box in the row that represents
your newly created Amazon EC2 instance. From this point forward, we call this instance the client
machine.
14. Choose Connect and follow the instructions to connect to the client machine.
2. To get the Apache Kafka tools that we need to create topics and send data, run the following
commands:
wget https://archive.apache.org/dist/kafka/2.8.1/kafka_2.12-2.8.1.tgz
3. Go to the kafka_2.12-2.8.1/libs directory, then run the following command to download the
Amazon MSK IAM JAR file. The Amazon MSK IAM JAR makes it possible for the client machine to
access the cluster.
wget https://github.com/aws/aws-msk-iam-auth/releases/download/v1.1.1/aws-msk-iam-
auth-1.1.1-all.jar
4. Go to the kafka_2.12-2.8.1/bin directory. Copy the following property settings and paste them
into a new file. Name the file client.properties and save it.
security.protocol=SASL_SSL
sasl.mechanism=AWS_MSK_IAM
sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
Next Step
1. In the following export command, replace my-endpoint with the bootstrap-server string you that
you saved after you created the cluster. Then, go to the kafka_2.12-2.8.1/bin directory on the
client machine and run the export command.
export BS=my-endpoint
51
Amazon Managed Streaming for
Apache Kafka Developer Guide
Step 5: Produce and consume data
Next Step
2. Enter any message that you want, and press Enter. Repeat this step two or three times. Every time
you enter a line and press Enter, that line is sent to your cluster as a separate message.
3. Keep the connection to the client machine open, and then open a second, separate connection to
that machine in a new window.
4. Use your second connection to the client machine to create a console consumer with the following
command. Replace my-endpoint with the bootstrap server string that you saved after you created
the cluster.
You start seeing the messages you entered earlier when you used the console producer command.
5. Enter more messages in the producer window, and watch them appear in the consumer window.
Next Step
52
Amazon Managed Streaming for
Apache Kafka Developer Guide
Configuration
message.timestamp.difference.max.ms
long.max Yes
You can also use Apache Kafka commands to set or modify topic-level configuration properties for new
or existing topics. For more information about topic-level configuration properties and examples of how
to set them, see Topic-Level Configs in the official Apache Kafka documentation.
Amazon MSK publishes PerSec metrics to CloudWatch at a frequency of once per minute. This
means that the 'SUM' statistic for a one-minute period accurately represents per-second data for
PerSec metrics. To collect per-second data for a period of longer than one minute, use the following
CloudWatch math expression: m1 * 60/PERIOD(m1).
53
Amazon Managed Streaming for
Apache Kafka Developer Guide
Monitoring
BytesInPerSec After a producer Cluster Name, The number of bytes per second
writes to a topic Topic received from clients. This metric is
available for each broker and also for
each topic.
BytesOutPerSec After a consumer Cluster Name, The number of bytes per second sent to
group consumes Topic clients. This metric is available for each
from a topic broker and also for each topic.
After a consumer
FetchMessageConversionsPerSec Cluster Name, The number of fetch message
group consumes Topic conversions per second for the broker.
from a topic
MaxOffsetLag After a consumer Cluster Name, The maximum offset lag across all
group consumes Consumer partitions in a topic.
from a topic Group, Topic
MessagesInPerSec After a producer Cluster Name, The number of incoming messages per
writes to a topic Topic second for the broker.
After a producer
ProduceMessageConversionsPerSec Cluster Name, The number of produce message
writes to a topic Topic conversions per second for the broker.
SumOffsetLag After a consumer Cluster Name, The aggregated offset lag for all the
group consumes Consumer partitions in a topic.
from a topic Group, Topic
1. Sign in to the Amazon Web Services Management Console and open the CloudWatch console at
https://console.amazonaws.cn/cloudwatch/.
2. In the navigation pane, under Metrics, choose All metrics.
3. In the metrics search for the term kafka.
4. Choose AWS/Kafka / Cluster Name, Topic or AWS/Kafka / Cluster Name, Consumer Group, Topic
to see different metrics.
54
Amazon Managed Streaming for
Apache Kafka Developer Guide
Cluster states
The following table shows the possible states of a cluster and describes what they mean. It also describes
what actions you can and cannot perform when a cluster is in one of these states. To find out the state
of a cluster, you can visit the Amazon Web Services Management Console. You can also use the describe-
cluster-v2 command or the DescribeClusterV2 operation to describe the cluster. The description of a
cluster includes its state.
55
Amazon Managed Streaming for
Apache Kafka Developer Guide
56
Amazon Managed Streaming for
Apache Kafka Developer Guide
Data protection
Security is a shared responsibility between Amazon and you. The shared responsibility model describes
this as security of the cloud and security in the cloud:
• Security of the cloud – Amazon is responsible for protecting the infrastructure that runs Amazon
services in the Amazon Cloud. Amazon also provides you with services that you can use securely.
Third-party auditors regularly test and verify the effectiveness of our security as part of the Amazon
Compliance Programs. To learn about the compliance programs that apply to Amazon Managed
Streaming for Apache Kafka, see Amazon Web Services in Scope by Compliance Program.
• Security in the cloud – Your responsibility is determined by the Amazon service that you use. You are
also responsible for other factors including the sensitivity of your data, your company's requirements,
and applicable laws and regulations.
This documentation helps you understand how to apply the shared responsibility model when using
Amazon MSK. The following topics show you how to configure Amazon MSK to meet your security and
compliance objectives. You also learn how to use other Amazon Web Services that help you to monitor
and secure your Amazon MSK resources.
Topics
• Data protection in Amazon Managed Streaming for Apache Kafka (p. 57)
• Authentication and authorization for Amazon MSK APIs (p. 61)
• Authentication and authorization for Apache Kafka APIs (p. 73)
• Changing an Amazon MSK cluster's security group (p. 89)
• Controlling access to Apache ZooKeeper (p. 90)
• Logging (p. 92)
• Compliance validation for Amazon Managed Streaming for Apache Kafka (p. 97)
• Resilience in Amazon Managed Streaming for Apache Kafka (p. 97)
• Infrastructure security in Amazon Managed Streaming for Apache Kafka (p. 97)
57
Amazon Managed Streaming for
Apache Kafka Developer Guide
Encryption
For data protection purposes, we recommend that you protect Amazon Web Services account credentials
and set up individual user accounts with Amazon Identity and Access Management (IAM). That way each
user is given only the permissions necessary to fulfill their job duties. We also recommend that you
secure your data in the following ways:
We strongly recommend that you never put confidential or sensitive information, such as your
customers' email addresses, into tags or free-form fields such as a Name field. This includes when
you work with Amazon MSK or other Amazon services using the console, API, Amazon CLI, or Amazon
SDKs. Any data that you enter into tags or free-form fields used for names may be used for billing or
diagnostic logs. If you provide a URL to an external server, we strongly recommend that you do not
include credentials information in the URL to validate your request to that server.
Topics
• Amazon MSK encryption (p. 58)
• How do I get started with encryption? (p. 59)
Encryption at rest
Amazon MSK integrates with Amazon Key Management Service (KMS) to offer transparent server-side
encryption. Amazon MSK always encrypts your data at rest. When you create an MSK cluster, you can
specify the Amazon KMS key that you want Amazon MSK to use to encrypt your data at rest. If you don't
specify a KMS key, Amazon MSK creates an Amazon managed key for you and uses it on your behalf.
For more information about KMS keys, see Amazon KMS keys in the Amazon Key Management Service
Developer Guide.
Encryption in transit
Amazon MSK uses TLS 1.2. By default, it encrypts data in transit between the brokers of your MSK
cluster. You can override this default at the time you create the cluster.
For communication between clients and brokers, you must specify one of the following three settings:
58
Amazon Managed Streaming for
Apache Kafka Developer Guide
How do I get started with encryption?
Amazon MSK brokers use public Amazon Certificate Manager certificates. Therefore, any truststore that
trusts Amazon Trust Services also trusts the certificates of Amazon MSK brokers.
While we highly recommend enabling in-transit encryption, it can add additional CPU overhead and
a few milliseconds of latency. Most use cases aren't sensitive to these differences, however, and the
magnitude of impact depends on the configuration of your cluster, clients, and usage profile.
{
"EncryptionAtRest": {
"DataVolumeKMSKeyId": "arn:aws:kms:us-east-1:123456789012:key/abcdabcd-1234-
abcd-1234-abcd123e8e8e"
},
"EncryptionInTransit": {
"InCluster": true,
"ClientBroker": "TLS"
}
}
For DataVolumeKMSKeyId, you can specify a customer managed key or the Amazon managed key for
MSK in your account (alias/aws/kafka). If you don't specify EncryptionAtRest, Amazon MSK still
encrypts your data at rest under the Amazon managed key. To determine which key your cluster is using,
send a GET request or invoke the DescribeCluster API operation.
For EncryptionInTransit, the default value of InCluster is true, but you can set it to false if you
don't want Amazon MSK to encrypt your data as it passes between brokers.
To specify the encryption mode for data in transit between clients and brokers, set ClientBroker to
one of three values: TLS, TLS_PLAINTEXT, or PLAINTEXT.
1. Save the contents of the previous example in a file and give the file any name that you want. For
example, call it encryption-settings.json.
2. Run the create-cluster command and use the encryption-info option to point to the file
where you saved your configuration JSON. The following is an example.
{
"ClusterArn": "arn:aws:kafka:us-east-1:123456789012:cluster/SecondTLSTest/
abcdabcd-1234-abcd-1234-abcd123e8e8e",
"ClusterName": "ExampleClusterName",
"State": "CREATING"
}
59
Amazon Managed Streaming for
Apache Kafka Developer Guide
How do I get started with encryption?
1. Create a client machine following the guidance in the section called “Step 2: Create a client
machine” (p. 5).
2. Install Apache Kafka on the client machine.
3. Run the following command on a machine that has the Amazon CLI installed, replacing clusterARN
with the ARN of your cluster (a cluster created with ClientBroker set to TLS like the example in
the previous procedure).
In the result, look for the value of ZookeeperConnectString and save it because you need it in
the next step.
4. Run the following command on your client machine to create a topic. Replace
ZookeeperConnectString with the value you obtained for ZookeeperConnectString in the
previous step.
<path-to-your-kafka-installation>/bin/kafka-topics.sh --create --
zookeeper ZookeeperConnectString --replication-factor 3 --partitions 1 --topic
TLSTestTopic
5. In this example we use the JVM truststore to talk to the MSK cluster. To do this, first create a folder
named /tmp on the client machine. Then, go to the bin folder of the Apache Kafka installation, and
run the following command. (Your JVM path might be different.)
cp /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.201.b09-0.amzn2.x86_64/jre/lib/security/
cacerts /tmp/kafka.client.truststore.jks
6. While still in the bin folder of the Apache Kafka installation on the client machine, create a text file
named client.properties with the following contents.
security.protocol=SSL
ssl.truststore.location=/tmp/kafka.client.truststore.jks
7. Run the following command on a machine that has the Amazon CLI installed, replacing clusterARN
with the ARN of your cluster.
A successful result looks like the following. Save this result because you need it for the next step.
{
"BootstrapBrokerStringTls": "a-1.example.g7oein.c2.kafka.us-
east-1.amazonaws.com:0123,a-3.example.g7oein.c2.kafka.us-
east-1.amazonaws.com:0123,a-2.example.g7oein.c2.kafka.us-east-1.amazonaws.com:0123"
}
8. Run the following command to create a console producer on your client machine. Replace
BootstrapBrokerStringTls with the value you obtained in the previous step. Leave this
producer command running.
<path-to-your-kafka-installation>/bin/kafka-console-producer.sh --broker-
list BootstrapBrokerStringTls --producer.config client.properties --topic TLSTestTopic
9. Open a new command window and connect to the same client machine. Then, run the following
command to create a console consumer.
60
Amazon Managed Streaming for
Apache Kafka Developer Guide
Authentication and authorization for Amazon MSK APIs
<path-to-your-kafka-installation>/bin/kafka-console-consumer.sh --bootstrap-
server BootstrapBrokerStringTls --consumer.config client.properties --topic
TLSTestTopic
10. In the producer window, type a text message followed by a return, and look for the same message in
the consumer window. Amazon MSK encrypted this message in transit.
For more information about configuring Apache Kafka clients to work with encrypted data, see
Configuring Kafka Clients.
This page describes how you can use IAM to control who can perform Amazon MSK operations on your
cluster. For information on how to control who can perform Apache Kafka operations on your cluster, see
the section called “Authentication and authorization for Apache Kafka APIs” (p. 73).
Topics
• How Amazon MSK works with IAM (p. 61)
• Amazon MSK identity-based policy examples (p. 64)
• Using service-linked roles for Amazon MSK (p. 67)
• Amazon managed policies for Amazon MSK (p. 68)
• Troubleshooting Amazon MSK identity and access (p. 72)
Topics
• Amazon MSK identity-based policies (p. 61)
• Amazon MSK resource-based policies (p. 64)
• Amazon managed policies (p. 64)
• Authorization based on Amazon MSK tags (p. 64)
• Amazon MSK IAM roles (p. 64)
61
Amazon Managed Streaming for
Apache Kafka Developer Guide
How Amazon MSK works with IAM
and condition keys. To learn about all of the elements that you use in a JSON policy, see IAM JSON Policy
Elements Reference in the IAM User Guide.
Actions
Administrators can use Amazon JSON policies to specify who has access to what. That is, which principal
can perform actions on what resources, and under what conditions.
The Action element of a JSON policy describes the actions that you can use to allow or deny access
in a policy. Policy actions usually have the same name as the associated Amazon API operation. There
are some exceptions, such as permission-only actions that don't have a matching API operation. There
are also some operations that require multiple actions in a policy. These additional actions are called
dependent actions.
Policy actions in Amazon MSK use the following prefix before the action: kafka:. For example, to
grant someone permission to describe an MSK cluster with the Amazon MSK DescribeCluster API
operation, you include the kafka:DescribeCluster action in their policy. Policy statements must
include either an Action or NotAction element. Amazon MSK defines its own set of actions that
describe tasks that you can perform with this service.
To specify multiple actions in a single statement, separate them with commas as follows:
You can specify multiple actions using wildcards (*). For example, to specify all actions that begin with
the word Describe, include the following action:
"Action": "kafka:Describe*"
To see a list of Amazon MSK actions, see Actions, resources, and condition keys for Amazon Managed
Streaming for Apache Kafka in the IAM User Guide.
Resources
Administrators can use Amazon JSON policies to specify who has access to what. That is, which principal
can perform actions on what resources, and under what conditions.
The Resource JSON policy element specifies the object or objects to which the action applies.
Statements must include either a Resource or a NotResource element. As a best practice, specify
a resource using its Amazon Resource Name (ARN). You can do this for actions that support a specific
resource type, known as resource-level permissions.
For actions that don't support resource-level permissions, such as listing operations, use a wildcard (*) to
indicate that the statement applies to all resources.
"Resource": "*"
arn:${Partition}:kafka:${Region}:${Account}:cluster/${ClusterName}/${UUID}
For more information about the format of ARNs, see Amazon Resource Names (ARNs) and Amazon
Service Namespaces.
62
Amazon Managed Streaming for
Apache Kafka Developer Guide
How Amazon MSK works with IAM
For example, to specify the CustomerMessages instance in your statement, use the following ARN:
"Resource": "arn:aws:kafka:us-east-1:123456789012:cluster/CustomerMessages/abcd1234-abcd-
dcba-4321-a1b2abcd9f9f-2"
To specify all instances that belong to a specific account, use the wildcard (*):
"Resource": "arn:aws:kafka:us-east-1:123456789012:cluster/*"
Some Amazon MSK actions, such as those for creating resources, cannot be performed on a specific
resource. In those cases, you must use the wildcard (*).
"Resource": "*"
To specify multiple resources in a single statement, separate the ARNs with commas.
To see a list of Amazon MSK resource types and their ARNs, see Resources Defined by Amazon Managed
Streaming for Apache Kafka in the IAM User Guide. To learn with which actions you can specify the ARN
of each resource, see Actions Defined by Amazon Managed Streaming for Apache Kafka.
Condition keys
Administrators can use Amazon JSON policies to specify who has access to what. That is, which principal
can perform actions on what resources, and under what conditions.
The Condition element (or Condition block) lets you specify conditions in which a statement is in
effect. The Condition element is optional. You can create conditional expressions that use condition
operators, such as equals or less than, to match the condition in the policy with values in the request.
If you specify multiple Condition elements in a statement, or multiple keys in a single Condition
element, Amazon evaluates them using a logical AND operation. If you specify multiple values for a single
condition key, Amazon evaluates the condition using a logical OR operation. All of the conditions must be
met before the statement's permissions are granted.
You can also use placeholder variables when you specify conditions. For example, you can grant an IAM
user permission to access a resource only if it is tagged with their IAM user name. For more information,
see IAM policy elements: variables and tags in the IAM User Guide.
Amazon supports global condition keys and service-specific condition keys. To see all Amazon global
condition keys, see Amazon global condition context keys in the IAM User Guide.
Amazon MSK defines its own set of condition keys and also supports using some global condition keys.
To see all Amazon global condition keys, see Amazon Global Condition Context Keys in the IAM User
Guide.
To see a list of Amazon MSK condition keys, see Condition Keys for Amazon Managed Streaming for
Apache Kafka in the IAM User Guide. To learn with which actions and resources you can use a condition
key, see Actions Defined by Amazon Managed Streaming for Apache Kafka.
Examples
To view examples of Amazon MSK identity-based policies, see Amazon MSK identity-based policy
examples (p. 64).
63
Amazon Managed Streaming for
Apache Kafka Developer Guide
Identity-based policy examples
To view an example identity-based policy for limiting access to a cluster based on the tags on that
cluster, see Accessing Amazon MSK clusters based on tags (p. 66).
Service-linked roles
Service-linked roles allow Amazon Web Services to access resources in other services to complete an
action on your behalf. Service-linked roles appear in your IAM account and are owned by the service. An
IAM administrator can view but not edit the permissions for service-linked roles.
Amazon MSK supports service-linked roles. For details about creating or managing Amazon MSK service-
linked roles, the section called “Service-linked roles” (p. 67).
To learn how to create an IAM identity-based policy using these example JSON policy documents, see
Creating Policies on the JSON Tab in the IAM User Guide.
Topics
• Policy best practices (p. 65)
• Allow users to view their own permissions (p. 65)
• Accessing one Amazon MSK cluster (p. 66)
• Accessing Amazon MSK clusters based on tags (p. 66)
64
Amazon Managed Streaming for
Apache Kafka Developer Guide
Identity-based policy examples
• Get started with Amazon managed policies and move toward least-privilege permissions – To get
started granting permissions to your users and workloads, use the Amazon managed policies that grant
permissions for many common use cases. They are available in your Amazon Web Services account.
We recommend that you reduce permissions further by defining Amazon customer managed policies
that are specific to your use cases. For more information, see Amazon managed policies or Amazon
managed policies for job functions in the IAM User Guide.
• Apply least-privilege permissions – When you set permissions with IAM policies, grant only the
permissions required to perform a task. You do this by defining the actions that can be taken on
specific resources under specific conditions, also known as least-privilege permissions. For more
information about using IAM to apply permissions, see Policies and permissions in IAM in the IAM User
Guide.
• Use conditions in IAM policies to further restrict access – You can add a condition to your policies
to limit access to actions and resources. For example, you can write a policy condition to specify that
all requests must be sent using SSL. You can also use conditions to grant access to service actions if
they are used through a specific Amazon Web Service, such as Amazon CloudFormation. For more
information, see IAM JSON policy elements: Condition in the IAM User Guide.
• Use IAM Access Analyzer to validate your IAM policies to ensure secure and functional permissions
– IAM Access Analyzer validates new and existing policies so that the policies adhere to the IAM
policy language (JSON) and IAM best practices. IAM Access Analyzer provides more than 100 policy
checks and actionable recommendations to help you author secure and functional policies. For more
information, see IAM Access Analyzer policy validation in the IAM User Guide.
• Require multi-factor authentication (MFA) – If you have a scenario that requires IAM users or root
users in your account, turn on MFA for additional security. To require MFA when API operations are
called, add MFA conditions to your policies. For more information, see Configuring MFA-protected API
access in the IAM User Guide.
For more information about best practices in IAM, see Security best practices in IAM in the IAM User
Guide.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ViewOwnUserInfo",
"Effect": "Allow",
"Action": [
"iam:GetUserPolicy",
"iam:ListGroupsForUser",
"iam:ListAttachedUserPolicies",
"iam:ListUserPolicies",
"iam:GetUser"
],
"Resource": ["arn:aws-cn:iam::*:user/${aws:username}"]
},
{
65
Amazon Managed Streaming for
Apache Kafka Developer Guide
Identity-based policy examples
"Sid": "NavigateInConsole",
"Effect": "Allow",
"Action": [
"iam:GetGroupPolicy",
"iam:GetPolicyVersion",
"iam:GetPolicy",
"iam:ListAttachedGroupPolicies",
"iam:ListGroupPolicies",
"iam:ListPolicyVersions",
"iam:ListPolicies",
"iam:ListUsers"
],
"Resource": "*"
}
]
}
{
"Version":"2012-10-17",
"Statement":[
{
"Sid":"UpdateCluster",
"Effect":"Allow",
"Action":[
"kafka:Describe*",
"kafka:Get*",
"kafka:List*",
"kafka:Update*"
],
"Resource":"arn:aws:kafka:us-east-1:012345678012:cluster/purchaseQueriesCluster/
abcdefab-1234-abcd-5678-cdef0123ab01-2"
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AccessClusterIfOwner",
"Effect": "Allow",
"Action": [
"kafka:Describe*",
"kafka:Get*",
"kafka:List*",
"kafka:Update*",
"kafka:Delete*"
],
66
Amazon Managed Streaming for
Apache Kafka Developer Guide
Service-linked roles
"Resource": "arn:aws:kafka:us-east-1:012345678012:cluster/*",
"Condition": {
"StringEquals": {
"aws:ResourceTag/Owner": "${aws:username}"
}
}
}
]
}
You can attach this policy to the IAM users in your account. If a user named richard-roe attempts to
update an MSK cluster, the cluster must be tagged Owner=richard-roe or owner=richard-roe.
Otherwise, he is denied access. The condition tag key Owner matches both Owner and owner because
condition key names are not case-sensitive. For more information, see IAM JSON Policy Elements:
Condition in the IAM User Guide.
A service-linked role makes setting up Amazon MSK easier because you do not have to manually add the
necessary permissions. Amazon MSK defines the permissions of its service-linked roles. Unless defined
otherwise, only Amazon MSK can assume its roles. The defined permissions include the trust policy and
the permissions policy, and that permissions policy cannot be attached to any other IAM entity.
For information about other services that support service-linked roles, see Amazon Web Services That
Work with IAM, and look for the services that have Yes in the Service-Linked Role column. Choose a Yes
with a link to view the service-linked role documentation for that service.
Topics
• Service-linked role permissions for Amazon MSK (p. 67)
• Creating a service-linked role for Amazon MSK (p. 68)
• Editing a service-linked role for Amazon MSK (p. 68)
• Supported Regions for Amazon MSK service-linked roles (p. 68)
The AWSServiceRoleForKafka service-linked role trusts the following services to assume the role:
• kafka.amazonaws.com
The role permissions policy allows Amazon MSK to complete the following actions on the specified
resources:
• Action: ec2:CreateNetworkInterface on *
• Action: ec2:DescribeNetworkInterfaces on *
• Action: ec2:CreateNetworkInterfacePermission on *
• Action: ec2:AttachNetworkInterface on *
• Action: ec2:DeleteNetworkInterface on *
67
Amazon Managed Streaming for
Apache Kafka Developer Guide
Amazon managed policies
• Action: ec2:DetachNetworkInterface on *
• Action: acm-pca:GetCertificateAuthorityCertificate on *
• Action: secretsmanager:ListSecrets on *
• Action: secretsmanager:GetResourcePolicy on secrets with the prefix AmazonMSK_ that you
create for Amazon MSK
• Action: secretsmanager:PutResourcePolicy on secrets with the prefix AmazonMSK_ that you
create for Amazon MSK
• Action: secretsmanager:DeleteResourcePolicy on secrets with the prefix AmazonMSK_ that you
create for Amazon MSK
• Action: secretsmanager:DescribeSecret on secrets with the prefix AmazonMSK_ that you create
for Amazon MSK
You must configure permissions to allow an IAM entity (such as a user, group, or role) to create, edit, or
delete a service-linked role. For more information, see Service-Linked Role Permissions in the IAM User
Guide.
If you delete this service-linked role, and then need to create it again, you can use the same process to
recreate the role in your account. When you create an Amazon MSK cluster, Amazon MSK creates the
service-linked role for you again.
Amazon Web Services maintain and update Amazon managed policies. You can't change the permissions
in Amazon managed policies. Services occasionally add additional permissions to an Amazon managed
policy to support new features. This type of update affects all identities (users, groups, and roles) where
the policy is attached. Services are most likely to update an Amazon managed policy when a new feature
is launched or when new operations become available. Services do not remove permissions from an
Amazon managed policy, so policy updates won't break your existing permissions.
68
Amazon Managed Streaming for
Apache Kafka Developer Guide
Amazon managed policies
Additionally, Amazon supports managed policies for job functions that span multiple services. For
example, the ViewOnlyAccess Amazon managed policy provides read-only access to many Amazon
Web Services and resources. When a service launches a new feature, Amazon adds read-only permissions
for new operations and resources. For a list and descriptions of job function policies, see Amazon
managed policies for job functions in the IAM User Guide.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kafka:*",
"ec2:DescribeSubnets",
"ec2:DescribeVpcs",
"ec2:DescribeSecurityGroups",
"ec2:DescribeRouteTables",
"ec2:DescribeVpcEndpoints",
"ec2:DescribeVpcAttribute",
"kms:DescribeKey",
"kms:CreateGrant",
"logs:CreateLogDelivery",
"logs:GetLogDelivery",
"logs:UpdateLogDelivery",
"logs:DeleteLogDelivery",
"logs:ListLogDeliveries",
"logs:PutResourcePolicy",
"logs:DescribeResourcePolicies",
"logs:DescribeLogGroups",
"S3:GetBucketPolicy",
"firehose:TagDeliveryStream"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateVpcEndpoint"
],
"Resource": [
"arn:*:ec2:*:*:vpc/*",
"arn:*:ec2:*:*:subnet/*",
"arn:*:ec2:*:*:security-group/*"
]
69
Amazon Managed Streaming for
Apache Kafka Developer Guide
Amazon managed policies
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateVpcEndpoint"
],
"Resource": [
"arn:*:ec2:*:*:vpc-endpoint/*"
],
"Condition": {
"StringEquals": {
"aws:RequestTag/AWSMSKManaged": "true"
},
"StringLike": {
"aws:RequestTag/ClusterArn": "*"
}
}
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateTags"
],
"Resource": "arn:*:ec2:*:*:vpc-endpoint/*",
"Condition": {
"StringEquals": {
"ec2:CreateAction": "CreateVpcEndpoint"
}
}
},
{
"Effect": "Allow",
"Action": [
"ec2:DeleteVpcEndpoints"
],
"Resource": "arn:*:ec2:*:*:vpc-endpoint/*",
"Condition": {
"StringEquals": {
"ec2:ResourceTag/AWSMSKManaged": "true"
},
"StringLike": {
"ec2:ResourceTag/ClusterArn": "*"
}
}
},
{
"Effect": "Allow",
"Action": "iam:CreateServiceLinkedRole",
"Resource": "arn:aws:iam::*:role/aws-service-role/kafka.amazonaws.com/
AWSServiceRoleForKafka*",
"Condition": {
"StringLike": {
"iam:AWSServiceName": "kafka.amazonaws.com"
}
}
},
{
"Effect": "Allow",
"Action": [
"iam:AttachRolePolicy",
"iam:PutRolePolicy"
],
"Resource": "arn:aws:iam::*:role/aws-service-role/kafka.amazonaws.com/
AWSServiceRoleForKafka*"
},
{
70
Amazon Managed Streaming for
Apache Kafka Developer Guide
Amazon managed policies
"Effect": "Allow",
"Action": "iam:CreateServiceLinkedRole",
"Resource": "arn:aws:iam::*:role/aws-service-role/
delivery.logs.amazonaws.com/AWSServiceRoleForLogDelivery*",
"Condition": {
"StringLike": {
"iam:AWSServiceName": "delivery.logs.amazonaws.com"
}
}
}
]
}
• The Amazon MSK permissions allow you to list Amazon MSK resources, describe them, and get
information about them.
• The Amazon EC2 permissions are used to describe the Amazon VPC, subnets, security groups, and ENIs
that are associated with a cluster.
• The Amazon KMS permission is used to describe the key that is associated with the cluster.
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"kafka:Describe*",
"kafka:List*",
"kafka:Get*",
"ec2:DescribeNetworkInterfaces",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcs",
"kms:DescribeKey"
],
"Effect": "Allow",
"Resource": "*"
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
71
Amazon Managed Streaming for
Apache Kafka Developer Guide
Troubleshooting
"Action": [
"ec2:CreateNetworkInterface",
"ec2:DescribeNetworkInterfaces",
"ec2:CreateNetworkInterfacePermission",
"ec2:AttachNetworkInterface",
"ec2:DeleteNetworkInterface",
"ec2:DetachNetworkInterface",
"acm-pca:GetCertificateAuthorityCertificate",
"secretsmanager:ListSecrets"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"secretsmanager:GetResourcePolicy",
"secretsmanager:PutResourcePolicy",
"secretsmanager:DeleteResourcePolicy",
"secretsmanager:DescribeSecret"
],
"Resource": "*",
"Condition": {
"ArnLike": {
"secretsmanager:SecretId":
"arn:*:secretsmanager:*:*:secret:AmazonMSK_*"
}
}
}
]
}
AmazonMSKFullAccess (p. 69) Amazon MSK added new November 30, 2021
– Update to an existing policy Amazon EC2 permissions to
make it possible to connect to a
cluster.
AmazonMSKFullAccess (p. 69) Amazon MSK added a new November 19, 2021
– Update to an existing policy permission to allow it to
describe Amazon EC2 route
tables.
Amazon MSK started tracking Amazon MSK started tracking November 19, 2021
changes changes for its Amazon
managed policies.
Topics
• I Am not authorized to perform an action in Amazon MSK (p. 73)
72
Amazon Managed Streaming for
Apache Kafka Developer Guide
Authentication and authorization for Apache Kafka APIs
The following example error occurs when the mateojackson IAM user tries to use the console to delete
a cluster but does not have kafka:DeleteCluster permissions.
In this case, Mateo asks his administrator to update his policies to allow him to access the
purchaseQueriesCluster resource using the kafka:DeleteCluster action.
For information on how to control who can perform Amazon MSK operations on your cluster, see the
section called “Authentication and authorization for Amazon MSK APIs” (p. 61).
Topics
• IAM access control (p. 73)
• Mutual TLS authentication (p. 81)
• Username and password authentication with Amazon Secrets Manager (p. 85)
• Apache Kafka ACLs (p. 88)
Amazon MSK logs access events so you can audit them. For more information, see the section called
“CloudTrail events” (p. 94).
To make IAM access control possible, Amazon MSK makes minor modifications to Apache Kafka source
code. These modifications won't cause a noticeable difference in your Apache Kafka experience.
Important
IAM access control doesn't apply to Apache ZooKeeper nodes. For information about how
you can control access to those nodes, see the section called “Controlling access to Apache
ZooKeeper” (p. 90).
Important
The allow.everyone.if.no.acl.found Apache Kafka setting has no effect if your cluster
uses IAM access control.
73
Amazon Managed Streaming for
Apache Kafka Developer Guide
IAM access control
Important
You can invoke Apache Kafka ACL APIs for an MSK cluster that uses IAM access control. However,
Apache Kafka ACLs stored in Apache ZooKeeper have no effect on authorization for IAM roles.
You must use IAM policies to control access for IAM roles.
• the section called “Create a cluster that uses IAM access control” (p. 74)
• the section called “Configure clients for IAM access control” (p. 74)
• the section called “Create authorization policies” (p. 75)
• the section called “Get the bootstrap brokers for IAM access control” (p. 76)
Use the Amazon Web Services Management Console to create a cluster that uses IAM access
control
Use the API or the Amazon CLI to create a cluster that uses IAM access control
• To create a cluster with IAM access control enabled, use the CreateCluster API or the create-
cluster CLI command, and pass the following JSON for the ClientAuthentication parameter:
"ClientAuthentication": { "Sasl": { "Iam": { "Enabled": true } }.
ssl.truststore.location=<PATH_TO_TRUST_STORE_FILE>
security.protocol=SASL_SSL
sasl.mechanism=AWS_MSK_IAM
sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
74
Amazon Managed Streaming for
Apache Kafka Developer Guide
IAM access control
sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
To use a named profile that you created for Amazon credentials, include awsProfileName="your
profile name"; in your client configuration file. For information about named profiles, see
Named profiles in the Amazon CLI documentation.
2. Download the latest stable aws-msk-iam-auth JAR file, and place it in the class path. If you use
Maven, add the following dependency, adjusting the version number as needed:
<dependency>
<groupId>software.amazon.msk</groupId>
<artifactId>aws-msk-iam-auth</artifactId>
<version>1.0.0</version>
</dependency>
The Amazon MSK client plugin is open-sourced under the Apache 2.0 license.
For information about how to create an IAM policy, see Creating IAM policies.
The following is an example authorization policy for a cluster named MyTestCluster. To understand the
semantics of the Action and Resource elements, see the section called “Semantics of actions and
resources” (p. 76).
Important
Changes that you make to an IAM policy are reflected in the IAM APIs and the Amazon CLI
immediately. However, it can take noticeable time for the policy change to take effect. In most
cases, policy changes take effect in less than a minute. Network conditions may sometimes
increase the delay.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kafka-cluster:Connect",
"kafka-cluster:AlterCluster",
"kafka-cluster:DescribeCluster"
],
"Resource": [
"arn:aws:kafka:us-east-1:0123456789012:cluster/MyTestCluster/abcd1234-0123-
abcd-5678-1234abcd-1"
]
},
{
"Effect": "Allow",
"Action": [
"kafka-cluster:*Topic*",
"kafka-cluster:WriteData",
"kafka-cluster:ReadData"
],
75
Amazon Managed Streaming for
Apache Kafka Developer Guide
IAM access control
"Resource": [
"arn:aws:kafka:us-east-1:0123456789012:topic/MyTestCluster/*"
]
},
{
"Effect": "Allow",
"Action": [
"kafka-cluster:AlterGroup",
"kafka-cluster:DescribeGroup"
],
"Resource": [
"arn:aws:kafka:us-east-1:0123456789012:group/MyTestCluster/*"
]
}
]
}
To learn how to create a policy with action elements that correspond to common Apache Kafka use
cases, like producing and consuming data, see the section called “Common use cases” (p. 80).
Actions
The following table lists the actions that you can include in an authorization policy when you use IAM
access control for Amazon MSK. When you include in your authorization policy an action from the Action
column of the table, you must also include the corresponding actions from the Required actions column.
76
Amazon Managed Streaming for
Apache Kafka Developer Guide
IAM access control
77
Amazon Managed Streaming for
Apache Kafka Developer Guide
IAM access control
78
Amazon Managed Streaming for
Apache Kafka Developer Guide
IAM access control
You can use the asterisk (*) wildcard any number of times in an action after the colon. The following are
examples.
Resources
The following table shows the four types of resources that you can use in an authorization policy
when you use IAM access control for Amazon MSK. You can get the cluster Amazon Resource Name
(ARN) from the Amazon Web Services Management Console or by using the DescribeCluster API or the
describe-cluster Amazon CLI command. You can then use the cluster ARN to construct topic, group, and
transaction ID ARNs. To specify a resource in an authorization policy, use that resource's ARN.
Cluster arn:aws:kafka:region:account-id:cluster/cluster-name/cluster-uuid
Topic arn:aws:kafka:region:account-id:topic/cluster-name/cluster-uuid/topic-
name
79
Amazon Managed Streaming for
Apache Kafka Developer Guide
IAM access control
Group arn:aws:kafka:region:account-id:group/cluster-name/cluster-uuid/group-
name
Transaction ID arn:aws:kafka:region:account-id:transactional-id/cluster-name/cluster-
uuid/transactional-id
You can use the asterisk (*) wildcard any number of times anywhere in the part of the ARN that comes
after :cluster/, :topic/, :group/, and :transaction-id/. The following are some examples of
how you can use the asterisk (*) wildcard to refer to multiple resources:
For information about all the actions that are part of IAM access control for Amazon MSK, see the section
called “Semantics of actions and resources” (p. 76).
Note
Actions are denied by default. You must explicitly allow every action that you want to authorize
the client to perform.
Admin kafka-cluster:*
kafka-cluster:CreateTopic
kafka-cluster:DescribeTopic
kafka-cluster:WriteData
kafka-cluster:DescribeTopic
kafka-cluster:DescribeGroup
80
Amazon Managed Streaming for
Apache Kafka Developer Guide
Mutual TLS authentication
kafka-cluster:ReadData
kafka-cluster:DescribeTopic
kafka-cluster:WriteData
kafka-cluster:WriteDataIdempotently
kafka-cluster:DescribeTopic
kafka-cluster:WriteData
kafka-cluster:DescribeTransactionalId
kafka-cluster:AlterTransactionalId
kafka-
cluster:DescribeClusterDynamicConfiguration
kafka-
cluster:DescribeClusterDynamicConfiguration
kafka-
cluster:AlterClusterDynamicConfiguration
kafka-
cluster:DescribeTopicDynamicConfiguration
kafka-
cluster:DescribeTopicDynamicConfiguration
kafka-
cluster:AlterTopicDynamicConfiguration
kafka-cluster:DescribeTopic
kafka-cluster:AlterTopic
81
Amazon Managed Streaming for
Apache Kafka Developer Guide
Mutual TLS authentication
Private CA can be either in the same Amazon Web Services account as your cluster, or in a different
account. For information about private CAs, see Creating and Managing a Private CA.
Note
TLS authentication is not currently available in the Beijing and Ningxia Regions.
Amazon MSK doesn't support certificate revocation lists (CRLs). To control access to your cluster topics
or block compromised certificates, use Apache Kafka ACLs and Amazon security groups. For information
about using Apache Kafka ACLs, see the section called “Apache Kafka ACLs” (p. 88).
1. Create a file named clientauthinfo.json with the following contents. Replace Private-CA-
ARN with the ARN of your PCA.
{
"Tls": {
"CertificateAuthorityArnList": ["Private-CA-ARN"]
}
}
{
"EncryptionAtRest": {
"DataVolumeKMSKeyId": "KMS-Key-ARN"
},
"EncryptionInTransit": {
"InCluster": true,
"ClientBroker": "TLS"
}
}
For more information about encryption, see the section called “Encryption” (p. 58).
4. On a machine where you have the Amazon CLI installed, run the following command to create a
cluster with authentication and in-transit encryption enabled. Save the cluster ARN provided in the
response.
82
Amazon Managed Streaming for
Apache Kafka Developer Guide
Mutual TLS authentication
cp /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.201.b09-0.amzn2.x86_64/jre/lib/security/
cacerts kafka.client.truststore.jks
5. On your client machine, run the following command to create a private key for your client. Replace
Distinguished-Name, Example-Alias, Your-Store-Pass, and Your-Key-Pass with strings of
your choice.
6. On your client machine, run the following command to create a certificate request with the private
key you created in the previous step.
7. Open the client-cert-sign-request file and ensure that it starts with -----BEGIN
CERTIFICATE REQUEST----- and ends with -----END CERTIFICATE REQUEST-----. If it
starts with -----BEGIN NEW CERTIFICATE REQUEST-----, delete the word NEW (and the single
space that follows it) from the beginning and the end of the file.
8. On a machine where you have the Amazon CLI installed, run the following command to sign your
certificate request. Replace Private-CA-ARN with the ARN of your PCA. You can change the
validity value if you want. Here we use 300 as an example.
83
Amazon Managed Streaming for
Apache Kafka Developer Guide
Mutual TLS authentication
9. Run the following command to get the certificate that ACM signed for you. Replace Certificate-
ARN with the ARN you obtained from the response to the previous command.
10. From the JSON result of running the previous command, copy the strings associated with
Certificate and CertificateChain. Paste these two strings in a new file named signed-
certificate-from-acm. Paste the string associated with Certificate first, followed by the string
associated with CertificateChain. Replace the \n characters with new lines. The following is the
structure of the file after you paste the certificate and certificate chain in it.
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
11. Run the following command on the client machine to add this certificate to your keystore so you can
present it when you talk to the MSK brokers.
12. Create a file named client.properties with the following contents. Adjust the truststore and
keystore locations to the paths where you saved kafka.client.truststore.jks.
security.protocol=SSL
ssl.truststore.location=/tmp/kafka_2.12-2.2.1/kafka.client.truststore.jks
ssl.keystore.location=/tmp/kafka_2.12-2.2.1/kafka.client.keystore.jks
ssl.keystore.password=Your-Store-Pass
ssl.key.password=Your-Key-Pass
2. Run the following command to start a console producer. The file named client.properties is
the one you created in the previous procedure.
<path-to-your-kafka-installation>/bin/kafka-console-producer.sh --broker-
list BootstrapBroker-String --topic ExampleTopic --producer.config client.properties
3. In a new command window on your client machine, run the following command to start a console
consumer.
<path-to-your-kafka-installation>/bin/kafka-console-consumer.sh --bootstrap-
server BootstrapBroker-String --topic ExampleTopic --consumer.config client.properties
4. Type messages in the producer window and watch them appear in the consumer window.
84
Amazon Managed Streaming for
Apache Kafka Developer Guide
SASL/SCRAM authentication
How it works
Username and password authentication for Amazon MSK uses SASL/SCRAM (Simple Authentication
and Security Layer/ Salted Challenge Response Mechanism) authentication. To set up username and
password authentication for a cluster, you create a Secret resource in Amazon Secrets Manager, and
associate user names and passwords with that secret.
SASL/SCRAM is defined in RFC 5802. SCRAM uses secured hashing algorithms, and does not transmit
plaintext passwords between client and server.
Note
When you set up SASL/SCRAM authentication for your cluster, Amazon MSK turns on TLS
encryption for all traffic between clients and brokers.
Note the following requirements when creating a secret for an Amazon MSK cluster:
• Choose Other type of secrets (e.g. API key) for the secret type.
• Your secret name must begin with the prefix AmazonMSK_.
• You must either use an existing custom Amazon KMS key or create a new custom Amazon KMS key for
your secret. Secrets Manager uses the default Amazon KMS key for a secret by default.
Important
A secret created with the default Amazon KMS key cannot be used with an Amazon MSK
cluster.
• Your user and password data must be in the following format to enter key-value pairs using the
Plaintext option.
{
"username": "alice",
"password": "alice-secret"
}
85
Amazon Managed Streaming for
Apache Kafka Developer Guide
SASL/SCRAM authentication
• Record the ARN (Amazon Resource Name) value for your secret.
• Important
You can't associate a Secrets Manager secret with a cluster that exceeds the limits described in
the section called “ Right-size your cluster: Number of partitions per broker” (p. 138).
• If you use the Amazon CLI to create the secret, specify a key ID or ARN for the kms-key-id parameter.
Don't specify an alias.
• To associate the secret with your cluster, use either the Amazon MSK console, or the
BatchAssociateScramSecret operation.
Important
When you associate a secret with a cluster, Amazon MSK attaches a resource policy to the
secret that allows your cluster to access and read the secret values that you defined. You
should not modify this resource policy. Doing so can prevent your cluster from accessing your
secret.
The following example JSON input for the BatchAssociateScramSecret operation associates a
secret with a cluster:
{
"clusterArn" : "arn:aws:kafka:us-west-2:0123456789019:cluster/SalesCluster/abcd1234-
abcd-cafe-abab-9876543210ab-4",
"secretArnList": [
"arn:aws:secretsmanager:us-west-2:0123456789019:secret:AmazonMSK_MyClusterSecret"
]
}
1. Retrieve your cluster details with the following command. Replace ClusterArn with the Amazon
Resource Name (ARN) of your cluster:
From the JSON result of the command, save the value associated with the string named
ZookeeperConnectString.
2. To create an example topic, run the following command on your client machine. Replace
ZookeeperConnectString with the string you recorded in the previous step.
<path-to-your-kafka-installation>/bin/kafka-topics.sh --create --
zookeeper ZookeeperConnectString --replication-factor 3 --partitions 1 --
topic ExampleTopicName
3. On your client machine, create a JAAS configuration file that contains the user credentials stored
in your secret. For example, for the user alice, create a file called users_jaas.conf with the
following content.
KafkaClient {
org.apache.kafka.common.security.scram.ScramLoginModule required
username="alice"
password="alice-secret";
};
86
Amazon Managed Streaming for
Apache Kafka Developer Guide
SASL/SCRAM authentication
4. Use the following command to export your JAAS config file as a KAFKA_OPTS environment
parameter.
export KAFKA_OPTS=-Djava.security.auth.login.config=<path-to-jaas-file>/users_jaas.conf
cp /usr/lib/jvm/JDKFolder/jre/lib/security/cacerts /tmp/kafka.client.truststore.jks
7. In the bin directory of your Apache Kafka installation, create a client properties file called
client_sasl.properties with the following contents. This file defines the SASL mechanism and
protocol.
security.protocol=SASL_SSL
sasl.mechanism=SCRAM-SHA-512
ssl.truststore.location=<path-to-keystore-file>/kafka.client.truststore.jks
8. Retrieve your bootstrap brokers string with the following command. Replace ClusterArn with the
Amazon Resource Name (ARN) of your cluster:
From the JSON result of the command, save the value associated with the string named
BootstrapBrokerStringSaslScram.
9. To produce to the example topic that you created, run the following command on your client
machine. Replace BootstrapBrokerStringSaslScram with the value that you retrieved in the
previous step.
<path-to-your-kafka-installation>/bin/kafka-console-producer.sh --broker-
list BootstrapBrokerStringSaslScram --topic ExampleTopicName --producer.config
client_sasl.properties
10. To consume from the topic you created, run the following command on your client machine. Replace
BootstrapBrokerStringSaslScram with the value that you obtained previously.
<path-to-your-kafka-installation>/bin/kafka-console-consumer.sh --bootstrap-
server BootstrapBrokerStringSaslScram --topic ExampleTopicName --from-beginning --
consumer.config client_sasl.properties
{
"username": "alice",
"password": "alice-secret"
}
87
Amazon Managed Streaming for
Apache Kafka Developer Guide
Apache Kafka ACLs
Revoking user access: To revoke a user's credentials to access a cluster, we recommend that you
first remove or enforce an ACL on the cluster, and then disassociate the secret. This is because of the
following:
For information about using an ACL with Amazon MSK, see Apache Kafka ACLs (p. 88).
We recommend that you restrict access to your zookeeper nodes to prevent users from modifying ACLs.
For more information, see Controlling access to Apache ZooKeeper (p. 90).
Limitations
Note the following limitations when using SCRAM secrets:
Apache Kafka ACLs have the format "Principal P is [Allowed/Denied] Operation O From Host H on any
Resource R matching ResourcePattern RP". If RP doesn't match a specific resource R, then R has no
associated ACLs, and therefore no one other than super users is allowed to access R. To change this
Apache Kafka behavior, you set the property allow.everyone.if.no.acl.found to true. Amazon
MSK sets it to true by default. This means that with Amazon MSK clusters, if you don't explicitly set
ACLs on a resource, all principals can access this resource. If you enable ACLs on a resource, only the
authorized principals can access it. If you want to restrict access to a topic and authorize a client using
TLS mutual authentication, add ACLs using the Apache Kafka authorizer CLI. For more information about
adding, removing, and listing ACLs, see Kafka Authorization Command Line Interface.
In addition to the client, you also need to grant all your brokers access to your topics so that the brokers
can replicate messages from the primary partition. If the brokers don't have access to a topic, replication
for the topic fails.
1. Add your brokers to the ACL table to allow them to read from all topics that have ACLs in place. To
grant your brokers read access to a topic, run the following command on a client machine that can
communicate with the MSK cluster.
88
Amazon Managed Streaming for
Apache Kafka Developer Guide
Changing security groups
<path-to-your-kafka-installation>/bin/kafka-acls.sh --authorizer-properties
zookeeper.connect=ZooKeeper-Connection-String --add --allow-principal
"User:CN=Distinguished-Name" --operation Read --group=* --topic Topic-Name
2. To grant read access to a topic, run the following command on your client machine. If you use
mutual TLS authentication, use the same Distinguished-Name you used when you created the
private key.
<path-to-your-kafka-installation>/bin/kafka-acls.sh --authorizer-properties
zookeeper.connect=ZooKeeper-Connection-String --add --allow-principal
"User:CN=Distinguished-Name" --operation Read --group=* --topic Topic-Name
To remove read access, you can run the same command, replacing --add with --remove.
3. To grant write access to a topic, run the following command on your client machine. If you use
mutual TLS authentication, use the same Distinguished-Name you used when you created the
private key.
<path-to-your-kafka-installation>/bin/kafka-acls.sh --authorizer-properties
zookeeper.connect=ZooKeeper-Connection-String --add --allow-principal
"User:CN=Distinguished-Name" --operation Write --topic Topic-Name
To remove write access, you can run the same command, replacing --add with --remove.
1. Use the ListNodes API or the list-nodes command in the Amazon CLI to get a list of the brokers in
your cluster. The results of this operation include the IDs of the elastic network interfaces (ENIs) that
are associated with the brokers.
2. Sign in to the Amazon Web Services Management Console and open the Amazon EC2 console at
https://console.amazonaws.cn/ec2/.
3. Using the dropdown list near the top-right corner of the screen, select the Region in which the
cluster is deployed.
4. In the left pane, under Network & Security, choose Network Interfaces.
5. Select the first ENI that you obtained in the first step. Choose the Actions menu at the top of the
screen, then choose Change Security Groups. Assign the new security group to this ENI. Repeat this
step for each of the ENIs that you obtained in the first step.
89
Amazon Managed Streaming for
Apache Kafka Developer Guide
Controlling access to Apache ZooKeeper
Note
Changes that you make to a cluster's security group using the Amazon EC2 console aren't
reflected in the MSK console under Network settings.
6. Configure the new security group's rules to ensure that your clients have access to the brokers. For
information about setting security group rules, see Adding, Removing, and Updating Rules in the
Amazon VPC user guide.
Important
If you change the security group that is associated with the brokers of a cluster, and then
add new brokers to that cluster, Amazon MSK associates the new brokers with the original
security group that was associated with the cluster when the cluster was created. However,
for a cluster to work correctly, all of its brokers must be associated with the same security
group. Therefore, if you add new brokers after changing the security group, you must follow the
previous procedure again and update the ENIs of the new brokers.
90
Amazon Managed Streaming for
Apache Kafka Developer Guide
Using TLS security with Apache ZooKeeper
8. When you select a network interface that corresponds to an Apache ZooKeeper node, choose the
Actions menu at the top of the page, then choose Change Security Groups. Assign a new security
group to this network interface. For information about creating security groups, see Creating a
Security Group in the Amazon VPC documentation.
9. Repeat the previous step to assign the same new security group to all the network interfaces that
are associated with the Apache ZooKeeper nodes of your cluster.
10. You can now choose who has access to this new security group. For information about setting
security group rules, see Adding, Removing, and Updating Rules in the Amazon VPC documentation.
• Clusters must use Apache Kafka version 2.5.1 or later to use TLS security with Apache ZooKeeper.
• Enable TLS security when you create or configure your cluster. Clusters created with Apache
Kafka version 2.5.1 or later with TLS enabled automatically use TLS security with Apache
ZooKeeper endpoints. For information about setting up TLS security, see How do I get started with
encryption? (p. 59).
• Retrieve the TLS Apache ZooKeeper endpoints using the DescribeCluster operation.
• Create an Apache ZooKeeper configuration file for use with the kafka-configs.sh and kafka-
acls.sh tools, or with the ZooKeeper shell. With each tool, you use the --zk-tls-config-file
parameter to specify your Apache ZooKeeper config.
zookeeper.ssl.client.enable=true
zookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty
zookeeper.ssl.keystore.location=kafka.jks
zookeeper.ssl.keystore.password=test1234
zookeeper.ssl.truststore.location=truststore.jks
zookeeper.ssl.truststore.password=test1234
• For other commands (such as kafka-topics), you must use the KAFKA_OPTS environment variable
to configure Apache ZooKeeper parameters. The following example shows how to configure the
KAFKA_OPTS environment variable to pass Apache ZooKeeper parameters into other commands:
export KAFKA_OPTS="
-Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty
-Dzookeeper.client.secure=true
-Dzookeeper.ssl.trustStore.location=/home/ec2-user/kafka.client.truststore.jks
-Dzookeeper.ssl.trustStore.password=changeit"
After you configure the KAFKA_OPTS environment variable, you can use CLI commands normally. The
following example creates an Apache Kafka topic using the Apache ZooKeeper configuration from the
KAFKA_OPTS environment variable:
<path-to-your-kafka-installation>/bin/kafka-topics.sh --create --
zookeeper ZooKeeperTLSConnectString --replication-factor 3 --partitions 1 --topic
AWSKafkaTutorialTopic
Note
The names of the parameters you use in your Apache ZooKeeper configuration file and those
you use in your KAFKA_OPTS environment variable are not consistent. Pay attention to which
91
Amazon Managed Streaming for
Apache Kafka Developer Guide
Logging
names you use with which parameters in your configuration file and KAFKA_OPTS environment
variable.
For more information about accessing your Apache ZooKeeper nodes with TLS, see KIP-515: Enable ZK
client to use the new TLS supported authentication.
Logging
You can deliver Apache Kafka broker logs to one or more of the following destination types: Amazon
CloudWatch Logs, Amazon S3, Amazon Kinesis Data Firehose. You can also log Amazon MSK API calls
with Amazon CloudTrail.
Broker logs
Broker logs enable you to troubleshoot your Apache Kafka applications and to analyze their
communications with your MSK cluster. You can configure your new or existing MSK cluster to deliver
INFO-level broker logs to one or more of the following types of destination resources: a CloudWatch log
group, an S3 bucket, a Kinesis Data Firehose delivery stream. Through Kinesis Data Firehose you can then
deliver the log data from your delivery stream to OpenSearch Service. You must create a destination
resource before you configure your cluster to deliver broker logs to it. Amazon MSK doesn't create these
destination resources for you if they don't already exist. For information about these three types of
destination resources and how to create them, see the following documentation:
Note
Amazon MSK does not support delivering broker logs to Kinesis Data Firehose in the Asia Pacific
(Osaka) Region.
Required permissions
To configure a destination for Amazon MSK broker logs, the IAM identity that you use for
Amazon MSK actions must have the permissions described in the Amazon managed policy:
AmazonMSKFullAccess (p. 69) policy.
To stream broker logs to an S3 bucket, you also need the s3:PutBucketPolicy permission. For
information about S3 bucket policies, see How Do I Add an S3 Bucket Policy? in the Amazon S3 Console
User Guide. For information about IAM policies in general, see Access Management in the IAM User
Guide.
{
"Sid": "Allow Amazon MSK to use the key.",
"Effect": "Allow",
"Principal": {
"Service": [
92
Amazon Managed Streaming for
Apache Kafka Developer Guide
Broker logs
"delivery.logs.amazonaws.com"
]
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
],
"Resource": "*"
}
For an existing cluster, choose the cluster from your list of clusters, then choose the Properties tab.
Scroll down to the Log delivery section and then choose its Edit button. You can specify the destinations
to which you want Amazon MSK to deliver your broker logs.
{
"BrokerLogs": {
"S3": {
"Bucket": "ExampleBucketName",
"Prefix": "ExamplePrefix",
"Enabled": true
},
"Firehose": {
"DeliveryStream": "ExampleDeliveryStreamName",
"Enabled": true
},
"CloudWatchLogs": {
"Enabled": true,
"LogGroup": "ExampleLogGroupName"
}
}
}
93
Amazon Managed Streaming for
Apache Kafka Developer Guide
CloudTrail events
Data Firehose as the log destination. If you use CloudWatch Logs as a log destination and you
dynamically enable DEBUG or TRACE level logging, Amazon MSK may continuously deliver a
sample of logs. This can significantly impact broker performance and should only be used when
the INFO log level is not verbose enough to determine the root cause of an issue.
Amazon MSK is integrated with Amazon CloudTrail, a service that provides a record of actions taken by
a user, role, or an Amazon service in Amazon MSK. CloudTrail captures API calls for as events. The calls
captured include calls from the Amazon MSK console and code calls to the Amazon MSK API operations.
It also captures Apache Kafka actions such as creating and altering topics and groups.
If you create a trail, you can enable continuous delivery of CloudTrail events to an Amazon S3 bucket,
including events for Amazon MSK. If you don't configure a trail, you can still view the most recent
events in the CloudTrail console in Event history. Using the information collected by CloudTrail, you can
determine the request that was made to Amazon MSK or the Apache Kafka action, the IP address from
which the request was made, who made the request, when it was made, and additional details.
To learn more about CloudTrail, including how to configure and enable it, see the Amazon CloudTrail
User Guide.
For an ongoing record of events in your Amazon Web Services account, including events for Amazon
MSK, create a trail. A trail enables CloudTrail to deliver log files to an Amazon S3 bucket. By default,
when you create a trail in the console, the trail applies to all Regions. The trail logs events from all
Regions in the Amazon partition and delivers the log files to the Amazon S3 bucket that you specify.
Additionally, you can configure other Amazon services to further analyze and act upon the event data
collected in CloudTrail logs. For more information, see the following:
Amazon MSK logs all Amazon MSK operations as events in CloudTrail log files. In addition, it logs the
following Apache Kafka actions.
• kafka-cluster:DescribeClusterDynamicConfiguration
• kafka-cluster:AlterClusterDynamicConfiguration
• kafka-cluster:CreateTopic
• kafka-cluster:DescribeTopicDynamicConfiguration
94
Amazon Managed Streaming for
Apache Kafka Developer Guide
CloudTrail events
• kafka-cluster:AlterTopic
• kafka-cluster:AlterTopicDynamicConfiguration
• kafka-cluster:DeleteTopic
Every event or log entry contains information about who generated the request. The identity
information helps you determine the following:
• Whether the request was made with root or Amazon Identity and Access Management (IAM) user
credentials.
• Whether the request was made with temporary security credentials for a role or federated user.
• Whether the request was made by another Amazon service.
The following example shows CloudTrail log entries that demonstrate the DescribeCluster and
DeleteCluster Amazon MSK actions.
{
"Records": [
{
"eventVersion": "1.05",
"userIdentity": {
"type": "IAMUser",
"principalId": "ABCDEF0123456789ABCDE",
"arn": "arn:aws:iam::012345678901:user/Joe",
"accountId": "012345678901",
"accessKeyId": "AIDACKCEVSQ6C2EXAMPLE",
"userName": "Joe"
},
"eventTime": "2018-12-12T02:29:24Z",
"eventSource": "kafka.amazonaws.com",
"eventName": "DescribeCluster",
"awsRegion": "us-east-1",
"sourceIPAddress": "192.0.2.0",
"userAgent": "aws-cli/1.14.67 Python/3.6.0 Windows/10 botocore/1.9.20",
"requestParameters": {
"clusterArn": "arn%3Aaws%3Akafka%3Aus-east-1%3A012345678901%3Acluster
%2Fexamplecluster%2F01234567-abcd-0123-abcd-abcd0123efa-2"
},
"responseElements": null,
"requestID": "bd83f636-fdb5-abcd-0123-157e2fbf2bde",
"eventID": "60052aba-0123-4511-bcde-3e18dbd42aa4",
"readOnly": true,
"eventType": "AwsApiCall",
"recipientAccountId": "012345678901"
},
{
"eventVersion": "1.05",
"userIdentity": {
"type": "IAMUser",
"principalId": "ABCDEF0123456789ABCDE",
95
Amazon Managed Streaming for
Apache Kafka Developer Guide
CloudTrail events
"arn": "arn:aws:iam::012345678901:user/Joe",
"accountId": "012345678901",
"accessKeyId": "AIDACKCEVSQ6C2EXAMPLE",
"userName": "Joe"
},
"eventTime": "2018-12-12T02:29:40Z",
"eventSource": "kafka.amazonaws.com",
"eventName": "DeleteCluster",
"awsRegion": "us-east-1",
"sourceIPAddress": "192.0.2.0",
"userAgent": "aws-cli/1.14.67 Python/3.6.0 Windows/10 botocore/1.9.20",
"requestParameters": {
"clusterArn": "arn%3Aaws%3Akafka%3Aus-east-1%3A012345678901%3Acluster
%2Fexamplecluster%2F01234567-abcd-0123-abcd-abcd0123efa-2"
},
"responseElements": {
"clusterArn": "arn:aws:kafka:us-east-1:012345678901:cluster/
examplecluster/01234567-abcd-0123-abcd-abcd0123efa-2",
"state": "DELETING"
},
"requestID": "c6bfb3f7-abcd-0123-afa5-293519897703",
"eventID": "8a7f1fcf-0123-abcd-9bdb-1ebf0663a75c",
"readOnly": false,
"eventType": "AwsApiCall",
"recipientAccountId": "012345678901"
}
]
}
The following example shows a CloudTrail log entry that demonstrates the kafka-
cluster:CreateTopic action.
{
"eventVersion": "1.08",
"userIdentity": {
"type": "IAMUser",
"principalId": "ABCDEFGH1IJKLMN2P34Q5",
"arn": "arn:aws:iam::111122223333:user/Admin",
"accountId": "111122223333",
"accessKeyId": "CDEFAB1C2UUUUU3AB4TT",
"userName": "Admin"
},
"eventTime": "2021-03-01T12:51:19Z",
"eventSource": "kafka-cluster.amazonaws.com",
"eventName": "CreateTopic",
"awsRegion": "us-east-1",
"sourceIPAddress": "198.51.100.0/24",
"userAgent": "aws-msk-iam-auth/unknown-version/aws-internal/3 aws-sdk-java/1.11.970
Linux/4.14.214-160.339.amzn2.x86_64 OpenJDK_64-Bit_Server_VM/25.272-b10 java/1.8.0_272
scala/2.12.8 vendor/Red_Hat,_Inc.",
"requestParameters": {
"kafkaAPI": "CreateTopics",
"resourceARN": "arn:aws:kafka:us-east-1:111122223333:topic/IamAuthCluster/3ebafd8e-
dae9-440d-85db-4ef52679674d-1/Topic9"
},
"responseElements": null,
"requestID": "e7c5e49f-6aac-4c9a-a1d1-c2c46599f5e4",
"eventID": "be1f93fd-4f14-4634-ab02-b5a79cb833d2",
"readOnly": false,
"eventType": "AwsApiCall",
"managementEvent": true,
"eventCategory": "Management",
"recipientAccountId": "111122223333"
}
96
Amazon Managed Streaming for
Apache Kafka Developer Guide
Compliance validation
For a list of Amazon services in scope of specific compliance programs, see Amazon Web Services in
Scope by Compliance Program. For general information, see Amazon Compliance Programs.
You can download third-party audit reports using Amazon Artifact. For more information, see
Downloading Reports in Amazon Artifact.
Your compliance responsibility when using Amazon MSK is determined by the sensitivity of your data,
your company's compliance objectives, and applicable laws and regulations. Amazon provides the
following resources to help with compliance:
• Security and Compliance Quick Start GuidesSecurity and Compliance Quick Start Guides – These
deployment guides discuss architectural considerations and provide steps for deploying security- and
compliance-focused baseline environments on Amazon.
• Architecting for HIPAA Security and Compliance Whitepaper – This whitepaper describes how
companies can use Amazon to create HIPAA-compliant applications.
• Amazon Compliance Resources – This collection of workbooks and guides might apply to your industry
and location.
• Evaluating Resources with Rules in the Amazon Config Developer Guide – The Amazon Config service
assesses how well your resource configurations comply with internal practices, industry guidelines, and
regulations.
• Amazon Security Hub – This Amazon service provides a comprehensive view of your security state
within Amazon that helps you check your compliance with security industry standards and best
practices.
For more information about Amazon Regions and Availability Zones, see Amazon Global Infrastructure.
97
Amazon Managed Streaming for
Apache Kafka Developer Guide
Infrastructure security
You use Amazon published API calls to access Amazon MSK through the network. Clients must support
Transport Layer Security (TLS) 1.0 or later. We recommend TLS 1.2 or later. Clients must also support
cipher suites with perfect forward secrecy (PFS) such as Ephemeral Diffie-Hellman (DHE) or Elliptic Curve
Ephemeral Diffie-Hellman (ECDHE). Most modern systems such as Java 7 and later support these modes.
Additionally, requests must be signed by using an access key ID and a secret access key that is associated
with an IAM principal. Or you can use the Amazon Security Token Service (Amazon STS) to generate
temporary security credentials to sign requests.
98
Amazon Managed Streaming for
Apache Kafka Developer Guide
Public access
To connect to your MSK cluster from a client that's outside the cluster's VPC, see the following topics:
Topics
• Public access (p. 99)
• Access from within Amazon but outside cluster's VPC (p. 101)
• Port information (p. 103)
Public access
Amazon MSK gives you the option to turn on public access to the brokers of MSK clusters running
Apache Kafka 2.6.0 or later versions. For security reasons, you can't turn on public access while creating
an MSK cluster. However, you can update an existing cluster to make it publicly accessible. You can also
create a new cluster and then update it to make it publicly accessible.
You can turn on public access to an MSK cluster at no additional cost, but standard Amazon data transfer
costs apply for data transfer in and out of the cluster. For information about pricing, see Amazon EC2
On-Demand Pricing.
To turn on public access to a cluster, first ensure that the cluster meets all of the following conditions:
• The subnets that are associated with the cluster must be public. This means that the subnets must
have an associated route table with an internet gateway attached. For information about how to
create and attach an internet gateway, see Internet gateways in the Amazon VPC user guide.
• Unauthenticated access control must be off and at least one of the following access-control methods
must be on: SASL/IAM, SASL/SCRAM, mTLS. For information about how to update the access-control
method of a cluster, see the section called “Updating security” (p. 28).
• Encryption within the cluster must be turned on. The on setting is the default when creating a cluster.
It's not possible to turn on encryption within the cluster for a cluster that was created with it turned
off. It is therefore not possible to turn on public access for a cluster that was created with encryption
within the cluster turned off.
• Plaintext traffic between brokers and clients must be off. For information about how to turn it off if it's
on, see the section called “Updating security” (p. 28).
• If you are using the SASL/SCRAM or mTLS access-control methods, you must set Apache Kafka
ACLs for your cluster. After you set the Apache Kafka ACLs for your cluster, update the cluster's
configuration to have the property allow.everyone.if.no.acl.found to false for the cluster. For
information about how to update the configuration of a cluster, see the section called “Configuration
operations” (p. 42). If you are using IAM access control and want to apply authorization policies or
update your authorization policies, see the section called “IAM access control” (p. 73). For information
about Apache Kafka ACLs, see the section called “Apache Kafka ACLs” (p. 88).
99
Amazon Managed Streaming for
Apache Kafka Developer Guide
Public access
After you ensure that an MSK cluster meets the conditions listed above, you can use the Amazon Web
Services Management Console, the Amazon CLI, or the Amazon MSK API to turn on public access.
After you turn on public access to a cluster, you can get a public bootstrap-brokers string for it. For
information about getting the bootstrap brokers for a cluster, see the section called “Getting the
bootstrap brokers” (p. 16).
Important
In addition to turning on public access, ensure that the cluster's security groups have inbound
TCP rules that allow public access from your IP address. We recommend that you make these
rules as restrictive as possible. For information about security groups and inbound rules, see
Security groups for your VPC in the Amazon VPC User Guide. For port numbers, see the section
called “Port information” (p. 103). For instructions on how to change a cluster's security group,
see the section called “Changing security groups” (p. 89).
Note
If you use the following instructions to turn on public access and then still cannot access
the cluster, see the section called “Unable to access cluster that has public access turned
on” (p. 135).
1. Sign in to the Amazon Web Services Management Console, and open the Amazon MSK console at
https://console.amazonaws.cn/msk/home?region=us-east-1#/home/.
2. In the list of clusters, choose the cluster to which you want to turn on public access.
3. Choose the Properties tab, then find the Network settings section.
4. Choose Edit public access.
1. Run the following Amazon CLI command, replacing ClusterArn and Current-Cluster-Version
with the ARN and current version of the cluster. To find the current version of the cluster, use the
DescribeCluster operation or the describe-cluster Amazon CLI command. An example version is
KTVPDKIKX0DER.
The output of this update-connectivity command looks like the following JSON example.
{
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"ClusterOperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef"
}
Note
To turn off public access, use a similar Amazon CLI command, but with the following
connectivity info instead:
2. To get the result of the update-connectivity operation, run the following command,
replacing ClusterOperationArn with the ARN that you obtained in the output of the update-
connectivity command.
100
Amazon Managed Streaming for
Apache Kafka Developer Guide
Access from within Amazon
The output of this describe-cluster-operation command looks like the following JSON
example.
{
"ClusterOperationInfo": {
"ClientRequestId": "982168a3-939f-11e9-8a62-538df00285db",
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"CreationTime": "2019-06-20T21:08:57.735Z",
"OperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef",
"OperationState": "UPDATE_COMPLETE",
"OperationType": "UPDATE_CONNECTIVITY",
"SourceClusterInfo": {
"ConnectivityInfo": {
"PublicAccess": {
"Type": "DISABLED"
}
}
},
"TargetClusterInfo": {
"ConnectivityInfo": {
"PublicAccess": {
"Type": "SERVICE_PROVIDED_EIPS"
}
}
}
}
}
If OperationState has the value UPDATE_IN_PROGRESS, wait a while, then run the describe-
cluster-operation command again.
• To use the API to turn public access to a cluster on or off, see UpdateConnectivity.
Note
For security reasons, Amazon MSK doesn't allow public access to Apache ZooKeeper nodes. For
information about how to control access to the Apache ZooKeeper nodes of your MSK cluster
from within Amazon, see the section called “Controlling access to Apache ZooKeeper” (p. 90).
101
Amazon Managed Streaming for
Apache Kafka Developer Guide
Amazon VPC peering
VPN connections
You can connect your MSK cluster's VPC to remote networks and users using the VPN connectivity
options described in the following topic: VPN Connections.
REST proxies
You can install a REST proxy on an instance running within your cluster's Amazon VPC. REST proxies
enable your producers and consumers to communicate with the cluster through HTTP API requests.
EC2-Classic
Use the following procedure to connect to your cluster from an EC2-Classic instance.
1. Follow the guidance described in ClassicLink to connect your EC2-Classic instance to your cluster's
VPC.
2. Find and copy the private IP associated with your EC2-Classic instance.
3. Using the Amazon CLI, run the following command, replacing ClusterArn with the Amazon
Resource Name (ARN) for your MSK cluster.
4. In the output of the describe-cluster command, look for SecurityGroups and save the ID of
the security group for your MSK cluster.
5. Open the Amazon VPC console at https://console.amazonaws.cn/vpc/.
6. In the left pane, choose Security Groups.
102
Amazon Managed Streaming for
Apache Kafka Developer Guide
Port information
7. Choose the security group whose ID you saved after you ran the describe-cluster command.
Select the box at the beginning of the row corresponding to this security group.
8. In the lower half of the page, choose Inbound Rules.
9. Choose Edit rules, then choose Add Rule.
10. For the Type field, choose All traffic in the drop-down list.
11. Leave the Source set to Custom and enter the private IP of your EC2-Classic instance, followed
immediately by /32 with no intervening spaces.
12. Choose Save rules.
Port information
The following list provides the numbers of the ports that Amazon MSK uses to communicate with client
machines.
103
Amazon Managed Streaming for
Apache Kafka Developer Guide
Migrating your Apache Kafka cluster to Amazon MSK
An outline of the steps to follow when using MirrorMaker to migrate to an MSK cluster
You can't use MirrorMaker for this step because it doesn't automatically re-create the topics that you
want to migrate with the right replication level. You can create the topics in Amazon MSK with the
same replication factors and numbers of partitions that they had in CLUSTER_ONPREM. You can also
create the topics with different replication factors and numbers of partitions.
2. Start MirrorMaker from an instance that has read access to CLUSTER_ONPREM and write access to
CLUSTER_AWSMSK.
3. Run the following command to mirror all topics:
<path-to-your-kafka-installation>/bin/kafka-mirror-maker.sh --consumer.config
config/mirrormaker-consumer.properties --producer.config config/mirrormaker-
producer.properties --whitelist '.*'
104
Amazon Managed Streaming for
Apache Kafka Developer Guide
Migrating from one Amazon MSK cluster to another
5. Check the progress of mirroring by inspecting the lag between the last offset for each topic and the
current offset from which MirrorMaker is consuming.
Remember that MirrorMaker is simply using a consumer and a producer. So, you can check the lag
using the kafka-consumer-groups.sh tool. To find the consumer group name, look inside the
mirrormaker-consumer.properties file for the group.id, and use its value. If there is no such
key in the file, you can create it. For example, set group.id=mirrormaker-consumer-group.
6. After MirrorMaker finishes mirroring all topics, stop all producers and consumers, and then
stop MirrorMaker. Then redirect the producers and consumers to the CLUSTER_AWSMSK cluster
by changing their producer and consumer bootstrap brokers values. Restart all producers and
consumers on CLUSTER_AWSMSK.
• Run MirrorMaker on the destination cluster. This way, if a network problem happens, the messages are
still available in the source cluster. If you run MirrorMaker on the source cluster and events are buffered
in the producer and there is a network issue, events might be lost.
• If encryption is required in transit, run it in the source cluster.
• For consumers, set auto.commit.enabled=false
• For producers, set
• max.in.flight.requests.per.connection=1
• retries=Int.Max_Value
• acks=all
• max.block.ms = Long.Max_Value
• For a high producer throughput:
• Buffer messages and fill message batches — tune buffer.memory, batch.size, linger.ms
• Tune socket buffers — receive.buffer.bytes, send.buffer.bytes
• To avoid data loss, turn off auto commit at the source, so that MirrorMaker can control the commits,
which it typically does after it receives the ack from the destination cluster. If the producer has acks=all
and the destination cluster has min.insync.replicas set to more than 1, the messages are persisted on
more than one broker at the destination before the MirrorMaker consumer commits the offset at the
source.
• If order is important, you can set retries to 0. Alternatively, for a production environment, set max
inflight connections to 1 to ensure that the batches sent out are not committed out of order if a batch
fails in the middle. This way, each batch sent is retried until the next batch is sent out. If max.block.ms
is not set to the maximum value, and if the producer buffer is full, there can be data loss (depending
on some of the other settings). This can block and back-pressure the consumer.
• For high throughput
• Increase buffer.memory.
105
Amazon Managed Streaming for
Apache Kafka Developer Guide
MirrorMaker 2.* advantages
106
Amazon Managed Streaming for
Apache Kafka Developer Guide
Amazon MSK metrics for monitoring with CloudWatch
You can also monitor your MSK cluster with Prometheus, an open-source monitoring application.
For information about Prometheus, see Overview in the Prometheus documentation. To learn
how to monitor your cluster with Prometheus, see the section called “Open monitoring with
Prometheus” (p. 117).
Topics
• Amazon MSK metrics for monitoring with CloudWatch (p. 107)
• Viewing Amazon MSK metrics using CloudWatch (p. 116)
• Consumer-lag monitoring (p. 116)
• Open monitoring with Prometheus (p. 117)
DEFAULT-level metrics are free. Pricing for other metrics is described in the Amazon CloudWatch pricing
page.
BurstBalance After the cluster gets to Cluster The remaining balance of input-output
the ACTIVE state. Name , burst credits for EBS volumes in the
Broker cluster. Use it to investigate latency or
ID decreased throughput.
107
Amazon Managed Streaming for
Apache Kafka Developer Guide
DEFAULT Level monitoring
BytesInPerSec After you create a topic. Cluster The number of bytes per second
Name, received from clients. This metric is
Broker available per broker and also per topic.
ID,
Topic
BytesOutPerSec After you create a topic. Cluster The number of bytes per second sent
Name, to clients. This metric is available per
Broker broker and also per topic.
ID,
Topic
ConnectionCount After the cluster gets to Cluster The number of active authenticated,
the ACTIVE state. Name, unauthenticated, and inter-broker
Broker connections.
ID
CPUCreditBalance After the cluster gets to Cluster This metric can help you monitor CPU
the ACTIVE state. Name, credit balance on the brokers. If your
Broker CPU usage is sustained above the
ID baseline level of 20% utilization, you
can run out of the CPU credit balance,
which can have a negative impact on
cluster performance. You can take
steps to reduce CPU load. For example,
you can reduce the number of client
requests or update the broker type to
an M5 broker type.
CpuIdle After the cluster gets to Cluster The percentage of CPU idle time.
the ACTIVE state. Name,
Broker
ID
CpuIoWait After the cluster gets to Cluster The percentage of CPU idle time during
the ACTIVE state. Name, a pending disk operation.
Broker
ID
CpuSystem After the cluster gets to Cluster The percentage of CPU in kernel space.
the ACTIVE state. Name,
Broker
ID
108
Amazon Managed Streaming for
Apache Kafka Developer Guide
DEFAULT Level monitoring
CpuUser After the cluster gets to Cluster The percentage of CPU in user space.
the ACTIVE state. Name,
Broker
ID
GlobalTopicCount After the cluster gets to Cluster Total number of topics across all
the ACTIVE state. Name brokers in the cluster.
LeaderCount After the cluster gets to Cluster The total number of leaders of
the ACTIVE state. Name, partitions per broker, not including
Broker replicas.
ID
MaxOffsetLag After consumer group Consumer The maximum offset lag across all
consumes from a topic. Group, partitions in a topic.
Topic
MemoryBuffered After the cluster gets to Cluster The size in bytes of buffered memory
the ACTIVE state. Name, for the broker.
Broker
ID
MemoryCached After the cluster gets to Cluster The size in bytes of cached memory for
the ACTIVE state. Name, the broker.
Broker
ID
109
Amazon Managed Streaming for
Apache Kafka Developer Guide
DEFAULT Level monitoring
MemoryFree After the cluster gets to Cluster The size in bytes of memory that is free
the ACTIVE state. Name, and available for the broker.
Broker
ID
HeapMemoryAfterGC After the cluster gets to Cluster The percentage of total heap memory
the ACTIVE state. Name, in use after garbage collection.
Broker
ID
MemoryUsed After the cluster gets to Cluster The size in bytes of memory that is in
the ACTIVE state. Name, use for the broker.
Broker
ID
MessagesInPerSec After the cluster gets to Cluster The number of incoming messages per
the ACTIVE state. Name, second for the broker.
Broker
ID
NetworkRxDropped After the cluster gets to Cluster The number of dropped receive
the ACTIVE state. Name, packages.
Broker
ID
NetworkRxErrors After the cluster gets to Cluster The number of network receive errors
the ACTIVE state. Name, for the broker.
Broker
ID
NetworkRxPackets After the cluster gets to Cluster The number of packets received by the
the ACTIVE state. Name, broker.
Broker
ID
NetworkTxDropped After the cluster gets to Cluster The number of dropped transmit
the ACTIVE state. Name, packages.
Broker
ID
NetworkTxErrors After the cluster gets to Cluster The number of network transmit errors
the ACTIVE state. Name, for the broker.
Broker
ID
NetworkTxPackets After the cluster gets to Cluster The number of packets transmitted by
the ACTIVE state. Name, the broker.
Broker
ID
PartitionCount After the cluster gets to Cluster The total number of topic partitions per
the ACTIVE state. Name, broker, including replicas.
Broker
ID
110
Amazon Managed Streaming for
Apache Kafka Developer Guide
DEFAULT Level monitoring
RequestBytesMean After the cluster gets to Cluster The mean number of request bytes for
the ACTIVE state. Name, the broker.
Broker
ID
RequestTime After request throttling Cluster The average time in milliseconds spent
is applied. Name, in broker network and I/O threads to
Broker process requests.
ID
RootDiskUsed After the cluster gets to Cluster The percentage of the root disk used by
the ACTIVE state. Name, the broker.
Broker
ID
SumOffsetLag After consumer group Consumer The aggregated offset lag for all the
consumes from a topic. Group, partitions in a topic.
Topic
SwapFree After the cluster gets to Cluster The size in bytes of swap memory that
the ACTIVE state. Name, is available for the broker.
Broker
ID
SwapUsed After the cluster gets to Cluster The size in bytes of swap memory that
the ACTIVE state. Name, is in use for the broker.
Broker
ID
TrafficShaping After the cluster gets to Cluster High-level metrics indicating the
the ACTIVE state. Name, number of packets shaped (dropped
Broker or queued) due to exceeding network
ID allocations. Finer detail is available with
PER_BROKER metrics.
111
Amazon Managed Streaming for
Apache Kafka Developer Guide
PER_BROKER Level monitoring
Additional metrics that are available starting at the PER_BROKER monitoring level
BwInAllowanceExceeded After the cluster gets to The number of packets shaped because
the ACTIVE state. the inbound aggregate bandwidth
exceeded the maximum for the broker.
BwOutAllowanceExceeded After the cluster gets to The number of packets shaped because
the ACTIVE state. the outbound aggregate bandwidth
exceeded the maximum for the broker.
ConnTrackAllowanceExceeded After the cluster gets to The number of packets shaped because
the ACTIVE state. the connection tracking exceeded the
maximum for the broker. Connection
tracking is related to security groups that
track each connection established to
ensure that return packets are delivered
as expected.
ConnectionCloseRate After the cluster gets to The number of connections closed per
the ACTIVE state. second per listener. This number is
aggregated per listener and filtered for
the client listeners.
CpuCreditUsage After the cluster gets to This metric can help you monitor CPU
the ACTIVE state. credit usage on the instances. If your CPU
usage is sustained above the baseline
level of 20%, you can run out of the CPU
credit balance, which can have a negative
impact on cluster performance. You can
monitor and alarm on this metric to take
corrective actions.
112
Amazon Managed Streaming for
Apache Kafka Developer Guide
PER_BROKER Level monitoring
After there's a
FetchConsumerLocalTimeMsMean The mean time in milliseconds that the
producer/consumer. consumer request is processed at the
leader.
After there's a
FetchConsumerRequestQueueTimeMsMean The mean time in milliseconds that the
producer/consumer. consumer request waits in the request
queue.
After there's a
FetchConsumerResponseQueueTimeMsMean The mean time in milliseconds that the
producer/consumer. consumer request waits in the response
queue.
After there's a
FetchConsumerResponseSendTimeMsMean The mean time in milliseconds for the
producer/consumer. consumer to send a response.
After there's a
FetchConsumerTotalTimeMsMean The mean total time in milliseconds that
producer/consumer. consumers spend on fetching data from
the broker.
After there's a
FetchFollowerLocalTimeMsMean The mean time in milliseconds that the
producer/consumer. follower request is processed at the
leader.
After there's a
FetchFollowerRequestQueueTimeMsMean The mean time in milliseconds that the
producer/consumer. follower request waits in the request
queue.
After there's a
FetchFollowerResponseQueueTimeMsMean The mean time in milliseconds that the
producer/consumer. follower request waits in the response
queue.
After there's a
FetchFollowerResponseSendTimeMsMean The mean time in milliseconds for the
producer/consumer. follower to send a response.
After there's a
FetchFollowerTotalTimeMsMean The mean total time in milliseconds that
producer/consumer. followers spend on fetching data from
the broker.
PpsAllowanceExceeded After the cluster gets to The number of packets shaped because
the ACTIVE state. the bidirectional PPS exceeded the
maximum for the broker.
113
Amazon Managed Streaming for
Apache Kafka Developer Guide
PER_BROKER Level monitoring
ProduceLocalTimeMsMean After the cluster gets to The mean time in milliseconds that the
the ACTIVE state. request is processed at the leader.
ProduceTotalTimeMsMean After the cluster gets to The mean produce time in milliseconds.
the ACTIVE state.
ReplicationBytesInPerSec After you create a topic. The number of bytes per second received
from other brokers.
ReplicationBytesOutPerSec After you create a topic. The number of bytes per second sent to
other brokers.
TcpConnections After the cluster gets to Shows number of incoming and outgoing
the ACTIVE state. TCP segments with the SYN flag set.
114
Amazon Managed Streaming for
Apache Kafka Developer Guide
PER_TOPIC_PER_BROKER Level monitoring
VolumeQueueLength After the cluster gets to The number of read and write operation
the ACTIVE state. requests waiting to be completed in a
specified time period.
VolumeReadBytes After the cluster gets to The number of bytes read in a specified
the ACTIVE state. time period.
VolumeTotalReadTime After the cluster gets to The total number of seconds spent by
the ACTIVE state. all read operations that completed in a
specified time period.
VolumeTotalWriteTime After the cluster gets to The total number of seconds spent by
the ACTIVE state. all write operations that completed in a
specified time period.
MessagesInPerSec After you create a The number of messages received per second.
topic.
115
Amazon Managed Streaming for
Apache Kafka Developer Guide
Viewing Amazon MSK metrics using CloudWatch
DEFAULT levels. Only the DEFAULT level metrics are free. The metrics in this table have the following
dimensions: Consumer Group, Topic, Partition.
EstimatedTimeLag After consumer Time estimate (in seconds) to drain the partition
group consumes offset lag.
from a topic.
Sign in to the Amazon Web Services Management Console and open the CloudWatch console at https://
console.amazonaws.cn/cloudwatch/.
Consumer-lag monitoring
Monitoring consumer lag allows you to identify slow or stuck consumers that aren't keeping up with
the latest data available in a topic. When necessary, you can then take remedial actions, such as scaling
or rebooting those consumers. To monitor consumer lag, you can use Amazon CloudWatch or open
monitoring with Prometheus.
116
Amazon Managed Streaming for
Apache Kafka Developer Guide
Open monitoring with Prometheus
Consumer lag metrics quantify the difference between the latest data written to your topics and the data
read by your applications. Amazon MSK provides the following consumer-lag metrics, which you can get
through Amazon CloudWatch or through open monitoring with Prometheus: EstimatedMaxTimeLag,
EstimatedTimeLag, MaxOffsetLag, OffsetLag, and SumOffsetLag. For information about these
metrics, see the section called “Amazon MSK metrics for monitoring with CloudWatch” (p. 107).
Amazon MSK supports consumer lag metrics for clusters with Apache Kafka 2.2.1 or a later version.
Note
To turn on consumer-lag monitoring for a cluster that was created before November 23, 2020,
ensure that the cluster is running Apache Kafka 2.2.1 or a later version, then create a support
case.
1. Sign in to the Amazon Web Services Management Console, and open the Amazon MSK console at
https://console.amazonaws.cn/msk/home?region=us-east-1#/home/.
2. In the Monitoring section, select the check box next to Enable open monitoring with Prometheus.
3. Provide the required information in all the sections of the page, and review all the available options.
4. Choose Create cluster.
• Invoke the create-cluster command and specify its open-monitoring option. Enable the
JmxExporter, the NodeExporter, or both. If you specify open-monitoring, the two exporters
can't be disabled at the same time.
• Invoke the CreateCluster operation and specify OpenMonitoring. Enable the jmxExporter, the
nodeExporter, or both. If you specify OpenMonitoring, the two exporters can't be disabled at the
same time.
117
Amazon Managed Streaming for
Apache Kafka Developer Guide
Setting up a Prometheus host on an Amazon EC2 instance
1. Sign in to the Amazon Web Services Management Console, and open the Amazon MSK console at
https://console.amazonaws.cn/msk/home?region=us-east-1#/home/.
2. Choose the name of the cluster that you want to update. This takes you to a page the contains
details for the cluster.
3. On the Properties tab, scroll down to find the Monitoring section.
4. Choose Edit.
5. Select the check box next to Enable open monitoring with Prometheus.
6. Choose Save changes.
• Invoke the update-monitoring command and specify its open-monitoring option. Enable the
JmxExporter, the NodeExporter, or both. If you specify open-monitoring, the two exporters
can't be disabled at the same time.
• Invoke the UpdateMonitoring operation and specify OpenMonitoring. Enable the jmxExporter,
the nodeExporter, or both. If you specify OpenMonitoring, the two exporters can't be disabled at
the same time.
# file: prometheus.yml
# my global config
global:
scrape_interval: 60s
118
Amazon Managed Streaming for
Apache Kafka Developer Guide
Prometheus metrics
brokers in the previous step. Include all of the brokers you obtained in the previous step. Amazon
MSK uses port 11001 for the JMX Exporter and port 11002 for the Node Exporter.
[
{
"labels": {
"job": "jmx"
},
"targets": [
"broker_dns_1:11001",
"broker_dns_2:11001",
.
.
.
"broker_dns_N:11001"
]
},
{
"labels": {
"job": "node"
},
"targets": [
"broker_dns_1:11002",
"broker_dns_2:11002",
.
.
.
"broker_dns_N:11002"
]
}
]
6. To start the Prometheus server on your Amazon EC2 instance, run the following command
in the directory where you extracted the Prometheus files and saved prometheus.yml and
targets.json.
./prometheus
7. Find the IPv4 public IP address of the Amazon EC2 instance where you ran Prometheus in the
previous step. You need this public IP address in the following step.
8. To access the Prometheus web UI, open a browser that can access your Amazon EC2 instance, and go
to Prometheus-Instance-Public-IP:9090, where Prometheus-Instance-Public-IP is the
public IP address you got in the previous step.
Prometheus metrics
All metrics emitted by Apache Kafka to JMX are accessible using open monitoring with Prometheus. For
information about Apache Kafka metrics, see Monitoring in the Apache Kafka documentation. Along
with Apache Kafka metrics, consumer-lag metrics are also available at port 11001 under the JMX MBean
name kafka.consumer.group:type=ConsumerLagMetrics. You can also use the Prometheus Node
Exporter to get CPU and disk metrics for your brokers at port 11002.
119
Amazon Managed Streaming for
Apache Kafka Developer Guide
Storing Prometheus metrics in amazon
managed service for Prometheus
the ingestion, storage, querying, and alerting of your metrics. It also integrates with Amazon security
services to give you fast and secure access to your data. You can use the open-source PromQL query
language to query your metrics and alert on them.
For more information, see Getting started with Amazon Managed Service for Prometheus.
120
Amazon Managed Streaming for
Apache Kafka Developer Guide
1. Create an Amazon EC2 instance in the same Amazon VPC as the Amazon MSK cluster.
2. Install Prometheus on the Amazon EC2 instance that you created in the previous step. Note the
private IP and the port. The default port number is 9090. For information on how to configure
Prometheus to aggregate metrics for your cluster, see the section called “Open monitoring with
Prometheus” (p. 117).
3. Download Cruise Control on the Amazon EC2 instance. (Alternatively, you can use a separate
Amazon EC2 instance for Cruise Control if you prefer.) For a cluster that has Apache Kafka version
2.4.*, use the latest 2.4.* Cruise Control release. If your cluster has an Apache Kafka version that is
older than 2.4.*, use the latest 2.0.* Cruise Control release.
4. Decompress the Cruise Control file, then go to the decompressed folder.
5. Run the following command to install git.
6. Run the following command to initialize the local repo. Replace Your-Cruise-Control-Folder
with the name of your current folder (the folder that you obtained when you decompressed the
Cruise Control download).
git init && git add . && git commit -m "Init local repo." && git tag -a Your-Cruise-
Control-Folder -m "Init local version."
121
Amazon Managed Streaming for
Apache Kafka Developer Guide
ssl.truststore.location=/home/ec2-user/kafka.client.truststore.jks
# Change the capacity config file and specify its path; details below
capacity.config.file=config/capacityCores.json
2. Edit the config/capacityCores.json file to specify the right disk size and CPU cores and
network in/out limits. You can use the DescribeCluster API operation (or its CLI equivalent) to obtain
the disk size. For CPU cores and network in/out limits, see Amazon EC2 Instance Types.
{
"brokerCapacities": [
{
"brokerId": "-1",
"capacity": {
"DISK": "10000",
"CPU": {
"num.cores": "2"
},
"NW_IN": "5000000",
"NW_OUT": "5000000"
},
"doc": "This is the default capacity. Capacity unit used for disk is in MB, cpu
is in number of cores, network throughput is in KB."
}
]
}
3. You can optionally install the Cruise Control UI. To download it, go to Setting Up Cruise Control
Frontend.
4. Run the following command to start Cruise Control. Consider using a tool like screen or tmux to
keep a long-running session open.
<path-to-your-kafka-installation>/bin/kafka-cruise-control-start.sh config/
cruisecontrol.properties 9091
5. Use the Cruise Control APIs or the UI to make sure that Cruise Control has the cluster load data and
that it's making rebalancing suggestions. It might take several minutes to get a valid window of
metrics.
122
Amazon Managed Streaming for
Apache Kafka Developer Guide
Amazon MSK quota
To handle retries on failed connections, you can set the reconnect.backoff.ms configuration
parameter on the client side. For example, if you want a client to retry connections after 1 second, set
reconnect.backoff.ms to 1000. For more information, see reconnect.backoff.ms in the Apache
Kafka documentation.
• Up to 100 configurations per account. To request higher quota, create a support case.
• A maximum of 50 revisions per configuration.
• To update the configuration or the Apache Kafka version of an MSK cluster, first ensure the number
of partitions per broker is under the limits described in the section called “ Right-size your cluster:
Number of partitions per broker” (p. 138).
123
Amazon Managed Streaming for
Apache Kafka Developer Guide
MSK Connect quota
Dimension Quota
124
Amazon Managed Streaming for
Apache Kafka Developer Guide
125
Amazon Managed Streaming for
Apache Kafka Developer Guide
Supported Apache Kafka versions
Topics
• Supported Apache Kafka versions (p. 126)
• Updating the Apache Kafka version (p. 129)
Topics
• Apache Kafka version 3.2.0 (p. 126)
• Apache Kafka version 3.1.1 (p. 126)
• Apache Kafka version 2.8.1 (p. 127)
• Apache Kafka version 2.8.0 (p. 127)
• Apache Kafka version 2.7.2 (p. 127)
• Apache Kafka version 2.7.1 (p. 127)
• Apache Kafka version 2.6.3 (p. 127)
• Apache Kafka version 2.6.2 [recommended] (p. 127)
• Apache Kafka version 2.7.0 (p. 127)
• Apache Kafka version 2.6.1 (p. 127)
• Apache Kafka version 2.6.0 (p. 127)
• Apache Kafka version 2.5.1 (p. 127)
• Amazon MSK bug-fix version 2.4.1.1 (p. 128)
• Apache Kafka version 2.4.1 (use 2.4.1.1 instead) (p. 128)
• Apache Kafka version 2.3.1 (p. 129)
• Apache Kafka version 2.2.1 (p. 129)
• Apache Kafka version 1.1.1 (for existing clusters only) (p. 129)
126
Amazon Managed Streaming for
Apache Kafka Developer Guide
Apache Kafka version 2.8.1
The output of the DescribeCluster operation includes the ZookeeperConnectStringTls node, which
lists the TLS zookeeper endpoints.
127
Amazon Managed Streaming for
Apache Kafka Developer Guide
Amazon MSK bug-fix version 2.4.1.1
The following example shows the ZookeeperConnectStringTls node of the response for the
DescribeCluster operation:
"ZookeeperConnectStringTls": "z-3.awskafkatutorialc.abcd123.c3.kafka.us-
east-1.amazonaws.com:2182,z-2.awskafkatutorialc.abcd123.c3.kafka.us-
east-1.amazonaws.com:2182,z-1.awskafkatutorialc.abcd123.c3.kafka.us-
east-1.amazonaws.com:2182"
For information about using TLS encryption with zookeeper, see Using TLS security with Apache
ZooKeeper (p. 91).
For more information about Apache Kafka version 2.5.1, see its release notes on the Apache Kafka
downloads site.
We recommend that you use MSK bug-fix version 2.4.1.1 for new Amazon MSK clusters if you prefer
to use Apache Kafka 2.4.1. You can update existing clusters running Apache Kafka version 2.4.1 to this
version to incorporate this fix. For information about upgrading an existing cluster, see Updating the
Apache Kafka version (p. 129).
To work around this issue without upgrading the cluster to version 2.4.1.1, see the Consumer group
stuck in PreparingRebalance state (p. 132) section of the Troubleshooting your Amazon MSK
cluster (p. 132) guide.
KIP-392 is one of the key Kafka Improvement Proposals that are included in the 2.4.1 release of
Apache Kafka. This improvement allows consumers to fetch from the closest replica. To use this
feature, set client.rack in the consumer properties to the ID of the consumer's Availability Zone.
An example AZ ID is use1-az1. Amazon MSK sets broker.rack to the IDs of the Availability
Zones of the brokers. You must also set the replica.selector.class configuration property to
org.apache.kafka.common.replica.RackAwareReplicaSelector, which is an implementation
of rack awareness provided by Apache Kafka.
When you use this version of Apache Kafka, the metrics in the PER_TOPIC_PER_BROKER monitoring
level appear only after their values become nonzero for the first time. For more information about this,
see the section called “PER_TOPIC_PER_BROKER Level monitoring” (p. 115).
For information about how to find Availability Zone IDs, see AZ IDs for Your Resource in the Amazon
Resource Access Manager user guide.
128
Amazon Managed Streaming for
Apache Kafka Developer Guide
Apache Kafka version 2.3.1
For information about setting configuration properties, see Configuration (p. 34).
For more information about KIP-392, see Allow Consumers to Fetch from Closest Replica in the
Confluence pages.
For more information about Apache Kafka version 2.4.1, see its release notes on the Apache Kafka
downloads site.
For information about how to make a cluster highly available during an update, see the section called
“Build highly available clusters” (p. 139).
Important
You can't update the Apache Kafka version for an MSK cluster that exceeds the limits described
in the section called “ Right-size your cluster: Number of partitions per broker” (p. 138).
Updating the Apache Kafka version using the Amazon Web Services Management Console
1. Run the following command, replacing ClusterArn with the Amazon Resource Name (ARN) that
you obtained when you created your cluster. If you don't have the ARN for your cluster, you can find
it by listing all clusters. For more information, see the section called “Listing clusters” (p. 17).
The output of this command includes a list of the Apache Kafka versions to which you can update
the cluster. It looks like the following example.
129
Amazon Managed Streaming for
Apache Kafka Developer Guide
Updating the Apache Kafka version
{
"CompatibleKafkaVersions": [
{
"SourceVersion": "2.2.1",
"TargetVersions": [
"2.3.1",
"2.4.1",
"2.4.1.1",
"2.5.1"
]
}
]
}
2. Run the following command, replacing ClusterArn with the Amazon Resource Name (ARN) that
you obtained when you created your cluster. If you don't have the ARN for your cluster, you can find
it by listing all clusters. For more information, see the section called “Listing clusters” (p. 17).
Replace Current-Cluster-Version with the current version of the cluster. For TargetVersion
you can specify any of the target versions from the output of the previous command.
Important
Cluster versions aren't simple integers. To find the current version of the cluster, use the
DescribeCluster operation or the describe-cluster Amazon CLI command. An example
version is KTVPDKIKX0DER.
The output of the previous command looks like the following JSON.
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"ClusterOperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef"
}
3. To get the result of the update-cluster-kafka-version operation, run the following command,
replacing ClusterOperationArn with the ARN that you obtained in the output of the update-
cluster-kafka-version command.
The output of this describe-cluster-operation command looks like the following JSON
example.
{
"ClusterOperationInfo": {
"ClientRequestId": "62cd41d2-1206-4ebf-85a8-dbb2ba0fe259",
"ClusterArn": "arn:aws:kafka:us-east-1:012345678012:cluster/exampleClusterName/
abcdefab-1234-abcd-5678-cdef0123ab01-2",
"CreationTime": "2021-03-11T20:34:59.648000+00:00",
"OperationArn": "arn:aws:kafka:us-east-1:012345678012:cluster-
operation/exampleClusterName/abcdefab-1234-abcd-5678-cdef0123ab01-2/0123abcd-
abcd-4f7f-1234-9876543210ef",
"OperationState": "UPDATE_IN_PROGRESS",
130
Amazon Managed Streaming for
Apache Kafka Developer Guide
Updating the Apache Kafka version
"OperationSteps": [
{
"StepInfo": {
"StepStatus": "IN_PROGRESS"
},
"StepName": "INITIALIZE_UPDATE"
},
{
"StepInfo": {
"StepStatus": "PENDING"
},
"StepName": "UPDATE_APACHE_KAFKA_BINARIES"
},
{
"StepInfo": {
"StepStatus": "PENDING"
},
"StepName": "FINALIZE_UPDATE"
}
],
"OperationType": "UPDATE_CLUSTER_KAFKA_VERSION",
"SourceClusterInfo": {
"KafkaVersion": "2.4.1"
},
"TargetClusterInfo": {
"KafkaVersion": "2.6.1"
}
}
}
If OperationState has the value UPDATE_IN_PROGRESS, wait a while, then run the
describe-cluster-operation command again. When the operation is complete, the value of
OperationState becomes UPDATE_COMPLETE. Because the time required for Amazon MSK to
complete the operation varies, you might need to check repeatedly until the operation is complete.
1. Invoke the GetCompatibleKafkaVersions operation to get a list of the Apache Kafka versions to
which you can update the cluster.
2. Invoke the UpdateClusterKafkaVersion operation to update the cluster to one of the compatible
Apache Kafka versions.
131
Amazon Managed Streaming for
Apache Kafka Developer Guide
Consumer group stuck in PreparingRebalance state
Topics
• Consumer group stuck in PreparingRebalance state (p. 132)
• Error delivering broker logs to Amazon CloudWatch Logs (p. 133)
• No default security group (p. 133)
• Cluster appears stuck in the CREATING state (p. 134)
• Cluster state goes from CREATING to FAILED (p. 134)
• Cluster state is ACTIVE but producers cannot send data or consumers cannot receive data (p. 134)
• Amazon CLI doesn't recognize Amazon MSK (p. 134)
• Partitions go offline or replicas are out of sync (p. 134)
• Disk space is running low (p. 134)
• Memory running low (p. 134)
• Producer gets NotLeaderForPartitionException (p. 135)
• Under-replicated partitions (URP) greater than zero (p. 135)
• Cluster has topics called __amazon_msk_canary and __amazon_msk_canary_state (p. 135)
• Partition replication fails (p. 135)
• Unable to access cluster that has public access turned on (p. 135)
• Unable to access cluster from within Amazon: Networking issues (p. 136)
• Failed authentication: Too many connects (p. 137)
• MSK Serverless: Cluster creation fails (p. 137)
To resolve this issue, we recommend that you upgrade your cluster to Amazon MSK bug-fix version
2.4.1.1 (p. 128), which contains a fix for this issue. For information about updating an existing cluster to
Amazon MSK bug-fix version 2.4.1.1, see Updating the Apache Kafka version (p. 129).
The workarounds for solving this issue without upgrading the cluster to Amazon MSK bug-fix version
2.4.1.1 are to either set the Kafka clients to use Static membership protocol (p. 132) , or to Identify and
reboot (p. 133) the coordinating broker node of the stuck consumer group.
132
Amazon Managed Streaming for
Apache Kafka Developer Guide
Identify and reboot
1. Set the group.instance.id property of your Kafka Consumers configuration to a static string
that identifies the consumer in the group.
2. Ensure that other instances of the configuration are updated to use the static string.
3. Deploy the changes to your Kafka Consumers.
Using Static Membership Protocol is more effective if the session timeout in the client configuration
is set to a duration that allows the consumer to recover without prematurely triggering a consumer
group rebalance. For example, if your consumer application can tolerate 5 minutes of unavailability, a
reasonable value for the session timeout would be 4 minutes instead of the default value of 10 seconds.
Note
Using Static Membership Protocol only reduces the probability of encountering this issue. You
may still encounter this issue even when using Static Membership Protocol.
{"Sid":"AWSLogDeliveryWrite","Effect":"Allow","Principal":
{"Service":"delivery.logs.amazonaws.com"},"Action":
["logs:CreateLogStream","logs:PutLogEvents"],"Resource":["*"]}
If you try to append the JSON above to an existing policy but get an error that says you've reached the
maximum length for the policy you picked, try to append the JSON to another one of your Amazon
CloudWatch Logs policies. After you append the JSON to an existing policy, try once again to set up
broker-log delivery to Amazon CloudWatch Logs.
133
Amazon Managed Streaming for
Apache Kafka Developer Guide
Cluster appears stuck in the CREATING state
• If your producers and consumers have access to the cluster but still experience problems producing
and consuming data, the cause might be KAFKA-7697, which affects Apache Kafka version 2.1.0 and
can lead to a deadlock in one or more brokers. Consider migrating to Apache Kafka 2.2.1, which is not
affected by this bug. For information about how to migrate, see Migration (p. 104).
134
Amazon Managed Streaming for
Apache Kafka Developer Guide
Producer gets NotLeaderForPartitionException
• If UnderReplicatedPartitions is spiky, the issue might be that the cluster isn't provisioned at the
right size to handle incoming and outgoing traffic. See Best practices (p. 138).
• If UnderReplicatedPartitions is consistently greater than 0 including during low-traffic periods,
the issue might be that you've set restrictive ACLs that don't grant topic access to brokers. To replicate
partitions, brokers must be authorized to both READ and DESCRIBE topics. DESCRIBE is granted by
default with the READ authorization. For information about setting ACLs, see Authorization and ACLs
in the Apache Kafka documentation.
1. Ensure that the cluster's security group's inbound rules allow your IP address and the cluster's port.
For a list of cluster port numbers, see the section called “Port information” (p. 103). Also ensure
that the security group's outbound rules allow outbound communications. For more information
about security groups and their inbound and outbound rules, see Security groups for your VPC in the
Amazon VPC User Guide.
2. Make sure that your IP address and the cluster's port are allowed in the inbound rules of the cluster's
VPC network ACL. Unlike security groups, network ACLs are stateless. This means that you must
configure both inbound and outbound rules. In the outbound rules, allow all traffic (port range:
135
Amazon Managed Streaming for
Apache Kafka Developer Guide
Unable to access cluster from
within Amazon: Networking issues
0-65535) to your IP address. For more information, see Add and delete rules in the Amazon VPC
User Guide.
3. Make sure that you are using the public-access bootstrap-brokers string to access the cluster. An
MSK cluster that has public access turned on has two different bootstrap-brokers strings, one for
public access, and one for access from within Amazon. For more information, see the section called
“Getting the bootstrap brokers using the Amazon Web Services Management Console” (p. 16).
1. Use any of the methods described in the section called “Getting the bootstrap brokers” (p. 16) to get
the addresses of the bootstrap brokers.
2. In the following command replace bootstrap-broker with one of the broker addresses that you
obtained in the previous step. Replace port-number with 9094 if the cluster is set up to use TLS
authentication. If the cluster doesn't use TLS authentication, replace port-number with 9092. Run
the command from the client machine.
If the client machine is able to access the brokers and the Apache ZooKeeper nodes, this means there
are no connectivity issues. In this case, run the following command to check whether your Apache Kafka
client is set up correctly. To get bootstrap-brokers, use any of the methods described in the section
called “Getting the bootstrap brokers” (p. 16). Replace topic with the name of your topic.
If the previous command succeeds, this means that your client is set up correctly. If you're still unable to
produce and consume from an application, debug the problem at the application level.
If the client machine is unable to access the brokers and the Apache ZooKeeper nodes, see the following
subsections for guidance that is based on your client-machine setup.
136
Amazon Managed Streaming for
Apache Kafka Developer Guide
Amazon EC2 client and MSK cluster in different VPCs
For information about VPC peering, see Working with VPC Peering Connections.
On-premises client
In the case of an on-premises client that is set up to connect to the MSK cluster using Amazon VPN,
ensure the following:
• The VPN connection status is UP. For information about how to check the VPN connection status, see
How do I check the current status of my VPN tunnel?.
• The route table of the cluster's VPC contains the route for an on-premises CIDR whose target has the
format Virtual private gateway(vgw-xxxxxxxx).
• The MSK cluster's security group allows traffic on port 2181, port 9092 (if your cluster accepts
plaintext traffic), and port 9094 (if your cluster accepts TLS-encrypted traffic).
For more Amazon VPN troubleshooting guidance, see Troubleshooting Client VPN.
If the previous troubleshooting guidance doesn't resolve the issue, ensure that no firewall is blocking
network traffic. For further debugging, use tools like tcpdump and Wireshark to analyze traffic and to
make sure that it is reaching the MSK cluster.
To learn more about the rate limits for new connections per broker, see the Amazon MSK quota (p. 123)
page.
For a complete list of permissions required to perform all Amazon MSK actions, see Amazon managed
policy: AmazonMSKFullAccess (p. 69).
137
Amazon Managed Streaming for
Apache Kafka Developer Guide
Right-size your cluster: Number of partitions per broker
Best practices
This topic outlines some best practices to follow when using Amazon MSK.
kafka.t3.small 300
kafka.m5.2xlarge 2000
If the number of partitions per broker exceeds the maximum value specified in the previous table, you
cannot perform any of the following operations on the cluster:
For guidance on choosing the number of partitions, see Apache Kafka Supports 200K Partitions
Per Cluster. We also recommend that you perform your own testing to determine the right type for
your brokers. For more information about the different broker types, see the section called “Broker
types” (p. 10).
138
Amazon Managed Streaming for
Apache Kafka Developer Guide
Build highly available clusters
descriptions. Estimates provided by this sheet are conservative and provide a starting point for a new
cluster. Cluster performance, size, and costs are dependent on your use case and we recommend that you
verify them with actual testing.
To understand how the underlying infrastructure affects Apache Kafka performance, see Best practices
for right-sizing your Apache Kafka clusters to optimize performance and cost in the Amazon Big Data
Blog. The blog post provides information about how to size your clusters to meet your throughput,
availability, and latency requirements. It also provides answers to questions such as when you should
scale up versus scale out, and guidance on how to continuously verify the size of your production
clusters.
You can use Amazon CloudWatch metric math to create a composite metric that is CPU User + CPU
System. Set an alarm that gets triggered when the composite metric reaches an average CPU utilization
of 60%. When this alarm is triggered, scale the cluster using one of the following options:
• Option 1 (recommended): Update your broker type to the next larger type. For example, if the current
type is kafka.m5.large, update the cluster to use kafka.m5.xlarge. Keep in mind that when
you update the broker type in the cluster, Amazon MSK takes brokers offline in a rolling fashion
and temporarily reassigns partition leadership to other brokers. A size update typically takes 10-15
minutes per broker.
• Option 2: If there are topics with all messages ingested from producers that use round-robin writes (in
other words, messages aren't keyed and ordering isn't important to consumers), expand your cluster by
139
Amazon Managed Streaming for
Apache Kafka Developer Guide
Monitor disk space
adding brokers. Also add partitions to existing topics with the highest throughput. Next, use kafka-
topics.sh --describe to ensure that newly added partitions are assigned to the new brokers. The
main benefit of this option compared to the previous one is that you can manage resources and costs
more granularly. Additionally, you can use this option if CPU load significantly exceeds 60% because
this form of scaling doesn't typically result in increased load on existing brokers.
• Option 3: Expand your cluster by adding brokers, then reassign existing partitions by using the
partition reassignment tool named kafka-reassign-partitions.sh. However, if you use this
option, the cluster will need to spend resources to replicate data from broker to broker after partitions
are reassigned. Compared to the two previous options, this can significantly increase the load on the
cluster at first. As a result, Amazon MSK doesn't recommend using this option when CPU utilization
is above 70% because replication causes additional CPU load and network traffic. Amazon MSK only
recommends using this option if the two previous options aren't feasible.
Other recommendations:
• Monitor total CPU utilization per broker as a proxy for load distribution. If brokers have consistently
uneven CPU utilization it might be a sign that load isn't evenly distributed within the cluster. Amazon
MSK recommends using Cruise Control to continuously manage load distribution via partition
assignment.
• Monitor produce and consume latency. Produce and consume latency can increase linearly with CPU
utilization.
• Use the section called “Automatic scaling” (p. 20). You can also manually increase broker storage as
described in the section called “Manual scaling” (p. 22).
• Reduce the message retention period or log size. For information on how to do that, see the section
called “Adjust data retention parameters” (p. 140).
• Delete unused topics.
For information on how to set up and use alarms, see Using Amazon CloudWatch Alarms. For a full list of
Amazon MSK metrics, see Monitoring a cluster (p. 107).
To specify a retention policy at the cluster level, set one or more of the following
parameters: log.retention.hours, log.retention.minutes, log.retention.ms, or
log.retention.bytes. For more information, see the section called “Custom configurations” (p. 34).
• To specify a retention time period per topic, use the following command.
140
Amazon Managed Streaming for
Apache Kafka Developer Guide
Monitor Apache Kafka memory
• To specify a retention log size per topic, use the following command.
The retention parameters that you specify at the topic level take precedence over cluster-level
parameters.
To determine how much memory Apache Kafka uses, you can monitor the HeapMemoryAfterGC metric.
HeapMemoryAfterGC is the percentage of total heap memory that is in use after garbage collection. We
recommend that you create a CloudWatch alarm that takes action when HeapMemoryAfterGC increases
above 60%.
The steps that you can take to decrease memory usage vary. They depend on the way that you
configure Apache Kafka. For example, if you use transactional message delivery, you can decrease the
transactional.id.expiration.ms value in your Apache Kafka configuration from 604800000 ms
to 86400000 ms (from 7 days to 1 day). This decreases the memory footprint of each transaction.
Reassign partitions
To move partitions to different brokers on the same cluster, you can use the partition reassignment
tool named kafka-reassign-partitions.sh. For example, after you add new brokers to expand
a cluster, you can rebalance that cluster by reassigning partitions to the new brokers. For information
about how to add brokers to a cluster, see the section called “Expanding a cluster” (p. 26). For
information about the partition reassignment tool, see Expanding your cluster in the Apache Kafka
documentation.
141
Amazon Managed Streaming for
Apache Kafka Developer Guide
Amazon glossary
For the latest Amazon terminology, see the Amazon glossary in the Amazon General Reference.
142