AWS Big Data Blog

Developer guidance on how to do local testing with Amazon MSK Serverless

Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that makes it easy to build and run Kafka clusters on Amazon Web Services (AWS). When working with Amazon MSK, developers are interested in accessing the service locally. This allows developers to test their application with a Kafka cluster that has the same configuration as production and provides an identical infrastructure to the actual environment without needing to run Kafka locally.

An Amazon MSK Serverless private DNS endpoint is only accessible from Amazon Virtual Private Cloud (Amazon VPC) connections that have been configured to connect. It isn’t directly resolvable from your local development environment. One option is to use AWS Direct Connect or AWS VPN to be able to Connect to Amazon MSK Serverless from your on-premises network. However, building such a solution may incur cost and complexity, and it needs to be set up by a platform team.

This post presents a practical approach to accessing your Amazon MSK environment for development purposes through a bastion host using a Secure Shell (SSH) tunnel (a commonly used secure connection method). Whether you’re working with Amazon MSK Serverless, where public access is unavailable, or with provisioned MSK clusters that are intentionally kept private, this post guides you through the steps to establish a secure connection and seamlessly integrate your local development environment with your MSK resources.

Solution overview

The solution allows you to directly connect to the Amazon MSK Serverless service from your local development environment without using Direct Connect or a VPN. The service is accessed with the bootstrap server DNS endpoint boot-<<xxxxxx>>.c<<x>>.kafka-serverless.<<region-name>>.amazonaws.com on port 9098, then routed through an SSH tunnel to a bastion host, which connects to the MSK Serverless cluster. In the next step, let’s explore how to set up this connection.

The flow of the solution is as follows:

  1. The Kafka client sends a request to connect to the bootstrap server
  2. The DNS query for your MSK Serverless endpoint is routed to a locally configured DNS server
  3. The locally configured DNS server routes the DNS query to localhost.
  4. The SSH tunnel forwards all the traffic on port 9098 from the localhost to the MSK Serverless server through the Amazon Elastic Compute Cloud (Amazon EC2) bastion host.

The following image shows the architecture diagram.

Architecture Diagram for accessing Serverless MSK from local

Prerequisites

Before deploying the solution, you need to have the following resources deployed in your account:

  1. An MSK Serverless cluster configured with AWS Identity and Access Management (IAM) authentication.
  2. A bastion host instance with network access to the MSK Serverless cluster and SSH public key authentication.
  3. AWS CLI configured with an IAM user and able to read and create topics on Amazon MSK. Use the IAM policy from Step 2: Create an IAM role in the Getting started using MSK Serverless clusters
  4. For Windows users, install Linux on Windows with Windows Subsystem for Linux 2 (WSL 2) using Ubuntu 24.04. For guidance, refer to How to install Linux on Windows with WSL.

This guide assumes an MSK Serverless deployment in us-east-1, but it can be used in every AWS Region where MSK Serverless is available. Furthermore, we are using OS X as operating system. In the following steps replace msk-endpoint-url with your MSK Serverless endpoint URL with IAM authentication. The MSK endpoint URL has a format like boot-<<xxxxxx>>.c<<x>>.kafka-serverless.<<region-name>>.amazonaws.com.

Solution walkthrough

To access your Amazon MSK environment for development purposes, use the following walkthrough.

Configure local DNS server OSX

Install Dnsmasq as a local DNS server and configure the resolver to resolve the Amazon MSK. The solution uses Dnsmasq because it can compare DNS requests against a database of patterns and use these to determine the correct response. This functionality can match any request that ends in kafka-serverless.us-east-1.amazonaws.com and send 127.0.0.1 in response. Follow these steps to install Dnsmasq:

  1. Update brew and install Dnsmasq using brew
    brew up
    brew install dnsmasq
  2. Start the Dnsmasq service
    sudo brew services start dnsmasq
  3. Reroute all traffic for Serverless MSK (kafka-serverless.us-east-1.amazonaws.com) to 127.0.0.1
    echo address=/kafka-serverless.us-east-1.amazonaws.com/127.0.0.1 >> $(brew --prefix)/etc/dnsmasq.conf
  4. Reload Dnsmasq configuration and clear cache
    sudo launchctl unload /Library/LaunchDaemons/homebrew.mxcl.dnsmasq.plist
    sudo launchctl load /Library/LaunchDaemons/homebrew.mxcl.dnsmasq.plist
    dscacheutil -flushcache

Configure OS X resolver

Now that you have a working DNS server, you can configure your operating system to use it. Configure the server to send only .kafka-serverless.us-east-1.amazonaws.com queries to Dnsmasq. Most operating systems that are similar to UNIX have a configuration file called /etc/resolv.conf that controls the way DNS queries are performed, including the default server to use for DNS queries. Use the following steps to configure the OS X resolver:

  1. OS X also allows you to configure additional resolvers by creating configuration files in the /etc/resolver/ This directory probably won’t exist on your system, so your first step should be to create it:
    sudo mkdir -p /etc/resolver
  2. Create a new file with the same name as your new top-level domain (kafka-serverless.us-east-1.amazonaws.com) in the /etc/resolver/ directory and add 127.0.0.1 as a nameserver to it by entering the following command.
    sudo tee /etc/resolver/kafka-serverless.us-east-1.amazonaws.com >/dev/null <<EOF
    nameserver 127.0.0.1
    EOF

Configure local DNS server Windows

In Windows Subsystem for Linux, first install Dnsmasq, then configure the resolver to resolve the Amazon MSK and finally add localhost as the first nameserver.

  1. Update apt and install Dnsmasq using apt. Install the telnet utility for later tests:
    sudo apt update
    sudo apt install dnsmasq
    sudo apt install telnet
  2. Reroute all traffic for Serverless MSK (kafka-serverless.us-east-1.amazonaws.com) to 127.0.0.1.
    echo "address=/kafka-serverless.us-east-1.amazonaws.com/127.0.0.1" | sudo tee -a /etc/dnsmasq.conf
  3. Reload Dnsmasq configuration and clear cache.
    sudo /etc/init.d/dnsmasq restart
  4. Open /etc/resolv.conf and add the following code in the first line.
    nameserver 127.0.0.1

    The output should look like the following code.

    #Some comments
    nameserver 127.0.0.1
    nameserver <<your_nameservers>>
    ..

Create SSH tunnel

The next step is to create the SSH tunnel, which will allow any connections made to localhost:9098 on your local machine to be forwarded over the SSH tunnel to the target Kafka broker. Use the following steps to create the SSH tunnel:

  1. Replace bastion-host-dns-endpoint with the public DNS endpoint of the bastion host, which comes in the style of <<xyz>>.compute-1.amazonaws.com, and replace ec2-key-pair.pem with the key pair of the bastion host. Then create the SSH tunnel by entering the following command.
    ssh -i "~/<<ec2-key-pair.pem>>" ec2-user@<<bastion-host-dns-endpoint>> -L 127.0.0.1:9098:<<msk-endpoint-url>>:9098
  2. Leave the SSH tunnel running and open a new terminal window.
  3. Test the connection to the Amazon MSK server by entering the following command.
    telnet <<msk-endpoint-url>> 9098

    The output should look like the following example.

    Trying 127.0.0.1...
    Connected to boot-<<xxxxxxxx>>.c<<x>>.kafka-serverless.us-east-1.amazonaws.com.
    Escape character is '^]'.

Testing

Now configure the Kafka client to use IAM Authentication and then test the setup. You find the latest Kafka installation at the Apache Kafka Download site. Then unzip and copy the content of the Dafka folder into ~/kafka.

  1. Download the IAM authentication and unpack it
    cd ~/kafka/libs
    wget https://github.com/aws/aws-msk-iam-auth/releases/download/v2.2.0/aws-msk-iam-auth-2.2.0-all.jar
    cd ~
  2. Configure Kafka properties to use IAM as the authentication mechanism
    cat <<EOF > ~/kafka/config/client-config.properties
    
    # Sets up TLS for encryption and SASL for authN.
    
    security.protocol = SASL_SSL
    
    # Identifies the SASL mechanism to use.
    
    sasl.mechanism = AWS_MSK_IAM
    
    # Binds SASL client implementation.
    
    sasl.jaas.config = software.amazon.msk.auth.iam.IAMLoginModule required;
    
    
    # Encapsulates constructing a SigV4 signature based on extracted credentials.
    
    # The SASL client bound by "sasl.jaas.config" invokes this class.
    
    sasl.client.callback.handler.class = software.amazon.msk.auth.iam.IAMClientCallbackHandler
    
    EOF
  3. Enter the following command in ~/kafka/bin to create an example topic. Make sure that the SSH tunnel created in the previous section is still open and running.
    ./kafka-topics.sh --bootstrap-server <<msk-endpoint-url>>:9098 --command-config ~/kafka/config/client-config.properties --create --topic ExampleTopic --partitions 10 --replication-factor 3 --config retention.ms=3600000

Cleanup

To remove the solution, complete the following steps for Mac users:

  1. Delete the file /etc/resolver/kafka-serverless.us-east-1.amazonaws.com
  2. Delete the entry address=/kafka-serverless.us-east-1.amazonaws.com/127.0.0.1 in the file $(brew --prefix)/etc/dnsmasq.conf
  3. Stop the Dnsmasq service sudo brew services stop dnsmasq
  4. Remove the Dnsmasq service sudo brew uninstall dnsmasq

To remove the solution, complete the following steps for WSL users:

  1. Delete the file /etc/dnsmasq.conf
  2. Delete the entry nameserver 127.0.0.1 in the file /etc/resolv.conf
  3. Remove the Dnsmasq service sudo apt remove dnsmasq
  4. Remove the telnet utility sudo apt remove telnet

Conclusion

In this post, I presented you with guidance on how developers can connect to Amazon MSK Serverless from local environments. The connection is done using an Amazon MSK endpoint through an SSH tunnel and a bastion host. This enables developers to experiment and test locally, without needing to setup a separate Kafka cluster.


About the Author

Simon Peyer is a Solutions Architect at Amazon Web Services (AWS) based in Switzerland. He is a practical doer and passionate about connecting technology and people using AWS Cloud services. A special focus for him is data streaming and automations. Besides work, Simon enjoys his family, the outdoors, and hiking in the mountains.