Skip to content

Commit

Permalink
Merge pull request #52 from rmoff/kafka-ngrok
Browse files Browse the repository at this point in the history
Kafka ngrok article
  • Loading branch information
rmoff authored Nov 1, 2023
2 parents b4dd85d + 1c005ec commit c2eaa13
Show file tree
Hide file tree
Showing 15 changed files with 323 additions and 2 deletions.
58 changes: 58 additions & 0 deletions content/code/docker-compose-ngrok-kafka.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
version: '2'
services:
ngrok:
image: ngrok/ngrok:latest
container_name: ngrok
# Sign up for an ngrok account at https://dashboard.ngrok.com/signup
# Get your auth-token from https://dashboard.ngrok.com/get-started/your-authtoken
# and either put it directly in the file here, or write it to a .env file in
# the same folder as this Docker Compose file in the form
# NGROK_AUTH_TOKEN=your_token_value
command: tcp broker:9092 --log stdout --authtoken $NGROK_AUTH_TOKEN
ports:
- 4040:4040 # Web dashboard for ngrok

zookeeper:
image: confluentinc/cp-zookeeper:7.5.1
container_name: zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000

broker:
image: confluentinc/cp-kafka:7.5.1
container_name: broker
depends_on:
- zookeeper
- ngrok
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENERS: DOCKER://broker:29092, NGROK://broker:9092
KAFKA_ADVERTISED_LISTENERS: DOCKER://broker:29092
KAFKA_INTER_BROKER_LISTENER_NAME: DOCKER
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: DOCKER:PLAINTEXT,NGROK:PLAINTEXT
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
entrypoint:
- /bin/sh
- -c
- |
echo "Waiting for ngrok tunnel to be created"
while : ; do
curl_status=$$(curl -s -o /dev/null -w %{http_code} http://ngrok:4040/api/tunnels/command_line)
echo -e $$(date) "\tTunnels API HTTP state: " $$curl_status " (waiting for 200)"
if [ $$curl_status -eq 200 ] ; then
break
fi
sleep 5
done
echo "ngrok tunnel is up"
NGROK_LISTENER=$(curl -s http://ngrok:4040/api/tunnels/command_line | grep -Po '"public_url":.*?[^\\]",' | cut -d':' -f2- | tr -d ',"' | sed 's/tcp:\/\//NGROK:\/\//g')
echo $$NGROK_LISTENER
export KAFKA_ADVERTISED_LISTENERS="$$KAFKA_ADVERTISED_LISTENERS, $$NGROK_LISTENER"
echo "KAFKA_ADVERTISED_LISTENERS is set to " $$KAFKA_ADVERTISED_LISTENERS
/etc/confluent/docker/run
262 changes: 262 additions & 0 deletions content/post/kafka-ngrok.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,262 @@
---
draft: false
title: 'Using Apache Kafka with ngrok'
date: "2023-11-01T10:07:58Z"
image: "/images/2023/11/h_IMG_5046.webp"
thumbnail: "/images/2023/11/ngrok02.webp"
credit: "https://twitter.com/rmoff/"
categories:
- ngrok
- Apache Kafka
---

Sometimes you might want to access Apache Kafka that's running on your local machine from another device not on the same network. I'm not sure I can think of a production use-case, but there are a dozen examples for sandbox, demo, and playground environments.

In this post we'll see how you can use [ngrok](https://ngrok.com/) to, in their words, `Put localhost on the internet`. And specifically, your local Kafka broker on the internet.

<!--more-->

## Overview

Why? In my case, I wanted to expose my local Kafka as a source (and target) to [Decodable](https://decodable.co/) so that I can process streams of data with Apache Flink through the managed service that Decodable provides.

![Overview of Kafka solution](/images/2023/11/ngrok01.webp)

The example I'm going to show has ngrok and the Kafka broker running in a Docker Compose environment:

![Overview of Kafka/ngrok solution](/images/2023/11/ngrok02.webp)

### ngrok

ngrok has a free tier to use which works just fine for this, but you will need to [create a free account](https://dashboard.ngrok.com/signup) to get your [auth token](https://dashboard.ngrok.com/get-started/your-authtoken). In this article I'm assuming you've exported your auth token to the environment variable `NGROK_AUTH_TOKEN`.

## Kafka and Listeners and Advertised Listeners…oh my

In theory, using ngrok is straightforward. You configure a _tunnel_ which routes a publicly-accessible host/port to one accessed from the ngrok agent running locally. Here we're telling it to route the public tunnel to a machine called `broker` on port `9092`:

```bash
ngrok tcp broker:9092 --authtoken $NGROK_AUTH_TOKEN
```

From this you get the tunnel url details:

```
t=2023-11-01T10:57:41+0000 lvl=info msg="started tunnel" obj=tunnels name=command_line addr=//broker:9092 url=tcp://6.tcp.eu.ngrok.io:13075
```

The tunnel URL is `6.tcp.eu.ngrok.io:13075`. This is accessible from anywhere on the interwebz—locally or remote. Let's try accessing our Kafka broker from another machine. I'm using my favourite tool, `kcat`. We'll start by interogating the broker (`-b`) for metadata (`-L`):

```bash
$ kcat -L -b 6.tcp.eu.ngrok.io:13075

Metadata for all topics (from broker -1: 6.tcp.eu.ngrok.io:13075/bootstrap):
1 brokers:
broker 1 at localhost:9092 (controller)
0 topics:
```

Look at that! It worked! 👏

Or did it…

```bash
$ echo "hello world" | kcat -b 6.tcp.eu.ngrok.io:13075 -P -t test_topic

%3|1698837177.570|FAIL|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Connect to ipv4#127.0.0.1:9092 failed: Connection refused (after 0ms in state CONNECT)
% ERROR: Local: Broker transport failure: localhost:9092/1: Connect to ipv4#127.0.0.1:9092 failed: Connection refused (after 0ms in state CONNECT)
%3|1698837177.804|FAIL|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Connect to ipv6#[::1]:9092 failed: Connection refused (after 0ms in state CONNECT)
% ERROR: Local: Broker transport failure: localhost:9092/1: Connect to ipv6#[::1]:9092 failed: Connection refused (after 0ms in state CONNECT)
[…]
```

oh no! 😖

So here's what's happening. Kafka is a distributed system; it's only in sandbox/demo environments that you'd ever be running just a single broker. For this reason, you have a _bootstrap_ address for one or more servers in the cluster. When you _initially_ connect it's to the bootstrap server (`6.tcp.eu.ngrok.io:13075`).

![The initial bootstrap connection between Kafka broker and client](/images/2023/11/ngrok03a.webp)

The server returns the **`advertised.listener`** for each of the brokers in the cluster as the address at which each of them can be found for subsequent connections.

![The advertised listeners exchange between Kafka broker and client](/images/2023/11/ngrok03b.webp)

After the initial bootstrap connection, the client uses the address that was returned to connect to when producing or consuming records.

![The Kafka client using advertised.listener to find which the broker's connection address for produce/consume](/images/2023/11/ngrok04.webp)

If we introduce ngrok into the mix it looks like this:

1. The Kafka client connects via the public ngrok tunnel address to the broker for bootstrap connection

![Kafka client connects via ngrok to the broker for bootstrap](/images/2023/11/ngrok05.webp)

2. The Kafka broker returns a list of available brokers, with their `advertised.listener` address (`localhost:9092`)

![The Kafka broker returns a list of available brokers, with their `advertised.listener` address](/images/2023/11/ngrok06.webp)

3. The Kafka client tries to connect to the address given—`localhost:9092`—to produce/consume data. Since there's no Kafka broker running local to the Kafka client (i.e. at `localhost:9092`) the connection fails.

![The Kafka client tries to connect to the address given—`localhost:9092`—to produce/consume data. Since there's no Kafka broker running local to the Kafka client (i.e. at `localhost:9092`) the connection fails.](/images/2023/11/ngrok07.webp)

_(If you want to get into this in more detail about this please see my previous article about [working with advertised.listeners](/2018/08/02/kafka-listeners-explained/))_

## Making it work

We need to get the ngrok tunnel URL and configure that as the Kafka broker's advertised.listener:

![Correct configuration of Kafka and ngrok](/images/2023/11/ngrok08.webp)

Let's look at what's actually involved in doing this.

I'm using Docker to run Kafka, so the configuration I'm going to discuss is through the environment variables passed to the image.

The first thing is to configure the **listeners**. This defines where the broker binds to for listening to inbound connections. I'm using two listeners; one for regular traffic between Docker containers, and the other for ngrok traffic. Listeners can be arbitrarily named, so I'm using nice clear labels here. The listener's label servers as a prefix to the host and port.

* `DOCKER`
* `NGROK`

The only—but crucial—difference is the port on which they listen. **The port on which a connection receives determines the listener used, and therefore the advertised listener that's served in response.**

```
KAFKA_LISTENERS: DOCKER://broker:29092, NGROK://broker:9092
```

Now we configure the **advertised listeners**:

```
KAFKA_ADVERTISED_LISTENERS: DOCKER://broker:29092, NGROK://6.tcp.eu.ngrok.io:13075
```

For completeness, we need two more listener configuration items:

```
KAFKA_INTER_BROKER_LISTENER_NAME: DOCKER
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: DOCKER:PLAINTEXT,NGROK:PLAINTEXT
```

## Automating it

Now, this is all very well. But there's a complication. ngrok uses a random host and port *each time the tunnel is created*. This means that to configure the Kafka broker with the correct advertised listener we need to know the tunnel *first*. That also makes building a static configuration for our Kafka broker tricky—for example, in Docker Compose.

So what we need to do is:

1. Launch ngrok and create tunnel
2. Determine the tunnel URL and add it to the advertised listener configuration for Kafka
3. Launch the Kafka broker

What prompted this whole exercise was wanting to build a standalone artefact that could be used to easily spin up Kafka locally to connect to Decodable in the cloud. Manually hacking around config is fine as a one-off, but I wanted something repeatable and as automated as possible.

## Wrangling around Docker Compose

This is what I ended up doing in Docker Compose. I'd love to hear your feedback if there's a smarter or more idiomatic way in which to accomplish the same :)

First, run ngrok and create the tunnel. This is hardcoded to direct traffic to the Kafka container at `broker:9092`.

```yaml
ngrok:
image: ngrok/ngrok:latest
container_name: ngrok
command: tcp broker:9092 --log stdout --authtoken $NGROK_AUTH_TOKEN
ports:
- 4040:4040
```
ngrok has a REST API, which we can query to get the tunnel URL (`public_url`):

```json
$ curl http://localhost:4040/api/tunnels/command_line | jq '.'
{
"name": "command_line",
"ID": "10c495e397bd65f42f3d4ebbd1bb74f5",
"uri": "/api/tunnels/command_line",
"public_url": "tcp://0.tcp.eu.ngrok.io:16761",
"proto": "tcp",
"config": {
"addr": "broker:9092",
"inspect": false
},
```

Now we are ready to look at the Kafka broker. The chaining is defined with the Docker Compose's `depends_on`:

```yaml
depends_on:
- zookeeper
- ngrok
```

What I've done is define the _constant_ listener variables in the Docker Compose service entry for the broker, whilst leaving `KAFKA_ADVERTISED_LISTENERS` with a single entry and nothing for ngrok yet:

```yaml
KAFKA_LISTENERS: DOCKER://broker:29092, NGROK://broker:9092
KAFKA_ADVERTISED_LISTENERS: DOCKER://broker:29092
KAFKA_INTER_BROKER_LISTENER_NAME: DOCKER
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: DOCKER:PLAINTEXT,NGROK:PLAINTEXT
```

I've then overriden the `entrypoint` of the container. First, it will wait for the tunnel to be created:

```bash
echo "Waiting for ngrok tunnel to be created"
while : ; do
curl_status=$(curl -s -o /dev/null -w %{http_code} http://ngrok:4040/api/tunnels/command_line)
echo -e $(date) "\tTunnels API HTTP state: " $curl_status " (waiting for 200)"
if [ $curl_status -eq 200 ] ; then
break
fi
sleep 5
done
echo "ngrok tunnel is up"
```

Then—in the absence of `jq` on the `confluentinc/cp-kafka` image—I use some fairly nasty shell tool code (which will probably break if the JSON structure from the ngrok API changes) to add the tunnel's URL to the advertised listeners environment variable:

```bash
NGROK_LISTENER=$(curl -s http://ngrok:4040/api/tunnels/command_line | grep -Po '"public_url":.*?[^\\]",' | cut -d':' -f2- | tr -d ',"' | sed 's/tcp:\/\//NGROK:\/\//g')
export KAFKA_ADVERTISED_LISTENERS="$KAFKA_ADVERTISED_LISTENERS, $NGROK_LISTENER"
echo "KAFKA_ADVERTISED_LISTENERS is set to " $KAFKA_ADVERTISED_LISTENERS
```

And then, finally, we launch the Kafka broker (using the original value of the Docker image's `entrypoint`):

```bash
/etc/confluent/docker/run
```

## ngrok with Kafka on Docker Compose, in action

_You can find the Docker Compose file [here](/code/docker-compose-ngrok-kafka.yml)_

Launch the Docker Compose stack:

```bash
$ docker compose up
```

Get the ngrok tunnel URL:

```bash
$ curl -s localhost:4040/api/tunnels | jq -r '.tunnels[0].public_url'
tcp://2.tcp.eu.ngrok.io:16738
```

(you can also use the Web UI at http://localhost:4040 to get this info)

Now from anywhere on the interwebz, use your local Kafka broker based on the URL returned from ngrok (minus the `tcp://` prefix):

```bash
$ echo "hello world" | kcat -b 2.tcp.eu.ngrok.io:16738 -P -t test_topic
$
$ kcat -b 2.tcp.eu.ngrok.io:16738 -C -t test_topic
hello world
% Reached end of topic test_topic [0] at offset 1
```

If you go and have a look at your Docker Compose log you'll see information about network traffic flowing through the tunnel:

```
ngrok | t=2023-11-01T12:17:59+0000 lvl=info msg="join connections" obj=join id=8db744a20159 l=192.168.117.4:9092 r=82.20.253.111:34870
```

So there we have it—ngrok and Kafka, nicely automated in a standalone Docker Compose file 😎
File renamed without changes
Binary file added static/images/2023/11/h_IMG_5046.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok01.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok02.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok03a.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok03b.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok04.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok05.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok06.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok07.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok08.webp
Binary file not shown.
2 changes: 1 addition & 1 deletion themes/story/layouts/_default/single.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

{{ define "main" }}
<article class="center bg-white br-3 pv1 ph4 lh-copy f5 nested-links mw8">
{{ .Content | replaceRE "(<h[2-9] id=\"([^\"]+)\".+)(</h[2-9]+>)" "${1}&nbsp;<a class=\"headline-hash\" href=\"#${2}\">#</a> ${3}" | safeHTML }}
{{ .Content | replaceRE "(<h[2-9] id=\"([^\"]+)\".+)(</h[2-9]+>)" "${1}&nbsp;<a class=\"headline-hash\" href=\"#${2}\">🔗</a> ${3}" | safeHTML }}
</article>
{{ end }}

Expand Down
3 changes: 2 additions & 1 deletion themes/story/static/css/story.css
Original file line number Diff line number Diff line change
Expand Up @@ -417,4 +417,5 @@ img[src~="3dbook"] {
.hljs-link {
text-decoration: underline;
}
a.headline-hash {opacity: 40%; text-decoration: none;}
a.headline-hash {opacity: 20%; text-decoration: none; font-size:80%; transition: opacity 0.5s;}
a.headline-hash:hover {opacity: 100%; font-size:100%;}

0 comments on commit c2eaa13

Please sign in to comment.