Skip to content

Commit

Permalink
Kafka ngrok article
Browse files Browse the repository at this point in the history
  • Loading branch information
rmoff committed Nov 1, 2023
1 parent b4dd85d commit d705804
Show file tree
Hide file tree
Showing 12 changed files with 314 additions and 0 deletions.
58 changes: 58 additions & 0 deletions content/code/docker-compose-ngrok-kafka.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
version: '2'
services:
ngrok:
image: ngrok/ngrok:latest
container_name: ngrok
# Sign up for an ngrok account at https://dashboard.ngrok.com/signup
# Get your auth-token from https://dashboard.ngrok.com/get-started/your-authtoken
# and either put it directly in the file here, or write it to a .env file in
# the same folder as this Docker Compose file in the form
# NGROK_AUTH_TOKEN=your_token_value
command: tcp broker:9092 --log stdout --authtoken $NGROK_AUTH_TOKEN
ports:
- 4040:4040 # Web dashboard for ngrok

zookeeper:
image: confluentinc/cp-zookeeper:7.5.1
container_name: zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000

broker:
image: confluentinc/cp-kafka:7.5.1
container_name: broker
depends_on:
- zookeeper
- ngrok
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENERS: DOCKER://broker:29092, NGROK://broker:9092
KAFKA_ADVERTISED_LISTENERS: DOCKER://broker:29092
KAFKA_INTER_BROKER_LISTENER_NAME: DOCKER
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: DOCKER:PLAINTEXT,NGROK:PLAINTEXT
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
entrypoint:
- /bin/sh
- -c
- |
echo "Waiting for ngrok tunnel to be created"
while : ; do
curl_status=$$(curl -s -o /dev/null -w %{http_code} http://ngrok:4040/api/tunnels/command_line)
echo -e $$(date) "\tTunnels API HTTP state: " $$curl_status " (waiting for 200)"
if [ $$curl_status -eq 200 ] ; then
break
fi
sleep 5
done
echo "ngrok tunnel is up"
NGROK_LISTENER=$(curl -s http://ngrok:4040/api/tunnels/command_line | grep -Po '"public_url":.*?[^\\]",' | cut -d':' -f2- | tr -d ',"' | sed 's/tcp:\/\//NGROK:\/\//g')
echo $$NGROK_LISTENER
export KAFKA_ADVERTISED_LISTENERS="$$KAFKA_ADVERTISED_LISTENERS, $$NGROK_LISTENER"
echo "KAFKA_ADVERTISED_LISTENERS is set to " $$KAFKA_ADVERTISED_LISTENERS
/etc/confluent/docker/run
256 changes: 256 additions & 0 deletions content/post/kafka-ngrok.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
---
draft: false
title: 'Using Apache Kafka with ngrok'
date: "2023-11-01T10:07:58Z"
image: "/images/2023/11/h_IMG_5046.webp"
thumbnail: "/images/2023/11/ngrok02.webp"
credit: "https://twitter.com/rmoff/"
categories:
- ngrok
- Apache Kafka
---

Sometimes you might want to access Apache Kafka that's running on your local machine from another device not on the same network. I'm not sure I can think of a production use-case, but there are a dozen examples for sandbox, demo, and playground environments.

In this post we'll see how you can use [ngrok](https://ngrok.com/) to, in their words, `Put localhost on the internet`. And specifically, your local Kafka broker on the internet.

<!--more-->

## Overview

Why? In my case, I wanted to expose my local Kafka as a source to [Decodable](https://decodable.co/) so that I can ingest streams of data for processing with Apache Flink through the managed service.

![Overview of Kafka solution](/images/2023/11/ngrok01.webp)

The example I'm going to show has ngrok and the Kafka broker running in a Docker Compose environment:

![Overview of Kafka/ngrok solution](/images/2023/11/ngrok02.webp)

_ngrok has a free tier to use which works jsut fine for this, but you will need to [create a free account](https://dashboard.ngrok.com/signup) to get your [auth token](https://dashboard.ngrok.com/get-started/your-authtoken). In this article I'm assuming you've exported your auth token to the environment variable `NGROK_AUTH_TOKEN`_.

## Kafka and Listeners and Advertised Listeners…oh my

In theory, using ngrok is easy. You configure a _tunnel_ which routes a publicly-accessible host/port to one accessed from the ngrok agent running locally.

```bash
ngrok tcp broker:9092 --authtoken $NGROK_AUTH_TOKEN
```

From this you get the tunnel url details:

```
t=2023-11-01T10:57:41+0000 lvl=info msg="started tunnel" obj=tunnels name=command_line addr=//broker:9092 url=tcp://6.tcp.eu.ngrok.io:13075
```

The tunnel URL is `6.tcp.eu.ngrok.io:13075`. This is accessible from anywhere on the interwebz—locally or remote. Let's try accessing our Kafka broker from another machine. I'm using my favourite tool, `kcat`. We'll start by interogating the broker (`-b`) for metadata (`-L`):

```bash
$ kcat -L -b 6.tcp.eu.ngrok.io:13075

Metadata for all topics (from broker -1: 6.tcp.eu.ngrok.io:13075/bootstrap):
1 brokers:
broker 1 at localhost:9092 (controller)
0 topics:
```

Look at that! It worked! 👏

Or did it…

```bash
$ echo "hello world" | kcat -b 6.tcp.eu.ngrok.io:13075 -P -t test_topic

%3|1698837177.570|FAIL|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Connect to ipv4#127.0.0.1:9092 failed: Connection refused (after 0ms in state CONNECT)
% ERROR: Local: Broker transport failure: localhost:9092/1: Connect to ipv4#127.0.0.1:9092 failed: Connection refused (after 0ms in state CONNECT)
%3|1698837177.804|FAIL|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Connect to ipv6#[::1]:9092 failed: Connection refused (after 0ms in state CONNECT)
% ERROR: Local: Broker transport failure: localhost:9092/1: Connect to ipv6#[::1]:9092 failed: Connection refused (after 0ms in state CONNECT)
[…]
```

oh no! 😖

So here's what's happening. The **`advertised.listener`** is the address at which the broker tells the client that it is to be found.

![The intitial bootstrap/advertised listeners exchange between Kafka broker and client](/images/2023/11/ngrok03.webp)

After the initial bootstrap connection its the address which the client will then connect to when producing or consuming records.

![The intitial bootstrap/advertised listeners exchange between Kafka broker and client](/images/2023/11/ngrok04.webp)

If we introduce ngrok into the mix it looks like this:

1. The Kafka client connects via the public ngrok tunnel address to the broker for bootstrap connection

![Kafka client connects via ngrok to the broker for bootstrap](/images/2023/11/ngrok05.webp)

2. The Kafka broker returns a list of available brokers, with their `advertised.listener` address (`localhost:9092`)

![The Kafka broker returns a list of available brokers, with their `advertised.listener` address](/images/2023/11/ngrok06.webp)

3. The Kafka client tries to connect to the address given—`localhost:9092`—to produce/consume data. Since there's no Kafka broker running local to the Kafka client (i.e. at `localhost:9092`) the connection fails.

![The Kafka client tries to connect to the address given—`localhost:9092`—to produce/consume data. Since there's no Kafka broker running local to the Kafka client (i.e. at `localhost:9092`) the connection fails.](/images/2023/11/ngrok07.webp)

_(If you want to get into this in more detail about this please see my previous article about [working with advertised.listeners](/2018/08/02/kafka-listeners-explained/))_

## Making it work

We need to get the ngrok tunnel URL and configure that as the Kafka broker's advertised.listener:

![Correct configuration of Kafka and ngrok](/images/2023/11/ngrok08.webp)

Let's look at what's actually involved in doing this.

I'm using Docker to run Kafka, so the configuration I'm going to discuss is through the environment variables passed to the image.

The first thing is to configure the **listeners**. This defines where the broker binds to for listening to inbound connections. I'm using two listeners; one for regular traffic between Docker containers, and the other for ngrok traffic. Listeners can be arbitrarily named, so I'm using nice clear labels here. The listener's label servers as a prefix to the host and port.

* `DOCKER`
* `NGROK`

The only—but crucial—difference is the port on which they listen. **The port on which a connection receives determines the listener used, and therefore the advertised listener that's served in response.**

```
KAFKA_LISTENERS: DOCKER://broker:29092, NGROK://broker:9092
```

Now we configure the **advertised listeners**:

```
KAFKA_ADVERTISED_LISTENERS: DOCKER://broker:29092, NGROK://6.tcp.eu.ngrok.io:13075
```

For completeness, we need two more listener configuration items:

```
KAFKA_INTER_BROKER_LISTENER_NAME: DOCKER
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: DOCKER:PLAINTEXT,NGROK:PLAINTEXT
```

## Automating it

Now, this is all very well. But there's a complication. ngrok uses a random host and port *each time the tunnel is created*. This means that to configure the Kafka broker with the correct advertised listener we need to know the tunnel *first*. That also makes building a static configuration for our Kafka broker tricky—for example, in Docker Compose.

So what we need to do is:

1. Launch ngrok and create tunnel
2. Determine the tunnel URL and add it to the advertised listener configuration for Kafka
3. Launch the Kafka broker

What prompted this whole exercise was wanting to build a standalone artefact that could be used to easily spin up Kafka locally to connect to Decodable in the cloud. Manually hacking around config is fine as a one-off, but I wanted something repeatable and as automated as possible.

## Wrangling around Docker Compose

This is what I ended up doing in Docker Compose. I'd love to hear your feedback if there's a smarter or more idiomatic way in which to accomplish the same :)

First, run ngrok and create the tunnel. This is hardcoded to direct traffic to the Kafka container at `broker:9092`.

```yaml
ngrok:
image: ngrok/ngrok:latest
container_name: ngrok
command: tcp broker:9092 --log stdout --authtoken $NGROK_AUTH_TOKEN
ports:
- 4040:4040
```
ngrok has a REST API, which we can query to get the tunnel URL (`public_url`):

```json
$ curl http://localhost:4040/api/tunnels/command_line | jq '.'
{
"name": "command_line",
"ID": "10c495e397bd65f42f3d4ebbd1bb74f5",
"uri": "/api/tunnels/command_line",
"public_url": "tcp://0.tcp.eu.ngrok.io:16761",
"proto": "tcp",
"config": {
"addr": "broker:9092",
"inspect": false
},
```

Now we are ready to look at the Kafka broker. The chaining is defined with the Docker Compose's `depends_on`:

```yaml
depends_on:
- zookeeper
- ngrok
```

What I've done is define the _constant_ listener variables in the Docker Compose service entry for the broker, whilst leaving `KAFKA_ADVERTISED_LISTENERS` with a single entry and nothing for ngrok yet:

```yaml
KAFKA_LISTENERS: DOCKER://broker:29092, NGROK://broker:9092
KAFKA_ADVERTISED_LISTENERS: DOCKER://broker:29092
KAFKA_INTER_BROKER_LISTENER_NAME: DOCKER
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: DOCKER:PLAINTEXT,NGROK:PLAINTEXT
```

I've then overriden the `entrypoint` of the container. First, it will wait for the tunnel to be created:

```bash
echo "Waiting for ngrok tunnel to be created"
while : ; do
curl_status=$$(curl -s -o /dev/null -w %{http_code} http://ngrok:4040/api/tunnels/command_line)
echo -e $$(date) "\tTunnels API HTTP state: " $$curl_status " (waiting for 200)"
if [ $$curl_status -eq 200 ] ; then
break
fi
sleep 5
done
echo "ngrok tunnel is up"
```

Then—in the absence of `jq` on the `confluentinc/cp-kafka` image—I use some fairly nasty shell tool code (which will probably break if the JSON structure from the ngrok API changes) to add the tunnel's URL to the advertised listeners environment variable:

```bash
NGROK_LISTENER=$(curl -s http://ngrok:4040/api/tunnels/command_line | grep -Po '"public_url":.*?[^\\]",' | cut -d':' -f2- | tr -d ',"' | sed 's/tcp:\/\//NGROK:\/\//g')
export KAFKA_ADVERTISED_LISTENERS="$$KAFKA_ADVERTISED_LISTENERS, $$NGROK_LISTENER"
echo "KAFKA_ADVERTISED_LISTENERS is set to " $$KAFKA_ADVERTISED_LISTENERS
```

And then, finally, we launch the Kafka broker (using the original value of the Docker image's `entrypoint`):

```bash
/etc/confluent/docker/run
```

## ngrok with Kafka on Docker Compose, in action

_You can find the Docker Compose file [here](/code/docker-compose-ngrok-kafka.yml)_

Launch the Docker Compose stack:

```bash
$ docker compose up
```

Get the ngrok tunnel URL:

```bash
$ curl -s localhost:4040/api/tunnels | jq -r '.tunnels[0].public_url'
tcp://2.tcp.eu.ngrok.io:16738
```

(you can also use the Web UI at http://localhost:4040 to get this info)

Now from anywhere on the interwebz, use your local Kafka broker based on the URL returned from ngrok (minus the `tcp://` prefix):

```bash
$ echo "hello world" | kcat -b 2.tcp.eu.ngrok.io:16738 -P -t test_topic
$
$ kcat -b 2.tcp.eu.ngrok.io:16738 -C -t test_topic
hello world
% Reached end of topic test_topic [0] at offset 1
```

If you go and have a look at your Docker Compose log you'll see information about network traffic flowing through the tunnel:

```
ngrok | t=2023-11-01T12:17:59+0000 lvl=info msg="join connections" obj=join id=8db744a20159 l=192.168.117.4:9092 r=82.20.253.111:34870
```

So there we have it—ngrok and Kafka, nicely automated in a standalone Docker Compose file 😎
File renamed without changes
Binary file added static/images/2023/11/h_IMG_5046.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok01.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok02.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok03.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok04.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok05.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok06.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok07.webp
Binary file not shown.
Binary file added static/images/2023/11/ngrok08.webp
Binary file not shown.

0 comments on commit d705804

Please sign in to comment.