-
Notifications
You must be signed in to change notification settings - Fork 137
Mirroring Kafka Clusters
In this use case, we set up two independent Kafka brokers locally, and use Brooklin to mirror data between them.
- Source: Kafka
- Destination: Kafka
- Connector:
KafkaMirrorMakerConnector
- Transport Provider:
KafkaTransportProvider
-
Download the latest Kafka tarball and untar it.
tar -xzf kafka_2.12-2.2.0.tgz cd kafka_2.12-2.2.0
-
Create two different
server.properties
files for the two different Kafka servers.cp config/server.properties config/server-src.properties cp config/server.properties config/server-dest.properties
-
Edit these two config files to specify different values for the
log.dirs
,zookeeper.connect
, andlisteners
config properties. You can do this manually or use the commands below.sed -ie 's/\/tmp\/kafka-logs/\/tmp\/kafka-logs\/src/; s/localhost:2181/localhost:2181\/src/' config/server-src.properties echo listeners=PLAINTEXT://:9092 >> config/server-src.properties sed -ie 's/\/tmp\/kafka-logs/\/tmp\/kafka-logs\/dest/; s/localhost:2181/localhost:2181\/dest/' config/server-dest.properties echo listeners=PLAINTEXT://:9093 >> config/server-dest.properties
-
Start a ZooKeeper server
bin/zookeeper-server-start.sh config/zookeeper.properties > /dev/null &
-
Start two Kafka servers (we'll call them
source
anddestination
)bin/kafka-server-start.sh config/server-src.properties > /dev/null & bin/kafka-server-start.sh config/server-dest.properties > /dev/null &
-
Create three topics in the source Kafka server
bin/kafka-topics.sh --topic first-topic --bootstrap-server localhost:9092 --create --replication-factor 1 --partitions 1 bin/kafka-topics.sh --topic second-topic --bootstrap-server localhost:9092 --create --replication-factor 1 --partitions 1 bin/kafka-topics.sh --topic third-topic --bootstrap-server localhost:9092 --create --replication-factor 1 --partitions 1
-
Populate the topics you created with some data
# We use the LICENSE and NOTICE files packaged in the Kafka tarball cat LICENSE | bin/kafka-console-producer.sh --topic first-topic --broker-list localhost:9092 cat NOTICE | bin/kafka-console-producer.sh --topic second-topic --broker-list localhost:9092 cat NOTICE | bin/kafka-console-producer.sh --topic third-topic --broker-list localhost:9092
- Download the latest tarball (tgz) from Brooklin releases
- Untar the Brooklin tarball
tar -xzf brooklin-1.0.0.tgz cd brooklin-1.0.0
- Run Brooklin
bin/brooklin-server-start.sh config/server.properties > /dev/null 2>&1 &
-
Create a datastream to mirror only the first two Kafka topics you created,
first-topic
andsecond-topic
, from the source to the destination Kafka server.Notice how we use a regex (
-s
option in the command below) to select the topics we are interested in. The pattern we specify intentionally excludesthird-topic
.bin/brooklin-rest-client.sh -o CREATE -u http://localhost:32311/ -n first-mirroring-stream -s "kafka://localhost:9092/^(first|second)-topic" -c kafkaConnector -t kafkaTransportProvider -m '{"owner":"test-user","system.reuseExistingDestination":"false"}'
Here are the options we used to create this datastream:
-o CREATE The operation is datastream creation -u http://localhost:32311/ Datstream Management Service URI -n first-file-datastream Datastream name -s kafka://localhost:9092/^(first|second)-topic Datastream source URI -c kafkaConnector Connector name ("kafkaConnector" refers to KafkaMirrorConnector) -t kafkaTransportProvider Transport provider name ("kafkaTransportProvider" refers to KafkaTransportProvider) -m '{"owner":"test-user", "system.reuseExistingDestination": "false"}' Datastream metadata
-
For the datastream source (
-s
) option in this example, it is required to specify a URI that starts withkafka://
orkafkassl://
. -
For the datastream metadata (
-m
) option-
Specifying an
owner
is mandatory -
Setting
system.reuseExistingDestination
tofalse
keeps Brooklinfrom reusing an existing Kafka topic (if any) in the destination Kafka server
-
Check the
KafkaMirrorConnector
wiki page to learn more about its various configuration options. -
-
Verify the datastream creation by requesting all datastream metadata from Brooklin.
bin/brooklin-rest-client.sh -o READALL -u http://localhost:32311/
Notice the
connectionString
values undersource
anddestination
-
Additionally, you can view some more information about the different
Datastreams
andDatastreamTasks
by querying the health monitoring REST endpoint of the Datastream Management Service.curl -s "http://localhost:32311/health"
-
Verify that only
first-topic
andsecond-topic
were created in the destination Kafka server by running:bin/kafka-topics.sh --bootstrap-server localhost:9093 --list
-
Verify the created topics have the right contents by running:
bin/kafka-console-consumer.sh --bootstrap-server localhost:9093 --from-beginning --topic first-topic bin/kafka-console-consumer.sh --bootstrap-server localhost:9093 --from-beginning --topic second-topic
-
Use the Kafka console consumer to read from the Kafka topic,
first-topic
, that Brooklin created in the destination server.bin/kafka-console-consumer.sh --bootstrap-server localhost:9093 --from-beginning --topic first-topic
-
Open another terminal window, and launch the Kafka console producer configuring it to write to
first-topic
in the source server.bin/kafka-console-producer.sh --broker-list localhost:9092 --topic first-topic
-
Start typing text in the Kafka producer terminal. Hit enter then observe the Kafka consumer terminal you launched in step 1. You should see the message as it got mirrored to the destination server.
-
You can stop mirroring temporarily by pausing the datasteam
bin/brooklin-rest-client.sh -o PAUSE -u http://localhost:32311/ -n first-mirroring-stream
-
Similarly, you can re-enable mirroring by resuming the datastream
bin/brooklin-rest-client.sh -o RESUME -u http://localhost:32311/ -n first-mirroring-stream
When you are done, run the following commands to stop all running apps.
# Replace <brooklin-dir> and <kafka-dir> with Brooklin and Kafka directories, respectively
<brooklin-dir>/bin/brooklin-server-stop.sh
<kafka-dir>/bin/kafka-server-stop.sh
<kafka-dir>/bin/zookeeper-server-stop.sh
- Home
- Brooklin Architecture
- Production Use Cases
- Developer Guide
- Documentation
- REST Endpoints
- Connectors
- Transport Providers
- Brooklin Configuration
- Test Driving Brooklin