From 6c0d3425077abcbb13acd01d754e200389d5b8ef Mon Sep 17 00:00:00 2001 From: Rajkamal Natarajan <2908985+rnatarajan@users.noreply.github.com> Date: Thu, 5 Aug 2021 22:09:09 -0500 Subject: [PATCH] Update faq.asciidoc Add documentation for MilliSecondsBehindSource growing exponentially. --- documentation/faq.asciidoc | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/documentation/faq.asciidoc b/documentation/faq.asciidoc index 64095b7c7fe..0e310320cb5 100644 --- a/documentation/faq.asciidoc +++ b/documentation/faq.asciidoc @@ -351,3 +351,21 @@ To solve the issue the configuration option `producer.max.request.size` must be If the global change is not desirable then the connector can override the default setting using configuration option `producer.override.max.request.size` set to a larger value. In the latter case it is also necessary to configure `connector.client.config.override.policy=ALL` option in Kafka Connect worker config file `connect-distributed.properties`. For Debezium `connect` Docker image the environment variable `CONNECT_CONNECTOR_CLIENT_CONFIG_OVERRIDE_POLICY` can be used to configure the option. + +== Why MilliSecondsBehindSource is growing exponentially ? + +When Debezium Connector is replicating binlog, it is possible that Debezium Connector is not able to keep up with rate at which CDC events are generated in the upstream database. +Observe the streaming metrics MilliSecondsBehindSource. MilliSecondsBehindSource would increase exponentially. + +To solve the issue, identify the round trip time of a packet from the machine on which Kafka connect is running to database host. + +``` +ping -c 10 +``` + +If the round trip time is in few milliseconds(For Example 20 or 30 milliseconds and not 0.1 or 0.5 milliseconds), then time taken by Kafka Connect to connect with upstream database is high. + +For streams generating fewer CDC events, Kafka connect will be able to keep up with CDC events even with high round trip time. +However for a stream generating high volume of CDC data, Kafka connect will not be able to keep up with CDC data and hence MilliSecondsBehindSource will grow exponentially. + +Move Kafka Connect with Debezium connector to a host or machine from which database can be reached faster or round trip time is less than a millisecond.