kafka best practices (#203)

opentelekomcloud · Feb 6, 2025 · e2c7ff2 · e2c7ff2
1 parent bb6319d
commit e2c7ff2
Show file tree

Hide file tree

Showing 17 changed files with 647 additions and 2 deletions.
diff --git a/...ices/distributed-message-service/configuring-message-accumulation-monitoring.md b/...ices/distributed-message-service/configuring-message-accumulation-monitoring.md
@@ -0,0 +1,133 @@
+---
+id: configuring-message-accumulation-monitoring
+title: Configuring Message Accumulation Monitoring
+tags: [kafka, dms, cloudeye, smn]
+---
+
+# Configuring Message Accumulation Monitoring
+
+Unprocessed messages accumulate if the client's consumption is slower than the server's sending. When accumulated messages cannot be consumed in time, we can configure alarm rules so that you will be notified when the number of accumulated messages in a consumer group exceeds the threshold.
+
+## Configuring Simple Message Notification
+
+### Creating a Topic
+
+- Log in to the *Open Telekom Cloud Console* -> *Simple Message Notification* -> *Topics* and click *Create Topic*.
+- Give it a name, e.g. **test_topic** and click *OK*.
+
+### Creating a Subscription
+
+- Go to the *Open Telekom Cloud Console* -> *Simple Message Notification* -> *Subscriptions* and click *Add Subscription*.
+
+<center>
+![](/img/docs/best-practices/application-services/distributed-message-service/Screenshot_from_2025-02-05_07-24-03.png "Click to enlarge")
+**Figure 1** Add a Subscription to the Topic  
+</center>
+
+- Click *Select Topic* and choose the topic we just created.
+- Set **Protocol** to `Email`.
+- Add an **Endpoint** (an email address you have direct access to) and click *OK*.
+
+Back to the *Subscriptions* console click *Request Confirmation*:
+
+<center>
+![](/img/docs/best-practices/application-services/distributed-message-service/Screenshot_from_2025-02-05_07-38-07.png "Click to enlarge")
+**Figure 2** Request Subscription Confirmation  
+</center>
+
+and click *Confirm Subscription* in the "SMN-Confirming Your Subcription" email:
+
+<center>
+![](/img/docs/best-practices/application-services/distributed-message-service/Screenshot_from_2025-02-05_07-41-04.png "Click to enlarge")
+**Figure 3** Respond to the Subscription Confirmation Request Email
+</center>
+
+## Configuring Cloud Eye
+
+- Log in to the *Open Telekom Cloud Console* -> *Cloud Eye* -> *Distributed Message Service* -> *Kafka Platinum*.
+- Find your Kafka instance in the list  and click *Create Alarm Rule*.
+
+<center>
+![](/img/docs/best-practices/application-services/distributed-message-service/Screenshot_from_2025-02-05_07-20-50.png "Click to enlarge")
+**Figure 4** Create a Cloud Eye Alarm Rule
+</center>
+
+### Creating an Alarm Rule
+
+- Set **Template** to `DMS Kafka Instance Alarm Template`.
+- Set **Notification Object** to the name of the SMN topic we created in the previous step (in this case **test_topic**) and click *Create*.
+
+<center>
+![](/img/docs/best-practices/application-services/distributed-message-service/Screenshot_from_2025-02-05_07-21-30.png "Click to enlarge")
+**Figure 5** Configure the Cloud Eye Alarm Rule
+</center>
+
+## Run a simulation
+
+We are going to rely in a small Golang package that creates a DMS Kafka topic, a message producer, a bunch of message consumers in order to simulate accumulated unprocessed messages in our Kafka instance.
+
+:::info
+You can find more information about the implementation details of this package in the Best Practice: [Optimizing Consumer Polling](optimizing-consumer-polling.md).
+:::
+
+### Cloning the repository
+
+Clone the repo in your local development machine:
+
+```shell
+git clone https://github.com/opentelekomcloud-blueprints/kafka-optimizing-consumer-pulling.git
+```
+
+### Configuring the Parameters
+
+For this lab, **we want to give a handicap to the consumers so we can simulate an accumulation of unprocessed messages**. For that matter, open **main.go** in the editor of your choice and make the following changes:
+
+1. Set `messages` to `1000` -> Produces a significant amount of messages to facilitate our observation.
+
+```go
+var (
+	consumers  = 5
+	partitions = 3
+	messages   = 1000
+	logLevel   = slog.LevelInfo
+	cleanExit  = true
+)
+```
+2. Raise `MaxWaitTime` to `1000` milliseconds -> For more details how `MaxWaitTime` affects the consumers consult [Optimizing Consumer Polling-Use Long Polling](optimizing-consumer-polling.md#use-long-polling).
+
+```go
+func newConsumer(ctx context.Context, consumerId int, wg *sync.WaitGroup) {
+	defer wg.Done()
+
+	config := sarama.NewConfig()
+	config.Consumer.Group.Rebalance.GroupStrategies = []sarama.BalanceStrategy{sarama.NewBalanceStrategyRoundRobin()}
+	config.Consumer.Offsets.Initial = sarama.OffsetOldest
+	config.Consumer.MaxWaitTime = 1000 * time.Millisecond
+```
+
+3. Start the simulation and go to *Open Telekom Cloud Console* -> *Distributed Message Service* -> *Monitoring* -> *Monitoring Details* 
+    and you will notice that the consumers they start lagging behind and unprocessed messages begin to accumulate:
+
+<center>
+![](/img/docs/best-practices/application-services/distributed-message-service/Screenshot_from_2025-02-05_07-27-43.png "Click to enlarge")
+**Figure 6** Kafka Instance Monitoring Details
+</center>
+
+Soon enough, when the limit set in the alarm rule is met, you will receive an email from SMN that will inform you that a Major Alarm is triggered
+by Cloud Eye concerning the Kafka instance:
+
+<center>
+![](/img/docs/best-practices/application-services/distributed-message-service/Screenshot_from_2025-02-05_07-22-58.png "Click to enlarge")
+**Figure 7** Major Alarm is triggered
+</center>
+
+4. Let the simulation continue, and as the time passes the consumers are starting catching up you will be informed by a second email, that Cloud    Eye delegated the situation and the Major Alarm is now been suppressed:
+
+<center>
+![](/img/docs/best-practices/application-services/distributed-message-service/Screenshot_from_2025-02-05_07-26-08.png "Click to enlarge")
+**Figure 8** Major Alarm is over
+</center>
+
+:::important
+You can set up alerts for a variety of Kafka metrics beyond Accumulated Messages. In addition to targeting the Kafka instance itself, you have the option to focus on individual brokers, specific topics, or distinct consumer groups. This flexibility allows for more granular monitoring and quicker responses to potential issues within your DMS/Kafka environment.
+:::
diff --git a/...es/distributed-message-service/improving-kafka-message-processing-efficiency.md b/...es/distributed-message-service/improving-kafka-message-processing-efficiency.md
@@ -0,0 +1,53 @@
+---
+id: improving-kafka-message-processing-efficiency
+title: Improving Kafka Message Processing Efficiency
+tags: [kafka, dms]
+---
+
+# Improving Kafka Message Processing Efficiency
+During message sending and consumption, Distributed Message Service (DMS) for Kafka, producers, and consumers collaborate to ensure service reliability. In addition, efficiency and accuracy of message sending and consumption improves when developers make proper use of DMS for Kafka topics.
+
+Best practices for message producers and consumers are as follows:
+
+## Acknowledging Message Production and Consumption
+
+### Message Production
+
+The producer decides whether to re-send a message based on the DMS for Kafka response.
+
+The producer waits for the sending result or asynchronous callback function to determine if the message is successfully sent. If an exception occurs when sending the message, the producer will not receive a success response and must decide whether to re-send the message. If a success response is received, it indicates that the message has been stored in DMS for Kafka.
+
+### Message consumption
+
+The consumer acknowledges successful message consumption.
+
+The produced messages are sequentially stored in DMS for Kafka. During consumption, messages stored in DMS for Kafka are obtained in sequence. Consumers obtain messages, consume them, and record the status (successful or failed). The status is then submitted to DMS for Kafka.
+
+During this process, the message consumption status may not be successfully submitted due to an exception. In this case, the corresponding messages will be re-obtained by the consumer in the next message consumption request.
+
+## Idempotent Transferring of Message Production and Consumption
+
+To guarantee lossless messaging, DMS for Kafka implements a series of reliability measures. For example, the message synchronization storage mechanism is used to prevent the system and server from being restarted or powered off. The ACK mechanism is used to deal with exceptions that occur during message transmission.
+
+Considering extreme conditions such as network exceptions, you can use DMS for Kafka to design idempotent message transferring in addition to acknowledging message production and consumption.
+
+* If message sending cannot be acknowledged, the producer needs to re-send the message.
+* After obtaining a message that has been processed, the consumer needs to notify DMS for Kafka that consumption is successful and ensure that the message is not processed repeatedly.
+
+## Producing and Consuming Messages in Batches
+
+It is recommended that messages be sent and consumed in batches to improve efficiency.
+
+**Figure 1** Messages being produced and consumed in batches  
+![](/img/docs/best-practices/application-services/distributed-message-service/en-us_image_0000001691529441.png "Click to enlarge")
+
+**Figure 2** Messages being produced and consumed one by one  
+![](/img/docs/best-practices/application-services/distributed-message-service/en-us_image_0000001643370172.png "Click to enlarge")
+
+When consuming messages in batches, consumers need to process and acknowledge messages in the sequence of receiving messages. Therefore, when a message in the batch fails to be consumed, the consumer does not need to consume the remaining messages, and can directly submit consumption acknowledgment of the successfully consumed messages.
+
+## Using Consumer Groups to Facilitate O&M
+
+You can use DMS for Kafka as a message management system. Reading message content from topics is helpful to fault locating and service debugging.
+
+When problems occur during message production and consumption, you can create different consumer groups to locate and analyze problems or debug services for interconnecting with other services. To ensure that other services can continue to process messages in topics, you can create a new consumer group to consume and analyze the messages.
diff --git a/...es/application-services/distributed-message-service/migrating-kafka-services.md b/...es/application-services/distributed-message-service/migrating-kafka-services.md
@@ -152,7 +152,9 @@ then produces the consumed data to the target cluster. For more
 information about MirrorMaker, see [Mirroring data between
 clusters](https://kafka.apache.org/documentation/?spm=a2c4g.11186623.0.0.c82870aav6G9no#basic_ops_mirror_maker).
 
+<center>
 ![**Figure 1** How MirrorMakerworks](/img/docs/best-practices/application-services/distributed-message-service/en-us_image_0000001348167557.png)
+</center>
 
 :::warning Restrictions
 -   The IP addresses and port numbers of the nodes in the source cluster