Consumer group starvation and partition allocation #56
-
For a topic with 15 partitions to get the best latency I will currently have 3 deployment starting each 5 thread running 1 synchronous consumers each ( Kafka-lib ) all in the same consumer group. So 15 consumers equally distribued between 3 deployment. Now with parallel-consumer : Let say I want HA so I only run 3 deployment of parallel-consumer. Does concurrency will be equilibrate in the consumer group ? How do you ensure there is no starvation between my 3 deployments ( how parallel-consumer decide how many consumer should be run for each instance ) ? And how manage maxConcurrency setting ? Thank you |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
That’s a great question! Thanks for asking and thanks for your interest in the project. The parallel consumer (PC) only uses a single consumer instance. So, if you deploy three PCs on let’s say, three nodes, you will have three consumers running in the group. So your 15 partitions will be spread evenly as normal, five per consumer. The concurrency you get though, is determined by the number of PCs, the ordering type you choose, and the max concurrency you choose. So, if you want 15 partitions to process in parallel in partition order, with three instances, you would set max concurrency to at least 5. That would give you three consumers, each running 5 threads to process messages (but still only three Kafka consumer clients inside). (This is assuming you’re using the core module and not the vertx module). If in This case you set the max concurrency to higher than five, let’s say 20, then five threads per parallel consumer will be doing work while 15 threads are idle. But that’s okay. However, if you don’t need partition ordering, then you can choose either key or unordered. In which case you could set the max concurrency to maybe 100 or 1000. Then you might get 300 or 3000 concurrent record processing across all three instances. Depending on unordered or key ordering types (and if key, depending on your key distribution across the records in the current buffers). Cheers! I hope you don’t mind, I converted your issue into a discussion. Let me know if there’s any other way I can help you. :) |
Beta Was this translation helpful? Give feedback.
That’s a great question! Thanks for asking and thanks for your interest in the project.
The parallel consumer (PC) only uses a single consumer instance. So, if you deploy three PCs on let’s say, three nodes, you will have three consumers running in the group. So your 15 partitions will be spread evenly as normal, five per consumer.
The concurrency you get though, is determined by the number of PCs, the ordering type you choose, and the max concurrency you choose.
So, if you want 15 partitions to process in parallel in partition order, with three instances, you would set max concurrency to at least 5. That would give you three consumers, each running 5 threads to process messages (but stil…