Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document when zio-kafka is faster than raw java kafka #1403

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

erikvanoosten
Copy link
Collaborator

No description provided.

On 2024-11-23, we estimated this overhead to be 1.2 ms per 1k records
(comparing [benchmarks](https://zio.github.io/zio-kafka/dev/bench/) `throughput` and `kafkaClients`, using the standard
GitHub Action runners (4 cores), and with the Kafka broker in the same JVM). This means that for this particular
combination, when processing needs more than 1.2 ms per 1k records, a zio-kafka based program will have **higher
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when processing needs more than 1.2 ms per 1k records, a zio-kafka based program will have higher throughput

I'm not sure we can conclude that. The zio-kafka part will still be 1.2 ms slower per 1k records than fetching it via the plain kafka client, and on top of that you get the extra processing time which will slow the system down. I don't see how the processing time cancels the overhead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a week or so I'll have time to write a longer article that will hopefully explain my intuition on this 😄

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be sure to take the new throughput results into account in your new calculation :)

@svroonland
Copy link
Collaborator

In ZioKafkaConsumerBenchmark.throughput:

Consumer
             .plainStream(Subscription.topics(topic1), Serde.byteArray, Serde.byteArray)
             .tap { _ =>
               counter.updateAndGet(_ + 1).flatMap(count => Consumer.stopConsumption.when(count == recordCount))
             }

Do you think it would be more fair to replace the tap with a tapChunk? In the manual kafka equivalent benchmark, we also process records by the batch instead of per record.

@erikvanoosten
Copy link
Collaborator Author

Do you think it would be more fair to replace the tap with a tapChunk?

Yes, I agree. Will you make a PR?

@svroonland
Copy link
Collaborator

See #1409

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants