Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document when zio-kafka is faster than raw java kafka #1403

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@ title: "Getting Started with ZIO Kafka"
sidebar_label: "Getting Started"
---

[ZIO Kafka](https://github.com/zio/zio-kafka) is a Kafka client for ZIO. It provides a purely functional, streams-based interface to the Kafka client and integrates effortlessly with ZIO and ZIO Streams.
[ZIO Kafka](https://github.com/zio/zio-kafka) is a Kafka client for ZIO. It provides a purely functional, streams-based interface to the Kafka
client and integrates effortlessly with ZIO and ZIO Streams. Often zio-kafka programs have a _higher_ throughput than
programs that use the Java Kafka client directly (see section [Performance](#performance) below).

@PROJECT_BADGES@ [![Scala Steward badge](https://img.shields.io/badge/Scala_Steward-helping-blue.svg?style=flat&logo=)](https://scala-steward.org)

Expand Down Expand Up @@ -135,3 +137,15 @@ Want to see your company here? [Submit a PR](https://github.com/zio/zio-kafka/ed
* [KelkooGroup](https://www.kelkoogroup.com)
* [Rocker](https://rocker.com)

## Performance

By default, zio-kafka programs process partitions in parallel. The default java Kafka client does not provide parallel
processing. Of course, there is some overhead in buffering records and distributing them to the fibers that need them.
On 2024-11-23, we estimated this overhead to be 1.2 ms per 1k records
(comparing [benchmarks](https://zio.github.io/zio-kafka/dev/bench/) `throughput` and `kafkaClients`, using the standard
GitHub Action runners (4 cores), and with the Kafka broker in the same JVM). This means that for this particular
combination, when processing needs more than 1.2 ms per 1k records, a zio-kafka based program will have **higher
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when processing needs more than 1.2 ms per 1k records, a zio-kafka based program will have higher throughput

I'm not sure we can conclude that. The zio-kafka part will still be 1.2 ms slower per 1k records than fetching it via the plain kafka client, and on top of that you get the extra processing time which will slow the system down. I don't see how the processing time cancels the overhead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a week or so I'll have time to write a longer article that will hopefully explain my intuition on this 😄

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be sure to take the new throughput results into account in your new calculation :)

throughput** than a program based on a java Kafka client.

If you do not care for the convenient ZStream based API that zio-kafka brings, and latency is of absolute importance,
using the java based Kafka client directly is still the better choice.
Loading