Data within Kafka is stored durably, in order, and can be read deterministically.
In addition, the data can be distributed within the system to provide additional protections against failures, as well as significant opportunities for scaling performance.
The unit of data within Kafka is the message.
A message can have an optional piece of metadata, which is referred to as a key.
The message and the key are byte arrays and have no specific meaning to Kafka.
The offset, an integer value that continually increases, is another piece of metadata that Kafka adds to each message as it is produced.
A single Kafka server is called a broker.
Kafka brokers are designed to operate as part of a cluster.
Within a cluster of brokers, one broker will also function as the cluster controller.
A partition is owned by a single broker in the cluster, and that broker is called the leader of the partition.
A replicated partition is assigned to additional brokers, called followers of the partition.
- Support for multiple producers
- Support for multiple consumers
- Disk-based retention
- Scalable
- High performance
- Activity tracking
- Messaging
- Metrics and logging
- Commit log
- Stream processing