Skip to content

Commit

Permalink
Repushing #113
Browse files Browse the repository at this point in the history
  • Loading branch information
SassafrasAU committed Sep 28, 2024
1 parent eb1cded commit 9014a85
Showing 1 changed file with 211 additions and 0 deletions.
211 changes: 211 additions & 0 deletions docs/project-4/Crowd-Monitoring-Detection/kafka.tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
---
sidebar_position: 2
---

# Crowd Monitoring & Player Tracking Project Plan: Apache Kafka

## Introduction
As a member of the Crowd Monitoring & Player Tracking team, my primary task is to develop a system for handling data logistics using a document-based database. The focus of my work is on ensuring that the data generated by our monitoring and tracking systems is efficiently and reliably processed, stored, and made available for analysis and visualization.

## Specific Focus on Kafka Data Streaming Pipeline
I have chosen to focus on the Kafka data streaming pipeline as a crucial component of our data logistics system. Kafka is well-suited for our needs due to its ability to handle high-throughput, real-time data streams with low latency, which is essential for monitoring and tracking applications where timely data processing is critical.

## Why Kafka?
Kafka was chosen for several reasons:

- **Scalability**: Kafka's distributed architecture allows it to scale horizontally, which is vital as the volume of data from player tracking and crowd monitoring can be substantial.
- **Reliability**: Kafka's strong durability guarantees ensure that no data is lost during transmission, which is important for maintaining the integrity of our tracking data.
- **Real-time Processing**: Kafka's capability to process data in real-time is a perfect fit for our system's requirement to monitor crowd movement and player tracking as events unfold.

## Key Components of Kafka

- **Producers**: Entities that publish data to Kafka topics. They push records (data) into Kafka without concern for how the data is processed downstream.
- **Consumers**: Entities that read records from Kafka topics. They can be independent processes or applications that subscribe to specific topics to process data.
- **Topics**: Categories or feed names to which records are published. Kafka topics are partitioned to allow for parallelism and scalability.
- **Brokers**: Kafka brokers are servers that store and serve data. A Kafka cluster consists of multiple brokers, ensuring fault tolerance and distributed storage.
- **Zookeeper**: Used by Kafka to manage and coordinate the brokers. It handles leader election for partitions and maintains a list of all brokers in the cluster.

## Installing Apache Kafka

### On macOS

To get started with Kafka on a macOS system, you'll need to install both Kafka and its dependency, Zookeeper. Here's a step-by-step guide:

#### Prerequisites
- **Homebrew**: Ensure that Homebrew is installed on your Mac. Homebrew is a popular package manager for macOS that simplifies the installation of software.
To install Homebrew, open Terminal and enter:
```bash
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
```
- **Java**: Kafka requires Java to run.
Install it using Homebrew:
```bash
brew install openjdk@11
```

#### Step-by-Step Installation

1. **Install Kafka and Zookeeper**
Install Kafka and Zookeeper using Homebrew:
```bash
brew install kafka
```

2. **Start Zookeeper**
Kafka uses Zookeeper to manage its brokers. Start Zookeeper with the following command:
```bash
zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties
```

3. **Start Kafka Server**
Once Zookeeper is running, start the Kafka broker:
```bash
kafka-server-start /usr/local/etc/kafka/server.properties
```

4. **Create a Topic**
To create a Kafka topic, use the following command:
```bash
kafka-topics --create --topic test-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
```

5. **Send and Receive Messages**
Start sending messages to the Kafka topic using a producer:
```bash
kafka-console-producer --topic test-topic --bootstrap-server localhost:9092
```
To consume messages from the topic, use:
```bash
kafka-console-consumer --topic test-topic --from-beginning --bootstrap-server localhost:9092
```

### On Windows

To install Kafka on a Windows system, follow these steps:

#### Prerequisites

- **Java**: Ensure that Java is installed on your machine. You can download and install it from the [Oracle JDK website](https://www.oracle.com/java/technologies/javase-downloads.html).
- **Download Kafka**: Go to the [Apache Kafka download page](https://kafka.apache.org/downloads) and download the latest binary for your operating system.

#### Step-by-Step Installation

1. **Extract Kafka**
Extract the downloaded Kafka archive to your desired directory (e.g., `C:\kafka`).

2. **Configure Environment Variables**
Add the Kafka `bin` directory (e.g., `C:\kafka\bin\windows`) to your system's `PATH` environment variable.

3. **Start Zookeeper**
Kafka uses Zookeeper to manage its brokers. Start Zookeeper with the following command in a new Command Prompt:
```bash
zookeeper-server-start.bat C:\kafka\config\zookeeper.properties
```

4. **Start Kafka Server**
Once Zookeeper is running, start the Kafka broker in another Command Prompt:
```bash
kafka-server-start.bat C:\kafka\config\server.properties
```

5. **Create a Topic**
To create a Kafka topic, use the following command:
```bash
kafka-topics.bat --create --topic test-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
```

6. **Send and Receive Messages**
Start sending messages to the Kafka topic using a producer:
```bash
kafka-console-producer.bat --topic test-topic --bootstrap-server localhost:9092
```
To consume messages from the topic, use:
```bash
kafka-console-consumer.bat --topic test-topic --from-beginning --bootstrap-server localhost:9092
```

### Using Docker

To run Kafka using Docker, follow these steps:

#### Prerequisites

- **Docker**: Ensure Docker is installed on your system. You can download Docker from the [Docker website](https://www.docker.com/products/docker-desktop).

#### Step-by-Step Installation

1. **Create a Docker Network**
Create a new Docker network for Kafka and Zookeeper:
```bash
docker network create kafka-network
```

2. **Start Zookeeper Container**
Run a Zookeeper container:
```bash
docker run -d --name zookeeper --network kafka-network -e ZOOKEEPER_CLIENT_PORT=2181 confluentinc/cp-zookeeper:latest
```

3. **Start Kafka Container**
Run a Kafka container:
```bash
docker run -d --name kafka --network kafka-network -e KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092 -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 confluentinc/cp-kafka:latest
```

4. **Create a Topic**
To create a Kafka topic, use the following command:
```bash
docker exec -it kafka kafka-topics --create --topic test-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
```

5. **Send and Receive Messages**
Send messages to the Kafka topic using a producer:
```bash
docker exec -it kafka kafka-console-producer --topic test-topic --bootstrap-server localhost:9092
```
To consume messages from the topic, use:
```bash
docker exec -it kafka kafka-console-consumer --topic test-topic --from-beginning --bootstrap-server localhost:9092
```

### On Linux

To install Kafka on a Linux system, follow these steps:

#### Prerequisites

- **Java**: Kafka requires Java to run. You can install it using your package manager. For example, on Ubuntu or Debian:
```bash
sudo apt update
sudo apt install openjdk-11-jdk
```
- **Download Kafka**: Go to the [Apache Kafka download page](https://kafka.apache.org/downloads) and download the latest binary for your operating system.

#### Step-by-Step Installation

1. **Extract Kafka**
Extract the downloaded Kafka archive to your desired directory (e.g., `/opt/kafka`).

2. **Start Zookeeper**
Kafka uses Zookeeper to manage its brokers. Start Zookeeper with the following command:
```bash
/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
```

3. **Start Kafka Server**
Once Zookeeper is running, start the Kafka broker:
```bash
/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
```

4. **Create a Topic**
To create a Kafka topic, use the following command:
```bash
/opt/kafka/bin/kafka-topics.sh --create --topic test-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
```

5. **Send and Receive Messages**
Start sending messages to the Kafka topic using a producer:
```bash
/opt/kafka/bin/kafka-console-producer.sh --topic
```

0 comments on commit 9014a85

Please sign in to comment.