AutoMQ on Ceph: Managed Serverless AutoScaling Kafka with 10x Cost Efficiency

Introduction

Ceph[1] is an open-source distributed object, block, and file storage system. It originated from Sage Weil's doctoral dissertation in 2003 and was released under the LGPL 2.1 license in 2006. Integrated with the Linux kernel KVM, Ceph is included by default in many GNU/Linux distributions. Its uniqueness lies in providing storage capabilities across objects, blocks, and file systems, catering to various storage needs.

AutoMQ[2]'s innovative shared storage architecture requires the simultaneous use of low-latency block devices and cost-effective object storage. With Ceph supporting both POSIX and S3 access protocols, it is well-suited for AutoMQ. Thanks to Ceph's compatibility with S3 and its support for both block and object storage protocols, even in private data centers, you can deploy an AutoMQ cluster to achieve a stream system that is fully compatible with Kafka but offers better cost efficiency, supreme elasticity, and single-digit millisecond latency. This article will guide you on deploying your AutoMQ cluster atop Ceph in your private data center.

Prerequisites

A fully operational Ceph environment can be configured by following the official documentation.
Consult the official documentation to set up Ceph's S3 compatible component, RGW.

To deploy the AutoMQ cluster, prepare five hosts. It's advisable to use Linux amd64 hosts with 2 cores and 16GB of memory, and to equip each with two virtual storage volumes. The configuration is outlined as follows:

Role	IP	Node ID	System volume	Data Volume
CONTROLLER	192.168.0.1	0	EBS 20GB	EBS 20GB
CONTROLLER	192.168.0.2	1	EBS 20GB	EBS 20GB
CONTROLLER	192.168.0.3	2	EBS 20GB	EBS 20GB
BROKER	192.168.0.4	3	EBS 20GB	EBS 20GB
BROKER	192.168.0.5	4	EBS 20GB	EBS 20GB

Tips:

Ensure that these machines are within the same subnet and can communicate with each other

In non-production environments, it's feasible to deploy just one Controller, which by default, also serves as a Broker

Download the latest official binary installation package from AutoMQ Github Releases to install AutoMQ

Create a Bucket for Ceph

Set environment variables to configure the Access Key and Secret Key required by the AWS CLI.

export AWS_ACCESS_KEY_ID=X1J0E1EC3KZMQUZCVHED
export AWS_SECRET_ACCESS_KEY=Hihmu8nIDN1F7wshByig0dwQ235a0WAeUvAEiWSD

Create an S3 bucket using the AWS CLI.

aws s3api create-bucket --bucket automq-data --endpoint=http://127.0.0.1:80
aws s3api create-bucket --bucket automq-ops --endpoint=http://127.0.0.1:80

Create a user for Ceph

radosgw-admin user create --uid="automq" --display-name="automq"

The created user by default has full permissions required by AutoMQ. For minimal permissions, refer to the CEPH official documentation for custom settings. After executing the above commands, the results are as follows:

{
    "user_id": "automq",
    "display_name": "automq",
    "email": "",
    "suspended": 0,
    "max_buckets": 1000,
    "subusers": [],
    "keys": [
        {
            "user": "automq",
            "access_key": "X1J0E1EC3KZMQUZCVHED",
            "secret_key": "Hihmu8nIDN1F7wshByig0dwQ235a0WAeUvAEiWSD"
        }
    ],
    "swift_keys": [],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "default_storage_class": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "temp_url_keys": [],
    "type": "rgw",
    "mfa_ids": []
}

Install and start the AutoMQ cluster

Configure S3URL

Step 1: Generate S3 URL

AutoMQ provides the `automq-kafka-admin.sh` tool for effortless startup. By inputting an S3 URL with the necessary endpoint and authentication details, users can launch AutoMQ with a single click, bypassing manual cluster ID creation and storage formatting.

bin/automq-kafka-admin.sh generate-s3-url \ 
--s3-access-key=xxx  \ 
--s3-secret-key=yyy \ 
--s3-region=cn-northwest-1  \ 
--s3-endpoint=s3.cn-northwest-1.amazonaws.com.cn \ 
--s3-data-bucket=automq-data \ 
--s3-ops-bucket=automq-ops

When configuring with Ceph, utilize the configuration below to create the appropriate S3 URL.

Parameter Name	Default Value in This Example	Description
--s3-access-key	X1J0E1EC3KZMQUZCVHED	After creating a Ceph user, be sure to update this based on your specific needs
--s3-secret-key	Hihmu8nIDN1F7wshByig0dwQ235a0WAeUvAEiWSD	After creating a Ceph user, be sure to update this based on your specific needs
--s3-region	us-west-2	This parameter is invalid in Ceph, it can be set to any value, such as us-west-2.
--s3-endpoint	http://127.0.0.1:80	This parameter represents the service address of Ceph's S3-compatible component RGW. If there are multiple machines, it is recommended to use a load balancer (SLB) to aggregate them into a single IP address.
--s3-data-bucket	automq-data	-
--s3-ops-bucket	automq-ops	-

Output results

After executing this command, the process will automatically proceed in the following stages:

Detecting core features of S3 using the provided accessKey and secretKey to verify compatibility between AutoMQ and S3.
Generating an s3url based on the identity and access point information.
Retrieving a command example for launching AutoMQ based on the s3url. In the command, replace --controller-list and --broker-list with the actual CONTROLLER and BROKER needed for deployment.

Example of the results is as follows:

############  Ping s3 ########################

[ OK ] Write s3 object
[ OK ] Read s3 object
[ OK ] Delete s3 object
[ OK ] Write s3 object
[ OK ] Upload s3 multipart object
[ OK ] Read s3 multipart object
[ OK ] Delete s3 object
############  String of s3url ################

Your s3url is:

s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=xxx&s3-secret-key=yyy&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA


############  Usage of s3url  ################
To start AutoMQ, generate the start commandline using s3url.
bin/automq-kafka-admin.sh generate-start-command \
--s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" \
--controller-list="192.168.0.1:9093;192.168.0.2:9093;192.168.0.3:9093"  \
--broker-list="192.168.0.4:9092;192.168.0.5:9092"

TIPS: Please replace the controller-list and broker-list with your actual IP addresses.

Step 2: Generate a list of startup commands

Replace the --controller-list and --broker-list in the previously generated command with your host information, specifically replacing them with the IP addresses of the 3 CONTROLLERS and 2 BROKERS mentioned in the environment preparation, and use the default ports 9092 and 9093.

bin/automq-kafka-admin.sh generate-start-command \
--s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" \
--controller-list="192.168.0.1:9093;192.168.0.2:9093;192.168.0.3:9093"  \
--broker-list="192.168.0.4:9092;192.168.0.5:9092"

Parameter Explanation

Parameter Name	Mandatory	Description
--s3-url	yes	Generated by the command line tool bin/automq-kafka-admin.sh generate-s3-url, which includes authentication, cluster ID, and other information.
--controller-list	yes	At least one address is required, serving as the IP and port list for the CONTROLLER host. The format is IP1:PORT1; IP2:PORT2; IP3:PORT3.
--broker-list	yes	At least one address is required, serving as the IP and port list for the BROKER host. The format is IP1:PORT1; IP2:PORT2; IP3:PORT3.
--controller-only-mode	no	Determines whether the CONTROLLER node only undertakes the CONTROLLER role. The default is false, which means the deployed CONTROLLER node also acts as the BROKER role.

Output results

After executing the command, a command list is generated for launching AutoMQ.


############  Start Commandline ##############
To start an AutoMQ Kafka server, please navigate to the directory where your AutoMQ tgz file is located and run the following command.

Before running the command, make sure that Java 17 is installed on your host. You can verify the Java version by executing 'java -version'.

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=0 --override [email protected]:9093,[email protected]:9093,[email protected]:9093 --override listeners=PLAINTEXT://192.168.0.1:9092,CONTROLLER://192.168.0.1:9093 --override advertised.listeners=PLAINTEXT://192.168.0.1:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=1 --override [email protected]:9093,[email protected]:9093,[email protected]:9093 --override listeners=PLAINTEXT://192.168.0.2:9092,CONTROLLER://192.168.0.2:9093 --override advertised.listeners=PLAINTEXT://192.168.0.2:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=2 --override [email protected]:9093,[email protected]:9093,[email protected]:9093 --override listeners=PLAINTEXT://192.168.0.3:9092,CONTROLLER://192.168.0.3:9093 --override advertised.listeners=PLAINTEXT://192.168.0.3:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker --override node.id=3 --override [email protected]:9093,[email protected]:9093,[email protected]:9093 --override listeners=PLAINTEXT://192.168.0.4:9092 --override advertised.listeners=PLAINTEXT://192.168.0.4:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker --override node.id=4 --override [email protected]:9093,[email protected]:9093,[email protected]:9093 --override listeners=PLAINTEXT://192.168.0.5:9092 --override advertised.listeners=PLAINTEXT://192.168.0.5:9092


TIPS: Start controllers first and then the brokers.

The node.id is automatically generated starting from 0 by default.

Step 3: Start AutoMQ

To initiate the cluster, sequentially execute the commands listed in the previous step on the pre-designated CONTROLLER or BROKER host. For example, to start the first CONTROLLER process on 192.168.0.1, execute the first command template from the generated startup command list.

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=0 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.1:9092,CONTROLLER://192.168.0.1:9093 --override advertised.listeners=PLAINTEXT://192.168.0.1:9092

Parameter Explanation

When using the startup command, parameters not specified will use the default configuration of Apache Kafka. For new parameters added by AutoMQ, AutoMQ's default values will be used. To override the default settings, additional --override key=value parameters can be added at the end of the command to modify the default values.

Parameter Name	Mandatory	Instructions
s3-url	Yes	Generated by the bin/automq-kafka-admin.sh command line tool, which includes authentication, cluster ID and other information.
process.roles	Yes	The options are CONTROLLER or BROKER. If a host serves as both CONTROLLER and BROKER, the configuration value should be CONTROLLER, BROKER.
node.id	Yes	An integer used to uniquely identify BROKER or CONTROLLER within a Kafka cluster, must maintain uniqueness within the cluster.
controller.quorum.voters	Yes	The host information participating in the KRAFT election, includes nodeid, IP and port information, such as: [email protected]:9093, [email protected]:9093, [email protected]:9093.
listeners	Yes	Listening IP and Port
advertised.listeners	Yes	The BROKER provides an access address for the Client.
log.dirs	No	The directory for storing KRAFT and BROKER metadata.
s3.wal.path	No	In a production environment, it is recommended to store AutoMQ WAL data on a newly mounted raw device in a standalone volume. This setup can lead to improved performance as AutoMQ supports writing data to raw devices, thereby reducing latency. Ensure that the correct path is configured to store WAL data.
autobalancer.controller.enable	No	The default value is false, meaning traffic rebalancing is not enabled. Once traffic rebalancing is automatically activated, the auto balancer component of AutoMQ will migrate partitions automatically to ensure overall traffic is balanced.

Tips: If you need to enable continuous traffic rebalancing or run Example: Self-Balancing When Cluster Nodes Change, it is recommended to explicitly specify the parameter --override autobalancer.controller.enable=true when starting the Controller.

Running in the background

To run the application in the background, append the following to your command:

command > /dev/null 2>&1 &

Prepare bare device data volumes.

AutoMQ leverages raw devices as data volumes for the write-ahead log (WAL) to boost write efficiency.

According to the official documentation(https://docs.ceph.com) from Ceph, raw devices can be configured on Linux hosts in the manner described.
Set the bare device path to /dev/vdb.

Data volume path

Use the Linux lsblk command to check local data volumes; unpartitioned block devices are identified as data volumes. Below, vdb appears as the unpartitioned bare block device.

vda    253:0    0   20G  0 disk
├─vda1 253:1    0    2M  0 part
├─vda2 253:2    0  200M  0 part /boot/efi
└─vda3 253:3    0 19.8G  0 part /
vdb    253:16   0   20G  0 disk

By default, AutoMQ stores metadata and WAL data in the /tmp directory. However, it's crucial to note that if the /tmp directory is mounted on tmpfs, it is unsuitable for production environments.

For better suitability in production or formal testing environments, consider modifying the configuration as follows: assign the metadata directory to log.dirs and the WAL data directory to s3.wal.path (write data disk's bare device) on the bare device path.

bin/kafka-server-start.sh ...\
--override  s3.telemetry.metrics.exporter.type=prometheus \
--override  s3.metrics.exporter.prom.host=0.0.0.0 \
--override  s3.metrics.exporter.prom.port=9090 \
--override  log.dirs=/root/kraft-logs \
--override  s3.wal.path=/dev/vdb \
> /dev/null 2>&1 &

Tips: /dev/vdb is the raw device path prepared by us through Ceph.

With this configuration, you've successfully deployed an AutoMQ cluster utilizing Ceph, resulting in a cost-effective, low-latency, and elastic Kafka cluster in seconds. For a deeper dive into AutoMQ’s capabilities like instant partition reassignment and self-balancing, refer to official examples.

This section refers to reference materials.

[1] Ceph: https://ceph.io/en/

[2] What is ceph: https://ubuntu.com/ceph/what-is-ceph

AutoMQ Wiki Key Pages

What is automq

Getting started

Architecture

Deployment

Migration

Observability

Integrations

Data analysis
- RisingWave
- Databend
- Timeplus
- Apache Doris
- Flink
- StarRocks
Object storage
- MinIO
- Ceph
- CubeFS
Kafka ui
- Kafdrop
- Redpanda Console
Observability
- Flashcat
- Guance Cloud
Data integration
- CloudCanal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AutoMQ on Ceph: Managed Serverless AutoScaling Kafka with 10x Cost Efficiency

Introduction

Prerequisites

Install and start the AutoMQ cluster

Configure S3URL

Step 1: Generate S3 URL

Output results

Step 2: Generate a list of startup commands

Parameter Explanation

Output results

Step 3: Start AutoMQ

Parameter Explanation

Running in the background

Prepare bare device data volumes.

Data volume path

This section refers to reference materials.

AutoMQ Wiki Key Pages

What is automq

Getting started

Architecture

Deployment

Migration

Observability

Integrations

Releases

Benchmarks

Reference

Articles

Clone this wiki locally