Skip to content

Benchmarking Cassandra and other NoSQL databases with YCSB

Pekka Enberg edited this page May 13, 2014 · 12 revisions

YCSB is a popular benchmark tool for NoSQL. It have ready adapters for different NoSQL DB like Cassandra, Mongo, Redis and others.

YCSB comes with 6 out of the box workload, each testing a different common use case

  • Workload A: Update heavy workload This workload has a mix of 50/50 reads and writes. An application example is a session store recording recent actions. *Workload B: Read mostly workload This workload has a 95/5 reads/write mix. Application example: photo tagging; add a tag is an update, but most operations are to read tags.
  • Workload C: Read only This workload is 100% read. Application example: user profile cache, where profiles are constructed elsewhere (e.g., Hadoop).
  • Workload D: Read latest workload In this workload, new records are inserted, and the most recently inserted records are the most popular. Application example: user status updates; people want to read the latest.
  • Workload E: Short ranges In this workload, short ranges of records are queried, instead of individual records. Application example: threaded conversations, where each scan is for the posts in a given thread (assumed to be clustered by thread id).
  • Workload F: Read-modify-write

The following are instruction for using YCSB to test Cassandra on OSv.

Prerequisite

You will need both cassandra-cli and YCSB installed to test Cassandra

cassandra-cli

sudo yum install cassandra12

YCSB

install

Test Bed

Running Cassandra

running Cassandra on OSv capstan run cloudius/osv-cassandra -n bridge

Configure Cassandra

YCSB do not have a function to configure each NoSQL DB Prerequisites. For Cassandra, the following need be executed:

cassandra-cli -h 192.168.122.89 (use proper ip of your Cassandra server, 192.168.122.89 is the default IP for KVM with network bridge)

create keyspace usertable;
use usertable;
create column family data;
exit;

Benchmark

Each benchmark have two phases:

  1. upload the test data to DB
  2. execute transactions

Loading test data

./bin/ycsb load cassandra-10 -p hosts="192.168.122.89" -P workloads/workloada > workloada_load_res.txt

Running the benchmark

 ./bin/ycsb run cassandra-10 -threads 10 -p hosts="192.168.122.89" -P workloads/workloada > workloada_run_res.txt

The results will be available at workloada_run_res.txt:

$ head workloada_res.txtYCSB Client 0.1
Command line: -db com.yahoo.ycsb.db.CassandraClient10 -p hosts=192.168.122.89 -P workloads/workloadc -s -threads 10 -target 100 -t
[OVERALL], RunTime(ms), 10139.0
[OVERALL], Throughput(ops/sec), 98.62905611993293
[READ], Operations, 1000
[READ], AverageLatency(us), 1003.36
[READ], MinLatency(us), 412
[READ], MaxLatency(us), 25990
[READ], 95thPercentileLatency(ms), 1
[READ], 99thPercentileLatency(ms), 3

More resources:

Clone this wiki locally