Skip to content

Commit

Permalink
Update documentation on new functions
Browse files Browse the repository at this point in the history
  • Loading branch information
pkolaczk committed Aug 12, 2024
1 parent 8ac4339 commit 5154216
Showing 1 changed file with 63 additions and 44 deletions.
107 changes: 63 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,22 +11,23 @@

### Performance

Contrary to
[NoSQLBench](https://github.com/nosqlbench/nosqlbench),
Contrary to
[NoSQLBench](https://github.com/nosqlbench/nosqlbench),
[Cassandra Stress](https://cassandra.apache.org/doc/4.0/cassandra/tools/cassandra_stress.html)
and [tlp-stress](https://thelastpickle.com/tlp-stress/),
Latte has been written in Rust and uses the native Cassandra driver from Scylla.
It features a fully asynchronous, thread-per-core execution engine,
capable of running thousands of requests per second from a single thread.
and [tlp-stress](https://thelastpickle.com/tlp-stress/),
Latte has been written in Rust and uses the native Cassandra driver from Scylla.
It features a fully asynchronous, thread-per-core execution engine,
capable of running thousands of requests per second from a single thread.

Latte has the following unique performance characteristics:

* Great scalability on multi-core machines.
* About 10x better CPU efficiency than NoSQLBench.
This means you can test large clusters with a small number of clients.
* About 50x-100x lower memory footprint than Java-based tools.
* Very low impact on operating system resources – low number of syscalls, context switches and page faults.
* No client code warmup needed. The client code works with maximum performance from the first benchmark cycle.
Even runs as short as 30 seconds give accurate results.
* About 10x better CPU efficiency than NoSQLBench.
This means you can test large clusters with a small number of clients.
* About 50x-100x lower memory footprint than Java-based tools.
* Very low impact on operating system resources – low number of syscalls, context switches and page faults.
* No client code warmup needed. The client code works with maximum performance from the first benchmark cycle.
Even runs as short as 30 seconds give accurate results.
* No GC pauses nor HotSpot recompilation happening in the middle of the test. You want to measure hiccups of the server,
not the benchmarking tool.

Expand All @@ -35,8 +36,8 @@ different workloads.

### Flexibility

Other benchmarking tools often use configuration files to specify workload recipes.
Although that makes it easy to define simple workloads, it quickly becomes cumbersome when you want
Other benchmarking tools often use configuration files to specify workload recipes.
Although that makes it easy to define simple workloads, it quickly becomes cumbersome when you want
to script more realistic scenarios that issue multiple
queries or need to generate data in different ways than the ones directly built into the tool.

Expand All @@ -51,7 +52,8 @@ anything you wish. There are variables, conditional statements, loops, pattern m
user-defined data structures, objects, enums, constants, macros and many more.

## Features
* Compatible with Apache Cassandra 3.x, 4.x, DataStax Enterprise 6.x and ScyllaDB

* Compatible with Apache Cassandra 3.x, 4.x, DataStax Enterprise 6.x and ScyllaDB
* Custom workloads with a powerful scripting engine
* Asynchronous queries
* Prepared queries
Expand All @@ -76,16 +78,20 @@ Latte is still early stage software under intensive development.
* Backwards compatibility may be broken frequently.

## Installation

### From deb package

```shell
dpkg -i latte-<version>.deb
````

## From source

1. [Install Rust toolchain](https://rustup.rs/)
2. Run `cargo install latte-cli`

## Usage

Start a Cassandra cluster somewhere (can be a local node). Then run:

```shell
Expand Down Expand Up @@ -125,9 +131,10 @@ The following script would benchmark querying the `system.local` table:

```rust
pub async fn run(ctx, i) {
ctx.execute("SELECT cluster_name FROM system.local LIMIT 1").await
ctx.execute("SELECT cluster_name FROM system.local LIMIT 1").await
}
```

Instance functions on `ctx` are asynchronous, so you should call `await` on them.

The workload script can provide more than one function for running the benchmark.
Expand All @@ -142,10 +149,10 @@ The `schema` function is executed by running `latte schema` command.

```rust
pub async fn schema(ctx) {
ctx.execute("CREATE KEYSPACE IF NOT EXISTS test \
ctx.execute("CREATE KEYSPACE IF NOT EXISTS test \
WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }").await?;
ctx.execute("DROP TABLE IF NOT EXISTS test.test").await?;
ctx.execute("CREATE TABLE test.test(id bigint, data varchar)").await?;
ctx.execute("DROP TABLE IF NOT EXISTS test.test").await?;
ctx.execute("CREATE TABLE test.test(id bigint, data varchar)").await?;
}
```

Expand All @@ -160,42 +167,43 @@ const INSERT = "my_insert";
const SELECT = "my_select";
pub async fn prepare(ctx) {
ctx.prepare(INSERT, "INSERT INTO test.test(id, data) VALUES (?, ?)").await?;
ctx.prepare(SELECT, "SELECT * FROM test.test WHERE id = ?").await?;
ctx.prepare(INSERT, "INSERT INTO test.test(id, data) VALUES (?, ?)").await?;
ctx.prepare(SELECT, "SELECT * FROM test.test WHERE id = ?").await?;
}
pub async fn run(ctx, i) {
ctx.execute_prepared(SELECT, [i]).await
ctx.execute_prepared(SELECT, [i]).await
}
```
Query parameters can be bound and passed by names as well:
```rust
const INSERT = "my_insert";
pub async fn prepare(ctx) {
ctx.prepare(INSERT, "INSERT INTO test.test(id, data) VALUES (:id, :data)").await?;
ctx.prepare(INSERT, "INSERT INTO test.test(id, data) VALUES (:id, :data)").await?;
}
pub async fn run(ctx, i) {
ctx.execute_prepared(INSERT, #{id: 5, data: "foo"}).await
ctx.execute_prepared(INSERT, # { id: 5, data: "foo" }).await
}
```
### Populating the database
Read queries are more interesting when they return non-empty result sets.
Read queries are more interesting when they return non-empty result sets.
To be able to load data into tables with `latte load`, you need to set the number of load cycles on the context object
To be able to load data into tables with `latte load`, you need to set the number of load cycles on the context object
and define the `load` function:
```rust
pub async fn prepare(ctx) {
ctx.load_cycle_count = 1000000;
ctx.load_cycle_count = 1000000;
}
pub async fn load(ctx, i) {
ctx.execute_prepared(INSERT, [i, "Lorem ipsum dolor sit amet"]).await
ctx.execute_prepared(INSERT, [i, "Lorem ipsum dolor sit amet"]).await
}
```
Expand All @@ -204,7 +212,7 @@ dataset regardless of the data that were present in the database before:
```rust
pub async fn erase(ctx) {
ctx.execute("TRUNCATE TABLE test.test").await
ctx.execute("TRUNCATE TABLE test.test").await
}
```
Expand All @@ -222,45 +230,54 @@ are pure, i.e. invoking them multiple times with the same parameters yields alwa
- `latte::blob(i, len)` – generates a random binary blob of length `len`
- `latte::normal(i, mean, std_dev)` – generates a floating point number from a normal distribution
- `latte::uniform(i, min, max)` – generates a floating point number from a uniform distribution
- `latte::text(i, length)` – generates a random string
- `latte::vector(length, function)` – generates a vector of given length with a function
that takes an integer element index and generates an element
- `latte::join(vector, separator)` – joins a collection of strings using a separator
- `x.clamp(min, max)` – restricts the range of an integer or a float value to given range
#### Type conversions
Rune uses 64-bit representation for integers and floats.
Rune uses 64-bit representation for integers and floats.
Since version 0.28 Rune numbers are automatically converted to proper target query parameter type,
therefore you don't need to do explicit conversions. E.g. you can pass an integer as a parameter
of Cassandra type `smallint`. If the number is too big to fit into the range allowed by the target
type, a runtime error will be signalled.

The following methods are available:
- `x.to_integer()` – converts a float to an integer
- `x.to_float()` – converts an integer to a float
- `x.to_string()` – converts a float or integer to a string
- `x.clamp(min, max)` – restricts the range of an integer or a float value to given range

You can also convert between floats and integers by calling `to_integer` or `to_float` instance functions.
- `x as i64` – converts any number to an integer
- `x as f64` – converts any number to a float
- `x.parse::<i64>()` – parses a string as an integer
- `x.parse::<f64>()` – parses a string as a float
- `x.to_string()` – converts a float or integer to a string
#### Text resources
Text data can be loaded from files or resources with functions in the `fs` module:
- `fs::read_to_string(file_path)` – returns file contents as a string
- `fs::read_lines(file_path)` – reads file lines into a vector of strings
- `fs::read_words(file_path)` – reads file words (split by non-alphabetic characters) into a vector of strings
- `fs::read_resource_to_string(resource_name)` – returns builtin resource contents as a string
- `fs::read_resource_lines(resource_name)` – returns builtin resource lines as a vector of strings
- `fs::read_resource_words(resource_name)` – returns builtin resource words as a vector of strings
The resources are embedded in the program binary. You can find them under `resources` folder in the
source tree.
The resources are embedded in the program binary. You can find them under `resources` folder in the
source tree.
To reduce the cost of memory allocation, it is best to load resources in the `prepare` function only once
and store them in the `data` field of the context for future use in `load` and `run`:
To reduce the cost of memory allocation, it is best to load resources in the `prepare` function only once
and store them in the `data` field of the context for future use in `load` and `run`:
```rust
pub async fn prepare(ctx) {
ctx.data.last_names = fs::read_lines("lastnames.txt")?;
// ... prepare queries
ctx.data.last_names = fs::read_lines("lastnames.txt")?;
// ... prepare queries
}
pub async fn run(ctx, i) {
let random_last_name = latte::hash_select(i, ctx.data.last_names);
// ... use random_last_name in queries
let random_last_name = latte::hash_select(i, ctx.data.last_names);
// ... use random_last_name in queries
}
```
Expand All @@ -273,16 +290,18 @@ Use `latte::param!(param_name, default_value)` macro to initialize script consta
const ROW_COUNT = latte::param!("row_count", 1000000);
pub async fn prepare(ctx) {
ctx.load_cycle_count = ROW_COUNT;
ctx.load_cycle_count = ROW_COUNT;
}
```
Then you can set the parameter by using `-P`:
```
latte run <workload> -P row_count=200
```
### Mixing workloads
It is possible to run more than one workload function at the same time.
You can specify multiple functions with `-f` / `--function` and optionally give
each function the weight which will determine how frequently the function should be called.
Expand Down

0 comments on commit 5154216

Please sign in to comment.