Skip to content

Commit

Permalink
benchmarking: Update docs for workloads (#124)
Browse files Browse the repository at this point in the history
* benchmarking: Update docs for workloads

* benchmarking: Fix inconsistencies in docs
  • Loading branch information
ohsayan authored Sep 14, 2024
1 parent d84bb3c commit 61955a8
Showing 1 changed file with 39 additions and 28 deletions.
67 changes: 39 additions & 28 deletions docs/load-testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,37 +11,48 @@ The overall goal with the benchmark tool is to simulate how the database would p
The ability to simulate different workloads is currently being worked on. This means that across releases, the default workload that the benchmark tool runs may vary. See the release notes to see if the benchmark workload has changed.
:::

## Benchmark workload

The workload that the engine currently uses (as of v0.8.1) is the following:
- A model `bench.bench` is created with a primary key (of type `string` and with a column of type `uint8`)
- Multiple clients are created (simulating "application servers")
- Queries are run against **unique rows**. This means that unlike `redis-benchmark` **every query group touches a different row, as it would generally do in the real-world**
- The following queries (collectively a "query group") are run for each unique row:
- The row is first created with an `INSERT`
- All columns of the row are returned with a `SELECT`
- The integer column is incremented with an `UPDATE`
- The row is finally removed with a `DELETE`
- **By default, 1,000,000 rows are created and manipulated**
- **The time taken for each row to be sent, read back and decoded into a readable form is recorded** (time taken to parse into actual language structures such as maps and lists) towards the total time taken, once again unlike many other benchmark tools
- In total 4,000,000 queries are run (by default)

:::caution
The benchmark tool will create a space `bench` and a model `bench` and will completely remove the space and associated data once the benchmark is complete. **Do not use this space!**
:::
## Using the benchmark tool

You will need to select a workload from the [below section on workloads](#benchmark-workloads). As workloads need `root` access to the database for creating and removing spaces and tables, you will need to also provide the `root` account password.

You may run workloads like this using an argument for the password:

```sh
sky-bench --workload <workload_name> --password <root password>
```

You can also use the `SKYDB_PASSWORD` environment variable if you do not want to use `--password`. You can tune the number of threads, connections, rows created and such to your liking to simulate the environment that you think matches your production setting:

- `--connections <count>`: Set the number of client connections (defaults to `8 * number of logical CPUs`)
- `--threads <count>`: Total number of threads to use. (defaults to the logical CPU count)
- `--rowcount <count>`: Set the number of unique rows to run workload sequences on
- `--keysize <size>`: Set the number of bytes to use for the primary key

See the help menu using `sky-bench --help` to use additional configuration options.

## Benchmark workloads

Workloads are used to emulate various usage scenarios, for example by varying the read/write proportions, changing the distributions of keys and such. While we intend to add more workloads down the line, the default workload is currently `uniform_std_v1`.

### `uniform_std_v1`

This workload executes an uniform proportion of operations (hence called "uniform") for unique rows. It does the following:

## Running the benchmark
- Creates a space `db`
- Creates a model `db.db` with the following definition: `create model db.db(k: binary, v: uint64)`
- Now:
- 1,000,000 unique rows are inserted using `INSERT` (in parallel)
- 1,000,000 of the unique rows that were created in the previous step are modified using `UPDATE`
- 1,000,000 of the unique rows that were created and modified earlier are fetched using a `SELECT`
- 1,000,000 of the unique rows that were created are individually removed using `DELETE`
- Hence, a total of 4,000,000 queries are run

Now that you know how the benchmark engine works, go ahead and benchmark for yourself.
How to run:

1. Run:
```sh
sky-bench --password <root_password>
```
**Note**: You can ignore the `--password` argument if you have already set it using the `SKYDB_PASSWORD` environment variable
2. The benchmark engine will run the full workload (as described earlier)
3. Witness 4,000,000 queries being executed in real-time. Good luck and enjoy the results!
```sh
sky-bench --workload 'uniform_std_v1'
```

:::tip
You can tune the number of threads, connections, rows created and such to your liking to simulate the environment that you think matches your production setting.
Now go ahead and run your own benchmarks to see the performance of Skytable for yourself. We know you'll love it 🚀
:::

0 comments on commit 61955a8

Please sign in to comment.