Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nightly Benchmark Summary SVG Workflow #153

Merged
merged 38 commits into from
Aug 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
a84e6dd
Test embedding simple svg file
stanbrub Jun 28, 2023
99a69b9
Merge branch 'deephaven:main' into embed-benchmark-summary-readme
stanbrub Jun 28, 2023
47f3de6
Merge branch 'deephaven:main' into embed-benchmark-summary-readme
stanbrub Jun 28, 2023
0428cd1
try html in svg foreign object
stanbrub Jun 29, 2023
f7c9124
try again
stanbrub Jun 29, 2023
9e5cc47
try again
stanbrub Jun 29, 2023
f98e45f
Update README.md
stanbrub Jul 1, 2023
51b7b9b
Update README.md
stanbrub Jul 2, 2023
f70d224
Update README.md
stanbrub Jul 3, 2023
5d19dcc
Added SVG summary template generation. Wrote up Public Summary and De…
stanbrub Jul 7, 2023
6f65d9b
Fixed some links
stanbrub Jul 7, 2023
4ebd1e1
formatting
stanbrub Jul 7, 2023
34876ca
Merge branch 'deephaven:main' into embed-benchmark-summary-readme
stanbrub Jul 10, 2023
12e8274
Merge branch 'deephaven:main' into embed-benchmark-summary-readme
stanbrub Jul 11, 2023
0093d5d
Update PublicSummaryWorkflow.md
stanbrub Jul 11, 2023
0593ab7
Merge branch 'deephaven:main' into embed-benchmark-summary-readme
stanbrub Jul 17, 2023
36a7a9a
More svg font/color/sizing work and more template vars
stanbrub Jul 18, 2023
4029d8e
Rearrange benchmarks
stanbrub Jul 18, 2023
4086fea
Better header and footer names
stanbrub Jul 19, 2023
7a58fef
Ugh...line endings
stanbrub Jul 19, 2023
0cf8107
Fixed regular experssion in test
stanbrub Jul 19, 2023
2317166
Reworked summary for two columns; static and ticking
stanbrub Jul 20, 2023
c6caa7c
Added demo script code
stanbrub Jul 21, 2023
1eae078
Added single backup of SVG Summary on Upload
stanbrub Jul 24, 2023
2790fb6
Formatting
stanbrub Jul 25, 2023
d5a6b9f
Formatting
stanbrub Jul 25, 2023
c1f2e42
Added nightly summary markdown
stanbrub Jul 25, 2023
0fb29a7
Changed Demo link
stanbrub Jul 25, 2023
b71a615
Add benchmark demo markdown
stanbrub Jul 26, 2023
32aa4fd
Updated docker compose file for Getting Started
stanbrub Jul 26, 2023
df25972
Added more details to the BenchmarkDemo markdown
stanbrub Jul 26, 2023
52d11fa
Reworked Summary SVG based on feedback from Don and Ryan
stanbrub Jul 28, 2023
87e9527
Update README.md
stanbrub Jul 28, 2023
a19b033
Added Svg Summary javadocs
stanbrub Jul 28, 2023
da84937
Merge branch 'embed-benchmark-summary-readme' of https://github.com/s…
stanbrub Jul 28, 2023
be698fe
Update README.md
stanbrub Jul 29, 2023
dea1ce3
Update NightlySummary.md
stanbrub Jul 29, 2023
6a65234
Delete deephaven-logo.svg
stanbrub Aug 3, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .github/workflows/remote-benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,16 @@ jobs:
with:
credentials_json: ${{secrets.BENCHMARK_GCLOUD}}

- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@v1

- name: Backup Existing Benchmark Summary SVG
run: |
SUMMARY_PREFIX=gs://deephaven-benchmark/${RUN_TYPE}/benchmark-summary
if gsutil stat ${SUMMARY_PREFIX}.svg &>/dev/null; then
gsutil mv ${SUMMARY_PREFIX}.svg ${SUMMARY_PREFIX}.prev.svg &>/dev/null
fi

- name: Upload Benchmark Results to GCloud
uses: google-github-actions/upload-cloud-storage@v1
with:
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Deephaven Benchmark

![Operation Rate Change Tracking By Release](docs/BenchmarkChangeTable.png)
[Summary of Latest Successful Nightly Benchmarks](docs/NightlySummary.md)
![Operation Rate Change Tracking By Release](https://storage.googleapis.com/deephaven-benchmark/nightly/benchmark-summary.svg?)

The Benchmark framework provides support for gathering performance measurements and statistics for operations on tabular data. It uses the JUnit
framework as a runner and works from popular IDEs or from the command line. It is geared towards scale testing interfaces capable of ingesting
Expand Down
51 changes: 51 additions & 0 deletions docs/BenchmarkDemo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Nightly and Release Benchmark Tables

Deephaven benchmarks run every night on the same (or equivalent) server. The script below
generates tables using the latest benchmark data.

Run the script and you will see tables that show the raw results and metrics,
differences between benchmark runs, and differences in metrics before and after
each operation has run.

Explore further by creating your own notebook, copying the script into it, and
running it. Then try using the resulting tables to
[generate your own](https://deephaven.io/core/docs/reference/cheat-sheets/cheat-sheet/),
visualize table data
with [ChartBuilder](https://deephaven.io/core/docs/how-to-guides/user-interface/chart-builder/),
or experiment with [the scripted UI](https://deephaven.io/core/docs/how-to-guides/plotting/category/).

```python
from urllib.request import urlopen

root = 'file:///data' if os.path.exists('/data/deephaven-benchmark') else 'https://storage.googleapis.com'
with urlopen(root + '/deephaven-benchmark/benchmark_tables.dh.py') as r:
benchmark_storage_uri_arg = root + '/deephaven-benchmark'
benchmark_category_arg = 'release' # release | nightly
benchmark_max_runs_arg = 5 # Latest X runs to include
exec(r.read().decode(), globals(), locals())
```
The script works with the benchmark data stored on this demo system but also works in any
[Deephaven Community Core](https://deephaven.io/core/docs/) instance that has
internet connectivity. Copy the script and any additions you have made to another
Deephaven notebook and run just as you did here.

## How Does This Python Snippet Work?

The following is a line-by-line walkthrough of what the above script is doing:
1. Import the [_urllib.request_](https://docs.python.org/3/library/urllib.request.html) package and set up _urlopen_ for use
2. Blank
3. Detect the parent location of the benchmark data; local Deephaven data directory or GCloud data directory
4. Open the *benchmark_tables* script from the discovered parent location
5. Tell the *benchmark_tables* script where the benchmark data is
6. Tell the *benchmark_tables* script what set of data to process
7. Tell the *benchmark_tables* script how many benchmark runs to include
8. Execute the *benchmark_tables* script to generate the tables

Script Arguments:
1. *benchmark_storage_uri_arg*:
- Where to load benchmark data from (don't change if you want to use Deephaven data storage)
2. *benchmark_category_arg*:
- _release_ for benchmarks collected on a specific [Deephaven releases](https://github.com/deephaven/deephaven-core/releases)
- _nightly_ for benchmarks collected every night
3. *benchmark_max_runs_arg*:
- The number of benchmark runs to include, starting from latest
4 changes: 2 additions & 2 deletions docs/GettingStarted.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,13 @@ services:
volumes:
- ./data:/data
environment:
- START_OPTS=-Xmx24g
- "START_OPTS=-Xmx24g -DAuthHandlers=io.deephaven.auth.AnonymousAuthenticationHandler"

redpanda:
command:
- redpanda
- start
- --smp 2
- --smp 2
- --memory 2G
- --reserve-memory 0M
- --overprovisioned
Expand Down
27 changes: 27 additions & 0 deletions docs/NightlySummary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Nightly Benchmark Summary

![Operation Rate Change Tracking By Release](https://storage.googleapis.com/deephaven-benchmark/nightly/benchmark-summary.svg?)

## Summary Table Organization

- Common operations are shown first followed by less common operations
- Benchmarks are taken for each operation twice; Static and Ticking
- [Static](https://deephaven.io/core/docs/how-to-guides/data-import-export/parquet-flat): Parquet data is read into memory and made
available to the operation as a whole
- [Ticking](https://deephaven.io/core/docs/conceptual/deephaven-overview/): Data is released incrementally each cycle
- The Benchmark Date shows the day when the benchmarks where collected, which is the latest successful run

## Basic Benchmark Methodology

- Run on equivalent hardware every night
- Load test data into memory before each run to normalized results without disk and network I/O
- Scale the data (row count) to target operation run time at around 10 seconds
- This is not always possible given the speed of some operations versus benchmark hardware constraints
- Always report a consistent unit for the result like rows processed per second

## Digger Deeper into the Demo

A [Benchmark Demo](https://controller.try-dh.demo.community.deephaven.io/get_ide) is provided on the Deephaven Code Studio
demo cluster. Among other Demo notebooks is a Benchmark notebook (TBD) that provides scripts that generate comparative
benchmark and metric tables using a copy of the latest benchmark results. Users can then experiment with Deephaven
query operations and charting to visualize the data.
109 changes: 109 additions & 0 deletions docs/PublicSummaryWorkflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Public Benchmark Summary Workflow (and Demo)

## Background
Benchmarks for Deephaven are run nightly on a bare metal server. Results for successful runs are collected in a read-only
GCloud bucket that is publically available through the ReST API at <https://storage.googleapis.com/deephaven-benchmark>.

The easiest way to access and use the Benchmark data is to run a [Python snippet](PublishedResults.md) in a
[Deephaven Community Core](https://deephaven.io/community/) (DHC) installation. This requires access to GCloud for that instance.

## Benchmark Summary
As of this writing there are nearly 600 benchmarks produced nightly for DHC. That's great for developers, but users may want
a more concise way of measuring up DHC query operations.

A summary can be added to the README.md for the [Benchmark Project](https://github.com/stanbrub/benchmark) to give a
simple overview of common query operations. An example is provided in the following forked
[README](https://github.com/stanbrub/benchmark/tree/embed-benchmark-summary-readme).

### Workflow
The table embedded in the example README can be generated along with the nightly Github run, deposited in the
GCloud bucket on success, and referenced in the README from there. The table is a SVG file embedded using a Markdown
image reference. Though SVG hyperlinks are not acknowledged in GitHub markdown, clicking on the image pops up the SVG document
where remote links then work.

Pros:
- Two ways to get a Benchmark summary with a link; [README](https://github.com/stanbrub/benchmark/tree/embed-benchmark-summary-readme)
or [GCloud](https://storage.googleapis.com/deephaven-benchmark/benchmark-summary.svg)
- Eliminates the need to check new tables into the Benchmark project
- Simple to generate from a template at the end of the Benchmark run and upload to GCloud using Github workflow

Cons:
- SVG in Markdown is limited (no js scripts, some css events don't work)
- The battle of caches (GCloud, Github, Browser) can delay visibility of changes

## Digging Deeper Demo
Even though there is a [Python snippet](PublishedResults.md) that creates
some tables from the data in the GCloud bucket, users that want to explore DHC benchmarks may be reluctant to download
and install DHC just to look at benchmarks. Navigating from the Benchmark Summary to a Demo DHC worker provides a way
to explore DHC operation performance without the effort of installation and potential troubleshooting.

Having a Demo that uses real Deephaven data rather than mocked up data shows confidence in the product. It also allows
a level of scrutiny that could make benchmark tests and performance better.

### Existing Demo Server Workflow (Using Python Snippet)
At [Deephaven IO](https://deephaven.io/) there is a "Try Demo" button that points the user to a live DHC installation
that has some pre-defined notebooks and data. A Benchmark folder seems to fit well here. Adding the
[Python snippet](PublishedResults.md) to a notebook can provide access to any of the Benchmark data in the cloud.

Pros:
- Use an existing and maintained Demo cluster
- Using the Python snippet, changes to data or tables do not require a change to the Demo
- No extra storage required for data, since it's in the cloud

Cons:
- Demo servers do not allow internet access from DHC queries
- Running the Pyhon snippet is slower than running from local data

### Existing Demo Server Workflow (Local Scripts and Data)
Like the previous workflow the existing Demo Servers would be used. However, both data and scripts/notebooks would be
copied onto (or checked into) the Demo Servers nightly. Everything would be run locally with no downloads.

Pros:
- Use an existing and maintained Demo cluster
- No worry of internet abuse from DHC, since access is turned off
- Faster query runs, since data is local

Cons:
- Data and scripts either checked-in nightly or copied nightly to keep up to date
- Data duplicated in GCloud and on the Demo server
- Maintain two versions of the Benchmark tables query; one for the cloud and one for local

### New Demo Server Workflow
Provide a new Demo cluster exclusively for Benchmarks that is configured to work with the GCloud bucket.
Provide login access for users to run Benchmark notebooks.

Pros:
- Customizable for the purpose of Benchmarks
- Access control to avoid internet abuse

Cons:
- Another cluster to maintain
- User access maintenance

## Ideal Path
After discussions, the ideal path is to do "Existing Demo Server Workflow (Using Python Snippet)". This
provides the lowest maintenance/cost approach. However, poking a hole in the firewal for GCloud bucket
http storage access is complicated by the fact that the Demo cluster installation uses Kubernetes.

## Workable Path
Since the Demo clusters use Kubernetes Kubes and the Kubernetes Firewall, the pushback on the Ideal Path
is the difficulty in properly allowing HTTP access to a public GCloud bucket. (The "Why" is beyond this
document's scope.) Instead, the possibility is to mount a GCP drive and copy/sync data nightly from GCloud
after a successful build.

### From Chip
- A GCP drive is created.
- Your benchmark stuff gets the data to the GCP drive using some mechanism such as gsutil rsync or by mounting the drive into the machine running the tests and writing directly.
- The demo system will create a Kube persistent volume that refers to the GCP drive.
- When a worker is spun up, it mounts the Kube persistent volume so that it can be seen locally on a path such as /data/benchmark
- A script or jupyter file can then access the data via the mounted /data/benchmark path.

Resources:
- https://devopscube.com/persistent-volume-google-kubernetes-engine/







9 changes: 5 additions & 4 deletions docs/PublishedResults.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,15 @@ in an instance of the Deephaven Engine.
````
from urllib.request import urlopen

script_uri = 'https://storage.googleapis.com/deephaven-benchmark/benchmark_tables.dh.py'
with urlopen(script_uri) as r:
root = 'file:///data' if os.path.exists('/data/deephaven-benchmark') else 'https://storage.googleapis.com'
with urlopen(root + '/deephaven-benchmark/benchmark_tables.dh.py') as r:
benchmark_storage_uri_arg = root + '/deephaven-benchmark'
benchmark_category_arg = 'release' # release | nightly
benchmark_max_runs_arg = 10 # Latest X runs to include
benchmark_max_runs_arg = 2 # Latest X runs to include
exec(r.read().decode(), globals(), locals())
````

This will download the available benchmarks for the given benchmark category (release or nightly), merge test runs together, and generate some
This will process the available benchmarks for the given benchmark category (release or nightly), merge test runs together, and generate some
useful Deephaven tables that can be used to explore the benchmarks.

Requirements:
Expand Down
7 changes: 6 additions & 1 deletion src/main/java/io/deephaven/benchmark/run/BenchmarkMain.java
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
/* Copyright (c) 2022-2023 Deephaven Data Labs and Patent Pending */
package io.deephaven.benchmark.run;

import java.net.URL;
import java.nio.file.Path;
import java.nio.file.Paths;
import org.junit.platform.console.ConsoleLauncher;
import io.deephaven.benchmark.api.Bench;
Expand Down Expand Up @@ -31,7 +33,10 @@ static int main1(String[] args) {
setSystemProperties();
int exitCode = ConsoleLauncher.execute(System.out, System.err, args).getExitCode();
if (exitCode == 0) {
new ResultSummary(Paths.get(Bench.rootOutputDir)).summarize();
Path outputDir = Paths.get(Bench.rootOutputDir);
URL csv = new ResultSummary(outputDir).summarize();
URL svgTemplate = BenchmarkMain.class.getResource("profile/benchmark-summary.template.svg");
new SvgSummary(csv, svgTemplate, outputDir.resolve("benchmark-summary.svg")).summarize();
}
return exitCode;
}
Expand Down
5 changes: 4 additions & 1 deletion src/main/java/io/deephaven/benchmark/run/ResultSummary.java
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

import static java.nio.file.StandardOpenOption.*;
import java.io.BufferedWriter;
import java.net.URL;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.ArrayList;
Expand All @@ -22,9 +23,10 @@ class ResultSummary {
this.summaryFile = getSummaryFile(rootDir, "benchmark-summary-results.csv");
}

void summarize() {
URL summarize() {
if (!Files.exists(rootDir)) {
System.out.println("Skipping summary because of missing output directory: " + rootDir);
return null;
}
try (BufferedWriter out = Files.newBufferedWriter(summaryFile, CREATE, WRITE, TRUNCATE_EXISTING)) {
boolean isHeaderWritten = false;
Expand All @@ -40,6 +42,7 @@ else if (i == 0)
writeSummaryLine(runId, lines.get(i), out);
} ;
}
return summaryFile.toUri().toURL();
} catch (Exception ex) {
throw new RuntimeException("Failed to write summary results: " + summaryFile, ex);
}
Expand Down
Loading