Skip to content

Commit

Permalink
Build static HTML page using the latest benchmark results (#467)
Browse files Browse the repository at this point in the history
This uses a GitHub action to download benchmark data from S3. It then
uses a static generator to create charts per benchmark / Postgres
version. These results are then bundled up using Jekyll which publishes
the README, available at:

https://xataio.github.io/pgroll/

The benchmarks themselves can be see at
https://xataio.github.io/pgroll/benchmarks.html (and are also linked
from the README)

In order to make this work and also not require users of pgroll as a
library to import the charting library we needed to create a new module
in `/dev`. We needed to duplicate the definition of the benchmark
results structs since we couldn't import them from the main `pgroll`
module without actually publishing a new version of the module. Once
this module has been published it will be possible.

Part of #408
  • Loading branch information
ryanslade authored Nov 20, 2024
1 parent 8469bad commit 860f5a9
Show file tree
Hide file tree
Showing 10 changed files with 440 additions and 40 deletions.
67 changes: 67 additions & 0 deletions .github/workflows/publish_benchmarks.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
name: Publish Benchmark

on:
workflow_run:
workflows: [ "Benchmark" ]
branches: [ main ]
types:
- completed
workflow_dispatch:

permissions:
id-token: write # For getting AWS permissions
contents: write
packages: read
pages: write

# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
concurrency:
group: "pages"
cancel-in-progress: false

jobs:
publish:
name: Publish benchmarks
runs-on: ubuntu-latest
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v2
with:
role-to-assume: arn:aws:iam::493985724844:role/pgroll-benchmark-results-access
aws-region: us-east-1
mask-aws-account-id: 'no'

- name: Checkout
uses: actions/checkout@v4

- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: 'dev/go.mod'

- name: Setup Pages
uses: actions/configure-pages@v5

- name: Download results and build html
working-directory: ./dev
run: |
aws s3 cp s3://pgroll-benchmark-results/benchmark-results.json $HOME/benchmark-results.json
go run benchmark-results/build.go $HOME/benchmark-results.json /home/runner/work/pgroll/pgroll/benchmarks.html
# This will pick up the benchmarks.html file generated in the previous step and will also
# publish the README at index.html
- name: Build with Jekyll
uses: actions/jekyll-build-pages@v1
with:
source: ./
destination: ./static

- name: Upload artifact
uses: actions/upload-pages-artifact@v3
with:
path: ./static

- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ When no more client applications are using the old schema version, the migration
- [Installation](#installation)
- [Usage](#usage)
- [Documentation](#documentation)
- [Benchmarks](#benchmarks)
- [Contributing](#contributing)
- [License](#license)
- [Support](#support)
Expand Down Expand Up @@ -164,6 +165,16 @@ pgroll --postgres-url postgres://user:password@host:port/dbname rollback

For more advanced usage, a tutorial, and detailed options refer to the full [Documentation](docs/README.md).

## Benchmarks

Some performance benchmarks are run on each commit to `main` in order to track performance over time. Each benchmark is run against Postgres 14.8, 15.3, 16.4, 17.0 and "latest". Each line on the chart represents the number of rows the benchmark was run against, currently 10k, 100k and 300k rows.

* Backfill: Rows/s to backfill a text column with the value `placeholder`. We use our default batching strategy of 10k rows per batch with no backoff.
* WriteAmplification/NoTrigger: Baselines rows/s when writing data to a table without a `pgroll` trigger.
* WriteAmplificationWithTrigger: Rows/s when writing data to a table when a `pgroll` trigger has been set up.

They can be seen [here](https://xataio.github.io/pgroll/benchmarks.html).

## Contributing

We welcome contributions from the community! If you'd like to contribute to `pgroll`, please follow these guidelines:
Expand Down
255 changes: 255 additions & 0 deletions dev/benchmark-results/build.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,255 @@
// SPDX-License-Identifier: Apache-2.0

package main

import (
"bufio"
"cmp"
"encoding/json"
"fmt"
"log"
"maps"
"os"
"slices"
"sort"
"strings"

"github.com/go-echarts/go-echarts/v2/charts"
"github.com/go-echarts/go-echarts/v2/components"
"github.com/go-echarts/go-echarts/v2/opts"
"github.com/spf13/cobra"
)

var rootCmd = &cobra.Command{
Use: "build <inputfile> <outputfile>",
SilenceUsage: true,
Args: cobra.ExactArgs(2),
RunE: func(cmd *cobra.Command, args []string) error {
err := buildCharts(args[0], args[1])
if err != nil {
cmd.PrintErr(err)
os.Exit(1)
}
return nil
},
}

// This will generate line charts displaying benchmark results over time. Each set of charts will
// apply to a single version of Postgres.
func main() {
if err := rootCmd.Execute(); err != nil {
os.Exit(1)
}
}

func buildCharts(inputFile, outputFile string) error {
log.Println("Loading data")
reports, err := loadData(inputFile)
if err != nil {
log.Fatalf("Loading data: %v", err)
}
log.Printf("Loaded %d reports", len(reports))

log.Println("Generating charts")
allCharts := generateCharts(reports)

page := components.NewPage()
page.SetPageTitle("pgroll benchmark results")
page.SetLayout("flex")

for _, c := range allCharts {
page.AddCharts(c)
}

f, err := os.Create(outputFile)
if err != nil {
log.Fatalf("Creating output file: %v", err)
}
defer func() {
if err := f.Close(); err != nil {
log.Fatalf("Closing output file: %v", err)
}
}()

if err := page.Render(f); err != nil {
log.Fatalf("Rendering: %s", err)
}
log.Printf("Charts generated at %s", outputFile)

return nil
}

type dataKey struct {
postgresVersion string
benchmarkName string
rowCount int
sha string
}

type chartKey struct {
postgresVersion string
benchmarkName string
}

// generateCharts will generate charts grouped by postgres version and benchmark with series for each
// rowCount
func generateCharts(reports []BenchmarkReports) []*charts.Line {
// Time data for each sha so we can order them later
timeOrder := make(map[string]int64) // shortSHA -> timestamp

// rows/s grouped by dataKey
groupedData := make(map[dataKey]float64)

// set of possible row counts
rowCounts := make(map[int]struct{})

for _, group := range reports {
short := shortSHA(group.GitSHA)
timeOrder[short] = group.Timestamp
for _, report := range group.Reports {
key := dataKey{
postgresVersion: group.PostgresVersion,
benchmarkName: trimName(report.Name),
sha: short,
rowCount: report.RowCount,
}
groupedData[key] = report.Result
rowCounts[report.RowCount] = struct{}{}
}
}

// Now we have the data grouped in a way that makes it easy for us to create each chart

// Create x-axis for each chart
xs := make(map[chartKey][]string)
for d := range groupedData {
ck := chartKey{postgresVersion: d.postgresVersion, benchmarkName: d.benchmarkName}
x := xs[ck]
x = append(x, d.sha)
xs[ck] = x
}
// Sort and deduplicate xs in time order
for key, x := range xs {
// Dedupe
slices.Sort(x)
x = slices.Compact(x)
// Sort by time
slices.SortFunc(x, func(a, b string) int {
return cmp.Compare(timeOrder[a], timeOrder[b])
})
xs[key] = x
}

allCharts := make([]*charts.Line, 0, len(xs))

for ck, xValues := range xs {
chart := charts.NewLine()
chart.SetGlobalOptions(
charts.WithTitleOpts(opts.Title{
Title: fmt.Sprintf("%s (%s)", ck.benchmarkName, ck.postgresVersion),
}),
charts.WithAnimation(false))
chart.SetXAxis(xValues)

series := make(map[int][]float64) // rowCount -> rows/s

// Add series per rowCount
for _, x := range xValues {
for rc := range rowCounts {
dk := dataKey{
postgresVersion: ck.postgresVersion,
benchmarkName: ck.benchmarkName,
rowCount: rc,
sha: x,
}
value, ok := groupedData[dk]
if !ok {
continue
}

series[rc] = append(series[rc], value)
}
}

// Make sure row counts are consistently sorted
sortedRowCounts := slices.Collect(maps.Keys(rowCounts))
slices.Sort(sortedRowCounts)

for _, rowCount := range sortedRowCounts {
s := series[rowCount]

name := fmt.Sprintf("%d", rowCount)
data := make([]opts.LineData, len(series[rowCount]))
for i := range series[rowCount] {
data[i] = opts.LineData{
Value: s[i],
}
}
chart.AddSeries(name, data)
}

allCharts = append(allCharts, chart)
}

sort.Slice(allCharts, func(i, j int) bool {
return allCharts[i].Title.Title < allCharts[j].Title.Title
})

return allCharts
}

func loadData(filename string) (allReports []BenchmarkReports, err error) {
f, err := os.Open(filename)
if err != nil {
return nil, fmt.Errorf("opening file: %w", err)
}
defer func() {
err = f.Close()
}()

scanner := bufio.NewScanner(f)

// Each line represents a collection of results from a single commit
for scanner.Scan() {
var reports BenchmarkReports
line := scanner.Text()
if err := json.Unmarshal([]byte(line), &reports); err != nil {
return nil, fmt.Errorf("unmarshalling reports: %w", err)
}
allReports = append(allReports, reports)
}
if err := scanner.Err(); err != nil {
return nil, fmt.Errorf("scanning input: %w", err)
}

return allReports, err
}

// Benchmarks are grouped by the number of rows they were tested against. We need to trim this off
// the end.
func trimName(name string) string {
return strings.TrimPrefix(name[:strings.LastIndex(name, "/")], "Benchmark")
}

// First 7 characters
func shortSHA(sha string) string {
return sha[:7]
}

type BenchmarkReports struct {
GitSHA string
PostgresVersion string
Timestamp int64
Reports []BenchmarkReport
}

func (r *BenchmarkReports) AddReport(report BenchmarkReport) {
r.Reports = append(r.Reports, report)
}

type BenchmarkReport struct {
Name string
RowCount int
Unit string
Result float64
}
20 changes: 20 additions & 0 deletions dev/benchmark-results/build_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
// SPDX-License-Identifier: Apache-2.0

package main

import (
"testing"

"github.com/stretchr/testify/assert"
)

// TestBuildChartsRegression is a simple regression test
func TestBuildChartsRegression(t *testing.T) {
reports, err := loadData("testdata/benchmark-results.json")
assert.NoError(t, err)
assert.Len(t, reports, 5)

generated := generateCharts(reports)
// 5 versions * 3 benchmarks
assert.Len(t, generated, 15)
}
5 changes: 5 additions & 0 deletions dev/benchmark-results/testdata/benchmark-results.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{"GitSHA":"9c7821fb64e345e3d7b18aaefe010bb5ffb8dddc","PostgresVersion":"14.8","Timestamp":1731919355,"Reports":[{"Name":"BenchmarkBackfill/10000","RowCount":10000,"Unit":"rows/s","Result":53318.89969391059},{"Name":"BenchmarkBackfill/100000","RowCount":100000,"Unit":"rows/s","Result":20771.251554603292},{"Name":"BenchmarkBackfill/300000","RowCount":300000,"Unit":"rows/s","Result":9355.33788169637},{"Name":"BenchmarkWriteAmplification/NoTrigger/10000","RowCount":10000,"Unit":"rows/s","Result":349447.42577457114},{"Name":"BenchmarkWriteAmplification/NoTrigger/100000","RowCount":100000,"Unit":"rows/s","Result":347618.8045838434},{"Name":"BenchmarkWriteAmplification/NoTrigger/300000","RowCount":300000,"Unit":"rows/s","Result":332593.7044319948},{"Name":"BenchmarkWriteAmplification/WithTrigger/10000","RowCount":10000,"Unit":"rows/s","Result":77669.81724959961},{"Name":"BenchmarkWriteAmplification/WithTrigger/100000","RowCount":100000,"Unit":"rows/s","Result":76891.54517908255},{"Name":"BenchmarkWriteAmplification/WithTrigger/300000","RowCount":300000,"Unit":"rows/s","Result":79160.3927250836}]}
{"GitSHA":"9c7821fb64e345e3d7b18aaefe010bb5ffb8dddc","PostgresVersion":"15.3","Timestamp":1731919352,"Reports":[{"Name":"BenchmarkBackfill/10000","RowCount":10000,"Unit":"rows/s","Result":50762.54737224374},{"Name":"BenchmarkBackfill/100000","RowCount":100000,"Unit":"rows/s","Result":24525.43224231854},{"Name":"BenchmarkBackfill/300000","RowCount":300000,"Unit":"rows/s","Result":9976.91680500212},{"Name":"BenchmarkWriteAmplification/NoTrigger/10000","RowCount":10000,"Unit":"rows/s","Result":335627.07589541783},{"Name":"BenchmarkWriteAmplification/NoTrigger/100000","RowCount":100000,"Unit":"rows/s","Result":322546.1202407683},{"Name":"BenchmarkWriteAmplification/NoTrigger/300000","RowCount":300000,"Unit":"rows/s","Result":304408.6182085832},{"Name":"BenchmarkWriteAmplification/WithTrigger/10000","RowCount":10000,"Unit":"rows/s","Result":73833.35000736488},{"Name":"BenchmarkWriteAmplification/WithTrigger/100000","RowCount":100000,"Unit":"rows/s","Result":73518.12084089564},{"Name":"BenchmarkWriteAmplification/WithTrigger/300000","RowCount":300000,"Unit":"rows/s","Result":76426.83785680518}]}
{"GitSHA":"9c7821fb64e345e3d7b18aaefe010bb5ffb8dddc","PostgresVersion":"16.4","Timestamp":1731919355,"Reports":[{"Name":"BenchmarkBackfill/10000","RowCount":10000,"Unit":"rows/s","Result":52847.614990834794},{"Name":"BenchmarkBackfill/100000","RowCount":100000,"Unit":"rows/s","Result":26198.329408561167},{"Name":"BenchmarkBackfill/300000","RowCount":300000,"Unit":"rows/s","Result":10284.203835176779},{"Name":"BenchmarkWriteAmplification/NoTrigger/10000","RowCount":10000,"Unit":"rows/s","Result":344511.30107311136},{"Name":"BenchmarkWriteAmplification/NoTrigger/100000","RowCount":100000,"Unit":"rows/s","Result":316209.5160869647},{"Name":"BenchmarkWriteAmplification/NoTrigger/300000","RowCount":300000,"Unit":"rows/s","Result":306740.91346836684},{"Name":"BenchmarkWriteAmplification/WithTrigger/10000","RowCount":10000,"Unit":"rows/s","Result":74980.02550865455},{"Name":"BenchmarkWriteAmplification/WithTrigger/100000","RowCount":100000,"Unit":"rows/s","Result":74194.89546004521},{"Name":"BenchmarkWriteAmplification/WithTrigger/300000","RowCount":300000,"Unit":"rows/s","Result":77262.72489187124}]}
{"GitSHA":"9c7821fb64e345e3d7b18aaefe010bb5ffb8dddc","PostgresVersion":"17.0","Timestamp":1731919352,"Reports":[{"Name":"BenchmarkBackfill/10000","RowCount":10000,"Unit":"rows/s","Result":55157.04163971648},{"Name":"BenchmarkBackfill/100000","RowCount":100000,"Unit":"rows/s","Result":28394.329608649885},{"Name":"BenchmarkBackfill/300000","RowCount":300000,"Unit":"rows/s","Result":11032.50778410557},{"Name":"BenchmarkWriteAmplification/NoTrigger/10000","RowCount":10000,"Unit":"rows/s","Result":346680.9785819451},{"Name":"BenchmarkWriteAmplification/NoTrigger/100000","RowCount":100000,"Unit":"rows/s","Result":326936.73759742273},{"Name":"BenchmarkWriteAmplification/NoTrigger/300000","RowCount":300000,"Unit":"rows/s","Result":311980.23507872946},{"Name":"BenchmarkWriteAmplification/WithTrigger/10000","RowCount":10000,"Unit":"rows/s","Result":77797.43208881274},{"Name":"BenchmarkWriteAmplification/WithTrigger/100000","RowCount":100000,"Unit":"rows/s","Result":76809.3620215389},{"Name":"BenchmarkWriteAmplification/WithTrigger/300000","RowCount":300000,"Unit":"rows/s","Result":78379.07380977394}]}
{"GitSHA":"9c7821fb64e345e3d7b18aaefe010bb5ffb8dddc","PostgresVersion":"latest","Timestamp":1731919355,"Reports":[{"Name":"BenchmarkBackfill/10000","RowCount":10000,"Unit":"rows/s","Result":56163.69069404208},{"Name":"BenchmarkBackfill/100000","RowCount":100000,"Unit":"rows/s","Result":27459.310046792947},{"Name":"BenchmarkBackfill/300000","RowCount":300000,"Unit":"rows/s","Result":10614.308750535665},{"Name":"BenchmarkWriteAmplification/NoTrigger/10000","RowCount":10000,"Unit":"rows/s","Result":299868.1929357951},{"Name":"BenchmarkWriteAmplification/NoTrigger/100000","RowCount":100000,"Unit":"rows/s","Result":325655.21487294463},{"Name":"BenchmarkWriteAmplification/NoTrigger/300000","RowCount":300000,"Unit":"rows/s","Result":308360.38580711134},{"Name":"BenchmarkWriteAmplification/WithTrigger/10000","RowCount":10000,"Unit":"rows/s","Result":79731.98507822279},{"Name":"BenchmarkWriteAmplification/WithTrigger/100000","RowCount":100000,"Unit":"rows/s","Result":79380.11112312211},{"Name":"BenchmarkWriteAmplification/WithTrigger/300000","RowCount":300000,"Unit":"rows/s","Result":81637.17582740792}]}
5 changes: 5 additions & 0 deletions dev/doc.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
// SPDX-License-Identifier: Apache-2.0

// Package dev contains code that is only intended for internal usage and allows us to use dependencies that
// we don't want to expose to users of pgroll as a library.
package dev
Loading

0 comments on commit 860f5a9

Please sign in to comment.