Tiny Weights #14402

vmg · 2023-10-31T13:37:52Z

Description

This is a new optimization pattern for the execution engine. The idea is to speed up comparison operations by embedding "tiny weights" inside sqltypes.Value during execution. A tiny weight is a 4-byte compressed form of the full weight string for the value (see: evalengine.TinyWeighter for a detailed description and implementation). Since we actually have 4 spare bytes inside sqltypes.Value, we can inject the weight there without increasing the allocation cost for our in-memory rows, and any further comparison operators will automatically make use of them. This makes e.g. sorting wildly more efficient, because most comparisons during the sort can be performed by comparing two uint32 integers, instead of doing a full collation-aware comparison.

Of course, two tiny weight strings can collide (as they're essentially a lossy form of the weight string), but this is perfectly safe because we always fall back to a full comparison of the two values whenever their tiny weight strings are identical.

The arewefastyet benchmark results are not wildly impressive because OLTP is actually a pathological case for this example. The strings that OLTP uses in sort queries in the benchmark are all numerical strings, so their alphabet is very reduced (10 possible characters), making the 4 byte string collide quite often. The improvement on these OLTP Distinct-sorted queries is just 15%.

goos: linux
goarch: amd64
pkg: vitess.io/vitess/go/vt/vtgate/endtoend
cpu: 13th Gen Intel(R) Core(TM) i9-13900K
                       │ baseline.txt │          sha-2e689b2faf.txt          │
                       │    sec/op    │    sec/op     vs base                │
OLTP/DistinctRanges-16    565.4µ ± 6%   479.3µ ± 31%  -15.22% (p=0.000 n=10)

If we were to craft a different benchmark with string columns that contain arbitrary UTF8 data, the improvement gets all the way to ~40% because all the comparisons during sorting are performed with the tiny weight strings.

The global improvement in arewefast is pretty good, particularly for latency:

https://benchmark.vitess.io/compare?ltag=369b6a1e55aecd98c3cf6d4366cfbcee0477c474&rtag=946eb31e74187866a4a7414ca5df1954435681da

Again I wouldn't pay much attention to OLTP here because it's not representative of real world data (:cry:), but the speed up for real queries that include SORT BY or DISTINCT will be significant.

cc @dbussink @systay

Related Issue(s)

Checklist

"Backport to:" labels have been added if this change should be back-ported
Tests were added or are not required
Did the new or modified tests pass consistently locally and on the CI
Documentation was added or is not required

Deployment Notes

vitess-bot · 2023-10-31T13:37:55Z

vitess-bot · 2023-10-31T13:38:23Z

Hello! 👋

This Pull Request is now handled by arewefastyet. The current HEAD and future commits will be benchmarked.

You can find the performance comparison on the arewefastyet website.

Signed-off-by: Vicent Marti <[email protected]>

go/hack/runtime.go

systay · 2023-11-07T08:01:51Z

go/vt/vtgate/engine/ordered_aggregate.go

-		cmp, err := evalengine.NullsafeCompare(currentKey[gb.KeyCol], nextRow[gb.KeyCol], gb.Type.Coll)
+		v1 := currentKey[gb.KeyCol]
+		v2 := nextRow[gb.KeyCol]
+		if v1.TinyWeightCmp(v2) != 0 {


we don't want to do this in the evalengine.NullsafeCompare instead?

I opted to wire up the comparison on the relevant callsites because there's a lot of places that don't use tiny weights right now. I think it'll be sensible to move it into NullsafeCompare once we wire up tiny weight generation in more paths.

vmg · 2023-11-07T13:02:45Z

This was actually huge in arewefast.

vitess-bot bot added NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Oct 31, 2023

github-actions bot added this to the v19.0.0 milestone Oct 31, 2023

vmg force-pushed the vmg/tiny-weights branch from 91cdf55 to 171966f Compare November 2, 2023 10:47

vmg added 13 commits November 6, 2023 09:12

wip: tiny weight strings

055de3b

Signed-off-by: Vicent Marti <[email protected]>

evalengine: fix tiny weight generation

89bc9a1

Signed-off-by: Vicent Marti <[email protected]>

engine: add missing tiny weights

7493109

Signed-off-by: Vicent Marti <[email protected]>

sqltypes: keep track of tiny weights with a flag

72ff7ca

Signed-off-by: Vicent Marti <[email protected]>

evalengine: move the comparison API

e1def36

Signed-off-by: Vicent Marti <[email protected]>

evalengine: move SortResult

3e85b07

Signed-off-by: Vicent Marti <[email protected]>

sizegen: update

12ef8f9

Signed-off-by: Vicent Marti <[email protected]>

evalengine: implement memory sorter

bc0b412

Signed-off-by: Vicent Marti <[email protected]>

evalengine: implement merger

02758ca

Signed-off-by: Vicent Marti <[email protected]>

sizegen: update

6ee4582

Signed-off-by: Vicent Marti <[email protected]>

engine: fix tests

63377a9

Signed-off-by: Vicent Marti <[email protected]>

evalengine: integer weights

af65e3a

Signed-off-by: Vicent Marti <[email protected]>

evalengine: remove slow tiny weights

946eb31

Signed-off-by: Vicent Marti <[email protected]>

vmg force-pushed the vmg/tiny-weights branch from 2280e1f to 946eb31 Compare November 6, 2023 08:12

evalengine: tiny weight documentation

2e689b2

Signed-off-by: Vicent Marti <[email protected]>

dbussink reviewed Nov 6, 2023

View reviewed changes

go/hack/runtime.go Show resolved Hide resolved

vmg marked this pull request as ready for review November 6, 2023 16:43

vmg requested review from deepthi, mattlord, rohit-nayak-ps, GuptaManan100, shlomi-noach, harshit-gangal, systay, frouioui and arthurschreiber as code owners November 6, 2023 16:43

systay reviewed Nov 7, 2023

View reviewed changes

systay approved these changes Nov 7, 2023

View reviewed changes

dbussink approved these changes Nov 7, 2023

View reviewed changes

vmg merged commit a15ef42 into vitessio:main Nov 7, 2023
115 checks passed

vmg deleted the vmg/tiny-weights branch November 7, 2023 10:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tiny Weights #14402

Tiny Weights #14402

vmg commented Oct 31, 2023 •

edited

Loading

vitess-bot bot commented Oct 31, 2023

vitess-bot bot commented Oct 31, 2023

systay Nov 7, 2023

vmg Nov 7, 2023

vmg commented Nov 7, 2023

Tiny Weights #14402

Tiny Weights #14402

Conversation

vmg commented Oct 31, 2023 • edited Loading

Description

Related Issue(s)

Checklist

Deployment Notes

vitess-bot bot commented Oct 31, 2023

Review Checklist

General

Tests

Documentation

New flags

If a workflow is added or modified:

Backward compatibility

vitess-bot bot commented Oct 31, 2023

systay Nov 7, 2023

Choose a reason for hiding this comment

vmg Nov 7, 2023

Choose a reason for hiding this comment

vmg commented Nov 7, 2023

vmg commented Oct 31, 2023 •

edited

Loading