FSSTCompressor #664

a10y · 2024-08-20T23:50:20Z

Implements a new codec, FSSTCompressor using latest version of fsst-rs library
Adds new metadata field on CompressionTree to allow reuse between the sampling and compressing stages. For example, we can save the ALP exponents to not have to calculate them twice. This is very important for FSST so that we save the overhead of training the table twice
Adds new compression benchmark using the lineitem table's l_comment column, with scalefactor=1, which is just over 6million rows. By default this is loaded as a ChunkedArray with 733 partitions. Compressing with FSST enabled takes 1.6s. Compressing on the canonicalized array takes ~550ms. We should be able to speed this up by at least ~2x, see FSSTCompressor #664 (comment), and we can potentially do even better. We probably want to be able to FSST compress a ChunkedArray directly so that we avoid the overhead of training/compressing each chunk from scratch.

vortex-sampling-compressor/src/compressors/fsst.rs

a10y · 2024-08-21T14:47:00Z

Ran the compress_taxi benchmark, got ~80% slower. I am a bit surprised that the biggest culprit seems to be creating new counters in the FSST training loop. That doesn't even scale w.r.t. to the size of the input array, it's just a flat 2MB allocation. The zeroing of the vector seems to be the biggest problem. I think we can avoid that with a second bitmap, let me try that out

a10y · 2024-08-21T15:44:27Z

Alright, using the change in spiraldb/fsst#21 helped a lot.

New benchmark result:

end to end - taxi/compress
                        time:   [100.73 ms 101.72 ms 102.96 ms]
                        change: [-45.073% -43.470% -42.057%] (p = 0.00 < 0.05)
                        Performance has improved.

Which is about 10ms or ~11% slower than running without FSST.

a10y · 2024-08-21T15:45:18Z

And I think we can go even lower, ideally we'd just use the trained compressor over the samples to compress the full array

bench-vortex/benches/random_access.rs

encodings/fsst/tests/fsst_tests.rs

robert3005 · 2024-08-21T16:07:58Z

Just bear in mind that the samples can be very small compared to data, i.e. 1024 elements. I would say just retrain it

a10y · 2024-08-23T03:19:46Z

Ok I've done a few things today

Introduced a way to reuse compressors in our samplingcompressor code
Keep tweaking some things on the FSST side, including matching how the paper author's sample the full input, and tried to reduce memory accesses/extraneous instructions as much as possible (in draft at feat: port in more from the C++ code fsst#24)
I'm trying to run some comparisons against the C++ code. Here's a screenshot comparing a microbenchmark for compressing thecomments column from the TPCH orders table (1.5mm rows) using Rust vs the C++ implementation. We seem roughly on-par with the C++ implementation here, the timings seemed consistent after several runs

Cargo.toml

a10y · 2024-08-23T04:09:46Z

Ok I added a new benchmark now which just compresses the comments column in-memory via Vortex, and i'm seeing it take ~500ms, which is roughly 2-3x longer than just doing the compression without Vortex.

I think the root of the performance difference is the chunking. Here's a comparison between running FSST over the comments column chunked as per our TPC loading infra (nchunks=192) and the canonicalized version of the comments array, which is not chunked:

So somewhere I guess there's some fixed-size overhead in FSST training (probably a combo of allocations and double-tight-loops over 0...511) that when you try and run FSST hundreds of times, they start to add up and can skew your results.

I'm curious how DuckDB and other folks deal with FSST + chunking, it seems like we might want to treat it as a special thing that can do its own sampling + have shared symbol table across chunks

a10y · 2024-08-23T17:33:31Z

I'm currently blocking this on some work in spiraldb/fsst#24

a10y · 2024-09-03T19:00:01Z

Currently 59% of fsst_compress time is spent actually compressing, we break out of the fast loop to do push_null and data copying. Something to improve on in flup

…ssor

bench-vortex/src/tpch/mod.rs

encodings/fsst/src/array.rs

a10y · 2024-09-03T19:24:09Z

encodings/fsst/src/array.rs

+        //  so we transmute to kill the lifetime complaints.
+        //  This is fine because the returned `Decompressor`'s lifetime is tied to the lifetime
+        //  of these same arrays.
+        let symbol_lengths = unsafe { std::mem::transmute::<&[u8], &[u8]>(symbol_lengths) };


curious for a sanity check here, or if there's another way i should be doing this. it feels a bit wrong, but I think it is currently the best way to do the thing I want...

nvm this is wrong, if we actually canonicalize this pointer is invalid

ok this should be fixed now, instead of returning a decompressor this constructs one on-the-fly to pass to a provided function

bench-vortex/benches/compress_benchmark.rs

vortex-sampling-compressor/src/compressors/mod.rs

vortex-sampling-compressor/src/compressors/fsst.rs

a10y added 2 commits August 20, 2024 18:14

re-enable miri on FSSTArray tests

7e99d94

FSSTCompressor

335636a

robert3005 reviewed Aug 21, 2024

View reviewed changes

vortex-sampling-compressor/src/compressors/fsst.rs Outdated Show resolved Hide resolved

robert3005 reviewed Aug 21, 2024

View reviewed changes

vortex-sampling-compressor/src/compressors/fsst.rs Outdated Show resolved Hide resolved

robert3005 reviewed Aug 21, 2024

View reviewed changes

vortex-sampling-compressor/src/compressors/fsst.rs Show resolved Hide resolved

fixes

fc81829

a10y commented Aug 21, 2024

View reviewed changes

bench-vortex/benches/random_access.rs Show resolved Hide resolved

a10y commented Aug 21, 2024

View reviewed changes

encodings/fsst/tests/fsst_tests.rs Show resolved Hide resolved

a10y added 2 commits August 22, 2024 12:53

some things

47c21a2

weird clippy

23f32ce

more

f7f7429

a10y commented Aug 23, 2024

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

save

536871a

a10y added 3 commits September 3, 2024 15:00

more

3517e59

Merge remote-tracking branch 'origin/develop' into aduffy/fsst-compre…

cefb8ce

…ssor

better

aff74c1

a10y commented Sep 3, 2024

View reviewed changes

bench-vortex/src/tpch/mod.rs Show resolved Hide resolved

prints

f55092b

a10y commented Sep 3, 2024

View reviewed changes

encodings/fsst/src/array.rs Show resolved Hide resolved

a10y commented Sep 3, 2024

View reviewed changes

a10y marked this pull request as ready for review September 3, 2024 19:30

decompressor -> with_decompressor

1d1d56a

robert3005 reviewed Sep 3, 2024

View reviewed changes

bench-vortex/benches/compress_benchmark.rs Outdated Show resolved Hide resolved

robert3005 reviewed Sep 3, 2024

View reviewed changes

vortex-sampling-compressor/src/compressors/mod.rs Outdated Show resolved Hide resolved

be slightly safer

79197e9

a10y force-pushed the aduffy/fsst-compressor branch from 58a58eb to 79197e9 Compare September 3, 2024 21:26

a10y commented Sep 3, 2024

View reviewed changes

vortex-sampling-compressor/src/compressors/mod.rs Show resolved Hide resolved

comment

87e881c

robert3005 reviewed Sep 3, 2024

View reviewed changes

vortex-sampling-compressor/src/compressors/fsst.rs Outdated Show resolved Hide resolved

fix comment

cffa478

robert3005 approved these changes Sep 3, 2024

View reviewed changes

a10y enabled auto-merge (squash) September 3, 2024 22:20

bump fsst-rs

9800ea7

a10y merged commit fd49140 into develop Sep 3, 2024
4 checks passed

a10y deleted the aduffy/fsst-compressor branch September 3, 2024 22:29

robert3005 mentioned this pull request Sep 5, 2024

Encoding: FSST #9

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FSSTCompressor #664

FSSTCompressor #664

a10y commented Aug 20, 2024 •

edited

Loading

a10y commented Aug 21, 2024

a10y commented Aug 21, 2024

a10y commented Aug 21, 2024

robert3005 commented Aug 21, 2024

a10y commented Aug 23, 2024

a10y commented Aug 23, 2024

a10y commented Aug 23, 2024

a10y commented Sep 3, 2024

a10y Sep 3, 2024

a10y Sep 3, 2024

a10y Sep 3, 2024

FSSTCompressor #664

FSSTCompressor #664

Conversation

a10y commented Aug 20, 2024 • edited Loading

a10y commented Aug 21, 2024

a10y commented Aug 21, 2024

a10y commented Aug 21, 2024

robert3005 commented Aug 21, 2024

a10y commented Aug 23, 2024

a10y commented Aug 23, 2024

a10y commented Aug 23, 2024

a10y commented Sep 3, 2024

a10y Sep 3, 2024

Choose a reason for hiding this comment

a10y Sep 3, 2024

Choose a reason for hiding this comment

a10y Sep 3, 2024

Choose a reason for hiding this comment

a10y commented Aug 20, 2024 •

edited

Loading