Use our own Discard for testing #35

nirs · 2024-10-19T20:55:09Z

Turns out that io.Discard is implementing ReadFrom using a small buffer (8192), confusing our benchmarks. We copyBuffer with 1 MiB buffer, but io.Discard is using its own 8 KiB buffer to do huge amount of tiny reads. These tiny reads are extremely slow for reading compressed clusters, since we have to read and decompress the same cluster multiple times.

With this change qcow2 zlib performance is 4 times better - it it still slow, but matches better the real performance.

Before:

% go test -bench Read
BenchmarkRead0p/qcow2-12          14      78238414 ns/op     3430.99 MB/s      1051160 B/op        39 allocs/op
BenchmarkRead0p/qcow2_zlib-12     14      78577923 ns/op     3416.17 MB/s      1051733 B/op        39 allocs/op
BenchmarkRead50p/qcow2-12         21      54889353 ns/op     4890.48 MB/s      1183231 B/op        45 allocs/op
BenchmarkRead50p/qcow2_zlib-12     1    3466799292 ns/op       77.43 MB/s    736076536 B/op    178764 allocs/op
BenchmarkRead100p/qcow2-12        38      30562127 ns/op     8783.27 MB/s      1182901 B/op        45 allocs/op
BenchmarkRead100p/qcow2_zlib-12    1    6834526167 ns/op       39.28 MB/s   1471530256 B/op    357570 allocs/op

After:

% go test -bench Read
BenchmarkRead0p/qcow2-12          14      77515735 ns/op     3462.98 MB/s      1050518 B/op        39 allocs/op
BenchmarkRead0p/qcow2_zlib-12     14      77823402 ns/op     3449.29 MB/s      1050504 B/op        39 allocs/op
BenchmarkRead50p/qcow2-12         24      48812158 ns/op     5499.36 MB/s      1181856 B/op        45 allocs/op
BenchmarkRead50p/qcow2_zlib-12     2     899659187 ns/op      298.37 MB/s    184996316 B/op     43247 allocs/op
BenchmarkRead100p/qcow2-12        61      19306020 ns/op    13904.24 MB/s      1181854 B/op        45 allocs/op
BenchmarkRead100p/qcow2_zlib-12    1    1732168542 ns/op      154.97 MB/s    368850952 B/op     86460 allocs/op

Turns out that io.Discard is implementing ReadFrom using a small buffer (8192), confusing our benchmarks. We copyBuffer with 1 MiB buffer, but io.Discard is using its own 8 KiB buffer to do huge amount of tiny reads. These tiny reads are extremely slow for reading compressed clusters, since we have to read and decompress the same cluster multiple times. With this change qcow2 zlib performance is 4 times better - it it still slow, but matches better the real performance. Before: % go test -bench Read BenchmarkRead0p/qcow2-12 14 78238414 ns/op 3430.99 MB/s 1051160 B/op 39 allocs/op BenchmarkRead0p/qcow2_zlib-12 14 78577923 ns/op 3416.17 MB/s 1051733 B/op 39 allocs/op BenchmarkRead50p/qcow2-12 21 54889353 ns/op 4890.48 MB/s 1183231 B/op 45 allocs/op BenchmarkRead50p/qcow2_zlib-12 1 3466799292 ns/op 77.43 MB/s 736076536 B/op 178764 allocs/op BenchmarkRead100p/qcow2-12 38 30562127 ns/op 8783.27 MB/s 1182901 B/op 45 allocs/op BenchmarkRead100p/qcow2_zlib-12 1 6834526167 ns/op 39.28 MB/s 1471530256 B/op 357570 allocs/op After: % go test -bench Read BenchmarkRead0p/qcow2-12 14 77515735 ns/op 3462.98 MB/s 1050518 B/op 39 allocs/op BenchmarkRead0p/qcow2_zlib-12 14 77823402 ns/op 3449.29 MB/s 1050504 B/op 39 allocs/op BenchmarkRead50p/qcow2-12 24 48812158 ns/op 5499.36 MB/s 1181856 B/op 45 allocs/op BenchmarkRead50p/qcow2_zlib-12 2 899659187 ns/op 298.37 MB/s 184996316 B/op 43247 allocs/op BenchmarkRead100p/qcow2-12 61 19306020 ns/op 13904.24 MB/s 1181854 B/op 45 allocs/op BenchmarkRead100p/qcow2_zlib-12 1 1732168542 ns/op 154.97 MB/s 368850952 B/op 86460 allocs/op Signed-off-by: Nir Soffer <[email protected]>

AkihiroSuda

Thanks

nirs force-pushed the fix-discard branch from d9b2206 to 5512113 Compare October 19, 2024 20:56

nirs mentioned this pull request Oct 19, 2024

Optimize zero reads #34

Merged

AkihiroSuda approved these changes Oct 20, 2024

View reviewed changes

AkihiroSuda merged commit b119fa3 into lima-vm:master Oct 20, 2024
2 checks passed

nirs deleted the fix-discard branch November 18, 2024 00:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use our own Discard for testing #35

Use our own Discard for testing #35

nirs commented Oct 19, 2024 •

edited

Loading

AkihiroSuda left a comment

Use our own Discard for testing #35

Use our own Discard for testing #35

Conversation

nirs commented Oct 19, 2024 • edited Loading

AkihiroSuda left a comment

Choose a reason for hiding this comment

nirs commented Oct 19, 2024 •

edited

Loading