Fix checksumming in the presence of large, mostly-zero buffers. #3355

Keno · 2022-08-02T23:28:14Z

Tools like asan allocate large regions of memory (multi-TB)
that are generally sparsely allocated using demand-paging.
Currently, when checksumming a process that has such regions,
we attempt to read the entire multi-TB region into a std::vector.
This obviously does not work. Switch our checksum method to
crc32c for performance and further improve performance by:

Reading the pagemap to detect pages that were allocated, but
are currently still 0 and waiting to demand-paged in. These
can be fast-forwarded using a precomputed crc operator.
Capping the amount of memory to be read at once at a reasonable
max buffer size to avoid thrashing.

The crc32c implementation here is copied from Julia, which itself
is cobbled together from various places around the web - it's not
the prettiest, but keeping it aligned with the Julia version will
make it easier to port any future hardware acceleration improvements
over, if necessary.

Tools like asan allocate large regions of memory (multi-TB) that are generally sparsely allocated using demand-paging. Currently, when checksumming a process that has such regions, we attempt to read the entire multi-TB region into a std::vector. This obviously does not work. Switch our checksum method to crc32c for performance and further improve performance by: 1. Reading the pagemap to detect pages that were allocated, but are currently still 0 and waiting to demand-paged in. These can be fast-forwarded using a precomputed crc operator. 2. Capping the amount of memory to be read at once at a reasonable max buffer size to avoid thrashing. The crc32c implementation here is copied from Julia, which itself is cobbled together from various places around the web - it's not the prettiest, but keeping it aligned with the Julia version will make it easier to port any future hardware acceleration improvements over, if necessary.

rocallahan · 2022-08-08T06:54:07Z

How much of a performance improvement is this fancy crc32 implementation?

yuyichao · 2022-08-08T13:56:28Z

You can see the benchmark result I did a while ago when I implemented the arm versions here JuliaLang/julia#22385 there isn’t any benchmark result from the x86 version when it was originally added afaict. JuliaLang/julia#18297

yuyichao · 2022-08-08T13:59:41Z

third-party/crc32c/crc32c.c

+#  endif
+# endif
+
+/* Table-driven software version as a fall-back.  This is about 15 times slower


Actually the comment suggests 15x faster for the hardware accelerated version which is similar with what I've seen on some older ARM cores.

rocallahan · 2022-08-09T09:11:47Z

I think the pagemap optimization makes sense because the speedup is nearly infinite in those pathological cases with lots of unreserved memory.

I'm less convinced about the benefits of accelerating the CRC implementation for data that we've had to read from the tracee. Wouldn't it be good enough to use the crc32() function from zlib? We could replace the existing crc code in util.cc with that too.

Personally I almost never use the memory checksumming. Maybe you and Keno use it a lot?

GitMensch · 2024-11-18T17:09:52Z

Friendly ping on this.

Wouldn't it be good enough to use the crc32() function from zlib?

As the zlib crc32 implementation has ARM optimizations since 2019, I think that's good to use and preferable over (another) 3rd party implementation.

yuyichao reviewed Aug 8, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix checksumming in the presence of large, mostly-zero buffers. #3355

Fix checksumming in the presence of large, mostly-zero buffers. #3355

Keno commented Aug 2, 2022

rocallahan commented Aug 8, 2022

yuyichao commented Aug 8, 2022

yuyichao Aug 8, 2022

rocallahan commented Aug 9, 2022 •

edited

Loading

GitMensch commented Nov 18, 2024

Fix checksumming in the presence of large, mostly-zero buffers. #3355

Are you sure you want to change the base?

Fix checksumming in the presence of large, mostly-zero buffers. #3355

Conversation

Keno commented Aug 2, 2022

rocallahan commented Aug 8, 2022

yuyichao commented Aug 8, 2022

yuyichao Aug 8, 2022

Choose a reason for hiding this comment

rocallahan commented Aug 9, 2022 • edited Loading

GitMensch commented Nov 18, 2024

rocallahan commented Aug 9, 2022 •

edited

Loading