Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inflate: make use of enable-dfa-jump-thread #257

Merged
merged 1 commit into from
Dec 4, 2024
Merged

inflate: make use of enable-dfa-jump-thread #257

merged 1 commit into from
Dec 4, 2024

Conversation

folkertdev
Copy link
Collaborator

Refactor so that the llvm enable-dfa-jump-thread has an effect. The numbers are really good for the small chunk sizes

chart (4)

We're now on-par for a chunk size of 4 with zlib-ng, and doing very well overall.

chart (3)

It really is a massive jump for chunk sizes 4 and 5, (20% and 12% resp.) and then matters less and less for bigger chunk sizes.

NOTE: these benchmarks are run with -Cllvm-args=-enable-dfa-jump-thread; this commit does not enable that flag in any way, it (for now) has to be enabled manually via rustflags.

@folkertdev folkertdev requested a review from bjorn3 December 3, 2024 15:42
Copy link

codecov bot commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 90.03984% with 25 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
zlib-rs/src/inflate.rs 90.03% 25 Missing ⚠️
Files with missing lines Coverage Δ
zlib-rs/src/inflate.rs 91.23% <90.03%> (-3.92%) ⬇️

... and 1 file with indirect coverage changes

// not the entirity of `dispatch`. We get a massive boost from that pass.
//
// It unfortunately does duplicate the code for some of the states; deduplicating it by having
// more of the states call this function is slower.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to remove the duplication by using macros for the content of the states that are currently duplicated? It would make rust-analyzer work less well on that code though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is because of the labels that we break/continue too. Also it's fine because we won't touch this much ever again hopefully. But not ideal for sure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can pass the labels as macro arguments, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you? what would be the fragment specifier of the label? is it an identifier somehow? tt might work but often requires brackets

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also in our function we load the reader and writer to the stack. you could parameterize the macro on that too but, idk, is that worth it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least please add a comment to every copy of every state that is duplicated indicating that they should be kept in sync.

@bjorn3
Copy link
Collaborator

bjorn3 commented Dec 4, 2024

Does this regress performance when not enabling the LLVM flag or is it perf neutral?

@folkertdev
Copy link
Collaborator Author

it's a win even without the flag; loading the values to the stack in this restricted case is advantageous. I get a ~10% increase at chunk size 4 (versus ~20% with the flag)

@bjorn3 bjorn3 merged commit 7961e8e into main Dec 4, 2024
20 checks passed
@bjorn3 bjorn3 deleted the llvm-dfa branch December 4, 2024 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants