-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added multithreading to zstd compression #689
Conversation
…age version. Added num_cpus package to get core count.
Whoops I think I put the wrong pull number in the |
fixed pull number
This needs more testing, in my opinion. It works, but the speedup isn't as significant as it probably could be. Here is a result of a quick test using a
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good PR! Thanks for taking the time to contribute, I added a small requested change here.
I see, to be honest I'm not 100% sure why our version is slower. I'll let you make the call tho, if you want to investigate further or you think we can merge this as it is. If you decide to investigate, I highly highly recommend looking into But it's a valuable improvement regardless! So feel incentivized and not pressured to do so 😅 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops can you also add the ² from ✓² in README to tell that compression is done in parallel? :v
https://github.com/ouch-org/ouch?tab=readme-ov-file#supported-formats
Thank you, I am happy that this change is appreciated and I'm glad to contribute. I will make the changes that you requested in the coming hours. As for investigating why the speedup isn't on par with the zstd tool, I do want to take a closer look at it but no promises on finding a solution. I enjoy Rust as a hobby and consider myself to be at an intermediate level, but I'm not super confident in my abilities if I'm being honest. I've heard of flamegraph and have wanted to try it out so this gives me a good excuse to learn it. I will work on this some today and tomorrow but if I cannot find anything then I'm fine with merging it as is (after the changes mentioned above, of course). |
I'm not at my Linux machine right now, so I can't run I did some research and found this issue on the repo for the Benchmark Notes
Results
In short, multi-threaded |
Here is the output of I can somewhat understand how to parse that information but I do not know where to begin when it comes to using that to optimize further. At least not at the moment, but when I have more free time I will possibly investigate further. I'm fine with merging it as is. As a side note, here is more terminal output that
Another side note, it seems to run better on Linux compared to Windows (same hardware), but that isn't too surprising to me. The gap is still there between
|
Oh thank you so much for all the investigation and effort you put into this! I highly appreciate it.
That'd make sense! I didn't know that, and yeah we use the streaming API to chain encoders and decoders. It also makes sense that it adds overhead since we make lots of small calls instead of just one.
Yeah if you run To compare both tools you'd need to run With that, by comparing the size of the bars you could get a rough comparison of where the extra time is spent. However, on second thought, I don't advise you to investigate further, flamegraph sampling with multithreading can be unpredictable, and we don't know if we are calling the exact same functions, and you'd likely find out at the end that it's indeed a problem in the Streaming API. Well, you can do it if you're still curious 😄 but you've done a lot for this PR, ty! |
Hello, this is my first PR here and my Rust is a bit rusty, so forgive me if this is a foolish change or if PRs are unwelcome. I created issue #688 and decided to take a stab at implementing it.