-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading Multiple Blocks #41
Comments
This seems like it's likely a bug in this library, but unfortunately I wouldn't know where to start. Do you have sample files that I can help poke around at? |
Ok, new clue. When preparing the example file, I realized that I had compressed my files with pbzip2 (parallel bzip2). If I compress things with regular bzip2, everything works fine and bzip2-rs yields the entire file. I wonder if it's that pbzip2 is using a different blocking scheme or some unusual options? The pbzip2 files decompress as expected with regular bunzip2. This is probably kind of an edge case, so I would understand if it doesn't have high priority, but here is an sample file compressed with pbzip2: https://send.firefox.com/download/51320afe37/#ECAJud4iGH-7UMjJ7M8Pjw |
Ah an interesting observation! I wonder if this has to do with a format that the bunzip2 tool specifically allows? One thing we ran into with |
Yup, looks like it! Just grepping through the file, I'm seeing multiple BZ headers (which, appropriately for today, is BZh9 + 0x314159265359). If I change the logic in BzDecoder::read to only stop on EOF and keep going after StreamEnd, it works like a charm. |
Spoke too soon there. You need to call bzlib restart too. Unfortunately, this results in a large memory leak for me, and I know almost nothing about rust memory management, especially with ffi. Here's what I have so far: iamlemec@c0c501d |
@iamlemec oh I think you can solve that by replacing |
Works great now. Thanks! |
@iamlemec were you thinking of sending a PR akin to |
Definitely. I'll check out |
I'm encountering a slight issue when using the library: I'm only getting the first block of bz2 files. In my case, the block size is 900k, and when
read
ing offBzDecoder
, I get a total of 900k then EOF (size zero reads thereafter).This happens on both the crates.io release and master here on github. Am I confused here? Would appreciate any suggestions. Thanks!
The text was updated successfully, but these errors were encountered: