Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoding with LZ77 shared dictionary fails on a case that works correctly with the c implementation #36

Open
garretrieger opened this issue Mar 13, 2025 · 0 comments

Comments

@garretrieger
Copy link

Here's a unit test that can reproduce:

#[test]
fn test_failing_patch() {
  let patch: &[u8] = &[
    27, 103, 0, 96, 47, 14, 120, 211, 142, 228, 22, 15, 167, 193, 55, 28, 228, 226, 254, 54, 10,
    36, 226, 192, 19, 76, 50, 8, 169, 92, 9, 197, 47, 12, 211, 114, 34, 175, 18, 241, 122, 134,
    170, 32, 189, 4, 112, 153, 119, 12, 237, 23, 120, 130, 2,
  ];

  let dict: Vec<u8> = vec![
    2, 0, 0, 0, 0, 213, 195, 31, 121, 231, 225, 250, 238, 34, 174, 158, 246, 208, 145, 187, 92, 2,
    0, 0, 4, 0, 0, 0, 46, 0, 0, 0, 0, 0, 11, 123, 105, 100, 125, 46, 105, 102, 116, 95, 116, 107,
    20, 0, 0, 52, 40, 103, 221, 215, 223, 255, 95, 54, 15, 13, 85, 53, 206, 115, 249, 165, 159,
    159, 16, 29, 37, 17, 114, 1, 163, 2, 16, 33, 51, 4, 32, 0, 226, 29, 19, 88, 254, 195, 129, 23,
    25, 22, 8, 19, 21, 41, 130, 136, 51, 8, 67, 209, 52, 204, 204, 70, 199, 130, 252, 47, 16, 40,
    186, 251, 62, 63, 19, 236, 147, 240, 211, 215, 59,
  ];

  let mut input_buffer: [u8; 4096] = [0; 4096];
  let mut output_buffer: [u8; 4096] = [0; 4096];

  let mut cursor = Cursor::new(patch);
  let mut output: Vec<u8> = vec![];

  let res = super::BrotliDecompressCustomDict(
    &mut cursor,
    &mut output,
    &mut input_buffer,
    &mut output_buffer,
    dict,
  );

  assert!(res.is_ok(), "Unexpected error {:?}", res);
  assert_eq!(
    output,
    vec![
      0x02, 0x00, 0x00, 0x00, 0x00, 0x8c, 0x16, 0xa6, 0x25, 0x18, 0xf8, 0x68, 0x63, 0x4e, 0xe4,
      0x09, 0x2b, 0xa1, 0xe2, 0x4b, 0xba, 0x02, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00, 0x2e, 0x00,
      0x00, 0x00, 0x00, 0x00, 0x0b, 0x7b, 0x69, 0x64, 0x7d, 0x2e, 0x69, 0x66, 0x74, 0x5f, 0x74,
      0x6b, 0x14, 0x00, 0x00, 0x38, 0x1d, 0x25, 0x11, 0x72, 0x01, 0xa3, 0x02, 0x10, 0x21, 0x33,
      0x04, 0x20, 0x00, 0xe2, 0x1d, 0x13, 0x58, 0xfe, 0xc3, 0x81, 0x17, 0x19, 0x16, 0x08, 0x13,
      0x15, 0x29, 0x82, 0x88, 0x33, 0x08, 0x43, 0xd1, 0x34, 0xcc, 0xcc, 0x46, 0xc7, 0x82, 0xfc,
      0x2f, 0x10, 0x28, 0xba, 0xfb, 0x3e, 0x3f, 0x13, 0xec, 0x93, 0xf0, 0xd3, 0xd7, 0x3b,
    ]
  );
}

I turned on logging for both the c implementation and this one and found that they diverged in behaviour on the first insert/copy:

c brotli:

...
[ProcessCommandsInternal] pos = 0 insert = 25 copy = 24
p1 = 0, p2 = 0
[ProcessCommandsInternal] context = 0
...

rust brotli:

...
"[ProcessCommandsInternal] pos = %d insert = %d copy = %d distance = %d\n" 0 25 24 -1
"p1" = 59
"p2" = 215
"context" = 21
...

After some more digging I believe the source of the issue is that the ring buffer is initialized with the last two bytes set to 0 (here in c brotli and here in rust brotli) but in the rust brotli implementation the end of the ring buffer is immediately overwritten with the shared dictionary. As a result the p1 and p2 lookups get the value of the last two bytes of the shared dictionary instead of 0 as in the c implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant