-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-standard escaping of zero bytes #2
Comments
perhaps this is something that needs to be fixed. |
Perhaps @zimmerle knows more? He introduced the change from \u0000 to \u0000 after all. :-) |
Hi @dstelter, Giving the example: '\u0000', the '' is not a escaping character, rather it is part of the unicode representation of '\0'. In ModSecurity we use the YAJL library to load the JSON into memory. The YAJL library considers the content of a JSON to be a NULL-terminated string. This means that whenever it face a '\0' it considers (somewhat expected) the end of the string, thus, not reading the entire data. The escape character was placed to avoid the partial loading of a content. Once the content is load into memory the double slash (\u0000) is replaced by a single slash, therefore, representing again a NULL byte. I opened a ticket on GitHub for yajl project (I don't remember the issue number) to ask for an api where I will be able to determinate the actually number of bytes to be read, disregard of the NULL byte. The developers, gave me an convince explanation that JSON should not transport such stuff as NULL byte. Maybe to use an transport encoding on top of it (base64?), but i didn't like the idea of loosing the readability, so I ended up adopting this creative way of represent the null byte: \u0000. Is that a problem to you? Do you have a better idea? The library that you are using to load those JSON files is able to handle the \u0000 right? |
Thanks for the clarification! I see your point now. It is certainly not a big deal to "unfix" the \u0000 to \u0000, yet still somewhat confusing for test consumers. After all \u0000 is the canonical representation of null bytes. Imo, such implementation quirks should be kept out of the test data and rather be dealt with by some loading shim. |
Hi @dstelter, I agree with you, i would prefer to have the \u0000 as \u0000 not \u0000. Your help will be more than appreciate. The code for the test utility is available here: The specific part where the \u0000 is replaced, is here: We don't want to add another dependency (e.g. another JSON library), and I would prefer not to replace the YAJL, as it is used in different parts of the code, unless it is really necessary. Notice that the tests repository is a gitsub repo, inside the libmodsecurity branch. Thanks!! |
I'm wondering why zero bytes are now double-escaped, which essentially breaks naive JSON parsing (e.g. using cjson).
Could you shed some light on the reasoning? What's wrong with \u0000?
The text was updated successfully, but these errors were encountered: