Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update f1_c_gen.py #50

Merged
merged 1 commit into from
Jun 28, 2024
Merged

Update f1_c_gen.py #50

merged 1 commit into from
Jun 28, 2024

Conversation

20urc3
Copy link
Contributor

@20urc3 20urc3 commented Jun 28, 2024

Fix #46

This commit addresses a UnicodeEncodeError that occurred when attempting to
serialize TreeNode objects containing Unicode characters outside the Latin-1
range (0-255). The specific error was triggered by the character '\u2421'.

Changes:

  1. Modified TreeNode.to_bytes() method:

    • Replaced Latin-1 encoding with UTF-8 for broader Unicode support.
    • Updated val_len to store the byte length of the UTF-8 encoded string
      instead of the character count.
  2. Updated TreeNode.from_bytes() method:

    • Changed decoding from Latin-1 to UTF-8 to match the new encoding.

These modifications allow the serialization and deserialization of TreeNode
objects containing any valid Unicode character, resolving the
UnicodeEncodeError while maintaining compatibility with the existing byte
structure.

Note: This change may slightly increase the size of serialized data for
non-ASCII characters, but it ensures correct handling of all Unicode
characters in the grammar.

@20urc3
Copy link
Contributor Author

20urc3 commented Jun 28, 2024

This allow to compile out-of-the-shelf the javascript.json using: make GRAMMAR_FILE=grammars/javascript.json (which was broken until now and was returning

UnicodeEncodeError: 'latin-1' codec can't encode character '\u2421' in position 0: ordinal not in range(256)
make: *** [GNUmakefile:102: src/f1_c_fuzz.c] Error 1

@vanhauser-thc
Copy link
Member

thank you!

@vanhauser-thc vanhauser-thc merged commit 05d8f53 into AFLplusplus:stable Jun 28, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UnicodeEncodeError
2 participants