c-utf8: Always call strlen() on strings of unknown length #7
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was provision in the validation loop for continuing to validate until a nul byte was hit, but unfortunately that often results in buffer overflow reads when used with heap allocated strings.
This is because the SIMD code reads 4 bytes at a time. If the string’s terminal nul byte is in bytes 0–3, this results in a 3–1 byte overread. This is detected and warned about by valgrind and asan, even though it’s harmless because:
Unfortunately, there’s no easy way to tell valgrind or asan about this. We could use the valgrind client API to tell it that the overread is harmless, but using the API is a significant runtime overhead. Asan doesn’t allow any warnings to be suppressed apart from telling it to not instrument the entire function, which would mean that basically all the code in c-utf8 would be uninstrumented. So we’d never be told about other, true positive, errors, if they exist.
So, the least bad option seems to be to explicitly calculate the string length with
strlen()
before validating, if the length is not specified. Many callers into c-utf8 will already specify the length, in which case this commit is a no-op.See also: https://gitlab.gnome.org/GNOME/glib/-/issues/3493
Signed-off-by: Philip Withnall [email protected]
Fixes: #6