Be more careful about file reading #4476
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
If SCons reads a file to interpret the contents, codecs are a concern. The
File
node class has aget_text_contents()
method which makes a best effort at decoding bytes data, but there are other places that don't get their file contents via that method, and so should do their own careful decoding - but don't, they just read as text and hope it's okay.Move the decode-bytes portion out of
File.get_text_contents()
toSCons.Util.to_Text()
so that everyone that needs this can call it. Add a couple of additional known BOM codes (after consulting Python's codecs module).Note that while
get_text_contents
acts on nodes, the new (moved) routineto_Text
acts on passed bytes, so it can be used in a non-Node context as well - for example the Java tool initializer reads a file and tries to decode it, and can get it wrong (see #3569), this change provides it some help.Fixes #3569
Fixes #4462
No docs impact: documented behavior does not change, just makes SCons less error-prone in certain situations.
Contributor Checklist:
CHANGES.txt
(and read theREADME.rst
)