Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search acTL only among chunk types #4

Merged
merged 9 commits into from
Aug 12, 2024

Conversation

madtisa
Copy link

@madtisa madtisa commented Aug 12, 2024

Byte sequence acTL can be encountered in other parts of PNG, that are not chunk type (e.g. in chunk data or crc), so we should check only chunk types to avoid false positive detection.

madtisa and others added 2 commits August 12, 2024 12:12
Byte sequence `acTL` can be encountered in other parts of PNG,
that are not chunk type (e.g. in chunk data or crc),
so we should check only chunk types to avoid false positive detection.
@vHeemstra
Copy link
Owner

vHeemstra commented Aug 12, 2024

Thanks for the valid improvement.

Before merging this and pushing a new release, do you have any ideas how we could early bail here, in order to prevent 'parsing' the entire file in special cases*.

I tried figuring out the maximum length of content before an IDAT chunk has to occur, but possible content seems too divers to have a definitive number here I think.

* In case a PNG file has a valid signature, but erroneous content (by accident or on purpose).

* Added additional tests for input validation handling and data structure constraints
* Removed test related to string matching
@madtisa
Copy link
Author

madtisa commented Aug 12, 2024

Thanks for the valid improvement.

Before merging this and pushing a new release, do you have any ideas how we could early bail here, in order to prevent 'parsing' the entire file in special cases*.

I tried figuring out the maximum length of content before an IDAT chunk has to occur, but possible content seems too divers to have a definitive number here I think.

  • In case a PNG file has a valid signature, but erroneous content (by accident or on purpose).

I don't think it would be possible to prevent simulating worst case scenario even if we fully validate all png chunks. Judging by spec png file can consist of sequence of small chunks (our worst-case scenario, performance-wise) and still be valid. Either way, even in worst case complexity won't exceed O(n) (where n - file size).
On average, jumping chunks should be quite fast. And even if file got corrupt (unintentionally), there is a high chance that random 4 bytes of chunk length would be big enough to throw us over the end of the file.

As for improvement to the search, we can include optional PLTE chunk as additional stopping condition (saw it from the link you've provided) UPD: never mind, it doesn't seem to be the case according to this: https://www.w3.org/TR/png/#acTL-chunk

@madtisa
Copy link
Author

madtisa commented Aug 12, 2024

Here is some useful info on fast-fail when file is corrupted: https://www.w3.org/TR/png/#13Error-checking

  • if 5-th (zero-based) bit of chunk type is 0 (critical chunk type) check it in known chunks list and return false is it's missing
  • check that all chunk type symbols are in range 41-5A and 61-7A (hex)

that shouldn't impact performance much

@madtisa
Copy link
Author

madtisa commented Aug 12, 2024

Included commit with chunk type validation as a proof of concept: 60087bc
You can revert it if you wish

Check chunk types symbol range and whether it's known critical chunk
@vHeemstra
Copy link
Owner

vHeemstra commented Aug 12, 2024

Again, thanks for your work on this.

I was looking for the most optimal way to fail-fast if we don't find the acTL chunk.
In the rare case where there is no IDAT chunk present (invalid (A)PNG), our current script would scan the entire file (in chunks). I figured this worst-case might be something to consider. But seeing there is no quick way to abort checking, without moving into validator territory , I think we will leave it for now.

We could, of course, add an extra exported function (or include an optional toggle argument) to let users opt-in for full validation of the file. But that would better suit a new feature branch.

For now, I will revert the chunk type validation and create a new branch for that (see here). If we want, we can work on it there.

@vHeemstra vHeemstra merged commit 32654e8 into vHeemstra:main Aug 12, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants