-
Notifications
You must be signed in to change notification settings - Fork 503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix: Handle seekhead for cues on mkv files. #2268
base: main
Are you sure you want to change the base?
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
This file was crafted by hand in a hex editor by modifying `sample.mkv`: 1. Adding two SeekHead elements at the end, each with a single Seek entry, with the first pointing at the Cues element and the second pointing at the Tags element. 2. Update the original SeekHead element so the Seek entries that previously pointed to the Cues and Tags element now point out the two new SeekHead elements we added. The resulting file is successfully parsed by `mkvinfo` and playable with VLC. The dump files were generated with all changes in MatroskaExtractor reverted (which is why they're unseekable). Bringing the MatroskaExtractor changes back and running the test results in a larger than expected diff in the dump file. Specifically all the samples are dropped.
I played around a bit with this, and added a new test file with three SeekHead elements (f5a6ed0). When I run the test with this new file without the In general I think this PR is an incomplete implementation of the full flexibility that the matroska spec supports with recursive SeekHead elements. |
This fixes the seek once behavior by adding a list of pending seekheads that we can visist, if we do not find cues.
This fixes the seek once behavior by adding a list of pending seekheads that we can visist, if we do not find cues. |
Thanks for the follow-up fixes. I've re-run
And the diff is (i.e. the 'passing' run results in unseekable output, while the 'failing' run is seekable):
The pattern is that the tests pass with
The same failure pattern occurs in the
Please could you take a look? |
This makes no sense at all, but the bug is fixed by removing this anti recursion check. I have no idea how and why. Is endMasterElement called twice with the exact same values? Can you please detail what really happends when an IO exception is thrown with an unknown length, does it restart from the beginning? By removing this all tests return if (visitedSeekHeads.add(seekEntryPosition)) {
pendingSeekHeads.add(seekEntryPosition);
} -> pendingSeekHeads.add(seekEntryPosition); |
Looks like this is the case, just calling clear will fix it, if we know it has reset. What function is called when the stream resets? We can put it in startMasterElement ID_SEGMENT, but that could be abused for infinite recursion. visitedSeekHeads.clear();
pendingSeekHeads.clear(); |
The
|
Ye, I read that, but are you sure it does not reset from 0 when unknown length=true? Because that is the only way this makes sense. I need to somehow know when that occurs to reset all the variables keeping track of what is visited, otherwise we cant revisit them due to infinite recursion. |
I checked now, the docs are wrong. It definitely restarts if an IOException is thrown and unknown length = true. So the documentation should be something like:
As just logging System.out.println("VISITED ID_SEGMENT: ("+contentPosition+") current count = "+ visistedCount++);
|
It looks like the 'restart from the beginning' behaviour is happening when media/libraries/test_utils/src/main/java/androidx/media3/test/utils/ExtractorAsserts.java Lines 408 to 410 in c858abd
[1] media/libraries/test_utils/src/main/java/androidx/media3/test/utils/ExtractorAsserts.java Lines 320 to 321 in c858abd
|
Well then the docs should be updated. However I still need to know when this happens, as I cant differentiate this behavior from a a malicious recursive mkv. Between resets I have to reset the variables to keep track of current seekheads, otherwise it is impossible to properly parse seekheads. How would you suggest I implement this clearing? |
With some logging added locally [1], in the
[1]
And then later, when we're no longer in the
Perhaps in |
Please check last commit |
// | ||
// Note that we also need to check that we do not jump before or to the segment we are on | ||
// as we do not want to clear our visitedSeekHeads | ||
if (visitedSeekHeads.add(seekEntryPosition) && seekEntryPosition > segmentContentPosition) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seekEntryPosition > segmentContentPosition
is needed to not seek above the segment to avoid clearing this visistedlist, however should this also be added to cues, or do we trust cues to be valid?
Closes #1143
This fixes the issue with non seekable mkv files, by checking the SEEK_HEAD, as media3 does not currently use SEEK_HEAD at all. A test was also added showing a non seekable mkv file, this was generated by
mkvpropedit sample.mkv --chapters large_chapters.xml
.All the credit for fire-light42