-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using select_streams argument traverses file twice #60
Comments
I agree that duplicating chunk reading is not ideal, but at that time it was the quickest solution, because I could use functions from an alternative XDF reader I'd written. Also, these functions were mainly intended for querying the XDF file ( If you want to put in the effort to de-duplicate chunk reading code paths this would be great, provided that there are no regressions in functionality (i.e. keeping the two functions I've mentioned before). |
@cbrnr - I'm doing a little cleanup of pyxdf, deduplicating code, eliminating the double-traversal when using Am I right in assuming that In my deduplication effort, I'm reusing your |
I'll answer in #74. |
Currently, when one uses the
select_streams
argument, it will first traverse the file's chunks to find/parse the headers, and sets some flags used subsequently for a check during theload_xdf
main loop, which traverses and parses the file again*. It does that using an alternative code path that mirrors the chunk traversal and header parsing of the main loop, but currently skips, among others, some error checking in case of file corruptions etc.Side note: I do like the idea of breaking
load_xdf
up into smaller subroutines, and I like the generator approach of_read_chunks
, i.e., there may be a possibility to refactorload_xdf
in this overall style, especially if the logic were to get more complex over the years. However, as it stands, the current main loop is still pretty simple and easy to follow (esp when one reads it alongside the spec), and could continue to serve as an approachable reference implementation for future language ports for a couple more years. So I'm not ready to pick a side at this point, also considering the effort that a full & clean refactor would come down to.For now, maybe a way to reconcile the code duplication (which I hope is temporary) and double-traversal of the file could be to add a self.skip (as in, skip processing chunks of this stream) bool in the
StreamData
constructor, and we could move the matching logic there or into a helper method/function. This way, it would run the first time a header is encountered, and then whenever one sees a chunk of that stream, one can, roughly, do anif streams[StreamId].skip: continue
near the place where it currently does that check. We could then earmark the remainder of that alternative code path (parse_chunks, _read_chunks, ...) for future consideration when or if we take on a refactor ofload_xdf
in this general style (maybe with a git tag).I'd be willing to implement the suggested change (using
StreamData.skip
) this week if there's no objection. I think this may also get us a closer to a future simple and fast load-only-headers option forload_xdf
(some time soon I'm hoping to have a separate discussion on that).(*): The double-traversal may not sound like much, it'd be relatively more costly on a network file system.
The text was updated successfully, but these errors were encountered: