-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sketch: do not hide metadata processing in sequence compression function #3241
Conversation
Thanks, this is good, I see what you mean now, the structure is clear at the expense of more layering/intermediation. I have no objections to merging this in if it works and the tests of #3232 are adapted to work with this but I don't have the bandwidth to do this myself right now. There might be a slight perf hit to doing it this way here as opposed to the original due to extra copying but probably negligible. |
Ah this is actually super easy to unit test now, easier than before - one extra advantage of this new organization |
@fengelniederhammer I've adapted the unit tests and made a function name more precise (it also filters out extra fields that are not in the schema but in the metadata) - how does this look to you? Happy for you to merge this in. See: 1d90b9e |
I added a commit to use Hamcrest matchers, because they provide better assertion errors than |
According to the docs,
|
Very nice, thanks! I see, makes sense that it's a CreateOnWrite (cow) under the hood or something like that. |
Summary
#3232 (comment)
It's missing tests and probably some thought over how to embed this into the rest of the code (e.g. the
CompressionService
is still used separately), but this sketches how we could separate the concerns (metadata postprocessing vs. sequence compression). Separation of concerns is also my main motivation for this.Screenshot
PR Checklist