Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sketch: do not hide metadata processing in sequence compression function #3241

Merged
merged 3 commits into from
Nov 19, 2024

Conversation

fengelniederhammer
Copy link
Contributor

Summary

#3232 (comment)

It's missing tests and probably some thought over how to embed this into the rest of the code (e.g. the CompressionService is still used separately), but this sketches how we could separate the concerns (metadata postprocessing vs. sequence compression). Separation of concerns is also my main motivation for this.

Screenshot

PR Checklist

  • All necessary documentation has been adapted.
  • The implemented feature is covered by an appropriate test.

@corneliusroemer
Copy link
Contributor

corneliusroemer commented Nov 19, 2024

Thanks, this is good, I see what you mean now, the structure is clear at the expense of more layering/intermediation.

I have no objections to merging this in if it works and the tests of #3232 are adapted to work with this but I don't have the bandwidth to do this myself right now.

There might be a slight perf hit to doing it this way here as opposed to the original due to extra copying but probably negligible.

@corneliusroemer
Copy link
Contributor

Ah this is actually super easy to unit test now, easier than before - one extra advantage of this new organization

@corneliusroemer
Copy link
Contributor

corneliusroemer commented Nov 19, 2024

@fengelniederhammer I've adapted the unit tests and made a function name more precise (it also filters out extra fields that are not in the schema but in the metadata) - how does this look to you? Happy for you to merge this in.

See: 1d90b9e

@fengelniederhammer
Copy link
Contributor Author

fengelniederhammer commented Nov 19, 2024

I added a commit to use Hamcrest matchers, because they provide better assertion errors than assertTrue. Looks good to me 👍

@fengelniederhammer
Copy link
Contributor Author

There might be a slight perf hit to doing it this way here as opposed to the original due to extra copying but probably negligible.

According to the docs, copy does not copy in the sense of "copying memory", so I think there is actually no performance impact. (The method name is misleading if you're used to Rust or C++)

Use the copy() function to copy an object, allowing you to alter some of its properties while keeping the rest unchanged.

https://kotlinlang.org/docs/data-classes.html#copying

@corneliusroemer
Copy link
Contributor

Very nice, thanks! I see, makes sense that it's a CreateOnWrite (cow) under the hood or something like that.

@corneliusroemer corneliusroemer merged commit 906a5cd into enforce-schema Nov 19, 2024
13 checks passed
@corneliusroemer corneliusroemer deleted the processedDataPostprocessor branch November 19, 2024 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants