sketch: do not hide metadata processing in sequence compression function #3241

fengelniederhammer · 2024-11-19T14:30:12Z

Summary

It's missing tests and probably some thought over how to embed this into the rest of the code (e.g. the CompressionService is still used separately), but this sketches how we could separate the concerns (metadata postprocessing vs. sequence compression). Separation of concerns is also my main motivation for this.

Screenshot

PR Checklist

All necessary documentation has been adapted.
The implemented feature is covered by an appropriate test.

corneliusroemer · 2024-11-19T14:37:17Z

Thanks, this is good, I see what you mean now, the structure is clear at the expense of more layering/intermediation.

I have no objections to merging this in if it works and the tests of #3232 are adapted to work with this but I don't have the bandwidth to do this myself right now.

There might be a slight perf hit to doing it this way here as opposed to the original due to extra copying but probably negligible.

corneliusroemer · 2024-11-19T14:42:41Z

Ah this is actually super easy to unit test now, easier than before - one extra advantage of this new organization

corneliusroemer · 2024-11-19T15:27:42Z

@fengelniederhammer I've adapted the unit tests and made a function name more precise (it also filters out extra fields that are not in the schema but in the metadata) - how does this look to you? Happy for you to merge this in.

See: 1d90b9e

fengelniederhammer · 2024-11-19T15:44:44Z

I added a commit to use Hamcrest matchers, because they provide better assertion errors than assertTrue. Looks good to me 👍

fengelniederhammer · 2024-11-19T15:47:45Z

There might be a slight perf hit to doing it this way here as opposed to the original due to extra copying but probably negligible.

According to the docs, copy does not copy in the sense of "copying memory", so I think there is actually no performance impact. (The method name is misleading if you're used to Rust or C++)

Use the copy() function to copy an object, allowing you to alter some of its properties while keeping the rest unchanged.

https://kotlinlang.org/docs/data-classes.html#copying

corneliusroemer · 2024-11-19T15:50:32Z

Very nice, thanks! I see, makes sense that it's a CreateOnWrite (cow) under the hood or something like that.

sketch: do not hide metadata processing in sequence compression function

448b0eb

fengelniederhammer requested a review from corneliusroemer November 19, 2024 14:30

fengelniederhammer mentioned this pull request Nov 19, 2024

feat(backend): Don't store null fields of processed data for perf, emit exactly current schema in get-released-data - imputing with null #3232

Merged

2 tasks

Make test work again, mention filtering out extra fields

1d90b9e

use hamcrest assertThat for better error messages

e7bab9e

corneliusroemer approved these changes Nov 19, 2024

View reviewed changes

corneliusroemer merged commit 906a5cd into enforce-schema Nov 19, 2024
13 checks passed

corneliusroemer deleted the processedDataPostprocessor branch November 19, 2024 15:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sketch: do not hide metadata processing in sequence compression function #3241

sketch: do not hide metadata processing in sequence compression function #3241

fengelniederhammer commented Nov 19, 2024

corneliusroemer commented Nov 19, 2024 •

edited

Loading

corneliusroemer commented Nov 19, 2024

corneliusroemer commented Nov 19, 2024 •

edited

Loading

fengelniederhammer commented Nov 19, 2024 •

edited

Loading

fengelniederhammer commented Nov 19, 2024

corneliusroemer commented Nov 19, 2024

sketch: do not hide metadata processing in sequence compression function #3241

sketch: do not hide metadata processing in sequence compression function #3241

Conversation

fengelniederhammer commented Nov 19, 2024

Summary

Screenshot

PR Checklist

corneliusroemer commented Nov 19, 2024 • edited Loading

corneliusroemer commented Nov 19, 2024

corneliusroemer commented Nov 19, 2024 • edited Loading

fengelniederhammer commented Nov 19, 2024 • edited Loading

fengelniederhammer commented Nov 19, 2024

corneliusroemer commented Nov 19, 2024

corneliusroemer commented Nov 19, 2024 •

edited

Loading

corneliusroemer commented Nov 19, 2024 •

edited

Loading

fengelniederhammer commented Nov 19, 2024 •

edited

Loading