Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Make it possible to read only a subset of available collections in ROOTFrameReader #504

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

tmadlener
Copy link
Collaborator

@tmadlener tmadlener commented Oct 11, 2023

BEGINRELEASENOTES

  • Add a collsToRead argument to readEntry and readNextEntry for the ROOTFrameReader to allow limiting the collections that are actually read.

ENDRELEASENOTES

This is an (early) proposal for introducing this functionality which would in the end address #499

  • Interface design: Do we want to have the possibility to change this frame-by-frame?
    • Alternatively would have to introduce a configuration method that takes a category name and a list of collections.
  • Should we only pass on the limited collection id table?
    • Interplay with Frame and assumptions there
  • Add tests and checks to avoid breaking things when collection names are passed that are not even available from the data
    • The way the check is implemented only names that exist in the frame will be checked. Non-existent names in collsToRead will be silently ignored.
  • Same capabilities for RNTuple reader
  • Documentation (once the rest has settled)
  • Ensure that files that have been read with a limited set of collections and are then written again, can be read without issues.
  • Needs Require C++20 and update to C++20 #698 as it also starts using std::ranges.
  • Needs Make sure RNTupleReader builds with ROOT > 6.34 #719 to pass on dev3 stacks

@tmadlener tmadlener linked an issue Oct 11, 2023 that may be closed by this pull request
@tmadlener
Copy link
Collaborator Author

Revived this as this is likely to become an issue for moving EDM4hep forwards otherwise. I have settled on making it possible to pass the collections that should be made available on a frame-by-frame basis as that is the most flexible and is also rather simple to implement.

I still have to make sure that the idTable that is available from the constructed Frames is properly cleaned up and also add some tests that passing in unavailable collection names doesn't break the whole thing.

@tmadlener tmadlener force-pushed the select-colls-root branch 4 times, most recently from 1123457 to 7cc113d Compare January 8, 2025 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow to limit the collections that are read
1 participant