Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TrackClusterMergeSplitter to output track-cluster associations (PFA0) #1699

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

ruse-traveler
Copy link
Contributor

@ruse-traveler ruse-traveler commented Jan 9, 2025

Briefly, what does this PR introduce?

This PR updates the TrackClusterMergeSplitter algorithm to output both edm4eic::TrackClusterMatch and MC paritcle-cluster associations. In this process, it reaps what was sown by originally writing the algorithm to operate on protoclusters rather than clusters: the algorithm will now ingest fully formed cluster and update relevant quantities.

What kind of change does this PR introduce?

Please check if this PR fulfills the following:

  • Tests for the changes have been added
  • Documentation has been added / updated
  • Changes have been communicated to collaborators

Does this PR introduce breaking changes? What changes might users need to make to their code?

No.

Does this PR change default behavior?

Yes. Track-cluster and MC particle-cluster associations will now be produced by the algorithm.

@ruse-traveler ruse-traveler marked this pull request as ready for review February 4, 2025 22:53
Comment on lines +528 to +535
// --------------------------------------------------------------------------
//! Calculate cluster shape parameters
// --------------------------------------------------------------------------
/*! Calculation originally written by Chao Peng, Dhevan Gangadharan,
* and Sebouh Paul. Code is copied from the `CalorimeterClusterRecoCoG`
* algorithm.
*/
void TrackClusterMergeSplitter::calculate_shape_parameters(edm4eic::MutableCluster& clust) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this factorizes so well, we might as well move it to a separate factory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is already a big PR. If you decide to go along with my suggestion, you could refactor existing code separately, and then rebase this to re-use that new facility)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be very much in favor of that! Keeps it nice and modular 😉

But how would this work in the data model? Would we have one collection without shape parameters filled in (likely not saved by default) and then one with them filled in (saved by default)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started work on this in #1734 !

Comment on lines 63 to 64
using MatrixF = std::vector<std::vector<float>>;
using VecMatrix = std::vector<MatrixF>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like, in some places you try to use maps, and here you give up and just rely on array indices? Is it true that every hit has a weight related to every cluster and every projection?

Copy link
Contributor Author

@ruse-traveler ruse-traveler Feb 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I do think we need a weight that's related to every cluster and every projection pointing to the merged cluster:

  • the former is because the merged cluster will (naturally) be composed of all of the hits of the clusters being merged,
  • and the latter is because we'll need to know the sum of all projections' momenta for the weight, so that summing over the split clusters should give the merged cluster energy before splitting.

(Unless I'm misunderstanding how the splitting should work!)

All that being siad, I probably could rework it to use maps instead of indices. I was thinking of this map of weights (hit_weight = hit_weight(cluster, projection)) as a "matrix", and this was an easy way of implementing a dynamic 2D matrix of floats 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I do think we need a weight that's related to every cluster and every projection pointing to the merged cluster:

I think you're right. Looks like this only involves hits from the cluster group.

All that being siad, I probably could rework it to use maps instead of indices. I was thinking of this map of weights (hit_weight = hit_weight(cluster, projection)) as a "matrix", and this was an easy way of implementing a dynamic 2D matrix of floats 🤷

Ideally we write the code to maximize clarity. Having a consistent use of maps could help with that, if there is a neat way to use those. Using maps also provides basic sanity checks to the code.

If you have a preference to use tensors here, then just remember std::vector<std::vector<std::vector< has triple dereferences/allocations and spoils data locality, so it won't win that much in terms of performance. We already use eigen library, that could be better for your case, perhaps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! I'll give it some thought and let you know which route (map or eigen) I go with!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After mulling it over, I think it a natural way might be to actually use a mix of maps and eigen: for each projection, we know how many clusters and hits we're going to need so we could easily define a map of projections onto MatrixXd! Let me give it a shot...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stand corrected: it was much easier to just use std::map<edm4eic::CalorimeterHit, double>. Just pushed some changes that rewrite the splitting calculation to use that (it's a little cleaner now).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants