-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distribution of detector blocks across MPI processes #334
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work, @anand-avinash. Thanks a lot! We can merge this once the documentation is updated and tests are added. Thank you for duly annotating all the types.
Great work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for having fixed the comments!
I believe this PR is ready to be merged, but there are a few minor issues that need discussion. The implementation of attribute-based detector blocks works well in the full simulation pipeline when Nonetheless, I tried to resolve some of the error which I am documenting here. The first error I encountered was:
which I solved simply by adding the following: litebird_sim/litebird_sim/observations.py Lines 209 to 212 in a941fd0
The next error I encountered was
It appears because litebird_sim/litebird_sim/observations.py Lines 677 to 682 in a941fd0
The next error I encountered is
It can be solved by replacing line 222 of
Next one I encountered while calling the destriper:
This one seems little non-trivial to solve. At the end we may face errors elsewhere as well that I have not tested. And it brings me to the question: should we worry about fixing this issue or should we leave it to the user to supply a well-balanced set of detectors? |
The simplest solution would probably be to check in |
This pull request is related to the discussion #325.
Here I have added a new method
_make_detector_blocks()
to theObservation
class. It takes a list of the detectors and divides them into different groups as specified bydet_blocks_attributes
parameter passed to theObservation
class. The group of detectors are saved in the variabledetector_blocks
ofObservation
class. For instance, withdet_blocks_attributes = ["wafer", "pixel"]
, the detector groups will be made in such a way that each group will have detectors with same wafer and pixel attributes.Once the detectors are divided into the groups, it sets the number of detector blocks. The number of time blocks are computed accordingly (
n_blocks_time = comm.size // n_blocks_det
). Given the structure of the code base, this scheme was pretty easy to implement than I thought, and it doesn't breaks any other functionality.On the other side, this implementation can lead to load imbalance in different MPI processes. Consider a simulation with 2 MPI processes and 4 detectors. If the detectors are ended up in two groups, with 3 and 1 elements respectively, one MPI process will have to work with 3 times more data than the other MPI process.
Another caveat is related to the
det_idx
member ofObservation
class. Previously, it corresponded to the global list of detectors passed to theObservation
class. Now with the introduction of detector blocks, it won't be true anymore. For a simple solution, I have introduced a member variabledetectors_global
, that is the global list of detectors sorted according the detector groups. Now thedet_idx
will correspond to the indices of the detectors indetectors_global
. But then we end up with three variablesdetectors
(the list of detectors used to initialize theObservation
class),Observation.detector_blocks
, andObservation.detectors_global
storing the same data in different manners.After adding documentation, this pull request should become ready to be merged. But please let me know if you have other solutions.