-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For multi-tile I/O, allow concurrent reads/writes of the different tiles #343
Comments
from Core team; Option 1: if the right way, we need to rework how PIO is initialized in ESMF entirely. Feel like this is the right thing is to initiate once and cleaner. But effort is huge Bill’s inclination is to go with the quicker and dirtier approach of Option 2: keeping the PIO initialization happening where it currently is, but creating a separate PIO instance for each tile. Action: |
Jim responded that probably should not implement at the PIO level. UFS team response I am also experimenting with using our own write_netcdf subroutines which we currently use for writing bundles to also support cubed_sphere grids. Gerhard help me sometime ago with explanation on how to use FieldGather with multi-tiles fields/grids. I just need to find out if we can use write grid component with number of tasks per group that are not divisible by 6. If that works, that's also a possible alternative. from Jun 1/22/2025: Team: we want to encourage others to move to correct solution (and not the experimental code that we do not have the capacity to maintain). |
One issue that may need some thought here is how to set up each PIO instance so that it has a subset of the PEs just for a given tile. I'm not sure this is essential, but it seems like it might be needed to actually get the performance boost we're trying to get: If (for example) the PIO instance for tile 1 was using some PEs that overlap with the PEs used for the data used for other tiles, then we might not get the concurrency we're looking for (though I'm not sure about this). We should at least make sure this gives us the concurrency we want in simple / standard cases: it would probably be okay (at least for an initial implementation) if we don't get the ideal concurrency in unusual cases, like having multiple DEs per PET, if it's harder to support those cases. |
There are performance issues with multi-tile I/O since the tiles are written sequentially (see recent notes in https://github.com/esmf-org/esmf-support/issues/489).
We'd like to investigate being able to write tiles concurrently - either by having multiple PIO instances or by changing something else about the PIO setup to allow concurrency or at least get similar performance to what we'd get from this concurrency.
See notes on this in the 2025-01-15 Core Team Meeting Notes.
The text was updated successfully, but these errors were encountered: