You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is to document a known issue with Slingshot 11 network. Legion runs hit an error if you use only 1 node:
*** FATAL ERROR (proc 0): in gasnetc_ofi_init() at .../gasnet_ofi.c:946: fi_domain failed: -38(Function not implemented)
I have been told that this is an issue with the SLURM integration, and therefore is not something that Legion/GASNet are in a position to directly address.
In the meantime, I'm aware of two workarounds:
Use 2 or more nodes
Run with srun --network=single_node_vni
I will update this issue when the workarounds are no longer required.
Edit: I understand that the issue is related to SLURM settings at OLCF, not necessarily to Slingshot 11 per se.
The text was updated successfully, but these errors were encountered:
elliottslaughter
changed the title
gasnetc_ofi_init failure on Slingshot 11 networks with 1 node
gasnetc_ofi_init failure on Frontier/Crusher with 1 node
Aug 24, 2023
This is to document a known issue with Slingshot 11 network. Legion runs hit an error if you use only 1 node:
I have been told that this is an issue with the SLURM integration, and therefore is not something that Legion/GASNet are in a position to directly address.
In the meantime, I'm aware of two workarounds:
srun --network=single_node_vni
I will update this issue when the workarounds are no longer required.
Edit: I understand that the issue is related to SLURM settings at OLCF, not necessarily to Slingshot 11 per se.
The text was updated successfully, but these errors were encountered: