Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NA OFI: inadvertent activation of psm3 instead of psm2 #485

Open
carns opened this issue Jul 12, 2021 · 1 comment
Open

NA OFI: inadvertent activation of psm3 instead of psm2 #485

carns opened this issue Jul 12, 2021 · 1 comment

Comments

@carns
Copy link
Contributor

carns commented Jul 12, 2021

Describe the bug

After upgrading libfabric (via external repo at https://github.com/mochi-hpc/mochi-spack-packages/tree/main/packages/libfabric) from 1.11.1 to 1.13.0, psm2 performance on the LCRC Bebop system dropped by an order of magnitude. Performance with 1.13.0 returns to normal if libfabric is compiled with an explicit --disable-psm3 argument. The libfabric package linked above now has an explicit, hardcoded --disable-psm3 configure argument for this reason. At the moment we are not running mochi code on any native psm3 environments.

To Reproduce

Reproducing with the margo-p2p-bw and margo-p2p-latency tests in https://github.com/mochi-hpc-experiments/mochi-tests/blob/main/perf-regression/bebop/margo-regression.sbatch .

The package list looks like this:

Expected behavior

I didn't expect the psm3 provider being enabled to impact psm2 performance. Are we activating the former by accident, and if so is there a way to prevent it from happening?

@carns
Copy link
Contributor Author

carns commented Jul 12, 2021

@marcvef heads up

@soumagne soumagne added this to the future milestone Oct 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants