Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise number of threads used by GASNet by default + add warning if exceeded #26448

Merged
merged 5 commits into from
Jan 17, 2025

Conversation

bradcray
Copy link
Member

@bradcray bradcray commented Dec 21, 2024

[co-developed by @bonachea]

As noted in #26379, we've recently had a user on a high-core-count system get surprised when we silently dialed the maximum number of tasks down to 255 rather than using the number of cores on that system. This PR addresses that issue in two ways:

  • by inserting a warning if/when this happens so that the user can be aware that it is happening. This warning is specialized to tell how to raise the limit for GASNet users (the only case that imposes a limit at present, we think).
  • by raising the number of threads that GASNet is configured with to 65535 as suggested by @bonachea in GASNet's max threads per process silently limiting Chapel's maxTaskPar #26379 (comment), which will result in 1024 being used as the default limit unless overridden with GASNET_MAX_THREADS

In testing this, I also updated the chpl_comm_getMaxThreads() call to use the soft limit from GASNet on the number of threads rather than the hard, configuration-time limit that we were using previously. Since these two values have traditionally been the same (by default at least, assuming the user didn't override either), this has not been an issue, but now that we're configuring to one value and relying on GASNet's default for the other, it becomes more iimportant. Without this change, the implication is that if the user had used more than 1024 threads, they could exceed the size of the arrays that GASNet had allocated and bad things would happen.

While here, I also noticed that in our Makefiles we seemed to be passing the list of GASNet configuration options to it twice, introduced in 2014 in d765982. I don't see any indication that that's anything more intentional than a typo, so here, I've removed one of the occurrences, leaving the one at the end of the line there, thinking it'd support overriding other, earlier options. After the fact, Dan pointed out that he'd made the same change in #25414 — a PR that seems to have fallen between the cracks at some point. I also adopted the FORCE is a phony target rule from there to permit it to be closed and opened #26548 to note that we should do the same in other Makefiles.

Resolves #26379
Resolves #25414

Copy link
Contributor

@bonachea bonachea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes!

Made a few minor observations, in case they're helpful.

runtime/src/tasks/qthreads/tasks-qthreads.c Show resolved Hide resolved
third-party/gasnet/Makefile Show resolved Hide resolved
Borrowed from @bonachea's PR in chapel-lang#25414 which contained this along
with the other Makefile change I made in this PR.

---
Signed-off-by: Brad Chamberlain <[email protected]>
@bradcray bradcray marked this pull request as ready for review January 15, 2025 22:06
runtime/src/tasks/qthreads/tasks-qthreads.c Outdated Show resolved Hide resolved
runtime/src/tasks/qthreads/tasks-qthreads.c Show resolved Hide resolved
third-party/gasnet/Makefile Show resolved Hide resolved
third-party/gasnet/Makefile Show resolved Hide resolved
runtime/src/tasks/qthreads/tasks-qthreads.c Outdated Show resolved Hide resolved
This does two things:

* specializes the error message I just added to only print the
  GASNet-specific information if CHPL_COMM is GASNet once Jade
  reminded me of how easy it was

* fixes an issue that I noticed in testing where the value being
  returned was the hard limit on the number of threads that GASNet is
  configured to use rather than the soft limit that it's using in
  practice.  With Dan's help, I updated this to the API that we
  should've been using all along and put in some bounds-checking code
  that is unlikely to ever matter (b/c we configure the cap at well
  within an int(32)'s value and b/c it's hard to imagine a user
  setting it higher than max(int(32)).

---
Signed-off-by: Brad Chamberlain <[email protected]>
Copy link
Contributor

@bonachea bonachea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted one minor potential defect, otherwise LGTM!

runtime/src/comm/gasnet/comm-gasnet-ex.c Show resolved Hide resolved
This test formerly tried to exceed our thread limits by setting the
number of threads to 300, but that limit is now below the GASNet
default soft limit of 1024, so here, I bumped the number of threads up
to 2000 to exceed that limit again, and keep it testing the same
things.

---
Signed-off-by: Brad Chamberlain <[email protected]>
Copy link
Contributor

@bonachea bonachea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@bradcray bradcray merged commit 87d7e99 into chapel-lang:main Jan 17, 2025
8 checks passed
@bradcray bradcray deleted the raise-gasnet-threads-and-warn branch January 17, 2025 00:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GASNet's max threads per process silently limiting Chapel's maxTaskPar
3 participants