Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

num_contexts patch #1122

Merged
merged 6 commits into from
May 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[submodule "modules/tests-sos"]
path = modules/tests-sos
url = ../../openshmem-org/tests-sos.git
branch = main
url = ../../ronawho/tests-sos.git
branch = fix_team_get_config_test
2 changes: 1 addition & 1 deletion modules/tests-sos
30 changes: 26 additions & 4 deletions src/shmem_team.c
Original file line number Diff line number Diff line change
Expand Up @@ -334,11 +334,33 @@ int shmem_internal_team_split_strided(shmem_internal_team_t *parent_team, int PE
myteam->start = global_PE_start;
myteam->stride = PE_stride;
myteam->size = PE_size;
if (config) {
myteam->config = *config;
myteam->config_mask = config_mask;

if (config_mask == 0) {
if (config != NULL) {
RAISE_WARN_MSG("%s %s\n", "team_split_strided operation encountered an unexpected",
"non-NULL config structure passed with a config_mask of 0.");
}
Comment on lines +339 to +342
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidozog - @dalcinl and I disagree. 🙂 We started getting warnings in shmem4py. We checked the specification and we do not see it saying config should be NULL if config_mask == 0. In fact, we are not even sure if a NULL pointer is allowed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, we are not even sure if a NULL pointer is allowed.

In here, Marcin is talking about the standard: there is no explicit wording saying that if the config_mask is 0, then config may be NULL. Or did we miss something? What if some implementation decides to add a check assert(config != NULL) irrespective of the value of config_mask? Would such behavior be in contradiction of the 1.5 standard?

Copy link
Member Author

@davidozog davidozog Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrogowski @dalcinl I agree there is a "blind spot" in this section of the standard right now (v1.5), and I have a note to try to fix it for the upcoming new version (v1.6). I would prefer to add the following statement to the v1.6 standard (do you think it's sufficient?):

If \VAR{config} is a null pointer, then \VAR{config_mask} must be 0, otherwise
the behavior is undefined.

Another option is to simply prohibit a null config pointer, but my hunch is that's a bit more restrictive than what was intended for v1.5, but I'm open to it!

We also might want something like this for improved clarity:

If \VAR{config_mask} is 0, then `shmem_team_get_config` performs no operation.

So if config_mask is 0 and config is non-null, then that's perfectly fine. Given the state of OpenSHMEM v1.5, we opted to include an SOS warning in this special case, but I think we could also move it to "DEBUG" output or simply remove it altogether - I'd prefer to remove it myself, especially if the statements above are added to OpenSHMEM v1.6.

Any preferences from @wrrobin and @stewartl318?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your previous suggestions with a small addition for maximum clarity:

 If \VAR{config_mask} is 0, then `shmem_team_get_config` performs no action and \VAR{config} may be `NULL`.

but I think we could also move it to "DEBUG" output or simply remove it altogether

I guess that would be good enough. Extra points if you guys ever allow for these warnings as an opt-in via some environment variable.

Another option is to simply prohibit a null config pointer, but my hunch is that's a bit more restrictive than what was intended for v1.5, but I'm open to it!

Indeed, there is little point in such restriction. Moreover, it is kind of a backward incompatible change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the internet robustness principle states it well: be conservative in what you do, be liberal in what you accept from others

I think there is no real value in any sort of debug or error message when mask is 0 but config is non-null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dalcinl @stewartl318 - Thanks for the input!

Extra points if you guys ever allow for these warnings as an opt-in via some environment variable.

This sounds like what I meant by "DEBUG" message above. The DEBUG_MSG macro in SOS will only print to stderr if the SHMEM_DEBUG environment variable is set.

I think there is no real value in any sort of debug or error message when mask is 0 but config is non-null.

I tend to agree... but maybe the debug message is a good compromise since the spec is pretty under-defined for this special case. Let's move the discussion to PR #1138, which proposes changing this to a debug message.

shmem_team_config_t defaults;
myteam->config_mask = 0;
myteam->contexts_len = 0;
defaults.num_contexts = 0;
memcpy(&myteam->config, &defaults, sizeof(shmem_team_config_t));
} else {
if (config_mask != SHMEM_TEAM_NUM_CONTEXTS) {
RAISE_WARN_MSG("Invalid team_split_strided config_mask (%ld)\n", config_mask);
return -1;
} else {
shmem_internal_assertp(config->num_contexts >= 0);
myteam->config = *config;
myteam->config_mask = config_mask;
myteam->contexts_len = config->num_contexts;
myteam->contexts = malloc(config->num_contexts * sizeof(shmem_transport_ctx_t*));
for (int i = 0; i < config->num_contexts; i++) {
myteam->contexts[i] = NULL;
}
}
}
myteam->contexts_len = 0;

myteam->psync_idx = -1;

shmem_internal_op_to_all(psync_pool_avail_reduced,
Expand Down
3 changes: 3 additions & 0 deletions src/teams_c.c4
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,9 @@ shmem_team_get_config(shmem_team_t team, long config_mask, shmem_team_config_t *
return -1;
}
memcpy(config, &myteam->config, sizeof(shmem_team_config_t));
} else if (config != NULL) {
RAISE_WARN_MSG("%s %s\n", "shmem_team_get_config encountered an unexpected",
"non-NULL config structure passed with a config_mask of 0.");
}

return 0;
Expand Down
Loading