Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let jobs retweak easyconfigs themselves by passing down --try-* options #4669

Merged
merged 8 commits into from
Feb 22, 2025

Conversation

bartoldeman
Copy link
Contributor

This can be accomplished by tweak() optionally also returning a dict which maps the tweaked easyconfig to the original version. Then the job can run eb ... <original_easyconfig.eb> --try-* and that original easyconfig will be retweaked in the job itself.

If the easyconfig passed to the job is not tweaked, then --try-* is not passed down (so, with --robot, some jobs will have --try-* and some don't).

This removes the requirement of a shared tmpdir with --job --try-*.

Fixes #1355

This can be accomplished by tweak() optionally also returning a dict which
maps the tweaked easyconfig to the original version. Then the job can
run `eb ... <original_easyconfig.eb> --try-*` and that original
easyconfig will be retweaked in the job itself.

If the easyconfig passed to the job is not tweaked, then `--try-*` is
*not* passed down (so, with `--robot`, some jobs will have `--try-*`
and some don't).

This removes the requirement of a shared tmpdir with `--job --try-*`.

Fixes easybuilders#1355
@bartoldeman bartoldeman marked this pull request as draft October 4, 2024 16:40
@bartoldeman bartoldeman added this to the 5.0 milestone Oct 4, 2024
@bartoldeman
Copy link
Contributor Author

Putting to draft because this needs a proper test combining --job --robot --try-*.

@boegel
Copy link
Member

boegel commented Oct 9, 2024

@bartoldeman Seems like test was implemented, so this shouldn't be a draft PR anymore?

@bartoldeman bartoldeman marked this pull request as ready for review October 24, 2024 15:11
@bartoldeman
Copy link
Contributor Author

Justed wanted to test locally using eb HPL-2.3-foss-2023a.eb --try-toolchain=foss,2024a --job. This works fine.

Note that a shared temporary directory is still needed with --job --from-pr, but you can use a shared temporary directory in the submitting easybuild and NOT on the worker node by setting $TMPDIR instead of using --tmpdir or (equivalently) $EASYBUILD_TMPDIR, as$TMPDIR is not passed down to the job; a shared TMPDIR on a worker node can cause quite a performance degredation since even GCC temporary .s (asm) files will be stored on that, causing a lot of expensive networked IOPS.

@bartoldeman
Copy link
Contributor Author

@bartoldeman Seems like test was implemented, so this shouldn't be a draft PR anymore?

@boegel undrafted now

@boegel boegel requested a review from Micket January 8, 2025 15:50
@Micket
Copy link
Contributor

Micket commented Feb 9, 2025

Sorry for my delay on this, was really busy with a deploy of a new cluster, then i got really sick for the past week. Feel free to steal this from me, i only intended to test it out on the cluster to see that it worked as intended.

@boegel boegel changed the title Let jobs retweak easyconfigs themselves Let jobs retweak easyconfigs themselves by passing down --try-* options Feb 22, 2025
Copy link
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed & tested, lgtm

@boegel boegel merged commit e9fcef8 into easybuilders:5.0.x Feb 22, 2025
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants