You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are a number of new (and seasoned) contributors who encounter failures in CI that cannot be produced on their local machine. The causes for this are numerous (software stack versions, MPI implementations, number of thread, data setttings, compilation settings, etc.)
While this is not a problem that's unique to OpenMC, it can make tracking these issues down painful all the same. This PR adds the tmate action to our CI. This action provides an ssh command to the CI machine where issues with installation and testing can be reproduced and debugged in the same environment.
One downside is that failure notifications won't come in as quickly because tmate keep the runner active to provide the user a chance to log into the machine. This also means higher GHA usage per job on failures, but IMO we'd see less usage overall by avoiding commits to PRs that are guessing at how to fix an issue. We can also set the tmate session timeout to something reasonable like 15 minutes (the default is 45 I believe).
Description
There are a number of new (and seasoned) contributors who encounter failures in CI that cannot be produced on their local machine. The causes for this are numerous (software stack versions, MPI implementations, number of thread, data setttings, compilation settings, etc.)
While this is not a problem that's unique to OpenMC, it can make tracking these issues down painful all the same. This PR adds the
tmate
action to our CI. This action provides anssh
command to the CI machine where issues with installation and testing can be reproduced and debugged in the same environment.One downside is that failure notifications won't come in as quickly because
tmate
keep the runner active to provide the user a chance to log into the machine. This also means higher GHA usage per job on failures, but IMO we'd see less usage overall by avoiding commits to PRs that are guessing at how to fix an issue. We can also set thetmate
session timeout to something reasonable like 15 minutes (the default is 45 I believe).There are other strategies for enabling
tmate
on a job as well, such as only generating a connection ifworkflow_dispatch
istrue
or running in detacthed mode where a connection is created by default and remains open at the end of the action.Alternatives
Compatibility
N/A
The text was updated successfully, but these errors were encountered: