Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve slurm complete/error email to describe where the pipeline output is #148

Open
kelly-sovacool opened this issue Aug 5, 2024 · 4 comments
Labels
enhancement New feature or request RENEE RepoName

Comments

@kelly-sovacool
Copy link
Member

can we customize the email body so users know which pipeline run failed?

also would be cool to run grep -v completed on the jobby log and include that in the email

@kopardev kopardev added the RENEE RepoName label Aug 5, 2024
@kopardev kopardev added the enhancement New feature or request label Nov 4, 2024
@kopardev
Copy link
Member

kopardev commented Nov 4, 2024

something like this

#SBATCH --mail-type=END,FAIL      # Notifications for job completion and failure
#SBATCH [email protected]

# Custom job body message function
send_custom_email() {
    echo -e "Job Name: $SLURM_JOB_NAME\nJob ID: $SLURM_JOB_ID\nJob Status: $1\nNode List: $SLURM_JOB_NODELIST\nStart Time: $(date)" \
    | mail -s "SLURM Job ${SLURM_JOB_ID} - ${1}" [email protected]
}

# Capture the job exit status
EXIT_STATUS=$?

# Send a custom email based on job success or failure
if [ $EXIT_STATUS -eq 0 ]; then
    send_custom_email "COMPLETED"
else
    send_custom_email "FAILED"
fi

@kelly-sovacool
Copy link
Member Author

solution from stack overflow: https://stackoverflow.com/a/73293621/5787827

@kelly-sovacool
Copy link
Member Author

contact biowulf admins in case they have suggestions

@kelly-sovacool
Copy link
Member Author

response from Wolfgang Resch:

The email format sent by slurm is a cluster-wide configuration and we
use the default. So unfortunately you cannot customize that email
yourself. You could modify your shellscript to send email directly.
Something like


#!/bin/bash
#SBATCH --mail-type=BEGIN


### IMPORTANT
# need to use -S sendwait for mail(x). otherwise mail(x) will do
# a double fork and return before mail is sent. then, if the batch
# script exits too soon, the background sendmail process will get reaped
# before the mail can get sent.

secs_to_human(){
    printf "%02i:%02i:%02i" $(( ${1} / 3600 )) $(( (${1} / 60) % 60 )) $(( ${1} % 60 ))
}

send_mail() {
    local _status _rt _subj
    _status="${1:-OK}"
    _rt="$(secs_to_human $(($(date +%s) - ${start})))"
    _subj="Slurm Job_id=${SLURM_JOB_ID} Name=${SLURM_JOB_NAME} ${_status}, runtime ${_rt}"
    cat <<__EOF__ | mail -S sendwait -s "${_subj}" "${USER}@hpc.nih.gov"
my wonderful $SLURM_JOB_NAME job ${_status}. I can put all kinds of stff
into the body of this email. It's awsome. Hip hip horray. Or maybe i'll just
put in the logfile. Lore ipsum etc pp
__EOF__
}

start=$(date +%s)

if (
    set -e -o pipefail
    ### put your task here or call your actual batch script
    sleep 1m
)
then
    send_mail finished
    exit
fi
send_mail failed
exit 1


Depending on what you want to achieve you might want to trap some
signals in the parent script and / or subshell. You could make that
into a wrapper so that you wouldn't have to re-write your script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request RENEE RepoName
Projects
None yet
Development

No branches or pull requests

2 participants