Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update playbooks to download SLURM RPM files to /tmp always #268

Open
andriy-safe-ai opened this issue Mar 24, 2024 · 0 comments
Open

Update playbooks to download SLURM RPM files to /tmp always #268

andriy-safe-ai opened this issue Mar 24, 2024 · 0 comments
Assignees

Comments

@andriy-safe-ai
Copy link
Contributor

andriy-safe-ai commented Mar 24, 2024

The playbook that downloads SLURM RPM file is hardcoded to download to /data/slurm_rpms.

- hosts: all
  become: true
  vars:
    slurm_version: "23.02.1-1"
    slurm_all_packages:
      - "slurm-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-devel-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-contribs-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-perlapi-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-torque-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-openlava-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-slurmctld-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-slurmdbd-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-pam_slurm-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-libpmi-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
      - "slurm-slurmd-{{slurm_version}}.el{{ansible_distribution_major_version}}.x86_64.rpm"
  tasks:
    - name: Download slurm .rpms
      get_url:
        url: "https://objectstorage.eu-frankfurt-1.oraclecloud.com/p/VnkLhYXOSNVilVa9d24Riz1fz4Ul-KTXeK4HCKoyqv0ghW3gry3Xz8CZqloqphLw/n/hpc/b/source/o/slurm/{{ item }}"
        dest: "/data/slurm_rpms"
      with_items: "{{slurm_all_packages}}"
      delegate_to: 127.0.0.1
      run_once: true  
    - name: manually install all of the .rpms together (fails separately)
      shell: yum install -y /data/slurm_rpms/{{slurm_all_packages[0]}} \
        /data/slurm_rpms/{{slurm_all_packages[1]}} \
        /data/slurm_rpms/{{slurm_all_packages[2]}} \
        /data/slurm_rpms/{{slurm_all_packages[3]}} \
        /data/slurm_rpms/{{slurm_all_packages[4]}} \
        /data/slurm_rpms/{{slurm_all_packages[5]}} \
        /data/slurm_rpms/{{slurm_all_packages[6]}} \
        /data/slurm_rpms/{{slurm_all_packages[7]}} \
        /data/slurm_rpms/{{slurm_all_packages[8]}} \
        /data/slurm_rpms/{{slurm_all_packages[9]}} \
        /data/slurm_rpms/{{slurm_all_packages[10]}}
      # Needed in case you wish to rerun this playbook otherwise it'll error.
      ignore_errors: true

This is a problem because the path where our playbooks that read the SLURM RPM files changes depending on how we configured our cluster. Depending on the values of variables in /etc/ansible/hosts the path those playbooks will look for will change. For example, the slurm role takes a download_path variable as the path to the RPM files. Depending on whether we have the configured create_fss to true or cluster_nfs to true the behavior will change. By default, the slurm role would check in /tmp but this would never work since we've hardcoded the download path to /data/slurm_rpms.

- hosts: bastion,slurm_backup,compute,login
  gather_facts: true
  vars:
    destroy: false
    initial: true
    download_path: "{{ nfs_target_path if create_fss | bool else ( cluster_nfs_path if cluster_nfs|bool else '/tmp')  }}"
    enroot_top_path: "{{ nvme_path }}/enroot/"
  vars_files:
    - "/opt/oci-hpc/conf/queues.conf"
  tasks:
    - include_role:
        name: slurm
      when: slurm|default(true)|bool

One way to solve this would be by setting the default download path for the SLURM RPMs be to /tmp and have all playbooks look for the RPMs in /tmp.

@andriy-safe-ai andriy-safe-ai self-assigned this Mar 24, 2024
@andriy-safe-ai andriy-safe-ai linked a pull request Mar 25, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant