Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move deployment tests to CI-run and re-enable integration tests #151

Open
wants to merge 44 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
7271d15
Use cirun for testing
aktech Feb 27, 2024
819c219
trigger run for cirun runner
aktech Feb 27, 2024
01fb47b
Re-enable deployment run
viniciusdc Mar 4, 2024
24b9efd
restrict trigger criteria for linting dirs
viniciusdc Mar 4, 2024
3e97e35
include python setup
viniciusdc Mar 4, 2024
50b998d
add concurrency group and fix pip calls
viniciusdc Mar 4, 2024
cf1604f
disable concurrent group and restrict trigger dirs
viniciusdc Mar 4, 2024
d897096
add install vagrant step
viniciusdc Mar 4, 2024
8ee1e44
tmp replace ci runner for github hosted for fast test response
viniciusdc Mar 4, 2024
b6792b1
libvirt installation and validation
viniciusdc Mar 4, 2024
4b1dc81
libvirt installation and validation
viniciusdc Mar 4, 2024
cd313ed
libvirt installation and validation
viniciusdc Mar 4, 2024
e068359
include libvirt plugin setup
viniciusdc Mar 4, 2024
338e059
fix bug with deps
viniciusdc Mar 4, 2024
cbdc942
re-enable cirun runner
viniciusdc Mar 4, 2024
7061288
Update kvm-test.yaml
viniciusdc Mar 5, 2024
6f8df3e
reorganize GitHub Actions workflow for Vagrant (KVM) tests
viniciusdc Mar 6, 2024
3d757bf
fix: rename host_vars file from hpc01-test.yaml to localhost.yaml for…
viniciusdc Mar 6, 2024
fe8836e
Refactor local testing
viniciusdc Mar 7, 2024
dcabe9b
update CI workflow to run on ubuntu-latest and improve network info e…
viniciusdc Mar 7, 2024
3b81ba5
fix: update script to use 'ip address show' for broader compatibility…
viniciusdc Mar 7, 2024
d6b802a
fix: 'become: true' was mistakenly removed
viniciusdc Mar 7, 2024
7921c6b
chore: fix identation and update timeout value for nfs wait_for task …
viniciusdc Mar 7, 2024
104cbb7
chore: remove redundant 'cmd' attribute and update notification messa…
viniciusdc Mar 7, 2024
534eccc
fix: incorrect placements of timeouts
viniciusdc Mar 7, 2024
8d5c6d3
chore: re-enable ci-run for KVM Test job
viniciusdc Mar 7, 2024
4ad5007
fix: update CI job to address /var/lib/dpkg/lock-frontend issue and k…
viniciusdc Mar 7, 2024
cb28bc0
chore: simplify killing process holding lock by using 'killall' command
viniciusdc Mar 7, 2024
d1fe0df
chore: update file permissions and paths for Slurm configuration file…
viniciusdc Mar 7, 2024
01d4a23
chore: re-enable CI to run on ubuntu-latest
viniciusdc Mar 7, 2024
e94366c
test new config
viniciusdc Mar 13, 2024
4c640d7
ensure systemd task is executed by root
viniciusdc Mar 13, 2024
058b519
update group_vars LDAP server url
viniciusdc Mar 15, 2024
2dd4317
increase MySQL packages install time
viniciusdc Mar 15, 2024
4581309
Increase instance size
viniciusdc Mar 21, 2024
de458a8
Split partial/full dpeloyment ansible runs
viniciusdc Mar 28, 2024
acaf5a6
Update AMI us-east-1-Jellyfish-22.04-amd64
viniciusdc Mar 28, 2024
f9e3cb4
Ubuntu Server 22.04 LTS (HVM) x86
viniciusdc Mar 28, 2024
afcd1b5
Update AMI instance Ubuntu Server 20.04
viniciusdc Apr 15, 2024
91eebfc
double timeout for postgresql packages install
viniciusdc Apr 15, 2024
d9dda48
move long demanding installs to pre-install
viniciusdc Apr 15, 2024
93b8d1a
addess postgresql install
viniciusdc Apr 15, 2024
660f24f
add debug session
viniciusdc Apr 16, 2024
36b0e8e
test disk usage
viniciusdc Apr 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .cirun.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
runners:
- name: cirun-aws-runner
# Cloud Provider: AWS
cloud: aws
# Instance Type has 4 vcpu, 16 GiB memory, Up to 5 Gbps Network Performance
instance_type: t3a.2xlarge
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can try t3.2xlarge if the number of cores is the bottleneck

viniciusdc marked this conversation as resolved.
Show resolved Hide resolved
machine_image: ami-0a388df278199ff52
viniciusdc marked this conversation as resolved.
Show resolved Hide resolved
# Region: Oregon
region: us-west-2
# Use Spot Instances for cost savings
preemptible:
- true
- false
labels:
- cirun-runner

Check failure on line 15 in .cirun.yml

View workflow job for this annotation

GitHub Actions / Ansible Lint

yaml[new-line-at-end-of-file]

No new line character at the end of file

Check failure on line 15 in .cirun.yml

View workflow job for this annotation

GitHub Actions / Ansible Lint

yaml[new-line-at-end-of-file]

No new line character at the end of file
17 changes: 17 additions & 0 deletions .github/scripts/extract_network_info.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash

IFS=$'\n' # Split output into lines based on newline

# Use `ip address show` instead of `ip -br -4 address show` for broader compatibility
readarray -t lines <<< "$(ip -br -4 address show | grep UP)"
for line in "${lines[@]}"; do
if [[ $line =~ (eth[0-9]|ens[0-9]+|enp[0-9].*) ]]; then
INTERFACE=$(echo $line | awk '{print $1}')
IP_RANGE=$(echo $line | awk '{print $3}')
break
fi
done

# Write variables into network_info.txt
echo "Interface: $INTERFACE" > network_info.txt
echo "IP Range: $IP_RANGE" >> network_info.txt
29 changes: 29 additions & 0 deletions .github/scripts/gen_inventory.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#!/bin/bash

# Check if the correct number of arguments was provided
if [ $# -ne 2 ]; then
echo "Usage: $0 <hostname> <output_path>"
exit 1
fi

# Get the hostname from the first argument
HOSTNAME=$1

# Get the output path from the second argument
OUTPUT_PATH=$2

# Ensure the directory exists
mkdir -p $(dirname "$OUTPUT_PATH")

# Create the inventory.ini file at the specified output path with dynamic content
cat <<EOF > "$OUTPUT_PATH"
${HOSTNAME} connection=local ansible_ssh_host=127.0.0.1

[hpc_master]
${HOSTNAME}

[hpc_worker]
${HOSTNAME}
EOF

echo "inventory.ini file has been created at $OUTPUT_PATH."
151 changes: 111 additions & 40 deletions .github/workflows/kvm-test.yaml
Original file line number Diff line number Diff line change
@@ -1,41 +1,112 @@
---
name: Vagrant (KVM) Tests

on:
pull_request:
push:
branches:
- main

jobs:
# https://github.com/jonashackt/vagrant-github-actions
test-kvm:
name: KVM Test
runs-on: macos-latest
steps:
- uses: actions/checkout@v2

- name: Cache Vagrant boxes
uses: actions/cache@v2
with:
path: ~/.vagrant.d/boxes
key: ${{ runner.os }}-vagrant-${{ hashFiles('Vagrantfile') }}
restore-keys: |
${{ runner.os }}-vagrant-

- name: Install test dependencies.
run: sudo pip3 install ansible

- name: Install Ansible Dependencies
working-directory: tests/ubuntu2004-singlenode
run: |
ansible-galaxy collection install community.general
ansible-galaxy collection install ansible.posix

- name: Show Vagrant version
run: vagrant --version

# Disabled until we fix it
# - name: Run vagrant up
# working-directory: tests/ubuntu2004-singlenode
# run: vagrant up
name: Vagrant (KVM) Tests

Check failure on line 2 in .github/workflows/kvm-test.yaml

View workflow job for this annotation

GitHub Actions / Ansible Lint

yaml[indentation]

Wrong indentation: expected 0 but found 2

Check failure on line 2 in .github/workflows/kvm-test.yaml

View workflow job for this annotation

GitHub Actions / Ansible Lint

yaml[indentation]

Wrong indentation: expected 0 but found 2

on:
pull_request:
push:
branches:
- main

jobs:
test-kvm:
name: KVM Test
# Disable ci-run untill addressing /var/lib/dpkg/lock-frontend issue
runs-on: "cirun-runner--${{ github.run_id }}"
# runs-on: "ubuntu-latest"
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: "3.10"
cache: "pip"

- name: Install dependencies
run: |
pip install ansible

- name: Install Ansible dependencies
run: |
ansible-galaxy collection install -r requirements.yaml

- name: Create deploy folder and move inventory files
run: |
mkdir deploy
cp -r inventory.template/* deploy/

chmod +x .github/scripts/gen_inventory.sh
./.github/scripts/gen_inventory.sh $(hostname -s) deploy/inventory.ini

- name: Check network adapter
run: |
ip a

- name: Check hosts
run: |
cat /etc/hosts

- name: Extract Network Information
run: |
chmod +x .github/scripts/extract_network_info.sh
./.github/scripts/extract_network_info.sh
echo "adapter_name=$(cat network_info.txt | head -1 | awk '{print $2}')" >> $GITHUB_ENV
echo "ip_range=$(cat network_info.txt | awk 'NR > 1 && $3 {print $3}')" >> $GITHUB_ENV

- name: Update group vars
run: |
cp deploy/group_vars/all.yaml deploy/group_vars/all.yaml.bak

echo "Updating group vars for firewall and internal network"
echo "firewall_internal_ip_range: $ip_range" >> deploy/group_vars/all.yaml
echo "internal_interface: $adapter_name" >> deploy/group_vars/all.yaml
echo "SlurmConfigFileDIr: /etc/slurm" >> deploy/group_vars/all.yaml

echo "Replace hpc01-test with $(hostname -s) in group_vars/hpc_worker.yaml file"
sed -i "s/hpc01-test/$(hostname -s)/g" deploy/group_vars/hpc_worker.yaml

echo "Replace LDAP server URI with $(hostname -s) in group_vars/all.yaml file"
sed -i "s|ldap://hpc01-test:389|ldap://$(hostname -s):389|g" deploy/group_vars/all.yaml

diff deploy/group_vars/all.yaml.bak deploy/group_vars/all.yaml || true

- name: Disable unattended-upgrades
run: |
# Ensure all commands are non-interactive by setting DEBIAN_FRONTEND to noninteractive
export DEBIAN_FRONTEND=noninteractive

# Check if unattended-upgrades service is active and stop it if it is
if systemctl is-active --quiet unattended-upgrades; then
echo "Stopping unattended-upgrades service..."
sudo systemctl stop unattended-upgrades
else
echo "unattended-upgrades service is not active. Skipping stop command."
fi

# Proceed with killing any running APT processes without manual confirmation
echo "Checking and killing running APT processes if necessary..."
sudo lsof /var/lib/dpkg/lock-frontend | awk '{print $2}' | tail -n +2 | while read PID; do
if [ ! -z "$PID" ]; then
echo "Killing PID $PID"
sudo kill -9 $PID
fi
done

# Configure any packages that are in an unclean state non-interactively
echo "Configuring any packages in an unclean state..."
sudo dpkg --configure -a

# Remove unattended-upgrades to avoid automatic background updates during script execution
echo "Disabling unattended upgrades..."
sudo apt-get remove --purge unattended-upgrades -y || true

# - name: Run ssh session
# uses: mxschmitt/action-tmate@v3
# with:
# detached: true

- name: Run ansible playbook
run: |
cd deploy
ansible-playbook ../playbook.yaml -i inventory.ini --connection=local -v
env:
ANSIBLE_FORCE_COLOR: True
10 changes: 10 additions & 0 deletions .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,17 @@ name: Ansible Lint

on:
push:
paths:
- 'roles/**'
- 'tasks/**'
- '.github/workflows/ansible-lint.yml'

pull_request:
paths:
- 'roles/**'
- 'tasks/**'
- '.github/workflows/ansible-lint.yml'

jobs:
build:
name: Ansible Lint
Expand Down
1 change: 1 addition & 0 deletions roles/apt_packages/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
- name: Ensure apt packages are installed
become: true
timeout: 300
ansible.builtin.apt:
name: "{{ installed_packages }}"
state: latest
Expand Down
1 change: 1 addition & 0 deletions roles/backups/tasks/backup.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
- name: Ensure restic installed
become: true
timeout: 300
ansible.builtin.apt:
name: restic
state: latest
Expand Down
1 change: 0 additions & 1 deletion roles/cifs/handlers/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,5 @@
name: "{{ item }}"
enabled: "yes"
state: restarted
cmd: ""
with_items:
- smbd
1 change: 1 addition & 0 deletions roles/cifs/tasks/client.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
- name: Install cifs
become: true
timeout: 300
ansible.builtin.apt:
state: latest
cache_valid_time: 3600
Expand Down
3 changes: 2 additions & 1 deletion roles/cifs/tasks/server.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
- name: Install samba
become: true
timeout: 300
ansible.builtin.apt:
state: latest
cache_valid_time: 3600
Expand All @@ -22,4 +23,4 @@
owner: root
group: root
mode: "0644"
notify: restart services samba
notify: Restart services samba
1 change: 0 additions & 1 deletion roles/dask_gateway/handlers/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,5 @@
name: "{{ item }}"
enabled: "yes"
state: restarted
cmd: ""
with_items:
- dask-gateway
4 changes: 2 additions & 2 deletions roles/dask_gateway/tasks/dask_gateway.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
owner: dask
group: dask
mode: "0644"
notify: restart services dask-gateway
notify: Restart services dask-gateway

- name: Copy the dask-gateway systemd service file
become: true
Expand Down Expand Up @@ -77,7 +77,7 @@
owner: root
group: root
mode: "0644"
notify: restart services dask-gateway
notify: Restart services dask-gateway

- name: Ensure dask-gateway is enabled on boot
become: true
Expand Down
1 change: 1 addition & 0 deletions roles/grafana/tasks/grafana.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

- name: Install grafana
become: true
timeout: 300
ansible.builtin.apt:
name: grafana{{ grafana_version }}
state: "{% if grafana_version %}present{% else %}latest{% endif %}"
Expand Down
3 changes: 0 additions & 3 deletions roles/jupyterhub/handlers/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
name: "{{ item }}"
enabled: "yes"
state: restarted
cmd: ""
with_items:
- jupyterhub

Expand All @@ -15,7 +14,6 @@
name: "{{ item }}"
enabled: "yes"
state: restarted
cmd: ""
with_items:
- jupyterhub-proxy

Expand All @@ -25,6 +23,5 @@
name: "{{ item }}"
enabled: "yes"
state: restarted
cmd: ""
with_items:
- jupyterhub-ssh
4 changes: 3 additions & 1 deletion roles/keycloak/tasks/keycloak.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
- name: Install openjdk and python requirements
become: true
timeout: 300
ansible.builtin.apt:
state: latest
cache_valid_time: 3600
Expand Down Expand Up @@ -64,7 +65,8 @@

- name: Ensure Keycloak admin user exists
become: true
ansible.builtin.command: /opt/keycloak-{{ keycloak_version }}/bin/add-user-keycloak.sh -r master -u "{{ keycloak_admin_username }}" -p "{{ keycloak_admin_password
ansible.builtin.command:
/opt/keycloak-{{ keycloak_version }}/bin/add-user-keycloak.sh -r master -u "{{ keycloak_admin_username }}" -p "{{ keycloak_admin_password
}}"
args:
creates: /opt/keycloak-{{ keycloak_version }}/standalone/configuration/keycloak-add-user.json
Expand Down
9 changes: 9 additions & 0 deletions roles/mysql/defaults/main.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
---
mysql_enabled: false
mysql_config_file: /etc/mysql/my.cnf

mysql_databases:
- slurm
- conda-store
Expand All @@ -14,3 +16,10 @@ mysql_users:
- username: conda-store
password: eIbmUditL4RbQm0YPeLozRme
privileges: "*.*:ALL"

# Define a custom list of packages to install
mysql_packages:
- mysql-server
- mysql-common

mysql_python_package: python3-mysqldb
1 change: 0 additions & 1 deletion roles/mysql/handlers/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,5 @@
name: "{{ item }}"
enabled: "yes"
state: restarted
cmd: ""
with_items:
- mysql
Loading
Loading