Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Ansible Script For Deploying A Clone of the External Data to A Node #100

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open
16 changes: 16 additions & 0 deletions Linux/external-data-mirror/ansible/external-data-mirror.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
- name: Deploy a docker container serving a mirror of the mantid external data store.
hosts: all

roles:
- role: dannixon.system.interactive_users
tags: "setup"
- role: geerlingguy.docker
become: yes
tags: "setup"
- role: mirror-data
become: yes
tags: "mirror"
- role: server
become: yes
tags: "server"

75 changes: 75 additions & 0 deletions Linux/external-data-mirror/ansible/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# External Data Mirror Deployment

This allows for additional copies of the external data to be generated to reduce the load
on the main server.

Developers deploying these playbooks will require ssh access to `mantidproject.org`.

## Setup

### VM Provisioning
- Create a new linux virtual machine (VM) (on OpenStack at STFC) with an ssh key that
you have on your system.
- Add the `HTTP` security group to the VM to allow traffic into the node on port `80`.

### Ansible Setup
- Create an `ansible` conda environment:

```sh
mamba create -n ansible ansible
mamba activate ansible
```

- Install 3rd party ansible packages from `ansible-galaxy`:

```sh
ansible-galaxy install -r requirements.yml
```

- Create an `inventory.txt` file in the following format:

- `VM_IP_ADDRESS`: IP address of the node you just provisioned.
- `IP_TO_COPY_FROM`: IP address or domain (usually `mantidproject.org`) that holds
the external data you want to copy.
- `DIR_NAME`: The directory in the `/srv/` directory on the server that holds the
rest of the path to the external data. On `mantidproject.org` this is formatted
as the main server's IP.

```ini
[all]
<VM_IP_ADDRESS> main_server_hostname=<IP_TO_COPY_FROM> main_data_srv_dir=<DIR_NAME>
```

## Deployment

- Deploy the playbook to the list of machines in the inventory.

```sh
ansible-playbook -i inventory.txt external-data-mirror.yml -u <YOUR_VM_USERNAME> -K
```

- There are 3 tags for the different parts of the deployment:

- `setup`: Sets up the VM with ssh keys for the whole DevOps team and a docker
installation.
- `mirror`: Creates a copy of the data on the new VM and sets up a crontab job to keep
it in sync with any new data added on the main server.
- `server`: Spins up a docker container to host the server with a mounted volume
containing the copied data.

The new server can now be accessed by: `http://<VM_IP_ADDRESS>/external-data/MD5/<TEST_FILE_HASH>`

### Configuring the Load Balancer (STFC Cloud)

For the mantid build process to be able to access your new mirror, it needs to be added to the
load balancer pool.


- Navigate to `Network` &rarr; `Load Balancers` &rarr; `External Data LB` &rarr; `Pools` &rarr;
`HTTP Pool` &rarr; `Members` &rarr; `Add/Remove Members`.
- Add your new VM to this pool and set the port to `80` (HTTP).
- Your new node should now be accessible via the floating IP address of the load balancer.

```sh
http://<LOAD_BALANCER_IP>/external-data/MD5/<TEST_FILE_HASH>
```
3 changes: 3 additions & 0 deletions Linux/external-data-mirror/ansible/requirements.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
roles:
- src: geerlingguy.docker
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
- name: Generate key pair if it does not exist
community.crypto.openssh_keypair:
force: no # Don't regenerate existing keys.
path: ~/.ssh/id_rsa

- name: Read public key into tmp to copy over.
fetch:
src: ~/.ssh/id_rsa.pub
dest: /tmp/{{ ansible_hostname }}-id_rsa.pub
flat: yes

- name: Add public key to main server's authorized keys
ansible.posix.authorized_key:
user: root
key: "{{ lookup('file','/tmp/{{ ansible_hostname }}-id_rsa.pub')}}"
remote_user: root
delegate_to: "{{ main_server_hostname }}"

- name: Touch the known_hosts file if it's missing
file:
path: ~/.ssh/known_hosts
state: touch
mode: 0644

- name: Check if known_hosts contains existing server fingerprint
command: ssh-keygen -F {{ main_server_hostname }}
register: key_exists
failed_when: key_exists.stderr != ''
changed_when: False

- name: Scan for existing remote ssh fingerprint
command: ssh-keyscan -T5 {{ main_server_hostname }}
register: keyscan
failed_when: keyscan.rc != 0 or keyscan.stdout == ''
changed_when: False
when: key_exists.rc == 1

- name: Copy ssh-key to local known_hosts
lineinfile:
name: ~/.ssh/known_hosts
create: yes
line: "{{ item }}"
when: key_exists.rc == 1
with_items: "{{ keyscan.stdout_lines|default([]) }}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
- name: Create a directory to hold the mirror of the external data.
ansible.builtin.file:
path: /external-data/MD5/
state: directory
mode: '0755'

- name: Check if machine has SSH access to the main data store.
ansible.builtin.command: ssh -o BatchMode=True root@{{ main_server_hostname }} 'echo success'
register: connected
ignore_errors: True

- name: Exchange SSH keys with linode so we can access the data.
import_tasks: exchange-keys.yml
when: connected.stdout != "success"

- name: Mirror the external data from the main server in a volume (this may take a while).
ansible.builtin.command: "rsync -az --perms -o -g {{ main_server_hostname }}:/srv/{{ main_data_srv_dir }}/ftp/external-data/MD5/ /external-data/MD5/"

- name: Copy the data update script onto the mirror machine.
ansible.builtin.copy:
src: ./update-external-data.sh
dest: /external-data/update-external-data.sh
mode: '0755'

- name: Create a crontab job that runs periodically to keep the data up to date.
ansible.builtin.cron:
name: Update external data
minute: "*/5"
job: /external-data/update-external-data.sh {{ main_server_hostname }} {{ main_data_srv_dir }} >> /external-data/update-log.txt 2>&1
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#! /bin/bash

SERVER_IP=${1}
FTP_SRV_DIR=${2}

RSYNC_PROCESS_IDS=$(pidof rsync)

printf "%(%H:%M:%S)T "

if [ -z "${RSYNC_PROCESS_IDS}" ]; then
echo "running rsync..."
rsync -az --perms -o -g $SERVER_IP:/srv/$FTP_SRV_DIR/ftp/external-data/MD5/ /external-data/MD5/
else
echo "rsync is already running. Skipping this time..."
fi
11 changes: 11 additions & 0 deletions Linux/external-data-mirror/ansible/roles/server/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
- name: Spin up the nginx docker container to serve the downloaded testing data.
community.docker.docker_container:
name: "nginx-external-data"
image: "nginx:stable"
state: "started"
detach: True
restart_policy: "always"
network_mode: "host"
ports: "80:80"
volumes:
"/external-data/:/usr/share/nginx/html/external-data/:ro"