Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to Ubuntu 20.04 #813

Merged
merged 6 commits into from
Jun 27, 2022

Conversation

rafamanzo
Copy link
Contributor

I did my best to create atomic commits with messages full of details.

This should close issue #811

I have developed using Vagrant by changing the the image to focal64. Then I ran:

vagrant destroy -f && vagrant up && ansible-playbook --limit vagrant playbooks/setup.yml && ansible-playbook --limit vagrant playbooks/provision.yml && ansible-playbook --limit vagrant playbooks/deploy.yml

After I got all these commands to run without errors I check in the browser if http://localhost:8080 loaded OFN homepage without errors.

With everything working in 20.04 I got back the Vagrantfile to xenial and bionic, repeating the test detailed above, in order to check if my changes had no unwanted side effects in them. All looked good in my tests.

Final notes:

  • the community should consider to EOL Ubuntu 16 soon which will make possible to remove many of the conditionals in the infrastructure automation we have here
  • this PR is a step towards getting Upgrade Python to v3 #765 done
  • this PR should replace Ubuntu 20 #659

@mkllnk
Copy link
Member

mkllnk commented Jun 22, 2022

the community should consider to EOL Ubuntu 16 soon

I agree and just checked current usage:

$ ansible all-staging -u openfoodnetwork -a 'lsb_release -a'
staging.openfoodnetwork.org.au | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.6 LTS
Release:	16.04
Codename:	xenialNo LSB modules are available.
staging.openfoodnetwork.org.uk | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.3 LTS
Release:	16.04
Codename:	xenialNo LSB modules are available.
staging.coopcircuits.fr | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.5 LTS
Release:	18.04
Codename:	bionicNo LSB modules are available.

$ ansible all-prod -u openfoodnetwork -a 'lsb_release -a'
openfoodnetwork.org.au | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.4 LTS
Release:	18.04
Codename:	bionicNo LSB modules are available.
openfoodnetwork.net | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.5 LTS
Release:	18.04
Codename:	bionicNo LSB modules are available.
openfoodnetwork.ca | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.6 LTS
Release:	16.04
Codename:	xenialNo LSB modules are available.
openfoodnetwork.hu | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.6 LTS
Release:	18.04
Codename:	bionicNo LSB modules are available.
app.katuma.org | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.4 LTS
Release:	18.04
Codename:	bionicNo LSB modules are available.
openfoodnetwork.be | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.5 LTS
Release:	16.04
Codename:	xenialNo LSB modules are available.
openfoodnetwork.org.uk | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.5 LTS
Release:	16.04
Codename:	xenialNo LSB modules are available.
openfoodnetwork.ie | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.6 LTS
Release:	16.04
Codename:	xenialNo LSB modules are available.
openfoodnetwork.de | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.6 LTS
Release:	16.04
Codename:	xenialNo LSB modules are available.
coopcircuits.fr | CHANGED | rc=0 >>
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.5 LTS
Release:	16.04
Codename:	xenialNo LSB modules are available.

Servers to upgrade:

  • staging.openfoodnetwork.org.au
  • staging.openfoodnetwork.org.uk
  • openfoodnetwork.ca
  • openfoodnetwork.org.uk
  • openfoodnetwork.ie
  • openfoodnetwork.de
  • coopcircuits.fr

That's a bit of work because we don't have a zero-downtime transition playbook at the moment. I doubt that we'll get to it in the next few months but I'll put it on the list to plan for it. I guess that 2026 is a hard deadline.

Copy link
Member

@mkllnk mkllnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This is great! Your long commit messages are excellent and help me a lot to understand what you tried and why you came to these solutions. 🏅

Just one idea below of what to try.

Comment on lines 118 to 124
- name: Fix Ruby # noqa 301
command: bash -lc "rbenv uninstall -f {{ ruby_version }} && rbenv install {{ ruby_version }}"
become: yes
become_user: "{{ app_user }}"
when: ansible_distribution_major_version >= '20'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this would happen each time we run the provisioning and it takes a while. Did you try to increase the memory of the virtual machine?

Or maybe we can skip the task if assets have compiled successfully? That would be a weird hack here, too...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion!

I did try to increase memory and cpu just in case. Unfortunately the error persisted. My best guess for the cause of this error still is something within the galaxy module taking care of installing ruby.

I think checking if assets precompilation work before running this would be hard and confusing, specially because the precompile runs in a different playbook (deploy.yml).

I understand that this should run every time a new ruby gets installed. What do you think about setting a hidden file with {{ ruby_version}} and only running the fix task if the task setting such file changed? This would prevent it to run the fix when not necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about setting a hidden file

Yes, it's also a hack but we are hacking anyway. 😉 It's a good idea. If you can easily put it in the ruby installation path then it gets removed automatically when removing a ruby version. It's like marking an installation as "patched". But if that's too difficult then a simple file in the home directory would do, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 415c8ff

Matt-Yorkley and others added 6 commits June 27, 2022 09:31
Starting in version 20 Python 2 packages, specially python-psycopg2
required by geerlingguy.postgresql, are no longer part of the distro and
we have to work with their versions for Python 3.

My first attempt to get this done was by setting the
`ansible_python_interpreter` to something like:

```
ansible_python_interpreter: "/usr/bin/python{% if ansible_distribution_major_version < '20' %}2.7{% else %}3{% endif%}"
```

But it dind't work  because at the time `ansible_python_interpreter`
needs to be evaluated `ansible_distribution_major_version` still is
undefined.

This is the reason I'm going for this particular more creative solution.

In the future, when the project maintainers decide to end support for
Ubuntu 16, this solution should not be necessary and we can just work
with Python 3 in Ubuntu 18 and 20.
This package no longer exists and was causing the following error:

```
TASK [common : install base packages] *********************************************************************************************************************************************
fatal: [local_vagrant]: FAILED! => {"changed": false, "msg": "No package matching 'python-psycopg2' is available"}
Tuesday 14 June 2022  14:31:23 -0300 (0:00:00.361)       0:00:12.428 **********
```
This version adds support to Ubuntu 20 while still supports 16 and 18.

Fixes the following error present when running in Ubuntu 20:

```
TASK [geerlingguy.postgresql : Include OS-specific variables (Debian).] ***********************************************************************************************************
fatal: [local_vagrant]: FAILED! => {"ansible_facts": {}, "ansible_included_var_files": [], "changed": false, "message": "Could not find or access 'Ubuntu-20.yml'\nSearched in:\n\t/home/manzo/workspace/Camar
inhaManzo/ofn-install/community/geerlingguy.postgresql/vars/Ubuntu-20.yml\n\t/home/manzo/workspace/CamarinhaManzo/ofn-install/community/geerlingguy.postgresql/Ubuntu-20.yml\n\t/home/manzo/workspace/Camarinh
aManzo/ofn-install/roles/dbserver/vars/Ubuntu-20.yml\n\t/home/manzo/workspace/CamarinhaManzo/ofn-install/roles/dbserver/Ubuntu-20.yml\n\t/home/manzo/workspace/CamarinhaManzo/ofn-install/community/geerlinggu
y.postgresql/tasks/vars/Ubuntu-20.yml\n\t/home/manzo/workspace/CamarinhaManzo/ofn-install/community/geerlingguy.postgresql/tasks/Ubuntu-20.yml\n\t/home/manzo/workspace/CamarinhaManzo/ofn-install/playbooks/v
ars/Ubuntu-20.yml\n\t/home/manzo/workspace/CamarinhaManzo/ofn-install/playbooks/Ubuntu-20.yml on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote
_src option"}
Tuesday 14 June 2022  14:51:16 -0300 (0:00:00.015)       0:08:08.065 **********
```
In Ubuntu 20, when running assets precompilation there was this error:

```
TASK [deploy : precompile assets] *************************************************************************************************************************************************
fatal: [local_vagrant]: FAILED! => {"changed": true, "cmd": ["bash", "-lc", "bundle exec rake assets:precompile RAILS_ENV=development"], "delta": "0:00:05.815405", "end": "2022-06
-15 11:15:39.701298", "msg": "non-zero return code", "rc": -6, "start": "2022-06-15 11:15:33.885893", "stderr": "I, [2022-06-15T11:15:39.534899 #62413]  INFO -- : Writing /home/op
enfoodnetwork/apps/openfoodnetwork/releases-old/2022-06-15-111445/public/assets/iehack-725caee6a64e094fd6f6faeb3b7d8456ba36bfca88b2b2d46fd02661da15f6d7.js\nI, [2022-06-15T11:15:39
.535309 #62413]  INFO -- : Writing /home/openfoodnetwork/apps/openfoodnetwork/releases-old/2022-06-15-111445/public/assets/iehack-725caee6a64e094fd6f6faeb3b7d8456ba36bfca88b2b2d46
fd02661da15f6d7.js.gz\nfree(): invalid pointer", "stderr_lines": ["I, [2022-06-15T11:15:39.534899 #62413]  INFO -- : Writing /home/openfoodnetwork/apps/openfoodnetwork/releases-ol
d/2022-06-15-111445/public/assets/iehack-725caee6a64e094fd6f6faeb3b7d8456ba36bfca88b2b2d46fd02661da15f6d7.js", "I, [2022-06-15T11:15:39.535309 #62413]  INFO -- : Writing /home/ope
nfoodnetwork/apps/openfoodnetwork/releases-old/2022-06-15-111445/public/assets/iehack-725caee6a64e094fd6f6faeb3b7d8456ba36bfca88b2b2d46fd02661da15f6d7.js.gz", "free(): invalid poi
nter"], "stdout": "yarn install v1.22.4\n[1/4] Resolving packages...\nsuccess Already up-to-date.\nDone in 0.66s.", "stdout_lines": ["yarn install v1.22.4", "[1/4] Resolving packa
ges...", "success Already up-to-date.", "Done in 0.66s."]}
Wednesday 15 June 2022  08:15:39 -0300 (0:00:05.946)       0:00:55.217 ********
```

The important part is: `free(): invalid pointer`.

This is a kind of a generic memory error that I could not find the actual
cause not even with valgrind. This led me to trial and error. First I
investigated if removing from Gemfile some gems with native compilation
such as `pg` and `json` had any effect, but the error persisted after
all my tries. After that I went to remove all gems and reinstall and the
error persisted.

Then I tried to remove the Ruby being used and reinstall it. Which fixed
the problem! But I must admit this is an ugly solution.

Now knowing the error source is within the Ruby installation, I've tried
to upgrade the `zzet.rbenv` ansible galaxy module, but had no success.
I've also tried to remove jemalloc from the installation and the error
persisted.

Finally I have moved the uninstall/reinstall to being the very next
thing after calling `zzet.rbenv` and to my surprise it also fixed the
problem! From this I can only conclude there is some unknown issue in
this module that is beyond the scope of OFN.

With this long description of what I have tried and failed I hope you
understand that I've did my best to find a better solution, but had no
success in my search. Uninstalling and reinstalling the same thing was
the only solution that worked.

Why it is a problem only in Ubuntu 20 and not 18 and 16 is another
mystery.

In the future, if someone accepts the quest of removing this ugly fix,
the first thing I suggest to check is if there is a version of
`zzet.rbenv` newer than 3.6.0 and, if there is such version, check if it
fixes the error described here.
It is only necessary to run once per Ruby. Running it every time the
provision playbook runs did not cause any error, but it took a while to
run delaying deployments.
Copy link
Member

@mkllnk mkllnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

I actually have to provision a new server this week. I'll try this.

@mkllnk mkllnk merged commit c6d69a9 into openfoodfoundation:master Jun 27, 2022
@mkllnk
Copy link
Member

mkllnk commented Jun 29, 2022

Which Ansible version do you use?

I'm running into some trouble with Ansible 2.9.6 and Ubuntu 20 because the systemd output changed. I found the fixing pull request but I can't find a release with it:

@mkllnk
Copy link
Member

mkllnk commented Jun 29, 2022

Ah, don't worry. I updated to Ansible 2.9.16 and it works. 👍

@rafamanzo
Copy link
Contributor Author

Nice it wasn't something harder to fix :)

I did an test install on Digital Ocean today. There was an issue that I did not handle here because I was running my tests on a vagrant box.

It seems the recommended way to install certbot in Ubuntu 20.04 has changed, but by upgrading the galaxy module to the latest version it should work. You may want to cherry-pick this: f2fd353

@mkllnk
Copy link
Member

mkllnk commented Jun 30, 2022

Ah, thanks! You may want to review #815.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants