-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with Ceph Ansible stable-8.0
on Ubuntu 22.04/24.04 and Impact on OpenStack Ansible (OSA) Integration
#7496
Comments
hi @tiagonix Thanks for your feedback,
For context, the fate of ceph-ansible has been uncertain for several releases. It was supposed to be deprecated and left unmaintained after pacific, this was postponed until after Quincy. In the end, we (@clwluvw and I) decided to keep maintaining it, (and to make it very clear, we do so only in our free time) so stable-8.0 has finally received some engineering efforts. Although it was announced that Regarding As for the following statement:
I'm not sure I totally understand what you are trying to bring out here, could you elaborate a bit more on this? |
Hi @guits, Firstly, I want to extend my gratitude for your detailed response. It's greatly appreciated. I've taken note that I'm eager to contribute to the testing of the openstack create -t ceph-basic-stack-ubuntu-5.yaml ceph-lab-5 I'm more than willing to share this template if it's of interest. To maintain focus and objectivity, I propose we prioritize making the TASK [ceph-mon : Waiting for the monitor(s) to form the quorum...].
...
fatal: [ceph-mon-1]: FAILED! => changed=false Overcoming this issue could significantly advance integration efforts with OpenStack Ansible and potentially make it work on Ubuntu 24.04 as well. Regarding the OpenStack components within Ceph Ansible, discussions with the OSA team suggest a consensus for moving the OpenStack-specific functionalities (e.g., pool creation) to OSA playbooks, which seems a reasonable direction. On the topic of containerized deployments, I've delved into the discussion in the Ceph list thread https://lists.ceph.io/hyperkitty/list/[email protected]/thread/TTTYKRVWJOR7LOQ3UCQAZQR32R7YADVY/ and found it quite enlightening. Our organization is heavily reliant on Again, thank you for your engagement and openness to community feedback. I look forward to contributing (mostly with testing) to the Cheers! |
I tested the cutting edge of |
Thank you for taking the time to test it! Let me show you exactly what I'm doing. BTW, have you also enabled Ubuntu's UCA Bobcat in your test before running Ceph Ansible ( To make it clear, I have to share two OpenStack Heat templates I'm using, one related to "Deployment 2" (Ubuntu 20.04 + Ceph Pacific), which works ( Here are the differences between "Deployment 2" and "Deployment 5": $ diff -Nru ceph-basic-stack-ubuntu-2.yaml ceph-basic-stack-ubuntu-5.yaml
--- ceph-basic-stack-ubuntu-2.yaml 2024-03-19 16:57:04.925881995 +0100
+++ ceph-basic-stack-ubuntu-5.yaml 2024-03-19 15:18:49.480372431 +0100
@@ -66,8 +66,8 @@
os_image_1:
type: string
label: 'Ubuntu Server - 64-bit'
- description: 'Ubuntu - Focal Fossa - LTS'
- default: 'ubuntu-20.04.1-20201201'
+ description: 'Ubuntu - Jammy Jellyfish - LTS'
+ default: 'ubuntu-22.04-20230110'
# Flavors for Ceph
flavor_ceph_generic:
@@ -529,7 +529,8 @@
packages:
- zram-config
- net-tools
- - ansible
+ - ansible-core
+ - python3-pip
- python3-six
- python3-netaddr
@@ -664,10 +665,11 @@
owner: root
content: |
---
+ yes_i_know: True
cluster: ceph
ntp_daemon_type: chronyd
ceph_origin: distro
- ceph_stable_release: pacific
+ ceph_stable_release: reef
generate_fsid: false
docker: false
containerized_deployment: false
@@ -919,6 +921,8 @@
owner: root
content: |
#!/bin/bash
+ # Python resolvelib issue: https://bugs.launchpad.net/ubuntu/+source/ansible/+bug/1995249
+ pip install resolvelib==0.5.4
pushd /home/manager/ceph-ansible
ansible-galaxy install -r requirements.yml
cp site.yml.sample site.yml
@@ -950,7 +954,7 @@
ssh-keygen -b 2048 -t rsa -f .ssh/id_rsa -q -C 'manager@cephao-1-ceph-ansible-1' -N ''
- git clone -b stable-6.0 https://github.com/ceph/ceph-ansible.git
+ git clone -b stable-8.0 https://github.com/ceph/ceph-ansible.git
mkdir /home/manager/ansible
@@ -965,8 +969,8 @@
runcmd:
- [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
- [ sh, -c, "/usr/local/bin/netcat-tarpipe-send-ssh-pubkey.sh" ]
- - [ sh, -c, "add-apt-repository ppa:ansible/ansible-2.10"]
- - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+ - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
+ - [ sh, -c, "add-apt-repository -y ppa:ansible/ansible"]
- [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
- [ sh, -c, "snap refresh" ]
- [ sh, -c, "reboot" ]
@@ -1073,7 +1077,7 @@
runcmd:
- [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
- [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
- - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+ - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
- [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
- [ sh, -c, "snap refresh" ]
- [ sh, -c, "reboot" ]
@@ -1156,7 +1160,7 @@
- swapon /swap.img
- echo '/swap.img none swap sw 0 0' >> /etc/fstab
- [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
- - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+ - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
- [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
- [ sh, -c, "snap refresh" ]
- [ sh, -c, "reboot" ]
@@ -1232,7 +1236,7 @@
- swapon /swap.img
- echo '/swap.img none swap sw 0 0' >> /etc/fstab
- [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
- - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+ - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
- [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
- [ sh, -c, "snap refresh" ]
- [ sh, -c, "reboot" ]
@@ -1307,7 +1311,7 @@
- swapon /swap.img
- echo '/swap.img none swap sw 0 0' >> /etc/fstab
- [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
- - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+ - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
- [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
- [ sh, -c, "snap refresh" ]
- [ sh, -c, "reboot" ]
@@ -1480,7 +1484,7 @@
runcmd:
- [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
- [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
- - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+ - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
- [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
- [ sh, -c, "snap refresh" ]
- [ sh, -c, "reboot" ]
@@ -1642,7 +1646,7 @@
runcmd:
- [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
- [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
- - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+ - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
- [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
- [ sh, -c, "snap refresh" ]
- [ sh, -c, "reboot" ]
@@ -1804,7 +1808,7 @@
runcmd:
- [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
- [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
- - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+ - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
- [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
- [ sh, -c, "snap refresh" ]
- [ sh, -c, "reboot" ]
@@ -1902,7 +1906,7 @@
runcmd:
- [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
- [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
- - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+ - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
- [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
- [ sh, -c, "snap refresh" ]
- [ sh, -c, "reboot" ] So, "Deployment 2" works as expected, which is based on Those two OpenStack Heat templates contain everything I'm using to deploy Ceph on top of vanilla Ubuntu 20.04 and 22.04 Cloud Images! Including a complete Ansible Inventory (around line 555) and Ceph Ansible variables (around line 665). The Bash script The Ansible error log I posted in my first message! Please let me know if there are other logs you think are worth collecting. I can run those Heat templates over and over, no problem. |
@tiagonix by any chance, are you available on ceph-storage.slack.com ? |
Let me run our Deployment 2! Check this out, it's kinda cool: It's called with $ openstack stack create -t ceph-basic-stack-ubuntu-2.yaml cephao-2
+---------------------+---------------------------------------------------------------------+
| Field | Value |
+---------------------+---------------------------------------------------------------------+
| id | 3437c0a1-91ca-4df6-ba9d-6cb6cdc6b847 |
| stack_name | cephao-2 |
| description | |
| | HOT template to create standard setup for a Ceph Cluster, with |
| | Security Groups and Floating IPs. |
| | |
| | Total of 8 Instances for a basic Ceph Cluster. |
| | * 1 Ubuntu as Ceph Ansible |
| | * 3 Ubuntu as Ceph Control Plane, for MONs, MGRs, Dashboard and etc |
| | * 3 Ubuntu as Ceph OSDs |
| | * 1 Ubuntu as Ceph Client (RBD) |
| | |
| | Network Diagram - ASCII |
| | |
| | Control Network (with Internet access via router_i_1) |
| | |
| | ---------|ctrl_subnet|--------------------------------- |
| | | | | | | | | |
| | | | | | | | | |
| | --- --- --- --- --- --- | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | ----|CEPH ANSIBLE|--| |
| | |-|---|-|ceph_pub_subnet|-|--|-|---| | |
| | | | | | | | | | | | | | |---|CEPH GRAFANA|--| |
| | |C| |C| |C| |C| |C| |C| | |& PROMETHEUS| | |
| | |E| |E| |E| |E| |E| |E| | | |
| | |P| |P| |P| |P| |P| |P| ----|CEPH CLIENT|---- |
| | |H| |H| |H| |H| |H| |H| |
| | | | | | | | | | | | | | |
| | |C| |C| |C| |O| |O| |O| |
| | |P| |P| |P| |S| |S| |S| |
| | | | | | | | |D| |D| |D| |
| | --- --- --- --- --- --- |
| | | | | |
| | |ceph_pri_subnet| |
| | |
| creation_time | 2024-03-19T15:51:55Z |
| updated_time | None |
| stack_status | CREATE_IN_PROGRESS |
| stack_status_reason | Stack CREATE started |
+---------------------+---------------------------------------------------------------------+ After about 10 minutes, Ceph Ansible did its thing! Look: $ openstack server list
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+
| 35abd0bc-51fd-48a3-83da-5647186cb065 | cephao-2-ceph-osd-2 | ACTIVE | cephao-2-ceph-cluster=10.192.1.22; cephao-2-ceph-public=10.192.0.22; cephao-2-control=10.232.243.112, 192.168.192.22 | ubuntu-20.04.1-20201201 | m1.medium |
| 44c87e44-f354-41f8-9f82-e8df6904c22d | cephao-2-ceph-cp-2 | ACTIVE | cephao-2-ceph-public=10.192.0.12; cephao-2-control=10.232.242.241, 192.168.192.12 | ubuntu-20.04.1-20201201 | m1.small |
| 59004ee9-100a-40d7-accb-f014ad6e89cd | cephao-2-ceph-osd-3 | ACTIVE | cephao-2-ceph-cluster=10.192.1.23; cephao-2-ceph-public=10.192.0.23; cephao-2-control=10.232.242.202, 192.168.192.23 | ubuntu-20.04.1-20201201 | m1.medium |
| 78941aac-54ba-4649-b03e-2aad8f478727 | cephao-2-ceph-cp-1 | ACTIVE | cephao-2-ceph-public=10.192.0.11; cephao-2-control=10.232.244.142, 192.168.192.11 | ubuntu-20.04.1-20201201 | m1.small |
| 28e8611c-f26e-45f5-853b-7b1df61d52e9 | cephao-2-ceph-client-1 | ACTIVE | cephao-2-ceph-public=10.192.0.5; cephao-2-control=10.232.241.31, 192.168.192.5 | ubuntu-20.04.1-20201201 | m1.small |
| a2face4f-4e23-4886-85f7-797dda2f748f | cephao-2-ceph-cp-3 | ACTIVE | cephao-2-ceph-public=10.192.0.13; cephao-2-control=10.232.244.150, 192.168.192.13 | ubuntu-20.04.1-20201201 | m1.small |
| cb7ecb6e-2155-4a9d-821a-8e479b2f5a60 | cephao-2-ceph-osd-1 | ACTIVE | cephao-2-ceph-cluster=10.192.1.21; cephao-2-ceph-public=10.192.0.21; cephao-2-control=10.232.242.73, 192.168.192.21 | ubuntu-20.04.1-20201201 | m1.medium |
| 3ad74d3f-6c37-4e1a-8915-79e7583f207d | cephao-2-ceph-dash-1 | ACTIVE | cephao-2-ceph-public=10.192.0.10; cephao-2-control=10.232.244.92, 192.168.192.10 | ubuntu-20.04.1-20201201 | m1.small |
| de717ee1-677e-4ff8-9783-d2166b275436 | cephao-2-ceph-ansible-1 | ACTIVE | cephao-2-ceph-public=10.192.0.4; cephao-2-control=10.232.242.17, 192.168.192.4 | ubuntu-20.04.1-20201201 | m1.small |
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+ $ ssh [email protected]
manager@cephao-2-ceph-cp-1:~$ sudo ceph status
cluster:
id: a46d52d3-5a8b-4609-9a1f-e22e06f710f9
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
Degraded data redundancy: 7/688 objects degraded (1.017%), 7 pgs degraded, 256 pgs undersized
services:
mon: 3 daemons, quorum cephao-2-ceph-cp-1,cephao-2-ceph-cp-2,cephao-2-ceph-cp-3 (age 9m)
mgr: cephao-2-ceph-cp-1(active, since 111s), standbys: cephao-2-ceph-cp-2, cephao-2-ceph-cp-3
mds: 1/1 daemons up, 2 standby
osd: 18 osds: 18 up (since 6m), 18 in (since 6m)
rgw: 3 daemons active (3 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 24 pools, 737 pgs
objects: 227 objects, 9.9 KiB
usage: 5.2 GiB used, 54 TiB / 54 TiB avail
pgs: 7/688 objects degraded (1.017%)
481 active+clean
249 active+undersized
7 active+undersized+degraded
manager@cephao-2-ceph-cp-1:~$ sudo ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME
-1 54.00000 - 54 TiB 5.2 GiB 40 MiB 0 B 5.1 GiB 54 TiB 0.01 1.00 - root default
-5 18.00000 - 18 TiB 1.7 GiB 13 MiB 0 B 1.7 GiB 18 TiB 0.01 1.00 - host cephao-2-ceph-osd-1
2 ssd 3.00000 1.00000 3.0 TiB 292 MiB 2.2 MiB 0 B 290 MiB 3.0 TiB 0.01 0.99 116 up osd.2
3 ssd 3.00000 1.00000 3.0 TiB 296 MiB 2.2 MiB 0 B 294 MiB 3.0 TiB 0.01 1.01 131 up osd.3
6 ssd 3.00000 1.00000 3.0 TiB 296 MiB 2.3 MiB 0 B 294 MiB 3.0 TiB 0.01 1.01 127 up osd.6
10 ssd 3.00000 1.00000 3.0 TiB 296 MiB 2.2 MiB 0 B 294 MiB 3.0 TiB 0.01 1.01 120 up osd.10
13 ssd 3.00000 1.00000 3.0 TiB 292 MiB 2.2 MiB 0 B 290 MiB 3.0 TiB 0.01 0.99 135 up osd.13
15 ssd 3.00000 1.00000 3.0 TiB 292 MiB 2.2 MiB 0 B 290 MiB 3.0 TiB 0.01 0.99 108 up osd.15
-3 18.00000 - 18 TiB 1.7 GiB 13 MiB 0 B 1.7 GiB 18 TiB 0.01 1.00 - host cephao-2-ceph-osd-2
0 ssd 3.00000 1.00000 3.0 TiB 296 MiB 2.2 MiB 0 B 294 MiB 3.0 TiB 0.01 1.01 123 up osd.0
4 ssd 3.00000 1.00000 3.0 TiB 292 MiB 2.2 MiB 0 B 290 MiB 3.0 TiB 0.01 0.99 122 up osd.4
7 ssd 3.00000 1.00000 3.0 TiB 292 MiB 2.2 MiB 0 B 290 MiB 3.0 TiB 0.01 0.99 96 up osd.7
9 ssd 3.00000 1.00000 3.0 TiB 296 MiB 2.3 MiB 0 B 294 MiB 3.0 TiB 0.01 1.01 127 up osd.9
12 ssd 3.00000 1.00000 3.0 TiB 296 MiB 2.2 MiB 0 B 294 MiB 3.0 TiB 0.01 1.01 132 up osd.12
16 ssd 3.00000 1.00000 3.0 TiB 296 MiB 2.3 MiB 0 B 294 MiB 3.0 TiB 0.01 1.01 137 up osd.16
-7 18.00000 - 18 TiB 1.7 GiB 13 MiB 0 B 1.7 GiB 18 TiB 0.01 1.00 - host cephao-2-ceph-osd-3
1 ssd 3.00000 1.00000 3.0 TiB 292 MiB 2.2 MiB 0 B 290 MiB 3.0 TiB 0.01 0.99 114 up osd.1
5 ssd 3.00000 1.00000 3.0 TiB 292 MiB 2.2 MiB 0 B 290 MiB 3.0 TiB 0.01 0.99 114 up osd.5
8 ssd 3.00000 1.00000 3.0 TiB 296 MiB 2.2 MiB 0 B 294 MiB 3.0 TiB 0.01 1.01 128 up osd.8
11 ssd 3.00000 1.00000 3.0 TiB 292 MiB 2.2 MiB 0 B 290 MiB 3.0 TiB 0.01 0.99 131 up osd.11
14 ssd 3.00000 1.00000 3.0 TiB 292 MiB 2.2 MiB 0 B 290 MiB 3.0 TiB 0.01 0.99 110 up osd.14
17 ssd 3.00000 1.00000 3.0 TiB 296 MiB 2.2 MiB 0 B 294 MiB 3.0 TiB 0.01 1.01 140 up osd.17
TOTAL 54 TiB 5.2 GiB 40 MiB 0 B 5.1 GiB 54 TiB 0.01
MIN/MAX VAR: 0.99/1.01 STDDEV: 0 Great! Ceph is up and running! |
Hey, @guits! I don't have access to Ceph's Slack space. |
if you share an email, i'll send you an invitation |
So, for some reason, my Deployments 3 and 4 ( My focus in with on Ceph Reef on Ubuntu 22.04 with UCA Bobcat (which brings Ceph Reef to 22.04), and Ceph Ansible |
HEY! It's working now! LOL I just tried my "Deployment 5" with the OpenStack Heat template I just shared, and it worked! The current Ceph Ansible Check, this, out: Ceph Ansible final lines of its run: TASK [set ceph crash install 'Complete'] ***************************************
ok: [cephao-1-ceph-cp-1]
PLAY [mons] ********************************************************************
TASK [get ceph status from the first monitor] **********************************
ok: [cephao-1-ceph-cp-1]
TASK [show ceph status for cluster ceph] ***************************************
ok: [cephao-1-ceph-cp-1] =>
msg:
- ' cluster:'
- ' id: a46d52d3-5a8b-4609-9a1f-e22e06f710f9'
- ' health: HEALTH_WARN'
- ' mons are allowing insecure global_id reclaim'
- ' Degraded data redundancy: 2/668 objects degraded (0.299%), 2 pgs degraded, 96 pgs undersized'
- ' '
- ' services:'
- ' mon: 3 daemons, quorum cephao-5-ceph-cp-1,cephao-5-ceph-cp-2,cephao-5-ceph-cp-3 (age 6m)'
- ' mgr: cephao-5-ceph-cp-2(active, since 3s), standbys: cephao-5-ceph-cp-3, cephao-5-ceph-cp-1'
- ' mds: 1/1 daemons up, 2 standby'
- ' osd: 18 osds: 18 up (since 3m), 18 in (since 4m)'
- ' rgw: 3 daemons active (3 hosts, 1 zones)'
- ' '
- ' data:'
- ' volumes: 1/1 healthy'
- ' pools: 12 pools, 337 pgs'
- ' objects: 222 objects, 588 KiB'
- ' usage: 513 MiB used, 54 TiB / 54 TiB avail'
- ' pgs: 2/668 objects degraded (0.299%)'
- ' 241 active+clean'
- ' 94 active+undersized'
- ' 2 active+undersized+degraded'
- ' '
PLAY RECAP *********************************************************************
cephao-1-ceph-client-1 : ok=79 changed=12 unreachable=0 failed=0 skipped=205 rescued=0 ignored=0
cephao-1-ceph-cp-1 : ok=526 changed=66 unreachable=0 failed=0 skipped=532 rescued=0 ignored=0
cephao-1-ceph-cp-2 : ok=370 changed=48 unreachable=0 failed=0 skipped=465 rescued=0 ignored=0
cephao-1-ceph-cp-3 : ok=382 changed=51 unreachable=0 failed=0 skipped=465 rescued=0 ignored=0
cephao-1-ceph-dash-1 : ok=58 changed=25 unreachable=0 failed=0 skipped=30 rescued=0 ignored=0
cephao-1-ceph-osd-1 : ok=160 changed=25 unreachable=0 failed=0 skipped=259 rescued=0 ignored=0
cephao-1-ceph-osd-2 : ok=147 changed=24 unreachable=0 failed=0 skipped=242 rescued=0 ignored=0
cephao-1-ceph-osd-3 : ok=149 changed=25 unreachable=0 failed=0 skipped=240 rescued=0 ignored=0
localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
INSTALLER STATUS ***************************************************************
Install Ceph Monitor : Complete (0:02:03)
Install Ceph Manager : Complete (0:00:16)
Install Ceph OSD : Complete (0:01:05)
Install Ceph MDS : Complete (0:00:20)
Install Ceph RGW : Complete (0:00:15)
Install Ceph Client : Complete (0:00:30)
Install Ceph RGW LoadBalancer : Complete (0:00:16)
Install Ceph Dashboard : Complete (0:00:26)
Install Ceph Grafana : Complete (0:00:21)
Install Ceph Node Exporter : Complete (0:01:29)
Install Ceph Crash : Complete (0:00:07)
real 9m44.665s
user 1m44.966s
sys 0m52.544s
~ YAY!!! Here's how I'm doing it: $ openstack stack create -t ceph-basic-stack-ubuntu-5.yaml cephao-5
+---------------------+---------------------------------------------------------------------+
| Field | Value |
+---------------------+---------------------------------------------------------------------+
| id | b846688a-a2d5-4963-adc9-992bd9cd8cbd |
| stack_name | cephao-5 |
| description | |
| | HOT template to create standard setup for a Ceph Cluster, with |
| | Security Groups and Floating IPs. |
| | |
| | Total of 8 Instances for a basic Ceph Cluster. |
| | * 1 Ubuntu as Ceph Ansible |
| | * 3 Ubuntu as Ceph Control Plane, for MONs, MGRs, Dashboard and etc |
| | * 3 Ubuntu as Ceph OSDs |
| | * 1 Ubuntu as Ceph Client (RBD) |
| | |
| | Network Diagram - ASCII |
| | |
| | Control Network (with Internet access via router_i_1) |
| | |
| | ---------|ctrl_subnet|--------------------------------- |
| | | | | | | | | |
| | | | | | | | | |
| | --- --- --- --- --- --- | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | ----|CEPH ANSIBLE|--| |
| | |-|---|-|ceph_pub_subnet|-|--|-|---| | |
| | | | | | | | | | | | | | |---|CEPH GRAFANA|--| |
| | |C| |C| |C| |C| |C| |C| | |& PROMETHEUS| | |
| | |E| |E| |E| |E| |E| |E| | | |
| | |P| |P| |P| |P| |P| |P| ----|CEPH CLIENT|---- |
| | |H| |H| |H| |H| |H| |H| |
| | | | | | | | | | | | | | |
| | |C| |C| |C| |O| |O| |O| |
| | |P| |P| |P| |S| |S| |S| |
| | | | | | | | |D| |D| |D| |
| | --- --- --- --- --- --- |
| | | | | |
| | |ceph_pri_subnet| |
| | |
| creation_time | 2024-03-19T16:35:53Z |
| updated_time | None |
| stack_status | CREATE_IN_PROGRESS |
| stack_status_reason | Stack CREATE started |
+---------------------+---------------------------------------------------------------------+ After about 10 minutes, Ceph Ansible finally worked! Look: $ openstack server list
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+
| b3668daf-704c-4613-b976-51e64a10ce1f | cephao-5-ceph-cp-1 | ACTIVE | cephao-5-ceph-public=10.192.0.11; cephao-5-control=10.232.244.42, 192.168.192.11 | ubuntu-22.04-20230110 | m1.small |
| f31ce00f-d72f-423a-8802-fa4c5dd609c8 | cephao-5-ceph-osd-3 | ACTIVE | cephao-5-ceph-cluster=10.192.1.23; cephao-5-ceph-public=10.192.0.23; cephao-5-control=10.232.242.64, 192.168.192.23 | ubuntu-22.04-20230110 | m1.medium |
| f3b3aef7-cd4f-4f0e-a15d-ed1903a33b14 | cephao-5-ceph-osd-1 | ACTIVE | cephao-5-ceph-cluster=10.192.1.21; cephao-5-ceph-public=10.192.0.21; cephao-5-control=10.232.244.87, 192.168.192.21 | ubuntu-22.04-20230110 | m1.medium |
| 382c2d04-59db-436b-9f82-973199dbc6d5 | cephao-5-ceph-cp-3 | ACTIVE | cephao-5-ceph-public=10.192.0.13; cephao-5-control=10.232.241.244, 192.168.192.13 | ubuntu-22.04-20230110 | m1.small |
| 78d089ad-e6ea-42af-8b6b-5c01724d7fe3 | cephao-5-ceph-client-1 | ACTIVE | cephao-5-ceph-public=10.192.0.5; cephao-5-control=10.232.242.113, 192.168.192.5 | ubuntu-22.04-20230110 | m1.small |
| d6df57d2-5adb-4644-8f7a-e82bd0558219 | cephao-5-ceph-cp-2 | ACTIVE | cephao-5-ceph-public=10.192.0.12; cephao-5-control=10.232.244.182, 192.168.192.12 | ubuntu-22.04-20230110 | m1.small |
| 241866c6-364d-4dfb-9b94-e828c32f020c | cephao-5-ceph-osd-2 | ACTIVE | cephao-5-ceph-cluster=10.192.1.22; cephao-5-ceph-public=10.192.0.22; cephao-5-control=10.232.244.105, 192.168.192.22 | ubuntu-22.04-20230110 | m1.medium |
| 82b90435-3e7c-41b3-8643-d3a01bae7583 | cephao-5-ceph-ansible-1 | ACTIVE | cephao-5-ceph-public=10.192.0.4; cephao-5-control=10.232.243.103, 192.168.192.4 | ubuntu-22.04-20230110 | m1.small |
| c0d897e9-8c8d-48c6-ad3e-abf1115d0321 | cephao-5-ceph-dash-1 | ACTIVE | cephao-5-ceph-public=10.192.0.10; cephao-5-control=10.232.242.248, 192.168.192.10 | ubuntu-22.04-20230110 | m1.small |
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+ $ ssh [email protected]
manager@cephao-5-ceph-cp-1:~$ sudo ceph status
cluster:
id: a46d52d3-5a8b-4609-9a1f-e22e06f710f9
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
Degraded data redundancy: 2/668 objects degraded (0.299%), 2 pgs degraded, 96 pgs undersized
services:
mon: 3 daemons, quorum cephao-5-ceph-cp-1,cephao-5-ceph-cp-2,cephao-5-ceph-cp-3 (age 11m)
mgr: cephao-5-ceph-cp-2(active, since 4m), standbys: cephao-5-ceph-cp-3, cephao-5-ceph-cp-1
mds: 1/1 daemons up, 2 standby
osd: 18 osds: 18 up (since 7m), 18 in (since 8m)
rgw: 3 daemons active (3 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 337 pgs
objects: 222 objects, 588 KiB
usage: 513 MiB used, 54 TiB / 54 TiB avail
pgs: 2/668 objects degraded (0.299%)
241 active+clean
94 active+undersized
2 active+undersized+degraded $ sudo ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME
-1 54.00000 - 54 TiB 513 MiB 23 MiB 0 B 490 MiB 54 TiB 0 1.00 - root default
-7 18.00000 - 18 TiB 170 MiB 7.8 MiB 0 B 162 MiB 18 TiB 0 0.99 - host cephao-5-ceph-osd-1
2 ssd 3.00000 1.00000 3.0 TiB 27 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.96 0 up osd.2
5 ssd 3.00000 1.00000 3.0 TiB 28 MiB 1.7 MiB 0 B 26 MiB 3.0 TiB 0 0.98 65 up osd.5
8 ssd 3.00000 1.00000 3.0 TiB 32 MiB 1.3 MiB 0 B 30 MiB 3.0 TiB 0.00 1.11 64 up osd.8
11 ssd 3.00000 1.00000 3.0 TiB 27 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.96 32 up osd.11
14 ssd 3.00000 1.00000 3.0 TiB 28 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.97 144 up osd.14
17 ssd 3.00000 1.00000 3.0 TiB 28 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.97 32 up osd.17
-3 18.00000 - 18 TiB 174 MiB 7.8 MiB 0 B 166 MiB 18 TiB 0 1.02 - host cephao-5-ceph-osd-2
0 ssd 3.00000 1.00000 3.0 TiB 27 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.96 32 up osd.0
3 ssd 3.00000 1.00000 3.0 TiB 28 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.97 112 up osd.3
6 ssd 3.00000 1.00000 3.0 TiB 27 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.96 96 up osd.6
9 ssd 3.00000 1.00000 3.0 TiB 32 MiB 1.7 MiB 0 B 30 MiB 3.0 TiB 0.00 1.12 33 up osd.9
12 ssd 3.00000 1.00000 3.0 TiB 27 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.96 32 up osd.12
15 ssd 3.00000 1.00000 3.0 TiB 32 MiB 1.3 MiB 0 B 30 MiB 3.0 TiB 0.00 1.11 32 up osd.15
-5 18.00000 - 18 TiB 170 MiB 7.8 MiB 0 B 162 MiB 18 TiB 0 0.99 - host cephao-5-ceph-osd-3
1 ssd 3.00000 1.00000 3.0 TiB 27 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.96 32 up osd.1
4 ssd 3.00000 1.00000 3.0 TiB 28 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.97 64 up osd.4
7 ssd 3.00000 1.00000 3.0 TiB 32 MiB 1.3 MiB 0 B 30 MiB 3.0 TiB 0.00 1.11 96 up osd.7
10 ssd 3.00000 1.00000 3.0 TiB 28 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.97 48 up osd.10
13 ssd 3.00000 1.00000 3.0 TiB 28 MiB 1.7 MiB 0 B 26 MiB 3.0 TiB 0 0.98 1 up osd.13
16 ssd 3.00000 1.00000 3.0 TiB 27 MiB 1.2 MiB 0 B 26 MiB 3.0 TiB 0 0.96 96 up osd.16
TOTAL 54 TiB 513 MiB 23 MiB 0 B 490 MiB 54 TiB 0
MIN/MAX VAR: 0.96/1.12 STDDEV: 0 manager@cephao-5-ceph-cp-1:~$ lsb_release -ra
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy manager@cephao-5-ceph-cp-1:~$ dpkg -l | grep ceph-common
ii ceph-common 18.2.0-0ubuntu3~cloud0 amd64 common utilities to mount and interact with a ceph storage cluster
ii python3-ceph-common 18.2.0-0ubuntu3~cloud0 all Python 3 utility libraries for Ceph AWESOME!!! Ceph Reef is up and running on Ubuntu 22.04! Let me test on Ubuntu 24.04 next. |
I can confirm that Ceph Ansible Check it out: root@cephao-6-ceph-cp-1:~# lsb_release -ra
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu Noble Numbat (development branch)
Release: 24.04
Codename: noble root@cephao-6-ceph-cp-1:~# dpkg -l | grep ceph-common
ii ceph-common 18.2.0-0ubuntu7 amd64 common utilities to mount and interact with a ceph storage cluster
ii python3-ceph-common 18.2.0-0ubuntu7 all Python 3 utility libraries for Ceph root@cephao-6-ceph-cp-1:~# ceph status
cluster:
id: a46d52d3-5a8b-4609-9a1f-e22e06f710f9
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
Degraded data redundancy: 2/662 objects degraded (0.302%), 2 pgs degraded, 70 pgs undersized
3 mgr modules have recently crashed
services:
mon: 3 daemons, quorum cephao-6-ceph-cp-1,cephao-6-ceph-cp-2,cephao-6-ceph-cp-3 (age 106m)
mgr: cephao-6-ceph-cp-3(active, since 104m), standbys: cephao-6-ceph-cp-2, cephao-6-ceph-cp-1
mds: 1/1 daemons up, 2 standby
osd: 18 osds: 18 up (since 103m), 18 in (since 103m); 26 remapped pgs
rgw: 3 daemons active (3 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 337 pgs
objects: 220 objects, 586 KiB
usage: 525 MiB used, 54 TiB / 54 TiB avail
pgs: 2/662 objects degraded (0.302%)
241 active+clean
68 active+undersized
22 active+clean+remapped
3 active+clean+remapped+scrubbing
2 active+undersized+degraded
1 active+clean+remapped+scrubbing+deep
progress:
Global Recovery Event (102m)
[======================......] (remaining: 26m) Awesome! It's worth mentioning that if
If you want a "Docker-free Ceph deployment", also set Also, the Alternatively (edit: should be the first try, actually lol)
Then the It seems that this issue can be closed after all! It was easier than I thought... Let's see how it'll play with OpenStack Ansible! lol Good opportunity to share ideas and Heat templates! :-D |
This enforces docker.io and docker respectively for `container_package_name` and `container_service_name` by default for Ubuntu distribution. Fixes: #7496 Signed-off-by: Guillaume Abrioux <[email protected]>
glad to read that.
As far as I remember, container_package_name
container_service_name
The dashboard uses grafana / prometheus / etc.. as containerized daemons, whatever your Ceph deployment is containerized or not.
In fact, we probably want to update the
See #7523 |
This enforces docker.io and docker respectively for `container_package_name` and `container_service_name` by default for Ubuntu distribution. Fixes: #7496 Signed-off-by: Guillaume Abrioux <[email protected]>
This enforces docker.io and docker respectively for `container_package_name` and `container_service_name` by default for Ubuntu distribution. Fixes: #7496 Signed-off-by: Guillaume Abrioux <[email protected]> (cherry picked from commit ef5a09d)
This enforces docker.io and docker respectively for `container_package_name` and `container_service_name` by default for Ubuntu distribution. Fixes: #7496 Signed-off-by: Guillaume Abrioux <[email protected]> (cherry picked from commit ef5a09d)
Greetings to all,
As an avid supporter and long-term user of Ceph Ansible, I've had the privilege of deploying Ceph in numerous corporate environments over the years, alongside educating many students on its deployment and maintenance. Our organization extensively utilizes Ubuntu and its Ubuntu Cloud Archive, appreciating the seamless upgrade paths it offers. This allows for the deployment of an LTS distribution and incremental Ceph releases atop the same LTS until the next LTS release. Transitioning between LTS releases while maintaining the same Ceph version is a breeze, fully backed by Canonical's support, leveraging its Debian heritage. Here’s an overview for clarity:
We manage several Ceph Ansible pipelines, facilitating individual Ceph Deployments in a CI/CD-like pipeline, including Ubuntu and Ceph upgrades when necessary. However, we're currently facing challenges with the
stable-8.0
branch of Ceph Ansible, particularly with its compatibility with Ubuntu 22.04 (with Ubuntu Cloud Archive Bobcat enabled for Ceph Reef) and Ubuntu 24.04 (default with Ceph Reef). It appears thatstable-8.0
exhibits significant issues, impacting not only our operations but also the broader OpenStack Ansible (OSA) community. The OSA project is constrained to thestable-7.0
branch and Ceph Quincy, despite Reef being accessible in Ubuntu repositories.References:
Ceph Pipelines Configuration
For enabling Ubuntu's UCA repositories, we opt for manual configuration over predefined variables such as
ceph_repository: uca
and its related settings. Instead, we use:We do not use:
This approach allows Ceph Ansible to utilize
distro
, automatically leveraging UCA when appropriate, simplifying the process without additional variables/logic.Deployment Scenarios
stable-5.0
apt install ansible
)Configured Ceph Ansible Variables
Ceph Ansible Run
Works!
add-apt-repository cloud-archive:wallaby
stable-6.0
add-apt-repository ppa:ansible/ansible-2.10
)Configured Ceph Ansible Variables
Ceph Ansible Run
Works!
add-apt-repository cloud-archive:yoga
stable-7.0
add-apt-repository ppa:ansible/ansible
)Configured Ceph Ansible Variables
Ceph Ansible Run
Works!
stable-7.0
apt install ansible-core
)Configured Ceph Ansible Variables
Ceph Ansible Run
Works!
IMPORTANT NOTES
Details about a working Ceph Ansible deployment.
When I can deploy Ceph using Ceph Ansible up until its version
stable-7.0
. Here's a goodceph.conf
example:So far, so good,
ceph status
always works.add-apt-repository cloud-archive:bobcat
stable-8.0
apt install ansible-core
)stable-7.0
withceph_stable_release: reef
for deployment on Ubuntu 22.04 + Bobcat.Configured Ceph Ansible Variables
Ceph Ansible Run
FAILED!!! Error:
When I went to Ceph Mon to check its configuration:
Many lines are missing from
ceph.conf
!stable-8.0
not working for Reef deployment on Ubuntu 24.04.stable-7.0
(untested), though not ideal for long-term use.OpenStack Ansible and Ceph Ansible Integration
The synergy between OpenStack Ansible (OSA) and Ceph Ansible is a cornerstone for efficient infrastructure deployment, enabling seamless OpenStack and Ceph installations with a unified Ansible Inventory. This integration simplifies processes, allowing users to deploy Ceph within OSA environments through streamlined commands.
For a traditional Ceph deployment, the process involves:
Conversely, deploying Ceph via OSA is more integrated:
cd /opt/openstack-ansible/playbooks/ openstack-ansible ceph-install.yml
This approach not only simplifies the deployment process but also ensures a more cohesive infrastructure setup with OSA.
However, challenges arise with the
stable-8.0
branch of Ceph Ansible, particularly its incompatibility with Ubuntu 22.04 (Bobcat/Reef) and Ubuntu 24.04 (Reef by default), and its adverse impact on OSA integration. Notably, the removal ofceph_conf_overrides
disrupts this integration, a change widely discussed within the community for its negative implications.OSA's preference for LXC containers, and its avoidance of Docker/Prometheus, further complicates the situation. The push towards
cephadm
, which mandates Docker, conflicts with OSA's (and many users') deployment strategies, particularly within LXC or LXD environments.A significant point of contention is the amendment made in Ceph Ansible
stable-8.0
, documented here:14b4abf#diff-a57eaa1f236c68f6acc319d4f9710af8b513741d044e8dd4ddf544c1c7d09cefL144
This change, among others (9c467e4), has sparked discussions about the need for
stable-8.0
to better align with user needs and the operational realities of OSA deployments, urging a reconsideration or adaptation of these changes to facilitate smoother integration and functionality.Conclusion and Call to Action
The modifications in Ceph Ansible
stable-8.0
do not fully account for its longstanding use cases, particularly concerning its integration with the OSA community. These changes, including the deprecation ofceph_conf_overrides
and a forced shift towards containerized deployments, disrupt established workflows and compatibility.I strongly advocate for the Ceph community to consider reverting disruptive changes in the
stable-8.0
branch, restoring feature parity withstable-7.0
, and providing clear guidance and support for transitioning OpenStack Ansible and other dependents to newer versions. It’s crucial to maintain Ceph Ansible’s utility and accessibility for both individual and organizationThe text was updated successfully, but these errors were encountered: