-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ceph-config: fix calculation of num_osds
#7502
Conversation
The number of OSDs defined by the `lvm_volumes` variable is added to `num_osds` in task `Count number of osds for lvm scenario`. Therefore theses devices must not be counted in task `Set_fact num_osds (add existing osds)`. There are currently three problems with the existing approach: 1. Bluestore DB and WAL devices are counted as OSDs 2. `lvm_volumes` supports a second notation to directly specify logical volumes instead of devices when the `data_vg` key exists. This scenario is not yet accounted for. 3. The `difference` filter used to remove devices from `lvm_volumes` returns a list of **unique** elements, thus not accounting for multiple OSDs on a single device The first problem is solved by filtering the list of logical volumes for devices used as `type` `block`. For the second and third problem lists are created from `lvm_volumes` containing either paths to devices or logical volumes devices. For the second problem the output of `ceph-volume` is simply filtered for `lv_path`s appearing in the list of logical volume devices described above. To solve the third problem the remaining OSDs in the output are compiled into a list of their used devices, which is then filtered for devices appearing in the list of devices from `lvm_volumes`. Fixes: ceph#7435 Signed-off-by: Jan Horstmann <[email protected]>
jenkins test centos-container-external_clients |
jenkins test centos-container-rbdmirror |
@clwluvw that's probably the kind of task which deserves to be converted into a module |
@janhorstmann I might be missing something but with the current implementation : inventory [osds]
osd0 lvm_volumes="[{'data': 'data-lv1', 'data_vg': 'test_group'},{'data': 'data-lv2', 'data_vg': 'test_group', 'db': 'journal1', 'db_vg': 'journals'}]"
osd1 lvm_volumes="[{'data': 'data-lv1', 'data_vg': 'test_group'},{'data': 'data-lv2', 'data_vg': 'test_group'}]" dmcrypt=true
osd2 lvm_volumes="[{'data': 'data-lv1', 'data_vg': 'test_group'},{'data': 'data-lv2', 'data_vg': 'test_group', 'db': 'journal1', 'db_vg': 'journals'}]"
osd3 lvm_volumes="[{'data': '/dev/sda'}, {'data': '/dev/sdb'}, {'data': '/dev/sdc'}]" I see this in log: TASK [ceph-config : Set_fact num_osds (add existing osds)] *********************
task path: /home/guillaume/workspaces/ceph-ansible/7502/roles/ceph-config/tasks/main.yml:93
Wednesday 20 March 2024 16:52:31 +0100 (0:00:00.641) 0:01:59.236 *******
ok: [osd0] => changed=false
ansible_facts:
num_osds: '2'
ok: [osd1] => changed=false
ansible_facts:
num_osds: '2'
ok: [osd2] => changed=false
ansible_facts:
num_osds: '2'
ok: [osd3] => changed=false
ansible_facts:
num_osds: '3' is there anything wrong here? |
Thanks for taking the time to look into this, @guits. The output shown is correct. Is this by chance from a first run of
|
no, this is from a re-run after a fresh install. |
Before I dive deeper into this could you please confirm that the output is actually from the current implementation. I noticed the number
Could this be a run with the version containing the fix? In that case I would hope that it is correct ;) If that number is unrelated could you show the output of |
I cloned the repo at a new path and named it with the id of you PR but it was well with the branch |
I'm gonna do more tests and double-check I didn't miss a detail |
Thank you for bearing with me here. I did not exactly reproduce your test environment, but set up a single instance with four volumes on four volume groups on four devices
matching this config:
Using branch
So on the second run, additionally to the count of items in If we start to bring |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions. |
I am still interested in landing this. Let me know if there is anything I can do to move this along |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions. |
@guits did you have time to look into this yet? |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions. |
This pull request has been automatically closed due to inactivity. Please re-open if these changes are still required. |
The number of OSDs defined by the
lvm_volumes
variable is added tonum_osds
in taskCount number of osds for lvm scenario
. Therefore theses devices must not be counted in taskSet_fact num_osds (add existing osds)
.There are currently three problems with the existing approach:
lvm_volumes
supports a second notation to directly specify logical volumes instead of devices when thedata_vg
key exists. This scenario is not yet accounted for.difference
filter used to remove devices fromlvm_volumes
returns a list of unique elements, thus not accounting for multiple OSDs on a single deviceThe first problem is solved by filtering the list of logical volumes for devices used as
type
block
.For the second and third problem lists are created from
lvm_volumes
containing either paths to devices or logical volumes devices. For the second problem the output ofceph-volume
is simply filtered forlv_path
s appearing in the list of logical volume devices described above.To solve the third problem the remaining OSDs in the output are compiled into a list of their used devices, which is then filtered for devices appearing in the list of devices from
lvm_volumes
.Fixes: #7435