feat: crmsh workflow and SUSE support #186

marcelmamula · 2024-02-07T13:32:34Z

Enhancement: This PR adds support for crmsh shell and covers most of its functionalities, with few exceptions.
List of changes:

crmsh workflow added, with reusing existing pcs steps where possible and creating rest from scratch, creating jinja2 for each crmsh command.
main.yml was adjusted to allow easier loading of shell specific files with removing pcs names and simplifying.
Few pcs shell files were renamed to go along with point 2.
Create and push CIB was changed to run_once: true, because running CIB configuration steps on multiple nodes can cause corruption and misaligned CIB files. This is 100% case for crmsh and it would be best if pcs workflow was also cautious of this.
corosync.j2 was added for corosync conf creation.
crmsh steps are using Maintenance Mode for cluster to freeze config to avoid errors due to versioning caused by cluster resource status changes.

Following functionalities were not part of current scope:

qnet/qdevice
resource bundles

Reason: Support for SUSE systems and allow wider use, including community.sap_install which is being enabled for SUSE as well.

Result: Testing was successfully completed on RHEL9.2 and SLES for SAP 15 SP5 on AWS. It includes:

Running on fresh servers
Destroying clusters
Re-executing and checking results of idempotency
Consuming using community.sap_install
local Ansible Lint checks through development

Issue Tracker Tickets (Jira or BZ if any): #175

tasks/main.yml

tasks/shell_crmsh/create-and-push-cib.yml

tasks/main.yml

tasks/shell_crmsh/check-and-prepare-role-variables.yml

codecov · 2024-02-07T21:09:43Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (2450454) 0.00% compared to head (8eab942) 50.54%.
Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff            @@
##           main     #186       +/-   ##
=========================================
+ Coverage      0   50.54%   +50.54%     
=========================================
  Files         0        1        +1     
  Lines         0       91       +91     
  Branches      0       12       +12     
=========================================
+ Hits          0       46       +46     
- Misses        0       45       +45

Flag	Coverage Δ
sanity	`50.54% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tasks/shell_crmsh/check-and-prepare-role-variables.yml

tasks/shell_crmsh/cluster-destroy-crm.yml

tasks/shell_crmsh/cluster-setup-keys.yml

tasks/shell_crmsh/cluster-start-and-reload.yml

tasks/shell_crmsh/create-and-push-cib.yml

tasks/shell_crmsh/crm-cib-constraint-location.yml

vars/RedHat.yml

tomjelinek

I'm getting a lot of errors from linters in CI. Please, fix those:

yamllint:

./vars/Suse.yml
  19:21     warning  too few spaces before comment  (comments)
  23:81     error    line too long (82 > 80 characters)  (line-length)
./tasks/main.yml
  95:81     error    line too long (87 > 80 characters)  (line-length)
./tasks/shell_crmsh/crm-cib-constraint-location.yml
  7:81      error    line too long (99 > 80 characters)  (line-length)
  8:81      error    line too long (104 > 80 characters)  (line-length)
  13:81     error    line too long (93 > 80 characters)  (line-length)
  18:81     error    line too long (85 > 80 characters)  (line-length)
  21:81     error    line too long (95 > 80 characters)  (line-length)
  29:81     error    line too long (94 > 80 characters)  (line-length)
./tasks/shell_crmsh/create-and-push-cib.yml
  237:81    error    line too long (81 > 80 characters)  (line-length)
./tasks/shell_crmsh/crm-cib-constraint-ticket.yml
  12:81     error    line too long (93 > 80 characters)  (line-length)
  17:81     error    line too long (85 > 80 characters)  (line-length)
  20:81     error    line too long (95 > 80 characters)  (line-length)
  28:81     error    line too long (96 > 80 characters)  (line-length)
./tasks/shell_crmsh/crm-cib-constraint-set.yml
  3:88      warning  too few spaces before comment  (comments)
  3:81      error    line too long (108 > 80 characters)  (line-length)
  12:81     error    line too long (93 > 80 characters)  (line-length)
  17:81     error    line too long (85 > 80 characters)  (line-length)
  20:81     error    line too long (95 > 80 characters)  (line-length)
  25:89     warning  too few spaces before comment  (comments)
  25:81     error    line too long (109 > 80 characters)  (line-length)
  33:81     error    line too long (81 > 80 characters)  (line-length)
  39:81     error    line too long (82 > 80 characters)  (line-length)
./tasks/shell_crmsh/crm-cib-resource-primitive.yml
  11:81     error    line too long (83 > 80 characters)  (line-length)
  20:28     warning  too few spaces before comment  (comments)
./tasks/shell_crmsh/crm-cib-stonith-level.yml
  12:81     error    line too long (88 > 80 characters)  (line-length)
./tasks/shell_crmsh/crm-cluster-properties.yml
  6:81      error    line too long (110 > 80 characters)  (line-length)
./tasks/shell_crmsh/crm-cib-constraint-order.yml
  12:81     error    line too long (93 > 80 characters)  (line-length)
  17:81     error    line too long (85 > 80 characters)  (line-length)
  20:81     error    line too long (95 > 80 characters)  (line-length)
  28:81     error    line too long (99 > 80 characters)  (line-length)
  29:81     error    line too long (95 > 80 characters)  (line-length)
  30:81     error    line too long (93 > 80 characters)  (line-length)
./tasks/shell_crmsh/sbd.yml
  15:81     error    line too long (82 > 80 characters)  (line-length)
  16:81     error    line too long (92 > 80 characters)  (line-length)
  17:81     error    line too long (81 > 80 characters)  (line-length)
  18:81     error    line too long (93 > 80 characters)  (line-length)
./tasks/shell_crmsh/crm-cib-resource-group.yml
  6:81      error    line too long (84 > 80 characters)  (line-length)
  11:81     error    line too long (83 > 80 characters)  (line-length)
  14:81     error    line too long (86 > 80 characters)  (line-length)
./tasks/shell_crmsh/cluster-start-and-reload.yml
  64:81     error    line too long (85 > 80 characters)  (line-length)
  98:81     error    line too long (85 > 80 characters)  (line-length)
  103:81    error    line too long (86 > 80 characters)  (line-length)
./tasks/shell_crmsh/crm-cib-resource-clone.yml
  3:81      error    line too long (113 > 80 characters)  (line-length)
  6:81      error    line too long (106 > 80 characters)  (line-length)
  11:81     error    line too long (91 > 80 characters)  (line-length)
  16:81     error    line too long (83 > 80 characters)  (line-length)
  19:81     error    line too long (93 > 80 characters)  (line-length)
  27:81     error    line too long (89 > 80 characters)  (line-length)
./tasks/shell_crmsh/crm-cib-constraint-colocation.yml
  3:81      error    line too long (81 > 80 characters)  (line-length)
  12:81     error    line too long (93 > 80 characters)  (line-length)
  17:81     error    line too long (85 > 80 characters)  (line-length)
  20:81     error    line too long (95 > 80 characters)  (line-length)
  28:81     error    line too long (96 > 80 characters)  (line-length)
  34:81     error    line too long (97 > 80 characters)  (line-length)

ansible-lint:

yaml[line-length]: Line too long (87 > 80 characters)
tasks/main.yml:95
yaml[line-length]: Line too long (85 > 80 characters)
tasks/shell_crmsh/cluster-start-and-reload.yml:64
yaml[line-length]: Line too long (85 > 80 characters)
tasks/shell_crmsh/cluster-start-and-reload.yml:98
yaml[line-length]: Line too long (86 > 80 characters)
tasks/shell_crmsh/cluster-start-and-reload.yml:103
yaml[line-length]: Line too long (81 > 80 characters)
tasks/shell_crmsh/create-and-push-cib.yml:237
yaml[line-length]: Line too long (81 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-colocation.yml:3
yaml[line-length]: Line too long (93 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-colocation.yml:12
yaml[line-length]: Line too long (85 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-colocation.yml:17
yaml[line-length]: Line too long (95 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-colocation.yml:20
yaml[line-length]: Line too long (96 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-colocation.yml:28
yaml[line-length]: Line too long (97 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-colocation.yml:34
yaml[line-length]: Line too long (99 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-location.yml:7
yaml[line-length]: Line too long (104 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-location.yml:8
yaml[line-length]: Line too long (93 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-location.yml:13
yaml[line-length]: Line too long (85 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-location.yml:18
yaml[line-length]: Line too long (95 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-location.yml:21
yaml[line-length]: Line too long (94 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-location.yml:29
yaml[line-length]: Line too long (93 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-order.yml:12
yaml[line-length]: Line too long (85 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-order.yml:17
yaml[line-length]: Line too long (95 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-order.yml:20
yaml[line-length]: Line too long (99 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-order.yml:28
yaml[line-length]: Line too long (95 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-order.yml:29
yaml[line-length]: Line too long (93 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-order.yml:30
yaml[line-length]: Line too long (108 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-set.yml:3
yaml[comments]: Too few spaces before comment
tasks/shell_crmsh/crm-cib-constraint-set.yml:3
yaml[line-length]: Line too long (93 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-set.yml:12
yaml[line-length]: Line too long (85 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-set.yml:17
yaml[line-length]: Line too long (95 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-set.yml:20
yaml[line-length]: Line too long (109 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-set.yml:25
yaml[comments]: Too few spaces before comment
tasks/shell_crmsh/crm-cib-constraint-set.yml:25
yaml[line-length]: Line too long (81 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-set.yml:33
yaml[line-length]: Line too long (82 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-set.yml:39
yaml[line-length]: Line too long (93 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-ticket.yml:12
yaml[line-length]: Line too long (85 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-ticket.yml:17
yaml[line-length]: Line too long (95 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-ticket.yml:20
yaml[line-length]: Line too long (96 > 80 characters)
tasks/shell_crmsh/crm-cib-constraint-ticket.yml:28
yaml[line-length]: Line too long (113 > 80 characters)
tasks/shell_crmsh/crm-cib-resource-clone.yml:3
yaml[line-length]: Line too long (106 > 80 characters)
tasks/shell_crmsh/crm-cib-resource-clone.yml:6
yaml[line-length]: Line too long (91 > 80 characters)
tasks/shell_crmsh/crm-cib-resource-clone.yml:11
yaml[line-length]: Line too long (83 > 80 characters)
tasks/shell_crmsh/crm-cib-resource-clone.yml:16
yaml[line-length]: Line too long (93 > 80 characters)
tasks/shell_crmsh/crm-cib-resource-clone.yml:19
yaml[line-length]: Line too long (89 > 80 characters)
tasks/shell_crmsh/crm-cib-resource-clone.yml:27
yaml[line-length]: Line too long (84 > 80 characters)
tasks/shell_crmsh/crm-cib-resource-group.yml:6
yaml[line-length]: Line too long (83 > 80 characters)
tasks/shell_crmsh/crm-cib-resource-group.yml:11
yaml[line-length]: Line too long (86 > 80 characters)
tasks/shell_crmsh/crm-cib-resource-group.yml:14
yaml[line-length]: Line too long (83 > 80 characters)
tasks/shell_crmsh/crm-cib-resource-primitive.yml:11
yaml[comments]: Too few spaces before comment
tasks/shell_crmsh/crm-cib-resource-primitive.yml:20
yaml[line-length]: Line too long (88 > 80 characters)
tasks/shell_crmsh/crm-cib-stonith-level.yml:12
yaml[line-length]: Line too long (110 > 80 characters)
tasks/shell_crmsh/crm-cluster-properties.yml:6
yaml[line-length]: Line too long (82 > 80 characters)
tasks/shell_crmsh/sbd.yml:15
yaml[line-length]: Line too long (92 > 80 characters)
tasks/shell_crmsh/sbd.yml:16
yaml[line-length]: Line too long (81 > 80 characters)
tasks/shell_crmsh/sbd.yml:17
yaml[line-length]: Line too long (93 > 80 characters)
tasks/shell_crmsh/sbd.yml:18
yaml[comments]: Too few spaces before comment
vars/Suse.yml:19
yaml[line-length]: Line too long (82 > 80 characters)
vars/Suse.yml:23

tasks/shell_pcs/cluster-configure.yml

templates/corosync.j2

tasks/shell_crmsh/check-and-prepare-role-variables.yml

vars/Suse.yml

richm · 2024-02-08T14:22:28Z

If you need help with line wrapping, see https://linux-system-roles.github.io/documentation/howto/working-with-ansible-jinja-code-and-filters.html "How to solve some common ansible-lint issues"

marcelmamula · 2024-02-08T14:23:58Z

If you need help with line wrapping, see https://linux-system-roles.github.io/documentation/howto/working-with-ansible-jinja-code-and-filters.html "How to solve some common ansible-lint issues"

My ansible lint is not showing line length warnings, but I am updating them now based on Tomas's list.

richm · 2024-02-08T14:26:25Z

If you need help with line wrapping, see https://linux-system-roles.github.io/documentation/howto/working-with-ansible-jinja-code-and-filters.html "How to solve some common ansible-lint issues"

My ansible lint is not showing line length warnings, but I am updating them now based on Tomas's list.

How are you running ansible-lint?

marcelmamula · 2024-02-08T14:28:16Z

If you need help with line wrapping, see https://linux-system-roles.github.io/documentation/howto/working-with-ansible-jinja-code-and-filters.html "How to solve some common ansible-lint issues"

My ansible lint is not showing line length warnings, but I am updating them now based on Tomas's list.

How are you running ansible-lint?

on linux bastion machine, with simple config, which does not include any line skips.
skip_list:

no-tabs
run-once[task]
risky-shell-pipe
ignore-errors
role-name[path]
fqcn[keyword]
name[template]

richm · 2024-02-08T14:45:48Z

This repo CI uses the .ansible-lint in the ha_cluster root directory. https://github.com/linux-system-roles/ha_cluster/blob/main/.ansible-lint

skip_list:
  - fqcn-builtins
  - var-naming[no-role-prefix]
  - sanity[cannot-ignore]  # wokeignore:rule=sanity

The reason why we don't see line length errors in the github action CI is because the latest ansible-lint github action only works on collections - so we have to convert the role to collection format before running ansible-lint - https://github.com/linux-system-roles/ha_cluster/blob/main/.github/workflows/ansible-lint.yml#L37 - when we convert to collection format, we cannot preserve line wrapping (due to ruamel roundtrip), so we disable line-length checking.

tomjelinek · 2024-02-08T14:57:20Z

This repo CI uses the .ansible-lint in the ha_cluster root directory. https://github.com/linux-system-roles/ha_cluster/blob/main/.ansible-lint
skip_list:
  - fqcn-builtins
  - var-naming[no-role-prefix]
  - sanity[cannot-ignore]  # wokeignore:rule=sanity
The reason why we don't see line length errors in the github action CI is because the latest ansible-lint github action only works on collections - so we have to convert the role to collection format before running ansible-lint - https://github.com/linux-system-roles/ha_cluster/blob/main/.github/workflows/ansible-lint.yml#L37 - when we convert to collection format, we cannot preserve line wrapping (due to ruamel roundtrip), so we disable line-length checking.

@richm Does that mean we don't care about line length anymore? Should I disable those checks in my tests? I still run ansible-lint and yamllint on the original source files.

richm · 2024-02-08T15:03:08Z

This repo CI uses the .ansible-lint in the ha_cluster root directory. https://github.com/linux-system-roles/ha_cluster/blob/main/.ansible-lint
skip_list:
  - fqcn-builtins
  - var-naming[no-role-prefix]
  - sanity[cannot-ignore]  # wokeignore:rule=sanity
The reason why we don't see line length errors in the github action CI is because the latest ansible-lint github action only works on collections - so we have to convert the role to collection format before running ansible-lint - https://github.com/linux-system-roles/ha_cluster/blob/main/.github/workflows/ansible-lint.yml#L37 - when we convert to collection format, we cannot preserve line wrapping (due to ruamel roundtrip), so we disable line-length checking.
@richm Does that mean we don't care about line length anymore?

I think we should care about line length - not sure if Galaxy role gating checks. I'm considering putting in a separate github action check to run ansible-lint against the role.

Should I disable those checks in my tests?

No, I think we should still care about line length.

I still run ansible-lint and yamllint on the original source files.

ansible-lint now runs yamllint, so you shouldn't need a separate check (and yes, ansible-lint will use .yamllint.yml if you have one).

tasks/shell_crmsh/crm-cib-resource-primitive.yml

Co-authored-by: Richard Megginson <[email protected]>

richm · 2024-02-08T16:42:40Z

[citest]
just want to make sure there are no regressions in the RedHat os_family pcs tests

richm · 2024-02-08T22:14:40Z

defaults/main.yml

-    +
-    (['fence-virt'] if ansible_architecture == 'x86_64' else [])
-  }}"
+ha_cluster_fence_agent_packages: []


Suggested change

ha_cluster_fence_agent_packages: []

ha_cluster_fence_agent_packages: "{{ __ha_cluster_fence_agent_packages }}"

I don't think that is correct place, because it would expose private variable to potential user inputs.

I have added conditional into main yesterday to do switch, depending if users added something to it.

+ ha_cluster_fence_agent_packages if ha_cluster_fence_agent_packages | length > 0 else __ha_cluster_fence_agent_packages }}"

I don't think that is correct place, because it would expose private variable to potential user inputs.

I'm not sure what you mean. I believe my proposal encapsulates the desired behavior - putting this in defaults/main.yml:

ha_cluster_fence_agent_packages: "{{ __ha_cluster_fence_agent_packages }}"

This allows users to provide their own list of packages for ha_cluster_fence_agent_packages (which is the desired behavior - @tomjelinek please correct me if I'm wrong), and if the user does not specify ha_cluster_fence_agent_packages, it will be set to the default value __ha_cluster_fence_agent_packages. And, since __ha_cluster_fence_agent_packages is defined for different values depending on the platform/version, we get the correct value for ha_cluster_fence_agent_packages for all platforms/versions.

I have added conditional into main yesterday to do switch, depending if users added something to it.

+ ha_cluster_fence_agent_packages if ha_cluster_fence_agent_packages | length > 0 else __ha_cluster_fence_agent_packages }}"

I think this is not the "Ansible way" to do this. I believe the correct way to do this is the way I outlined this above.

@richm I have changed to in latest commit after conversation with @tomjelinek.

Looks like the default variable was renamed to __ha_cluster_fence_agent_packages_default, so this should be

ha_cluster_fence_agent_packages: "{{ __ha_cluster_fence_agent_packages_default }}"

@richm Changes are completed and pushed.

@richm this can be marked as resolved?

marcelmamula · 2024-02-09T08:10:36Z

I can see Workflow test failed on firewall and qdevice, are those CI environments in good shape? PR does not touch firewall or qdevices.

tomjelinek · 2024-02-09T08:30:32Z

I can see Workflow test failed on firewall and qdevice, are those CI environments in good shape? PR does not touch firewall or qdevices.

Tests are failing with the same errors on my systems as well. This needs to be investigated.

marcelmamula · 2024-02-09T09:25:41Z

I can see Workflow test failed on firewall and qdevice, are those CI environments in good shape? PR does not touch firewall or qdevices.

Tests are failing with the same errors on my systems as well. This needs to be investigated.

I tried running this PR and existing collection (without PR) on CentOS9, but it fails in different place, which seems as wrong order of steps in pcs. So i cannot re-test this with my current setup. Let me know if you identify something., since it seems you have more variables defined to overcome error below.

Issue I have in mine are from pcs attempting to do property change (empty value) on nonexistent CIB
fatal: [centos0]: FAILED! => {"changed": true, "cmd": ["pcs", "--force", "-f", "/var/lib/pacemaker/cib/cib.xml", "--", "property", "set", "stonith-watchdog-timeout=0"], "delta": "0:00:00.992301", "end": "2024-02-09 09:22:47.144951", "msg": "non-zero return code", "rc": 1, "start": "2024-02-09 09:22:46.152650", "stderr": "Error: unable to get cib", "stderr_lines": ["Error: unable to get cib"], "stdout": "", "stdout_lines": []}

tomjelinek · 2024-02-09T09:28:36Z

I can see Workflow test failed on firewall and qdevice, are those CI environments in good shape? PR does not touch firewall or qdevices.

The firewall failures are caused by not enabling port 1229/tcp. This port is required for fence-virt. This got broken by changes related to ha_cluster_fence_agent_packages variable. Check tasks/firewall.yml - ha_cluster_fence_agent_packages is empty there, thus fence-virt port is not enabled. The ha_cluster_fence_agent_packages variable gets populated by __ha_cluster_fence_agent_packages in tasks/main.yml after that. That's the root cause.

marcelmamula · 2024-02-09T09:36:18Z

I can see Workflow test failed on firewall and qdevice, are those CI environments in good shape? PR does not touch firewall or qdevices.

The firewall failures are caused by not enabling port 1229/tcp. This port is required for fence-virt. This got broken by changes related to ha_cluster_fence_agent_packages variable. Check tasks/firewall.yml - ha_cluster_fence_agent_packages is empty there, thus fence-virt port is not enabled. The ha_cluster_fence_agent_packages variable gets populated by __ha_cluster_fence_agent_packages in tasks/main.yml after that. That's the root cause.

I see, I have tried to add it to include_vars/os_family yesterday but it resulted in critical nesting, so I moved it to main.yml into "Install cluster packages" as

+
          ha_cluster_fence_agent_packages
            if ha_cluster_fence_agent_packages | length > 0
            else __ha_cluster_fence_agent_packages }}"

But there are other places that use it before that - firewall and selinux. Where would be best place to do this assignment then if include_vars cannot be used with same syntax?

Nesting came from this in include_vars

ha_cluster_fence_agent_packages: "{{ ha_cluster_fence_agent_packages if
 ha_cluster_fence_agent_packages | length > 0
 else __ha_cluster_fence_agent_packages }}"

marcelmamula · 2024-02-09T10:01:35Z

I can see Workflow test failed on firewall and qdevice, are those CI environments in good shape? PR does not touch firewall or qdevices.

The firewall failures are caused by not enabling port 1229/tcp. This port is required for fence-virt. This got broken by changes related to ha_cluster_fence_agent_packages variable. Check tasks/firewall.yml - ha_cluster_fence_agent_packages is empty there, thus fence-virt port is not enabled. The ha_cluster_fence_agent_packages variable gets populated by __ha_cluster_fence_agent_packages in tasks/main.yml after that. That's the root cause.

I see, I have tried to add it to include_vars/os_family yesterday but it resulted in critical nesting, so I moved it to main.yml into "Install cluster packages" as
+
          ha_cluster_fence_agent_packages
            if ha_cluster_fence_agent_packages | length > 0
            else __ha_cluster_fence_agent_packages }}"
But there are other places that use it before that - firewall and selinux. Where would be best place to do this assignment then if include_vars cannot be used with same syntax?

Nesting came from this in include_vars
ha_cluster_fence_agent_packages: "{{ ha_cluster_fence_agent_packages if
 ha_cluster_fence_agent_packages | length > 0
 else __ha_cluster_fence_agent_packages }}"

@tomjelinek We can simplify it with pre-step in main.yml which avoids nesting and still leaves user input as option.

- name: Set ha_cluster_fence_agent_packages fact
  ansible.builtin.set_fact:
    ha_cluster_fence_agent_packages: "{{ __ha_cluster_fence_agent_packages }}"
  when: ha_cluster_fence_agent_packages | length == 0

There is also second approach, to always add os_family fence agents on top of user input with | unique, which would be more manageable.

marcelmamula · 2024-02-09T10:51:15Z

@tomjelinek I have updated code for 2nd approach and it works well by combining both and then running unique on top.
Let me know if you are fine with this approach and I will commit it for CI testing.

selinux:

(
            'fence-virt' in ha_cluster_fence_agent_packages
            or
            'fence-virt' in __ha_cluster_fence_agent_packages
            or
            'fence-virt' in ha_cluster_extra_packages
            or
            'fence-agents-all' in ha_cluster_fence_agent_packages
            or
            'fence-agents-all' in __ha_cluster_fence_agent_packages
            or
            'fence-agents-all' in ha_cluster_extra_packages
          )

firewall

    __use_fence_fw_port: "{{ __arch == 'x86_64' and
      ('fence-virt' in ha_cluster_fence_agent_packages
       or 'fence-virt' in __ha_cluster_fence_agent_packages
       or 'fence-virt' in ha_cluster_extra_packages
       or 'fence-agents-all' in ha_cluster_fence_agent_packages
       or 'fence-agents-all' in __ha_cluster_fence_agent_packages
       or 'fence-agents-all' in ha_cluster_extra_packages) }}"

main:

    - name: Install cluster packages
      package:
        name: "{{(
          __ha_cluster_fullstack_node_packages
          +
          __ha_cluster_qdevice_in_use |
            ternary(__ha_cluster_qdevice_node_packages, [])
          +
          ha_cluster_sbd_enabled | ternary(__ha_cluster_sbd_packages, [])
          +
          ha_cluster_fence_agent_packages
          +
          __ha_cluster_fence_agent_packages ) | unique }}"
        state: present
        use: "{{ (__ha_cluster_is_ostree | d(false)) |
                 ternary('ansible.posix.rhel_rpm_ostree', omit) }}"

tasks/main.yml

tomjelinek · 2024-02-09T11:35:55Z

@tomjelinek I have updated code for 2nd approach and it works well by combining both and then running unique on top.

@marcelmamula That is not going to work. Originally, users could redefine ha_cluster_fence_agent_packages value to install only those fence-agents they were interested in. If you combine ha_cluster_fence_agent_packages and __ha_cluster_fence_agent_packages together, then all fence-agents will be installed every time, as __ha_cluster_fence_agent_packages contains fence-agents-all.

tomjelinek · 2024-02-09T11:40:07Z

I can see Workflow test failed on firewall and qdevice, are those CI environments in good shape? PR does not touch firewall or qdevices.

Qdevice tests are failing due to corosync-qdevice package not being installed by the role. This is caused by changes in "Install cluster packages" task in main.yml. I explained this in a comment above.

tomjelinek · 2024-02-09T11:45:43Z

@marcelmamula re: ha_cluster_fence_agent_packages

I would try to resolve this by setting fact __ha_cluster_fence_agent_packages in check-and-prepare-role-variables.yml depending on ha_cluster_fence_agent_packages and then using __ha_cluster_fence_agent_packages instead of ha_cluster_fence_agent_packages throughout the role.

marcelmamula · 2024-02-09T11:47:13Z

@marcelmamula re: ha_cluster_fence_agent_packages

I would try to resolve this by setting fact __ha_cluster_fence_agent_packages in check-and-prepare-role-variables.yml depending on ha_cluster_fence_agent_packages and then using __ha_cluster_fence_agent_packages instead of ha_cluster_fence_agent_packages throughout the role.

@tomjelinek I have retested using ternary ha_cluster_fence_agent_packages: "{{ ha_cluster_fence_agent_packages | length == 0 | ternary(__ha_cluster_fence_agent_packages, ha_cluster_fence_agent_packages) }}" and it ends with endless nesting like if/else.

It might be simplest to do set_facts before firewall/selinux block:

- name: Set ha_cluster_fence_agent_packages fact
  ansible.builtin.set_fact:
    ha_cluster_fence_agent_packages: "{{ __ha_cluster_fence_agent_packages }}"
  when: ha_cluster_fence_agent_packages | length == 0

tomjelinek · 2024-02-09T11:57:12Z

I have retested using ternary ha_cluster_fence_agent_packages: "{{ ha_cluster_fence_agent_packages | length == 0 | ternary(__ha_cluster_fence_agent_packages, ha_cluster_fence_agent_packages) }}" and it ends with endless nesting like if/else.

It might be simplest to do set_facts before firewall/selinux block:
- name: Set ha_cluster_fence_agent_packages fact
  ansible.builtin.set_fact:
    ha_cluster_fence_agent_packages: "{{ __ha_cluster_fence_agent_packages }}"
  when: ha_cluster_fence_agent_packages | length == 0

This would work, but it modifies user-defined ha_cluster_fence_agent_packages variable, and the modification will be in effect even after the role finishes. Users won't expect this to happen and it may cause issues. I would rather set a "private" variable. If there is endless nesting, perhaps we need a new variable name? __ha_cluster_fence_agent_packages_final: "{{ ha_cluster_fence_agent_packages | length == 0 | ternary(__ha_cluster_fence_agent_packages_default, ha_cluster_fence_agent_packages) }}"

Also, I would put this task into check-and-prepare-role-variables.yml

marcelmamula · 2024-02-09T13:04:51Z

@tomjelinek Thank you for suggestion. I have implemented it and retested in rhel, centos and sles4sap and it was working correctly. New commit was pushed for restesting.

sean-freeman · 2024-02-09T15:06:50Z

[citest]
Triggers tests

richm · 2024-02-09T15:15:50Z

I have retested using ternary ha_cluster_fence_agent_packages: "{{ ha_cluster_fence_agent_packages | length == 0 | ternary(__ha_cluster_fence_agent_packages, ha_cluster_fence_agent_packages) }}" and it ends with endless nesting like if/else.
It might be simplest to do set_facts before firewall/selinux block:
- name: Set ha_cluster_fence_agent_packages fact
  ansible.builtin.set_fact:
    ha_cluster_fence_agent_packages: "{{ __ha_cluster_fence_agent_packages }}"
  when: ha_cluster_fence_agent_packages | length == 0
This would work, but it modifies user-defined ha_cluster_fence_agent_packages variable, and the modification will be in effect even after the role finishes. Users won't expect this to happen and it may cause issues. I would rather set a "private" variable. If there is endless nesting, perhaps we need a new variable name? __ha_cluster_fence_agent_packages_final: "{{ ha_cluster_fence_agent_packages | length == 0 | ternary(__ha_cluster_fence_agent_packages_default, ha_cluster_fence_agent_packages) }}"

Also, I would put this task into check-and-prepare-role-variables.yml

I think my proposal #186 (comment) will solve this problem.

richm · 2024-02-09T15:19:04Z

tasks/shell_crmsh/check-and-prepare-role-variables.yml

+    __ha_cluster_fence_agent_packages_final:
+      "{{ (ha_cluster_fence_agent_packages | length == 0) |
+        ternary(__ha_cluster_fence_agent_packages_default,
+        ha_cluster_fence_agent_packages) }}"


This is exactly what my proposal does, without having to resort to set_fact and defining yet another variable.

Oh, yes. Somehow, I missed your comment about this. @marcelmamula, can you try implementing @richm's proposal regarding this?

defaults/main.yml does not contain any __private variables. Do you want to expose them there now?
Latest commit has option for users to define fence agents and then decide based on that decision, without having them to to go and replace this assignment in defaults.

defaults/main.yml does not contain any __private variables. Do you want to expose them there now?

Yes. This is quite a common idiom in system roles.

Latest commit has option for users to define fence agents and then decide based on that decision, without having them to to go and replace this assignment in defaults.

Not sure what you mean. With my proposal:

if the user defines ha_cluster_fence_agent_packages then the role will use that definition

if the user does not define ha_cluster_fence_agent_packages then ha_cluster_fence_agent_packages will be set to the default value __ha_cluster_fence_agent_packages_default and the default value will be used

See also https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_variables.html#ansible-variable-precedence

Also note - in Ansible there is no way to "protect" variables - we use the convention that any variable that starts with __ha_cluster_... should not be set by users, and the role is "unsupported" if such variables are set by users.

@richm Both solution works, but I will update it so we can move past this.

[citest]
Changes are pushed

@richm this can be marked as resolved?

sean-freeman

lgtm 👍

richm · 2024-02-09T18:00:29Z

[citest] is only allowed by role maintainers or linux-system-roles org admins, for security purposes

richm

thanks for being so patient! hopefully the next PR won't require 134 gh pr comments . . .

crmsh workflow and SUSE support added

88919c1

marcelmamula requested review from tomjelinek and richm as code owners February 7, 2024 13:32

sean-freeman reviewed Feb 7, 2024

View reviewed changes

tasks/main.yml Show resolved Hide resolved

sean-freeman reviewed Feb 7, 2024

View reviewed changes

tasks/shell_crmsh/create-and-push-cib.yml Show resolved Hide resolved