Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: crmsh 4.6.0 support and stonith-enabled workflow update #232

Merged
merged 3 commits into from
Oct 16, 2024

Conversation

marcelmamula
Copy link
Contributor

@marcelmamula marcelmamula commented Oct 14, 2024

Enhancement:

main

  • package installation changed from present to latest to accommodate for installation of package updates.
  • Images do not come with latest maintenance patches so it is better to update to latest before building cluster.
  • @richm @tomjelinek yum should support state latest from what I checked.
  • New variable added to allow updating of packages ha_cluster_use_latest_packages, as discussed below.

crmsh

  • Changed maintenance tasks to force mode (expect modules were commented out)
  • Add task to disable stonith-enabled property before proceeding with other crm configure steps
  • Move cluster property task to end of CIB build so stonith can be enabled at end
  • Enhance cluster property task to
    • Append stonith-enabled=true if it is missing.
    • Show warning if it was purposefully executed with stonith-enabled=false, which is not recommended.
  • Execute crm_verify to validate shadow CIB before patching. Fail module is used to check return code.

Reason:

crmsh 4.6.0 in SUSE Linux Enterprise Server for SAP Applications 15 SP6 no longer ignores error caused by ha_cluster workflow.

  • crmsh should be executed by crm init which sets property stonit-enabled=false, allowing for all configuration steps including maintenance mode.
  • ha_cluster workflow does not use crm init so this step was missing, resulting in RC 78 error on SP6 which is thrown for every crm configure command except crm configure property stonith-enabled=false
ae1ascs:~ # crm configure property maintenance-mode=true
ERROR: error: Resource start-up disabled since no STONITH resources have been defined
ERROR: error: Either configure some or disable STONITH with the stonith-enabled option
ERROR: error: NOTE: Clusters with shared data need STONITH to ensure data integrity
ERROR: crm_verify: Errors found during check: config not valid

Result:

Successfully tested building of SAP HANA and SAP Netweaver clusters on AWS and Google Cloud for OS:

  • SUSE Linux Enterprise Server for SAP Applications 15 SP5
  • SUSE Linux Enterprise Server for SAP Applications 15 SP6

@marcelmamula
Copy link
Contributor Author

@tomjelinek @richm Can you tell me why tests failed beside latest in package module?

I can move latest to check-and-prepare-role-variables.yml task file and use zypper instead, if you want to keep package module using present. But I would assume you would also want to have latest packages before execution.

@richm
Copy link
Contributor

richm commented Oct 14, 2024

@tomjelinek @richm Can you tell me why tests failed beside latest in package module?

I can move latest to check-and-prepare-role-variables.yml task file and use zypper instead, if you want to keep package module using present. But I would assume you would also want to have latest packages before execution.

https://ansible.readthedocs.io/projects/lint/rules/package-latest/?h=package

tasks/main.yml Outdated Show resolved Hide resolved
@tomjelinek
Copy link
Member

@tomjelinek @richm Can you tell me why tests failed beside latest in package module?

Python unit tests fail because pcs-main no longer builds on ubuntu-22.04. This is being addressed in #231

Ansible test seems to be failing due to Warning: : Collection fedora.linux_system_roles does not support Ansible. I'm not sure what's going on there.

README.md Outdated Show resolved Hide resolved
@richm
Copy link
Contributor

richm commented Oct 15, 2024

@tomjelinek @richm Can you tell me why tests failed beside latest in package module?

Python unit tests fail because pcs-main no longer builds on ubuntu-22.04. This is being addressed in #231

Ansible test seems to be failing due to Warning: : Collection fedora.linux_system_roles does not support Ansible. I'm not sure what's going on there.

It is because of linux-system-roles/auto-maintenance#353 - just ignore it for now

@richm
Copy link
Contributor

richm commented Oct 15, 2024

[citest]

@spetrosi spetrosi merged commit e75d464 into linux-system-roles:main Oct 16, 2024
17 of 26 checks passed
@marcelmamula marcelmamula deleted the sp6 branch December 4, 2024 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants