Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-44849: Provide fallback Xattr method where not supported in kernel. #7029

Merged
merged 1 commit into from
Dec 4, 2024

Conversation

paul-maidment
Copy link
Contributor

@paul-maidment paul-maidment commented Nov 26, 2024

Some customers are expereincing breaking issues when their OS does not support xattr, most notably NFS + Xattr
This causes installations not to proceed and needs to be addressed urgently.

This change provides a means of 'falling back' to a filesystem based implementation in the case that xattr is unavailable in the kernel.

This is achieved by the creation of an XattrClient interface and three implementations of this client

OSxattrClient - which wraps the kernel native xattr functions and is responsible for determining if xattr is available or not.
FilesystemBasedXattrClient - Which uses a directory parallel to the root to maintain metadata keys as individual files - one file per key

CompositeXattrClient - Which encapsulates the other two clients and dependent on the availability of xattr in the kernel will call the right client as applicable.

If we need to disable the fallback for any reason then the correct way to do this is to create a custom config map for assisted service.

apiVersion: v1
kind: ConfigMap
metadata:
  name: custom-config-map
  namespace: multicluster-engine
data:
  ENABLE_XATTR_FALLBACK: "false"

This should then be added as an annotation to the AgentServiceConfig

kind: AgentServiceConfig
metadata:
  annotations:
    unsupported.agent-install.openshift.io/assisted-service-configmap: custom-config-map

Once you have updated the AgentServiceConfig, redeploy the service

oc rollout restart deployment -n multicluster-engine assisted-service

After the service redeploys, you should see an entry in the assisted service log to indicate that the setting has been picked up

oc logs -f assisted-service-66dfcbf689-95v5f -c asisted-service | grep -i FALLBACK
time="2024-12-03T09:46:36Z" level=info msg="Options.EnableXattrFallback:false" func=main.main file="/assisted-service/cmd/main.go:332"

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Nov 26, 2024
@openshift-ci-robot
Copy link

@paul-maidment: This pull request references Jira Issue OCPBUGS-44849, which is invalid:

  • expected the bug to target the "4.19.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Some customers are expereincing breaking issues when their OS does not support xattr, most notably NFS + Xattr This causes installations not to proceed and needs to be addressed urgently.

As an intermim measure, it has been decided that we will skip manifest metadata operations and warn the user if we find that xattr is unsupported for the file path.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 26, 2024
Copy link

openshift-ci bot commented Nov 26, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: paul-maidment

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 26, 2024
Copy link

codecov bot commented Nov 26, 2024

Codecov Report

Attention: Patch coverage is 66.48045% with 60 lines in your changes missing coverage. Please review.

Project coverage is 68.08%. Comparing base (973249d) to head (ebb71ba).
Report is 6 commits behind head on master.

Files with missing lines Patch % Lines
pkg/s3wrapper/filesystem.go 0.00% 20 Missing ⚠️
pkg/s3wrapper/os_xattr_client.go 67.39% 9 Missing and 6 partials ⚠️
pkg/s3wrapper/filesystem_xattr_client.go 78.33% 8 Missing and 5 partials ⚠️
pkg/s3wrapper/composite_xattr_client.go 75.51% 7 Missing and 5 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7029      +/-   ##
==========================================
- Coverage   68.21%   68.08%   -0.14%     
==========================================
  Files         279      284       +5     
  Lines       39307    39686     +379     
==========================================
+ Hits        26813    27019     +206     
- Misses      10060    10246     +186     
+ Partials     2434     2421      -13     
Files with missing lines Coverage Δ
pkg/s3wrapper/xattr_client.go 100.00% <100.00%> (ø)
pkg/s3wrapper/composite_xattr_client.go 75.51% <75.51%> (ø)
pkg/s3wrapper/filesystem_xattr_client.go 78.33% <78.33%> (ø)
pkg/s3wrapper/os_xattr_client.go 67.39% <67.39%> (ø)
pkg/s3wrapper/filesystem.go 0.00% <0.00%> (-34.64%) ⬇️

... and 7 files with indirect coverage changes

@paul-maidment paul-maidment marked this pull request as draft November 26, 2024 16:17
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 26, 2024
@openshift-ci openshift-ci bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 27, 2024
@paul-maidment paul-maidment marked this pull request as ready for review November 27, 2024 13:08
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 27, 2024
@openshift-ci openshift-ci bot requested review from danmanor and omertuc November 27, 2024 13:08
@paul-maidment paul-maidment force-pushed the OCPBUGS-44849 branch 6 times, most recently from 1b9a77c to abd01d5 Compare November 27, 2024 13:59
@paul-maidment paul-maidment marked this pull request as draft November 27, 2024 18:01
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 27, 2024
@jhernand
Copy link
Contributor

jhernand commented Dec 3, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2024
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2024
@jhernand
Copy link
Contributor

jhernand commented Dec 3, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2024
@paul-maidment
Copy link
Contributor Author

/retest

@paul-maidment
Copy link
Contributor Author

/test edge-subsystem-kubeapi-aws

1 similar comment
@paul-maidment
Copy link
Contributor Author

/test edge-subsystem-kubeapi-aws

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 4, 2024
…ernel.

Some customers are expereincing breaking issues when their OS does not support xattr, most notably NFS + Xattr
This causes installations not to proceed and needs to be addressed urgently.

This change provides a means of 'falling back' to a filesystem based implementation in the case that xattr is unavailable in the kernel.

This is achieved by the creation of an XattrClient interface and three implementations of this client

OSxattrClient - which wraps the kernel native xattr functions and is responsible for determining if xattr is available or not.
FilesystemBasedXattrClient - Which uses a directory parallel to the root to maintain metadata keys as individual files - one file per key

CompositeXattrClient - Which encapsulates the other two clients and dependent on the availability of xattr in the kernel will call the right client as applicable.

If we need to disable the fallback for any reason then the correct way to do this is to create a custom config map for assisted service.
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: custom-config-map
  namespace: multicluster-engine
data:
  ENABLE_XATTR_FALLBACK: "false"
```

This should then be added as an annotation to the AgentServiceConfig
```
kind: AgentServiceConfig
metadata:
  annotations:
    unsupported.agent-install.openshift.io/assisted-service-configmap: custom-config-map

```

Once you have updated the AgentServiceConfig, redeploy the service
```
oc rollout restart deployment -n multicluster-engine assisted-service
```

After the service redeploys, you should see an entry in the assisted service log to indicate that the setting has been picked up

```
oc logs -f assisted-service-66dfcbf689-95v5f -c asisted-service | grep -i FALLBACK
time="2024-12-03T09:46:36Z" level=info msg="Options.EnableXattrFallback:false" func=main.main file="/assisted-service/cmd/main.go:332"
```
@jhernand
Copy link
Contributor

jhernand commented Dec 4, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 4, 2024
@paul-maidment
Copy link
Contributor Author

/retest

@paul-maidment
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot
Copy link

@paul-maidment: This pull request references Jira Issue OCPBUGS-44849, which is invalid:

  • expected the bug to target the "4.19.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@paul-maidment
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 4, 2024
@openshift-ci-robot
Copy link

@paul-maidment: This pull request references Jira Issue OCPBUGS-44849, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.19.0) matches configured target version for branch (4.19.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 29d802e into openshift:master Dec 4, 2024
14 of 17 checks passed
@openshift-ci-robot
Copy link

@paul-maidment: Jira Issue OCPBUGS-44849: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-44849 has been moved to the MODIFIED state.

In response to this:

Some customers are expereincing breaking issues when their OS does not support xattr, most notably NFS + Xattr
This causes installations not to proceed and needs to be addressed urgently.

This change provides a means of 'falling back' to a filesystem based implementation in the case that xattr is unavailable in the kernel.

This is achieved by the creation of an XattrClient interface and three implementations of this client

OSxattrClient - which wraps the kernel native xattr functions and is responsible for determining if xattr is available or not.
FilesystemBasedXattrClient - Which uses a directory parallel to the root to maintain metadata keys as individual files - one file per key

CompositeXattrClient - Which encapsulates the other two clients and dependent on the availability of xattr in the kernel will call the right client as applicable.

If we need to disable the fallback for any reason then the correct way to do this is to create a custom config map for assisted service.

apiVersion: v1
kind: ConfigMap
metadata:
 name: custom-config-map
 namespace: multicluster-engine
data:
 ENABLE_XATTR_FALLBACK: "false"

This should then be added as an annotation to the AgentServiceConfig

kind: AgentServiceConfig
metadata:
 annotations:
   unsupported.agent-install.openshift.io/assisted-service-configmap: custom-config-map

Once you have updated the AgentServiceConfig, redeploy the service

oc rollout restart deployment -n multicluster-engine assisted-service

After the service redeploys, you should see an entry in the assisted service log to indicate that the setting has been picked up

oc logs -f assisted-service-66dfcbf689-95v5f -c asisted-service | grep -i FALLBACK
time="2024-12-03T09:46:36Z" level=info msg="Options.EnableXattrFallback:false" func=main.main file="/assisted-service/cmd/main.go:332"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@paul-maidment
Copy link
Contributor Author

/cherry-pick release-ocm-2.12

@paul-maidment
Copy link
Contributor Author

/cherry-pick release-4.18

@openshift-cherrypick-robot

@paul-maidment: new pull request created: #7098

In response to this:

/cherry-pick release-ocm-2.12

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-cherrypick-robot

@paul-maidment: new pull request created: #7099

In response to this:

/cherry-pick release-4.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants