Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mount] Addition of "checkAndRepairXfsFilesystem" inadvertently prevents XFS self-recovery via mounting #141

Closed
nktpro opened this issue Feb 28, 2020 · 11 comments · Fixed by #150 or kubernetes/kubernetes#89444
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.

Comments

@nktpro
Copy link

nktpro commented Feb 28, 2020

PR #126 added an extra step to run xfs_repair before mounting a XFS file system. However instead of helping to automatically correct FS issues due to prior unclean shutdowns, it actually prevented auto recovery from happening, which led to complete unavailability of the corresponding volume and subsequently required manual human intervention.

The sequence of events is as follows:

  1. A node loss / unclean shutdown occurs.
  2. A stateful pod is restarted on another healthy node; its volume is re-attached to the new node.
  3. xfs_repair is run against the volume. The relevant logs would look like these (in the context of rook-ceph but should apply to any other user of the mounter)
Filesystem corruption was detected for /dev/rbd1, running xfs_repair to repair
ID: 29 Req-ID: 0001-0009-rook-ceph-0000000000000001-9adb43bf-4e25-11ea-aa19-2ecc193be507 failed to mount device path (/dev/rbd1) to staging path (/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-423b5a86-7c03-43c4-a7e9-4921934016de/globalmount/0001-0009-rook-ceph-0000000000000001-9adb43bf-4e25-11ea-aa19-2ecc193be507) for volume (0001-0009-rook-ceph-0000000000000001-9adb43bf-4e25-11ea-aa19-2ecc193be507) error 'xfs_repair' found errors on device /dev/rbd1 but could not correct them: Phase 1 - find and verify superblock...

Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
  1. The volume is then prevented from being mounted and manual intervention is required
  2. All that's needed then, is to manually mount the volume, which replay XFS logs automatically, then unmount it and restart the corresponding pod.

Note that step #5 was what has always happened prior to this change. The volume is simply mounted without any attempt to perform FS check / xfs_repair. It can simply correct itself as part of just being mounted, as per XFS design.

The recommended fix is to only attempt to run xfs_repair if mounting actually fails, as the last resort. There shouldn't be any need to xfs_repair prior to a mount failure.

Alternatively, don't bail out if an error occurs when running xfs_repair. Let the mount attempt happen anyway. It'll then either fix itself, or fail mounting with another error.

Relevant issue from rook-ceph repo: rook/rook#4914

CC'ing @27149chen

@nktpro nktpro changed the title Addition of "checkAndRepairXfsFilesystem" caused high availability regression / production outage [mount] Addition of "checkAndRepairXfsFilesystem" caused high availability regression / production outage Feb 28, 2020
@nktpro nktpro changed the title [mount] Addition of "checkAndRepairXfsFilesystem" caused high availability regression / production outage [mount] Addition of "checkAndRepairXfsFilesystem" inadvertently prevents XFS self-recovery via mounting Feb 28, 2020
@27149chen
Copy link
Member

27149chen commented Feb 28, 2020

@nktpro , you actually encounter a "dirty log" issue of an xfs device. It is handled in pr #132 . But it is still under review. You are welcome to add your comments there.
btw, your recommendation is not safe, because even if it can be mounted, it does not mean that it is healthy, there may be other issues, and these issues can be repaired by xfs_repair.
You can see that no matter xfs (xfs_repair) or other filesystems (fsck), the process are the same: check and repair first, and then mount. It is safer, but more expensive. You can see issue #137 which is discussing the possibility of mount first and then repair.

@27149chen
Copy link
Member

/assign

@nktpro
Copy link
Author

nktpro commented Feb 28, 2020

Thanks @27149chen, either #132 or #137 would address this. I believe we need to raise more awareness on the seriousness of this regression since it's a ticking time bomb in production for any recent k8s clusters with XFS PVs. High availability is critically compromised.

Dirty logs happen all the time during node loss / unclean shutdowns and instead of the volumes simply be mounted and fix themselves by auto-replaying logs as it used to be, they are now stuck waiting for manual human operators to mount and unmount those volumes.

@gnufied @dims could you guys help accelerating reviewing @27149chen PR, due to the level of impact this has? A rollback of the checkAndRepairXfsFilesystem addition would also address this while figuring out a more robust FS checking process prior to mounting, without inadvertently makes things worse than not having it at all.

@27149chen
Copy link
Member

27149chen commented Feb 29, 2020

@nktpro I think we can't revert the previous pr, because there is another issue, which can be fixed by xfs_repair.
btw, is it possible for you to try the version in my branch to see if it can resolve your problem?

@nktpro
Copy link
Author

nktpro commented Feb 29, 2020

@27149chen Testing your branch would be rather complicated. Specifically in my case, this is a transitive dependency of rook-ceph. It'd first require custom building the downstream project (https://github.com/ceph/ceph-csi), followed by substitution of the docker image for the csi-rbdplugin component in rook-ceph, and deploy it in a test K8s cluster.

That's a bit tedious but it's all doable. However, to actually verify the fix I'd also need to force a volume with XFS to be in an inconsistent state with dirty logs, maybe via intentionally trigger kernel panic on a node while having lots of writes coming in? Do you have any suggestion on a better way to deterministically induce that?

Also that brings up a good conversation on the need to automate an integration / e2e test suite for this.

@27149chen
Copy link
Member

@nktpro how did you encounter this issue before? You said it happened all the time during node loss / unclean shutdowns. I was thinking that it was easy to reproduce. Sorry, I don't know how to deterministically cause a dirty log. But according to the document and your manually try, mount and unmount is the right way to fix it, what do you think?

@nktpro
Copy link
Author

nktpro commented Feb 29, 2020

@27149chen We hit it in production, never had to manually intervene before until this change made it into ceph-csi 2.0 release, which then affects rook-ceph and everyone who upgraded to latest rook-ceph is susceptible to it. Usually we would only need to manually resort to xfs_repair if mounting actually fails with corruption errors.

You said it happened all the time during node loss / unclean shutdowns.

Yes, hence that's one known way to test this (hard-resetting a node, triggering a kernel panic, pulling the power cord in the middle of database writes, etc.). However it's non-deterministic, manual, and hard to automate as a scripted test suite to prevent similar regression in the future.

Anyway I'll report back the result if we have some bandwidth to test your fix. In the meantime, we are forced to either downgrade to a version right before this change, or switch to Ext4 since this only affects XFS.

@gnufied
Copy link
Member

gnufied commented Feb 29, 2020

Thanks for opening this issue. I still think that somehow doing filesystem repairs should be an opt-in rather than default. I am not a XFS expert but if there is a chance that #132 could somehow worsen the problem rather than allowing an admin to manually fix it, we should be careful. It should also be noted that because mount operations are retried, the filesystem repair will be retried in a continuous loop as well.
cc @jsafrane

@27149chen
Copy link
Member

27149chen commented Mar 1, 2020

@gnufied , we have been doing filesystem repairs (by fsck) by default all the time. I agree that it is expensive as I mentioned in issue #137, it will be great if we can find a way to reduce the unnecessary repair. But before that, I think we should try to repair the filesystem by default. Regarding xfs_repair, I think replaying the dirty logs by default (by mounting and immediately unmounting the filesystem) won't worsen the problem because it is the official recommended way, and if we don't do that, the subsequent operations will still try to mount this unhealthy disk.

@saad-ali
Copy link
Member

saad-ali commented Mar 24, 2020

To unblock k8s 1.18.0 release, the recommendation from SIG Storage is to roll back PR #126 causing this issue (it was fixing a non-critical corner case). Working with k8s release team to get the go ahead for that.

Once 1.18.0 is cut, we can revisit this.

@alejandrox1
Copy link

For tracking on the release side
/priority critical-urgent
/kind bug

and more importantly: thank you to everyone who has been working on this!

@k8s-ci-robot k8s-ci-robot added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. kind/bug Categorizes issue or PR as related to a bug. labels Mar 24, 2020
gnufied added a commit to gnufied/kubernetes that referenced this issue Mar 24, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141
saad-ali pushed a commit to saad-ali/kubernetes that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141
k8s-publishing-bot pushed a commit to kubernetes/client-go that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/client-go that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/component-base that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/component-base that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/sample-controller that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/apiextensions-apiserver that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/apiextensions-apiserver that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/metrics that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/metrics that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/cli-runtime that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/cli-runtime that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/sample-cli-plugin that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/sample-cli-plugin that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/kube-proxy that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/kube-proxy that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/kube-scheduler that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/kube-scheduler that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/kube-controller-manager that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/kube-controller-manager that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/cloud-provider that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/cloud-provider that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/csi-translation-lib that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/csi-translation-lib that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/legacy-cloud-providers that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/legacy-cloud-providers that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
k8s-publishing-bot pushed a commit to kubernetes/kubectl that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: a1ae67d691d514d859fce68299d7bd3830686b38
k8s-publishing-bot pushed a commit to kubernetes/kubectl that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
saad-ali pushed a commit to saad-ali/kubernetes that referenced this issue Mar 25, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141
Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue Apr 9, 2020
This PR updates the kubernetes utils packages
we are using. we had hit an issue in xfs_repair
as this is fixed in recent kubernetes utils
we are updating it for the same reason

more info at kubernetes/utils#141

Signed-off-by: Madhu Rajanna <[email protected]>
humblec added a commit to ceph/ceph-csi that referenced this issue Apr 9, 2020
we are using. we had hit an issue in xfs_repair
as this is fixed in recent kubernetes utils
we are updating it for the same reason

more info at kubernetes/utils#141

fixes #859
updates rook/rook#4914

Signed-off-by: Humble Chirammal <[email protected]>
humblec added a commit to humblec/ceph-csi that referenced this issue Apr 9, 2020
NOTE:

This PR also updates the kubernetes utils packages
we are using. we had hit an issue in xfs_repair
as this is fixed in recent kubernetes utils
we are updating it for the same reason

more info at kubernetes/utils#141

fixes ceph#859
updates rook/rook#4914

Signed-off-by: Humble Chirammal <[email protected]>
humblec added a commit to humblec/ceph-csi that referenced this issue Apr 13, 2020
NOTE:

This PR also updates the kubernetes utils packages
we are using. we had hit an issue in xfs_repair
as this is fixed in recent kubernetes utils
we are updating it for the same reason

more info at kubernetes/utils#141

fixes ceph#859
updates rook/rook#4914

Signed-off-by: Humble Chirammal <[email protected]>
tamalsaha pushed a commit to kmodules/shared-informer that referenced this issue Aug 13, 2020
This fixes bug with xfs mount failing because of xfs_repair
being called. Fixes kubernetes/utils#141

Kubernetes-commit: 0630031f85ba508559abcb40a1adca4ac2350056
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
6 participants