Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Centralize VolumeGroupSnapshot members resource creation in the snapshot-controller #1171

Merged
merged 1 commit into from
Nov 5, 2024

Conversation

leonardoce
Copy link
Contributor

@leonardoce leonardoce commented Oct 24, 2024

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

What this PR does / why we need it:

Previously, the csi-snapshotter was responsible for creating individual
VolumeSnapshots and VolumeSnapshotContents for each member when
dynamically provisioning a VolumeGroupSnapshot.

This commit relocates this logic to the snapshot-controller, bringing
all resource creation processes under a single authority.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Move the logic of creating individual VolumeSnapshot and VolumeSnapshotContent resources for dynamically created VolumeGroupSnapshot from csi-snapshotter sidecar to snapshot-controller.

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Oct 24, 2024
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 24, 2024
@xing-yang
Copy link
Collaborator

Hi @leonardoce, can you please split this into 2 separate PRs, one depending on the other? Thanks!

@leonardoce
Copy link
Contributor Author

Hi! Yes. This PR is composed by 2 commits, with the first being the dependent PR.

Do you want me to remove it? Without that it won't compile.

@xing-yang
Copy link
Collaborator

@leonardoce Never mind! You are right. They are already separate PRs.

@xing-yang
Copy link
Collaborator

Can you add a release note?

@xing-yang
Copy link
Collaborator

/assign @jsafrane

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Oct 25, 2024
@leonardoce leonardoce force-pushed the move-logic branch 2 times, most recently from aaba439 to e0b5871 Compare October 28, 2024 07:44
@leonardoce leonardoce marked this pull request as ready for review October 28, 2024 07:44
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 28, 2024
@leonardoce
Copy link
Contributor Author

Since #1169 is merged, I rebased over the latest master and marked it as ready for review.

@xing-yang
Copy link
Collaborator

Can you reword the release notes as follows?
Move the logic of creating individual VolumeSnapshot and VolumeSnapshotContent resources for dynamically created VolumeGroupSnapshot from csi-snapshotter sidecar to snapshot-controller.

@leonardoce
Copy link
Contributor Author

I did it @xing-yang. Thank you!

groupSnapshotContent.Name, err)
}

pvs, err := ctrl.client.CoreV1().PersistentVolumes().List(context.TODO(), metav1.ListOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All API server calls are expensive. List is a very expensive one. The controller should have a PV lister + indexer on spec.csi.driver + "^" + spec.csi.volumeHandle for fast lookups.
And use the lister to get PVs in the other parts of the controller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with you. In this PR, I'm just moving the existing logic. Should we address it here, in your opinion?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsafrane Can this be addressed in a followup PR? This way it will be easier to review.
I added this issue in the doc: https://docs.google.com/document/d/1jjliw0tFhCPnT9ixlYpj61k5u8_-awf1qSCWrkunshQ/edit?tab=t.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the informer and the indexer in a separate commit, so we can decide whether to include it in this PR or in a separate one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Xing asked me to put that on a separate PR. I'm on it.

pkg/common-controller/groupsnapshot_controller_helper.go Outdated Show resolved Hide resolved
pkg/common-controller/groupsnapshot_controller_helper.go Outdated Show resolved Hide resolved
pkg/common-controller/groupsnapshot_controller_helper.go Outdated Show resolved Hide resolved
pkg/common-controller/groupsnapshot_controller_helper.go Outdated Show resolved Hide resolved

pv := utils.GetPersistentVolumeFromHandle(pvs, groupSnapshotContent.Spec.Driver, volumeHandle)
if pv == nil {
klog.Errorf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting case. A snapshot exists, but its PV has been deleted. How useful is to have a VolumeSnapshot / VolumeSnapshotContent for it, if the user cannot possibly guess from which PVC the snapshot is?

My gut feeling would be to record also the source PVC + PV name in VolumeSnapshotHandlePairList to be able to reconstruct the VolumeSnapshotContent / VolumeSnapshot with a link to the PV / PVC

This should be discussed in the KEP to keep track of the decision.

Copy link
Collaborator

@xing-yang xing-yang Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We retrieve PV here to populate groupSnapshotContent.Status.PVVolumeSnapshotContentList and set volumeSnapshotContent.Spec.SourceVolumeMode.

  • The SourceVolumeMode will have an impact on applications that rely on changing volume mode when creating PVC from VolumeSnapshot. Without it being set, it falls back to the default behavior (for backward compatibility) which is not to check it.
  • Regarding PVVolumeSnapshotContentList, we have an action item to see if we can get rid of it after making all other changes.

I'm not sure if it is worth it to add source PVC and PV name in VolumeSnapshotHandlePairList. That sounds like the wrong place to store this info. We can discuss more about it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me open an issue and we'll get back to this one later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue opened: #1183

pkg/common-controller/groupsnapshot_controller_helper.go Outdated Show resolved Hide resolved
pkg/common-controller/groupsnapshot_controller_helper.go Outdated Show resolved Hide resolved
pkg/common-controller/groupsnapshot_controller_helper.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 2, 2024
@leonardoce leonardoce force-pushed the move-logic branch 5 times, most recently from 91cff49 to 6208785 Compare November 4, 2024 16:46
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 4, 2024
…hot-controller

Previously, the csi-snapshotter was responsible for creating individual
VolumeSnapshots and VolumeSnapshotContents for each member when
dynamically provisioning a VolumeGroupSnapshot.

This commit relocates this logic to the snapshot-controller, bringing
all resource creation processes under a single authority.
@leonardoce
Copy link
Contributor Author

I rebased it on the latest master and now I'm filing a separate PR for the added informer and lister.

@jsafrane
Copy link
Contributor

jsafrane commented Nov 5, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 5, 2024
@jsafrane
Copy link
Contributor

jsafrane commented Nov 5, 2024

The amount of issues and subsequent PRs worries me a bit, but I can see it's moving towards a better controller.

@jsafrane
Copy link
Contributor

jsafrane commented Nov 5, 2024

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jsafrane, leonardoce

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 5, 2024
@k8s-ci-robot k8s-ci-robot merged commit be73f8a into kubernetes-csi:master Nov 5, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants