Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: support QoS based on capacity for rbd volume #5016

Merged
merged 1 commit into from
Feb 17, 2025

Conversation

YiteGu
Copy link
Member

@YiteGu YiteGu commented Dec 12, 2024

  1. QoS provides settings for rbd volume read/write iops and read/write bandwidth.
  2. All QoS parameters are placed in the SC, send QoS parameters from SC to Cephcsi through PVC create request.
  3. We need provide QoS parameters in the SC as below:
    • BaseReadIops
    • BaseWriteIops
    • BaseReadBytesPerSecond
    • BaseWriteBytesPerSecond
    • ReadIopsPerGB
    • WriteIopsPerGB
    • ReadBpsPerGB
    • WriteBpsPerGB
    • BaseVolSizeBytes
      There are 4 base qos parameters among them, when users apply for a volume capacity equal to or less than BaseVolSizebytes, use base qos limit. For the portion of capacity exceeding BaseVolSizebytes, QoS will be increased in steps set per GB. If the step size parameter per GB is not provided, only base QoS limit will be used and not associated with capacity size.
  4. If PVC has resize request, adjust the QoS limit according to the QoS parameters after resizing.

Describe what this PR does

Provide some context for the reviewer

Is there anything that requires special attention

Do you have any questions?

Is the change backward compatible?

Are there concerns around backward compatibility?

Provide any external context for the change, if any.

For example:

  • Kubernetes links that explain why the change is required
  • CSI spec related changes/catch-up that necessitates this patch
  • golang related practices that necessitates this change

Related issues

Mention any github issues relevant to this PR. Adding below line
will help to auto close the issue once the PR is merged.

Fixes: #issue_number

Future concerns

List items that are not part of the PR and do not impact it's
functionality, but are work items that can be taken up subsequently.

Checklist:

  • Commit Message Formatting: Commit titles and messages follow
    guidelines in the developer
    guide
    .
  • Reviewed the developer guide on Submitting a Pull
    Request
  • Pending release
    notes

    updated with breaking and/or notable changes for the next major release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)

@YiteGu YiteGu force-pushed the rbd-qos-support branch 2 times, most recently from 8a47010 to c07a41a Compare December 12, 2024 10:20
@YiteGu YiteGu changed the title cephcsi: support QoS based on capacity for rbd volume rbd: support QoS based on capacity for rbd volume Dec 12, 2024
@mergify mergify bot added the component/rbd Issues related to RBD label Dec 12, 2024
@YiteGu YiteGu force-pushed the rbd-qos-support branch 3 times, most recently from 63ea87d to f60f7c3 Compare December 12, 2024 12:06
Copy link
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This is really a nice feature to have. A couple of comments that need to be addressed to improve the maintainability and usability.

  • move all qos related functions, structs, constants to a new qos.go file
  • all new functions that receive a *rbdVolume as argument should become a function of the rbdVolume struct (see example for the SetQOS function)
  • new parameters for the StorageClass need to be added to the documentation
  • documentation should explain that the QoS settings are currently only used in combination with the NBD mounter
  • ideally add unit tests for functions that do not interact with a Ceph cluster
  • ideally add some form or e2e testing, maybe set QoS parameters in the StorageClass, and verify that those are set on an RBD-image?

Thanks again, this is a great start!

@nixpanic nixpanic added the enhancement New feature or request label Dec 13, 2024
@YiteGu YiteGu force-pushed the rbd-qos-support branch 10 times, most recently from 2b895ad to 31ce867 Compare December 18, 2024 08:29
@YiteGu
Copy link
Member Author

YiteGu commented Dec 18, 2024

/test ci/centos/mini-e2e/k8s-1.31

@nixpanic
Copy link
Member

Looks quite complete, many thanks for adding the unit and e2e tests!

@nixpanic
Copy link
Member

/test ci/centos/mini-e2e/k8s-1.31

@nixpanic
Copy link
Member

/test ci/centos/mini-e2e/k8s-1.31

not sure why that did not work, but the CI is running now (takes ~2 hours to complete): https://jenkins-ceph-csi.apps.ocp.cloud.ci.centos.org/job/mini-e2e_k8s-1.31/283/display/redirect

@YiteGu
Copy link
Member Author

YiteGu commented Dec 18, 2024

Looks quite complete, many thanks for adding the unit and e2e tests!

Write it briefly to verify that e2e testing is available. More scenes need to be added. :-)

@YiteGu YiteGu force-pushed the rbd-qos-support branch 2 times, most recently from 4d90640 to 0c09729 Compare December 19, 2024 11:05
@YiteGu
Copy link
Member Author

YiteGu commented Feb 13, 2025

change: @Madhu-1 @nixpanic

  1. Consider the two cases create pvc from snapshot or create pvc from pvc, the clone pvc size maybe greater than parent. So, we should adjust qos after expand pvc. On the create pvc from snapshot case, apply qos way as same as normal create. On the create pvc from pvc case, adjust qos after expand.
  2. we use RequestedVolSize to calculate qos values.
  3. add e2e test what the pvc create size greater than parent.

@YiteGu
Copy link
Member Author

YiteGu commented Feb 13, 2025

/test ci/centos/mini-e2e/k8s-1.31

@YiteGu YiteGu requested review from Madhu-1 and nixpanic February 14, 2025 01:54
nixpanic
nixpanic previously approved these changes Feb 17, 2025
Copy link
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good to me.

@Madhu-1 , can you take a look again too?

@mergify mergify bot dismissed nixpanic’s stale review February 17, 2025 09:58

Pull request has been modified.

@nixpanic
Copy link
Member

@Mergifyio rebase

1. QoS provides settings for rbd volume read/write iops
   and read/write bandwidth.
2. All QoS parameters are placed in the SC,
   send QoS parameters from SC to Cephcsi through PVC create request.
3. We need provide QoS parameters in the SC as below:
   - BaseReadIops
   - BaseWriteIops
   - BaseReadBytesPerSecond
   - BaseWriteBytesPerSecond
   - ReadIopsPerGB
   - WriteIopsPerGB
   - ReadBpsPerGB
   - WriteBpsPerGB
   - BaseVolSizeBytes
   There are 4 base qos parameters among them, when users apply for
   a volume capacity equal to or less than BaseVolSizebytes, use base
   qos limit. For the portion of capacity exceeding BaseVolSizebytes,
   QoS will be increased in steps set per GB. If the step size parameter
   per GB is not provided, only base QoS limit will be used and not associated
   with capacity size.
4. If PVC has resize request, adjust the QoS limit
   according to the QoS parameters after resizing.

Signed-off-by: Yite Gu <[email protected]>
Copy link
Contributor

mergify bot commented Feb 17, 2025

rebase

✅ Branch has been successfully rebased

@nixpanic
Copy link
Member

@Mergifyio queue

Copy link
Contributor

mergify bot commented Feb 17, 2025

queue

✅ The pull request has been merged automatically

The pull request has been merged automatically at 7595e20

@mergify mergify bot added the ok-to-test Label to trigger E2E tests label Feb 17, 2025
@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.32

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.30

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-cephfs

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.32

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.30

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-rbd

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.32

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.31

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.30

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.31

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.31

@ceph-csi-bot ceph-csi-bot removed the ok-to-test Label to trigger E2E tests label Feb 17, 2025
@mergify mergify bot merged commit 7595e20 into ceph:devel Feb 17, 2025
36 of 37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rbd Issues related to RBD enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants