Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

full_storage_utilization_test: Storage utilization at 90% cluster size #9018

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Lakshmipathi
Copy link

Populate data until it reaches over 90% disk storage then perform db and topology change cluster options.

Description

This PR covers basic part of 90% storage utilization, populate data when the cluster reaches over 90% disk usage
performs scaleout operation.Goal of this task is to run the cluster at 90% disk utilization. This will help the users to utilize their instances fully. To help achieve this goal, we need to utilize features like concurrent topology changes, tablets and migrations.

Keeping this as draft version so that it acts as common base between dev and qa team so it helps to improve this PR further.

Testing

  • [ ]

PR pre-checks (self review)

  • I added the relevant backport labels
  • I didn't leave commented-out/debugging code

Reminders

  • Add New configuration option and document them (in sdcm/sct_config.py)
  • Add unit tests to cover my changes (under unit-test/ folder)
  • Update the Readme/doc folder relevant to this change (if needed)

@Lakshmipathi Lakshmipathi requested review from lukepio and paszkow and removed request for lukepio October 21, 2024 11:15
@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch 2 times, most recently from 4a24b29 to b327724 Compare October 23, 2024 17:50
@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch 9 times, most recently from c9eb32f to 0bb3e8e Compare November 3, 2024 10:10
@Lakshmipathi
Copy link
Author

Recent log run can be found here: https://jenkins.scylladb.com/view/staging/job/scylla-staging/job/LakshmipathiGanapathi/job/byo-longevity-test/185/consoleText (ignore error from send_email step from the job)

@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch 2 times, most recently from f58a6c4 to b23d8bd Compare November 3, 2024 15:26
full_storage_utilization_test.py Outdated Show resolved Hide resolved
full_storage_utilization_test.py Outdated Show resolved Hide resolved
full_storage_utilization_test.py Outdated Show resolved Hide resolved
full_storage_utilization_test.py Outdated Show resolved Hide resolved
sdcm/sct_config.py Outdated Show resolved Hide resolved
full_storage_utilization_test.py Outdated Show resolved Hide resolved
@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch 3 times, most recently from ad5d8b0 to 8b0fc7a Compare November 4, 2024 08:19
full_storage_utilization_test.py Outdated Show resolved Hide resolved
full_storage_utilization_test.py Outdated Show resolved Hide resolved
full_storage_utilization_test.py Outdated Show resolved Hide resolved
sdcm/sct_config.py Outdated Show resolved Hide resolved
full_storage_utilization_test.py Outdated Show resolved Hide resolved
@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch 2 times, most recently from b492597 to 79396b9 Compare November 4, 2024 13:16
@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch 4 times, most recently from 55d6d6d to 8fe1a4b Compare November 12, 2024 12:34
@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch 2 times, most recently from 15a5693 to d4efbc6 Compare November 13, 2024 09:32
@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch 2 times, most recently from 76983ef to a376e97 Compare November 15, 2024 03:02
@pehala
Copy link
Contributor

pehala commented Nov 20, 2024

@Lakshmipathi Could we finish this PR and add other cases in followups? Is it ready for review?

@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch from a376e97 to 9d5bac1 Compare November 21, 2024 02:20
@Lakshmipathi Lakshmipathi marked this pull request as ready for review November 21, 2024 02:22
@Lakshmipathi
Copy link
Author

@Lakshmipathi Could we finish this PR and add other cases in followups? Is it ready for review?

Yes, I was moving some code into utils and testing them. Now pushed and its ready for review.

@Lakshmipathi
Copy link
Author

Any thoughts on how to view pre-commit failure messages? https://jenkins.scylladb.com/job/sct-github-PRs-scan/job/scylla-cluster-tests/job/PR-9018/workflow-stage/ gives me permission denied

@pehala
Copy link
Contributor

pehala commented Nov 21, 2024

Any thoughts on how to view pre-commit failure messages? https://jenkins.scylladb.com/job/sct-github-PRs-scan/job/scylla-cluster-tests/job/PR-9018/workflow-stage/ gives me permission denied

I can see the logs from the job https://jenkins.scylladb.com/job/sct-github-PRs-scan/job/scylla-cluster-tests/job/PR-9018/49/console but all phases seemed to be green? @fruch could you please advise on whats wrong?

@fruch
Copy link
Contributor

fruch commented Nov 21, 2024

Any thoughts on how to view pre-commit failure messages? https://jenkins.scylladb.com/job/sct-github-PRs-scan/job/scylla-cluster-tests/job/PR-9018/workflow-stage/ gives me permission denied

I can see the logs from the job https://jenkins.scylladb.com/job/sct-github-PRs-scan/job/scylla-cluster-tests/job/PR-9018/49/console but all phases seemed to be green? @fruch could you please advise on whats wrong?

as listed in the github actions, the pre-commit phase is faling:
https://jenkins.scylladb.com/job/sct-github-PRs-scan/job/scylla-cluster-tests/job/PR-9018/49/execution/node/23/log/

the permission denied is a bug in jenkins workflow-stage page
got back to the root of theat job:
https://jenkins.scylladb.com/job/sct-github-PRs-scan/job/scylla-cluster-tests/job/PR-9018/
and follow the specific runs from there

regardless I would recommend installing correctly the pre-commit hook.
and run it again locally, after rebase/amend, that it something might miss the changed files

docs/configuration_options.md Outdated Show resolved Hide resolved
docs/configuration_options.md Outdated Show resolved Hide resolved
full_storage_utilization_test.py Outdated Show resolved Hide resolved
full_storage_utilization_test.py Outdated Show resolved Hide resolved
full_storage_utilization_test.py Outdated Show resolved Hide resolved
sdcm/utils/full_storage_utils.py Outdated Show resolved Hide resolved
sdcm/utils/full_storage_utils.py Outdated Show resolved Hide resolved
docs/configuration_options.md Outdated Show resolved Hide resolved
sdcm/utils/full_storage_utils.py Outdated Show resolved Hide resolved
sdcm/utils/full_storage_utils.py Outdated Show resolved Hide resolved
@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch 3 times, most recently from 3ff576c to 2b5fd03 Compare November 22, 2024 11:12
@Lakshmipathi
Copy link
Author

ok, now pre-commit check passed.

@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch 3 times, most recently from 8afe736 to c181d37 Compare November 26, 2024 11:12
Copy link
Contributor

@roydahan roydahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire PR and probably all next PRs can be just yaml and pipeline.

I doubt that there is even a need to add new code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this file just duplicates functions we already have in performance_regression.py and could be integrated directly there.

Not only it saves duplications but also benefits from all automations like decorators we have for them.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @roydahan for pointing to this! @Lakshmipathi could you reuse the existing code where possible? And do not worry about the lost code you have to delete - I am sure it was great learning experience to write it and you will long benefit from it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will check performance_regression.py and re-use them where-ever required. Less code is better :)

…er size

Populate data until it reaches over 90% disk storage and perform
db and cluster options.

Signed-off-by: Lakshmipathi <[email protected]>
@Lakshmipathi Lakshmipathi force-pushed the wip/full_storage_utilization branch from c181d37 to 2282636 Compare November 29, 2024 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/elastic cloud Issues related to the elastic cloud project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants