Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize handling of storage and execution time quotas #1969

Merged
merged 13 commits into from
Jul 25, 2024

Conversation

tw4l
Copy link
Member

@tw4l tw4l commented Jul 24, 2024

Fixes #1968

Changes:

  • stopped_quota_reached and skipped_quota_reached migrated to new values that indicate which quota was reached
  • Before crawls are run, the operator checks if storage or exec mins quotas are reached and if so fails the crawl with the appropriate state of skipped_storage_quota_reached or skipped_time_quota_reached
  • While crawls are running, the operator checks if the exec mins quota is reached or if the size of all running crawls will mean the storage quota is reached once uploaded; if so, the crawl is stopped gracefully and given stopped_storage_quota_needed or stopped_time_quota_reached state as appropriate
  • Adds new nightly tests for enforcing storage quota

To run the nightly tests, build the local backend and then run:

  • python -m pytest backend/test_nightly/test_storage_quota.py
  • python -m pytest backend/test_nightly/test_execution_minutes_quota.py

tw4l added 11 commits July 24, 2024 11:20
- Check if both quotas are over before starting a crawl and during
the crawl, stop crawl gracefully if over
- Migrate existing stopped_quota_reached and skipped_quota_reached
crawl states to indicate which quota they relate to
- Use states that reflect both the action (stopped, skipped) and
which quota was over, checking storage quota first
@tw4l tw4l requested review from ikreymer and SuaYoo July 24, 2024 20:10
Copy link
Member

@SuaYoo SuaYoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frontend looks good!

Copy link
Member

@ikreymer ikreymer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! The quota lookup itself could use some optimization, will open a separate PR for that

Copy link
Member

@ikreymer ikreymer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, we also need to add a migration for lastCrawlState on the workflows.

ikreymer added a commit that referenced this pull request Jul 25, 2024
- instead of looking up storage and exec min quotas from oid, and loading an org each time, load org
once and then check quotas on the org object
- many times the org was already available, and was looked up again
- storage and exec quota checks become sync
- rename can_run_crawl() to more generic can_write_data(), optionally also checks exec minutes
- follow up to #1969
@tw4l tw4l requested a review from ikreymer July 25, 2024 19:08
@ikreymer ikreymer merged commit d38abbc into main Jul 25, 2024
7 checks passed
@ikreymer ikreymer deleted the issue-1968-quota-exceeded-standardization branch July 25, 2024 19:49
ikreymer added a commit that referenced this pull request Jul 25, 2024
- instead of looking up storage and exec min quotas from oid, and loading an org each time, load org
once and then check quotas on the org object
- many times the org was already available, and was looked up again
- storage and exec quota checks become sync
- rename can_run_crawl() to more generic can_write_data(), optionally also checks exec minutes
- follow up to #1969
ikreymer added a commit that referenced this pull request Jul 25, 2024
- instead of looking up storage and exec min quotas from oid, and
loading an org each time, load org once and then check quotas on the org
object - many times the org was already available, and was looked up
again
- storage and exec quota checks become sync
- rename can_run_crawl() to more generic can_write_data(), optionally
also checks exec minutes
- typing: get_org_by_id() always returns org, or throws, adjust methods
accordingly (don't check for none, catch exception)
- typing: fix typo in BaseOperator, catch type errors in operator
'org_ops'
- operator quota check: use up-to-date 'status.size' for current job,
ignore current job in all jobs list to avoid double-counting
- follow up to #1969
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enforce storage quota and execution minutes quota exceeded in the same way
3 participants