Standardize handling of storage and execution time quotas #1969

tw4l · 2024-07-24T20:10:25Z

Fixes #1968

Changes:

stopped_quota_reached and skipped_quota_reached migrated to new values that indicate which quota was reached
Before crawls are run, the operator checks if storage or exec mins quotas are reached and if so fails the crawl with the appropriate state of skipped_storage_quota_reached or skipped_time_quota_reached
While crawls are running, the operator checks if the exec mins quota is reached or if the size of all running crawls will mean the storage quota is reached once uploaded; if so, the crawl is stopped gracefully and given stopped_storage_quota_needed or stopped_time_quota_reached state as appropriate
Adds new nightly tests for enforcing storage quota

To run the nightly tests, build the local backend and then run:

python -m pytest backend/test_nightly/test_storage_quota.py
python -m pytest backend/test_nightly/test_execution_minutes_quota.py

- Check if both quotas are over before starting a crawl and during the crawl, stop crawl gracefully if over - Migrate existing stopped_quota_reached and skipped_quota_reached crawl states to indicate which quota they relate to - Use states that reflect both the action (stopped, skipped) and which quota was over, checking storage quota first

SuaYoo

Frontend looks good!

ikreymer

Looks good! The quota lookup itself could use some optimization, will open a separate PR for that

ikreymer

Ah, we also need to add a migration for lastCrawlState on the workflows.

- instead of looking up storage and exec min quotas from oid, and loading an org each time, load org once and then check quotas on the org object - many times the org was already available, and was looked up again - storage and exec quota checks become sync - rename can_run_crawl() to more generic can_write_data(), optionally also checks exec minutes - follow up to #1969

- instead of looking up storage and exec min quotas from oid, and loading an org each time, load org once and then check quotas on the org object - many times the org was already available, and was looked up again - storage and exec quota checks become sync - rename can_run_crawl() to more generic can_write_data(), optionally also checks exec minutes - typing: get_org_by_id() always returns org, or throws, adjust methods accordingly (don't check for none, catch exception) - typing: fix typo in BaseOperator, catch type errors in operator 'org_ops' - operator quota check: use up-to-date 'status.size' for current job, ignore current job in all jobs list to avoid double-counting - follow up to #1969

tw4l added 11 commits July 24, 2024 11:20

Update crawl statuses and types in frontend

211fc3d

Add back missing awaits

816734d

Rename to ...time_quota_reached

0176537

Add nightly tests for storage quota

4c3f28c

Fixups

43fb351

Fix size constant label

aa0b401

Stop crawl gracefully if crawl size will go over storage quota

284d7e7

Remove fickle nightly test

854fbc0

Check all running crawl sizes, not just current crawl

8bf2192

Fetch org first

a0d20ff

tw4l requested review from ikreymer and SuaYoo July 24, 2024 20:10

SuaYoo approved these changes Jul 25, 2024

View reviewed changes

ikreymer approved these changes Jul 25, 2024

View reviewed changes

ikreymer requested changes Jul 25, 2024

View reviewed changes

ikreymer mentioned this pull request Jul 25, 2024

optimize org quota lookups #1973

Merged

tw4l added 2 commits July 25, 2024 15:06

Migrate lastCrawlState

157e0c4

Improve docstring

d09d4b2

tw4l requested a review from ikreymer July 25, 2024 19:08

ikreymer approved these changes Jul 25, 2024

View reviewed changes

ikreymer merged commit d38abbc into main Jul 25, 2024
7 checks passed

ikreymer deleted the issue-1968-quota-exceeded-standardization branch July 25, 2024 19:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize handling of storage and execution time quotas #1969

Standardize handling of storage and execution time quotas #1969

tw4l commented Jul 24, 2024

SuaYoo left a comment

ikreymer left a comment

ikreymer left a comment

Standardize handling of storage and execution time quotas #1969

Standardize handling of storage and execution time quotas #1969

Conversation

tw4l commented Jul 24, 2024

SuaYoo left a comment

Choose a reason for hiding this comment

ikreymer left a comment

Choose a reason for hiding this comment

ikreymer left a comment

Choose a reason for hiding this comment