Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor driver timeouts #2061

Merged
merged 25 commits into from
Nov 21, 2023
Merged

Refactor driver timeouts #2061

merged 25 commits into from
Nov 21, 2023

Conversation

sunce86
Copy link
Contributor

@sunce86 sunce86 commented Nov 16, 2023

Description

Tackles checkboxes 1 and 2 from #2041 (comment)

timeouts

Changes

This is my attempt to make timeouts more explicit and clear.
First, we split the deadline given to driver into 3 pieces:

  1. Solving time given to solvers
  2. Competition time, used to validate, merge, simulate, score and rank solutions.
  3. Http delay, to cover potential slow propagating of http response back to autopilot (http_delay)

The time is split in a following way: competition time and http delay are values read from the configuration (so hardcoded), while the solving time is whatever is left after deducting those two from the deadline given to driver.

It's important to note that:

  1. Http delay is reduced from driver deadline at the very beginning of the solve/quote handling, because it's a non-domain information. Core domain logic doesn't care about network delays.
  2. Competition time is reduced from deadline in the domain, because how we split given time to solving and competition is actually domain related. For some solvers we want 19s for solving and 1s for competition, while for some others like Baseline, we might want 10s for solving and 10s for competition.

Default values are set to something like:
/solve: 20s given to driver, http delay is 0.5s, competition is 4.5s, solving time 15s
/quote: 3s given to driver, http delay is 0.5s, competition time 1s, solving time 1.5s

Release notes: check the default timeouts and if they need to be adjusted.

Copy link
Contributor

@MartinquaXD MartinquaXD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not the longterm solution we should pursue but if you think this gets us closer to deploying on prod mainnet we can try it out.

crates/driver/src/infra/config/file/load.rs Outdated Show resolved Hide resolved
crates/e2e/src/setup/services.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@fleupold fleupold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is quite a bit of non-trivial logic around different parts of the whole driver stack. I wonder if the logic could be smaller/more contained if we

  1. Keep the timing math local to the Deadline and expose the different final deadlines we are interested in (for solvers and the merging)
  2. Not differentiate for now between quote/solve

I'd also like for percentages to be the same format we have them for most of our other configuration parameters.

crates/driver/src/domain/time.rs Outdated Show resolved Hide resolved
crates/driver/src/infra/config/file/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/infra/config/file/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/infra/solver/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/util/percent.rs Outdated Show resolved Hide resolved
crates/e2e/src/setup/services.rs Outdated Show resolved Hide resolved
crates/driver/src/tests/setup/solver.rs Outdated Show resolved Hide resolved
@sunce86 sunce86 requested a review from fleupold November 20, 2023 15:20
Copy link
Contributor

@fleupold fleupold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach much better. Just some nits around naming and use of conversion traits, but nothing blocking.

crates/driver/src/domain/competition/solution/mod.rs Outdated Show resolved Hide resolved
crates/driver/src/domain/time.rs Outdated Show resolved Hide resolved
crates/driver/src/domain/time.rs Outdated Show resolved Hide resolved
crates/driver/src/domain/time.rs Show resolved Hide resolved
crates/driver/src/util/percent.rs Show resolved Hide resolved
crates/driver/src/tests/setup/mod.rs Outdated Show resolved Hide resolved
@fleupold fleupold linked an issue Nov 20, 2023 that may be closed by this pull request
@sunce86 sunce86 merged commit 82b01c1 into main Nov 21, 2023
8 checks passed
@sunce86 sunce86 deleted the network-delay branch November 21, 2023 20:51
@github-actions github-actions bot locked and limited conversation to collaborators Nov 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: Colocated solvers timing out on prod mainnet
4 participants