Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make driver restarts on kubernetes have zero downtime #1331

Open
fleupold opened this issue Aug 17, 2020 · 1 comment
Open

Make driver restarts on kubernetes have zero downtime #1331

fleupold opened this issue Aug 17, 2020 · 1 comment
Labels
[Driver] tasks that relate to the driver subsystem enhancement New feature or request

Comments

@fleupold
Copy link
Contributor

We occasionally run into issues submitting solutions (e.g. here). These likely come from unfortunately timed restarts where we stop the old pods halfway through the processing of a batch and thus restart the new ones with little time remaining.

Moreover, the new instances might have some significant setup time if e.g. the event database has to be resynced from scratch.

We should therefore work towards only shutting down the previous deployment once the new deployment is ready and ideally only at the beginning (e.g. first 30 seconds) of a batch.

At the same time we have to make sure we are not actually running the same pod twice on the same auction as this could lead to issues with out solver license and nonce issues when trying to use the same PK for solution submission.

@fleupold fleupold added enhancement New feature or request [Driver] tasks that relate to the driver subsystem labels Aug 17, 2020
@fleupold
Copy link
Contributor Author

Small Update on the progress:

@giacomolicari has been working on making the price-estimation service rollover (since the ready route is already implemented there). We now have an init container that downloads the latest orderbook from an S3 bucket and a container running on the side that re-uploads it to S3 on every change.

He is currently working on changing the auto-deployer to use a smooth restart instead of forced delete + restart.

Once #1373 is merged we can also have the same concept for the solver containers. There, we might not have to upload any data since one host (price estimator) should be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Driver] tasks that relate to the driver subsystem enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant