-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autoscaling inference endpoints #412
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
cc @albertvillanova this could interest you btw :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the ping.
Just a preliminary comment while trying to understand the logic:
- The PR does not allow the user to create an endpoint with an arbitrary name. The user can only:
- Either create an endpoint without being able to choose the endpoint name
- Or reuse and existing (arbitrarily named) endpoint
Yep! That's the idea, since I wanted to streamline process as much as possible - either the user already has an endpoint, or they want one created and don't care about the name |
Would you want do be able to select a name in all cases? |
I do not have a strong opinion. I just wanted to be sure to understand all possible use cases and if we should allow all of them:
The simplest cases:
|
At the moment we only cover the simple cases with this PR: user passes endpoint name or model name, we see if it's an already existing endpoint or if we need to spin it up, if we need to spin it up we delete it afterwards |
My general suggestion would be:
But some aspects if considered useful could be addressed in other PRs. The only inconvenient I see is:
|
Very fair point, I'll revert this specific change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good ! Only small nit on the config file
This PR: