-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update docs for v0.10.0 #205
Merged
kserve-oss-bot
merged 10 commits into
kserve:main
from
alexagriffith:alexagriffith/update_dataplane_docs
Jan 21, 2023
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
4646125
clarify runtimes.txt usage in custom model
alexagriffith 03c08f1
updating docs to reflect v0.10
alexagriffith 654a39e
update wording ab model version
alexagriffith 3ee4078
add model ready
alexagriffith 175aaaa
update serving runtime table
alexagriffith 705924d
add mlfow to runtime table
alexagriffith dc48af1
update dataplane md
alexagriffith 53113aa
update v1
alexagriffith 0f00deb
fix links
alexagriffith 9346498
resolving pr comments
alexagriffith File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# Data Plane | ||
The InferenceService Data Plane architecture consists of a static graph of components which coordinate requests for a single model. Advanced features such as Ensembling, A/B testing, and Multi-Arm-Bandits should compose InferenceServices together. | ||
|
||
## Introduction | ||
KServe's data plane protocol introduces an inference API that is independent of any specific ML/DL framework and model server. This allows for quick iterations and consistency across Inference Services and supports both easy-to-use and high-performance use cases. | ||
|
||
By implementing this protocol both inference clients and servers will increase their utility and | ||
portability by operating seamlessly on platforms that have standardized around this API. Kserve's inference protocol is endorsed by NVIDIA | ||
Triton Inference Server, TensorFlow Serving, and TorchServe. | ||
|
||
![Data Plane](../../images/dataplane.jpg) | ||
<br> Note: Protocol V2 uses /infer instead of :predict | ||
|
||
### Concepts | ||
**Component**: Each endpoint is composed of multiple components: "predictor", "explainer", and "transformer". The only required component is the predictor, which is the core of the system. As KServe evolves, we plan to increase the number of supported components to enable use cases like Outlier Detection. | ||
|
||
**Predictor**: The predictor is the workhorse of the InferenceService. It is simply a model and a model server that makes it available at a network endpoint. | ||
|
||
**Explainer**: The explainer enables an optional alternate data plane that provides model explanations in addition to predictions. Users may define their own explanation container, which configures with relevant environment variables like prediction endpoint. For common use cases, KServe provides out-of-the-box explainers like Alibi. | ||
|
||
**Transformer**: The transformer enables users to define a pre and post processing step before the prediction and explanation workflows. Like the explainer, it is configured with relevant environment variables too. For common use cases, KServe provides out-of-the-box transformers like Feast. | ||
|
||
|
||
## Data Plane V1 & V2 | ||
|
||
KServe supports two versions of its data plane, V1 and V2. V1 protocol offers a standard prediction workflow with HTTP/REST. The second version of the data-plane protocol addresses several issues found with the V1 data-plane protocol, including performance and generality across a large number of model frameworks and servers. Protocol V2 expands the capabilities of V1 by adding gRPC APIs. | ||
|
||
### Main changes | ||
|
||
* V2 does not currently support the explain endpoint | ||
* V2 added Server Readiness/Liveness/Metadata endpoints | ||
* V2 endpoint paths contain `/` instead of `:` | ||
* V2 renamed `:predict` endpoint to `/infer` | ||
* V2 allows for model versions in the request path (optional) | ||
|
||
|
||
### V1 APIs | ||
|
||
| API | Verb | Path | | ||
| ------------- | ------------- | ------------- | | ||
| List Models | GET | /v1/models | | ||
| Model Ready | GET | /v1/models/\<model_name\> | | ||
| Predict | POST | /v1/models/\<model_name\>:predict | | ||
| Explain | POST | /v1/models/\<model_name\>:explain | | ||
|
||
### V2 APIs | ||
|
||
| API | Verb | Path | | ||
| ------------- | ------------- | ------------- | | ||
| Inference | POST | v2/models/\<model_name\>[/versions/\<model_version\>]/infer | | ||
| Model Metadata | GET | v2/models/\<model_name\>[/versions/\<model_version\>] | | ||
| Server Readiness | GET | v2/health/ready | | ||
| Server Liveness | GET | v2/health/live | | ||
| Server Metadata | GET | v2 | | ||
| Model Readiness| GET | v2/models/\<model_name\>[/versions/<model_version>]/ready | | ||
|
||
** path contents in `[]` are optional | ||
|
||
Please see [V1 Protocol](./v1_protocol.md) and [V2 Protocol](./v2_protocol.md) documentation for more information. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
# Data Plane (V1) | ||
KServe's V1 protocol offers a standardized prediction workflow across all model frameworks. This protocol version is still supported, but it is recommended that users migrate to the [V2 protocol](./v2_protocol.md) for better performance and standardization among serving runtimes. However, if a use case requires a more flexibile schema than protocol v2 provides, v1 protocol is still an option. | ||
|
||
| API | Verb | Path | Request Payload | Response Payload | | ||
| ------------- | ------------- | ------------- | ------------- | ------------- | | ||
| List Models | GET | /v1/models | | {"models": \[\<model_name\>\]} | | ||
| Model Ready| GET | /v1/models/\<model_name> | | {"name": \<model_name\>,"ready": $bool} | | ||
| Predict | POST | /v1/models/\<model_name\>:predict | {"instances": []} ** | {"predictions": []} | | ||
| Explain | POST | /v1/models/\<model_name\>:explain | {"instances": []} **| {"predictions": [], "explanations": []} | | | ||
|
||
** = payload is optional | ||
|
||
Note: The response payload in V1 protocol is not strictly enforced. A custom server define and return its own response payload. We encourage using the KServe defined response payload for consistency. | ||
|
||
|
||
## API Definitions | ||
|
||
| API | Definition | | ||
| --- | --- | | ||
| Predict | The "predict" API performs inference on a model. The response is the prediction result. All InferenceServices speak the [Tensorflow V1 HTTP API](https://www.tensorflow.org/tfx/serving/api_rest#predict_api). | | ||
| Explain | The "explain" API is an optional component that provides model explanations in addition to predictions. The standardized explainer interface is identical to the Tensorflow V1 HTTP API with the addition of an ":explain" verb.| | ||
| Model Ready | The “model ready” health API indicates if a specific model is ready for inferencing. If the model(s) is downloaded and ready to serve requests, the model ready endpoint returns the list of accessible <model_name>(s). | | ||
| List Models | The "models" API exposes a list of models in the model registry. | | ||
|
||
<!-- TODO: ## Examples --> |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like saying 0.9 and prior is kind of not useful right? means every version. and it doesn't change with 0.10. so that we dont have to update this with each version update, we can just say
kserve creates
until that changes.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless im reading this wrong. the next sentence is that "Now kserve provides an option..." but kserve is at v0.9 so im confused a little by what this means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it maybe say "Starting with v0.9, Kserve provides an option...."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with above change. And replacing "Now" with "Starting with v0.9" makes it clearer too when the option to disable was provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im not 100% sure this started with v0.9, @yuzisun can you confirm?