Skip to content

Commit

Permalink
Added more deployments to the docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Aleksandr Movchan committed Jul 19, 2024
1 parent 44caffd commit 539c6cb
Show file tree
Hide file tree
Showing 4 changed files with 61 additions and 16 deletions.
9 changes: 9 additions & 0 deletions aana/deployments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,20 @@
LLMBatchOutput,
LLMOutput,
)
from aana.deployments.hf_text_generation_deployment import (
HfTextGenerationConfig,
HfTextGenerationDeployment,
)
from aana.deployments.vllm_deployment import VLLMConfig, VLLMDeployment

__all__ = [
"AanaDeploymentHandle",
"BaseDeployment",
"BaseTextGenerationDeployment",
"HfTextGenerationConfig",
"HfTextGenerationDeployment",
"VLLMConfig",
"VLLMDeployment",
"ChatOutput",
"LLMBatchOutput",
"LLMOutput",
Expand Down
27 changes: 16 additions & 11 deletions aana/deployments/vllm_deployment.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,17 +32,22 @@ class VLLMConfig(BaseModel):
"""The configuration of the vLLM deployment.
Attributes:
model (str): the model name
dtype (str): the data type (optional, default: "auto")
quantization (str): the quantization method (optional, default: None)
gpu_memory_reserved (float): the GPU memory reserved for the model in mb
default_sampling_params (SamplingParams): the default sampling parameters.
max_model_len (int): the maximum generated text length in tokens (optional, default: None)
chat_template (str): the name of the chat template, if not provided, the chat template from the model will be used
but some models may not have a chat template (optional, default: None)
enforce_eager: whether to enforce eager execution (optional, default: False)
engine_args: extra engine arguments (optional, default: {})
model: The model name.
dtype: The data type.
Defaults to "auto".
quantization: The quantization method.
Defaults to None.
gpu_memory_reserved: The GPU memory reserved for the model in MB.
default_sampling_params: The default sampling parameters.
max_model_len: The maximum generated text length in tokens.
Defaults to None.
chat_template: The name of the chat template. If not provided, the chat template
from the model will be used. Some models may not have a chat template.
Defaults to None.
enforce_eager: Whether to enforce eager execution.
Defaults to False.
engine_args: Extra engine arguments.
Defaults to {}.
"""

model: str
Expand Down
31 changes: 30 additions & 1 deletion docs/reference/aana_deployments.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,32 @@
# Aana Deployments

::: aana.deployments
## AanaDeploymentHandle

AanaDeploymentHandle is a class that allows you to interact with Aana deployments.

::: aana.deployments.AanaDeploymentHandle

## Base classes for deployments

At the momemnt there are two base classes that you can use to create your own deployments:
- BaseDeployment: This is the base class for all Aana deployments.
- BaseTextGenerationDeployment: This is the base class for all text generation deployments (LLM deployments).

::: aana.deployments.BaseDeployment
::: aana.deployments.BaseTextGenerationDeployment

## Text generation deployments

### Hugging Face Text Generation Deployment

Hugging Face Text Generation Deployment allows you to use Hugging Face transformers library to serve LLMs.

::: aana.deployments.HfTextGenerationConfig
::: aana.deployments.HfTextGenerationDeployment

### VLLM Deployment

VLLM Deployment allows you to use vLLM library to serve LLMs.

::: aana.deployments.VLLMConfig
::: aana.deployments.VLLMDeployment
10 changes: 6 additions & 4 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,13 @@ theme:
toggle:
icon: material/toggle-switch-off-outline
name: Switch to dark mode
primary: teal
primary: 5c21d1
accent: purple
- scheme: slate
toggle:
icon: material/toggle-switch
name: Switch to light mode
primary: teal
primary: 5c21d1
accent: lime

# Extensions
Expand All @@ -52,18 +52,19 @@ markdown_extensions:
emoji_index: !!python/name:materialx.emoji.twemoji
emoji_generator: !!python/name:materialx.emoji.to_svg
- toc:
toc_depth: 3
toc_depth: 2

# Plugins
plugins:
mkdocstrings:
handlers:
python:
options:
show_object_full_path: false
show_root_heading: true
show_if_no_docstring: true
show_root_toc_entry: false
inherited_members: true
inherited_members: true
members_order: source
separate_signature: true
unwrap_annotated: true
Expand All @@ -73,6 +74,7 @@ plugins:
signature_crossrefs: true
show_symbol_type_heading: true
show_symbol_type_toc: true
show_labels: false
# Customization
extra:
social:
Expand Down

0 comments on commit 539c6cb

Please sign in to comment.