Skip to content

Commit

Permalink
[octoai] add chat models
Browse files Browse the repository at this point in the history
  • Loading branch information
pkelaita committed Jul 25, 2024
1 parent d530a70 commit df27c53
Show file tree
Hide file tree
Showing 7 changed files with 194 additions and 41 deletions.
46 changes: 25 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# L2M2: A Simple Python LLM Manager 💬👍

[![Tests](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml/badge.svg?timestamp=1721868974)](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml) [![codecov](https://codecov.io/github/pkelaita/l2m2/graph/badge.svg?token=UWIB0L9PR8)](https://codecov.io/github/pkelaita/l2m2) [![PyPI version](https://badge.fury.io/py/l2m2.svg?timestamp=1721868974)](https://badge.fury.io/py/l2m2)
[![Tests](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml/badge.svg?timestamp=1721884488)](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml) [![codecov](https://codecov.io/github/pkelaita/l2m2/graph/badge.svg?token=UWIB0L9PR8)](https://codecov.io/github/pkelaita/l2m2) [![PyPI version](https://badge.fury.io/py/l2m2.svg?timestamp=1721884488)](https://badge.fury.io/py/l2m2)

**L2M2** ("LLM Manager" → "LLMM" → "L2M2") is a tiny and very simple LLM manager for Python that exposes lots of models through a unified API. This is useful for evaluation, demos, production applications etc. that need to easily be model-agnostic.

### Features

- <!--start-count-->17<!--end-count--> supported models (see below) – regularly updated and with more on the way
- <!--start-count-->21<!--end-count--> supported models (see below) – regularly updated and with more on the way
- Session chat memory – even across multiple models
- JSON mode
- Prompt loading tools
Expand All @@ -23,25 +23,29 @@ L2M2 currently supports the following models:

<!--start-model-table-->

| Model Name | Provider(s) | Model Version(s) |
| ------------------- | -------------------------------------------------------------------- | ------------------------------------------------------------------- |
| `gpt-4o` | [OpenAI](https://openai.com/product) | `gpt-4o-2024-05-13` |
| `gpt-4o-mini` | [OpenAI](https://openai.com/product) | `gpt-4o-mini-2024-07-18` |
| `gpt-4-turbo` | [OpenAI](https://openai.com/product) | `gpt-4-turbo-2024-04-09` |
| `gpt-3.5-turbo` | [OpenAI](https://openai.com/product) | `gpt-3.5-turbo-0125` |
| `gemini-1.5-pro` | [Google](https://ai.google.dev/) | `gemini-1.5-pro` |
| `gemini-1.0-pro` | [Google](https://ai.google.dev/) | `gemini-1.0-pro` |
| `claude-3.5-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-5-sonnet-20240620` |
| `claude-3-opus` | [Anthropic](https://www.anthropic.com/api) | `claude-3-opus-20240229` |
| `claude-3-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-sonnet-20240229` |
| `claude-3-haiku` | [Anthropic](https://www.anthropic.com/api) | `claude-3-haiku-20240307` |
| `command-r` | [Cohere](https://docs.cohere.com/) | `command-r` |
| `command-r-plus` | [Cohere](https://docs.cohere.com/) | `command-r-plus` |
| `mixtral-8x7b` | [Groq](https://wow.groq.com/) | `mixtral-8x7b-32768` |
| `gemma-7b` | [Groq](https://wow.groq.com/) | `gemma-7b-it` |
| `llama3-8b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-8b-8192`, `meta/meta-llama-3-8b-instruct` |
| `llama3-70b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-70b-8192`, `meta/meta-llama-3-70b-instruct` |
| `llama3.1-405b` | [Replicate](https://replicate.com/), [OctoAI](https://octoai.cloud/) | `meta/meta-llama-3.1-405b-instruct`, `meta-llama-3.1-405b-instruct` |
| Model Name | Provider(s) | Model Version(s) |
| ------------------- | --------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
| `gpt-4o` | [OpenAI](https://openai.com/product) | `gpt-4o-2024-05-13` |
| `gpt-4o-mini` | [OpenAI](https://openai.com/product) | `gpt-4o-mini-2024-07-18` |
| `gpt-4-turbo` | [OpenAI](https://openai.com/product) | `gpt-4-turbo-2024-04-09` |
| `gpt-3.5-turbo` | [OpenAI](https://openai.com/product) | `gpt-3.5-turbo-0125` |
| `gemini-1.5-pro` | [Google](https://ai.google.dev/) | `gemini-1.5-pro` |
| `gemini-1.0-pro` | [Google](https://ai.google.dev/) | `gemini-1.0-pro` |
| `claude-3.5-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-5-sonnet-20240620` |
| `claude-3-opus` | [Anthropic](https://www.anthropic.com/api) | `claude-3-opus-20240229` |
| `claude-3-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-sonnet-20240229` |
| `claude-3-haiku` | [Anthropic](https://www.anthropic.com/api) | `claude-3-haiku-20240307` |
| `command-r` | [Cohere](https://docs.cohere.com/) | `command-r` |
| `command-r-plus` | [Cohere](https://docs.cohere.com/) | `command-r-plus` |
| `mistral-7b` | [OctoAI](https://octoai.cloud/) | `mistral-7b-instruct` |
| `mixtral-8x7b` | [Groq](https://wow.groq.com/), [OctoAI](https://octoai.cloud/) | `mixtral-8x7b-32768`, `mixtral-8x7b-instruct` |
| `mixtral-8x22b` | [OctoAI](https://octoai.cloud/) | `mixtral-8x22b-instruct` |
| `gemma-7b` | [Groq](https://wow.groq.com/) | `gemma-7b-it` |
| `llama3-8b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-8b-8192`, `meta/meta-llama-3-8b-instruct` |
| `llama3-70b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/), [OctoAI](https://octoai.cloud/) | `llama3-70b-8192`, `meta/meta-llama-3-70b-instruct`, `meta-llama-3-70b-instruct` |
| `llama3.1-8b` | [OctoAI](https://octoai.cloud/) | `meta-llama-3.1-8b-instruct` |
| `llama3.1-70b` | [OctoAI](https://octoai.cloud/) | `meta-llama-3.1-70b-instruct` |
| `llama3.1-405b` | [Replicate](https://replicate.com/), [OctoAI](https://octoai.cloud/) | `meta/meta-llama-3.1-405b-instruct`, `meta-llama-3.1-405b-instruct` |

<!--end-model-table-->

Expand Down
2 changes: 1 addition & 1 deletion l2m2/_internal/http.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,10 @@ async def _handle_replicate_201(
async def llm_post(
client: httpx.AsyncClient,
provider: str,
model_id: str,
api_key: str,
data: Dict[str, Any],
timeout: Optional[int],
model_id: Optional[str] = None,
) -> Any:
endpoint = PROVIDER_INFO[provider]["endpoint"]
if API_KEY in endpoint:
Expand Down
20 changes: 16 additions & 4 deletions l2m2/client/base_llm_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
get_extra_message,
run_json_strats_out,
)
from l2m2.exceptions import LLMOperationError
from l2m2._internal.http import llm_post


Expand Down Expand Up @@ -501,6 +502,7 @@ async def _call_openai(
result = await llm_post(
client=self.httpx_client,
provider="openai",
model_id=model_id,
api_key=self.api_keys["openai"],
data={"model": model_id, "messages": messages, **params},
timeout=timeout,
Expand Down Expand Up @@ -532,6 +534,7 @@ async def _call_anthropic(
result = await llm_post(
client=self.httpx_client,
provider="anthropic",
model_id=model_id,
api_key=self.api_keys["anthropic"],
data={"model": model_id, "messages": messages, **params},
timeout=timeout,
Expand Down Expand Up @@ -564,6 +567,7 @@ async def _call_cohere(
result = await llm_post(
client=self.httpx_client,
provider="cohere",
model_id=model_id,
api_key=self.api_keys["cohere"],
data={"model": model_id, "message": prompt, **params},
timeout=timeout,
Expand Down Expand Up @@ -595,6 +599,7 @@ async def _call_groq(
result = await llm_post(
client=self.httpx_client,
provider="groq",
model_id=model_id,
api_key=self.api_keys["groq"],
data={"model": model_id, "messages": messages, **params},
timeout=timeout,
Expand Down Expand Up @@ -633,10 +638,10 @@ async def _call_google(
result = await llm_post(
client=self.httpx_client,
provider="google",
model_id=model_id,
api_key=self.api_keys["google"],
data=data,
timeout=timeout,
model_id=model_id,
)
result = result["candidates"][0]

Expand All @@ -657,12 +662,12 @@ async def _call_replicate(
json_mode_strategy: JsonModeStrategy,
) -> str:
if isinstance(self.memory, ChatMemory):
raise ValueError(
raise LLMOperationError(
"Chat memory is not supported with Replicate."
+ " Try using Groq, or using ExternalMemory instead."
)
if json_mode_strategy.strategy_name == StrategyName.PREPEND:
raise ValueError(
raise LLMOperationError(
"JsonModeStrategy.prepend() is not supported with Replicate."
+ " Try using Groq, or using JsonModeStrategy.strip() instead."
)
Expand All @@ -673,10 +678,10 @@ async def _call_replicate(
result = await llm_post(
client=self.httpx_client,
provider="replicate",
model_id=model_id,
api_key=self.api_keys["replicate"],
data={"input": {"prompt": prompt, **params}},
timeout=timeout,
model_id=model_id,
)
return "".join(result["output"])

Expand All @@ -690,6 +695,12 @@ async def _call_octoai(
json_mode: bool,
json_mode_strategy: JsonModeStrategy,
) -> str:
if isinstance(self.memory, ChatMemory) and model_id == "mixtral-8x22b-instruct":
raise LLMOperationError(
"Chat memory is not supported with mixtral-8x22b via OctoAI. Try using"
+ " ExternalMemory instead, or ChatMemory with a different model/provider."
)

messages = []
if system_prompt is not None:
messages.append({"role": "system", "content": system_prompt})
Expand All @@ -705,6 +716,7 @@ async def _call_octoai(
result = await llm_post(
client=self.httpx_client,
provider="octoai",
model_id=model_id,
api_key=self.api_keys["octoai"],
data={"model": model_id, "messages": messages, **params},
timeout=timeout,
Expand Down
6 changes: 6 additions & 0 deletions l2m2/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,9 @@ class LLMRateLimitError(Exception):
"""Raised when a request to an LLM provider API is rate limited."""

pass


class LLMOperationError(Exception):
"""Raised when a model does not support a particular feature or mode."""

pass
96 changes: 94 additions & 2 deletions l2m2/model_info.py
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,22 @@ class ModelEntry(TypedDict):
"extras": {},
},
},
"mistral-7b": {
"octoai": {
"model_id": "mistral-7b-instruct",
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
"max": 2.0,
},
"max_tokens": {
"default": PROVIDER_DEFAULT,
"max": INF,
},
},
"extras": {},
},
},
"mixtral-8x7b": {
"groq": {
"model_id": "mixtral-8x7b-32768",
Expand All @@ -325,6 +341,36 @@ class ModelEntry(TypedDict):
},
"extras": {},
},
"octoai": {
"model_id": "mixtral-8x7b-instruct",
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
"max": 2.0,
},
"max_tokens": {
"default": PROVIDER_DEFAULT,
"max": INF,
},
},
"extras": {},
},
},
"mixtral-8x22b": {
"octoai": {
"model_id": "mixtral-8x22b-instruct",
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
"max": 2.0,
},
"max_tokens": {
"default": PROVIDER_DEFAULT,
"max": INF,
},
},
"extras": {},
},
},
"gemma-7b": {
"groq": {
Expand All @@ -348,7 +394,7 @@ class ModelEntry(TypedDict):
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
"max": 2,
"max": 2.0,
},
"max_tokens": {
"default": PROVIDER_DEFAULT,
Expand Down Expand Up @@ -379,7 +425,7 @@ class ModelEntry(TypedDict):
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
"max": 2,
"max": 2.0,
},
"max_tokens": {
"default": PROVIDER_DEFAULT,
Expand All @@ -403,6 +449,52 @@ class ModelEntry(TypedDict):
},
"extras": {},
},
"octoai": {
"model_id": "meta-llama-3-70b-instruct",
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
"max": 2.0,
},
"max_tokens": {
"default": PROVIDER_DEFAULT,
"max": INF,
},
},
"extras": {},
},
},
"llama3.1-8b": {
"octoai": {
"model_id": "meta-llama-3.1-8b-instruct",
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
"max": 2.0,
},
"max_tokens": {
"default": PROVIDER_DEFAULT,
"max": INF,
},
},
"extras": {},
},
},
"llama3.1-70b": {
"octoai": {
"model_id": "meta-llama-3.1-70b-instruct",
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
"max": 2.0,
},
"max_tokens": {
"default": PROVIDER_DEFAULT,
"max": INF,
},
},
"extras": {},
},
},
"llama3.1-405b": {
"replicate": {
Expand Down
Loading

0 comments on commit df27c53

Please sign in to comment.