Skip to content

Commit

Permalink
Model Updates (#17)
Browse files Browse the repository at this point in the history
  • Loading branch information
pkelaita authored Dec 17, 2024
2 parents ed7d77a + 4e5652b commit 9087fa8
Show file tree
Hide file tree
Showing 9 changed files with 270 additions and 116 deletions.
21 changes: 20 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,28 @@
# Changelog

_Current version: 0.0.38_
_Current version: 0.0.39_

[PyPi link](https://pypi.org/project/l2m2/)

### 0.0.39 - December 17, 2024

> [!CAUTION]
> This release has breaking changes! Please read the changelog carefully.
#### Added

- Support for [Llama 3.3 70b](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_3/) via [Groq](https://console.groq.com/docs/models) and [Cerebras](https://inference-docs.cerebras.ai/introduction).
- Support for OpenAI's [o1 series](https://openai.com/o1/): `o1`, `o1-preview`, and `o1-mini`.
- The `extra_params` parameter to `call` and `call_custom`.

> [!NOTE]
> At the time of this release, you must be on OpenAI's [usage tier](https://platform.openai.com/docs/guides/rate-limits) 5 to use `o1` and tier 1+ to use `o1-preview` and `o1-mini`.
#### Removed

- `gemma-7b` has been removed as it has been [deprecated](https://console.groq.com/docs/models) by Groq.
- `llama-3.1-70b` has been removed as it has been deprecated by both [Groq](https://console.groq.com/docs/models) and [Cerebras](https://inference-docs.cerebras.ai/introduction).

### v0.0.38 - December 12, 2024

> [!CAUTION]
Expand Down
104 changes: 67 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# L2M2: A Simple Python LLM Manager 💬👍

[![Tests](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml/badge.svg?timestamp=1734052060)](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml) [![codecov](https://codecov.io/github/pkelaita/l2m2/graph/badge.svg?token=UWIB0L9PR8)](https://codecov.io/github/pkelaita/l2m2) [![PyPI version](https://badge.fury.io/py/l2m2.svg?timestamp=1734052060)](https://badge.fury.io/py/l2m2)
[![Tests](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml/badge.svg?timestamp=1734477464)](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml) [![codecov](https://codecov.io/github/pkelaita/l2m2/graph/badge.svg?token=UWIB0L9PR8)](https://codecov.io/github/pkelaita/l2m2) [![PyPI version](https://badge.fury.io/py/l2m2.svg?timestamp=1734477464)](https://badge.fury.io/py/l2m2)

**L2M2** ("LLM Manager" → "LLMM" → "L2M2") is a tiny and very simple LLM manager for Python that exposes lots of models through a unified API. This is useful for evaluation, demos, production applications etc. that need to easily be model-agnostic.

![](assets/l2m2_demo.gif)

### Features

- <!--start-count-->29<!--end-count--> supported models (see below) – regularly updated and with more on the way.
- <!--start-count-->31<!--end-count--> supported models (see below) – regularly updated and with more on the way.
- Session chat memory – even across multiple models or with concurrent memory streams.
- JSON mode
- Prompt loading tools
Expand All @@ -25,37 +25,39 @@ L2M2 currently supports the following models:

<!--start-model-table-->

| Model Name | Provider(s) | Model Version(s) |
| --------------------- | ------------------------------------------------------------------ | --------------------------------------------------- |
| `gpt-4o` | [OpenAI](https://openai.com/product) | `gpt-4o-2024-11-20` |
| `gpt-4o-mini` | [OpenAI](https://openai.com/product) | `gpt-4o-mini-2024-07-18` |
| `gpt-4-turbo` | [OpenAI](https://openai.com/product) | `gpt-4-turbo-2024-04-09` |
| `gpt-3.5-turbo` | [OpenAI](https://openai.com/product) | `gpt-3.5-turbo-0125` |
| `gemini-2.0-flash` | [Google](https://ai.google.dev/) | `gemini-2.0-flash-exp` |
| `gemini-1.5-flash` | [Google](https://ai.google.dev/) | `gemini-1.5-flash` |
| `gemini-1.5-flash-8b` | [Google](https://ai.google.dev/) | `gemini-1.5-flash-8b` |
| `gemini-1.5-pro` | [Google](https://ai.google.dev/) | `gemini-1.5-pro` |
| `claude-3.5-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-5-sonnet-latest` |
| `claude-3.5-haiku` | [Anthropic](https://www.anthropic.com/api) | `claude-3-5-haiku-latest` |
| `claude-3-opus` | [Anthropic](https://www.anthropic.com/api) | `claude-3-opus-20240229` |
| `claude-3-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-sonnet-20240229` |
| `claude-3-haiku` | [Anthropic](https://www.anthropic.com/api) | `claude-3-haiku-20240307` |
| `command-r` | [Cohere](https://docs.cohere.com/) | `command-r` |
| `command-r-plus` | [Cohere](https://docs.cohere.com/) | `command-r-plus` |
| `mistral-large` | [Mistral](https://mistral.ai/) | `mistral-large-latest` |
| `ministral-3b` | [Mistral](https://mistral.ai/) | `ministral-3b-latest` |
| `ministral-8b` | [Mistral](https://mistral.ai/) | `ministral-8b-latest` |
| `mistral-small` | [Mistral](https://mistral.ai/) | `mistral-small-latest` |
| `mixtral-8x7b` | [Groq](https://wow.groq.com/) | `mixtral-8x7b-32768` |
| `gemma-7b` | [Groq](https://wow.groq.com/) | `gemma-7b-it` |
| `gemma-2-9b` | [Groq](https://wow.groq.com/) | `gemma2-9b-it` |
| `llama-3-8b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-8b-8192`, `meta/meta-llama-3-8b-instruct` |
| `llama-3-70b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-70b-8192`, `meta/meta-llama-3-70b-instruct` |
| `llama-3.1-8b` | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/) | `llama-3.1-8b-instant`, `llama3.1-8b` |
| `llama-3.1-70b` | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/) | `llama-3.1-70b-versatile`, `llama3.1-70b` |
| `llama-3.1-405b` | [Replicate](https://replicate.com/) | `meta/meta-llama-3.1-405b-instruct` |
| `llama-3.2-1b` | [Groq](https://wow.groq.com/) | `llama-3.2-1b-preview` |
| `llama-3.2-3b` | [Groq](https://wow.groq.com/) | `llama-3.2-3b-preview` |
| Model Name | Provider(s) | Model Version(s) |
| --------------------- | ----------------------------------------------------------------------------- | --------------------------------------------------- |
| `gpt-4o` | [OpenAI](https://openai.com/api/) | `gpt-4o-2024-11-20` |
| `gpt-4o-mini` | [OpenAI](https://openai.com/api/) | `gpt-4o-mini-2024-07-18` |
| `o1` | [OpenAI](https://openai.com/api/) | `o1` |
| `o1-preview` | [OpenAI](https://openai.com/api/) | `o1-preview` |
| `o1-mini` | [OpenAI](https://openai.com/api/) | `o1-mini` |
| `gpt-4-turbo` | [OpenAI](https://openai.com/api/) | `gpt-4-turbo-2024-04-09` |
| `gpt-3.5-turbo` | [OpenAI](https://openai.com/api/) | `gpt-3.5-turbo-0125` |
| `gemini-2.0-flash` | [Google](https://ai.google.dev/) | `gemini-2.0-flash-exp` |
| `gemini-1.5-flash` | [Google](https://ai.google.dev/) | `gemini-1.5-flash` |
| `gemini-1.5-flash-8b` | [Google](https://ai.google.dev/) | `gemini-1.5-flash-8b` |
| `gemini-1.5-pro` | [Google](https://ai.google.dev/) | `gemini-1.5-pro` |
| `claude-3.5-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-5-sonnet-latest` |
| `claude-3.5-haiku` | [Anthropic](https://www.anthropic.com/api) | `claude-3-5-haiku-latest` |
| `claude-3-opus` | [Anthropic](https://www.anthropic.com/api) | `claude-3-opus-20240229` |
| `claude-3-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-sonnet-20240229` |
| `claude-3-haiku` | [Anthropic](https://www.anthropic.com/api) | `claude-3-haiku-20240307` |
| `command-r` | [Cohere](https://docs.cohere.com/) | `command-r` |
| `command-r-plus` | [Cohere](https://docs.cohere.com/) | `command-r-plus` |
| `mistral-large` | [Mistral](https://docs.mistral.ai/deployment/laplateforme/overview/) | `mistral-large-latest` |
| `ministral-3b` | [Mistral](https://docs.mistral.ai/deployment/laplateforme/overview/) | `ministral-3b-latest` |
| `ministral-8b` | [Mistral](https://docs.mistral.ai/deployment/laplateforme/overview/) | `ministral-8b-latest` |
| `mistral-small` | [Mistral](https://docs.mistral.ai/deployment/laplateforme/overview/) | `mistral-small-latest` |
| `mixtral-8x7b` | [Groq](https://wow.groq.com/) | `mixtral-8x7b-32768` |
| `gemma-2-9b` | [Groq](https://wow.groq.com/) | `gemma2-9b-it` |
| `llama-3-8b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-8b-8192`, `meta/meta-llama-3-8b-instruct` |
| `llama-3-70b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-70b-8192`, `meta/meta-llama-3-70b-instruct` |
| `llama-3.1-8b` | [Groq](https://wow.groq.com/), [Cerebras](https://inference-docs.cerebras.ai) | `llama-3.1-8b-instant`, `llama3.1-8b` |
| `llama-3.1-405b` | [Replicate](https://replicate.com/) | `meta/meta-llama-3.1-405b-instruct` |
| `llama-3.2-1b` | [Groq](https://wow.groq.com/) | `llama-3.2-1b-preview` |
| `llama-3.2-3b` | [Groq](https://wow.groq.com/) | `llama-3.2-3b-preview` |
| `llama-3.3-70b` | [Groq](https://wow.groq.com/), [Cerebras](https://inference-docs.cerebras.ai) | `llama-3.3-70b-versatile`, `llama3.3-70b` |

<!--end-model-table-->

Expand All @@ -73,6 +75,7 @@ L2M2 currently supports the following models:
- **Tools**
- [JSON Mode](#tools-json-mode)
- [Prompt Loader](#tools-prompt-loader)
- [Other Capabilities](#other-capabilities)
- [Planned Features](#planned-features)
- [Contributing](#contributing)
- [Contact](#contact)
Expand Down Expand Up @@ -149,9 +152,7 @@ response = client.call(
)
```

If you'd like to call a language model from one of the supported providers that isn't officially supported by L2M2 (for example, older models such as `gpt-4-0125-preview`), you can similarly `call_custom` with the additional required parameter `provider`, and pass in the model name expected by the provider's API. Unlike `call`, `call_custom` doesn't guarantee correctness or well-defined behavior.

### Example
#### Example

```python
# example.py
Expand Down Expand Up @@ -649,10 +650,39 @@ print(prompt)
Your name is Pierce and you are a software engineer.
```
## Other Capabilities
#### Call Custom
If you'd like to call a language model from one of the supported providers that isn't officially supported by L2M2 (for example, older models such as `gpt-4-0125-preview`), you can similarly `call_custom` with the additional required parameter `provider`, and pass in the model name expected by the provider's API. Unlike `call`, `call_custom` doesn't guarantee correctness or well-defined behavior.
```python
response = client.call_custom(
provider="<provider name>",
model_id="<model id for given provider>",
prompt="<prompt>",
...
)
```
#### Extra Parameters
You can pass in extra parameters to the provider's API (For example, [reasoning_effort](https://platform.openai.com/docs/api-reference/chat/create#chat-create-reasoning_effort) on OpenAI's o1 series) by passing in the `extra_params` parameter to `call` or `call_custom`. These parameters are passed in as a dictionary of key-value pairs, where the values are of type `str`, `int`, or `float`. Similarly, using `extra_params` does not guarantee correctness or well-defined behavior, and you should refer to the provider's documentation for correct usage.
```python
response = client.call(
model="<model name>",
prompt="<prompt>",
extra_params={"foo": "bar", "baz": 123},
...
)
```
## Planned Features
- Support for structured outputs where available (Just OpenAI as far as I know)
- Support for OSS and self-hosted (Hugging Face, Ollama, Gpt4all, etc.)
- Support for batch APIs where available (OpenAI, Anthropic, etc.)
- Support for OSS and self-hosted (Hugging Face, Gpt4all, etc.)
- Basic (i.e., customizable & non-opinionated) agent & multi-agent system features
- Tools for common application workflows: RAG, prompt management, search, etc.
- Support for streaming responses
Expand Down
2 changes: 1 addition & 1 deletion l2m2/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.0.38"
__version__ = "0.0.39"
7 changes: 6 additions & 1 deletion l2m2/_internal/http.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from typing import Optional, Dict, Any
from typing import Optional, Dict, Any, Union
import httpx

from l2m2.exceptions import LLMTimeoutError, LLMRateLimitError
Expand Down Expand Up @@ -43,12 +43,17 @@ async def llm_post(
api_key: str,
data: Dict[str, Any],
timeout: Optional[int],
extra_params: Optional[Dict[str, Union[str, int, float]]],
) -> Any:
endpoint = PROVIDER_INFO[provider]["endpoint"]
if API_KEY in endpoint:
endpoint = endpoint.replace(API_KEY, api_key)
if MODEL_ID in endpoint and model_id is not None:
endpoint = endpoint.replace(MODEL_ID, model_id)

if extra_params:
data.update(extra_params)

try:
response = await client.post(
endpoint,
Expand Down
Loading

0 comments on commit 9087fa8

Please sign in to comment.