Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemini Updates #16

Merged
merged 2 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,22 @@
# Changelog

_Current version: 0.0.37_
_Current version: 0.0.38_

[PyPi link](https://pypi.org/project/l2m2/)

### In Development
### v0.0.38 - December 12, 2024

> [!CAUTION]
> This release has breaking changes! Please read the changelog carefully.

#### Added

- Support for [Python 3.13](https://www.python.org/downloads/release/python-3130/).
- Support for Google's [Gemini 2.0 Flash](https://ai.google.dev/gemini-api/docs/models/gemini#gemini-2.0-flash), [Gemini 1.5 Flash](https://ai.google.dev/gemini-api/docs/models/gemini#gemini-1.5-flash), and [Gemini 1.5 Flash 8B](https://ai.google.dev/gemini-api/docs/models/gemini#gemini-1.5-flash-8b) models.

#### Removed

- Gemini 1.0 Pro is no longer supported, as it is [deprecated](https://ai.google.dev/gemini-api/docs/models/gemini#gemini-1.0-pro) by Google. **This is a breaking change!!!** Calls to Gemini 1.0 Pro will fail.

### 0.0.37 - December 9, 2024

Expand Down
67 changes: 36 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# L2M2: A Simple Python LLM Manager 💬👍

[![Tests](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml/badge.svg?timestamp=1733808328)](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml) [![codecov](https://codecov.io/github/pkelaita/l2m2/graph/badge.svg?token=UWIB0L9PR8)](https://codecov.io/github/pkelaita/l2m2) [![PyPI version](https://badge.fury.io/py/l2m2.svg?timestamp=1733808328)](https://badge.fury.io/py/l2m2)
[![Tests](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml/badge.svg?timestamp=1734052060)](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml) [![codecov](https://codecov.io/github/pkelaita/l2m2/graph/badge.svg?token=UWIB0L9PR8)](https://codecov.io/github/pkelaita/l2m2) [![PyPI version](https://badge.fury.io/py/l2m2.svg?timestamp=1734052060)](https://badge.fury.io/py/l2m2)

**L2M2** ("LLM Manager" → "LLMM" → "L2M2") is a tiny and very simple LLM manager for Python that exposes lots of models through a unified API. This is useful for evaluation, demos, production applications etc. that need to easily be model-agnostic.

![](assets/l2m2_demo.gif)

### Features

- <!--start-count-->27<!--end-count--> supported models (see below) – regularly updated and with more on the way.
- <!--start-count-->29<!--end-count--> supported models (see below) – regularly updated and with more on the way.
- Session chat memory – even across multiple models or with concurrent memory streams.
- JSON mode
- Prompt loading tools
Expand All @@ -25,35 +25,37 @@ L2M2 currently supports the following models:

<!--start-model-table-->

| Model Name | Provider(s) | Model Version(s) |
| ------------------- | ------------------------------------------------------------------ | --------------------------------------------------- |
| `gpt-4o` | [OpenAI](https://openai.com/product) | `gpt-4o-2024-11-20` |
| `gpt-4o-mini` | [OpenAI](https://openai.com/product) | `gpt-4o-mini-2024-07-18` |
| `gpt-4-turbo` | [OpenAI](https://openai.com/product) | `gpt-4-turbo-2024-04-09` |
| `gpt-3.5-turbo` | [OpenAI](https://openai.com/product) | `gpt-3.5-turbo-0125` |
| `gemini-1.5-pro` | [Google](https://ai.google.dev/) | `gemini-1.5-pro` |
| `gemini-1.0-pro` | [Google](https://ai.google.dev/) | `gemini-1.0-pro` |
| `claude-3.5-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-5-sonnet-latest` |
| `claude-3.5-haiku` | [Anthropic](https://www.anthropic.com/api) | `claude-3-5-haiku-latest` |
| `claude-3-opus` | [Anthropic](https://www.anthropic.com/api) | `claude-3-opus-20240229` |
| `claude-3-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-sonnet-20240229` |
| `claude-3-haiku` | [Anthropic](https://www.anthropic.com/api) | `claude-3-haiku-20240307` |
| `command-r` | [Cohere](https://docs.cohere.com/) | `command-r` |
| `command-r-plus` | [Cohere](https://docs.cohere.com/) | `command-r-plus` |
| `mistral-large` | [Mistral](https://mistral.ai/) | `mistral-large-latest` |
| `ministral-3b` | [Mistral](https://mistral.ai/) | `ministral-3b-latest` |
| `ministral-8b` | [Mistral](https://mistral.ai/) | `ministral-8b-latest` |
| `mistral-small` | [Mistral](https://mistral.ai/) | `mistral-small-latest` |
| `mixtral-8x7b` | [Groq](https://wow.groq.com/) | `mixtral-8x7b-32768` |
| `gemma-7b` | [Groq](https://wow.groq.com/) | `gemma-7b-it` |
| `gemma-2-9b` | [Groq](https://wow.groq.com/) | `gemma2-9b-it` |
| `llama-3-8b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-8b-8192`, `meta/meta-llama-3-8b-instruct` |
| `llama-3-70b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-70b-8192`, `meta/meta-llama-3-70b-instruct` |
| `llama-3.1-8b` | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/) | `llama-3.1-8b-instant`, `llama3.1-8b` |
| `llama-3.1-70b` | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/) | `llama-3.1-70b-versatile`, `llama3.1-70b` |
| `llama-3.1-405b` | [Replicate](https://replicate.com/) | `meta/meta-llama-3.1-405b-instruct` |
| `llama-3.2-1b` | [Groq](https://wow.groq.com/) | `llama-3.2-1b-preview` |
| `llama-3.2-3b` | [Groq](https://wow.groq.com/) | `llama-3.2-3b-preview` |
| Model Name | Provider(s) | Model Version(s) |
| --------------------- | ------------------------------------------------------------------ | --------------------------------------------------- |
| `gpt-4o` | [OpenAI](https://openai.com/product) | `gpt-4o-2024-11-20` |
| `gpt-4o-mini` | [OpenAI](https://openai.com/product) | `gpt-4o-mini-2024-07-18` |
| `gpt-4-turbo` | [OpenAI](https://openai.com/product) | `gpt-4-turbo-2024-04-09` |
| `gpt-3.5-turbo` | [OpenAI](https://openai.com/product) | `gpt-3.5-turbo-0125` |
| `gemini-2.0-flash` | [Google](https://ai.google.dev/) | `gemini-2.0-flash-exp` |
| `gemini-1.5-flash` | [Google](https://ai.google.dev/) | `gemini-1.5-flash` |
| `gemini-1.5-flash-8b` | [Google](https://ai.google.dev/) | `gemini-1.5-flash-8b` |
| `gemini-1.5-pro` | [Google](https://ai.google.dev/) | `gemini-1.5-pro` |
| `claude-3.5-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-5-sonnet-latest` |
| `claude-3.5-haiku` | [Anthropic](https://www.anthropic.com/api) | `claude-3-5-haiku-latest` |
| `claude-3-opus` | [Anthropic](https://www.anthropic.com/api) | `claude-3-opus-20240229` |
| `claude-3-sonnet` | [Anthropic](https://www.anthropic.com/api) | `claude-3-sonnet-20240229` |
| `claude-3-haiku` | [Anthropic](https://www.anthropic.com/api) | `claude-3-haiku-20240307` |
| `command-r` | [Cohere](https://docs.cohere.com/) | `command-r` |
| `command-r-plus` | [Cohere](https://docs.cohere.com/) | `command-r-plus` |
| `mistral-large` | [Mistral](https://mistral.ai/) | `mistral-large-latest` |
| `ministral-3b` | [Mistral](https://mistral.ai/) | `ministral-3b-latest` |
| `ministral-8b` | [Mistral](https://mistral.ai/) | `ministral-8b-latest` |
| `mistral-small` | [Mistral](https://mistral.ai/) | `mistral-small-latest` |
| `mixtral-8x7b` | [Groq](https://wow.groq.com/) | `mixtral-8x7b-32768` |
| `gemma-7b` | [Groq](https://wow.groq.com/) | `gemma-7b-it` |
| `gemma-2-9b` | [Groq](https://wow.groq.com/) | `gemma2-9b-it` |
| `llama-3-8b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-8b-8192`, `meta/meta-llama-3-8b-instruct` |
| `llama-3-70b` | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-70b-8192`, `meta/meta-llama-3-70b-instruct` |
| `llama-3.1-8b` | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/) | `llama-3.1-8b-instant`, `llama3.1-8b` |
| `llama-3.1-70b` | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/) | `llama-3.1-70b-versatile`, `llama3.1-70b` |
| `llama-3.1-405b` | [Replicate](https://replicate.com/) | `meta/meta-llama-3.1-405b-instruct` |
| `llama-3.2-1b` | [Groq](https://wow.groq.com/) | `llama-3.2-1b-preview` |
| `llama-3.2-3b` | [Groq](https://wow.groq.com/) | `llama-3.2-3b-preview` |

<!--end-model-table-->

Expand Down Expand Up @@ -514,6 +516,9 @@ The following models natively support JSON mode via the given provider:
- `gpt-4o-mini` (via Openai)
- `gpt-4-turbo` (via Openai)
- `gpt-3.5-turbo` (via Openai)
- `gemini-2.0-flash` (via Google)
- `gemini-1.5-flash` (via Google)
- `gemini-1.5-flash-8b` (via Google)
- `gemini-1.5-pro` (via Google)
- `mistral-large` (via Mistral)
- `ministral-3b` (via Mistral)
Expand Down
2 changes: 1 addition & 1 deletion l2m2/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.0.37"
__version__ = "0.0.38"
6 changes: 1 addition & 5 deletions l2m2/client/base_llm_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -521,11 +521,7 @@ async def _call_google(
data: Dict[str, Any] = {}

if system_prompt is not None:
# Earlier models don't support system prompts, so prepend it to the prompt
if model_id not in ["gemini-1.5-pro"]:
prompt = f"{system_prompt}\n{prompt}"
else:
data["system_instruction"] = {"parts": {"text": system_prompt}}
data["system_instruction"] = {"parts": {"text": system_prompt}}

messages: List[Dict[str, Any]] = []
if isinstance(memory, ChatMemory):
Expand Down
46 changes: 41 additions & 5 deletions l2m2/model_info.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,9 +187,9 @@ class ModelEntry(TypedDict):
"extras": {"json_mode_arg": {"response_format": {"type": "json_object"}}},
},
},
"gemini-1.5-pro": {
"gemini-2.0-flash": {
"google": {
"model_id": "gemini-1.5-pro",
"model_id": "gemini-2.0-flash-exp",
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
Expand All @@ -205,9 +205,9 @@ class ModelEntry(TypedDict):
"extras": {"json_mode_arg": {"response_mime_type": "application/json"}},
},
},
"gemini-1.0-pro": {
"gemini-1.5-flash": {
"google": {
"model_id": "gemini-1.0-pro",
"model_id": "gemini-1.5-flash",
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
Expand All @@ -220,7 +220,43 @@ class ModelEntry(TypedDict):
"max": 8192,
},
},
"extras": {},
"extras": {"json_mode_arg": {"response_mime_type": "application/json"}},
},
},
"gemini-1.5-flash-8b": {
"google": {
"model_id": "gemini-1.5-flash-8b",
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
"max": 2.0,
},
"max_tokens": {
"custom_key": "max_output_tokens",
"default": PROVIDER_DEFAULT,
# https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models
"max": 8192,
},
},
"extras": {"json_mode_arg": {"response_mime_type": "application/json"}},
},
},
"gemini-1.5-pro": {
"google": {
"model_id": "gemini-1.5-pro",
"params": {
"temperature": {
"default": PROVIDER_DEFAULT,
"max": 2.0,
},
"max_tokens": {
"custom_key": "max_output_tokens",
"default": PROVIDER_DEFAULT,
# https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models
"max": 8192,
},
},
"extras": {"json_mode_arg": {"response_mime_type": "application/json"}},
},
},
"claude-3.5-sonnet": {
Expand Down
10 changes: 0 additions & 10 deletions tests/l2m2/client/test_base_llm_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -267,16 +267,6 @@ async def test_call_google_1_5(mock_get_extra_message, mock_llm_post, llm_client
await _generic_test_call(llm_client, "google", "gemini-1.5-pro")


@pytest.mark.asyncio
@patch(LLM_POST_PATH)
@patch(GET_EXTRA_MESSAGE_PATH)
async def test_call_google_1_0(mock_get_extra_message, mock_llm_post, llm_client):
mock_get_extra_message.return_value = "extra message"
mock_return_value = {"candidates": [{"content": {"parts": [{"text": "response"}]}}]}
mock_llm_post.return_value = mock_return_value
await _generic_test_call(llm_client, "google", "gemini-1.0-pro")


@pytest.mark.asyncio
@patch(LLM_POST_PATH)
@patch(GET_EXTRA_MESSAGE_PATH)
Expand Down
Loading