Current version: 0.0.40
Caution
This release has breaking changes! Please read the changelog carefully.
- The
call_custom
method has been removed fromLLMClient
andAsyncLLMClient
due to lack of use and unnecessary complexity. This is a breaking change!!! If you need to call a model that is not officially supported by L2M2, please open an issue on the Github repo.
Caution
This release has breaking changes! Please read the changelog carefully.
- Support for Llama 3.3 70b via Groq and Cerebras.
- Support for OpenAI's o1 series:
o1
,o1-preview
, ando1-mini
. - The
extra_params
parameter tocall
andcall_custom
.
Note
At the time of this release, you must be on OpenAI's usage tier 5 to use o1
and tier 1+ to use o1-preview
and o1-mini
.
gemma-7b
has been removed as it has been deprecated by Groq.llama-3.1-70b
has been removed as it has been deprecated by both Groq and Cerebras.
Caution
This release has breaking changes! Please read the changelog carefully.
- Support for Python 3.13.
- Support for Google's Gemini 2.0 Flash, Gemini 1.5 Flash, and Gemini 1.5 Flash 8B models.
- Gemini 1.0 Pro is no longer supported, as it is deprecated by Google. This is a breaking change!!! Calls to Gemini 1.0 Pro will fail.
Caution
This release has significant breaking changes! Please read the changelog carefully.
- Support for Anthropic's claude-3.5-haiku.
- Support for provider Cerebras, offering
llama-3.1-8b
andllama-3.1-70b
. - Support for Mistral's
mistral-small
,ministral-8b
, andministral-3b
models via La Plateforme.
mistral-large-2
has been renamed tomistral-large
, to keep up with Mistral's naming scheme. This is a breaking change!!! Calls tomistral-large-2
will fail.
mixtral-8x22b
,mixtral-8x7b
, andmistral-7b
are no longer available from provider Mistral as they have been deprecated. This is a breaking change!!! Calls tomixtral-8x7b
andmistral-7b
will fail, and calls tomixtral-8x22b
via provider Mistral will fail.
Note
The model mixtral-8x22b
is still available via Groq.
- Updated
gpt-4o
version fromgpt-4o-2024-08-06
togpt-4o-2024-11-20
(Announcement)
- Support for Anthropic's updated Claude 3.5 Sonnet released today
claude-3.5-sonnet
now points to versionclaude-3-5-sonnet-latest
Caution
This release has breaking changes! Please read the changelog carefully.
- New supported models
gemma-2-9b
,llama-3.2-1b
, andllama-3.2-3b
via Groq.
- In order to be more consistent with l2m2's naming scheme, the following model ids have been updated:
llama3-8b
→llama-3-8b
llama3-70b
→llama-3-70b
llama3.1-8b
→llama-3.1-8b
llama3.1-70b
→llama-3.1-70b
llama3.1-405b
→llama-3.1-405b
- This is a breaking change!!! Calls using the old
model_id
s (llama3-8b
, etc.) will fail.
- Provider
octoai
has been removed as they have been acquired and are shutting down their cloud platform. This is a breaking change!!! Calls using theoctoai
provider will fail.- All previous OctoAI supported models (
mixtral-8x22b
,mixtral-8x7b
,mistral-7b
,llama-3-70b
,llama-3.1-8b
,llama-3.1-70b
, andllama-3.1-405b
) are still available via Mistral, Groq, and/or Replicate.
- All previous OctoAI supported models (
- Updated gpt-4o version from
gpt-4o-2024-05-13
togpt-4o-2024-08-06
.
-
Mistral provider support via La Plateforme.
-
Mistral Large 2 model availibility from Mistral.
-
Mistral 7B, Mixtral 8x7B, and Mixtral 8x22B model availibility from Mistral in addition to existing providers.
-
0.0.30 and 0.0.31 are skipped due to a packaging error and a model key typo.
Caution
This release has breaking changes! Please read the changelog carefully.
alt_memory
andbypass_memory
have been added as parameters tocall
andcall_custom
inLLMClient
andAsyncLLMClient
. These parameters allow you to specify alternative memory streams to use for the call, or to bypass memory entirely.
- Previously, the
LLMClient
andAsyncLLMClient
constructors tookmemory_type
,memory_window_size
, andmemory_loading_type
as arguments. Now, it just takesmemory
as an argument, whilewindow_size
andloading_type
can be set on the memory object itself. These changes make the memory API far more consistent and easy to use, especially with the additions ofalt_memory
andbypass_memory
.
- The
MemoryType
enum has been removed. This is a breaking change!!! Instances ofclient = LLMClient(memory_type=MemoryType.CHAT)
should be replaced withclient = LLMClient(memory=ChatMemory())
, and so on.
- Providers can now be activated by default via the following environment variables:
OPENAI_API_KEY
for OpenAIANTHROPIC_API_KEY
for AnthropicCO_API_KEY
for CohereGOOGLE_API_KEY
for GoogleGROQ_API_KEY
for GroqREPLICATE_API_TOKEN
for ReplicateOCTOAI_TOKEN
for OctoAI
- OctoAI provider support.
- Llama 3.1 availibility, in sizes 8B (via OctoAI), 70B (via OctoAI), and 405B (via both OctoAI and Replicate).
- Mistral 7B and Mixtral 8x22B via OctoAI.
LLMOperationError
exception, raised when a feature or mode is not supported by a particular model.
- Rate limit errors would sometimes give the model id as
None
in the error message. This has been fixed.
- GPT-4o-mini availibility.
- Custom exception
LLMRateLimitError
, raised when an LLM call returns a 429 status code.
- The ability to specify a custom timeout for LLM calls by passing a
timeout
argument tocall
orcall_custom
(defaults to 10 seconds). - A custom exception
LLMTimeoutError
which is raised when an LLM call times out, along with a more helpful message than httpx's default timeout error.
- Calls to Anthropic with large context windows were sometimes timing out, prompting this change.
- Major bug where l2m2 would cause environments without
typing_extensions
installed to crash due to it not being listed as an external dependency. This has been fixed by addingtyping_extensions
as an external dependency.
- This bug wasn't caught becuase integration tests were not running in a clean environment – (i.e.,
typing_extensions
was already installed from one of the dev dependencies). To prevent this from happening again, I mademake itest
uninstall all Python dependencies before running.
- In 0.0.21, async calls were blocking due to the use of
requests
. 0.0.22 replacesrequests
withhttpx
to allow for fully asynchoronous behavior.
AsyncLLMClient
should now be instantiated with a context manager (async with AsyncLLMClient() as client:
) to ensure proper cleanup of thehttpx
client.- In
AsyncLLMClient
,call_async
andcall_custom_async
have been renamed tocall
andcall_custom
respectively, with asynchronous behavior.
call_concurrent
andcall_custom_concurrent
have been removed due to unnecessary complexity and lack of use.
- This changelog (finally – oops)
- Support for Anthropic's Claude 3.5 Sonnet released today
- L2M2 is now fully HTTP based with no external dependencies, taking the total recursive dependency count from ~60 to 0 and massively simplifying the unit test suite.
- Non-native JSON mode strategy now defaults to prepend for Anthropic models and strip for all others.