Initial commit of Google Gemini LLM service. #150

kwindla · 2024-05-17T04:05:24Z

Gemini text input works. We translate from OpenAILLMContext format on the fly in the GoogleLLMService implementation.

This commit also implements image input (vision) in both the GoogleLLMService and in the OpenAILLMService. Image input is a hack and needs to be revisited. OpenAI expects images to be uploaded as base64-encoded JPEGs. Google does not require the base64 encoding. Other than for images, we use the OpenAI format as our standard, but base64-encoding the images and then unencoding them in the GoogleLLMService feels wasteful.

aconchillo · 2024-05-17T04:49:02Z

macos-py3.10-requirements.txt

 #
-# This file is autogenerated by pip-compile with Python 3.10
+# This file is autogenerated by pip-compile with Python 3.11


nit: I try to generate this with python 3.10 just to make sure it works there. on macos i do:

brew install [email protected] python3.10 -m venv venv ... ... pip-compile --all-extras pyproject.toml mv requirements.txt macos-py3.10-requirements.txt

aconchillo · 2024-05-17T04:50:01Z

src/pipecat/services/openai.py

+                del message["mime_type"]
+
+        # messages_for_log = json.dumps(messages)
+        # logger.debug(f"Generating chat: {messages_for_log}")


nit: remove or re-add?

chadbailey59

LGTM 👍

Gemini text input works. We translate from OpenAILLMContext format on the fly in the GoogleLLMService implementation. This commit also implements image input (vision) in both the GoogleLLMService and in the OpenAILLMService. Image input is a hack and needs to be revisited. OpenAI expects images to be uploaded as base64-encoded JPEGs. Google does not require the base64 encoding. Other than for images, we use the OpenAI format as our standard, but base64-encoding the images and then unencoding them in the GoogleLLMService feels wasteful.

kwindla requested review from chadbailey59, aconchillo and jptaylor May 17, 2024 04:05

kwindla mentioned this pull request May 17, 2024

Implement Google Gemini LLM service #145

Closed

aconchillo reviewed May 17, 2024

View reviewed changes

aconchillo approved these changes May 17, 2024

View reviewed changes

chadbailey59 approved these changes May 17, 2024

View reviewed changes

kwindla added 2 commits May 19, 2024 10:35

generate macos-py3.10-requirements.txt with Python 3.10

d83f0aa

kwindla force-pushed the khk-gemini branch from bf62e4e to d83f0aa Compare May 19, 2024 17:54

kwindla added 5 commits May 19, 2024 11:08

add back in debug log line in openai.py

cf597a2

add google and deepgram to README.md

e5ddaf1

oops, fix openai.py

e507686

fix up openai vision and gemini implementation

6637795

add to CHANGELOG.md

7ffb10d

aconchillo marked this pull request as ready for review May 20, 2024 02:24

aconchillo merged commit bf036be into main May 20, 2024
2 of 3 checks passed

aconchillo deleted the khk-gemini branch October 23, 2024 20:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial commit of Google Gemini LLM service. #150

Initial commit of Google Gemini LLM service. #150

kwindla commented May 17, 2024

aconchillo May 17, 2024 •

edited

Loading

kwindla May 19, 2024

aconchillo May 17, 2024

kwindla May 19, 2024

chadbailey59 left a comment

Initial commit of Google Gemini LLM service. #150

Initial commit of Google Gemini LLM service. #150

Conversation

kwindla commented May 17, 2024

aconchillo May 17, 2024 • edited Loading

Choose a reason for hiding this comment

kwindla May 19, 2024

Choose a reason for hiding this comment

aconchillo May 17, 2024

Choose a reason for hiding this comment

kwindla May 19, 2024

Choose a reason for hiding this comment

chadbailey59 left a comment

Choose a reason for hiding this comment

aconchillo May 17, 2024 •

edited

Loading