Title: Multimodal chat Tags: [multimodal]
LLaVa (Language Llama) is a multimodal conversational AI that can understand and generate text and images. This Chainlit example demonstrates how to integrate LLaVa into a Chainlit application, allowing users to interact with the AI through a chat interface that supports both text and image inputs.
- Multimodal Interaction: Users can converse with the AI using text and images.
- Customizable Conversation Styles: Different separator styles for conversation formatting.
- Image Processing: Supports various image processing modes like padding, cropping, and resizing.
- Asynchronous Request Handling: Communicates with the LLaVa backend asynchronously for efficient performance.
- Set up LLaVa: Deploy LLaVa and obtain a
CONTROLLER_URL
by following the instructions in this YouTube video. - Environment Variable: Ensure the
CONTROLLER_URL
is set in your environment variables. - Install Dependencies: Install the required Python packages including
chainlit
,aiohttp
, andPIL
. - Run the App: Start the Chainlit app by running
app.py
.
Conversation
: A dataclass that keeps all conversation history, including system messages, roles, and separator styles.request
: An asynchronous function that sends the user's input to the LLaVa backend and streams the response back to the user.start
: An event handler that initializes the chat settings when a new chat session starts.setup_agent
: Updates the chat settings when the user changes them.main
: The main event handler for incoming messages. It processes images, appends messages to the conversation, and sends requests to the LLaVa backend.
LLaVa (Language Llama) is a multimodal conversational AI that can understand and generate text and images. This Chainlit example demonstrates how to integrate LLaVa into a Chainlit application, allowing users to interact with the AI through a chat interface that supports both text and image inputs.
- Multimodal Interaction: Users can converse with the AI using text and images.
- Customizable Conversation Styles: Different separator styles for conversation formatting.
- Image Processing: Supports various image processing modes like padding, cropping, and resizing.
- Asynchronous Request Handling: Communicates with the LLaVa backend asynchronously for efficient performance.
- Set up LLaVa: Deploy LLaVa and obtain a
CONTROLLER_URL
by following the instructions in this YouTube video. - Environment Variable: Ensure the
CONTROLLER_URL
is set in your environment variables. - Install Dependencies: Install the required Python packages including
chainlit
,aiohttp
, andPIL
. - Run the App: Start the Chainlit app by running
app.py
.
Conversation
: A dataclass that keeps all conversation history, including system messages, roles, and separator styles.request
: An asynchronous function that sends the user's input to the LLaVa backend and streams the response back to the user.start
: An event handler that initializes the chat settings when a new chat session starts.setup_agent
: Updates the chat settings when the user changes them.main
: The main event handler for incoming messages. It processes images, appends messages to the conversation, and sends requests to the LLaVa backend.
Full credits to the LLaVa team (GitHub repo). This example is heavily inspired by the original logic written for the Gradio app. The goal of this example is to provide the community with a working example of LLaVa with Chainlit.