OpenAI Assistants Agent #4131

lspinheiro · 2024-11-11T06:53:16Z

Why are these changes needed?

Related issue number

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

lspinheiro · 2024-11-11T07:22:30Z

@ekzhu @jackgerrits , this is a very early draft, I have some questions before proceeding further.

What to do w.r.t. model client? The chat completion client abstraction doesn't seem to fit well because it seems to have some assumptions about the handling of messages in the interface that the assistants api has a very different approach of handling with the threads (I have spent a lot of time trying to adapt it without success). I'm also not sure if we can generate some general interface for agent-like apis, should I even create a specific one in autogen_ext to abstract away the openai sdk? I'm not sure what would be the value in that but it also feels like I'm adding an implementation without ap roper standard/abstraction.
How do we want to handle file search, specially ingestion? Also seems like something we don't have a strong abstraction for. I'm not sure if it should fit into how we will integrate rag or not. I also don't know if we want file ingestion to be part of the agent interaction api through on messages or not (maybe just a separate method to be used in the agent set up before running the chat?)
The next step for me is to map the tool calling to the autogen core framework. It looks like the azure openai api has integration with logic apps to actually call functions as tools. Should this be a future functionality in autogen_ext?

ekzhu · 2024-11-11T20:17:50Z

Thanks. I think we can follow the design in the Core cookbook for open ai assistant agent: https://microsoft.github.io/autogen/dev/user-guide/core-user-guide/cookbook/openai-assistant-agent.html. The API should be simple without introducing additional abstractions on our side.

class OpenAIAssistantAgent:
  name: str
  description: str
  client: openai.AsyncClient
  assistant_id: str
  thread_id: str
  tools: List[Tools] | None = None,
  code_interperter: ... | None = None, # configuration class from OpenAI client
  file_search: ... | None = None, # Configuration class from OpenAI client

We don't need to introduce additional abstractions because OpenAI Assistant is specific to OpenAI and Azure OpenAI services -- we should stick with the official clients they provide. Furthermore, we shouldn't expect the agent to be the only interface to the assistant features such as file search, and as the application may also perform other functions such as file upload and thread management.

What to do w.r.t. model client? The chat completion client abstraction doesn't seem to fit well because it seems to have some assumptions about the handling of messages in the interface that the assistants api has a very different approach of handling with the threads (I have spent a lot of time trying to adapt it without success). I'm also not sure if we can generate some general interface for agent-like apis, should I even create a specific one in autogen_ext to abstract away the openai sdk? I'm not sure what would be the value in that but it also feels like I'm adding an implementation without ap roper standard/abstraction.

Use the official openai client and do not introduce new abstractions on our side besides the new agent class.

How do we want to handle file search, specially ingestion? Also seems like something we don't have a strong abstraction for. I'm not sure if it should fit into how we will integrate rag or not. I also don't know if we want file ingestion to be part of the agent interaction api through on messages or not (maybe just a separate method to be used in the agent set up before running the chat?)

This should be mostly done using the official openai client in user's application. We can potentially add new assistant tools that can use the client.

The next step for me is to map the tool calling to the autogen core framework. It looks like the azure openai api has integration with logic apps to actually call functions as tools. Should this be a future functionality in autogen_ext?

We should make sure we can use our Tool class tools in this new agent.

Overall, the goal is to bring OpenAI assistant agents into our ecosystem, not to build a new wrapper around assistant API.

lspinheiro · 2024-11-17T04:24:06Z

@ekzhu , this is ready for review. I tested it with the script below. I haven't added unit tests as mocking the assistants api will take a lot of effort and we dont have other use cases.

import asyncio
from enum import Enum
from typing import List, Optional
from autogen_agentchat.agents import OpenAIAssistantAgent
from autogen_core.components.tools._base import BaseTool
from openai import AsyncAzureOpenAI
from autogen_agentchat.messages import TextMessage
from autogen_core.base import CancellationToken
from pydantic import BaseModel


class QuestionType(str, Enum):
    MULTIPLE_CHOICE = "MULTIPLE_CHOICE"
    FREE_RESPONSE = "FREE_RESPONSE"

class Question(BaseModel):
    question_text: str
    question_type: QuestionType
    choices: Optional[List[str]] = None

class DisplayQuizArgs(BaseModel):
    title: str
    questions: List[Question]

# Step 2: Create the Tool class by subclassing BaseTool

class DisplayQuizTool(BaseTool[DisplayQuizArgs, List[str]]):
    def __init__(self):
        super().__init__(
            args_type=DisplayQuizArgs,
            return_type=List[str],
            name="display_quiz",
            description=(
                "Displays a quiz to the student and returns the student's responses. "
                "A single quiz can have multiple questions."
            ),
        )

    # Step 3: Implement the run method

    async def run(self, args: DisplayQuizArgs, cancellation_token: CancellationToken) -> List[str]:
        # Simulate displaying the quiz and collecting responses
        responses = []
        for q in args.questions:
            if q.question_type == QuestionType.MULTIPLE_CHOICE:
                # Simulate a response for multiple-choice questions
                response = q.choices[0] if q.choices else ""
            elif q.question_type == QuestionType.FREE_RESPONSE:
                # Simulate a response for free-response questions
                response = "Sample free response"
            else:
                response = ""
            responses.append(response)
        return responses


def create_agent(client: AsyncAzureOpenAI) -> OpenAIAssistantAgent:
    tools = [
        {"type": "code_interpreter"},
        {"type": "file_search"},
        {"type": "tool", "tool": DisplayQuizTool()},
    ]

    return OpenAIAssistantAgent(
        name="assistant",
        instructions="Help the user with their task.",
        model="gpt-4o-mini",
        description="OpenAI Assistant Agent",
        client=client,
        tools=tools,
    )



async def test_file_retrieval(agent: OpenAIAssistantAgent, cancellation_token: CancellationToken):
    file_path = r".\data\SampleBooks\jungle_book.txt"
    await agent.on_upload_for_file_search(file_path, cancellation_token)
    
    message = TextMessage(source="user", content="What is the first sentence of the jungle scout book?")
    response = await agent.on_messages([message], cancellation_token)
    print("File Retrieval Test Response:", response.chat_message.content)
    
    await agent.delete_uploaded_files(cancellation_token)
    await agent.delete_vector_store(cancellation_token)


async def test_code_interpreter(agent: OpenAIAssistantAgent, cancellation_token: CancellationToken):
    message = TextMessage(source="user", content="I need to solve the equation `3x + 11 = 14`. Can you help me?")
    response = await agent.on_messages([message], cancellation_token)
    print("Code Interpreter Test Response:", response.chat_message.content)


async def test_quiz_creation(agent: OpenAIAssistantAgent, cancellation_token: CancellationToken):
    message = TextMessage(source="user", content="Create a short quiz about basic math with one multiple choice question and one free response question.")
    response = await agent.on_messages([message], cancellation_token)
    print("Quiz Creation Test Response:", response.chat_message.content)


async def main():
    client = AsyncAzureOpenAI(
        azure_endpoint="https://{your-api-endpoint}.openai.azure.com",
        api_version="2024-08-01-preview",
        api_key="your-api-key"
    )
    
    cancellation_token = CancellationToken()

    # Test file retrieval
    agent = create_agent(client)
    await test_file_retrieval(agent, cancellation_token)
    await agent.delete_assistant(cancellation_token)

    # Wait 1 minute before next test
    await asyncio.sleep(60)

    # Test code interpreter
    agent = create_agent(client)
    await test_code_interpreter(agent, cancellation_token) 
    await agent.delete_assistant(cancellation_token)

    # Wait 1 minute before next test
    await asyncio.sleep(60)

    # Test quiz creation
    agent = create_agent(client)
    await test_quiz_creation(agent, cancellation_token)
    await agent.delete_assistant(cancellation_token)


if __name__ == "__main__":
    asyncio.run(main())

Here is the output of the tool call inspection

-> run = await cancellation_token.link_future(
(Pdb) tool_outputs
[FunctionExecutionResult(content="['4', 'Sample free response']", call_id='call_mx2T4F0niQvJ0FiGBqsMGZXl'), FunctionExecutionResult(content="['4', 'Sample free response']", call_id='call_WW4D3OtQteig5BwH5FOZhdgL')]
(Pdb) c
Quiz Creation Test Response: I have created a short quiz about basic math. Here it is:

### Basic Math Quiz

1. **What is 15 divided by 3?**
   - A) 4
   - B) 5
   - C) 6
   - D) 7

2. **What is the square root of 64?**
   - (Free Response)

python/packages/autogen-agentchat/src/autogen_agentchat/agents/_openai_assistant_agent.py

python/packages/autogen-agentchat/tests/test_openai_assistant_agent.py

lpinheiroms and others added 4 commits October 29, 2024 18:59

initial assistant client draft

e35b407

expose assistants client

29ddbc1

initial openai assistant agentchat draft

c75cb31

Merge branch 'main' into lpinheiro/feat/add-openai-assistants-agent

8798a75

lpinheiroms and others added 4 commits November 15, 2024 15:13

update file search

41219af

add delete methods and fix typing

19b1fba

Merge branch 'main' into lpinheiro/feat/add-openai-assistants-agent

5eec435

add tool execution

08ab8ec

ekzhu linked an issue Nov 16, 2024 that may be closed by this pull request

Create OpenAI assisant agent #3646

Closed

lpinheiroms and others added 2 commits November 17, 2024 14:19

fix tool call and add docstring

76351d9

Merge branch 'main' into lpinheiro/feat/add-openai-assistants-agent

9274ba2

lspinheiro marked this pull request as ready for review November 17, 2024 04:19

lspinheiro requested review from ekzhu and jackgerrits November 17, 2024 04:24

ekzhu reviewed Nov 17, 2024

View reviewed changes

python/packages/autogen-agentchat/src/autogen_agentchat/agents/_openai_assistant_agent.py Show resolved Hide resolved

ekzhu reviewed Nov 17, 2024

View reviewed changes

python/packages/autogen-agentchat/src/autogen_agentchat/agents/_openai_assistant_agent.py Show resolved Hide resolved

ekzhu reviewed Nov 17, 2024

View reviewed changes

python/packages/autogen-agentchat/src/autogen_agentchat/agents/_openai_assistant_agent.py Show resolved Hide resolved

lspinheiro and others added 4 commits November 18, 2024 11:32

Merge branch 'main' into lpinheiro/feat/add-openai-assistants-agent

00286ef

abstract tools and support thread management

e1c91fd

add tests

b037604

removed unused typevars

4400d31

lspinheiro requested a review from ekzhu November 18, 2024 05:34

Merge branch 'main' into lpinheiro/feat/add-openai-assistants-agent

6883aa2

lspinheiro changed the title ~~WIP - OpenAI Assistants Agent~~ OpenAI Assistants Agent Nov 18, 2024

ekzhu approved these changes Nov 18, 2024

View reviewed changes

ekzhu reviewed Nov 18, 2024

View reviewed changes

python/packages/autogen-agentchat/tests/test_openai_assistant_agent.py Outdated Show resolved Hide resolved

add unsaved test changes

d4df331

lpinheiroms and others added 2 commits November 19, 2024 07:44

test typing fixes

9eca674

Merge branch 'main' into lpinheiro/feat/add-openai-assistants-agent

562fb54

lspinheiro merged commit df32d5e into microsoft:main Nov 18, 2024
41 checks passed

lspinheiro deleted the lpinheiro/feat/add-openai-assistants-agent branch November 18, 2024 23:56

ekzhu mentioned this pull request Nov 19, 2024

OpenAIAssistant should be in autogen-ext module #4267

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI Assistants Agent #4131

OpenAI Assistants Agent #4131

lspinheiro commented Nov 11, 2024

lspinheiro commented Nov 11, 2024

ekzhu commented Nov 11, 2024 •

edited

Loading

lspinheiro commented Nov 17, 2024

OpenAI Assistants Agent #4131

OpenAI Assistants Agent #4131

Conversation

lspinheiro commented Nov 11, 2024

Why are these changes needed?

Related issue number

Checks

lspinheiro commented Nov 11, 2024

ekzhu commented Nov 11, 2024 • edited Loading

lspinheiro commented Nov 17, 2024

ekzhu commented Nov 11, 2024 •

edited

Loading