Skip to content

Commit

Permalink
Python: Add support for streaming OpenAI assistants (#9055)
Browse files Browse the repository at this point in the history
### Motivation and Context

OpenAI assistants were introduced several versions ago; however, there
was open work to support streaming for OpenAI assistants v2. Work to
support streaming assistants was also needed in Agent Group Chat
scenarios.

<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->

### Description

This PR introduces:
- Support for streaming with (Azure) OpenAI assistants
- Support for Agent Chat Scenarios that can now leverage a streaming
invoke instead of just a sync invoke.
- Unit tests to increase the test coverage to near 100% for the newly
added code
- Adds a few more OpenAI assistant agent samples
- Closes #7267 

<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->

### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [X] The code builds clean without any errors or warnings
- [X] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [X] All unit tests pass, and I have added new tests where possible
- [X] I didn't break anyone 😄
  • Loading branch information
moonbox3 authored Oct 2, 2024
1 parent ca17571 commit f08cf3c
Show file tree
Hide file tree
Showing 33 changed files with 1,791 additions and 56 deletions.
2 changes: 1 addition & 1 deletion docs/PLANNERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

This document has been moved to the Semantic Kernel Documentation site. You can find it by navigating to the [Automatically orchestrate AI with planner](https://learn.microsoft.com/en-us/semantic-kernel/ai-orchestration/planner) page.

To make an update on the page, file a PR on the [docs repo.](https://github.com/MicrosoftDocs/semantic-kernel-docs/blob/main/semantic-kernel/ai-orchestration/planner.md)
To make an update on the page, file a PR on the [docs repo.](https://github.com/MicrosoftDocs/semantic-kernel-docs/blob/main/semantic-kernel/concepts/planning.md)
6 changes: 6 additions & 0 deletions python/samples/concepts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,12 @@ In Semantic Kernel for Python, we leverage Pydantic Settings to manage configura
3. **Direct Constructor Input:**
- As an alternative to environment variables and `.env` files, you can pass the required settings directly through the constructor of the AI Connector or Memory Connector.

## Microsoft Entra Token Authentication

To authenticate to your Azure resources using a Microsoft Entra Authentication Token, the `AzureChatCompletion` AI Service connector now supports this as a built-in feature. If you do not provide an API key -- either through an environment variable, a `.env` file, or the constructor -- and you also do not provide a custom `AsyncAzureOpenAI` client, an `ad_token`, or an `ad_token_provider`, the `AzureChatCompletion` connector will attempt to retrieve a token using the [`DefaultAzureCredential`](https://learn.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python).

To successfully retrieve and use the Entra Auth Token, you need the `Cognitive Services OpenAI Contributor` role assigned to your Azure OpenAI resource. By default, the `https://cognitiveservices.azure.com` token endpoint is used. You can override this endpoint by setting an environment variable `.env` variable as `AZURE_OPENAI_TOKEN_ENDPOINT` or by passing a new value to the `AzureChatCompletion` constructor as part of the `AzureOpenAISettings`.

## Best Practices

- **.env File Placement:** We highly recommend placing the `.env` file in the `semantic-kernel/python` root directory. This is a common practice when developing in the Semantic Kernel repository.
Expand Down
37 changes: 37 additions & 0 deletions python/samples/concepts/agents/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Semantic Kernel: Agent concept examples

This project contains a step by step guide to get started with _Semantic Kernel Agents_ in Python.

#### PyPI:
- For the use of Chat Completion agents, the minimum allowed Semantic Kernel pypi version is 1.3.0.
- For the use of OpenAI Assistant agents, the minimum allowed Semantic Kernel pypi version is 1.4.0.
- For the use of Agent Group Chat, the minimum allowed Semantic kernel pypi version is 1.6.0.
- For the use of Streaming OpenAI Assistant agents, the minimum allowed Semantic Kernel pypi version is 1.11.0

#### Source

- [Semantic Kernel Agent Framework](../../../semantic_kernel/agents/)

## Examples

The concept agents examples are grouped by prefix:

Prefix|Description
---|---
assistant|How to use agents based on the [Open AI Assistant API](https://platform.openai.com/docs/assistants).
chat_completion|How to use Semantic Kernel Chat Completion agents.
mixed_chat|How to combine different agent types.
complex_chat|**Coming Soon**

*Note: As we strive for parity with .NET, more getting_started_with_agent samples will be added. The current steps and names may be revised to further align with our .NET counterpart.*

## Configuring the Kernel

Similar to the Semantic Kernel Python concept samples, it is necessary to configure the secrets
and keys used by the kernel. See the follow "Configuring the Kernel" [guide](../README.md#configuring-the-kernel) for
more information.

## Running Concept Samples

Concept samples can be run in an IDE or via the command line. After setting up the required api key or token authentication
for your AI connector, the samples run without any extra command line arguments.
34 changes: 26 additions & 8 deletions python/samples/concepts/agents/assistant_agent_chart_maker.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from semantic_kernel.agents.open_ai import AzureAssistantAgent, OpenAIAssistantAgent
from semantic_kernel.contents.chat_message_content import ChatMessageContent
from semantic_kernel.contents.file_reference_content import FileReferenceContent
from semantic_kernel.contents.streaming_file_reference_content import StreamingFileReferenceContent
from semantic_kernel.contents.utils.author_role import AuthorRole
from semantic_kernel.kernel import Kernel

Expand All @@ -19,6 +20,8 @@
# Note: you may toggle this to switch between AzureOpenAI and OpenAI
use_azure_openai = True

streaming = True


# A helper method to invoke the agent with the user input
async def invoke_agent(agent: OpenAIAssistantAgent, thread_id: str, input: str) -> None:
Expand All @@ -27,14 +30,29 @@ async def invoke_agent(agent: OpenAIAssistantAgent, thread_id: str, input: str)

print(f"# {AuthorRole.USER}: '{input}'")

async for message in agent.invoke(thread_id=thread_id):
if message.content:
print(f"# {message.role}: {message.content}")

if len(message.items) > 0:
for item in message.items:
if isinstance(item, FileReferenceContent):
print(f"\n`{message.role}` => {item.file_id}")
if streaming:
first_chunk = True
async for message in agent.invoke_stream(thread_id=thread_id):
if message.content:
if first_chunk:
print(f"# {message.role}: ", end="", flush=True)
first_chunk = False
print(message.content, end="", flush=True)

if len(message.items) > 0:
for item in message.items:
if isinstance(item, StreamingFileReferenceContent):
print(f"\n# {message.role} => {item.file_id}")
print()
else:
async for message in agent.invoke(thread_id=thread_id):
if message.content:
print(f"# {message.role}: {message.content}")

if len(message.items) > 0:
for item in message.items:
if isinstance(item, FileReferenceContent):
print(f"\n`{message.role}` => {item.file_id}")


async def main():
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
AGENT_INSTRUCTIONS = "You are a funny comedian who loves telling G-rated jokes."

# Note: you may toggle this to switch between AzureOpenAI and OpenAI
use_azure_openai = False
use_azure_openai = True


# A helper method to invoke the agent with the user input
Expand Down
110 changes: 110 additions & 0 deletions python/samples/concepts/agents/assistant_agent_streaming.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Copyright (c) Microsoft. All rights reserved.
import asyncio
from typing import Annotated

from semantic_kernel.agents.open_ai import AzureAssistantAgent, OpenAIAssistantAgent
from semantic_kernel.contents.chat_message_content import ChatMessageContent
from semantic_kernel.contents.utils.author_role import AuthorRole
from semantic_kernel.functions.kernel_function_decorator import kernel_function
from semantic_kernel.kernel import Kernel

#####################################################################
# The following sample demonstrates how to create an OpenAI #
# assistant using either Azure OpenAI or OpenAI. OpenAI Assistants #
# allow for function calling, the use of file search and a #
# code interpreter. Assistant Threads are used to manage the #
# conversation state, similar to a Semantic Kernel Chat History. #
# This sample also demonstrates the Assistants Streaming #
# capability and how to manage an Assistants chat history. #
#####################################################################

HOST_NAME = "Host"
HOST_INSTRUCTIONS = "Answer questions about the menu."

# Note: you may toggle this to switch between AzureOpenAI and OpenAI
use_azure_openai = True


# Define a sample plugin for the sample
class MenuPlugin:
"""A sample Menu Plugin used for the concept sample."""

@kernel_function(description="Provides a list of specials from the menu.")
def get_specials(self) -> Annotated[str, "Returns the specials from the menu."]:
return """
Special Soup: Clam Chowder
Special Salad: Cobb Salad
Special Drink: Chai Tea
"""

@kernel_function(description="Provides the price of the requested menu item.")
def get_item_price(
self, menu_item: Annotated[str, "The name of the menu item."]
) -> Annotated[str, "Returns the price of the menu item."]:
return "$9.99"


# A helper method to invoke the agent with the user input
async def invoke_agent(
agent: OpenAIAssistantAgent, thread_id: str, input: str, history: list[ChatMessageContent]
) -> None:
"""Invoke the agent with the user input."""
message = ChatMessageContent(role=AuthorRole.USER, content=input)
await agent.add_chat_message(thread_id=thread_id, message=message)

# Add the user message to the history
history.append(message)

print(f"# {AuthorRole.USER}: '{input}'")

first_chunk = True
async for content in agent.invoke_stream(thread_id=thread_id, messages=history):
if content.role != AuthorRole.TOOL:
if first_chunk:
print(f"# {content.role}: ", end="", flush=True)
first_chunk = False
print(content.content, end="", flush=True)
print()


async def main():
# Create the instance of the Kernel
kernel = Kernel()

# Add the sample plugin to the kernel
kernel.add_plugin(plugin=MenuPlugin(), plugin_name="menu")

# Create the OpenAI Assistant Agent
service_id = "agent"
if use_azure_openai:
agent = await AzureAssistantAgent.create(
kernel=kernel, service_id=service_id, name=HOST_NAME, instructions=HOST_INSTRUCTIONS
)
else:
agent = await OpenAIAssistantAgent.create(
kernel=kernel, service_id=service_id, name=HOST_NAME, instructions=HOST_INSTRUCTIONS
)

thread_id = await agent.create_thread()

history: list[ChatMessageContent] = []

try:
await invoke_agent(agent, thread_id=thread_id, input="Hello", history=history)
await invoke_agent(agent, thread_id=thread_id, input="What is the special soup?", history=history)
await invoke_agent(agent, thread_id=thread_id, input="What is the special drink?", history=history)
await invoke_agent(agent, thread_id=thread_id, input="Thank you", history=history)
finally:
await agent.delete_thread(thread_id)
await agent.delete()

# You may then view the conversation history
print("========= Conversation History =========")
for content in history:
if content.role != AuthorRole.TOOL:
print(f"# {content.role}: {content.content}")
print("========= End of Conversation History =========")


if __name__ == "__main__":
asyncio.run(main())
95 changes: 95 additions & 0 deletions python/samples/concepts/agents/mixed_chat_streaming.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Copyright (c) Microsoft. All rights reserved.

import asyncio

from semantic_kernel.agents import AgentGroupChat, ChatCompletionAgent
from semantic_kernel.agents.open_ai import OpenAIAssistantAgent
from semantic_kernel.agents.strategies.termination.termination_strategy import TerminationStrategy
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion
from semantic_kernel.contents.chat_message_content import ChatMessageContent
from semantic_kernel.contents.utils.author_role import AuthorRole
from semantic_kernel.kernel import Kernel

#####################################################################
# The following sample demonstrates how to create an OpenAI #
# assistant using either Azure OpenAI or OpenAI, a chat completion #
# agent and have them participate in a group chat to work towards #
# the user's requirement. #
#####################################################################


class ApprovalTerminationStrategy(TerminationStrategy):
"""A strategy for determining when an agent should terminate."""

async def should_agent_terminate(self, agent, history):
"""Check if the agent should terminate."""
return "approved" in history[-1].content.lower()


REVIEWER_NAME = "ArtDirector"
REVIEWER_INSTRUCTIONS = """
You are an art director who has opinions about copywriting born of a love for David Ogilvy.
The goal is to determine if the given copy is acceptable to print.
If so, state that it is approved. Only include the word "approved" if it is so.
If not, provide insight on how to refine suggested copy without example.
"""

COPYWRITER_NAME = "CopyWriter"
COPYWRITER_INSTRUCTIONS = """
You are a copywriter with ten years of experience and are known for brevity and a dry humor.
The goal is to refine and decide on the single best copy as an expert in the field.
Only provide a single proposal per response.
You're laser focused on the goal at hand.
Don't waste time with chit chat.
Consider suggestions when refining an idea.
"""


def _create_kernel_with_chat_completion(service_id: str) -> Kernel:
kernel = Kernel()
kernel.add_service(AzureChatCompletion(service_id=service_id))
return kernel


async def main():
try:
agent_reviewer = ChatCompletionAgent(
service_id="artdirector",
kernel=_create_kernel_with_chat_completion("artdirector"),
name=REVIEWER_NAME,
instructions=REVIEWER_INSTRUCTIONS,
)

agent_writer = await OpenAIAssistantAgent.create(
service_id="copywriter",
kernel=Kernel(),
name=COPYWRITER_NAME,
instructions=COPYWRITER_INSTRUCTIONS,
)

chat = AgentGroupChat(
agents=[agent_writer, agent_reviewer],
termination_strategy=ApprovalTerminationStrategy(agents=[agent_reviewer], maximum_iterations=10),
)

input = "a slogan for a new line of electric cars."

await chat.add_chat_message(ChatMessageContent(role=AuthorRole.USER, content=input))
print(f"# {AuthorRole.USER}: '{input}'")

last_agent = None
async for message in chat.invoke_stream():
if message.content is not None:
if last_agent != message.name:
print(f"\n# {message.name}: ", end="", flush=True)
last_agent = message.name
print(f"{message.content}", end="", flush=True)

print()
print(f"# IS COMPLETE: {chat.is_complete}")
finally:
await agent_writer.delete()


if __name__ == "__main__":
asyncio.run(main())
4 changes: 4 additions & 0 deletions python/samples/getting_started/CONFIGURING_THE_KERNEL.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,7 @@ chat_completion = AzureChatCompletion(service_id="test", env_file_path=env_file_

- Manually configure the `api_key` or required parameters on either the `OpenAIChatCompletion` or `AzureChatCompletion` constructor with keyword arguments.
- This requires the user to manage their own keys/secrets as they aren't relying on the underlying environment variables or `.env` file.

### 4. Microsoft Entra Authentication

To learn how to use a Microsoft Entra Authentication token to authenticate to your Azure OpenAI resource, please navigate to the following [guide](../concepts/README.md#microsoft-entra-token-authentication).
1 change: 1 addition & 0 deletions python/samples/getting_started_with_agents/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ This project contains a step by step guide to get started with _Semantic Kernel
- For the use of Chat Completion agents, the minimum allowed Semantic Kernel pypi version is 1.3.0.
- For the use of OpenAI Assistant agents, the minimum allowed Semantic Kernel pypi version is 1.4.0.
- For the use of Agent Group Chat, the minimum allowed Semantic kernel pypi version is 1.6.0.
- For the use of Streaming OpenAI Assistant agents, the minimum allowed Semantic Kernel pypi version is 1.11.0

#### Source

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,16 +35,29 @@ def create_message_with_image_reference(input: str, file_id: str) -> ChatMessage
)


streaming = False


# A helper method to invoke the agent with the user input
async def invoke_agent(agent: OpenAIAssistantAgent, thread_id: str, message: ChatMessageContent) -> None:
"""Invoke the agent with the user input."""
await agent.add_chat_message(thread_id=thread_id, message=message)

print(f"# {AuthorRole.USER}: '{message.items[0].text}'")

async for content in agent.invoke(thread_id=thread_id):
if content.role != AuthorRole.TOOL:
print(f"# {content.role}: {content.content}")
if streaming:
first_chunk = True
async for content in agent.invoke_stream(thread_id=thread_id):
if content.role != AuthorRole.TOOL:
if first_chunk:
print(f"# {content.role}: ", end="", flush=True)
first_chunk = False
print(content.content, end="", flush=True)
print()
else:
async for content in agent.invoke(thread_id=thread_id):
if content.role != AuthorRole.TOOL:
print(f"# {content.role}: {content.content}")


async def main():
Expand Down
Loading

0 comments on commit f08cf3c

Please sign in to comment.