Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function calling #175

Merged
merged 21 commits into from
May 30, 2024
Merged

Function calling #175

merged 21 commits into from
May 30, 2024

Conversation

chadbailey59
Copy link
Contributor

@chadbailey59 chadbailey59 commented May 23, 2024

No description provided.

@chadbailey59 chadbailey59 force-pushed the cb/function-calling branch from 3418f19 to 729aca3 Compare May 24, 2024 17:56
@chadbailey59
Copy link
Contributor Author

So, function calling is... weird.

Without function calling, a chatbot pipeline is pretty straightforward:

  • The user says something
  • Pipecat appends whatever the user said to a messages list and send that list to the LLM
  • The LLM generates an assistant response
  • Pipecat generates TTS from that assistant response and plays that audio through the transport

With function calling, the flow is different:

  • The user says something
  • Pipecat appends whatever the user said to a messages list (along with some possible "tools" it may choose to use) and sends that list to the LLM
  • The LLM generates an assistant response that may be text, or it may be a "tool call", i.e. the LLM decides to use one of the available "tools"
    • If it's text:
      • Pipecat generates TTS from that assistant response and plays that audio through the transport
    • if it's a tool call:
      • Pipecat needs to append the assistant message with the tool call, including its params, to the message list
      • Pipecat then needs to call the requested function with the provided params (e.g. "check_weather" with params {location: 'san francisco'}
      • Pipecat then needs to append the results from that function call to the messages list in a weird format
      • Pipecat then needs to re-prompt the LLM with the new messages list to generate an answer to the user's question
      • Finally, Pipecat generates TTS from that second assistant response and plays that audio through the transport

Right now, that entire second branch is implemented in the 15-function-calling example as a FunctionCaller class that pushes a context frame back up the pipeline for the re-prompting. We should probably be handling all this inside the framework itself, but that starts to touch on how much context management we should be doing on behalf of the user.

llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
tma_out # Santa Cat spoken responses
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the comments are removed? And we should use WakeCheckFilter instead.

]

})
self._context.add_message(tool_call)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this can be added internally when we push LLMFunctionCallFrame. There's nothing the user should do in here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant to do this on behalf of the user, because as a general rule we don't mess with the context in the framework. In the patient-intake example, I'm kind of misusing function calling and not actually inserting function call results into the context, for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Well, in this case it's not the result, it's just the function call itself. Maybe for this one maybe we could add an argument to the LLM service, something like include_function_calling_in_context or something like that (and it defaults to True). Asking the user to handle all this is too much probably?

@chadbailey59
Copy link
Contributor Author

@aconchillo I think I've addressed the concerns and the function calling code is ready to merge. But I'm still concerned I may have inadvertently undone some of your changes through various merges and rebases.

@chadbailey59 chadbailey59 requested a review from aconchillo May 28, 2024 17:07
LLMFullResponseEndFrame,
LLMFullResponseStartFrame,
LLMFunctionStartFrame,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove LLMFunctionStartFrame and CallFrame

@chadbailey59
Copy link
Contributor Author

OK, I think I've gotten the new function calling approach where it needs to be. Let's get this one merged, and I can remove the function call frame types in a follow-up PR. @aconchillo

@chadbailey59 chadbailey59 requested a review from aconchillo May 30, 2024 14:47
self._context.add_message(
{"role": "system", "content": "Finally, ask the user the reason for their doctor visit today. Once they answer, call the list_visit_reasons function."})
await llm.process_frame(OpenAILLMContextFrame(self._context), FrameDirection.DOWNSTREAM)
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should all these functions return None?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They do implicitly, but maybe better to be more explicit about it, if so.

@chadbailey59 chadbailey59 force-pushed the cb/function-calling branch from f09a9ba to ed70c71 Compare May 30, 2024 16:48
@chadbailey59 chadbailey59 merged commit 4c3d19c into main May 30, 2024
2 of 3 checks passed
@chadbailey59 chadbailey59 deleted the cb/function-calling branch May 30, 2024 17:25
@aconchillo aconchillo mentioned this pull request Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants