Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chat with message history (draft) #225

Closed

Conversation

leila-messallem
Copy link
Contributor

@leila-messallem leila-messallem commented Dec 6, 2024

Description

Suggestion for adding chat functionality with message history.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Documentation update
  • Project configuration change

Complexity

Complexity: Medium

How Has This Been Tested?

  • Unit tests
  • E2E tests
  • Manual tests

Checklist

The following requirements should have been met (depending on the changes in the branch):

  • Documentation has been updated
  • Unit tests have been updated
  • E2E tests have been updated
  • Examples have been updated
  • New files have copyright header
  • CLA (https://neo4j.com/developer/cla/) has been signed
  • CHANGELOG.md updated if appropriate

@@ -18,17 +18,3 @@ repos:
language: system
types: [ python ]
stages: [ commit, push ]
- id: mypy
Copy link
Contributor Author

@leila-messallem leila-messallem Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about this, but I couldn't bother with mypy just for opening a Draft.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good point actually, maybe we should disable this for draft PRs

@@ -134,7 +136,10 @@ def search(
)
logger.debug(f"RAG: retriever_result={retriever_result}")
logger.debug(f"RAG: prompt={prompt}")
answer = self.llm.invoke(prompt)
if chat_history is not None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this part isn't great. The idea I want to demonstrate is to have two different flows for chat with message history and single invokation.

) -> Iterable[ChatCompletionMessageParam]:
messages = [{"role": "system", "content": self.system_instruction}]
for i, message in enumerate(chat_history):
if i % 2 == 0:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that the chat history is a list of strings, alternating between user and assistant messages. It's obviously fragile and should probably be changed to some kind of UserMessage and AssistantMessage objects, or smtn like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use dicts with "role" and "content" keys (it'd work for OpenAI, I don't know about the others)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can look into it.

super().__init__(model_name, model_params)
self.model = GenerativeModel(model_name=model_name, **kwargs)
super().__init__(model_name, model_params, system_instruction)
self.model = GenerativeModel(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On vertexai, system_instruction is set on the GenerativeModel object. So it's separated from the question prompt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite bad news actually, that means that we can use an LLM instance only in one specific use case? For instance, with RAG with a text2Cypher retriever, we can't use the same LLM for the retrieval and the generation? (I mean, if we want to use the system prompt, which is not what we do at the moment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we don't have to always use the system_instruction. Although, it's the recommended way.

@leila-messallem
Copy link
Contributor Author

Something that is currently missing is to include the chat history (or at least some of it) in the embeddings calculation.

@@ -52,6 +54,21 @@ def invoke(self, input: str) -> LLMResponse:
LLMGenerationError: If anything goes wrong.
"""

@abstractmethod
def chat(self, input: str, chat_history: list[str]) -> LLMResponse:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it could be done in one method, with an optional chat_history. The difference would be dealt with in a get_messages method, that would do something like this:

def get_messages(self, input, chat_history) -> list:
    messages = []
    if self.system_instructions:
        messages.append({"role": "system", "content": self.system_instructions}})
    if chat_history:
        # handle chat history messages format
        messages.extend(chat_history)
    messages.append({"role": "user", "content": input})
    return messages

This is an advantage if most of the LLM providers share the same format for messages, in which case we can implement this method only once in the interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that works.

@leila-messallem
Copy link
Contributor Author

#234

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants