Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support continue final message #2733

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

drbh
Copy link
Collaborator

@drbh drbh commented Nov 8, 2024

This PR adds the continue_final_message param to chat requests to allow users to continue the last message. This is useful when prefilling a response from the model. Please see https://huggingface.co/docs/transformers/main/en/chat_templating#what-does-continuefinalmessage-do for details

Comment on lines 10 to 11
disable_grammar_support=False,
use_flash_attention=False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove those.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in latest commit

"max_tokens": 30,
"stream": False,
"seed": 1337,
"continue_final_message": False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we assume that having a final assistant message means that users what continuation ?

There's no use case where the LLM produces a user output right ?

Copy link
Collaborator Author

@drbh drbh Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats an interesting idea, currently a conversation could hypothetically contain back to back assistant messages (assuming the template allows it)

ie.

<system>
<user>
<assistant>
<assistant>

I could see this pattern being used when injecting some information, for example a request may contain messages like below

"messages": [
    {"role": "system", "content": "system message"},
    {"role": "user", "content": "whats ticket 10 about?"},
    {"role": "assistant", "content": "Ticket Information:\nTicket ID: 10\nTicket Title: Ticket 10\nTicket Description: This is a test ticket\nTicket Status: Open\nTicket Priority: High\nTicket Type: Bug\nTicket Assignee: John Doe\nTicket Reporter: Jane Doe\nTicket Created: 2022-01-01 00:00:00\nTicket Updated: 2022-01-01 00:00:00\n"},
],

and the user may expect a new message, rather than a continuation of the last one.

however, it would be nice to disallow this pattern to avoid the extra argument, and always just continue the last message if its an assistant (or other heuristic)

Copy link
Collaborator

@Narsil Narsil Nov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the pattern exist in the wild or not ? If there's no evidence, let's not guess about what might become, and focus instead on what is.

We can add a flag when the current behavior proposal is not the only one anymore (for instance when there's a new use case). Until then, let's aim for simplicity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great point! I've removed the continue_final_message and prefer assuming if the final message is an assistant we will continue it

@drbh drbh force-pushed the support-continue-final-message branch 2 times, most recently from 0a011fb to e837d9f Compare November 21, 2024 17:02
@drbh drbh force-pushed the support-continue-final-message branch from c17635f to 8770b39 Compare November 22, 2024 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants