-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support continue final message #2733
base: main
Are you sure you want to change the base?
Conversation
disable_grammar_support=False, | ||
use_flash_attention=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed in latest commit
"max_tokens": 30, | ||
"stream": False, | ||
"seed": 1337, | ||
"continue_final_message": False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we assume that having a final assistant message means that users what continuation ?
There's no use case where the LLM produces a user output right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats an interesting idea, currently a conversation could hypothetically contain back to back assistant messages (assuming the template allows it)
ie.
<system>
<user>
<assistant>
<assistant>
I could see this pattern being used when injecting some information, for example a request may contain messages like below
"messages": [
{"role": "system", "content": "system message"},
{"role": "user", "content": "whats ticket 10 about?"},
{"role": "assistant", "content": "Ticket Information:\nTicket ID: 10\nTicket Title: Ticket 10\nTicket Description: This is a test ticket\nTicket Status: Open\nTicket Priority: High\nTicket Type: Bug\nTicket Assignee: John Doe\nTicket Reporter: Jane Doe\nTicket Created: 2022-01-01 00:00:00\nTicket Updated: 2022-01-01 00:00:00\n"},
],
and the user may expect a new message, rather than a continuation of the last one.
however, it would be nice to disallow this pattern to avoid the extra argument, and always just continue the last message if its an assistant (or other heuristic)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the pattern exist in the wild or not ? If there's no evidence, let's not guess about what might become, and focus instead on what is.
We can add a flag when the current behavior proposal is not the only one anymore (for instance when there's a new use case). Until then, let's aim for simplicity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great point! I've removed the continue_final_message
and prefer assuming if the final message is an assistant we will continue it
0a011fb
to
e837d9f
Compare
c17635f
to
8770b39
Compare
This PR adds the
continue_final_message
param to chat requests to allow users to continue the last message. This is useful when prefilling a response from the model. Please see https://huggingface.co/docs/transformers/main/en/chat_templating#what-does-continuefinalmessage-do for details