-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GEMMA_2, LLAMA_3, and PHI_3 MessagesFormatter #73
Conversation
Gemma needs another \n to avoid slow processing of follow up prompts, see Maximilian-Winter#54
Add another \n to speed up follow up prompts
<|end_of_turn|> does not exist
Great patch I would check all to see if anyone is missing a typo or newline. Thank u |
it seems that keep missing some newlines in there you wanna add in llama3 in this PR? llama_3_prompt_markers = {
Roles.system: PromptMarkers("""<|start_header_id|>system<|end_header_id|>\n\n""", """<|eot_id|>"""),
Roles.user: PromptMarkers("""<|start_header_id|>user<|end_header_id|>\n\n""", """<|eot_id|>"""),
Roles.assistant: PromptMarkers("""<|start_header_id|>assistant<|end_header_id|>\n\n""", """<|eot_id|>"""),
Roles.tool: PromptMarkers("""<|start_header_id|>function_calling_results<|end_header_id|>\n""", """<|eot_id|>"""),
} I don't know if in tools would be another
And on |
I saw the issue you put it in Just if wanna add the missing Llama 3 would be awesome Thank u for the contribution |
I don't use tools, so I don't know if another \n is missing there. But Llama3 is also fixed in this PR. But I think a better long term fix in general would be my proposal in #54, i.e. to keep the leading and trailing newlines (and spaces?) produced by the model and store them in history. I don't know the code (and I am a python noob) but it feels like they are being stipped. |
We only need one more \n in Roles.assistant, not in Roles.user (there it should not matter) |
Is there any reason, not to merge this? |
@woheller69 Sorry, was busy with other stuff, will merge this after checking it in the next days. |
Mistral Small 22B needs an additional blank for faster multi-turn conversation: Roles.assistant: PromptMarkers(""" """, """""") |
what is the reason for stripping the prompt? For Mistral Small it can also be fixed like this if the assistant reply is not stripped. I think this would alter the history the llm knows:
Can that be the root cause? |
Also _format_response contains strip() and seems to be part of the problem. If I set strip_prompt = False Llama3, Mixtral, and Gemma work fine without the modifications I proposed above. |
Is this project still maintained? |
Hello @woheller69 I'm here to help / maintain the repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good thank you
do you know the reason for "So essentially: what is the advantage of strip_prompt = True which is set by default?" |
No ideas... but I would ask @Maximilian-Winter whenever he have time. Also @woheller69 if you have discord we can talk there I'm looking forward to have people help us in the repo |
Gemma needs another \n in the prompt template to avoid slow processing of follow up prompts, see #54
As can be seen in the attached screenshot the prompt processing is much faster.
Before: The new user prompt plus the models previous answer are processed (42 tokens)
After: Only the new user prompt is processed (18 tokens)
The longer the models previous answer the more relevant...
Probably there are similar issues in all other prompt templates.