Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GEMMA_2, LLAMA_3, and PHI_3 MessagesFormatter #73

Merged
merged 3 commits into from
Dec 10, 2024

Conversation

woheller69
Copy link
Contributor

Gemma needs another \n in the prompt template to avoid slow processing of follow up prompts, see #54

Screenshot from 2024-07-07 19-16-28
As can be seen in the attached screenshot the prompt processing is much faster.
Before: The new user prompt plus the models previous answer are processed (42 tokens)
After: Only the new user prompt is processed (18 tokens)

The longer the models previous answer the more relevant...

Probably there are similar issues in all other prompt templates.

Gemma needs another \n to avoid slow processing of follow up prompts, see Maximilian-Winter#54
Add another \n to speed up follow up prompts
@woheller69 woheller69 changed the title Fix GEMMA MessagesFormatter Fix GEMMA_2 and LLAMA_3 MessagesFormatter Jul 7, 2024
@woheller69 woheller69 changed the title Fix GEMMA_2 and LLAMA_3 MessagesFormatter Fix GEMMA_2, LLAMA_3, and PHI_3 MessagesFormatter Jul 7, 2024
@pabl-o-ce
Copy link
Collaborator

Great patch I would check all to see if anyone is missing a typo or newline.

Thank u

@pabl-o-ce
Copy link
Collaborator

pabl-o-ce commented Jul 7, 2024

it seems that keep missing some newlines in there you wanna add in llama3 in this PR?
It would be like this the correct one:

llama_3_prompt_markers = {
    Roles.system: PromptMarkers("""<|start_header_id|>system<|end_header_id|>\n\n""", """<|eot_id|>"""),
    Roles.user: PromptMarkers("""<|start_header_id|>user<|end_header_id|>\n\n""", """<|eot_id|>"""),
    Roles.assistant: PromptMarkers("""<|start_header_id|>assistant<|end_header_id|>\n\n""", """<|eot_id|>"""),
    Roles.tool: PromptMarkers("""<|start_header_id|>function_calling_results<|end_header_id|>\n""", """<|eot_id|>"""),
}

I don't know if in tools would be another \n newline.. I'm not sure

On Gemma I'm not sure is required a new line I have to test it

And on Phi you are correct :)

@pabl-o-ce
Copy link
Collaborator

I saw the issue you put it in Gemma Thanks.

Just if wanna add the missing Llama 3 would be awesome

Thank u for the contribution

@woheller69
Copy link
Contributor Author

I don't use tools, so I don't know if another \n is missing there. But Llama3 is also fixed in this PR.

But I think a better long term fix in general would be my proposal in #54, i.e. to keep the leading and trailing newlines (and spaces?) produced by the model and store them in history. I don't know the code (and I am a python noob) but it feels like they are being stipped.

@woheller69
Copy link
Contributor Author

it seems that keep missing some newlines in there you wanna add in llama3 in this PR? It would be like this the correct one:

llama_3_prompt_markers = {
    Roles.system: PromptMarkers("""<|start_header_id|>system<|end_header_id|>\n\n""", """<|eot_id|>"""),
    Roles.user: PromptMarkers("""<|start_header_id|>user<|end_header_id|>\n\n""", """<|eot_id|>"""),
    Roles.assistant: PromptMarkers("""<|start_header_id|>assistant<|end_header_id|>\n\n""", """<|eot_id|>"""),
    Roles.tool: PromptMarkers("""<|start_header_id|>function_calling_results<|end_header_id|>\n""", """<|eot_id|>"""),
}

We only need one more \n in Roles.assistant, not in Roles.user (there it should not matter)
We only need to make sure that we append the latest model answer exactly as the model produced it so that it finds the tokens already processed in memory.

@woheller69
Copy link
Contributor Author

Is there any reason, not to merge this?

@Maximilian-Winter
Copy link
Owner

@woheller69 Sorry, was busy with other stuff, will merge this after checking it in the next days.

@woheller69
Copy link
Contributor Author

Mistral Small 22B needs an additional blank for faster multi-turn conversation: Roles.assistant: PromptMarkers(""" """, """""")
Unfortunately other Mistral based models like Mistral-Nemo do not need it...

@woheller69
Copy link
Contributor Author

what is the reason for stripping the prompt? For Mistral Small it can also be fixed like this if the assistant reply is not stripped. I think this would alter the history the llm knows:

def _format_message_content(self, content: str, role: Roles) -> str:
    if self.strip_prompt and role != Roles.assistant:
        return content.strip()
    else:
        return content

Can that be the root cause?

@woheller69
Copy link
Contributor Author

def _format_response(
        self,
        formatted_messages: str,
        last_role: Roles,
        response_role: Literal[Roles.user, Roles.assistant] | None = None,
) -> Tuple[str, Roles]:
    if response_role is None:
        response_role = Roles.assistant if last_role != Roles.assistant else Roles.user

    prompt_start = self.prompt_markers[response_role].start.strip() if self.strip_prompt else self.prompt_markers[
        response_role].start
    return formatted_messages + prompt_start, response_role

Also _format_response contains strip() and seems to be part of the problem.

If I set strip_prompt = False Llama3, Mixtral, and Gemma work fine without the modifications I proposed above.
So essentially: what is the advantage of strip_prompt = True which is set by default?

@woheller69
Copy link
Contributor Author

Is this project still maintained?

@pabl-o-ce
Copy link
Collaborator

Hello @woheller69 I'm here to help / maintain the repo

Copy link
Collaborator

@pabl-o-ce pabl-o-ce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good thank you

@pabl-o-ce pabl-o-ce merged commit 40b1eb8 into Maximilian-Winter:master Dec 10, 2024
@woheller69 woheller69 deleted the patch-1 branch December 10, 2024 17:39
@woheller69
Copy link
Contributor Author

do you know the reason for "So essentially: what is the advantage of strip_prompt = True which is set by default?"
This seems to be the root cause.

@pabl-o-ce
Copy link
Collaborator

No ideas... but I would ask @Maximilian-Winter whenever he have time. Also @woheller69 if you have discord we can talk there I'm looking forward to have people help us in the repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants