Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Microsoft's Phi-2 model #2548

Merged
merged 3 commits into from
Jan 11, 2024
Merged

Support for Microsoft's Phi-2 model #2548

merged 3 commits into from
Jan 11, 2024

Conversation

vince62s
Copy link
Member

@vince62s vince62s commented Jan 11, 2024

Just a note on this PR to remember. Phi-2 from MSFT uses a rotary dim (32) which is different from the dim per head (2560/32=80) which makes things a lit bit awkward, rotary embeddings are applied only to the first 32 dimensions and beyond (from 33 to 80) it's just a plain copy.

NOTE 2: I am trying to build a generic converter convert_HF.py fo now compatible with Llama, Mistral, Phi, hope to include other filters and in the end have only a single converter.

@vince62s vince62s merged commit 8045a86 into OpenNMT:master Jan 11, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant