-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: refactor adapter weight loading and mapping #2193
Conversation
a2759fd
to
71f9a4c
Compare
this PR has been updated to include the ability to load lora adapters from a local directory. the path to the lora adapter can be specified in the following way:
note it's possible to mix
|
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this super brittle? It only works for llama, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its a bit brittle but should work in most cases, currently this will correctly load in weight for llama, mistral and gemma type models (just added some updates and tests too). The intention of moving this code here is to avoid mapping the weights inside of each models code.. currently this is the best approach I have, however I'm happy to make any changes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could have a mapping table here per model type? But probably better for another PR.
f8ecabf
to
d46372f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a really nice improvement 🎉. Had two questions.
Does this also apply to lora adapters generated by using the mistral-finetune repo also? As the weights in these adapters use keys that are in the consolidated.safetensors file typically found in recent mistral huggingface repos, and these differ from conventional key names. The best example is Mistral 7b instruct v0.3. For example have tried to use a lora adapter generated from mistral-finetune but it did not work with TGI. It also did not work with vllm. Also to get it in a state where it might work with vllm had mapped the weights keys to what is conventionally used but found lm_head, embed_tokens and layernorm weight keys that vllm could not handle. Will this PR address these issues also? Thank you! |
eced5b7
to
59022c2
Compare
Hi @tensimixt, thank you for your question. If |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment about _get_model
/get_model
, maybe the naming can be improved?
Looks good to me otherwise.
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could have a mapping table here per model type? But probably better for another PR.
* fix: refactor adapter weight loading and mapping * feat: enable lora load from directory * fix: adjust launcher for local lora adapters * feat: improve weight loading and add tests * fix: improve logging and rebase syntax issue * fix: impove adapter merge comments and remove unused conditional * fix: improve get_model_with_lora_adapters naming * fix: comment typo
* fix: refactor adapter weight loading and mapping * feat: enable lora load from directory * fix: adjust launcher for local lora adapters * feat: improve weight loading and add tests * fix: improve logging and rebase syntax issue * fix: impove adapter merge comments and remove unused conditional * fix: improve get_model_with_lora_adapters naming * fix: comment typo
Any update about this PR? We will have this PR included in the next release build? Thanks. |
Hi @ErikKaum , cool and thanks a lots!! |
* fix: refactor adapter weight loading and mapping * feat: enable lora load from directory * fix: adjust launcher for local lora adapters * feat: improve weight loading and add tests * fix: improve logging and rebase syntax issue * fix: impove adapter merge comments and remove unused conditional * fix: improve get_model_with_lora_adapters naming * fix: comment typo
This PR refactors the loading and mapping of lora adapters and avoids the need for model specific changes for adapters. The goal of this PR is to simplify the loading flow and avoid modify modeling code for loras to work.