-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use this frame work to train a LLM with multi-GPUs? #149
Comments
The parallel functionality is currently under development and testing, and is expected to be available by the end of the month. |
seem the wrong lora dimension. |
in (model_llama.py:301)
And I got this result:
I think there are bugs in the LlamaModel Class and LLMModelArgs Class. |
the raw liners are created from the transformers.LlamaForCausalLM, and we get the in_features and out_features from the Linear. |
I sove the problem by updating the transformers I found transformers==4.30.2
I recommend that you upgrade the dependencies of transformers during subsequent updates. |
thks, we will evaluate this. |
can we close this? |
Hi, thank you for the great work. Are there any updates for single-machine multi-GPU fine-tuning? In the README.md, I can only find the multi-node settings, how can I fine-tune with multi-GPUs on a single machine? Thanks a lot. |
the pp can run in any setup, like single-machine multi-GPU and multi-machine multi-GPU. just set the MASTER_ADDR=localhost. |
Thanks a lot! |
Is the frame work support multi-gpu training?
I want to use the frame work to train a 70B model, however, I did not find the parameter settings or methods for multi-gpus training.
The text was updated successfully, but these errors were encountered: