-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fine tuning with llama-7b-model failure #256
Comments
See sources that have similar suggestions to use bfloat16:
Besides the same error can also occur when memory has exceeded with large batch sizes even with bfloat16. In which case, we can try a smaller batch sizes and input sequence lengths |
@alanbraz can you verify with bfloat16 and let us know if you run into anymore issues? |
Running in a pod, at the same cluster of the Model Tuner, with the same resources and
still error at the command:
|
Describe the bug
reported by Alan Braz
There are failures seen when fine tuning
llama-7b-model
with certain set of parameters :Platform
Please provide details about the environment you are using, including the following:
Sample Code
run examples/run_fine_tuning.py script with any dataset and above config
Expected behavior
Fine tuning succeeds
Observed behavior
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: