-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot reproduce the results of Llama7B dora_r32. #14
Comments
This is the log of the training process and adapter_config with |
Did you install all the packages following requirements.txt? |
Hi, I donot install Will that hurt the performance? The packages I use are listed below: Package Version accelerate 0.25.0 I am trying to use exactly the same package as in requirements.txt, and will update my results when the finetuning and testing process finish. |
I use exactly the packages in requirements.txt, and the results are:
which has 2.1% average accuracy gap between the results reported in readme. |
New updates: I use exactly the packages in requirements.txt, and the results on r=[8,16,32,64] still have a gap with the results reported in readme, while the result on r=[4] is better. Average acc:
Is this a normal result? |
@xiaoshingshing2 I have encountered a similar issue. Did you manage to resolve it? Could you provide your package versions? |
In the latest update, I used exactly the packages in requirements.txt with the same versions. I still have the problems. |
First of all, using the official checkpoint is ok. The results on BoolQ is 69.63 while the official result is 69.7.
However, when I try to reproduce the results, I encounter two problems.
The first is about Llama7B dora_r32 without dora_simple. I change three commands in
llama_7B_Dora.sh
. For example, the micro_batch_size from 16 to 4, learning_rate from 2e-4 to 1e-4, and add--dora_simple False
to avoid using dora_simple. I use the command linesh llama_7B_Dora.sh 32 64 ./finetuned_result/dora_r32 0
, and the results arewhich are worse than the official results.
The second is that when I delete the
--dora_simple False
to accelerate the training process with dora_simple, the results are even worse:The text was updated successfully, but these errors were encountered: