We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On this line, grad clipping occurs:
waifu-diffusion/trainer/diffusers_trainer.py
Line 931 in 27d301c
However, if fp16 is enabled then the clipping would be applied to the scaled gradients, due to GradScaler.
GradScaler
According to PyTorch documentation (https://pytorch.org/docs/master/notes/amp_examples.html#gradient-clipping), the gradients should be unscaled before clipping.
So, this appears to be a bug and could cause fp16 training to result in worse performance than it otherwise should.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
On this line, grad clipping occurs:
waifu-diffusion/trainer/diffusers_trainer.py
Line 931 in 27d301c
However, if fp16 is enabled then the clipping would be applied to the scaled gradients, due to
GradScaler
.According to PyTorch documentation (https://pytorch.org/docs/master/notes/amp_examples.html#gradient-clipping), the gradients should be unscaled before clipping.
So, this appears to be a bug and could cause fp16 training to result in worse performance than it otherwise should.
The text was updated successfully, but these errors were encountered: