Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HF Phi3-mini-128k returns very different gradients than reference #1441

Closed
riccardofelluga opened this issue Nov 14, 2024 · 2 comments
Closed
Assignees
Labels
high priority huggingface For supporting HF models thunderfx for things that could be applicable to the dynamo+thunder frontend

Comments

@riccardofelluga
Copy link
Collaborator

riccardofelluga commented Nov 14, 2024

🐛 Bug

When testing HF phi3 the gradients differ by several order of magnitudes from reference:

>       torch.testing.assert_close(grads_ref, grads_compiled, rtol=1e-2, atol=1e-2)
E       AssertionError: Tensor-likes are not close!
E
E       Mismatched elements: 39 / 128 (30.5%)
E       Greatest absolute difference: 0.0810546875 at index (0, 12) (up to 0.01 allowed)
E       Greatest relative difference: 8.125 at index (1, 3) (up to 0.01 allowed)
E
E       The failure occurred for item [0]

thunder/tests/test_networks.py:508: AssertionError

To Reproduce

Steps to reproduce the behavior:

  1. Go to test-hf-phi3 branch
  2. Run pytest thunder/tests/test_networks.py -k phi3
  3. Wait for the error

Environment

Container 20241114 with Thunder at test-phi3@4c71eaa4f15028f94910e365ce6c3894769578a5

Additional context

This is part of #1278.

cc @apaz-cli

@riccardofelluga
Copy link
Collaborator Author

This seems to have been addressed in transformers 4.46.2, which is bumped in #1439

@tfogal tfogal added high priority thunderfx for things that could be applicable to the dynamo+thunder frontend huggingface For supporting HF models labels Nov 15, 2024
@riccardofelluga
Copy link
Collaborator Author

Closing since #1439 has been merged and the test passes now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority huggingface For supporting HF models thunderfx for things that could be applicable to the dynamo+thunder frontend
Projects
None yet
Development

No branches or pull requests

2 participants