Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with converting Whisper model to ONNX #1040

Open
1 of 5 tasks
AvivSham opened this issue Nov 19, 2024 · 1 comment
Open
1 of 5 tasks

Issue with converting Whisper model to ONNX #1040

AvivSham opened this issue Nov 19, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@AvivSham
Copy link

System Info

Created new env following this requirement file:

transformers[torch]==4.46.1
onnxruntime==1.19.2
optimum==1.23.3
onnx==1.16.2
onnxconverter-common==1.14.0
tqdm==4.66.5
onnxslim==0.1.36
--extra-index-url https://pypi.ngc.nvidia.com
onnx_graphsurgeon==0.3.27

system info:
MAC M2
converting using CPU device

Environment/Platform

  • Website/web-app
  • Browser extension
  • Server-side (e.g., Node.js, Deno, Bun)
  • Desktop app (e.g., Electron)
  • Other (e.g., VSCode extension)

Description

We are attempting to convert whisper-small using the HF model openai/whisper-small by executing the command specified in the README file.
python -m scripts.convert --quantize --model_id openai/whisper-small

We get the following trace:

TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
  elif len(self.key_cache[layer_idx]) == 0:  # fills previously skipped layers; checking for tensor causes errors
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
	model.decoder.embed_tokens.weight: {'model.decoder.embed_tokens.weight'}
	proj_out.weight: {'onnx::MatMul_3259'}
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
	model.decoder.embed_tokens.weight: {'model.decoder.embed_tokens.weight'}
	proj_out.weight: {'onnx::MatMul_2910'}
		-[x] values not close enough, max diff: 0.024361729621887207 (atol: 0.001)
		-[x] values not close enough, max diff: 6.988886833190918 (atol: 0.001)
		-[x] values not close enough, max diff: 5.208465576171875 (atol: 0.001)
		-[x] values not close enough, max diff: 1.9965003728866577 (atol: 0.001)
		-[x] values not close enough, max diff: 1.4132819175720215 (atol: 0.001)
		-[x] values not close enough, max diff: 0.8667690753936768 (atol: 0.001)
		-[x] values not close enough, max diff: 3.7726752758026123 (atol: 0.001)
		-[x] values not close enough, max diff: 2.159898519515991 (atol: 0.001)
		-[x] values not close enough, max diff: 12.425561904907227 (atol: 0.001)
		-[x] values not close enough, max diff: 1.2728543281555176 (atol: 0.001)
		-[x] values not close enough, max diff: 6.912049770355225 (atol: 0.001)
		-[x] values not close enough, max diff: 1.0248034000396729 (atol: 0.001)
		-[x] values not close enough, max diff: 7.5350022315979 (atol: 0.001)
		-[x] values not close enough, max diff: 1.6307682991027832 (atol: 0.001)
		-[x] values not close enough, max diff: 7.0035505294799805 (atol: 0.001)
		-[x] values not close enough, max diff: 0.8978527784347534 (atol: 0.001)
		-[x] values not close enough, max diff: 5.2730207443237305 (atol: 0.001)
		-[x] values not close enough, max diff: 1.0290248394012451 (atol: 0.001)
		-[x] values not close enough, max diff: 5.59857177734375 (atol: 0.001)
		-[x] values not close enough, max diff: 1.0392111539840698 (atol: 0.001)
		-[x] values not close enough, max diff: 4.692121505737305 (atol: 0.001)
		-[x] values not close enough, max diff: 1.080666184425354 (atol: 0.001)
		-[x] values not close enough, max diff: 2.687824249267578 (atol: 0.001)
		-[x] values not close enough, max diff: 1.6337403059005737 (atol: 0.001)
		-[x] values not close enough, max diff: 2.598097801208496 (atol: 0.001)
		-[x] values not close enough, max diff: 1.6576173305511475 (atol: 0.001)
Validation for the model models/openai/whisper-small/encoder_model.onnx raised: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 0.001:
- last_hidden_state: max diff = 0.024361729621887207
The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 0.001:
- logits: max diff = 6.988886833190918
- present.0.decoder.key: max diff = 5.208465576171875
- present.0.decoder.value: max diff = 1.9965003728866577
- present.1.decoder.key: max diff = 1.4132819175720215
- present.1.decoder.value: max diff = 0.8667690753936768
- present.2.decoder.key: max diff = 3.7726752758026123
- present.2.decoder.value: max diff = 2.159898519515991
- present.3.decoder.key: max diff = 12.425561904907227
- present.3.decoder.value: max diff = 1.2728543281555176
- present.4.decoder.key: max diff = 6.912049770355225
- present.4.decoder.value: max diff = 1.0248034000396729
- present.5.decoder.key: max diff = 7.5350022315979
- present.5.decoder.value: max diff = 1.6307682991027832
- present.6.decoder.key: max diff = 7.0035505294799805
- present.6.decoder.value: max diff = 0.8978527784347534
- present.7.decoder.key: max diff = 5.2730207443237305
- present.7.decoder.value: max diff = 1.0290248394012451
- present.8.decoder.key: max diff = 5.59857177734375
- present.8.decoder.value: max diff = 1.0392111539840698
- present.9.decoder.key: max diff = 4.692121505737305
- present.9.decoder.value: max diff = 1.080666184425354
- present.10.decoder.key: max diff = 2.687824249267578
- present.10.decoder.value: max diff = 1.6337403059005737
- present.11.decoder.key: max diff = 2.598097801208496
- present.11.decoder.value: max diff = 1.6576173305511475.
 The exported model was saved at: models/openai/whisper-small

None of the layers meet the default tolerance and in most layers, the difference is more than 3 orders of magnitude.
@xenova can you please help with this?

Thanks,

Reproduction

just run:
python -m scripts.convert --quantize --model_id openai/whisper-small

@AvivSham AvivSham added the bug Something isn't working label Nov 19, 2024
@AvivSham AvivSham changed the title Issue with converting whisper model to ONNX Issue with converting Whisper model to ONNX Nov 19, 2024
@xenova
Copy link
Collaborator

xenova commented Nov 21, 2024

Thanks @AvivSham, I am able to reproduce the issue. Same thing happens with other variants of whisper. @echarlaix looks to be an issue with Optimum as I'm able to reproduce with optimum-cli. 👀 Any idea what's up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants