You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi team,
I am currently running T5-Small model inference using OnnxRuntime. The model I am using to run the inference is - https://huggingface.co/Xenova/t5-small/tree/main/onnx
I tested the same model on CPU and DirectML execution providers and observed different outputs for the same input during the decoding stage.
encoder_model.onnx - This model is working as expected in both CPU and DirectML EPs.
decoder_model_merged.onnx - This model has outputs beyond acceptable range for CPU and DirectML. If anyone from ORT team can investigate, it would be really helpful.
I am attaching some results for CPU and DirectML comparisons for reference -
=== Comparing Encoder Outputs ===
Comparing Encoder outputs:
Shapes: (1, 12, 512) vs (1, 12, 512)
Statistics for first array:
mean: -0.002746098442003131
std: 0.12785771489143372
min: -0.5774061679840088
max: 0.5452761054039001
abs_max: 0.5774061679840088
has_nan: False
has_inf: False
Statistics for second array:
mean: -0.00274610030464828
std: 0.1278577446937561
min: -0.5774062871932983
max: 0.5452762246131897
abs_max: 0.5774062871932983
has_nan: False
has_inf: False
Difference analysis:
Maximum absolute difference: 5.736947059631348e-07
Mean absolute difference: 5.666575475515856e-08
Maximum relative difference: 0.07109003514051437
Position of max difference: (np.int64(0), np.int64(1), np.int64(401))
✅ Differences within acceptable threshold (1e-05)
=== Comparing Decoder Outputs ===
Comparing Decoder logits:
Shapes: (1, 1, 32128) vs (1, 1, 32128)
Statistics for first array:
mean: -19.10366439819336
std: 4.460851669311523
min: -43.21986389160156
max: -1.202622890472412
abs_max: 43.21986389160156
has_nan: False
has_inf: False
Statistics for second array:
mean: -19.10366439819336
std: 4.460851669311523
min: -43.21989059448242
max: -1.2026221752166748
abs_max: 43.21989059448242
has_nan: False
has_inf: False
Difference analysis:
Maximum absolute difference: 5.7220458984375e-05
Mean absolute difference: 7.175476639531553e-06
Maximum relative difference: 2.00232352653984e-06
Position of max difference: (np.int64(0), np.int64(0), np.int64(32113))
❌ Large difference detected! (> 1e-05)
Values at maximum difference point:
Array1: -43.13878631591797
Array2: -43.13884353637695
Surrounding values (if available):
Array1 at [np.int64(0), np.int64(0), np.int64(32112)]: -43.058406829833984
Array2 at [np.int64(0), np.int64(0), np.int64(32112)]: -43.058406829833984
Array1 at [np.int64(0), np.int64(0), np.int64(32114)]: -43.1171760559082
Array2 at [np.int64(0), np.int64(0), np.int64(32114)]: -43.11715316772461
To reproduce
Please run the above mentioned model using encode and decode methods.
Urgency
I would like to get this resolved by end of Dec 2024.
Platform
Windows
OS Version
Windows 11 Enterprise 22631.4169
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.20.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
DirectML
Execution Provider Library Version
DirectML 1.15.4
The text was updated successfully, but these errors were encountered:
5.7220458984375e-05 does not seems a large difference for a model.
Could you use end-to-end metrics (like precision/recall etc) to measure and see whether it makes any difference between CPU and DirectML?
Describe the issue
Hi team,
I am currently running T5-Small model inference using OnnxRuntime. The model I am using to run the inference is -
https://huggingface.co/Xenova/t5-small/tree/main/onnx
I tested the same model on CPU and DirectML execution providers and observed different outputs for the same input during the decoding stage.
encoder_model.onnx
- This model is working as expected in both CPU and DirectML EPs.decoder_model_merged.onnx
- This model has outputs beyond acceptable range for CPU and DirectML. If anyone from ORT team can investigate, it would be really helpful.I am attaching some results for CPU and DirectML comparisons for reference -
To reproduce
Please run the above mentioned model using encode and decode methods.
Urgency
I would like to get this resolved by end of Dec 2024.
Platform
Windows
OS Version
Windows 11 Enterprise 22631.4169
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.20.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
DirectML
Execution Provider Library Version
DirectML 1.15.4
The text was updated successfully, but these errors were encountered: