Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while converting LLama-3.1:8b to ONNX #1000

Open
charlesbvll opened this issue Oct 29, 2024 · 0 comments
Open

Error while converting LLama-3.1:8b to ONNX #1000

charlesbvll opened this issue Oct 29, 2024 · 0 comments
Labels
question Further information is requested

Comments

@charlesbvll
Copy link

Question

Hey @xenova,

Thanks a lot for this library! I tried converting meta-llama/Llama-3.1-8B-Instruct to ONNX using the following command (on main):

python -m scripts.convert --quantize --model_id "meta-llama/Llama-3.1-8B-Instruct"

Using the following requirements.py file (in a fresh env):

transformers[torch]==4.43.4
onnxruntime==1.19.2
optimum==1.21.3
onnx==1.16.2
onnxconverter-common==1.14.0
tqdm==4.66.5
onnxslim==0.1.31
--extra-index-url https://pypi.ngc.nvidia.com
onnx_graphsurgeon==0.3.27

But got the following error:

Framework not specified. Using pt to export the model.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:27<00:00,  6.99s/it]
Automatic task detection to text-generation-with-past (possible synonyms are: causal-lm-with-past).
Using the export variant default. Available variants are:
    - default: The default ONNX variant.

***** Exporting submodel 1/1: LlamaForCausalLM *****
Using framework PyTorch: 2.5.0
Overriding 1 configuration item(s)
        - use_cache -> True
We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)
/site-packages/transformers/models/llama/modeling_llama.py:1037: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if sequence_length != 1:
Traceback (most recent call last):
  File "/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "scripts/convert.py", line 462, in <module>
    main()
  File "scripts/convert.py", line 349, in main
    main_export(**export_kwargs)
  File "/site-packages/optimum/exporters/onnx/__main__.py", line 365, in main_export
    onnx_export_from_model(
  File "/site-packages/optimum/exporters/onnx/convert.py", line 1170, in onnx_export_from_model
    _, onnx_outputs = export_models(
  File "/site-packages/optimum/exporters/onnx/convert.py", line 776, in export_models
    export(
  File "/site-packages/optimum/exporters/onnx/convert.py", line 881, in export
    export_output = export_pytorch(
  File "/site-packages/optimum/exporters/onnx/convert.py", line 577, in export_pytorch
    onnx_export(
  File "/site-packages/torch/onnx/__init__.py", line 375, in export
    export(
  File "/site-packages/torch/onnx/utils.py", line 502, in export
    _export(
  File "/site-packages/torch/onnx/utils.py", line 1564, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "/site-packages/torch/onnx/utils.py", line 1117, in _model_to_graph
    graph = _optimize_graph(
  File "/site-packages/torch/onnx/utils.py", line 663, in _optimize_graph
    _C._jit_pass_onnx_graph_shape_type_inference(
RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library. Therefore the output file must be a file path, so that the ONNX external data can be written to the same directory. Please specify the output file name.

I saw this somewhat related issue #967, but the error didn't happen on the ONNX library (I think v3 has been merged now).

Do you have a fix for larger models such as this one? I also tried with meta-llama/Llama-3.2-3B-Instruct, but I got the same error, even though I see here that you managed to convert it successfully.

Thanks!

@charlesbvll charlesbvll added the question Further information is requested label Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant