Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipex-llm run benchmark error on LNL NPU #12895

Open
Lucas-cai opened this issue Feb 25, 2025 · 0 comments
Open

ipex-llm run benchmark error on LNL NPU #12895

Lucas-cai opened this issue Feb 25, 2025 · 0 comments

Comments

@Lucas-cai
Copy link

I used miniforge to create env, and I updated driver of NPU. I uesd test_api 'transformers_int4_npu_win' in the config.yaml. Here is the log of violation of Mem.

(npu) C:\Users\intel\model\ipex-llm-main\python\llm\dev\benchmark\all-in-one>python run.py
C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\transformers\deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:12<00:00, 6.06s/it]
2025-02-25 15:49:32,322 - INFO - Converting model, it may takes up to several minutes ...
C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\torch\nn\init.py:412: UserWarning: Initializing zero-element tensors is a no-op
warnings.warn("Initializing zero-element tensors is a no-op")
2025-02-25 15:50:00,754 - INFO - Finish to convert model
decode start compiling
decode end compiling
Model saved to ./save_converted_model_dir\decoder_layer_0.xml
decode start compiling
decode end compiling
Model saved to ./save_converted_model_dir\decoder_layer_1.xml
prefill start compiling
prefill end compiling
Model saved to ./save_converted_model_dir\decoder_layer_prefill.xml
start compiling
Model saved to ./save_converted_model_dir\lm_head.xml
start compiling
C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\ipex_llm\transformers\npu_model.py:49: UserWarning: Model is already saved at ./save_converted_model_dir
warnings.warn(f"Model is already saved at {self.save_directory}")
2025-02-25 15:52:03,955 - INFO - Converted model has already saved to ./save_converted_model_dir.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

loading of model costs 164.65547030000005s
model generate cost: 3.3208497000000534
<|begin▁of▁sentence|><|begin▁of▁sentence|>461 U.S. 238 (1983) OLIM ET AL. v. WAKINEKONA No. 201, 201, 202, 203, 204, 205, 2
model generate cost: 2.9965502000000015
<|begin▁of▁sentence|><|begin▁of▁sentence|>461 U.S. 238 (1983) OLIM ET AL. v. WAKINEKONA No. 201, 201, 202, 203, 204, 205, 2
model generate cost: 3.0089657999999417
<|begin▁of▁sentence|><|begin▁of▁sentence|>461 U.S. 238 (1983) OLIM ET AL. v. WAKINEKONA No. 201, 201, 202, 203, 204, 205, 2
model generate cost: 2.998917000000006
<|begin▁of▁sentence|><|begin▁of▁sentence|>461 U.S. 238 (1983) OLIM ET AL. v. WAKINEKONA No. 201, 201, 202, 203, 204, 205, 2
Traceback (most recent call last):
File "C:\Users\intel\model\ipex-llm-main\python\llm\dev\benchmark\all-in-one\run.py", line 2338, in
run_model(model, api, in_out_pairs, conf['local_model_hub'], conf['warm_up'], conf['num_trials'], conf['num_beams'],
File "C:\Users\intel\model\ipex-llm-main\python\llm\dev\benchmark\all-in-one\run.py", line 197, in run_model
result = transformers_int4_npu_win(repo_id, local_model_hub, in_out_pairs, warm_up, num_trials, num_beams, low_bit, batch_size, optimize_model, transpose_value_cache, group_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\intel\model\ipex-llm-main\python\llm\dev\benchmark\all-in-one\run.py", line 673, in transformers_int4_npu_win
output_ids = model.generate(input_ids, do_sample=False, max_new_tokens=out_len,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\ipex_llm\transformers\npu_models\convert.py", line 338, in generate
return simple_generate(self, inputs=inputs, streamer=streamer, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\ipex_llm\transformers\npu_models\convert.py", line 404, in simple_generate
token = run_prefill(self.model_ptr, input_list, self.vocab_size,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\ipex_llm\transformers\npu_models\npu_llm_cpp.py", line 82, in run_prefill
plogits = _lib.run_prefill(model_ptr, input_ptr, input_len, repetition_penalty, False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: exception: access violation writing 0x000001F559302000
Exception ignored in: <function BaseNPUBackendWithPrefetch.del at 0x000001F09D548360>
Traceback (most recent call last):
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\intel_npu_acceleration_library\backend\base.py", line 245, in del
super(BaseNPUBackendWithPrefetch, self).del()
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\intel_npu_acceleration_library\backend\base.py", line 54, in del
backend_lib.destroyNNFactory(self._mm)
OSError: exception: access violation reading 0xFFFFFFFFFFFFFFFF
Exception ignored in: <function BaseNPUBackendWithPrefetch.del at 0x000001F09D548360>
Traceback (most recent call last):
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\intel_npu_acceleration_library\backend\base.py", line 245, in del
super(BaseNPUBackendWithPrefetch, self).del()
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\intel_npu_acceleration_library\backend\base.py", line 54, in del
backend_lib.destroyNNFactory(self._mm)
OSError: exception: access violation reading 0xFFFFFFFFFFFFFFFF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant