You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I used miniforge to create env, and I updated driver of NPU. I uesd test_api 'transformers_int4_npu_win' in the config.yaml. Here is the log of violation of Mem.
(npu) C:\Users\intel\model\ipex-llm-main\python\llm\dev\benchmark\all-in-one>python run.py
C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\transformers\deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:12<00:00, 6.06s/it]
2025-02-25 15:49:32,322 - INFO - Converting model, it may takes up to several minutes ...
C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\torch\nn\init.py:412: UserWarning: Initializing zero-element tensors is a no-op
warnings.warn("Initializing zero-element tensors is a no-op")
2025-02-25 15:50:00,754 - INFO - Finish to convert model
decode start compiling
decode end compiling
Model saved to ./save_converted_model_dir\decoder_layer_0.xml
decode start compiling
decode end compiling
Model saved to ./save_converted_model_dir\decoder_layer_1.xml
prefill start compiling
prefill end compiling
Model saved to ./save_converted_model_dir\decoder_layer_prefill.xml
start compiling
Model saved to ./save_converted_model_dir\lm_head.xml
start compiling
C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\ipex_llm\transformers\npu_model.py:49: UserWarning: Model is already saved at ./save_converted_model_dir
warnings.warn(f"Model is already saved at {self.save_directory}")
2025-02-25 15:52:03,955 - INFO - Converted model has already saved to ./save_converted_model_dir.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
loading of model costs 164.65547030000005s
model generate cost: 3.3208497000000534
<|begin▁of▁sentence|><|begin▁of▁sentence|>461 U.S. 238 (1983) OLIM ET AL. v. WAKINEKONA No. 201, 201, 202, 203, 204, 205, 2
model generate cost: 2.9965502000000015
<|begin▁of▁sentence|><|begin▁of▁sentence|>461 U.S. 238 (1983) OLIM ET AL. v. WAKINEKONA No. 201, 201, 202, 203, 204, 205, 2
model generate cost: 3.0089657999999417
<|begin▁of▁sentence|><|begin▁of▁sentence|>461 U.S. 238 (1983) OLIM ET AL. v. WAKINEKONA No. 201, 201, 202, 203, 204, 205, 2
model generate cost: 2.998917000000006
<|begin▁of▁sentence|><|begin▁of▁sentence|>461 U.S. 238 (1983) OLIM ET AL. v. WAKINEKONA No. 201, 201, 202, 203, 204, 205, 2
Traceback (most recent call last):
File "C:\Users\intel\model\ipex-llm-main\python\llm\dev\benchmark\all-in-one\run.py", line 2338, in
run_model(model, api, in_out_pairs, conf['local_model_hub'], conf['warm_up'], conf['num_trials'], conf['num_beams'],
File "C:\Users\intel\model\ipex-llm-main\python\llm\dev\benchmark\all-in-one\run.py", line 197, in run_model
result = transformers_int4_npu_win(repo_id, local_model_hub, in_out_pairs, warm_up, num_trials, num_beams, low_bit, batch_size, optimize_model, transpose_value_cache, group_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\intel\model\ipex-llm-main\python\llm\dev\benchmark\all-in-one\run.py", line 673, in transformers_int4_npu_win
output_ids = model.generate(input_ids, do_sample=False, max_new_tokens=out_len,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\ipex_llm\transformers\npu_models\convert.py", line 338, in generate
return simple_generate(self, inputs=inputs, streamer=streamer, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\ipex_llm\transformers\npu_models\convert.py", line 404, in simple_generate
token = run_prefill(self.model_ptr, input_list, self.vocab_size,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\ipex_llm\transformers\npu_models\npu_llm_cpp.py", line 82, in run_prefill
plogits = _lib.run_prefill(model_ptr, input_ptr, input_len, repetition_penalty, False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: exception: access violation writing 0x000001F559302000
Exception ignored in: <function BaseNPUBackendWithPrefetch.del at 0x000001F09D548360>
Traceback (most recent call last):
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\intel_npu_acceleration_library\backend\base.py", line 245, in del
super(BaseNPUBackendWithPrefetch, self).del()
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\intel_npu_acceleration_library\backend\base.py", line 54, in del
backend_lib.destroyNNFactory(self._mm)
OSError: exception: access violation reading 0xFFFFFFFFFFFFFFFF
Exception ignored in: <function BaseNPUBackendWithPrefetch.del at 0x000001F09D548360>
Traceback (most recent call last):
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\intel_npu_acceleration_library\backend\base.py", line 245, in del
super(BaseNPUBackendWithPrefetch, self).del()
File "C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\intel_npu_acceleration_library\backend\base.py", line 54, in del
backend_lib.destroyNNFactory(self._mm)
OSError: exception: access violation reading 0xFFFFFFFFFFFFFFFF
The text was updated successfully, but these errors were encountered:
I used miniforge to create env, and I updated driver of NPU. I uesd test_api 'transformers_int4_npu_win' in the config.yaml. Here is the log of violation of Mem.
(npu) C:\Users\intel\model\ipex-llm-main\python\llm\dev\benchmark\all-in-one>python run.py
C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\transformers\deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:12<00:00, 6.06s/it]
2025-02-25 15:49:32,322 - INFO - Converting model, it may takes up to several minutes ...
C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\torch\nn\init.py:412: UserWarning: Initializing zero-element tensors is a no-op
warnings.warn("Initializing zero-element tensors is a no-op")
2025-02-25 15:50:00,754 - INFO - Finish to convert model
decode start compiling
decode end compiling
Model saved to ./save_converted_model_dir\decoder_layer_0.xml
decode start compiling
decode end compiling
Model saved to ./save_converted_model_dir\decoder_layer_1.xml
prefill start compiling
prefill end compiling
Model saved to ./save_converted_model_dir\decoder_layer_prefill.xml
start compiling
Model saved to ./save_converted_model_dir\lm_head.xml
start compiling
C:\Users\intel\miniforge3\envs\npu\Lib\site-packages\ipex_llm\transformers\npu_model.py:49: UserWarning: Model is already saved at ./save_converted_model_dir
warnings.warn(f"Model is already saved at {self.save_directory}")
2025-02-25 15:52:03,955 - INFO - Converted model has already saved to ./save_converted_model_dir.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
The text was updated successfully, but these errors were encountered: