This guide demonstrates how to install and use IPEX-LLM on the Intel Arc B-Series GPU (such as B580).
Note
Ensure your GPU driver and software environment meet the prerequisites before proceeding.
- Linux
1.1 Install Prerequisites
1.2 Install IPEX-LLM (for PyTorch and HuggingFace)
1.3 Install IPEX-LLM (for llama.cpp and Ollama) - Windows
2.1 Install Prerequisites
2.2 Install IPEX-LLM (for PyTorch and HuggingFace)
2.3 Install IPEX-LLM (for llama.cpp and Ollama) - Use Cases
3.1 PyTorch
3.2 Ollama
3.3 llama.cpp
3.4 vLLM - Troubleshooting
4.1 RuntimeError: could not create an engine
Note
Ensure that Resizable BAR is enabled in your system's BIOS before proceeding. This is essential for optimal GPU performance and to avoid potential issues such as Bus error (core dumped)
. For detailed steps, please refer to the official guidance here.
We recommend using Ubuntu 24.10 and kernel version 6.11 or above, as support for Battle Mage has been backported from kernel version 6.12 to version 6.11, which is included in Ubuntu 24.10, according to the official documentation here. However, since this version of Ubuntu does not include the latest compute and media-related packages, we offer the intel-graphics Personal Package Archive (PPA). The PPA provides early access to newer packages, along with additional tools and features such as EU debugging.
Use the following commands to install the intel-graphics PPA and the necessary compute and media packages:
sudo apt-get update
sudo apt-get install -y software-properties-common
sudo add-apt-repository -y ppa:kobuk-team/intel-graphics
sudo apt-get install -y libze-intel-gpu1 libze1 intel-ocloc intel-opencl-icd clinfo intel-gsc intel-media-va-driver-non-free libmfx1 libmfx-gen1 libvpl2 libvpl-tools libva-glx2 va-driver-all vainfo
sudo apt-get install -y intel-level-zero-gpu-raytracing # Optional: Hardware ray tracing support
Download and install Miniforge:
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh
source ~/.bashrc
Create and activate a Python environment:
conda create -n llm python=3.11
conda activate llm
With the llm
environment active, install the appropriate ipex-llm
package based on your use case:
Install the ipex-llm[xpu-arc]
package. Choose either the US or CN website for extra-index-url
:
-
For US:
pip install --pre --upgrade ipex-llm[xpu-arc] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
-
For CN:
pip install --pre --upgrade ipex-llm[xpu-arc] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
Install the ipex-llm[cpp]
package.
pip install --pre --upgrade ipex-llm[cpp]
Note
If you encounter network issues during installation, refer to the troubleshooting guide for alternative steps.
If your driver version is lower than 32.0.101.6449/32.0.101.101.6256
, update it from the Intel download page. After installation, reboot the system.
Download and install Miniforge for Windows from the official page. After installation, create and activate a Python environment:
conda create -n llm python=3.11 libuv
conda activate llm
With the llm
environment active, install the appropriate ipex-llm
package based on your use case:
Install the ipex-llm[xpu-arc]
package. Choose either the US or CN website for extra-index-url
:
-
For US:
pip install --pre --upgrade ipex-llm[xpu-arc] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
-
For CN:
pip install --pre --upgrade ipex-llm[xpu-arc] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
Install the ipex-llm[cpp]
package.
pip install --pre --upgrade ipex-llm[cpp]
Note
If you encounter network issues while installing IPEX, refer to this guide for troubleshooting advice.
Run a Quick PyTorch Example:
- Activate the environment:
conda activate llm # On Windows, use 'cmd'
- Run the code:
import torch from ipex_llm.transformers import AutoModelForCausalLM tensor_1 = torch.randn(1, 1, 40, 128).to('xpu') tensor_2 = torch.randn(1, 1, 128, 40).to('xpu') print(torch.matmul(tensor_1, tensor_2).size())
- Expected Output:
torch.Size([1, 1, 40, 40])
For benchmarks and performance measurement, refer to the Benchmark Quickstart guide.
To integrate and run with Ollama, follow the Ollama Quickstart guide.
For instructions on how to run llama.cpp with IPEX-LLM, refer to the llama.cpp Quickstart guide.
To set up and run vLLM, follow the vLLM Quickstart guide.
If you encounter a RuntimeError
like the output shown above while working on Linux after running conda deactivate
and then reactivating your environment using conda activate env
, the issue is likely caused by the OCL_ICD_VENDORS
environment variable.
To fix this on Linux, run the following command:
unset OCL_ICD_VENDORS
This will remove the conflicting environment variable and allow your program to function correctly.
Note: This issue only occurs on Linux systems. It does not affect Windows environments.