[Installation]: vllm on NVIDIA jetson AGX orin #5640

phgcha · 2024-06-18T14:25:37Z

Your current environment

root@jetson:/workspace# python collect_env.py
  File "collect_env.py", line 724
    print(msg, file=sys.stderr)
                   ^
SyntaxError: invalid syntax
root@jetson:/workspace# python3 collect_env.py
Collecting environment information...
PyTorch version: 2.0.0a0+ec3941ad.nv23.02
Is debug build: False
CUDA used to build PyTorch: 11.4
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.5 LTS (aarch64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.25.2
Libc version: glibc-2.31

Python version: 3.8.10 (default, Nov 14 2022, 12:59:47)  [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-5.10.120-tegra-aarch64-with-glibc2.29
Is CUDA available: True
CUDA runtime version: 11.4.315
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Probably one of the following:
/usr/lib/aarch64-linux-gnu/libcudnn.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_adv_infer.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_adv_train.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_cnn_infer.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_cnn_train.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_ops_infer.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_ops_train.so.8.6.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: False

CPU:
Architecture:                    aarch64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
CPU(s):                          12
On-line CPU(s) list:             0-7
Off-line CPU(s) list:            8-11
Thread(s) per core:              1
Core(s) per socket:              4
Socket(s):                       2
Vendor ID:                       ARM
Model:                           1
Model name:                      ARMv8 Processor rev 1 (v8l)
Stepping:                        r0p1
CPU max MHz:                     2201.6001
CPU min MHz:                     115.2000
BogoMIPS:                        62.50
L1d cache:                       512 KiB
L1i cache:                       512 KiB
L2 cache:                        2 MiB
L3 cache:                        4 MiB
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; CSV2, but not BHB
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp uscat ilrcpc flagm

Versions of relevant libraries:
[pip3] numpy==1.17.4
[pip3] torch==2.0.0a0+ec3941ad.nv23.2
[pip3] torchaudio==0.13.1+b90d798
[pip3] torchvision==0.14.1a0+5e8e2f1
[conda] Could not collect
ROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: N/A
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
Could not collect

How you are installing vllm

pip install vllm

Hi,

I'm trying to install vllm on my Jetson AGX orin developer kit.

I'm using the following image: nvcr.io/nvidia/l4t-pytorch:r35.2.1-pth2.0-py3
and I get this error when I pip install vllm

root@jetson:/workspace# pip install vllm
Collecting vllm
  Downloading vllm-0.5.0.post1.tar.gz (743 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 743.2/743.2 kB 12.6 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [16 lines of output]
      Traceback (most recent call last):
        File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmp/pip-build-env-d1bct981/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/tmp/pip-build-env-d1bct981/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-d1bct981/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 415, in <module>
        File "<string>", line 341, in get_vllm_version
      RuntimeError: Unknown runtime environment
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

Note the error message Unknown runtime environment
I figured that this is thrown here https://github.com/vllm-project/vllm/blob/main/setup.py#L347 due to torch.version.cuda being none

However, when I prompt python3 and try verifying the cuda availability,

root@jetson:/workspace# python3
Python 3.8.10 (default, Nov 14 2022, 12:59:47)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.version.cuda)
11.4

Any help would be appreciated. Thanks

The text was updated successfully, but these errors were encountered:

youkaichao · 2024-06-18T16:59:35Z

I suggest you contact your support team. You are using a custom built pytorch, which we can answer very limited questions about.

phgcha · 2024-06-18T17:56:33Z

Thanks, will do. But does anyone have rough idea what might have caused this?

Shatatel · 2024-06-26T20:07:31Z

Same issue here
Appears right after fresh Jetson AGX flashing

here some sw info

jetson_release 

Model: NVIDIA Jetson AGX Orin Developer Kit - Jetpack 6.0 [L4T 36.3.0]
NV Power Mode[0]: MAXN
Hardware:
 - Module: NVIDIA Jetson AGX Orin
Platform:
 - Distribution: Ubuntu 22.04 Jammy Jellyfish
 - Release: 5.15.136-tegra
jtop:
 - Version: 4.2.8
 - Service: Active
Libraries:
 - CUDA: 12.2.140
 - cuDNN: 8.9.4.25
 - TensorRT: 8.6.2.3
 - VPI: 3.1.5
 - Vulkan: 1.3.204
 - OpenCV: 4.10.0-dev - with CUDA: YES

Python 3.10.12 [GCC 11.4.0] on linux
>>> import torch
>>> print(torch.version.cuda)
12.2

Irrerwirrer · 2024-06-28T18:17:54Z

Same problem here. Using print(torch.__version__), I found that current vllm tries to use PyTorch version 2.3.0 in setup.py . However, my currently installed version is 2.1.0a0+41361538.nv23.06, which is the latest supported by my Jetpack 5.1.2. According to the NVIDIA documentation, PyTorch 2.3.0 is not supported on my setup.

To work around this, I attempted to install vllm-0.2.4, which requires PyTorch only 2.1.0. However, I encountered an OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root. error, even though I have already added the correct CUDA_HOME to my .bashrc

Acc1143 · 2024-07-01T06:25:19Z

怎么解决

KungFuPandaPro · 2024-09-25T16:15:54Z

Same problem here. Using print(torch.__version__), I found that current vllm tries to use PyTorch version 2.3.0 in setup.py . However, my currently installed version is 2.1.0a0+41361538.nv23.06, which is the latest supported by my Jetpack 5.1.2. According to the NVIDIA documentation, PyTorch 2.3.0 is not supported on my setup.

To work around this, I attempted to install vllm-0.2.4, which requires PyTorch only 2.1.0. However, I encountered an OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root. error, even though I have already added the correct CUDA_HOME to my .bashrc

how to install vllm on jetson?

walker-ai · 2024-10-25T05:43:49Z

Same problem here. Using print(torch.__version__), I found that current vllm tries to use PyTorch version 2.3.0 in setup.py . However, my currently installed version is 2.1.0a0+41361538.nv23.06, which is the latest supported by my Jetpack 5.1.2. According to the NVIDIA documentation, PyTorch 2.3.0 is not supported on my setup.
To work around this, I attempted to install vllm-0.2.4, which requires PyTorch only 2.1.0. However, I encountered an OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root. error, even though I have already added the correct CUDA_HOME to my .bashrc

how to install vllm on jetson?

Did you resolve that？

walker-ai · 2024-10-25T06:06:24Z

@youkaichao Hey, I met the same problem, could you tell me is it feasible to run on Jetson AGX Orin (aarch64) even if I only call some classes (configuration classes and operators) from it? If so, how should I install it?

Irrerwirrer · 2024-10-25T06:07:05Z

@KungFuPandaPro @walker-ai
I have managed to install it properly.
Take a look at this thread/issue
#6063

walker-ai · 2024-10-25T09:58:26Z

@KungFuPandaPro @walker-ai I have managed to install it properly. Take a look at this thread/issue #6063

Thank you for your time, I m trying to install it by the method that you mentioned here, but I encountered some error, could you show the current environment that you're using? Such as jetpack, pytorch, vllm and etc. I'm afraid it's a version inconsistency.

Irrerwirrer · 2024-10-25T10:01:29Z

@walker-ai could you provide further details of the error? My project is already completed, and I do not have access to the environment anymore, but maybe your error was one of the ones I have also encountered along the way

walker-ai · 2024-10-25T10:12:09Z

@walker-ai could you provide further details of the error? My project is already completed, and I do not have access to the environment anymore, but maybe your error was one of the ones I have also encountered along the way

When I type:

# cd /workspace/vllm/
python setup.py develop

it shows:

Error: could not find CMAKE_PROJECT_NAME in Cache
Traceback (most recent call last):
  File "setup.py", line 486, in <module>
    setup(
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/__init__.py", line 117, in setup
    return distutils.core.setup(**attrs)
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 183, in setup
    return run_commands(dist)
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 199, in run_commands
    dist.run_commands()
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 954, in run_commands
    self.run_command(cmd)
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/dist.py", line 950, in run_command
    super().run_command(command)
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
    cmd_obj.run()
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/command/develop.py", line 35, in run
    self.install_for_development()
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/command/develop.py", line 112, in install_for_development
    self.run_command('build_ext')
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
    self.distribution.run_command(command)
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/dist.py", line 950, in run_command
    super().run_command(command)
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
    cmd_obj.run()
  File "setup.py", line 243, in run
    super().run()
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 98, in run
    _build_ext.run(self)
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
    self.build_extensions()
  File "setup.py", line 217, in build_extensions
    subprocess.check_call(["cmake", *build_args], cwd=self.build_temp)
  File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '-j=8', '--target=_moe_C', '--target=vllm_flash_attn_c', '--target=_C']' returned non-zero exit status 1.

then I try to figure out the detail about cmake error log:

cmake --build . -j8 --target=_moe_C --verbose

it shows (partly):

[1/5] Building CUDA object CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu.o
FAILED: CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu.o 
ccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -DPy_LIMITED_API=3 -DTORCH_EXTENSION_NAME=_moe_C -D_moe_C_EXPORTS -I/home/orin/tools/vllm/csrc -isystem /home/orin/tools/anaconda3/envs/tmp/include/python3.8 -isystem /home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/torch/include -isystem /home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /usr/local/cuda/include -DONNX_NAMESPACE=onnx_c2 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=set_but_not_used,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda -O2 -g -DNDEBUG -std=c++17 -Xcompiler=-fPIC -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DENABLE_FP8 --threads=1 -DENABLE_SCALED_MM_C2X=1 -D_GLIBCXX_USE_CXX11_ABI=1 -gencode arch=compute_86,code=sm_86 -MD -MT CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu.o -MF CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu.o.d -x cu -c /home/orin/tools/vllm/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu -o CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu.o
/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: qualified name is not allowed

/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: explicit type is missing ("int" assumed)

/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: expected a ";"

...


[2/5] Building CUDA object CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu.o
FAILED: CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu.o 
ccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -DPy_LIMITED_API=3 -DTORCH_EXTENSION_NAME=_moe_C -D_moe_C_EXPORTS -I/home/orin/tools/vllm/csrc -isystem /home/orin/tools/anaconda3/envs/tmp/include/python3.8 -isystem /home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/torch/include -isystem /home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /usr/local/cuda/include -DONNX_NAMESPACE=onnx_c2 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=set_but_not_used,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda -O2 -g -DNDEBUG -std=c++17 -Xcompiler=-fPIC -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DENABLE_FP8 --threads=1 -DENABLE_SCALED_MM_C2X=1 -D_GLIBCXX_USE_CXX11_ABI=1 -gencode arch=compute_86,code=sm_86 -MD -MT CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu.o -MF CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu.o.d -x cu -c /home/orin/tools/vllm/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu -o CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu.o
/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: qualified name is not allowed

/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: explicit type is missing ("int" assumed)

/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: expected a ";"

Irrerwirrer · 2024-10-25T10:28:31Z

Ok, I did not encounter this error.
If I remember correctly, then this was more or less my way:

Take care of problematic dependencies manually (remove torch and torchvision dependency from requirements-build and requirements-cuda (because I already installed custom-built pytorch and torchvision))
pip install -e .
With the benefit of hindsight, my problem was just the badly built dependencies. When i got that fixed one after another (torch, torchvision, xformers, etc.) vllm wasn't a problem anymore.

But your error seems more like a cmake problem. I guess you already checked the Install Guide, which got updated a lot since I did my installation.

Hope you can fix it 👍

walker-ai · 2024-10-25T10:36:19Z

Ok, I did not encounter this error. If I remember correctly, then this was more or less my way:

Take care of problematic dependencies manually (remove torch and torchvision dependency from requirements-build and requirements-cuda (because I already installed custom-built pytorch and torchvision))

pip install -e .
With the benefit of hindsight, my problem was just the badly built dependencies. When i got that fixed one after another (torch, torchvision, xformers, etc.) vllm wasn't a problem anymore.

But your error seems more like a cmake problem. I guess you already checked the Install Guide, which got updated a lot since I did my installation.

Hope you can fix it 👍

Thank you for providing these information, I will try that :)

conroy-cheers · 2024-10-26T05:50:34Z

I'm currently in the process of updating the nixpkgs build definition for vLLM from 0.5.3.post1 to 0.6.3.post1.

Build works great on x86_64 machine with NVIDIA GPU, and have successfully built for aarch64 Jetson Orin AGX with correct dependency / CUDA library versions. Unfortunately, at runtime on the Jetson, vLLM fails to start engine process:

  File "/nix/store/x5bz4k7prp7vdvm4i01y0z0lbi4awad7-python3.12-vllm-0.6.3.post1/lib/python3.12/site-packages/vllm/utils.py", line 799, in current_memory_usage
    return mem
           ^^^
UnboundLocalError: cannot access local variable 'mem' where it is not associated with a value

Have traced this back to the root cause - newer versions of vLLM use NVML to detect CUDA support.
NVML is not supported on Jetson, so unfortunately it isn't possible to proceed without making changes to vLLM here.

I'd be happy to open a PR to add a fallback mode on CUDA platforms where NVML isn't supported, if the maintainers would be open to it? Otherwise I will have to maintain a patch downstream.

conroy-cheers · 2024-10-27T13:44:11Z

#9735

phgcha added the installation Installation problems label Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Installation]: vllm on NVIDIA jetson AGX orin #5640

[Installation]: vllm on NVIDIA jetson AGX orin #5640

phgcha commented Jun 18, 2024

youkaichao commented Jun 18, 2024

phgcha commented Jun 18, 2024

Shatatel commented Jun 26, 2024 •

edited

Loading

Irrerwirrer commented Jun 28, 2024

Acc1143 commented Jul 1, 2024

KungFuPandaPro commented Sep 25, 2024

walker-ai commented Oct 25, 2024

walker-ai commented Oct 25, 2024

Irrerwirrer commented Oct 25, 2024

walker-ai commented Oct 25, 2024

Irrerwirrer commented Oct 25, 2024

walker-ai commented Oct 25, 2024 •

edited

Loading

Irrerwirrer commented Oct 25, 2024

walker-ai commented Oct 25, 2024

conroy-cheers commented Oct 26, 2024 •

edited

Loading

conroy-cheers commented Oct 27, 2024

[Installation]: vllm on NVIDIA jetson AGX orin #5640

[Installation]: vllm on NVIDIA jetson AGX orin #5640

Comments

phgcha commented Jun 18, 2024

Your current environment

How you are installing vllm

youkaichao commented Jun 18, 2024

phgcha commented Jun 18, 2024

Shatatel commented Jun 26, 2024 • edited Loading

Irrerwirrer commented Jun 28, 2024

Acc1143 commented Jul 1, 2024

KungFuPandaPro commented Sep 25, 2024

walker-ai commented Oct 25, 2024

walker-ai commented Oct 25, 2024

Irrerwirrer commented Oct 25, 2024

walker-ai commented Oct 25, 2024

Irrerwirrer commented Oct 25, 2024

walker-ai commented Oct 25, 2024 • edited Loading

Irrerwirrer commented Oct 25, 2024

walker-ai commented Oct 25, 2024

conroy-cheers commented Oct 26, 2024 • edited Loading

conroy-cheers commented Oct 27, 2024

Shatatel commented Jun 26, 2024 •

edited

Loading

walker-ai commented Oct 25, 2024 •

edited

Loading

conroy-cheers commented Oct 26, 2024 •

edited

Loading