Local cuda container build fails with "unsupported instruction `vpdpbusd'" #471

nzwulfin · 2024-11-20T00:08:18Z

Trying to build on my home system, ./container_build.sh cuda will fail with the following error

/tmp/ccnKypuJ.s: Assembler messages:
/tmp/ccnKypuJ.s:31871: Error: unsupported instruction `vpdpbusd'
/tmp/ccnKypuJ.s:31926: Error: unsupported instruction `vpdpbusd'
/tmp/ccnKypuJ.s:31995: Error: unsupported instruction `vpdpbusd'
/tmp/ccnKypuJ.s:32060: Error: unsupported instruction `vpdpbusd'
/tmp/ccnKypuJ.s:32113: Error: unsupported instruction `vpdpbusd'
gmake[2]: *** [ggml/src/CMakeFiles/ggml.dir/build.make:132: ggml/src/CMakeFiles/ggml.dir/ggml-quants.c.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:1591: ggml/src/CMakeFiles/ggml.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2

From what I can tell online, this is due to the binutils in RHEL 9 not being new enough to support the instruction.

I made some progress by adding the GCC Toolset 12 to the cuda portion of the dnf_install switch statement, but I'm not familiar enough with what needs to really get set to use the toolset correctly. I expect that scl enable is doing a lot more with the path than I exported.

    dnf install -y gcc-toolset-12 
    export CC=/opt/rh/gcc-toolset-12/root/usr/bin/gcc
    export CCXX=/opt/rh/gcc-toolset-12/root/usr/bin/g++

I've hit my limit for testing but thought I'd report the issue anyhow.

The text was updated successfully, but these errors were encountered:

nzwulfin · 2024-11-20T01:38:55Z

I examined a UBI 9 container with the toolset and the CUDA dev container and brute forced a few more exports for the build to complete. I don't think this is the right solution, but might serve as a pointer to one.

  elif [ "$containerfile" = "cuda" ]; then
    dnf install -y "${rpm_list[@]}"
    dnf install -y gcc-toolset-12 
    export CC=/opt/rh/gcc-toolset-12/root/usr/bin/gcc
    export CCXX=/opt/rh/gcc-toolset-12/root/usr/bin/g++
    export PKG_CONFIG_PATH=/opt/rh/gcc-toolset-12/root/usr/lib64/pkgconfig
    export INFOPATH=/opt/rh/gcc-toolset-12/root/usr/share/info
    export LD_LIBRARY_PATH=/opt/rh/gcc-toolset-12/root/usr/lib64:/opt/rh/gcc-toolset-12/root/usr/lib:$LD_LIBRARY_PATH
    export PATH=/usr/share/Modules/bin:/opt/rh/gcc-toolset-12/root/usr/bin:/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:$PATH

bmahabirbu · 2024-11-20T07:05:46Z

@nzwulfin good analysis. Did you also try doing scl enable gcc-toolset-12 bash before doing the exports? It will create a separate terminal with the GCC toolset 12 and it should avoid the error.

In general, I have personally tested the building process on Ubuntu 24.04, Ubuntu 22.04 in WSL2, and Fedora 40 but I'm new to Rhel 9!

ericcurtin · 2024-11-20T11:33:45Z

Let's open a PR and get this change in, related issue:

ggerganov/llama.cpp#5316

nzwulfin · 2024-11-20T13:07:35Z

@bmahabirbu I did try the scl enable bash step both in the switch and after the dnf_install in the main body. I didn't see any changes to which GCC got picked up by cmake, but it also didn't throw any errors.

I didn't have any problems in a local version of the cuda:12.6.2-devel-ubi9 container:

[root@13fd7588eacd /]# scl enable gcc-toolset-12 bash

[root@30d20f630919 /]# gcc --version
gcc (GCC) 12.2.1 20221121 (Red Hat 12.2.1-7)
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

It might be the bash invocation inside a running script.

I found the enable file and it's mainly just a bunch of env exports. I'm going to test changing all my exports to

source /opt/rh/gcc-toolset-12/enable

I'll report in once I have a local build

nzwulfin · 2024-11-20T13:11:09Z

Let's open a PR and get this change in, related issue:

ggerganov/llama.cpp#5316

Well if I had read the issue Eric linked, I could have saved all my testing this morning ;)

Based on ggerganov/llama.cpp#5316 (comment) and ggerganov/llama.cpp#5316 (comment)

I should have the right combo in this attempt:

  elif [ "$containerfile" = "cuda" ]; then
    dnf install -y "${rpm_list[@]}"
    dnf install -y gcc-toolset-12 
    source /opt/rh/gcc-toolset-12/enable

nzwulfin · 2024-11-20T13:32:56Z

The llama.cpp compile was a little noisy b/c of an enabled warning but it worked and was able to get llama3.2 working via notes in the discussion. I'll clean up my local repo and submit a PR so folks can look at it in context.

Here's the warning I was seeing in case someone wants to think about silencing it.

/opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/include/avx512fintrin.h:5946:10: warning: '__Y' may be used uninitialized [-Wmaybe-uninitialized]

bmahabirbu · 2024-11-20T14:11:08Z

@ericcurtin good find for that issue! I'm surprised I didn't come upon it during my search.

@nzwulfin my apologies but thank you for testing my suggestion anyway! Guess scl enable doesn't properly give access to gcc toolket 12. It's good to know that using sources works.

nzwulfin · 2024-11-20T14:35:55Z

@bmahabirbu no worries, I wanted to make sure I didn't miss anything the first time I tried it!

nzwulfin · 2024-11-20T14:42:17Z

PR #473 submitted, thanks y'all!

nzwulfin · 2024-11-21T00:47:36Z

PR #473 was merged, local test confirmed the fix

nzwulfin mentioned this issue Nov 20, 2024

Enable GCC Toolet 12 to support AVX VNNI #473

Merged

nzwulfin closed this as completed Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local cuda container build fails with "unsupported instruction `vpdpbusd'" #471

Local cuda container build fails with "unsupported instruction `vpdpbusd'" #471

nzwulfin commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

bmahabirbu commented Nov 20, 2024

ericcurtin commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

bmahabirbu commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

nzwulfin commented Nov 21, 2024

Local cuda container build fails with "unsupported instruction `vpdpbusd'" #471

Local cuda container build fails with "unsupported instruction `vpdpbusd'" #471

Comments

nzwulfin commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

bmahabirbu commented Nov 20, 2024

ericcurtin commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

bmahabirbu commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

nzwulfin commented Nov 20, 2024

nzwulfin commented Nov 21, 2024