Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

编译SwiftTransformer失败 #37

Open
FredHuang99 opened this issue Aug 9, 2024 · 5 comments
Open

编译SwiftTransformer失败 #37

FredHuang99 opened this issue Aug 9, 2024 · 5 comments

Comments

@FredHuang99
Copy link

执行命令:
git clone https://github.com/LLMServe/SwiftTransformer.git cd SwiftTransformergit;submodule update --init --recursive;cmake -B build;cmake --build build -j$(nproc)

报错原因:
(截取部分具有代表性的错误)
/workspace/DistServe/SwiftTransformer/src/unittest/util/../unittest_utils.h:93:45: error: call of overloaded ‘fabs(__half)’ is ambiguous
93 | fabs(answer[i]-reference[i]), fabs(answer[i]-reference[i])/fabs(reference[i]));
| ~~~~^~~~~~~~~~~~~~~~~~~~~~~~
/workspace/DistServe/SwiftTransformer/src/unittest/util/../unittest_utils.h:93:75: error: call of overloaded ‘fabs(__half)’ is ambiguous
93 | fabs(answer[i]-reference[i]), fabs(answer[i]-reference[i])/fabs(reference[i]));
| ~~~~^~~~~~~~~~~~~~~~~~~~~~~~
/workspace/DistServe/SwiftTransformer/src/csrc/kernel/fused_context_stage_attention.cu(145): error: name followed by "::" must be a class or namespace name
wmma::fragment<wmma::matrix_a, 16ul, 16ul, 16ul, __half, wmma::row_major> a_frag;
^
/workspace/DistServe/SwiftTransformer/src/csrc/kernel/fused_context_stage_attention.cu(146): error: type name is not allowed
wmma::fragment<wmma::matrix_b, 16ul, 16ul, 16ul, __half, wmma::col_major> b_frag;
^
/workspace/DistServe/SwiftTransformer/src/csrc/kernel/fused_context_stage_attention.cu(146): error: identifier "b_frag" is undefined
wmma::fragment<wmma::matrix_b, 16ul, 16ul, 16ul, __half, wmma::col_major> b_frag;
^
编译环境
nvcr.io/nvidia/pytorch/23.10-py3镜像
CXX compiler: GNU 11.4.0
CUDA: NVIDIA 12.2.140
CUDAToolkit: 12.2.140
NCCL: libnccl.so.2.19.3
MPI: 3.1

@duihuhu
Copy link

duihuhu commented Aug 9, 2024

is it version problem?

@interestingLSY
Copy link
Member

Have you added the --gpus=all argument when you launch the docker container, or equivalently, can you see your GPUs when you type nvidia-smi inside your docker container?

@FredHuang99
Copy link
Author

Have you added the --gpus=all argument when you launch the docker container, or equivalently, can you see your GPUs when you type nvidia-smi inside your docker container?

i have added --gpus all

@William12github
Copy link

what's the GPU type in the system?

@TZHelloWorld
Copy link

maybe you can execute submodule update --init --recursive; to make sure that submodule installs all the sub-modules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants