-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash in rpc mode #34
Comments
please try with |
😢 I have tried but still crashs, my arguments: 0.02.559.011 I arguments : /root/.local/share/pipx/venvs/gpustack/lib/python3.11/site-packages/gpustack/third_party/bin/llama-box/llama-box --host 0.0.0.0 --gpu-layers 62 --parallel 2 --ctx-size 12288 --port 40324 --model /root/DeepSeek-R1-UD-IQ1_S_1.53b/DeepSeek-R1-UD-IQ1_S.gguf --alias DeepSeek-R1-UD-IQ1_S-1.58 --no-mmap --no-warmup --rpc 172.20.10.59:50389,172.20.10.59:50556,172.20.10.59:50195,172.20.10.59:50162,172.20.10.59:50750,172.20.10.59:50883,172.20.10.59:50244 --no-context-shift --no-cache-prompt -n 6144 --metrics |
can you try with the latest version? we should track it in the main branch as the frequency changes in AI development. |
seems better than pervious version 👍 , but now the local llambox would crash 😢
command and version:
|
can you offload the full log and the reproducing steps here? we have closed a similar issue gpustack/gpustack#1137, can you verify again with gpustack v0.5.1? remember, as v0.0.117 introduces a new RPC command, you should upgrade all your agents to v0.5.1 too. |
Distro: Rocky LInux 8.4
GPU: 8x NVIDIA RTX 2080 Driver Version: 555.42.02 CUDA Version: 12.5
nvidia-smi output
Version:
Issue: I use rpc to run DeepSeek-R1-UD-IQ1_S-1.58.gguf with two nodes(each has 8 x RTX 2080), when I use evalscope to benchmark service performance with
llama-box --parallel 2
it always crashes. If I usellama-box --parallel 1
, it's fine...this is messages and coredump:
llama-box main process log:
The text was updated successfully, but these errors were encountered: