Can I use two GPUs for VLLM? #471

RL4LLM · 2025-03-04T20:25:20Z

Thank you for your work.

I understand that you are using 1 GPU for VLLM inference and the other 7 GPUs for training.

Can I use two GPUs for VLLM inference? I noticed that using 1 GPU results in an out-of-memory (OOM) error when I set the max_completion_length to a large value.

samma1570 · 2025-03-06T12:02:37Z

same problem

tastelikefeet · 2025-03-08T04:41:34Z

you can try SWIFT based on TRL and supports multi vllms and tensor-parallel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can I use two GPUs for VLLM? #471

Can I use two GPUs for VLLM? #471

RL4LLM commented Mar 4, 2025

samma1570 commented Mar 6, 2025

tastelikefeet commented Mar 8, 2025 •

edited

Loading

Can I use two GPUs for VLLM? #471

Can I use two GPUs for VLLM? #471

Comments

RL4LLM commented Mar 4, 2025

samma1570 commented Mar 6, 2025

tastelikefeet commented Mar 8, 2025 • edited Loading

tastelikefeet commented Mar 8, 2025 •

edited

Loading