Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xinference[vllm]无法单机多卡启动一个模型 #2513

Open
1 of 3 tasks
Weishaoya opened this issue Nov 4, 2024 · 5 comments
Open
1 of 3 tasks

xinference[vllm]无法单机多卡启动一个模型 #2513

Weishaoya opened this issue Nov 4, 2024 · 5 comments
Milestone

Comments

@Weishaoya
Copy link

System Info / 系統信息

ubuntu 22.04
python 3.11.10

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • docker / docker
  • pip install / 通过 pip install 安装
  • installation from source / 从源码安装

Version info / 版本信息

xinference[vllm] 0.16.2

The command used to start Xinference / 用以启动 xinference 的命令

GRADIO_DEFAULT_CONCURRENCY_LIMIT=10 xinference-local --host 0.0.0.0 --port 10860

Reproduction / 复现过程

我直接用vllm框架在两张v100 16g上启动一个7b的模型,是没有问题的,而且可以正常对话。
当我用xinference[vllm]在两张v100 16g上启动一个7b的模型,它报错了,以下是我的参数设置和报错信息:
image
image
image

Expected behavior / 期待表现

我现在在用ragflow框架,推理服务接的xinference, 这对我很重要,希望官方能重视这个bug.

@XprobeBot XprobeBot added this to the v0.16 milestone Nov 4, 2024
@qinxuye
Copy link
Contributor

qinxuye commented Nov 4, 2024

不需要设置tensor_parallel_size,去掉试试。

@Weishaoya
Copy link
Author

不需要设置tensor_parallel_size,去掉试试。
image
image
image
我去掉了tensor_parallel_size参数设置,它报另外一个错误,以上是报错信息。希望你们能修复这个bug,感谢!

Copy link

This issue is stale because it has been open for 7 days with no activity.

@github-actions github-actions bot added the stale label Nov 12, 2024
@redreamality
Copy link

same issue

@github-actions github-actions bot removed the stale label Nov 15, 2024
@Weishaoya
Copy link
Author

same issue
I have updated the version of xinference to 1.0.0, and it worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants