We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inference 12.2 容器版本
12.2
curl -X 'POST' \ 'http://127.0.0.12:9997/v1/models' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "model_engine": "vllm", "model_name": "deepseek-r1-distill-qwen", "model_path": "/app/models/DeepSeek-R1-Distill-Qwen-14B", "worker_ip": "10.0.6.101", "max_model_len": 57744 }'
使用api提问:例如:提问发动机冒烟的原因是什么。响应超过30秒,停止输出。
完整的输出deepseek的思考
The text was updated successfully, but these errors were encountered:
same with me.
Sorry, something went wrong.
是分布式?
分布式部署,不过deepseek运行的服务器只有一张显卡
No branches or pull requests
System Info / 系統信息
inference 12.2 容器版本
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
12.2
The command used to start Xinference / 用以启动 xinference 的命令
curl -X 'POST' \ 'http://127.0.0.12:9997/v1/models' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "model_engine": "vllm", "model_name": "deepseek-r1-distill-qwen", "model_path": "/app/models/DeepSeek-R1-Distill-Qwen-14B", "worker_ip": "10.0.6.101", "max_model_len": 57744 }'
Reproduction / 复现过程
使用api提问:例如:提问发动机冒烟的原因是什么。响应超过30秒,停止输出。
Expected behavior / 期待表现
完整的输出deepseek的思考
The text was updated successfully, but these errors were encountered: