Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deepspeek-v3-awq自定义注册时Chat Template报错,注册后只能选择Transformers #2854

Open
rexjm opened this issue Feb 13, 2025 · 4 comments
Milestone

Comments

@rexjm
Copy link

rexjm commented Feb 13, 2025

大家好,有两个问题请教下:

1,我用自定义注册Deepspeek-v3-awq后,无法选择vllm引擎,请问如何解决?

2,看到前面老师说v3和v2.5架构一致,但是用v2.5拉起时会报 Model not found, name: deepseek-v2.5, format: pytorch, size: 236, quantization: moe_wna16的错误,请问如果如何解决呢?

谢谢!

@XprobeBot XprobeBot added this to the v1.x milestone Feb 13, 2025
@qinxuye
Copy link
Contributor

qinxuye commented Feb 13, 2025

Deepseek v3 will come soon, better to wait for the builtin models support.

@rexjm
Copy link
Author

rexjm commented Feb 13, 2025

Thanks for your reply!

如果我想临时部署v3,我根据issue方法注册后,如果加入--quantization moe_wna16会报错Model DeepSeek-V3-awq cannot be run on engine vLLM, with format awq, size 671 and quantization moe_wna16.
不加这个参数model可以found但是会OOM,

请问如果临时部署下呢

@qinxuye
Copy link
Contributor

qinxuye commented Feb 13, 2025

你指定 gpu 个数了吗?AWQ 量化应该也需要 8张 80G 显卡。

@rexjm
Copy link
Author

rexjm commented Feb 13, 2025

恩恩,指定了,我感觉是quantization moe_wna16这个目前没支持,所以启动的quantization为None导致OOM了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants