Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deepseek-r1-distill-qwen-1.5b-awq输出字符会截断 ,大概1000字左右就会截断 #2862

Open
worm128 opened this issue Feb 14, 2025 · 9 comments
Milestone

Comments

@worm128
Copy link

worm128 commented Feb 14, 2025

Image

加载deepseek-r1-distill-qwen-1.5b-awq,只要输出内容稍微多一点,1000多个字,就会被截断
是AWQ量化int4的问题,还是设置了dtype=float16的问题,还是其他问题?

Xinference版本:
Image

模型参数:
Model ID: deepseek-r1-distill-qwen-1.5b-awq
Model Size: 15 Billion Parameters
Model Format: awq
Model Quantization: Int4

Image
Image

@XprobeBot XprobeBot added this to the v1.x milestone Feb 14, 2025
@amzfc
Copy link

amzfc commented Feb 15, 2025

配置上下文长度呢?

@qinxuye
Copy link
Contributor

qinxuye commented Feb 15, 2025

Image

界面上 max_tokens 拉大。

@love01211
Copy link

@qinxuye 大佬,如果是调用api接口的方式,也会被截断,如何调整max tokens?

@qinxuye
Copy link
Contributor

qinxuye commented Feb 18, 2025

https://inference.readthedocs.io/zh-cn/latest/index.html

文档首页就写了传 max_tokens。

@love01211
Copy link

https://inference.readthedocs.io/zh-cn/latest/index.html

文档首页就写了传 max_tokens。

我是用dify调用的

@qinxuye
Copy link
Contributor

qinxuye commented Feb 18, 2025

dify 没有地方调整 max_tokens 吗?

@love01211
Copy link

dify 没有地方调整 max_tokens 吗?

没,只有模型类型、名称、服务器URL、模型UID、API密钥的输入框内容

@SharkSyl
Copy link

请问
提示:you need autoawq>0.6.2
不过我查了下 https://github.com/casper-hansen/AutoAWQ 的最新版本是0.2.8
请问呢这个是我理解错了吗?

 xinference launch --model_path /root/.cache/huggingface/hub/deepseek-r1-distill-qwen-1.5b-awq --model-engine transformers --model-name deepseek-r1-distill-qwen --size-in-billions 1_5 --model-format awq  --quantization Int4
```shell

```shell
(base) root@6124cea6235e:/# xinference launch --model_path /root/.cache/huggingface/hub/deepseek-r1-distill-qwen-1.5b-awq --model-engine transformers --model-name deepseek-r1-distill-qwen --size-in-billions 1_5 --model-format awq  --quantization Int4
Launch model name: deepseek-r1-distill-qwen with kwargs: {'model_path': '/root/.cache/huggingface/hub/deepseek-r1-distill-qwen-1.5b-awq'}
Traceback (most recent call last):
  File "/opt/conda/bin/xinference", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/xinference/deploy/cmdline.py", line 908, in model_launch
    model_uid = client.launch_model(
                ^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/xinference/client/restful/restful_client.py", line 999, in launch_model
    raise RuntimeError(
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:39895, pid=1087] To use IPEX backend, you need autoawq>0.6.2. Pl

我用的xinference:v1.2.2-cpu

@948024326
Copy link

dify 没有地方调整 max_tokens 吗?

没,只有模型类型、名称、服务器URL、模型UID、API密钥的输入框内容

老哥请问解决了吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants