deepseek-r1-distill-qwen-1.5b-awq输出字符会截断，大概1000字左右就会截断 #2862

worm128 · 2025-02-14T12:06:08Z

加载deepseek-r1-distill-qwen-1.5b-awq，只要输出内容稍微多一点，1000多个字，就会被截断
是AWQ量化int4的问题，还是设置了dtype=float16的问题，还是其他问题？

Xinference版本：

模型参数：
Model ID: deepseek-r1-distill-qwen-1.5b-awq
Model Size: 15 Billion Parameters
Model Format: awq
Model Quantization: Int4

amzfc · 2025-02-15T02:04:48Z

配置上下文长度呢？

qinxuye · 2025-02-15T02:35:00Z

界面上 max_tokens 拉大。

love01211 · 2025-02-18T02:22:30Z

@qinxuye 大佬，如果是调用api接口的方式，也会被截断，如何调整max tokens？

qinxuye · 2025-02-18T02:41:13Z

https://inference.readthedocs.io/zh-cn/latest/index.html

文档首页就写了传 max_tokens。

love01211 · 2025-02-18T05:47:35Z

https://inference.readthedocs.io/zh-cn/latest/index.html

文档首页就写了传 max_tokens。

我是用dify调用的

qinxuye · 2025-02-18T07:18:06Z

dify 没有地方调整 max_tokens 吗？

love01211 · 2025-02-18T07:20:56Z

dify 没有地方调整 max_tokens 吗？

没，只有模型类型、名称、服务器URL、模型UID、API密钥的输入框内容

SharkSyl · 2025-02-19T10:52:01Z

请问
提示：you need autoawq>0.6.2
不过我查了下 https://github.com/casper-hansen/AutoAWQ 的最新版本是0.2.8
请问呢这个是我理解错了吗？

 xinference launch --model_path /root/.cache/huggingface/hub/deepseek-r1-distill-qwen-1.5b-awq --model-engine transformers --model-name deepseek-r1-distill-qwen --size-in-billions 1_5 --model-format awq  --quantization Int4
```shell

```shell
(base) root@6124cea6235e:/# xinference launch --model_path /root/.cache/huggingface/hub/deepseek-r1-distill-qwen-1.5b-awq --model-engine transformers --model-name deepseek-r1-distill-qwen --size-in-billions 1_5 --model-format awq  --quantization Int4
Launch model name: deepseek-r1-distill-qwen with kwargs: {'model_path': '/root/.cache/huggingface/hub/deepseek-r1-distill-qwen-1.5b-awq'}
Traceback (most recent call last):
  File "/opt/conda/bin/xinference", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/xinference/deploy/cmdline.py", line 908, in model_launch
    model_uid = client.launch_model(
                ^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/xinference/client/restful/restful_client.py", line 999, in launch_model
    raise RuntimeError(
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:39895, pid=1087] To use IPEX backend, you need autoawq>0.6.2. Pl

我用的xinference:v1.2.2-cpu

948024326 · 2025-02-21T06:03:47Z

dify 没有地方调整 max_tokens 吗？

没，只有模型类型、名称、服务器URL、模型UID、API密钥的输入框内容

老哥请问解决了吗？

XprobeBot added this to the v1.x milestone Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deepseek-r1-distill-qwen-1.5b-awq输出字符会截断，大概1000字左右就会截断 #2862

deepseek-r1-distill-qwen-1.5b-awq输出字符会截断，大概1000字左右就会截断 #2862

worm128 commented Feb 14, 2025 •

edited

Loading

amzfc commented Feb 15, 2025

qinxuye commented Feb 15, 2025

love01211 commented Feb 18, 2025

qinxuye commented Feb 18, 2025

love01211 commented Feb 18, 2025

qinxuye commented Feb 18, 2025

love01211 commented Feb 18, 2025

SharkSyl commented Feb 19, 2025

948024326 commented Feb 21, 2025

deepseek-r1-distill-qwen-1.5b-awq输出字符会截断 ，大概1000字左右就会截断 #2862

deepseek-r1-distill-qwen-1.5b-awq输出字符会截断 ，大概1000字左右就会截断 #2862

Comments

worm128 commented Feb 14, 2025 • edited Loading

amzfc commented Feb 15, 2025

qinxuye commented Feb 15, 2025

love01211 commented Feb 18, 2025

qinxuye commented Feb 18, 2025

love01211 commented Feb 18, 2025

qinxuye commented Feb 18, 2025

love01211 commented Feb 18, 2025

SharkSyl commented Feb 19, 2025

948024326 commented Feb 21, 2025

deepseek-r1-distill-qwen-1.5b-awq输出字符会截断，大概1000字左右就会截断 #2862

deepseek-r1-distill-qwen-1.5b-awq输出字符会截断，大概1000字左右就会截断 #2862

worm128 commented Feb 14, 2025 •

edited

Loading