You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please follow the template so that we can understand your issue more clearly.
Please provide the full case so that we can try to reproduce.
Please try using guided decoding if you would like perfectly valid JSON generation.
Does this happen to Qwen2.5?
Does it happen for the bfloat16 precision? fp8 may severely impact the performance. we advise you to use GPTQ or AWQ or other kinds of qunatization algorithms.
Model Series
Qwen2
What are the models used?
Qwen2-72B-Instruct
What is the scenario where the problem happened?
tensorrt-llm
Is this badcase known and can it be solved using avaiable techniques?
Information about environment
GPU: Nvidia H800
Description
Steps to reproduce
This happens to Qwen2-72B-Instruct. fp8量化.
The badcase can be reproduced with the following steps:
输出以json格式输出
The following example input & output can be used:
Expected results
json格式输出时候会多\转义符,比如\n,变成了\n
Attempts to fix
I have tried several ways to fix this, including:
Anything else helpful for investigation
I find that this problem also happens to ...
The text was updated successfully, but these errors were encountered: