无法输出结果 #2

xxWeiDG · 2025-01-03T01:52:31Z

代码如下
from lmcsc import LMCorrector

corrector = LMCorrector(
model="/home/Work/models/Qwen/Qwen2-7B",
config_path="configs/default_config.yaml",
)
print("加载模型成功")
text = input("请输出文本：")
outputs = corrector(text)
print(outputs)

运行结果如下

按照要求配置的环境 pip install -r requirements.txt
但是结果无法输出，卡了很久

Jacob-Zhou · 2025-01-03T02:32:16Z

您好，Qwen2 和 Qwen2.5 好像使用默认 attention 实现的时候会出问题。
可以尝试安装 flash-attn。

或者在加载时设置 torch_dtype 为 torch.bfloat16

from lmcsc import LMCorrector
import torch

corrector = LMCorrector(
    model="Qwen/Qwen2-7B",
    config_path="configs/default_config.yaml",
    torch_dtype=torch.bfloat16
)

xxWeiDG · 2025-01-03T03:34:57Z

您好，Qwen2 和 Qwen2.5 好像使用默认 attention 实现的时候会出问题。可以尝试安装 flash-attn。

或者在加载时设置 torch_dtype 为 torch.bfloat16
from lmcsc import LMCorrector
import torch

corrector = LMCorrector(
    model="Qwen/Qwen2-7B",
    config_path="configs/default_config.yaml",
    torch_dtype=torch.bfloat16
)

按照您的代码改了但是，效果不是很好，Qwen2.5-7B-Instruct

Jacob-Zhou · 2025-01-03T03:54:47Z

对齐模型 *-Instruct 的结果是可能不如基模的。因为我们这个依赖于模型的语言模型建模能力。目前尝试下来纠错效果最好的应该是baichuan-inc/Baichuan2-7B-Base 和 baichuan-inc/Baichuan2-13B-Base。

Jacob-Zhou · 2025-01-03T05:31:15Z

我这边利用 Qwen2.5-7B 测试这两个例子，是可以改对的。

xxWeiDG · 2025-01-03T05:43:28Z

我这边利用 Qwen2.5-7B 测试这两个例子，是可以改对的。

那是为什么呢，我感觉我这个没用上

Jacob-Zhou · 2025-01-03T05:51:31Z

奇怪。是否可以把具体环境、显卡之类的发一下呢，我尝试复现一下

xxWeiDG · 2025-01-03T07:53:01Z

奇怪。是否可以把具体环境、显卡之类的发一下呢，我尝试复现一下
通过 python:3.10.14 构建docker 环境，
系统： ubuntu 22.04桌面版
显卡： A800显卡 Driver Version: 525.147.05 CUDA Version: 12.0
下面为pip包

Package Version

accelerate 1.2.1
altair 5.5.0
annotated-types 0.7.0
anyio 4.7.0
attrs 24.3.0
bitsandbytes 0.45.0
blinker 1.9.0
cachetools 5.5.0
certifi 2024.12.14
charset-normalizer 3.4.1
click 8.1.8
cmake 3.31.2
exceptiongroup 1.2.2
fastapi 0.115.6
filelock 3.16.1
fsspec 2024.12.0
gitdb 4.0.12
GitPython 3.1.44
h11 0.14.0
huggingface-hub 0.27.0
idna 3.10
Jinja2 3.1.5
jsonschema 4.23.0
jsonschema-specifications 2024.10.1
lit 18.1.8
loguru 0.7.3
markdown-it-py 3.0.0
MarkupSafe 3.0.2
mdurl 0.1.2
modelscope 1.21.1
mpmath 1.3.0
narwhals 1.20.1
networkx 3.4.2
numpy 1.24.4
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
nvidia-cufft-cu11 10.9.0.58
nvidia-curand-cu11 10.2.10.91
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusparse-cu11 11.7.4.91
nvidia-nccl-cu11 2.14.3
nvidia-nvtx-cu11 11.7.91
opencc-python-reimplemented 0.1.7
packaging 24.2
pandas 2.2.3
pillow 11.1.0
pip 23.0.1
protobuf 5.29.2
psutil 6.1.1
pyarrow 18.1.0
pydantic 2.10.4
pydantic_core 2.27.2
pydeck 0.9.1
Pygments 2.18.0
pypinyin 0.53.0
pypinyin-dict 0.8.0
python-dateutil 2.9.0.post0
pytz 2024.2
PyYAML 6.0.2
referencing 0.35.1
regex 2024.11.6
requests 2.32.3
rich 13.9.4
rpds-py 0.22.3
safetensors 0.5.0
sentencepiece 0.2.0
setuptools 65.5.1
six 1.17.0
smmap 5.0.2
sniffio 1.3.1
sse-starlette 2.2.1
starlette 0.41.3
streamlit 1.41.1
sympy 1.13.3
tenacity 9.0.0
tokenizers 0.21.0
toml 0.10.2
torch 2.0.1
tornado 6.4.2
tqdm 4.67.1
transformers 4.47.1
triton 2.0.0
typing_extensions 4.12.2
tzdata 2024.2
urllib3 2.3.0
uvicorn 0.34.0
watchdog 6.0.0
wheel 0.43.0
xformers 0.0.21

xxWeiDG · 2025-01-03T08:03:32Z

奇怪。是否可以把具体环境、显卡之类的发一下呢，我尝试复现一下

通过 python:3.10.14 构建docker 环境，
系统： ubuntu 22.04桌面版
显卡： A800显卡 Driver Version: 525.147.05 CUDA Version: 12.0
下面为pip包
Package Version

accelerate 1.2.1
altair 5.5.0
annotated-types 0.7.0
anyio 4.7.0
attrs 24.3.0
bitsandbytes 0.45.0
blinker 1.9.0
cachetools 5.5.0
certifi 2024.12.14
charset-normalizer 3.4.1
click 8.1.8
cmake 3.31.2
exceptiongroup 1.2.2
fastapi 0.115.6
filelock 3.16.1
fsspec 2024.12.0
gitdb 4.0.12
GitPython 3.1.44
h11 0.14.0
huggingface-hub 0.27.0
idna 3.10
Jinja2 3.1.5
jsonschema 4.23.0
jsonschema-specifications 2024.10.1
lit 18.1.8
loguru 0.7.3
markdown-it-py 3.0.0
MarkupSafe 3.0.2
mdurl 0.1.2
modelscope 1.21.1
mpmath 1.3.0
narwhals 1.20.1
networkx 3.4.2
numpy 1.24.4
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
nvidia-cufft-cu11 10.9.0.58
nvidia-curand-cu11 10.2.10.91
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusparse-cu11 11.7.4.91
nvidia-nccl-cu11 2.14.3
nvidia-nvtx-cu11 11.7.91
opencc-python-reimplemented 0.1.7
packaging 24.2
pandas 2.2.3
pillow 11.1.0
pip 23.0.1
protobuf 5.29.2
psutil 6.1.1
pyarrow 18.1.0
pydantic 2.10.4
pydantic_core 2.27.2
pydeck 0.9.1
Pygments 2.18.0
pypinyin 0.53.0
pypinyin-dict 0.8.0
python-dateutil 2.9.0.post0
pytz 2024.2
PyYAML 6.0.2
referencing 0.35.1
regex 2024.11.6
requests 2.32.3
rich 13.9.4
rpds-py 0.22.3
safetensors 0.5.0
sentencepiece 0.2.0
setuptools 65.5.1
six 1.17.0
smmap 5.0.2
sniffio 1.3.1
sse-starlette 2.2.1
starlette 0.41.3
streamlit 1.41.1
sympy 1.13.3
tenacity 9.0.0
tokenizers 0.21.0
toml 0.10.2
torch 2.0.1
tornado 6.4.2
tqdm 4.67.1
transformers 4.47.1
triton 2.0.0
typing_extensions 4.12.2
tzdata 2024.2
urllib3 2.3.0
uvicorn 0.34.0
watchdog 6.0.0
wheel 0.43.0
xformers 0.0.21

xxWeiDG · 2025-01-03T09:19:26Z

奇怪。是否可以把具体环境、显卡之类的发一下呢，我尝试复现一下

换了您说的那两个模型确实可以用了👍,但是Qwen2.5我这确实不可用

Jacob-Zhou · 2025-01-03T12:04:47Z

换了您说的那两个模型确实可以用了👍,但是Qwen2.5我这确实不可用

我这俩天研究研究

Jacob-Zhou · 2025-01-06T16:48:59Z

目前发现是在 3.10 之后的 Python 版本中

simple-csc/lmcsc/generation.py

Lines 104 to 115 in 1e269be

    
           @torch.jit.script 
        
           def distortion_probs_to_cuda( 
        
               template_tensor: torch.Tensor,  
        
               force_eos: torch.Tensor, 
        
               batch_size: int,  
        
               num_beams: int,  
        
               batch_beam_size: int,  
        
               vocab_size: int,  
        
               _batch_indices: List[int],  
        
               _beam_indices: List[int],  
        
               _token_indices: List[int],  
        
               _distortion_probs: List[float]) -> torch.Tensor:

这段代码中的 104 行的 @torch.jit.script 会导致构造出来的 distortion_probs 和传进去的不一致。
注释掉之后我这边的 Qwen2.5 可以恢复正常。您可以试试看在您那边是否有效。
具体原因我还在调查中。

xxWeiDG · 2025-01-06T23:59:07Z

好的，我看下，再请教个事，我这专业词汇比较多，程序纠错很多专业词汇，这个怎么解决呢，是用我的专业数据从头训练吗 Jacob Zhou ***@***.***>于2025年1月7日周二00:49写道：

…

目前发现是在 3.10 之后的 Python 版本中 https://github.com/Jacob-Zhou/simple-csc/blob/1e269be9b51d17de7d54c270ec895cd05fbeaa10/lmcsc/generation.py#L104-L115 这段代码中的 104 行的 @torch.jit.script 会导致构造出来的 distortion_probs 和传进去的不一致。注释掉之后我这边的 Qwen2.5 可以恢复正常。您可以试试看在您那边是否有效。具体原因我还在调查中。 — Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BBSGQBICGUMX3IFCRECKXF32JKXZDAVCNFSM6AAAAABUQ5A5A2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNZTGUYDEMRUG4> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Jacob-Zhou · 2025-01-07T02:30:33Z

好的，我看下，再请教个事，我这专业词汇比较多，程序纠错很多专业词汇，这个怎么解决呢，是用我的专业数据从头训练吗

可以试试在 contexts 中输入您专业领域的描述、或者当前句子的前文。可能会有些提升。
也可以用您的专业数据对基模进行 CPT 继续预训练，这个应该效果是最好的。

xxWeiDG closed this as completed Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

无法输出结果 #2

无法输出结果 #2

xxWeiDG commented Jan 3, 2025 •

edited

Loading

Jacob-Zhou commented Jan 3, 2025 •

edited

Loading

xxWeiDG commented Jan 3, 2025

Jacob-Zhou commented Jan 3, 2025

Jacob-Zhou commented Jan 3, 2025

xxWeiDG commented Jan 3, 2025

Jacob-Zhou commented Jan 3, 2025

xxWeiDG commented Jan 3, 2025

xxWeiDG commented Jan 3, 2025 •

edited

Loading

xxWeiDG commented Jan 3, 2025

Jacob-Zhou commented Jan 3, 2025

Jacob-Zhou commented Jan 6, 2025

xxWeiDG commented Jan 6, 2025 via email

Jacob-Zhou commented Jan 7, 2025

无法输出结果 #2

无法输出结果 #2

Comments

xxWeiDG commented Jan 3, 2025 • edited Loading

Jacob-Zhou commented Jan 3, 2025 • edited Loading

xxWeiDG commented Jan 3, 2025

Jacob-Zhou commented Jan 3, 2025

Jacob-Zhou commented Jan 3, 2025

xxWeiDG commented Jan 3, 2025

Jacob-Zhou commented Jan 3, 2025

xxWeiDG commented Jan 3, 2025

xxWeiDG commented Jan 3, 2025 • edited Loading

xxWeiDG commented Jan 3, 2025

Jacob-Zhou commented Jan 3, 2025

Jacob-Zhou commented Jan 6, 2025

xxWeiDG commented Jan 6, 2025 via email

Jacob-Zhou commented Jan 7, 2025

xxWeiDG commented Jan 3, 2025 •

edited

Loading

Jacob-Zhou commented Jan 3, 2025 •

edited

Loading

xxWeiDG commented Jan 3, 2025 •

edited

Loading