We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
看仓库说明: 使用的训练数据是: https://huggingface.co/datasets/shibing624/chinese_text_correction 测试数据是: SIGHAN-2015(sighan2015_test.tsv) EC-LAW(ec_law_test.tsv) MCSC(mcsc_test.tsv)
检查发现,EC-LAW和MCSC数据和训练数据是有重叠的,这和三个测试集的效果一致,EC-LAW,MCSC接近1,SIGHAN-2015奇怪的只有0.4917
想问一下,训练的时候有去除在测试集中的数据吗?
The text was updated successfully, but these errors were encountered:
训练的时候包括了测试集中的数据。
Sorry, something went wrong.
No branches or pull requests
看仓库说明:
使用的训练数据是:
https://huggingface.co/datasets/shibing624/chinese_text_correction
测试数据是:
SIGHAN-2015(sighan2015_test.tsv)
EC-LAW(ec_law_test.tsv)
MCSC(mcsc_test.tsv)
检查发现,EC-LAW和MCSC数据和训练数据是有重叠的,这和三个测试集的效果一致,EC-LAW,MCSC接近1,SIGHAN-2015奇怪的只有0.4917
想问一下,训练的时候有去除在测试集中的数据吗?
The text was updated successfully, but these errors were encountered: