Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于sentiment的训练的一些建议 #96

Open
Koado opened this issue Oct 19, 2018 · 2 comments
Open

关于sentiment的训练的一些建议 #96

Koado opened this issue Oct 19, 2018 · 2 comments

Comments

@Koado
Copy link

Koado commented Oct 19, 2018

首先训练集我从网上下载了谭松波老师的酒店评论语料,里面已经分好了pos.txt和neg.txt,各3000份,把他们拼起来就好(建议大家在此基础上再拼上sentiment目录下自带的neg.txt和pos.txt,训练集越大,范围越广,训练效果更佳)。再就是训练,代码如下
from snownlp.sentiment import train, save
train('neg.txt', 'pos.txt')//路径指向酒店评论语料文件
save('new_sentiment.marshal')
训练要一定时间,训练好后,得到的文件是以.marshal.3为后缀名的文件,修改sentiment目录下的__init__.py文件,将data_path = os.path.join(os.path.dirname(os.path.abspath(file)), 'sentiment_marshal') 中的sentiment_marshal替换为new_sentiment.marshal(我这里是把新的marshal.3文件放在了sentiment目录下了)
再次运行python,就可以看到效果了。可以通过相同的语句进行对比,会发现效果比之前好些。
要想效果更佳,只能再找或者自己制作一些分好类的pos.txt和neg.txt。
希望能够帮到有需要的同学

@Koado
Copy link
Author

Koado commented Oct 19, 2018

感谢作者制作的中文情感分析库,太棒了!

@Zero1366166516
Copy link

我也是这么做的,但是,我是用金融词典进行训练,训练很快,不到1分钟就结束了,生产一个.3的文件。但是,运行程序直接报错了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants