Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory Error怎么解决的? #3

Open
yangtianyu92 opened this issue Apr 27, 2019 · 1 comment
Open

memory Error怎么解决的? #3

yangtianyu92 opened this issue Apr 27, 2019 · 1 comment

Comments

@yangtianyu92
Copy link

顺便喷下上面两个,公用测试文本漫天飞自己下载不好吗,非要问作者要,数据这玩意自己搜集的,你是要个几把要。

@yangtianyu92
Copy link
Author

vectorizer = CountVectorizer(max_features = 13000)

统计每个词语的tf-idf权值,限制下最大特征

transformer = TfidfTransformer()
freq_word_matrix = vectorizer.fit_transform(corpus)
#获取词袋模型中的所有词语
word = vectorizer.get_feature_names()
tfidf = transformer.fit_transform(freq_word_matrix)

元素w[i][j]表示j词在i类文本中的tf-idf权重

weight = tfidf.toarray()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant