hn-sentiment

An experimental Hacker News reader incorporating sentiment analysis of comments.

A color label for each story is produced by taking the mean of the sentiment of each comment, multiplying this by the log of the number of comments (to reduce the variance of stories with few comments), and finally estimating the area under the probability density function between this score and the mean score. This results in an even distribution of scores, essentially magnifying differences close to the mean.

Each comment is labeled using the same method, but the histogram shows the raw scores.

Uses node.js, React, D3.js, and the sentiment module.

Important terms

Use tf-idf to identify the N most "important" words within a comment thread, relative to a corpus of comments.

Generate csv with large corpus of comments .
Analyze corpus with tf-idf, and store the idf cache in a json file.
Given a story, load the idf cache for the corpus.
Calculate the tf-idf for each term in the comment thread (normalize by length?), relative to the corpus.
Select some N terms with the highest scores.

Clustering

Use the presence (or frequency?) of these words in each comment as features in a clustering model. Maybe exclude some if they are too common?

For each comment create a vector from the presence/frequency/tfidf of each term.
Do some kind of clustering using the vectors.

See: http://brandonrose.org/clustering

TODO:

Check distribution of comments
Performance: http://benchling.engineering/performance-engineering-with-react/

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
data		data
dist		dist
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
webpack.config.js		webpack.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hn-sentiment

Important terms

Clustering

TODO:

About

Releases

Packages

Languages

CJStadler/hn_sentiment

Folders and files

Latest commit

History

Repository files navigation

hn-sentiment

Important terms

Clustering

TODO:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages