Twitter Sentiment Analysis: Project Overview

Analyzed tweets to determine positive, negative, or neutral sentiment from kaggle competition data.

Resources Used:

Python Version: 3.6 Packages: pandas, numpy, plotly, SpaCy, nltk

Distribution of sentiments in training data

To evaluate my models I used the Jaccard index, which determines the similarity of two sample sentences.

Here are the distributions of Jaccard scores on tweets compared with training tweets and selected parts of a tweet.

List of the most common words (after removal of stopwords)

I used SpaCy to teach my named entity recogniser My steps:

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Visualizations		Visualizations
working/models		working/models
README.md		README.md
TweetSentimet.ipynb		TweetSentimet.ipynb
sample_submission.csv		sample_submission.csv
submission.csv		submission.csv
test.csv		test.csv
train.csv		train.csv