Skip to content

Sentiment Analysis Honours Stage Dissertation using Python. Making use of Natural Language Processing of Twitter data obtained through the API.

Notifications You must be signed in to change notification settings

bdavis9725/Sentiment-Analysis-Honours-Thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sentiment Analysis-Dissertation

Dissertation Project: An Analysis of Sentiment using Twitter Data

Research Question

“How can modelling techniques be used to Analyse the Sentiment of Twitter Data?”

Abstract

Sentiment Analysis is a fast track area of research in high demand for businesses, giving them the ability to understand and adapt their business strategies through understanding the way their brand is seen amongst consumers. This paper aims to analyse the Sentiment Polarity of topical tweets using modelling techniques, through the analysis of 70,000 tweets. With the combined use of Deep Learning, Natural Language Processing and Word Embeddings, training and classification of tweets is successfully implemented with good results. Variations in trainable data demonstrated that removal of stopwords, can have a negative effect on classification accuracy. The process of Indexing and Tokenization are effective techniques used to feed data into Neural Networks. Results are evaluated based on data trained alongside different type of hidden layers across Recurrent Neural Networks and Convolutional Neural Networks. The technique of visualisation is used to support the evaluation metrics, which allows for comparison where accuracies are too close to be called better. The project achieved a high accuracy of 96% using a Convolutional Neural Network on unclean data, with results showing lower accuracies on other data. This helped to show that this approach has the possibility to lead to advancements through the use of the techniques showcased.

Project Contains

  • Jupyter Notebook containing the source code (.ipynb)
  • Text file and CSV files containing all the data used in raw format / cleaned format
  • Project Initiation Document PDF file
  • Final Report PDF file
  • Project Plan file
  • Saved models that can be re-ran and evaluated again without the need for re-training

Important Note

  • To run the code yourself, comment out the 5th code block where twitter credentials would be

Releases

No releases published

Packages

No packages published