Dissertation Project: An Analysis of Sentiment using Twitter Data
“How can modelling techniques be used to Analyse the Sentiment of Twitter Data?”
Sentiment Analysis is a fast track area of research in high demand for businesses, giving them the ability to understand and adapt their business strategies through understanding the way their brand is seen amongst consumers. This paper aims to analyse the Sentiment Polarity of topical tweets using modelling techniques, through the analysis of 70,000 tweets. With the combined use of Deep Learning, Natural Language Processing and Word Embeddings, training and classification of tweets is successfully implemented with good results. Variations in trainable data demonstrated that removal of stopwords, can have a negative effect on classification accuracy. The process of Indexing and Tokenization are effective techniques used to feed data into Neural Networks. Results are evaluated based on data trained alongside different type of hidden layers across Recurrent Neural Networks and Convolutional Neural Networks. The technique of visualisation is used to support the evaluation metrics, which allows for comparison where accuracies are too close to be called better. The project achieved a high accuracy of 96% using a Convolutional Neural Network on unclean data, with results showing lower accuracies on other data. This helped to show that this approach has the possibility to lead to advancements through the use of the techniques showcased.
- Jupyter Notebook containing the source code (.ipynb)
- Text file and CSV files containing all the data used in raw format / cleaned format
- Project Initiation Document PDF file
- Final Report PDF file
- Project Plan file
- Saved models that can be re-ran and evaluated again without the need for re-training
- To run the code yourself, comment out the 5th code block where twitter credentials would be