Sentiment Analysis of Imbalanced Dataset using SMOTE
The Sentiment Analysis of Imbalanced Datasets (in English, Arabic, and Moroccan Dialect) using SMOTE is novel method based on BERT embedding (for each language) and stacked deep learning algorithms, namely: LSTM, BiLSTM, GRU, and CNN that provide state-of-the-art results on the imbalanced dialect dataset in terms of accuracy.
We used three datasets in English, Arabic, and Moroccan Dialect as shown in the following table:
Language | Dataset | positive | negative |
---|---|---|---|
English | Reviews | 27411 | 8189 |
Arabic | Arabic_tweets | 835 | 1253 |
Moroccan Dialect | dataset | 14628 | 24932 |
This code is compatible with python 3.x. If python 3 is not default in your system, please using python3 and pip3 commands instead of python and pip commands.