This repo is about creating a simple comment toxicity classifier which is a model that takes a comment and outputs the probability of that comment being toxic (carry negative thoughts). (based on this tutorial )
The data was downloaded from kaggle
definitely this is not a perfect classifier (there is no perfect one).
but it can get analyze comments with accuracy between 80 to 90 %
to improve these results :-
- More dataset is needed as there are a lot of neutral comments that make the model slightly biased
- Some tuning is needed to handle bias - variance trade off
- Transfer learning can be used by using BERT for a better text encoding
You need to install all the following libraries to run the notebook correctly so if any library is missing, just use its corresponding command
pip install tensorflow
pip install tkinter
pip install numpy
pip install os
pip install pandas
- Open test_app.ipynb
- Run the notebook
- Enter your sentence in the window that showed up
- Hit analyze
- Zeyad Abdelreheem LinkedIn