spam_filter

In this project I have explored the multinomial Naive Bayes' algorithm and applied it to build a text SMS spam filter. The model has an accuracy of over 90%.

The dataset used for both training and testing of the algorithm was created by Tiago A. Almeida and José María Gómez Hidalgo, and can be found at The UCL Machine Learning Repository. The dataset contains SMS messages that are already classified as being spam or not.

The Naive Bayes algorithm will assess whether each individual SMS message is spam by evaluating the word contents of the message. As the algorithm is 'Naive', it assumes there is conditional independence between the words in the message which may not be as accurate.

Overall, the algorithm correctly predicts 98.7% of the test data. The messages which were wrongly predicted contained various elements which may have escaped the algorithm capabilities such as punctual emojis, abbreviations and acronyms.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
SMSSpamCollection		SMSSpamCollection
spam_filter.ipynb		spam_filter.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

spam_filter

About

Releases

Packages

Languages

billy-moore-98/spam_filter

Folders and files

Latest commit

History

Repository files navigation

spam_filter

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages