This project has Hadoop implementation of Naive Bayes for document classification and its comparison with the in memory implementation of Naive Bayes is compared. The repository contains the following files:
- NB_in_memory.py (In Memory implementation)
- Distributed_Naive_Bayes.pdf (Project report)
- NB_Hadoop (Contains code and log files for implementation of Naive Bayes in Hadoop)