Group coursework together with @laibe as part of the course INM432 BigData at City, University of London.
This coursework is about classification of e-mail messages as spam or non-spam in Spark and alsos introduce a few additional elements, such as the NLTK and some of the preprocessing and machine learning functions that come with Spark.