Skip to content

craigmacartney/Spark-RDD-Programming-Exercise---NLTK-and-TF.IDF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

Spark Programming - Natural Language Processing and Information Retrieval

IN432 Big Data coursework 2018

Group coursework together with @laibe as part of the course INM432 BigData at City, University of London.

This coursework is about classification of e-mail messages as spam or non-spam in Spark and alsos introduce a few additional elements, such as the NLTK and some of the preprocessing and machine learning functions that come with Spark.

About

Spark Programming - Natural Language Processing and Information Retrieval

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published