README

This is a simple spam email classifier, which I use to showcase a data analysis pipeline.

How to reproduce this analysis

Clone the repository!
Place the test emails in data/ and follow the naming and format specification below.
Run make clean to cleanup workspace from previous analysis.
Run make INPATH=data/testemails.SAMPLE_ID.txt, replacing SAMPLE_ID to match with the path to file containing emails for testing.
Classification results will be found in results/ for files with name prediction.CLS_ID.SAMPLE_ID.csv (where CLS_ID is the ID of the classifier used).

test email format specification

It should follow the same format as [the sample](
the nmake INPATH=data/testemails.fake.txt

Autotest

It is adviced to leave the auto-test running while making changes. At project root, run: Rscript tests/runtests.R which runs all tests in the tests/ directory first, and rerun a test if a change in either the test or the code in spClassR/ is detected.

Dependencies

GNU make tool
Python
R
R packages
- tm
- plyr
- kernlab
- randomForest
- e1071
- testthat

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
spClassR		spClassR
tests		tests
.gitignore		.gitignore
Makefile		Makefile
README.Rmd		README.Rmd
README.md		README.md
makeHelper.py		makeHelper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

How to reproduce this analysis

test email format specification

Autotest

Dependencies

About

Releases

Packages

Languages

sibyl229/spClass

Folders and files

Latest commit

History

Repository files navigation

README

How to reproduce this analysis

test email format specification

Autotest

Dependencies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages