Skip to content

tauseef1234/Spam_Labeling_Snorkel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Snorkel - Data labeling using weak supervision

Despite massive improvements to machine learning frameworks, research and hardware, preparing training dataset largely remains a manual process. Data scientists have to either label massive amount of files by hand or outsource the task to contract workers. This bottleneck is becoming more apparent as deep learning is more accessible than ever due to various open source tools available. The Snorkel project started at Stanford in 2016 aims to solve this problem by programtically label, build and manage training data with weak supervision.

In this project, we will walk through the process of using Snorkel to build a training set for classifying text messages as spam or not spam. Additional goal of this project is to demonstrate the basic components and concepts of Snorkel, but also to dive into some of the actual process of iteratively developing real applications using Snorkel.