Project 3 - Classification with Natural Language Processing

Author: Grace Campbell

Problem Statement

Reddit is a content aggregation website where members can submit links, text posts, images, and videos, which other members can then comment on and discuss. The posts "are organized by subject into user-created boards called 'subreddits', which cover a variety of topics including news, science, movies, video games, music, books, fitness, food, and image-sharing." (Wikipedia)

There are two subreddits I am interested in: /r/News and /r/TheOnion. The first contains titles of news articles, while the second contains titles of satirical news articles. Can I build a classification model using natural language processing that can accurately predict which subreddit a given post came from?

Project Directory

Data Preparation
- Data Gathering
- Exploratory Data Analysis
Modeling

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
materials		materials
project-3-aws		project-3-aws
.gitignore		.gitignore
README.md		README.md
data-gathering.ipynb		data-gathering.ipynb
exploratory-data-analysis.ipynb		exploratory-data-analysis.ipynb
modeling-knn.ipynb		modeling-knn.ipynb
modeling-naive-bayes.ipynb		modeling-naive-bayes.ipynb
modeling-svm.ipynb		modeling-svm.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project 3 - Classification with Natural Language Processing

Problem Statement

Project Directory

About

Releases

Packages

Languages

JCacho2007/Fake-News-Classification-NLP

Folders and files

Latest commit

History

Repository files navigation

Project 3 - Classification with Natural Language Processing

Problem Statement

Project Directory

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages