Game-of-thrones-analysis

This is a natural language processing project using subtitle files from Game of Thrones Season7.

1. Natural Language Processing using (NLTK)

a. tokenize words from each document
b. filter out stop words and stem the remaining
c. rank the most-used 100 words
d. count word frequency by episode

2. K-means Clustering (Scikit-learn)

a. K-means clustering to group similar words based on frequencies
b. PCA analysis for dimensionality reduction
c. plot results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Game-of-thrones-analysis

1. Natural Language Processing using (NLTK)

2. K-means Clustering (Scikit-learn)

Files

README.md

Latest commit

History

README.md

File metadata and controls

Game-of-thrones-analysis

1. Natural Language Processing using (NLTK)

2. K-means Clustering (Scikit-learn)