NLP_Maximum_Entropy_Classifier

Trains on test data from IMDB movie reviews, and classifies future documents based on sentiment. Please see "Maxent and Perceptron.pdf" for detailed implementation notes.

An ongoing project of mine in this repository is the empirical model of the MaxEnt perceptron. If you're curious, check out MaxEnt_Empirical_Notes_Model. I've written the notes and the code following this paper:

http://blog.datumbox.com/machine-learning-tutorial-the-max-entropy-text-classifier/

The maximum entropy classifer I have successfully implemented utilizes stochastic gradient descent, whereas the empirical model utilizes regular gradient descent. This was a design choice because stochastic descent is more of an approximate convergence, whereas the empirical model is more exact based upon frequency probability. I wanted the program to be able to run, train, and score within a decent amount of time. Currently, the program is still a bit slow because of how I've handled the Bag of Words, but still gives a 74-78% accurracy across 10 test splits on average. The theoretical maximum is 86.7 percent accuracy, but that is with much more in-depth feature extraction, whereas I am essentially utilizing Bag of Words and removing stopwords in scoring.

If you'd like to reproduce my results, please check out the implementation notes, and utilize the data folders "Pos" and "Neg" as hyperparameters when running MaxEnt from the Python shell.

Enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
neg		neg
pos		pos
MaxEnt.py		MaxEnt.py
MaxEnt_And_Perceptron.pdf		MaxEnt_And_Perceptron.pdf
MaxEnt_Empirical_Notes_Model.py		MaxEnt_Empirical_Notes_Model.py
Maxent and Perceptron.pdf		Maxent and Perceptron.pdf
README.md		README.md
sentimentAnalyzer.pdf		sentimentAnalyzer.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP_Maximum_Entropy_Classifier

About

Releases

Packages

Languages

Mike001-wq/NLP_Maximum_Entropy_Classifier

Folders and files

Latest commit

History

Repository files navigation

NLP_Maximum_Entropy_Classifier

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages