Project Report: project report/project_report.pdf
- download 'merged_training.pkl' from the source denoted in the report, and put it in the folder 'emotion\datasets\Emotion Dataset for Emotion Recognition Tasks'
- updated: https://www.icloud.com/iclouddrive/084E9TMZ_lykn3QhU-kIX1DDQ
- original github repo: https://github.com/dair-ai/emotion_dataset
- Execute "project_classify.ipynb"
- This will train the classification model and vectorizer, and save them both
- Please ignore the commented-out code used for comparing models and hyperparameter tuning
- This also saves metric results to a folder
- The saved model is:
- LinearSVC(random_state=RANDOM_STATE, max_iter=10000, C=0.085)
- Execute "project_ltr_v3.ipynb"
- This will create the index (if it doesn't exist) and train the learning-to-rank pipeline
- This will also save the documents dataframe for the interface, but not the trained learning-to-rank pipeline
- This also saves metric results to a folder
- Execute "project_interface.py"
- This is the user interface. It first loads in the model, vectorizer, and documents dataframe. Then it trains the learning-to-rank model (takes a few seconds), then runs the text-based interface
- Choice 2 is supposed to show a graph in a popup via plt.show()
- BM25 and TF-IDF both perform better than my best trained ML model. I am using the ML model for purpose of this assignment
- Bug with PyTerrier: program throws error and exits when no results found