The purporse of the project is to analyze Student Reviews on RateMyProfessor site to gain insight into the course difficulty level, Professor’s style of teaching and overall student sentiment at a Professor / Department level.
We are using the Big Data Set sourced from RateMyProfessor.com for Professors' Teaching Evaluation.
The dataset has fields like professor's name, school name, number of students, student comments, student star rating, student difficulty rating etc. The dataset contains 9,543,998 rows of comment records with valid information for 919,750 professors.
https://data.mendeley.com/datasets/fvtfjyvw7d/2
- Python 3.9.13 version was used
- 3rd Party Python Modules (can be found in requirements.txt) :
- NLTK (For sentiment analysis)
- Spacy (For text preprocessing)
- Seaborn (For visualization)
- Matplotlib (For visualization)
- Pandas
- Numpy
- Bokeh (For visualization)
- Plotly
- Clone repository
- Install python dependencies
pip install -r requirements.txt
- Install VADER library from NLTK
import nltk
>>> nltk.download('vader_lexicon')
- Run the jupyter notebook to render visualizations & statistics
Inspiration, code snippets, etc.