I analysed a dataset of Indian Movies primarily Hindi Movies.
- Pandas
- Numpy
- Matplotlib
- fivethirtyeight : A style which creates similar plots as that of FiveThirtyEight.
-
Basic Exploration of Dataset.
- Distribution of Ratings.
- Gross collection in US and India.
- Popularity of Genres.
- Unique Stars and the number of roles they played.
- Unique directors and the number of movies they directed.
-
Factors affecting movie performance, based upon hypothesis.
- Relationship between
gross and genre.
- Relationship between
rating and genre.
- Relationship between
rating and gross.
- Relationship between
-
Trend of directors.
- How many movies were directed by each director?
- What genres do the directors stick to?
- What are the popular genres for new/upcoming directors?
-
Analysis of stars.
- Who are the highest grossing stars comparing Indian and US Gross?
- Who are the best rated stars?
- Getting inferences from the outliers using IQR.
- IQR(Inter-quartile range)
- Skewness
- Median being used as a better average than mean.