The aim of this capstone project is to build a machine learning tool capable of predicting the final domestic box office gross of a given film.
The data used for this project was assembled from 3 different sources. The Movie Database (TMDB) dataset was used as a base from Kaggle with information added from the Open Movie Database(OMDB) API and box office figures were sourced from Box Office Mojo. The data was then combined using pandas and exported to the CSV file listed below. The second CSV file is a subset of the original containing only films with box office data and relevent feautures.
- movies_dataset.csv
- boxoffice.csv
The following order of notebooks represent the project flow from data visualiztion to data modelling.
- Capstone 1 Data Visualization.ipynb
- Capstone 1 Statistical Data Analysis.ipynb
- Capstone 1 In Depth Analysis.ipynb
The following order of documents match the project flow from proposal to final report.
- Capstone 1 Project Proposal.pdf
- Capstone 1 Milestone Report.pdf
- Capstone 1 Consolidated Report.pdf
The slide deck below gives a brief overview of the entire project.
- Predicting Box Office Gross Slide Deck.pptx