Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
get_data.ipynb		get_data.ipynb
parse_data.ipynb		parse_data.ipynb
predict.ipynb		predict.ipynb

README.md

Project Overview

In this project, we'll predict who will win basketball games in the NBA. We'll start by web scraping NBA box scores. We'll read in and clean the html using BeautifulSoup. Then we'll parse the box scores into pandas DataFrames that we can use for machine learning. Then we'll do feature selection to identify good predictors, and train a machine learning model to make predictions. We'll end by computing rolling predictors and improving the model.

This is a complete end-to-end project, and we'll be using web scraping, data cleaning, and machine learning.

Project Steps

Scrape standings and box score data
Clean up html with BeautifulSoup
Parse the box scores to get a DataFrame
Run feature selection to identify predictors
Train an ML model
Compute rolling predictors to improve the model

Code

You can find the code for this project here

File overview:

get_data.ipynb - download the box scores from basketball reference.
parse_data.ipynb - clean the data to get a pandas DataFrame.
predict.ipynb - make predictions using machine learning

Prerequisites

To complete this project, you'll need to have a good understanding of:

Python syntax, including functions, if statements, and data structures
Data cleaning
Pandas syntax
Using Jupyter notebook

You'll also need to know the basics of machine learning.

Please make sure you've completed these Dataquest courses (or know the material) before trying this project:

Local Setup

Installation

To follow this project, please install the following locally:

JupyerLab
Python 3.8+
Python packages
- pandas
- scikit-learn
- beautifulsoup4
- playwright

Data

We'll download our dataset from Basketball Reference.

You can find the pre-scraped data here if you don't want to run the code.
If you only want to do the second part of this project (prediction), you can download the csv file of box scores here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nba_games

nba_games

README.md

Project Overview

Code

Prerequisites

Local Setup

Installation

Data

Files

nba_games

Directory actions

More options

Directory actions

More options

Latest commit

History

nba_games

Folders and files

parent directory

README.md

Project Overview

Code

Prerequisites

Local Setup

Installation

Data