CS2410 NBA PREDICTION

A predictive model that lists the probability of each NBA team being the next NBA champion

First, data about every team is pulled from Basketball Reference using sportsipy
- This data includes overall team data of every year, such as their field goal percentage, blocks, etc.
- Once every team's data is pulled through sportsipy, we use beautiful soup to scrape through the website again to add on the champion for every year, just because that data isn't available within sportsipy
- When all the data is correctly pulled and merged together, we save it as a csv for easier access later down the line
Once we pulled the data we need, it starts getting cleaned.
- Remove excess data
- Champion column values gets converted to 1's and 0's (1 if the team is that year's champion, 0 if they aren't)
- Drop the features that make the data noisy and decrease accuracy
  - Found the noisy data through BalanceRandomForestClassifier's "feature importances" property
  - This allows us to weigh each feature based on how much it effects the target, meaning we can easily figure out which features are noisy and which aren't
After the data is cleaned the model gets created
- Data is split into test data and training data
- Since it's unbalanced, it goes through the BalancedRandomForestClassifier, which acts both as a data balancer and as a model we can use to predict behavior
Evaluate the model using SKLearn's metrics
Create our list of predictions by passing the current year's data through the model

Create a virtual enviornment using python's venv package and set it up
- Create a directory to use as a virtual enviornment. I'm calling mine "venv"
```
mkdir venv
```
- Go into that directory and create a virtual enviornment inside of it
```
cd venv  
python3 -m venv venv
```
- Go back out the virtual enviornment, activate it, and then install all the packages from the req.txt file
```
cd ..
. venv/bin/activate
pip3 install -r req.txt
```
- Install the sportsipy package into your venv by following the instructions within its repo
Run!
- Type python3 proj.py into your terminal or run it in your IDE and you should be good to go!
  - If you don't already have data generated, make sure to run the create_curr_year_df() method and the create_data_csv() method
- The teams should be listed out in your terminal in order of most likely to least likely, and a CSV file called final.csv should also be created with some extra information

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
2023.csv		2023.csv
2023_data.csv		2023_data.csv
README.md		README.md
final.csv		final.csv
nba.csv		nba.csv
proj.py		proj.py
req.txt		req.txt

Provide feedback