In this project, I extracted data from my Spotify playlists using Spotify API and used it to predict if I will like a certain song or not.
The main purpose of this project was to make myself more familiar with the process of working with data and using the Numpy, Pandas, Seaborn, SciKit Learn and TensorFlow libraries in Python.
This repository contains three Jupyter Notebooks that document my work:
Part 1 - Extracting Spotify Audio Features
Parts 2 and 3 - Exploring Playlist Data and Learning Models
Part 4 - Predictions on Separate Dataset
The playlist data that I used is in a separate directory in the repository, but it needs to be in the same directory as the notebooks in order for the code to work.
This repository also includes the 'extractor.py' script that can create .json files based on your own playlists. The .json files can then be imported into a Pandas dataframe that contains the song title, unique ID and audio information. To do this:
- Follow the instructions on the Spotipy documentation page to get your credentials for the credentials.json file.
- Open the Spotify playlist (of no more than 100 songs), click on the cirlce with three dots, go to Share -> Copy Spotify URI
- Paste that Spotify URI in the playlists.json file in next to the key 'uri'.
- If you want to classify the playlist by whether you like it or not, set the 'like' key to true or false.
- Add or remove playlists as you please.
- Run the script. (it should take 2-3 minutes per playlist)