Hourly Divvy Trip Predictor Service

Introduction

The city of Chicago is home to nearly 3 million people, and it is currently the third most populous city in the US. Furthermore, its Cook County is the second most populous county in the country. Owing to this massive population, there are a range of transport options in the city. One of these is the city's Divvy Bike-sharing system, complete with hundreds of stations and thousands of bikes & scooters. It is currently operated by the ride-sharing company Lyft, and has been in existence for 9 years. With this many trips taking place every day for this long, this makes Divvy's historical trip data an attractive source of time-series data (at least for me :D), especially because the data is updated monthly.

The Business Problem

How can we predict the number of trips that will start and end at various stations in the city each hour?

Being able to anticipate spikes in activity will enable Divvy to allocate bikes and scooters more efficiently over time.
This capabability could help the management to plan any possible changes in the scale of their services in a given area.
Having models that predict customer activity in this way can provide a sense of confidence in managements understanding customer behaviour.

The Objective

Build a complete end-to-end machine learning system that culminates in a simple frontend which provides the desired predictions in an interactive manner.

System Design

Feature Pipeline

ingests the available recent monthly usage data
runs preprocessing procedures to produce time series data
transforms the time series data into training data

Training Pipline

trains models (with selected architectures) to predict hourly arrivals and departures
implements optional hyperparameter tuning during training
logs the best model to CometML's model registry

Inference Pipeline

Provides code that allows for interaction with the Hopsorks Feature Store API.
Backfills the Hopsworks feature store with time series data and predictions
Delivers these predictions through a simple Streamlit frontend.
Github actions are used to backfill the feature store with new predictions every hour.

Use the App

A containerised version of the app is available here.

Alternatively, you can build the project locally by doing the following:

Clone the repository:

$ git clone https://github.com/maadabrandon/Hourly-Divvy-Trip-Predictor

Install Poetry

$ curl -sSL https://install.python-poetry.org | python3 -

Enter the project directory and run:
```
$ poetry install
```
Register free accounts on Hopsworks and CometML. Then copy your project names(for both platforms), API keys(again for both platforms), Comet workspace name, and email address into a .env file.
Backfill the Hopsworks feature groups with historical data:
```
$ make backfill-features
```
Run the training pipeline:
```
$ make train-all
```
Backfill the Hopsworks feature groups with predictions:
```
$ make backfill-predictions
```
View the frontend:
```
$ make frontend
```

Name		Name	Last commit message	Last commit date
Latest commit History 231 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
data/geographical/mixed_indexer		data/geographical/mixed_indexer
images		images
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hourly Divvy Trip Predictor Service

Introduction

The Business Problem

How can we predict the number of trips that will start and end at various stations in the city each hour?

The Objective

System Design

Feature Pipeline

Training Pipline

Inference Pipeline

Use the App

Alternatively, you can build the project locally by doing the following:

About

Releases

Packages

Languages

License

kobinabrandon/Hourly-Divvy-Trip-Predictor

Folders and files

Latest commit

History

Repository files navigation

Hourly Divvy Trip Predictor Service

Introduction

The Business Problem

How can we predict the number of trips that will start and end at various stations in the city each hour?

The Objective

System Design

Feature Pipeline

Training Pipline

Inference Pipeline

Use the App

Alternatively, you can build the project locally by doing the following:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages