Skip to content

This project focuses on analyzing ride sharing app data to predict locations of high value.

Notifications You must be signed in to change notification settings

iKintosh/ETNA_EDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ride Sharing App EDA

This project focuses on analyzing ride sharing app data to predict locations of high value.

Results

Findings of this EDA could be found in eda_notebook. And as nbviewer notebook.

How to install

If you want to play with the data or the report you may install and set up the environment.

  1. Clone the repository.
  2. In the terminal run:
    poetry install
  3. Done

I recommend using pyenv to set up correct Python version.

Requirements

  • Python >=3.8.1,<3.10.0
  • Poetry
  • Other requirements listed in pyproject.toml file

Data Description

The data used in this project is in the form of a .csv file and contains the following columns:

  • start_time: the time when the order was made
  • start_lat: latitude of the location where the order was made
  • start_lng: longitude of the location where the order was made
  • end_lat: latitude of the destination point
  • end_lng: longitude of the destination point
  • ride_value: how much monetary value is in this particular ride

EDA Purpose

The availability of supply for ride sharing services depends on the duration of time it takes for the drivers to reach the customers. We want to attract drivers towards areas of the highest ride value. The purpose of this EDA is to determine if it is possible to predict areas of high ride value using only the data available.

Methodology

The data is aggregated into clusters to allow prediction of demand based on location. ETNA library is used to conduct the forecasting tasks in this work.

Two different approaches for clustering and forecasting is used. The most promissing one is using:

  • uber h3 for clusterization.
  • catboost model for forecasting.

Further Work

  • Rearrange the regions manually
  • Collect more data on small regions
  • Try an ensemble of models
  • Use end_lat, end_lng columns

About

This project focuses on analyzing ride sharing app data to predict locations of high value.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published