Skip to content

An end-to-end data engineering project leveraging MAGE for transformation, GCP's BigQuery for storage, and Looker for insightful visualization using Uber trip data

Notifications You must be signed in to change notification settings

tejas-jm/UberFlow-Analytics

Repository files navigation

UberFlow Analytics

An end-to-end data engineering project leveraging MAGE for transformation, GCP's BigQuery for storage, and Looker for insightful visualization using Uber trip data."

Project Overview

This project covers the end-to-end process of working with Uber trip data, including:

  • Extracting raw data from CSV files.
  • Transforming the data into structured dimension and fact tables via Mage.
  • Loading the transformed data into Google BigQuery for efficient querying.
  • Creating insightful visualizations using tools like Google Data Studio.

Architecture

Technology Used

Programming Language - Python

Google Cloud Platform

  1. Google Storage
  2. Compute Instance
  3. BigQuery
  4. Looker Studio

Data Pipleine Tool - Mage AI

Dataset Used

TLC Trip Record Data Yellow and green taxi trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.

More info about dataset can be found here:

  1. Website - https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page
  2. Data Dictionary - https://www.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf

Data Model

About

An end-to-end data engineering project leveraging MAGE for transformation, GCP's BigQuery for storage, and Looker for insightful visualization using Uber trip data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published