Skip to content

fonsecagabriella/data_engineering

Repository files navigation

Data Engineering Zoomcamp

The Data Engineering Zoomcamp is a free 9-week course that teaches the fundamentals of building data pipelines.

I got hands-on experience with tools like Docker, Terraform, Kestra, dbt, Spark, and Kafka, learning about everything from setting up infrastructure to working with streaming data.
The final project really helped solidify what I learned and gave me a chance to apply it all.

👩🏽‍💻 Link to the course

Module 1: Containerization and Infrastructure as Code

Introduction to GCP Docker and Docker Compose Running PostgreSQL with Docker Infrastructure setup with Terraform

Module 2: Workflow Orchestration

Data Lakes and Workflow Orchestration Workflow orchestration with Kestra

Workshop 1: Data Ingestion

API reading and pipeline scalability Data normalization and incremental loading

Module 3: Data Warehousing

Introduction to BigQuery Partitioning, clustering, and best practices Machine learning in BigQuery

Module 4: Analytics Engineering

dbt (data build tool) with PostgreSQL & BigQuery Testing, documentation, and deployment Data visualization with Metabase

Module 5: Batch Processing

Introduction to Apache Spark DataFrames and SQL Internals of GroupBy and Joins

Module 6: Streaming

Introduction to Kafka Kafka Streams and KSQL Schema management with Avro

Releases

No releases published

Packages

No packages published