Skip to content

night-fury-me/advanced-data-engineering-fau

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains data engineering and data science projects and exercises using open data sources as part of the AMSE/SAKI course, taught by the FAU Chair for Open-Source Software (OSS) in the Summer'24 semester. This repo is forked from 2023-amse-template repository.

Influence of Solar Activity on Earth's Climate

Project Logo

Image Reference:

earth.com

Introduction

The project aims to explore the possible connection between solar activity - like solar flares, and climate change on Earth. Through the statistical analysis of historical data, our goal is to uncover whether there exists a clear relationship between solar events and the observable shift in global climate patterns. This investigation holds significant importance as it provides insights essential for refining climate models and improving our capacity to forecast and address the consequences of climate change.

Research Question

Is there any relationship between solar activity (Solar Flares) and climate change on Earth?

Project Structure

project
├── ETL
│   ├── __init__.py
│   ├── extract
│   │   ├── __init__.py
│   │   ├── csv_extractor.py
│   │   ├── extractor.py
│   ├── load
│   │   ├── __init__.py
│   │   ├── loader.py
│   │   └── sqlite_loader.py
│   └── transform
│       ├── __init__.py
│       ├── co2_concentration_data_transformer.py
│       ├── sea_level_data_transformer.py
│       ├── solar_flare_data_transformer.py
│       ├── temperature_data_transformer.py
│       └── transformer.py
├── config
│   └── pipeline_config.yaml
├── data
│   ├── raw
│   ├── sink
│   └── transformed
├── data-analysis
│   ├── data-analysis.ipynb
│   └── plots
├── data-report
├── logger
│   ├── __init__.py
│   ├── base_logger.py
│   ├── console_logger.py
│   ├── file_logger.py
│   ├── logger.py
├── logs
│   └── 20240701_043653.log
├── notebooks
├── tasks
│   ├── __init__.py
│   └── task.py
├── tests
│   ├── __init__.py
│   ├── mock
│   │   ├── __init__.py
│   │   ├── data
│   │   │   ├── __init__.py
│   │   │   ├── co2_concentration_mock_data.py
│   │   │   ├── solar_flare_mock_data.py
│   │   │   ├── temperature_change_mock_data.py
│   │   │   ├── raw
│   │   │   ├── sink
│   │   │   └── transformed
│   │   └── mock_logger.py
│   ├── test_system.py
│   └── test_transform.py
├── utils
│   ├── __init__.py
│   ├── config.py
│   └── converters.py
├── analysis-report.pdf
├── data-report.pdf
├── pipeline.py
├── pipeline.sh
├── tests.py
├── tests.sh
├── project-plan.md
└── requirements.txt

Installation

Using conda

To install necessary dependencies using conda, execute the following command after cloning the repository:

conda env create -f environment.yml

This command will create a conda environment named made-env (or any other specified name in your environment.yml file) and install all required packages.

Using pip

If you prefer using pip, you can install the dependencies listed in requirements.txt. Make sure you have Python and pip installed, then run:

pip install -r requirements.txt

Running the Data Pipeline

You can run the data pipeline using either pipeline.sh or pipeline.py:

cd project
chmod +x pipeline.sh
./pipeline.sh
OR
python pipeline.py

Running Unit Tests

To run unit tests for the data pipeline, use either tests.sh or tests.py:

cd project
chmod +x tests.sh
./tests.sh
OR
python tests.py

Test Automation with GitHub Actions

Whenever a commit is pushed to this repository, the unit tests for the data pipeline will be automatically triggered using GitHub Actions. The status of the test run will be notified to the Slack channel #made-ci-cd as shown in the image below:

slack-webhook.png

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Exercise Badges (Part of the course-work)

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 88.3%
  • Python 9.0%
  • TeX 2.7%