Image Reference:This repository contains data engineering and data science projects and exercises using open data sources as part of the AMSE/SAKI course, taught by the FAU Chair for Open-Source Software (OSS) in the Summer'24 semester. This repo is forked from 2023-amse-template repository.
The project aims to explore the possible connection between solar activity - like solar flares, and climate change on Earth. Through the statistical analysis of historical data, our goal is to uncover whether there exists a clear relationship between solar events and the observable shift in global climate patterns. This investigation holds significant importance as it provides insights essential for refining climate models and improving our capacity to forecast and address the consequences of climate change.
Is there any relationship between solar activity (Solar Flares) and climate change on Earth?
project
├── ETL
│ ├── __init__.py
│ ├── extract
│ │ ├── __init__.py
│ │ ├── csv_extractor.py
│ │ ├── extractor.py
│ ├── load
│ │ ├── __init__.py
│ │ ├── loader.py
│ │ └── sqlite_loader.py
│ └── transform
│ ├── __init__.py
│ ├── co2_concentration_data_transformer.py
│ ├── sea_level_data_transformer.py
│ ├── solar_flare_data_transformer.py
│ ├── temperature_data_transformer.py
│ └── transformer.py
├── config
│ └── pipeline_config.yaml
├── data
│ ├── raw
│ ├── sink
│ └── transformed
├── data-analysis
│ ├── data-analysis.ipynb
│ └── plots
├── data-report
├── logger
│ ├── __init__.py
│ ├── base_logger.py
│ ├── console_logger.py
│ ├── file_logger.py
│ ├── logger.py
├── logs
│ └── 20240701_043653.log
├── notebooks
├── tasks
│ ├── __init__.py
│ └── task.py
├── tests
│ ├── __init__.py
│ ├── mock
│ │ ├── __init__.py
│ │ ├── data
│ │ │ ├── __init__.py
│ │ │ ├── co2_concentration_mock_data.py
│ │ │ ├── solar_flare_mock_data.py
│ │ │ ├── temperature_change_mock_data.py
│ │ │ ├── raw
│ │ │ ├── sink
│ │ │ └── transformed
│ │ └── mock_logger.py
│ ├── test_system.py
│ └── test_transform.py
├── utils
│ ├── __init__.py
│ ├── config.py
│ └── converters.py
├── analysis-report.pdf
├── data-report.pdf
├── pipeline.py
├── pipeline.sh
├── tests.py
├── tests.sh
├── project-plan.md
└── requirements.txt
Using conda
To install necessary dependencies using conda, execute the following command after cloning the repository:
conda env create -f environment.yml
This command will create a conda environment named made-env
(or any other specified name in your environment.yml
file) and install all required packages.
Using pip
If you prefer using pip, you can install the dependencies listed in requirements.txt
. Make sure you have Python and pip installed, then run:
pip install -r requirements.txt
You can run the data pipeline using either pipeline.sh
or pipeline.py
:
cd project
chmod +x pipeline.sh
./pipeline.sh
OR
python pipeline.py
To run unit tests for the data pipeline, use either tests.sh
or tests.py
:
cd project
chmod +x tests.sh
./tests.sh
OR
python tests.py
Whenever a commit is pushed to this repository, the unit tests for the data pipeline will be automatically triggered using GitHub Actions. The status of the test run will be notified to the Slack channel #made-ci-cd
as shown in the image below:
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.