Skip to content
/ data Public

IES UDE data sets used for research and teaching

Notifications You must be signed in to change notification settings

es-ude/data

Repository files navigation

UDE Intelligent Embedded Systems (IES) Data

The library collects utilities to download data used for research and teaching. In the future we might add some very basic tools for preprocessing.

Installation

Simple Approach

Make sure you have at least python3.10 installed. Then you can install the package via

$ pip install git+ssh://[email protected]:es-ude/data.git

Recommended Approach

We recommend to install the uv package manager. Afterwards you can use

$ uv init --python 3.12 my-project
$ cd my-project
$ uv python install python3.12
$ uv add git+ssh://[email protected]:es-ude/data.git
$ uv sync
$ source .venv/bin/activate

Usage

Downloading

You can use

from iesude.data import MitBihAtrialFibrillationDataSet as AFDataSet

d = AFDataSet.download("my_data_dir")

This will download the data from our public sciebo share into a tmp directory and extract the contents into a folder called my_data_dir.

If you want to download your data again you need to delete that directory.

Adding Your Data

You need write access to our sciebo share. Upload your dataset, preferrably as a zip file, since support for compressed tar archives is not implemented yet. Assuming you stored your data under "myproject/dataset01.zip". You define your new dataset like so

from iesude.data import DataSet, Zip as ZipArchive

MyNewDataSet = DataSet(file_path = "myproject/dataset01.zip", file_type=ZipArchive)

Features

  • automatic download from UDE IES sciebo share
  • automatic archive extraction into a given folder
  • supported file types are
    • .zip archive
    • uncompressed .tar archive
    • plain files (download a file directly to your folder without extraction)

Todo

  • tar.gz
  • tar.xz
  • override share endpoint in user config
  • upload data sets
  • automatically put descriptions/readmes for uploaded datasets in github repo
  • autogenerate classes when uploading a data set
  • upload checksums for datasets
  • use checksums instead of directory presence to decide whether or not to download datasets
  • download progress bar
  • logging

Contribution

To contribute, clone the repository and install uv (link in the install section). Additionally also install pre-commit, e.g., like so

$ uv tool install pre-commit

Alternatively you can use devenv environment for reproducible, declarative and easy to use setup. It will take care of

  • installing uv and calling it to install python and the dev dependencies
  • most importantly it will install and setup pre-commit

Follow the 2 steps from the devenv getting started guide.

Additionally, we recommend you use direnv to activate devenv automatically upon entering the project in a shell (also supported by several IDE/Editor plugins).

About

IES UDE data sets used for research and teaching

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published