Skip to content

An open source benchmark for semi-supervised change point detection of type 1 diabetic meals from continuous glucose monitor time series data. Originally created to present to PyData Global 2024 in association with sktime and skchange.

License

Notifications You must be signed in to change notification settings

Blood-Glucose-Control/t1d-change-point-detection-benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

T1D Change Point Detection Benchmark

An open source benchmark for semi-supervised change point detection of type 1 diabetic meals from continuous glucose monitor time series data. Originally created to present to PyData Global 2024 in association with sktime and skchange.

Versioning

NOTE: The benchmark is still under heavy development and is subject to change. Consider its current state as alpha v.0.0.1

  • 0.0.1 -> Patch/bug fixes
  • 0.1.0 -> Minor data set updates where we add new patients, cgms, insulin pumps, or meal timing regimens.
  • 1.0.0 -> Major data set updates where we add new modeling tasks, like new transfer learning settings.

Dataset Metadata

property value
name T1D Semi-Supervised Change Point Detection Benchmark
url https://github.com/Blood-Glucose-Control/t1d-change-point-detection-benchmark
sameAs https://github.com/Blood-Glucose-Control/t1d-change-point-detection-benchmark
description
citation
license

Data Directory Structure

Overview

This repository contains three main data directories: raw, processed, and obfuscated, each serving different purposes in the data pipeline.

Directory Details

1. data/raw

Contains data generated directly from simglucose simulator.

Characteristics:

2. data/processed

Contains processed data derived from data/raw.

File Naming Convention

Pattern: {patientNum}_{cgmName}_{insulinPumpName}_{startDate}_{endDate}.csv

Example: ado001_Dexcom_Cozmo_2024-02-01_2024-04-30

Component Description Example
patientNum Concatenation of first 3 and last 3 characters from patient name ado001 (adolescent#001)
cgmName CGM device name Dexcom
insulinPumpName Insulin pump device name Cozmo
startDate First day of generated data 2024-02-01
endDate Last day of generated data 2024-04-30

3. data/obfuscated

Contains data obfuscated from data/processed to simulate human behavior.

File Naming Convention

Pattern: {patientNum}_{cgmName}_{insulinPumpName}_{startDate}_{endDate}_{loggingBehaviour}_{loggingTiming}.csv

Example: ado001_Dexcom_Cozmo_2024-02-01_2024-04-30_all_normal.csv

Component Description Example
patientNum Concatenation of first 3 and last 3 characters from patient name ado001 (adolescent#001)
cgmName CGM device name Dexcom
insulinPumpName Insulin pump device name Cozmo
startDate First day of generated data 2024-02-01
endDate Last day of generated data 2024-04-30

Logging Behavior Types

Filename Indicator Type Description Distribution
all All meals Logs every meal 20%
top2 Multiple meals per day Logs 1-2 largest meals (on average 1.8 logs/day) 25%
once Once per day Logs largest meal only 20%
weekly A few times per week Irregular logging (on average 3 logs/week) 20%
none Never No logging 15%

Note: Distribution percentages are subject to change

Logging Timing Patterns

Filename Indicator Pattern Description Distribution
late Left skewed Forgetful loggers (gamma distribution) 38%
early Right skewed Hasty loggers (gamma distribution) 23%
average Normal Distribution Centered around meal start time 28%
punctual Unchanged Logs exactly at meal start 11%

Note: Distribution percentages are subject to change

Note: Parameters for gamma/distribution are subject to change

Gamma/normal distribution for loggingTiming

Each graph contains 50 randomly generated curves

Average logger: Normal Distribution

Late logger: Gamma Distribution

Early logger: Gamma Distribution

T1D Change Point Detection Benchmark

A benchmark dataset for evaluating change point detection algorithms on Type 1 Diabetes data.

Installation

You can install the package using pip:

pip install t1d-cpd-benchmark

Usage

The package provides three main functions to load different types of data:

  • load_raw_data(): Load raw CGM data
  • load_processed_data(): Load processed CGM data
  • load_obfuscated_data(): Load obfuscated CGM data

Each function can load either a single patient's data by index or all patients' data.

Example

from t1d_cpd_benchmark.datasets import load_raw_data, load_processed_data, load_obfuscated_data

# Load first patient's raw data
# index for load_raw_data can be 0 to 29
data = load_raw_data(index=0)
print(data.head())

# Load first patient's processed data
# index for load_processed_data can be 0 to 179
data = load_processed_data(index=0)
print(data.head())

# Load first patient's obfuscated data
# index for load_obfuscated_data can be 0 to 59
data = load_obfuscated_data(index=0)
print(data.head())

# Load all patients' data
# Returns concatenated DataFrame of all patients
all_data = load_raw_data()  

Data Types

  • Raw Data: Original CGM measurements
  • Processed Data: Cleaned and preprocessed CGM data
  • Obfuscated Data: Anonymized CGM data

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

An open source benchmark for semi-supervised change point detection of type 1 diabetic meals from continuous glucose monitor time series data. Originally created to present to PyData Global 2024 in association with sktime and skchange.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages