Model Development - Transformations Script #96

RobotPsychologist · 2024-10-30T01:46:11Z

The idea for this ticket is to implement a function that takes the data produced from the data generation pipeline:

0_meal_identification/meal_identification/meal_identification/datasets/dataset_cleaner.py
0_meal_identification/meal_identification/meal_identification/datasets/dataset_generator.py
0_meal_identification/meal_identification/meal_identification/datasets/dataset_operations.py

The above scripts are intended to facilitate the data generation and cleaning that occurs outside of the sktime library.

The transformations script will operate as the connection point between the data generated from above and the training pipeline. The training pipeline should be able to call the transformation script in a loop for extend training runs where we loop through a dictionary of sktime transformation pipelines:

Inside the transformation function itself should be a looping mechanism that loops through a list of provided datasets to apply the transformations to. E.g., we could have two identical data sets but one is the three hour meal window and one is the five hour meal window, but we want to apply the same transformations on both data sets for our experiments.

So one loop of the transformation script should:

Check if the transformed data set already exists in: 0_meal_identification/meal_identification/data/processed
If it does not exist load the specified data set from: 0_meal_identification/meal_identification/data/interim
Apply the transformation pipeline
Store the transformed data for caching if specified, for a given training run, we could create a new subdirectory with the training runs label, and a new directory for each transformation pipeline applied e.g. 0_meal_identification/meal_identification/data/processed/{training run label}/{pipeline label}

Once the looping is complete is should

Return a dictionary of the transformed data set(s) if specified
Record logs of the transformed data (perhaps create a external log recorder function, we don't need to write this right now just have the function set up to call an external logger function).

Please also right tests like the data team did using pydantic, you can reach out to them for guidance on this to conform to the standards they have been using for consistency. Reach out to @Tony911029 @andytubeee @Phiruby if you have questions regarding this.

@fkiraly Please let me know if this makes sense or if there are any other clarifications required.

aryavkin · 2024-11-06T23:31:11Z

interested

y-mx · 2024-11-06T23:31:29Z

add me please

RobotPsychologist · 2024-11-20T23:19:32Z

@Tony911029 and @andytubeee help with unit tests.

RobotPsychologist added this to @RobotPsychologist's Automatic Meal Detection from Blood Glucose CGM Reading Oct 30, 2024

RobotPsychologist converted this from a draft issue Oct 30, 2024

RobotPsychologist added the modeldev Developing modeling pipelines for meal annotation task. label Oct 30, 2024

RobotPsychologist added this to the Modeling Pipeline Completion milestone Oct 30, 2024

RobotPsychologist assigned aryavkin Nov 6, 2024

RobotPsychologist assigned y-mx Nov 6, 2024

RobotPsychologist mentioned this issue Nov 8, 2024

Model Development - Data Generation Script #90

Closed

9 tasks

RobotPsychologist linked a pull request Nov 21, 2024 that will close this issue

96 model development transformation scripts #176

Merged

RobotPsychologist moved this from In progress to In review in @RobotPsychologist's Automatic Meal Detection from Blood Glucose CGM Reading Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Development - Transformations Script #96

Model Development - Transformations Script #96

RobotPsychologist commented Oct 30, 2024 •

edited

Loading

aryavkin commented Nov 6, 2024

y-mx commented Nov 6, 2024

RobotPsychologist commented Nov 20, 2024

Model Development - Transformations Script #96

Model Development - Transformations Script #96

Comments

RobotPsychologist commented Oct 30, 2024 • edited Loading

aryavkin commented Nov 6, 2024

y-mx commented Nov 6, 2024

RobotPsychologist commented Nov 20, 2024

RobotPsychologist commented Oct 30, 2024 •

edited

Loading