-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model Development - Data Generation Script #90
Labels
modeldev
Developing modeling pipelines for meal annotation task.
Milestone
Comments
RobotPsychologist
added
the
modeldev
Developing modeling pipelines for meal annotation task.
label
Oct 26, 2024
RobotPsychologist
moved this to Backlog
in @RobotPsychologist's Automatic Meal Detection from Blood Glucose CGM Reading
Oct 26, 2024
. |
RobotPsychologist
added a commit
that referenced
this issue
Nov 7, 2024
Adding the dataset_processing.py file for #90.
RobotPsychologist
added a commit
that referenced
this issue
Nov 7, 2024
Adding the dataset_cleaning.py script specified in #90
Hopefully, this is enough to get you started. |
Closing this now because I think all the requirements have been fulfilled. New changes to the data generation script will either be enhancements or bug fixes. |
github-project-automation
bot
moved this from In progress
to Done
in @RobotPsychologist's Automatic Meal Detection from Blood Glucose CGM Reading
Nov 15, 2024
RobotPsychologist
moved this from In progress
to Done
in @RobotPsychologist's Automatic Meal Detection from Blood Glucose CGM Reading
Nov 20, 2024
RobotPsychologist
closed this as completed
by moving to Done
in
@RobotPsychologist's Automatic Meal Detection from Blood Glucose CGM Reading
Nov 20, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This issue captures both: #108 #109
See the README for a better understanding of where files should go.
This ticket relates to #91 - Data Cleaning Script and #96 - Transformations Script.
The
dataset_generator.py
script should specify which settings to use for the dataset creation. It should generally create the dataset stored indata/interim
. It calls all specified data wrangling, processing, and cleaning utilities that should happen outside ofsktime
's API.The data stored in
data/interim
is then used by thedata_transformations.py
script functions to apply time series machine learning-specific transformations to the data, which is the final data processing stage before modelling. The data transformations should be stored indata/processed
. For more information on sktime transformers, see:dataset_processing.py
Purpose: Handles data loading, saving, and file naming.
Location: 0_meal_identification/meal_identification/meal_identification/datasets/dataset_processing.py
Functions:
get_root_dir
: finds the root directory of the project.load_data
: a general data loading utility function that can load data from any data directory.save_data
: a general data saving utility function that can store data in eitherdata/interim
ordata/processed
.dataset_labeler
: an auto-labeller that takes in the configurations from the data processing, cleaning, and generationto create a labelled dataset that should give the user a good understanding of how the dataset was generated.
dataset_cleaning.py
Purpose: Focuses on cleaning and preprocessing utilities for the dataset, such as handling overlaps and selecting top meals.
Location: 0_meal_identification/meal_identification/meal_identification/datasets/dataset_cleaning.py
Functions:
dataset_generator.py
Purpose: Handles only the dataset creation process by leveraging functions from both dataset_processing.py and dataset_cleaning.py, it should generally be writing the pre-transform dataset into
data/interim
Location: 0_meal_identification/meal_identification/meal_identification/datasets/dataset_generator.py
Functions:
plots.py
Purpose: contains a variety of plotting functions that we will frequently reuse for various tasks, usually related to assessing model performance.
Location: 0_meal_identification/meal_identification/meal_identification/plots.py
The text was updated successfully, but these errors were encountered: