Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poincare: Add mappings in dataset class for additional feature set (cat2) #126

Open
egillax opened this issue Aug 22, 2024 · 1 comment
Open

Comments

@egillax
Copy link
Collaborator

egillax commented Aug 22, 2024

In the dataset class, categorical features are in this format:

1, 2
1, 4
1, 1
2, 3
3, 3

Where first column is rowIds or observations and second column is columnId (Id for each feature). The columnId starts at 1 and goes to # features.

The features are then converted to:

1 2 4
3 0 0
3 0 0

The rows are observations. Each row is now a sequence of columnId's for that patient/observation. The tensor is then zero padded, patient 2 and 3 only have one feature.

For the Poincare addition new columnId's need to be generated in the first form. These new columnIds should go from 1 to the # of features in that feature set (cat or cat2). Then the first form can be converted to the second.

These new mappings need to be stored somehow so they can be applied at test/validation time.

@egillax
Copy link
Collaborator Author

egillax commented Aug 22, 2024

Currently we are using MapIds from PLP to generate the mappings before the dataset is initialized. But since we need to do more mappings in the dataset we could think about adding our own mapping functionality in python. The mapping info needs to be saved with the model to be used when applying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant