Poincare: Add mappings in dataset class for additional feature set (cat2) #126

egillax · 2024-08-22T12:49:59Z

In the dataset class, categorical features are in this format:

1, 2
1, 4
1, 1
2, 3
3, 3

Where first column is rowIds or observations and second column is columnId (Id for each feature). The columnId starts at 1 and goes to # features.

The features are then converted to:

1 2 4
3 0 0
3 0 0

The rows are observations. Each row is now a sequence of columnId's for that patient/observation. The tensor is then zero padded, patient 2 and 3 only have one feature.

For the Poincare addition new columnId's need to be generated in the first form. These new columnIds should go from 1 to the # of features in that feature set (cat or cat2). Then the first form can be converted to the second.

These new mappings need to be stored somehow so they can be applied at test/validation time.

The text was updated successfully, but these errors were encountered:

egillax · 2024-08-22T13:02:32Z

Currently we are using MapIds from PLP to generate the mappings before the dataset is initialized. But since we need to do more mappings in the dataset we could think about adding our own mapping functionality in python. The mapping info needs to be saved with the model to be used when applying.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poincare: Add mappings in dataset class for additional feature set (cat2) #126

Poincare: Add mappings in dataset class for additional feature set (cat2) #126

egillax commented Aug 22, 2024

egillax commented Aug 22, 2024

Poincare: Add mappings in dataset class for additional feature set (cat2) #126

Poincare: Add mappings in dataset class for additional feature set (cat2) #126

Comments

egillax commented Aug 22, 2024

egillax commented Aug 22, 2024