Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement MetaLearnerGridSearch #9

Merged
merged 93 commits into from
Jul 5, 2024

Conversation

FrancescMartiEscofetQC
Copy link
Contributor

@FrancescMartiEscofetQC FrancescMartiEscofetQC commented Jun 14, 2024

This PR implements GridSearchCV which performs an exhaustive search over all the parameters combination. I restricted the search to only one type of metalearner because the model names are different for each metalearner and therefore there were many cases and the user can perform one search for each metalearner type and would be the same.

With the current setup model reusage does not work. This is because the grid search performs data splitting and therefore in-sample predictions for training evaluation can't be performed. One option would be to only compute test losses or not support model reusage.

Checklist

  • Added a CHANGELOG.rst entry

@FrancescMartiEscofetQC FrancescMartiEscofetQC marked this pull request as ready for review June 14, 2024 13:34
@kklein
Copy link
Collaborator

kklein commented Jul 3, 2024

With the current setup model reusage does not work. This is because the grid search performs data splitting and therefore in-sample predictions for training evaluation can't be performed. One option would be to only compute test losses or not support model reusage.

Two thoughts:

  • If we allowed for the passing of an optional KFold parameter to MetaLearnerGridSearchCV.__init__ we could reuse the same splits across MetaLearnerGridSearchCV and thereby reuse models between MetaLearnerGridSearchCV s.
  • If we had a evalute_in_sample_flag in MetaLearnerGridSearchCV.fit, a user could turn off the in-sample evaluation and thereby use any pretrained model (unless I'm missing something, of course).

What do you think of these options?

@FrancescMartiEscofetQC
Copy link
Contributor Author

  • If we allowed for the passing of an optional KFold parameter to MetaLearnerGridSearchCV.__init__ we could reuse the same splits across MetaLearnerGridSearchCV and thereby reuse models between MetaLearnerGridSearchCV s.

I think this could work but we would have the same problem as we had when synchronizing of needing to pass the indices as just with KFold if random_state was None it's not reproducible so I wouldn't suggest it.

  • If we had a evalute_in_sample_flag in MetaLearnerGridSearchCV.fit, a user could turn off the in-sample evaluation and thereby use any pretrained model (unless I'm missing something, of course).

I think this is quite nice but the problem is at fit time for metalearners which have two stages the output of the predictions for the nuisance models are done with is_oos=False and therefore this raises an error because the fitting of the nuisance model was done in a different dataset.

@FrancescMartiEscofetQC FrancescMartiEscofetQC changed the title Implement MetaLearnerGridSearchCV Implement MetaLearnerGridSearch Jul 4, 2024
Copy link
Collaborator

@kklein kklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly small things :)

CHANGELOG.rst Outdated Show resolved Hide resolved
metalearners/grid_search.py Outdated Show resolved Hide resolved
metalearners/grid_search.py Outdated Show resolved Hide resolved
metalearners/grid_search.py Show resolved Hide resolved
metalearners/grid_search.py Show resolved Hide resolved
metalearners/grid_search.py Outdated Show resolved Hide resolved
"base_learner_grid keys don't match the expected model names. base_learner_grid "
f"keys were expected to be {self.models_to_fit}."
)
self.base_learner_grid = list(ParameterGrid(base_learner_grid))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I don't quite see yet why we need/want the transformation from
{key: [value1, value2, value3]} to [{key: value1}, {key: value2}, {key: value3}] :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it at the __init__ so I moved this conversion to the fit.
d6c8c3f

metalearners/grid_search.py Outdated Show resolved Hide resolved
metalearners/grid_search.py Outdated Show resolved Hide resolved
tests/test_grid_search.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@kklein kklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thank you! :)

@FrancescMartiEscofetQC FrancescMartiEscofetQC merged commit a406292 into main Jul 5, 2024
14 of 16 checks passed
@FrancescMartiEscofetQC FrancescMartiEscofetQC deleted the implement_grid_search branch July 5, 2024 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants