Implement MetaLearnerGridSearch #9

FrancescMartiEscofetQC · 2024-06-14T12:58:34Z

This PR implements GridSearchCV which performs an exhaustive search over all the parameters combination. I restricted the search to only one type of metalearner because the model names are different for each metalearner and therefore there were many cases and the user can perform one search for each metalearner type and would be the same.

With the current setup model reusage does not work. This is because the grid search performs data splitting and therefore in-sample predictions for training evaluation can't be performed. One option would be to only compute test losses or not support model reusage.

Checklist

Added a CHANGELOG.rst entry

Co-authored-by: Kevin Klein <[email protected]>

Speedup tests

Switch `strict` meaning in `validate_number_positive`

Co-authored-by: Kevin Klein <[email protected]>

kklein · 2024-07-03T09:58:55Z

With the current setup model reusage does not work. This is because the grid search performs data splitting and therefore in-sample predictions for training evaluation can't be performed. One option would be to only compute test losses or not support model reusage.

Two thoughts:

If we allowed for the passing of an optional KFold parameter to MetaLearnerGridSearchCV.__init__ we could reuse the same splits across MetaLearnerGridSearchCV and thereby reuse models between MetaLearnerGridSearchCV s.
If we had a evalute_in_sample_flag in MetaLearnerGridSearchCV.fit, a user could turn off the in-sample evaluation and thereby use any pretrained model (unless I'm missing something, of course).

What do you think of these options?

FrancescMartiEscofetQC · 2024-07-03T10:33:40Z

If we allowed for the passing of an optional KFold parameter to MetaLearnerGridSearchCV.__init__ we could reuse the same splits across MetaLearnerGridSearchCV and thereby reuse models between MetaLearnerGridSearchCV s.

I think this could work but we would have the same problem as we had when synchronizing of needing to pass the indices as just with KFold if random_state was None it's not reproducible so I wouldn't suggest it.

If we had a evalute_in_sample_flag in MetaLearnerGridSearchCV.fit, a user could turn off the in-sample evaluation and thereby use any pretrained model (unless I'm missing something, of course).

I think this is quite nice but the problem is at fit time for metalearners which have two stages the output of the predictions for the nuisance models are done with is_oos=False and therefore this raises an error because the fitting of the nuisance model was done in a different dataset.

kklein

Mostly small things :)

CHANGELOG.rst

metalearners/grid_search.py

kklein · 2024-07-04T09:09:07Z

metalearners/grid_search.py

+                "base_learner_grid keys don't match the expected model names. base_learner_grid "
+                f"keys were expected to be {self.models_to_fit}."
+            )
+        self.base_learner_grid = list(ParameterGrid(base_learner_grid))


I'm afraid I don't quite see yet why we need/want the transformation from
{key: [value1, value2, value3]} to [{key: value1}, {key: value2}, {key: value3}] :/

We don't need it at the __init__ so I moved this conversion to the fit.
d6c8c3f

metalearners/grid_search.py

tests/test_grid_search.py

Co-authored-by: Kevin Klein <[email protected]>

kklein

LGTM - thank you! :)

FrancescMartiEscofetQC and others added 17 commits June 14, 2024 12:55

Speedup tests

e8b64e6

Co-authored-by: Kevin Klein <[email protected]>

Switch strict meaning in validate_number_positive

7a11445

Add classes_ to cfe

642cb2e

Fix RLoss calculation in evaluate

d7cef73

Merge pull request #3 from Quantco/speedup_tests

1234a0b

Speedup tests

Merge pull request #4 from Quantco/issue_162

8efba91

Switch `strict` meaning in `validate_number_positive`

Merge branch 'main' into cfe_classes_

32c721d

Parametrize evaluate

963debf

Merge branch 'fix_r_evaluate' into parametrize_evaluate

dc93dd1

Merge branch 'main' into fix_r_evaluate

6a4cd07

Merge branch 'cfe_classes_' into fix_r_evaluate

e3df56a

Merge branch 'fix_r_evaluate' into parametrize_evaluate

1a93bfa

run pchs

ad71c66

Implement MetaLearnerGridSearchCV

1c39193

Update CHANGELOG

e0a9239

Merge branch 'parametrize_evaluate' into implement_grid_search

5094e45

Update CHANGELOG

f0d6f6c

FrancescMartiEscofetQC marked this pull request as ready for review June 14, 2024 13:34

FrancescMartiEscofetQC requested a review from kklein as a code owner June 14, 2024 13:34

FrancescMartiEscofetQC and others added 11 commits June 17, 2024 09:19

Merge branch 'main' into cfe_classes_

a5f657d

Merge branch 'cfe_classes_' into fix_r_evaluate

9992576

Merge branch 'fix_r_evaluate' into parametrize_evaluate

f6c7d74

Merge branch 'parametrize_evaluate' into implement_grid_search

7a21186

Merge branch 'main' into parametrize_evaluate

a38ca89

Merge branch 'parametrize_evaluate' into implement_grid_search

0f54c2c

Merge branch 'main' into parametrize_evaluate

d6327ae

Merge branch 'parametrize_evaluate' into implement_grid_search

914f047

Update metalearners/metalearner.py

476a4ae

Co-authored-by: Kevin Klein <[email protected]>

Update metalearners/metalearner.py

1c4c060

Co-authored-by: Kevin Klein <[email protected]>

Update metalearners/metalearner.py

49f1556

Co-authored-by: Kevin Klein <[email protected]>

Merge branch 'main' into implement_grid_search

82d38d9

FrancescMartiEscofetQC added 2 commits July 4, 2024 10:14

Disable cv to be able to reuse models

3b841e5

Add text about reusage in docs

a7be0cd

FrancescMartiEscofetQC changed the title ~~Implement MetaLearnerGridSearchCV~~ Implement MetaLearnerGridSearch Jul 4, 2024

FrancescMartiEscofetQC requested a review from kklein July 4, 2024 08:26

Add test propensity model reuse

13eeed1

kklein reviewed Jul 4, 2024

View reviewed changes

FrancescMartiEscofetQC and others added 18 commits July 4, 2024 12:01

Update CHANGELOG.rst

0264937

Co-authored-by: Kevin Klein <[email protected]>

Update metalearners/grid_search.py

bcaab55

Co-authored-by: Kevin Klein <[email protected]>

Update metalearners/grid_search.py

8ad4b87

Co-authored-by: Kevin Klein <[email protected]>

Update metalearners/grid_search.py

5e34a35

Co-authored-by: Kevin Klein <[email protected]>

Update metalearners/grid_search.py

fa95338

Co-authored-by: Kevin Klein <[email protected]>

Update metalearners/grid_search.py

bac8cfb

Co-authored-by: Kevin Klein <[email protected]>

Adapt var name

928edd7

Use &

83f0e78

Use ParameterGrid in fit and not init

d6c8c3f

Use fixture grid_search_data

acade9e

Merge branch 'main' into implement_grid_search

183b251

Add docc about results_

5a6c91f

Index dataframe with config

991b2f1

Rename kwargs to metalerner_fit_params

29db2bb

Merge branch 'main' into implement_grid_search

4733b2a

Rephrase docs

669f37f

Spacing docs

7b97173

Merge branch 'main' into implement_grid_search

5d1dde9

kklein approved these changes Jul 5, 2024

View reviewed changes

FrancescMartiEscofetQC merged commit a406292 into main Jul 5, 2024
14 of 16 checks passed

FrancescMartiEscofetQC deleted the implement_grid_search branch July 5, 2024 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement MetaLearnerGridSearch #9

Implement MetaLearnerGridSearch #9

FrancescMartiEscofetQC commented Jun 14, 2024 •

edited

Loading

kklein commented Jul 3, 2024

FrancescMartiEscofetQC commented Jul 3, 2024

kklein left a comment

kklein Jul 4, 2024

FrancescMartiEscofetQC Jul 4, 2024

kklein left a comment

Implement MetaLearnerGridSearch #9

Implement MetaLearnerGridSearch #9

Conversation

FrancescMartiEscofetQC commented Jun 14, 2024 • edited Loading

Checklist

kklein commented Jul 3, 2024

FrancescMartiEscofetQC commented Jul 3, 2024

kklein left a comment

Choose a reason for hiding this comment

kklein Jul 4, 2024

Choose a reason for hiding this comment

FrancescMartiEscofetQC Jul 4, 2024

Choose a reason for hiding this comment

kklein left a comment

Choose a reason for hiding this comment

FrancescMartiEscofetQC commented Jun 14, 2024 •

edited

Loading