Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TestOnly resampling strategy? #976

Open
ablaom opened this issue Apr 24, 2024 · 0 comments
Open

Add TestOnly resampling strategy? #976

ablaom opened this issue Apr 24, 2024 · 0 comments

Comments

@ablaom
Copy link
Member

ablaom commented Apr 24, 2024

Sometimes you have trained a supervised machine on some data and you want to evaluate on some holdout set without retraining. Using evaluate!(..., resamping=Holdout()) doesn't allow this, so you would need to manually predict, and apply each metric to the prediction and test target, which is inconvenient, especially if you are tracking multiple metrics.

The idea of a TestOnly reampling strategy is that evaluate!(mach, resampling=TestOnly(), rows=test, measures=...) automates this: we assume mach is already trained (or throw an exception) and just evaluate the specified measures on predictions on the test rows.

Implementation looks pretty simple: train_test_pairs(::TestOnly, rows) = [(Int[], rows),] (i.e. empty train) and in evaluate! an empty train set will suppress training.

(It would be very convenient and natural that specifying no resampling strategy would fall back to TestOnly(), as in:

fit!(mach, rows=train)
evaluate!(mach, rows=test, measure=l2)

but that would be technically breaking - the current fallback is CV(). )

@github-project-automation github-project-automation bot moved this to tracking/discussion/metaissues/misc in General Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: tracking/discussion/metaissues/misc
Development

No branches or pull requests

1 participant