-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack cache and acceleration (rebased) #785
Conversation
Codecov Report
@@ Coverage Diff @@
## dev #785 +/- ##
==========================================
+ Coverage 85.93% 85.97% +0.04%
==========================================
Files 36 36
Lines 3462 3473 +11
==========================================
+ Hits 2975 2986 +11
Misses 487 487
Continue to review full report at Codecov.
|
@olivierlabayle I have done some integration tests for this PR. To reproduce run this script. Each model below is inserted as one of three base models in a stack, the stack is evaluated in julia> DataFrame(report1)[:,[:name, :package_name, :stack_evaluation, :accelerated_stack_evaluation]]
22×4 DataFrame
Row │ name package_name stack_evaluation accelerated_stack_evaluation
│ String String String String
─────┼───────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ ConstantRegressor MLJModels - -
2 │ DecisionTreeRegressor BetaML ✓ ✓
3 │ DecisionTreeRegressor DecisionTree ✓ ✓
4 │ DeterministicConstantRegressor MLJModels ✓ ✓
5 │ ElasticNetRegressor MLJLinearModels ✓ ✓
6 │ EvoTreeGaussian EvoTrees - -
7 │ EvoTreeRegressor EvoTrees ✓ ✓
8 │ HuberRegressor MLJLinearModels ✓ ✓
9 │ KNNRegressor NearestNeighborModels ✓ ✓
10 │ LADRegressor MLJLinearModels ✓ ✓
11 │ LGBMRegressor LightGBM ✓ ✓
12 │ LassoRegressor MLJLinearModels ✓ ✓
13 │ LinearRegressor GLM - -
14 │ LinearRegressor MLJLinearModels ✓ ✓
15 │ LinearRegressor MultivariateStats ✓ ✓
16 │ NeuralNetworkRegressor MLJFlux ✓ -
17 │ QuantileRegressor MLJLinearModels ✓ ✓
18 │ RandomForestRegressor BetaML ✓ -
19 │ RandomForestRegressor DecisionTree ✓ -
20 │ RidgeRegressor MLJLinearModels ✓ ✓
21 │ RidgeRegressor MultivariateStats ✓ ✓
22 │ RobustRegressor MLJLinearModels ✓ ✓
julia> DataFrame(report2)[:,[:name, :package_name, :stack_evaluation, :accelerated_stack_evaluation]]
16×4 DataFrame
Row │ name package_name stack_evaluation accelerated_stack_evaluation
│ String String String String
─────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ AdaBoostStumpClassifier DecisionTree ✓ ✓
2 │ ConstantClassifier MLJModels ✓ ✓
3 │ DSADDetector OutlierDetectionNetworks - -
4 │ DecisionTreeClassifier DecisionTree ✓ -
5 │ DeterministicConstantClassifier MLJModels - -
6 │ ESADDetector OutlierDetectionNetworks - -
7 │ EvoTreeClassifier EvoTrees ✓ ✓
8 │ GaussianNBClassifier NaiveBayes ✓ ✓
9 │ KNNClassifier NearestNeighborModels ✓ ✓
10 │ LGBMClassifier LightGBM ✓ ✓
11 │ LinearBinaryClassifier GLM ✓ ✓
12 │ LogisticClassifier MLJLinearModels ✓ ✓
13 │ NeuralNetworkClassifier MLJFlux ✓ -
14 │ PegasosClassifier BetaML ✓ ✓
15 │ PerceptronClassifier BetaML ✓ ✓
16 │ RandomForestClassifier DecisionTree ✓ - |
To test nested threading (to rule out this as the source of #783) I have now replaced the simple stack in my integration tests, with a "double" stack, in which the model occurs at two levels (snippet below details this). All tests pass function _stack(model, resource, isregressor)
if isregressor
models = (knn1=KNNRegressor(K=4),
knn2=KNNRegressor(K=6),
tmodel=model)
metalearner = KNNRegressor()
else
models = (knn1=KNNClassifier(K=4),
knn2=KNNClassifier(K=6),
tmodel=model)
metalearner = KNNClassifier()
end
Stack(;
metalearner,
resampling=CV(;nfolds=2),
acceleration=resource,
models...
)
end
# return a nested stack in which `model` appears at two levels, with
# both layers accelerated using `resource`:
_double_stack(model, resource, isregressor) =
_stack(_stack(model, resource, isregressor), resource, isregressor) I'd like to recommend this PR for merge. @OkonSamuel Are you happy for us to proceed? I think #783 is unrelated. |
Going to merge this now. Thanks @olivierlabayle for your contribution. |
Replaces #767