Multiple calls of .fit() has inconsistent behavior with scikit-learn #167

hombit · 2024-03-14T07:23:45Z

Currently, AADForest.fit() would do nothing when called second time, while with scikit-learn it would would cause model retraining. The same applies for fit_known() which just ignores data argument, even if it differs from data the model was previously trained with.

Here I propose to modify the Coniferest interface to add an additional method, .tune_known(known_data, known_labels). In this case:

.fit(data) would refit every time it is called dropping all the previous training
.fit_known(data, known_data, known_labels) would also refit
.tune_known(known_data, known_labels) would use the same "base" (isolation forest) model and tune it for labeled data. It would fail if called before .fit or .fit_known

The text was updated successfully, but these errors were encountered:

matwey · 2024-03-14T07:24:55Z

Are this names (fit_known, tune_known) exist in sklearn?

matwey · 2024-03-14T07:26:05Z

Related to #113

hombit · 2024-03-14T07:27:41Z

Are this names (fit_known, tune_known) exist in sklearn?

No, but it would be weird if .fit and fit_known behavior would be different in this way

matwey · 2024-03-14T07:35:14Z

Then I would propose the following alternative since it seems that having three functions is redundant:

.fit(data, known_labels = None, known_data = None) would refit every time it is called dropping all the previous training.
.fit_known(known_labels, known_data = None) doesn't accept data and doesn't do refit.

Mind known_labels and known_data order. If known_data is missed then known_labels are associated with data itself.

hombit · 2024-03-14T07:37:56Z

I would try to be duck-consistent with scikit-learn, including .fit(X, y)

matwey · 2024-03-14T07:38:57Z

And .fit(data, known_labels) is mostly the same as .fit(X, y).

hombit added this to the coniferest 0.1 release milestone Mar 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple calls of .fit() has inconsistent behavior with scikit-learn #167

Multiple calls of .fit() has inconsistent behavior with scikit-learn #167

hombit commented Mar 14, 2024

matwey commented Mar 14, 2024

matwey commented Mar 14, 2024

hombit commented Mar 14, 2024

matwey commented Mar 14, 2024

hombit commented Mar 14, 2024

matwey commented Mar 14, 2024

Multiple calls of .fit() has inconsistent behavior with scikit-learn #167

Multiple calls of .fit() has inconsistent behavior with scikit-learn #167

Comments

hombit commented Mar 14, 2024

matwey commented Mar 14, 2024

matwey commented Mar 14, 2024

hombit commented Mar 14, 2024

matwey commented Mar 14, 2024

hombit commented Mar 14, 2024

matwey commented Mar 14, 2024