Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: offset column role in Task #1225

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open

feat: offset column role in Task #1225

wants to merge 23 commits into from

Conversation

bblodfon
Copy link
Contributor

@bblodfon bblodfon commented Dec 4, 2024

  • Properly support offset for mlr3 learners
    • Some TODOs left (talked with Marc)
  • xgboost in mlr3learners uses this now mlr3learners PR
  • See mlr3proba PR for the PEM reduction pipeline (surv => regr) that uses this

@bblodfon bblodfon requested a review from be-marc December 4, 2024 16:31
@be-marc be-marc changed the title Add offset col_role in Task feat: offset column role in Task Jan 16, 2025
stopf("Offset column(s) %s must be a numeric or integer column", paste0("'", new_roles[["offset"]], "'", collapse = ","))
}

if (any(task$missings(cols = new_roles[["offset"]]) > 0)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use a shorter circuit (something like mlr3misc::some)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use vectorization when available, so the version here is fine i'd say (?)

@@ -4,10 +4,12 @@
#' The following properties are currently standardized and understood by learners in \CRANpkg{mlr3}:
#' * `"missings"`: The learner can handle missing values in the data.
#' * `"weights"`: The learner supports observation weights.
#' * `"offset"`: The learner can incorporate offset values to adjust predictions.
#' * `"importance"`: The learner supports extraction of importance scores, i.e. comes with an `$importance()` extractor function (see section on optional extractors in [Learner]).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@be-marc here hotstart_forward, hotstart_backward and featureless are not documented, but they should be perhaps?

#' * `"offset"`: Offset values specifying fixed adjustments for model training.
#' These values can be used to provide baseline predictions from an existing model for updating another model.
#' Some learners require an offset for each target class in a multiclass setting.
#' In this case, the offset columns must be named `"offset_{target_class_name}"`.
#'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must be integer or numeric

#' * `"offset"`: Offset values specifying fixed adjustments for model training.
#' These values can be used to provide baseline predictions from an existing model for updating another model.
#' Some learners require an offset for each target class in a multiclass setting.
#' In this case, the offset columns must be named `"offset_{target_class_name}"`.
#'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple models just shift the prediction. And add link to learners.

task$col_roles$offset = character()
expect_true("offset" %nin% task$properties)
expect_null(task$offset)
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add autotest for mlr3learners

return(NULL)
}

self$backend$data(private$.row_roles$use, offset_cols)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe setnames() or sanitize in some other way, as we do in the other ABs

return(NULL)
}

self$backend$data(private$.row_roles$use, offset_cols)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add row_id and offset name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants