Include rsample in R-Instat #7210

rdstern · 2022-02-08T19:04:13Z

rdstern
Feb 8, 2022
Maintainer

This will probably (at least) replace the Prepare > Data Reshape > Random Subset dialogue, but for now it should be added as (at least one) new dialogue. One aspect - important enough is (I think) mas follows:
a) It can divide a data frame into a training data frame (default 80%) and a test data frame (default 20%). What's important is that it can do this in a structured way, so allowing for strata (e.g. districts), multi-level (e.g. families in a person-level data frame) and time-series data. Given our ease of using multiple data frames that's all nice and visual with R-Instat. I think it can easily split into 2 data frames of just add a new factor to specify the split. (Then we can easily do it later.)
b) It also does bootstrap, jacknife (called loo now) etc. That might be somewhere different, but we should investigate soon. I think this can generate many 1000's of data frames, so we may not want to show them?
So there may be dialogues elsewhere with different functions from rsample. It needs a good study, but is useful in using tidymodels and getting us going on cross validation etc.
c) I was pleased to see it has a reference to the caret package. That fits with what we have been doing on the start of machine learning.

I don't think the exciting parts of this package should delay us from making a simple start. That should be checked in an initial investigation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include rsample in R-Instat #7210

{{title}}

Replies: 0 comments

Select a reply

Include rsample in R-Instat #7210

rdstern Feb 8, 2022 Maintainer

Replies: 0 comments

rdstern
Feb 8, 2022
Maintainer