You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This will probably (at least) replace the Prepare > Data Reshape > Random Subset dialogue, but for now it should be added as (at least one) new dialogue. One aspect - important enough is (I think) mas follows:
a) It can divide a data frame into a training data frame (default 80%) and a test data frame (default 20%). What's important is that it can do this in a structured way, so allowing for strata (e.g. districts), multi-level (e.g. families in a person-level data frame) and time-series data. Given our ease of using multiple data frames that's all nice and visual with R-Instat. I think it can easily split into 2 data frames of just add a new factor to specify the split. (Then we can easily do it later.)
b) It also does bootstrap, jacknife (called loo now) etc. That might be somewhere different, but we should investigate soon. I think this can generate many 1000's of data frames, so we may not want to show them?
So there may be dialogues elsewhere with different functions from rsample. It needs a good study, but is useful in using tidymodels and getting us going on cross validation etc.
c) I was pleased to see it has a reference to the caret package. That fits with what we have been doing on the start of machine learning.
I don't think the exciting parts of this package should delay us from making a simple start. That should be checked in an initial investigation.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
This will probably (at least) replace the Prepare > Data Reshape > Random Subset dialogue, but for now it should be added as (at least one) new dialogue. One aspect - important enough is (I think) mas follows:
a) It can divide a data frame into a training data frame (default 80%) and a test data frame (default 20%). What's important is that it can do this in a structured way, so allowing for strata (e.g. districts), multi-level (e.g. families in a person-level data frame) and time-series data. Given our ease of using multiple data frames that's all nice and visual with R-Instat. I think it can easily split into 2 data frames of just add a new factor to specify the split. (Then we can easily do it later.)
b) It also does bootstrap, jacknife (called loo now) etc. That might be somewhere different, but we should investigate soon. I think this can generate many 1000's of data frames, so we may not want to show them?
So there may be dialogues elsewhere with different functions from rsample. It needs a good study, but is useful in using tidymodels and getting us going on cross validation etc.
c) I was pleased to see it has a reference to the caret package. That fits with what we have been doing on the start of machine learning.
I don't think the exciting parts of this package should delay us from making a simple start. That should be checked in an initial investigation.
Beta Was this translation helpful? Give feedback.
All reactions