Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEM Pipeline #417
base: main
Are you sure you want to change the base?
PEM Pipeline #417
Changes from 24 commits
6eb5a6e
4d49fa3
d786d08
b7bc0c6
5d6b61b
c19e4bb
717478d
976d9c0
35745f0
e2a6c21
5a75617
a3bdbf5
6bf7332
690b5c6
f341679
d684cd0
b14e678
f00f1f9
7a848bf
2fc11a4
c035e9b
cb52f0b
4e509e4
82b8af7
0532d71
ec0ffdd
a5402c3
e1404b5
bb009bd
e232df6
5788225
22ca02b
f47f72a
7c35ba3
484a9fd
1621f43
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent! I would suggest 1) remove the time-dependency as we don't support it, ie
x
instead ofx(t)
2) describe a bitg
function? 3) add reference via the bibtex file => Andreas 2018 paper (A generalized additive model approach to time-to-event analysis)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some additional descriptions for clarification. At the same time, I am wondering whether PipeOpPred is the appropriate place for these mathematical explanations. I feel, as a whole, they might be more appropriate in the doc of the pipeline, with exception of the backtransform portion, i.e. how we get surv probs from hazards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point - it makes sense to put the related doc directly in the class that implementes what the docs says - and provide the links to that in the pipeline, but up to you if you want to add extra math doc somewhere (even a bit duplicated), always welcome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll ask Andreas what he thinks/prefers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Task: I think this is the part that sometimes results in surv probabilities that are not descreasing, right?
Example:
Can we please identify why that is happening? is it some sort of arithmetic instability thing? or some calculations above with the offset are wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always need to specify the family/objective (depending on the learner) as "poisson" to establish the exponential link between hazard and the learned model. So this works as intended, as long as the learner is correctly specified, however, the pipeline does not check whether that has been done. I suppose one can check for this, but not sure if some learners use arguments other than features and family to specify the distributional assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nice! We should document that in the example of the pipeline (and the vignette). If you want to do the extra leg, you can create a small function that checks the learner in the pipeline for
family
/objective
parameters, and if it doesn't find the keywordpoisson
, throws a warning "PEM works correcty with learners that support poisson regression". Andreas' list of candidates learners is enough (i.e. is not that many either way)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...numeric feature... (please verify) => add this also in the DiscTime pipeop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numeric indeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more precisely: has a column with col_role
offset
which is the ... log of something?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mroe accurately: offset in not a
feature
, ie it doesn;'t havbe thecol_role
feature, but theoffset
oneThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... that supports poisson regression....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: when I finish the
mlr3extralearners
PR, we can safely remove this comment here.Also correct the example + make it a bit more interesting, e.g. =>
l = lrn("regr.gam", formula = pem_status ~ s(age) + s(tend), family = "poisson")
=> you definitely need the familypoisson
argument hereThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to experiment and implement the validation stuff for
xgboost
, here is a bit of what is happening:task
will have a predefined validation task here, which is not transformed. what we need to do is something like:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing redundant empty lines in all code would be nice - some space is good, more space is unnecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bit more proper indentation style =>
target
should be belownew(
<= here, please check all code for thisThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: no
'
anywhere in the code please, use only"
!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of extra space: a good informative comment!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, not quite sure what's the purpose of
data[[event_var]] = 1
As for
data[[time_var]] = max_time
, this ensures that for each subject the ped data spans over all intervals instead of only until the event time, which of course is sensible for prediction. I added this as a comment.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# for pem