-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-sample API #211
Comments
I like the idea of a sample-group, though I don't know how well that extends to @julia326 's use case(s). It might be helpful here to document the analysis that one would want to enable when there are multiple samples. This might help to motivate the API for using multi-sample data. E.g. : some change in status, comparing pre-tx vs post-tx? Could there be different settings on the samples (e.g. bqsr vs not)? In each of these cases, I can imagine one sample might be the "default" (pre & with-bqsr) and another might be referenced on-demand. |
@jburos definitely! I suppose my motivating "analysis" to start with was running Epidisco/other pipelines for all samples. |
Will also be useful to talk to @julia326 about what types of analyses we could enable here. |
@tavinathanson great, i can understand wanting to run the pipelines, but then .. do what with the results? I am bringing this up again b/c we're approaching this problem from the other end, so to speak, for a different project. For this cohort, we have a subset of patients with pre/post Tx RNA samples & already have epidisco pipeline results for these samples. I am now thinking about how to extend cohorts in order to process them. In my use case (granted parts of this aren't yet supported by cohorts, but .. putting here for the record), I'd like to be able to:
Seems to me that a lot of the above could be facilitated with a sample label or keyword. Again, not thinking here about discohorts, just cohorts. |
We haven't necessarily wanted to incorporate multiple samples into one
Cohort
object, as that complicates every part of the library. For example:missense_snv_count
count?tumor_sample
andnormal_sample
?For many use cases, different
Cohort
objects can just be created with different sets of samples.However, questions pop up like:
One thought is a separate
SampleCollection
for specifying a bunch of samples and optionally creatingCohort
s from those samples. And perhaps aSampleGroup
(wherePatient
can extendSampleGroup
) to link samples together when we don't have the clinical data appropriate for aPatient
. Something like:The text was updated successfully, but these errors were encountered: