-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Account for feature data when merging objects #610
Conversation
…texp-merged-objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly this seems entirely fine to me and ready to go, no notes! 🎉
Well ok, one note that I can't tell how I feel about, so tell me how you feel about it and that will be the answer! But, I feel fine approving as is! Tag me back for more discussion as needed.
merge.nf
Outdated
input: | ||
tuple val(project_id), val(library_ids), path(scpca_nf_file) | ||
tuple val(project_id), val(has_adt), val(library_ids), path(scpca_nf_file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we want to change the name of this value project_id
within this process only to something like merge_group_id
, for future us? For now, it will always refer to a project id, but the name merge_group_id
is slightly more flexible for different groupings down the line. There's a few spots to replace this value in this process (L34, L36 [here], and L38) if you want to make that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was torn on whether or not to use project_id
or not in the process, so I'll go ahead and change it!
Closes #595
This PR accounts for libraries that have feature data before merging. Specifically, any libraries that have ADT/CITE-seq data will retain the
altExp
in the object, while any libraries with cell hashing will not include thealtExp
in the merged object. I made the following changes:--include_alt_exp
. This mirrors the argument being added in Add functionality to merge altExps scpcaTools#243, but here I am setting the default to be false. So by default thealtExp
will not be included unless the data specifically has CITE-seq/adt data.has_adt
value that gets passed to the process for merging. Ifhas_adt
is true, then the--include_alt_exp
flag is used. For all other cases, no altExp data will be kept.altExp
will be kept.publishDir
for the merge step. Now that we are only producing the merged objects and not integrating, we want to make sure we output this.The last thing I want to note is that previously there was a question about why we were including a filter for
seq_unit in ["cell", "nucleus"]
and theunique
call. On working on this, I realized that CITE/cell hashing data will have two libraries that haveseq_unit
equal tocell
ornucleus
. However the contents of thetecnology
column will be different. So the first filter is to just remove bulk and spatial data from being merged. Theunique
is to remove duplicates because of CITE/ cell hashing for the same library. I updated the comments here to reflect that.