Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account for feature data when merging objects #610

Merged
merged 5 commits into from
Dec 11, 2023

Conversation

allyhawkins
Copy link
Member

Closes #595

This PR accounts for libraries that have feature data before merging. Specifically, any libraries that have ADT/CITE-seq data will retain the altExp in the object, while any libraries with cell hashing will not include the altExp in the merged object. I made the following changes:

  • To the script for merging SCE's, I added an option to --include_alt_exp. This mirrors the argument being added in Add functionality to merge altExps scpcaTools#243, but here I am setting the default to be false. So by default the altExp will not be included unless the data specifically has CITE-seq/adt data.
  • I create a list of all project ids that have at least one library with CITE-seq data. For each project being processed, I created a has_adt value that gets passed to the process for merging. If has_adt is true, then the --include_alt_exp flag is used. For all other cases, no altExp data will be kept.
  • This means we do not explicitly check for multiplexed data, but I don't think we need that information since they will be treated like regular single-cell/ single-nuclei where no altExp will be kept.
  • This approach also implies that if a project has some libraries with CITE/hashing and some libraries without, that will be handled within the function in Add functionality to merge altExps scpcaTools#243. My understanding of that PR is that is true.
  • The other thing I did was change the publishDir for the merge step. Now that we are only producing the merged objects and not integrating, we want to make sure we output this.

The last thing I want to note is that previously there was a question about why we were including a filter for seq_unit in ["cell", "nucleus"] and the unique call. On working on this, I realized that CITE/cell hashing data will have two libraries that have seq_unit equal to cell or nucleus. However the contents of the tecnology column will be different. So the first filter is to just remove bulk and spatial data from being merged. The unique is to remove duplicates because of CITE/ cell hashing for the same library. I updated the comments here to reflect that.

Copy link
Member

@sjspielman sjspielman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly this seems entirely fine to me and ready to go, no notes! 🎉

Well ok, one note that I can't tell how I feel about, so tell me how you feel about it and that will be the answer! But, I feel fine approving as is! Tag me back for more discussion as needed.

merge.nf Outdated
input:
tuple val(project_id), val(library_ids), path(scpca_nf_file)
tuple val(project_id), val(has_adt), val(library_ids), path(scpca_nf_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want to change the name of this value project_id within this process only to something like merge_group_id, for future us? For now, it will always refer to a project id, but the name merge_group_id is slightly more flexible for different groupings down the line. There's a few spots to replace this value in this process (L34, L36 [here], and L38) if you want to make that change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was torn on whether or not to use project_id or not in the process, so I'll go ahead and change it!

@allyhawkins allyhawkins merged commit 081f319 into development Dec 11, 2023
3 checks passed
@allyhawkins allyhawkins deleted the allyhawkins/altexp-merged-objects branch December 11, 2023 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants