feature: updates for cellranger 9.0 #9541

stephenwilliams22 · 2024-12-10T20:02:21Z

This PR adds support for cellranger 9.0 which performs automatic cell type annotation. Creates a simple loading function similar to Load10x_spatial for 10x single cell data which will also put cell annotations in the meta.data.

~~Changes to other .Rd files result from the roxygenize() function~~

decided to let @dcollins15 roxigenize on his end to get formatting and documentation that matches the rest of the repo

rharao · 2024-12-11T16:27:06Z

Thanks for this! I'm excited to try the automatic cell type calling in Cellranger 9.0.
I'm not part of the developer team, but would you be open to some comments on this addition from a user's perspective?

stephenwilliams22 · 2024-12-11T16:52:40Z

Thanks for this! I'm excited to try the automatic cell type calling in Cellranger 9.0. I'm not part of the developer team, but would you be open to some comments on this addition from a user's perspective?

100%. Comments welcome.

rharao

Appreciate it!

NAMESPACE

rharao · 2024-12-11T17:24:25Z

R/preprocessing.R

+#' @param filename Name of H5 file containing the feature barcode matrix
+#' @param filter.matrix Only keep spots that have been determined to be over
+#' tissue
+#' @param to.upper Converts all feature names to upper case. This can provide an


(Preemptive-- this parameter doesn't seem to be implemented yet)
In my opinion, it would be better not to suggest that blind uppercasing is ever an appropriate way of finding mouse-human orthologs, even in an exploratory analysis.
If the user has some good reason to convert all the symbols to uppercase, they can simply use toupper and replace the rownames of the object.

This option was put in originally by the seurat team. I'll let them address the usefulness here.

The parameter was introduced well before my time but I tend to agree with @rharao.

R/preprocessing.R

stephenwilliams22 · 2024-12-11T20:45:35Z

@dcollins15 sorry for the little formatting edits. Not really sure where those were introduced. Feel free to fix and roxygenise and push directly to my branch as it seems what I'm going on my side is introducing little diffs.

stephenwilliams22 · 2024-12-11T21:30:54Z

@evolvedmicrobe I wanted to give you a heads up on this PR

rharao

I like it!

rharao · 2024-12-11T21:54:13Z

R/preprocessing.R

+      )
+    })
+
+    for (i in 1:seq_along(seurat.list)) {


1:seq_along(seurat.list)

should be seq_along(seurat.list).
Alternatively, you could replace the for loop with:

seurat.list <- lapply(seurat.list, \(x) if (Assays(x) %in% c("Gene Expression", "RNA")) {Add_10X_CellTypes(data.dir, x)} else {x})

I did also notice that if Add_10X_CellTypes doesn't find a cell types csv, it returns the original without warning or error, so if you want to rely on that, you could skip the conditional here.

rharao · 2024-12-11T22:06:19Z

R/preprocessing.R

+  counts <- Read10X_h5(counts.path, ...)
+
+  if (to.upper) {
+    counts <- imap(counts, ~{


Nit: because this formula operates on the counts matrices only and doesn't use the list's names, I think the imap call could be replaced by base::sapply.
Frankly, I'm not sure why this package imports purrr::imap anyway, as I can't find it used anywhere in the codebase, not even in Load10X_Spatial. But that's an issue for another time.

personally, I like the simplicity and readability of tidy functions. I'll leave it up to the seruat developers to decide if they mind the extra import. I haven't heard any complaints since we wrote this function a few years ago.

It's not really an issue; the choice one way or another doesn't affect me. (Personally, I agree; I always prefer the purrr functions in my own scripts.) I only wanted to mention it for the attention of the dev team.
Thanks for your receptiveness 😄

+1 to replacing the imap call with a sapply to match the current implementation of Load10X_Spatial.

Frankly, I'm not sure why this package imports purrr::imap anyway, as I can't find it used anywhere in the codebase, not even in Load10X_Spatial. But that's an issue for another time.

This is also a good point, @importFrom purrr imap should be dropped from Load10X_Spatial and as a suggested dependency. Now that you mention it, @importFrom grid rasterGrob could also be dropped from the function. I'll include these tweaks in #9556 👌

R/preprocessing.R

dcollins15

This is looking great 🎉 Big thanks to @rharao for the code review 🙌

For these types of loader functions, I think the two main priorities should be:

Backward compatibility
Consistency

You've done a great job emulating the existing Load10X_Spatial but I wonder if the two functions can or should share more logic 🤔

I’d also appreciate a bit more clarity on how Load10X is intended to relate to Read10X. Specifically:

What versions of Cell Ranger output are compatible with each?
As far as I can tell, Read10X hasn't been updated since just before the release of Cell Ranger v7.1.0.

I suspect that we could be doing a better job of making Load10X_Spatial and LoadXenium more ~~mutually coherent~~ cohesive but I'm not very familiar with the datatypes or output structure to propose specific suggestions.

On a related note, what's the easiest way to access 10x datasets in their "canonical" structure? I typically download datasets from the 10x Portal but end up having to manually configure the data.dir for Seurat. I assume there’s a better method that I’m just ignorant of.

This leads nicely into my other concern: testing. Creating small, representative datasets for testing was one of the more tedious parts of the Visium HD updates. Any thoughts on streamlining this process would be great.

@stephenwilliams22, now that I’ve had a thorough look, I think it would be best to hold off on merging this until after the next release (v5.2.0).

dcollins15 · 2024-12-16T23:52:44Z

R/preprocessing.R

+  counts <- Read10X_h5(counts.path, ...)
+
+  if (to.upper) {
+    counts <- imap(counts, ~{


+1 to replacing the imap call with a sapply to match the current implementation of Load10X_Spatial.

Frankly, I'm not sure why this package imports purrr::imap anyway, as I can't find it used anywhere in the codebase, not even in Load10X_Spatial. But that's an issue for another time.

This is also a good point, @importFrom purrr imap should be dropped from Load10X_Spatial and as a suggested dependency. Now that you mention it, @importFrom grid rasterGrob could also be dropped from the function. I'll include these tweaks in #9556 👌

dcollins15 · 2024-12-17T00:00:31Z

R/preprocessing.R

+#' @param filename Name of H5 file containing the feature barcode matrix
+#' @param filter.matrix Only keep spots that have been determined to be over
+#' tissue
+#' @param to.upper Converts all feature names to upper case. This can provide an


The parameter was introduced well before my time but I tend to agree with @rharao.

dcollins15 · 2024-12-17T19:30:12Z

R/preprocessing.R

+  } else {
+    object <- CreateSeuratObject(counts, assay = assay)
+    if (Assays(object)  %in%  c("Gene Expression", "RNA")) {
+      object <- Add_10X_CellTypes(data.dir, object)
+    }


I'm not sure that is.list(counts) conditional is necessary. Instead, you can just force counts to always be a list, and then you can do without the else block.

It looks like there's just one CellTypes file per data.dir? If that's the case, this could be simplified even more—the call to Add_10X_CellTypes could be made on the merged.object.

rharao reviewed Dec 11, 2024

View reviewed changes

dcollins15 self-requested a review December 11, 2024 18:40

stephenwilliams22 force-pushed the stephen/cellranger-9.0 branch from 7cd4ae0 to 90cd6d7 Compare December 11, 2024 20:33

fix little formatting

3038586

stephenwilliams22 force-pushed the stephen/cellranger-9.0 branch from 4e91581 to 3038586 Compare December 11, 2024 20:50

Merge branch 'develop' into stephen/cellranger-9.0

c36bb1c

stephenwilliams22 requested a review from rharao December 11, 2024 21:29

rharao reviewed Dec 11, 2024

View reviewed changes

use base r and small changes

1d575a1

dcollins15 reviewed Dec 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: updates for cellranger 9.0 #9541

feature: updates for cellranger 9.0 #9541

stephenwilliams22 commented Dec 10, 2024 •

edited

Loading

rharao commented Dec 11, 2024

stephenwilliams22 commented Dec 11, 2024

rharao left a comment

rharao Dec 11, 2024

stephenwilliams22 Dec 11, 2024

dcollins15 Dec 17, 2024

stephenwilliams22 commented Dec 11, 2024

stephenwilliams22 commented Dec 11, 2024

rharao left a comment

rharao Dec 11, 2024

rharao Dec 11, 2024

rharao Dec 11, 2024

stephenwilliams22 Dec 12, 2024

rharao Dec 12, 2024

dcollins15 Dec 16, 2024

dcollins15 left a comment •

edited

Loading

dcollins15 Dec 16, 2024

dcollins15 Dec 17, 2024

dcollins15 Dec 17, 2024

feature: updates for cellranger 9.0 #9541

Are you sure you want to change the base?

feature: updates for cellranger 9.0 #9541

Conversation

stephenwilliams22 commented Dec 10, 2024 • edited Loading

rharao commented Dec 11, 2024

stephenwilliams22 commented Dec 11, 2024

rharao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stephenwilliams22 commented Dec 11, 2024

stephenwilliams22 commented Dec 11, 2024

rharao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dcollins15 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stephenwilliams22 commented Dec 10, 2024 •

edited

Loading

dcollins15 left a comment •

edited

Loading