Skip to content

2.2 Spatial RNA: Curio Seeker

Pingxu0101 edited this page Jul 4, 2024 · 22 revisions

Use code in (2.1 Spatial RNA: Visium) to analysis Brain data

Mouse Brain Curio Seeker Data Analysis Workflow

In this part, we will focus on the Curio Mouse Brain data, which is generated with the Curio Seeker platform.
The Curio Seeker platform is designed for high-resolution (10μm) spatial transcriptomics, providing detailed insights into gene expression patterns within tissue samples.
Working with high-resolution data in spatial transcriptomics can be both computationally intensive and time-consuming.
To manage this more efficiently, we will downsample a smaller dataset in this workshop.

Datainformation:

Data is download from Curio Bioscience The Curio Seeker bioinformatics pipeline report

Data Download:

Raw rds file from Curio Seeker bioinformatics pipeline Brain_10X10
Downsampled object: Brain_subset
Brain single cell RNA object Brain_SC

If you have memory error, try smaller dataset

Brain_subset_100cells

Note:

a: The spatial analysis covered in 2.1 Spatial RNA: Visium can also be applied to the Curio Brain data.
b: Both Seurat 4 and Seurat 5 can be used here.
c: CellTrek is compatible with Seurat 4.
d: Seurat 5 contains new functions, such as sketch.
The sketch function generates a "sketch" of the dataset that retains the overall structure and main features while being smaller and easier to handle.
Example Analysis of Curio Seeker Data Using Seurat (Version 5.1.0)

Activate your Conda environment named 'course4.3' on Linux

conda activate course4.3
which R

Open a new terminal window and open RStudio on Linux

conda deactivate
export RSTUDIO_WHICH_R= your_course4.3_R_directory
open -na rstudio

R code

# Set the library path
# The required packages have already been installed. 
.libPaths("~/VisiumLibrary")
# Specify the working directory
your_working_dir <- "/Users/admin/Desktop/CSHL_2024/Curio_Brain_downsample/your_working_folder/"
# Set Working Directory
setwd(your_working_dir)
# Load necessary packages 
suppressPackageStartupMessages({
    library(hdf5r)
    library(Seurat)
    library(dplyr)
    library(RColorBrewer)
    library(CellTrek)
    library(ggplot2)
    library(spacexr)
    library(paletteer)
})
# Check Seurat version
packageVersion("Seurat")
# Path to the raw output from the Curio Seeker bioinformatics pipeline
Brain_10X10_Path <- "/Users/admin/Desktop/CSHL_2024/Curio_Brain_downsample/Mouse_brain_10by10_seurat.rds"

# Path to the downsampled smaller dataset
Brain_subset_Path <- "/Users/admin/Desktop/CSHL_2024/Curio_Brain_downsample/Brain_subset.rds"

# Path to the Brain SC V4_obj
Brain_SC_Path <- "/Users/admin/Desktop/CSHL_2024/Curio_Brain_downsample/your_working_folder/MouseBrain_snRNA_V4_obj.rds"

# Path to the Processed Brain ST data
Processed_filtered_Brain_Path <-"/Users/admin/Desktop/CSHL_2024/Curio_Brain_downsample/your_working_folder/Processed_filtered_Brain.rds"
# Load rds object from Curio Seeker bioinformatics pipeline
Brain_10X10 <- readRDS(Brain_10X10_Path)
# Check UMAP and clusters
DimPlot(Brain_10X10)
image
# cluster from Curio Seeker bioinformatics pipeline
SpatialDimPlot(Brain_10X10, pt.size.factor = 0.5)
image
# cluster from Curio Seeker bioinformatics pipeline
SpatialFeaturePlot(Brain_10X10, features = c('log10_nFeature_RNA','log10_nCount_RNA'), pt.size.factor = 0.5)
image

Note: Using 'Brain_subset <- subset(x = Brain_10X10, downsample = 2000)' may result in memory limit errors.

# Code used for reproducibility to create the subset
set.seed(123) 
# Code used to downsample 1000 cells per identity class 
Brain_subset <- subset(x = Brain_10X10, downsample = 2000)
# Code used save the subset
saveRDS(Brain_subset, file = 'Brain_subset.rds') 

Instead, let's read the pre-saved subset object.

# Read the subset
Brain_subset <- readRDS(Brain_subset_Path) 
# Check object (52581 spots)
Brain_subset
image
# Remove the raw data object to free up memory
rm(Brain_10X10)

Add number of genes per UMI for each spot to metadata

Brain_subset$log10GenesPerUMI <- log10(Brain_subset$nFeature_RNA) / log10(Brain_subset$nCount_RNA)

# Check the metadata 
head(Brain_subset@meta.data)
image
# Visualize the number of transcripts per spot.
Brain_subset@meta.data %>% 
  ggplot(aes( x=nCount_RNA, fill= 'orig.ident')) + 
  geom_density(alpha = 0.2) + 
  scale_x_log10() + 
  theme_classic() +
  ylab("Cell density") +
  geom_vline(xintercept = 200)+NoLegend()
image
# Then you can tell the differences between our DCIS and Brain data
Brain_subset@meta.data %>% 
  ggplot(aes(x = nCount_RNA, fill = orig.ident)) + 
  geom_density(alpha = 0.2) + 
  scale_x_log10() + 
  theme_classic() +
  ylab("Cell density") +
  geom_vline(xintercept = 50, color = "blue") +
  geom_vline(xintercept = 200, color = "red", linetype = "dashed") +
  NoLegend()
image
# Visualize the distribution of genes detected per spot.
Brain_subset@meta.data  %>% 
  ggplot(aes( x=nFeature_RNA, fill= 'orig.ident')) + 
  geom_density(alpha = 0.2) + 
  scale_x_log10() + 
  theme_classic() +
  ylab("Cell density")  +
  geom_vline(xintercept = 50, color = "blue") +
  geom_vline(xintercept = 200, color = "red", linetype = "dashed") +
  NoLegend()
image
# Visualize the overall complexity of the gene expression by visualizing the genes detected per UMI
Brain_subset@meta.data %>%
  ggplot(aes(x=log10GenesPerUMI, fill= 'orig.ident')) +
  geom_density(alpha = 0.2) +
  theme_classic() +
  ylab("Cell density") +
  geom_vline(xintercept = 0.8, color = "blue")  +
  NoLegend()
image
# Visualize the Mitochondrial counts ratio
Brain_subset@meta.data %>%
  ggplot(aes(x=percent.mt, fill= 'orig.ident')) +
  geom_density(alpha = 0.2) +
  theme_classic() +
  ylab("Cell density") +
  geom_vline(xintercept = 0.15, color = "red", linetype = "dashed")+
  NoLegend()
image
# Filtering
filtered_Brain <- subset(x = Brain_subset, 
                         subset= (nCount_RNA >= 50) & 
                           (nFeature_RNA >= 50) & 
                           (log10GenesPerUMI > 0.80))
# Check object 
filtered_Brain
image
# Remove the Brain_subset object to free up memory
rm(Brain_subset)
# Check the X y information in the ST object
# Check SPATIAL xy information
head(filtered_Brain@reductions$SPATIAL@cell.embeddings)
ggplot(filtered_Brain@reductions$SPATIAL@cell.embeddings, aes(x=SPATIAL_1, y=SPATIAL_2)) +
  geom_point(size=0.5 ,color= '#3585A6')
image
# Normalization and runPCA
DefaultAssay(filtered_Brain) = 'RNA'
filtered_Brain <- NormalizeData(filtered_Brain, assay='RNA') %>% 
  FindVariableFeatures() %>% 
  ScaleData(features=rownames(filtered_Brain))

filtered_Brain <- RunPCA(filtered_Brain, features=VariableFeatures(filtered_Brain),
                         verbose = F)
# ElbowPlot
ElbowPlot(filtered_Brain, ndims = 50) 
image
# RunUMAP
filtered_Brain <- FindNeighbors(filtered_Brain, reduction='pca', dims = 1:25)
filtered_Brain <- FindClusters(filtered_Brain, resolution = 0.1)
filtered_Brain <- RunUMAP(filtered_Brain, dims = 1:25)

# We can then visualize the results of the clustering in UMAP space. 
plot1 <- DimPlot(filtered_Brain, label = T,reduction = "umap", 
                 cols=colorRampPalette(brewer.pal(12,"Paired"))(16))
plot1
image
# We can then visualize the results of the clustering in the spatial coordinate space. 
SpatialDimPlot(filtered_Brain, label = F, pt.size.factor = 0.5)
image
# Improve UMAP
xy_df  <- data.frame(rowname = rownames(  filtered_Brain@reductions$SPATIAL@cell.embeddings), 
                             x = filtered_Brain@reductions$SPATIAL@cell.embeddings[, 'SPATIAL_1'],
                             y = filtered_Brain@reductions$SPATIAL@cell.embeddings[, 'SPATIAL_2']
)
head(xy_df)
image
# Extract the seurat_clusters column while keeping the row names
cluster_df <- data.frame(rowname = rownames(filtered_Brain@meta.data), 
                         seurat_clusters = filtered_Brain@meta.data[, 'RNA_snn_res.0.1'])

head(cluster_df)
image
merged_df <- left_join(xy_df,cluster_df, by='rowname' )
head(merged_df)
image
# plot
ggplot(merged_df, aes(x = x, y = y, color = as.factor(seurat_clusters))) +
  geom_point(size = 0.5, shape = 16, alpha = 1) +  
  scale_color_manual(values = colorRampPalette(brewer.pal(12, "Paired"))(28),
                     name = "Clusters") +
  theme_minimal() +
  guides(color = guide_legend(override.aes = list(size = 3)))  # Increase legend key size
image
# Find marker genes
filtered_Brain_markers <- FindAllMarkers(filtered_Brain, only.pos = T)
filtered_Brain_top5 <- filtered_Brain_markers %>% 
  group_by(cluster) %>% 
  top_n(n=5,wt=avg_log2FC)
# check the filtered_Brain_top10
filtered_Brain_top5
image
# Heatmap visualize
plot1 <- DoHeatmap(filtered_Brain,
                   features = filtered_Brain_top5$gene, 
                   group.colors = colorRampPalette(brewer.pal(12,"Paired"))(16))
plot1
image
# Dot plot visualization
# cols = "RdBu" sets the color scheme to a red-to-blue gradient
plot2 <- DotPlot(object = filtered_Brain, 
                 features = unique(filtered_Brain_top5$gene),
                 col.min = -2,
                 col.max = 1,
                 cols = "RdBu") + 
  theme(axis.text.x = element_text(angle = 90, hjust = 1))
plot2
image
# Check marker genes for Glutamatergic neuron
plot1 <- SpatialFeaturePlot(filtered_Brain, features = c('Slc17a7'), pt.size.factor = 1)+
  scale_fill_gradientn(colors = c('white', '#FF0033'))
# Check marker genes for GABAergic neuron
plot2 <-SpatialFeaturePlot(filtered_Brain, features = c('Slc32a1'), pt.size.factor = 1)+
  scale_fill_gradientn(colors = c('white', '#FF0033'))
plot1|plot2
image
# Check marker gene for Cluster 0
filtered_Brain_top5[filtered_Brain_top5$cluster == '0',]$gene
image
# Check marker gene for Cluster 4
filtered_Brain_top5[filtered_Brain_top5$cluster == '4',]$gene
image
# Check marker gene for Cluster 4
filtered_Brain_top5[filtered_Brain_top5$cluster == '4',]$gene

# Check gene expression spatially
p1 <- SpatialFeaturePlot(filtered_Brain, features = "Vxn", pt.size.factor = 1) +
  scale_fill_gradientn(colors = c('white', '#FF0033')) +
  theme(legend.position = "right")

p2 <- SpatialFeaturePlot(filtered_Brain, features = "Prkcd", pt.size.factor = 1) +
  scale_fill_gradientn(colors = c('white', '#FF0033')) +
  theme(legend.position = "right")

p1|p2
image
# Visualize Spatial cluster
plot1 <- SpatialDimPlot(filtered_Brain, pt.size.factor = 1, 
                        cells.highlight = WhichCells(filtered_Brain, idents =c( '0')),
                        cols.highlight = c("#00F5FF","white"))+ggtitle('Cluster 0 Highlighted' )+NoLegend()
plot2 <-SpatialDimPlot(filtered_Brain, pt.size.factor = 1, 
                       cells.highlight = WhichCells(filtered_Brain, idents =c( '4')),
                       cols.highlight = c("#FF0C00","white"))+ggtitle('Cluster 4 Highlighted' )+NoLegend()
plot1|plot2
image
# Visualize nCount_RNA and nFeature_RNA in each cluster
plot1 <- VlnPlot(filtered_Brain, 
                 features = c("nCount_RNA", "nFeature_RNA"),
                 pt.size = 0, 
                 cols = colorRampPalette(brewer.pal(12,"Paired"))(16))
plot1
image
# FindSpatiallyVariableFeatures
# Downsample to avoid Error: vector memory exhausted (limit reached)
filtered_Brain_V2 <- FindSpatiallyVariableFeatures(subset(x = filtered_Brain, downsample = 100),
                                                   assay = "RNA",
                                                   features = VariableFeatures(filtered_Brain)[1:100],
                                                   selection.method = "moransi")


# Access the meta.features data frame
meta_features <- filtered_Brain_V2[["RNA"]]@meta.features
meta_features <- meta_features[!is.na(meta_features$moransi.spatially.variable.rank),]
# Arrange features based on moransi.spatially.variable.rank
ranked_features <- meta_features %>%
  arrange(moransi.spatially.variable.rank)
head(ranked_features)
image
# Visualize the expression of the top 6 features identified by this measure.
SpatialFeaturePlot(filtered_Brain, 
                   features = rownames(ranked_features)[1:3],
                   pt.size.factor = 1,
                   ncol = 3)& scale_fill_gradientn(colors = c('white', '#FF0033')) 
image
# Save the intermediate file for future use.
saveRDS(filtered_Brain, file = 'Processed_filtered_Brain.rds')

Check scRNA-seq object

Note: reference data is form 10X_Brain_SC
Our group makes annotations based on gene expression profiles.

# Load
Brain_SC <- readRDS(Brain_SC_Path)
Brain_SC
unique(Idents(Brain_SC))
image
# Check scRNA data UMAP
plot1 <- DimPlot(Brain_SC, 
                 label = T,
                 repel = T,
                 cols = colorRampPalette(brewer.pal(12,"Paired"))(11))
plot1
image

Seurat label transfer

# Perform label transfer
DCIS_anchors <- FindTransferAnchors(reference = Brain_SC, query = filtered_Brain)
# Transfer data and assign predictions
DCIS_predictions.assay <- TransferData(anchorset = DCIS_anchors, 
                                       refdata = Brain_SC$Annotation, 
                                       prediction.assay = TRUE,
                                       weight.reduction = filtered_Brain[["pca"]], 
                                       dims = 1:30)
filtered_Brain[["predictions"]] <- DCIS_predictions.assay
DefaultAssay(filtered_Brain) <- "predictions"

# Check the updated object
filtered_Brain@assays$predictions@data[,1:3]
image
# Visualize cancer cells spatially

plot1 <- SpatialFeaturePlot(filtered_Brain, 
                            features = c("Glutamatergic-1",
                                         "GABAergic-9",
                                         "Astrocytes-4" ), 
                            pt.size.factor = 1,
                            ncol = 3)& scale_fill_gradientn(colors = c('white', '#FF0033')) 

plot1
image

Session info

sessionInfo()
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Red Hat Enterprise Linux
## 
## Matrix products: default
## BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] paletteer_1.5.0         spacexr_2.2.1           ggplot2_3.4.2          
##  [4] CellTrek_0.0.94         RColorBrewer_1.1-3      dplyr_1.1.2            
##  [7] SeuratObject_4.1.3.9001 Seurat_4.3.0            hdf5r_1.3.7            
## [10] klippy_0.0.0.9500      
## 
## loaded via a namespace (and not attached):
##   [1] backports_1.4.1        plyr_1.8.8             igraph_1.5.0.1        
##   [4] lazyeval_0.2.2         sp_2.0-0               splines_4.1.2         
##   [7] listenv_0.9.0          scattermore_0.8        digest_0.6.34         
##  [10] foreach_1.5.1          htmltools_0.5.4        fansi_1.0.4           
##  [13] magrittr_2.0.3         doParallel_1.0.14      tensor_1.5            
##  [16] cluster_2.1.2          ROCR_1.0-11            limma_3.50.3          
##  [19] globals_0.16.2         fastcluster_1.2.3      matrixStats_0.63.0    
##  [22] spatstat.sparse_3.0-0  rmdformats_1.0.4       colorspace_2.1-0      
##  [25] ggrepel_0.9.2          xfun_0.37              jsonlite_1.8.4        
##  [28] progressr_0.13.0       spatstat.data_3.0-0    iterators_1.0.14      
##  [31] survival_3.6-4         zoo_1.8-11             glue_1.6.2.9000       
##  [34] polyclip_1.10-4        gtable_0.3.1           leiden_0.4.3          
##  [37] car_3.1-0              future.apply_1.10.0    abind_1.4-5           
##  [40] scales_1.2.1           DBI_1.2.2              data.tree_1.0.0       
##  [43] rstatix_0.7.1          spatstat.random_3.1-3  miniUI_0.1.1.1        
##  [46] Rcpp_1.0.10            viridisLite_0.4.1      xtable_1.8-4          
##  [49] magic_1.6-0            reticulate_1.28        bit_4.0.5             
##  [52] akima_0.6-3.4          randomForestSRC_3.2.0  htmlwidgets_1.5.4     
##  [55] httr_1.4.5             DiagrammeR_1.0.9       ellipsis_0.3.2        
##  [58] ica_1.0-3              farver_2.1.1           pkgconfig_2.0.3       
##  [61] sass_0.4.2             uwot_0.1.14            deldir_1.0-6          
##  [64] utf8_1.2.3             dynamicTreeCut_1.63-1  labeling_0.4.2        
##  [67] tidyselect_1.2.0       rlang_1.1.0            reshape2_1.4.4        
##  [70] later_1.3.0            munsell_0.5.0          tools_4.1.2           
##  [73] visNetwork_2.1.0       cachem_1.0.7           cli_3.6.2             
##  [76] dbscan_1.1-11          generics_0.1.3         broom_1.0.4           
##  [79] ggridges_0.5.4         geometry_0.4.5         evaluate_0.23         
##  [82] stringr_1.5.0          fastmap_1.1.1          yaml_2.3.7            
##  [85] goftest_1.2-3          rematch2_2.1.2         knitr_1.41            
##  [88] bit64_4.0.5            fitdistrplus_1.1-11    purrr_1.0.1           
##  [91] RANN_2.6.1             pbapply_1.7-0          future_1.32.0         
##  [94] nlme_3.1-155           mime_0.12              ggrastr_1.0.1         
##  [97] compiler_4.1.2         rstudioapi_0.14        beeswarm_0.4.0        
## [100] plotly_4.10.0          png_0.1-8              ggsignif_0.6.3        
## [103] spatstat.utils_3.0-1   tibble_3.2.1           bslib_0.4.0           
## [106] stringi_1.7.12         highr_0.10             lattice_0.20-45       
## [109] Matrix_1.5-4           vctrs_0.6.2            pillar_1.9.0          
## [112] lifecycle_1.0.3        spatstat.geom_3.0-6    lmtest_0.9-40         
## [115] jquerylib_0.1.4        RcppAnnoy_0.0.20       data.table_1.15.0     
## [118] cowplot_1.1.1          irlba_2.3.5.1          httpuv_1.6.8          
## [121] patchwork_1.1.2        R6_2.5.1               bookdown_0.30         
## [124] promises_1.2.0.1       KernSmooth_2.23-20     gridExtra_2.3         
## [127] vipor_0.4.5            parallelly_1.35.0      codetools_0.2-18      
## [130] MASS_7.3-60            assertthat_0.2.1       philentropy_0.7.0     
## [133] withr_2.5.0            sctransform_0.3.5      parallel_4.1.2        
## [136] grid_4.1.2             tidyr_1.2.1            rmarkdown_2.14        
## [139] carData_3.0-5          packcircles_0.3.5      Rtsne_0.16            
## [142] ggpubr_0.5.0           spatstat.explore_3.0-6 shiny_1.7.2           
## [145] ggbeeswarm_0.6.0