-
Notifications
You must be signed in to change notification settings - Fork 3
2.2 Spatial RNA: Curio Seeker
In this part, we will focus on the Curio Mouse Brain data, which is generated with the Curio Seeker platform.
The Curio Seeker platform is designed for high-resolution (10μm) spatial transcriptomics, providing detailed insights into gene expression patterns within tissue samples.
Working with high-resolution data in spatial transcriptomics can be both computationally intensive and time-consuming.
To manage this more efficiently, we will downsample a smaller dataset in this workshop.
Data is download from Curio Bioscience The Curio Seeker bioinformatics pipeline report
Raw rds file from Curio Seeker bioinformatics pipeline
Brain_10X10
Downsampled object:
Brain_subset
Brain single cell RNA object
Brain_SC
a: The spatial analysis covered in 2.1 Spatial RNA: Visium can also be applied to the Curio Brain data.
b: Both Seurat 4 and Seurat 5 can be used here.
c: CellTrek is compatible with Seurat 4.
d: Seurat 5 contains new functions, such as sketch.
The sketch function generates a "sketch" of the dataset that retains the overall structure and main features while being smaller and easier to handle.
Example Analysis of Curio Seeker Data Using Seurat (Version 5.1.0)
conda activate course4.3
which R
conda deactivate
export RSTUDIO_WHICH_R= your_course4.3_R_directory
open -na rstudio
# Set the library path
# The required packages have already been installed.
.libPaths("~/VisiumLibrary")
# Specify the working directory
your_working_dir <- "/Users/admin/Desktop/CSHL_2024/Curio_Brain_downsample/your_working_folder/"
# Set Working Directory
setwd(your_working_dir)
# Load necessary packages
suppressPackageStartupMessages({
library(hdf5r)
library(Seurat)
library(dplyr)
library(RColorBrewer)
library(CellTrek)
library(ggplot2)
library(spacexr)
library(paletteer)
})
# Check Seurat version
packageVersion("Seurat")
# Path to the raw output from the Curio Seeker bioinformatics pipeline
Brain_10X10_Path <- "/Users/admin/Desktop/CSHL_2024/Curio_Brain_downsample/Mouse_brain_10by10_seurat.rds"
# Path to the downsampled smaller dataset
Brain_subset_Path <- "/Users/admin/Desktop/CSHL_2024/Curio_Brain_downsample/Brain_subset.rds"
# Path to the Brain SC V4_obj
Brain_SC_Path <- "/Users/admin/Desktop/CSHL_2024/Curio_Brain_downsample/your_working_folder/MouseBrain_snRNA_V4_obj.rds"
# Path to the Processed Brain ST data
Processed_filtered_Brain_Path <-"/Users/admin/Desktop/CSHL_2024/Curio_Brain_downsample/your_working_folder/Processed_filtered_Brain.rds"
# Load rds object from Curio Seeker bioinformatics pipeline
Brain_10X10 <- readRDS(Brain_10X10_Path)
# Check UMAP and clusters
DimPlot(Brain_10X10)
# cluster from Curio Seeker bioinformatics pipeline
SpatialDimPlot(Brain_10X10, pt.size.factor = 0.5)
# cluster from Curio Seeker bioinformatics pipeline
SpatialFeaturePlot(Brain_10X10, features = c('log10_nFeature_RNA','log10_nCount_RNA'), pt.size.factor = 0.5)
Note: Using 'Brain_subset <- subset(x = Brain_10X10, downsample = 2000)' may result in memory limit errors.
# Code used for reproducibility to create the subset
set.seed(123)
# Code used to downsample 1000 cells per identity class
Brain_subset <- subset(x = Brain_10X10, downsample = 2000)
# Code used save the subset
saveRDS(Brain_subset, file = 'Brain_subset.rds')
# Read the subset
Brain_subset <- readRDS(Brain_subset_Path)
# Check object (52581 spots)
Brain_subset
# Remove the raw data object to free up memory
rm(Brain_10X10)
Brain_subset$log10GenesPerUMI <- log10(Brain_subset$nFeature_RNA) / log10(Brain_subset$nCount_RNA)
# Check the metadata
head(Brain_subset@meta.data)
# Visualize the number of transcripts per spot.
Brain_subset@meta.data %>%
ggplot(aes( x=nCount_RNA, fill= 'orig.ident')) +
geom_density(alpha = 0.2) +
scale_x_log10() +
theme_classic() +
ylab("Cell density") +
geom_vline(xintercept = 200)+NoLegend()
# Then you can tell the differences between our DCIS and Brain data
Brain_subset@meta.data %>%
ggplot(aes(x = nCount_RNA, fill = orig.ident)) +
geom_density(alpha = 0.2) +
scale_x_log10() +
theme_classic() +
ylab("Cell density") +
geom_vline(xintercept = 50, color = "blue") +
geom_vline(xintercept = 200, color = "red", linetype = "dashed") +
NoLegend()
# Visualize the distribution of genes detected per spot.
Brain_subset@meta.data %>%
ggplot(aes( x=nFeature_RNA, fill= 'orig.ident')) +
geom_density(alpha = 0.2) +
scale_x_log10() +
theme_classic() +
ylab("Cell density") +
geom_vline(xintercept = 50, color = "blue") +
geom_vline(xintercept = 200, color = "red", linetype = "dashed") +
NoLegend()
# Visualize the overall complexity of the gene expression by visualizing the genes detected per UMI
Brain_subset@meta.data %>%
ggplot(aes(x=log10GenesPerUMI, fill= 'orig.ident')) +
geom_density(alpha = 0.2) +
theme_classic() +
ylab("Cell density") +
geom_vline(xintercept = 0.8, color = "blue") +
NoLegend()
# Visualize the Mitochondrial counts ratio
Brain_subset@meta.data %>%
ggplot(aes(x=percent.mt, fill= 'orig.ident')) +
geom_density(alpha = 0.2) +
theme_classic() +
ylab("Cell density") +
geom_vline(xintercept = 0.15, color = "red", linetype = "dashed")+
NoLegend()
# Filtering
filtered_Brain <- subset(x = Brain_subset,
subset= (nCount_RNA >= 50) &
(nFeature_RNA >= 50) &
(log10GenesPerUMI > 0.80))
# Check object
filtered_Brain
# Remove the Brain_subset object to free up memory
rm(Brain_subset)
# Check the X y information in the ST object
# Check SPATIAL xy information
head(filtered_Brain@reductions$SPATIAL@cell.embeddings)
ggplot(filtered_Brain@reductions$SPATIAL@cell.embeddings, aes(x=SPATIAL_1, y=SPATIAL_2)) +
geom_point(size=0.5 ,color= '#3585A6')
# Normalization and runPCA
DefaultAssay(filtered_Brain) = 'RNA'
filtered_Brain <- NormalizeData(filtered_Brain, assay='RNA') %>%
FindVariableFeatures() %>%
ScaleData(features=rownames(filtered_Brain))
filtered_Brain <- RunPCA(filtered_Brain, features=VariableFeatures(filtered_Brain),
verbose = F)
# ElbowPlot
ElbowPlot(filtered_Brain, ndims = 50)
# RunUMAP
filtered_Brain <- FindNeighbors(filtered_Brain, reduction='pca', dims = 1:25)
filtered_Brain <- FindClusters(filtered_Brain, resolution = 0.1)
filtered_Brain <- RunUMAP(filtered_Brain, dims = 1:25)
# We can then visualize the results of the clustering in UMAP space.
plot1 <- DimPlot(filtered_Brain, label = T,reduction = "umap",
cols=colorRampPalette(brewer.pal(12,"Paired"))(16))
plot1
# We can then visualize the results of the clustering in the spatial coordinate space.
SpatialDimPlot(filtered_Brain, label = F, pt.size.factor = 0.5)
# Improve UMAP
xy_df <- data.frame(rowname = rownames( filtered_Brain@reductions$SPATIAL@cell.embeddings),
x = filtered_Brain@reductions$SPATIAL@cell.embeddings[, 'SPATIAL_1'],
y = filtered_Brain@reductions$SPATIAL@cell.embeddings[, 'SPATIAL_2']
)
head(xy_df)
# Extract the seurat_clusters column while keeping the row names
cluster_df <- data.frame(rowname = rownames(filtered_Brain@meta.data),
seurat_clusters = filtered_Brain@meta.data[, 'RNA_snn_res.0.1'])
head(cluster_df)
merged_df <- left_join(xy_df,cluster_df, by='rowname' )
head(merged_df)
# plot
ggplot(merged_df, aes(x = x, y = y, color = as.factor(seurat_clusters))) +
geom_point(size = 0.5, shape = 16, alpha = 1) +
scale_color_manual(values = colorRampPalette(brewer.pal(12, "Paired"))(28),
name = "Clusters") +
theme_minimal() +
guides(color = guide_legend(override.aes = list(size = 3))) # Increase legend key size
# Find marker genes
filtered_Brain_markers <- FindAllMarkers(filtered_Brain, only.pos = T)
filtered_Brain_top5 <- filtered_Brain_markers %>%
group_by(cluster) %>%
top_n(n=5,wt=avg_log2FC)
# check the filtered_Brain_top10
filtered_Brain_top5
# Heatmap visualize
plot1 <- DoHeatmap(filtered_Brain,
features = filtered_Brain_top5$gene,
group.colors = colorRampPalette(brewer.pal(12,"Paired"))(16))
plot1
# Dot plot visualization
# cols = "RdBu" sets the color scheme to a red-to-blue gradient
plot2 <- DotPlot(object = filtered_Brain,
features = unique(filtered_Brain_top5$gene),
col.min = -2,
col.max = 1,
cols = "RdBu") +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
plot2
# Check marker genes for Glutamatergic neuron
plot1 <- SpatialFeaturePlot(filtered_Brain, features = c('Slc17a7'), pt.size.factor = 1)+
scale_fill_gradientn(colors = c('white', '#FF0033'))
# Check marker genes for GABAergic neuron
plot2 <-SpatialFeaturePlot(filtered_Brain, features = c('Slc32a1'), pt.size.factor = 1)+
scale_fill_gradientn(colors = c('white', '#FF0033'))
plot1|plot2
# Check marker gene for Cluster 0
filtered_Brain_top5[filtered_Brain_top5$cluster == '0',]$gene
# Check marker gene for Cluster 4
filtered_Brain_top5[filtered_Brain_top5$cluster == '4',]$gene
# Check marker gene for Cluster 4
filtered_Brain_top5[filtered_Brain_top5$cluster == '4',]$gene
# Check gene expression spatially
p1 <- SpatialFeaturePlot(filtered_Brain, features = "Vxn", pt.size.factor = 1) +
scale_fill_gradientn(colors = c('white', '#FF0033')) +
theme(legend.position = "right")
p2 <- SpatialFeaturePlot(filtered_Brain, features = "Prkcd", pt.size.factor = 1) +
scale_fill_gradientn(colors = c('white', '#FF0033')) +
theme(legend.position = "right")
p1|p2
# Visualize Spatial cluster
plot1 <- SpatialDimPlot(filtered_Brain, pt.size.factor = 1,
cells.highlight = WhichCells(filtered_Brain, idents =c( '0')),
cols.highlight = c("#00F5FF","white"))+ggtitle('Cluster 0 Highlighted' )+NoLegend()
plot2 <-SpatialDimPlot(filtered_Brain, pt.size.factor = 1,
cells.highlight = WhichCells(filtered_Brain, idents =c( '4')),
cols.highlight = c("#FF0C00","white"))+ggtitle('Cluster 4 Highlighted' )+NoLegend()
plot1|plot2
# Visualize nCount_RNA and nFeature_RNA in each cluster
plot1 <- VlnPlot(filtered_Brain,
features = c("nCount_RNA", "nFeature_RNA"),
pt.size = 0,
cols = colorRampPalette(brewer.pal(12,"Paired"))(16))
plot1
# FindSpatiallyVariableFeatures
# Downsample to avoid Error: vector memory exhausted (limit reached)
filtered_Brain_V2 <- FindSpatiallyVariableFeatures(subset(x = filtered_Brain, downsample = 100),
assay = "RNA",
features = VariableFeatures(filtered_Brain)[1:100],
selection.method = "moransi")
# Access the meta.features data frame
meta_features <- filtered_Brain_V2[["RNA"]]@meta.features
meta_features <- meta_features[!is.na(meta_features$moransi.spatially.variable.rank),]
# Arrange features based on moransi.spatially.variable.rank
ranked_features <- meta_features %>%
arrange(moransi.spatially.variable.rank)
head(ranked_features)
# Visualize the expression of the top 6 features identified by this measure.
SpatialFeaturePlot(filtered_Brain,
features = rownames(ranked_features)[1:3],
pt.size.factor = 1,
ncol = 3)& scale_fill_gradientn(colors = c('white', '#FF0033'))
# Save the intermediate file for future use.
saveRDS(filtered_Brain, file = 'Processed_filtered_Brain.rds')
Note: reference data is form 10X_Brain_SC
Our group makes annotations based on gene expression profiles.
# Load
Brain_SC <- readRDS(Brain_SC_Path)
Brain_SC
unique(Idents(Brain_SC))
# Check scRNA data UMAP
plot1 <- DimPlot(Brain_SC,
label = T,
repel = T,
cols = colorRampPalette(brewer.pal(12,"Paired"))(11))
plot1
# Perform label transfer
DCIS_anchors <- FindTransferAnchors(reference = Brain_SC, query = filtered_Brain)
# Transfer data and assign predictions
DCIS_predictions.assay <- TransferData(anchorset = DCIS_anchors,
refdata = Brain_SC$Annotation,
prediction.assay = TRUE,
weight.reduction = filtered_Brain[["pca"]],
dims = 1:30)
filtered_Brain[["predictions"]] <- DCIS_predictions.assay
DefaultAssay(filtered_Brain) <- "predictions"
# Check the updated object
filtered_Brain@assays$predictions@data[,1:3]
# Visualize cancer cells spatially
plot1 <- SpatialFeaturePlot(filtered_Brain,
features = c("Glutamatergic-1",
"GABAergic-9",
"Astrocytes-4" ),
pt.size.factor = 1,
ncol = 3)& scale_fill_gradientn(colors = c('white', '#FF0033'))
plot1
sessionInfo()
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Red Hat Enterprise Linux
##
## Matrix products: default
## BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] paletteer_1.5.0 spacexr_2.2.1 ggplot2_3.4.2
## [4] CellTrek_0.0.94 RColorBrewer_1.1-3 dplyr_1.1.2
## [7] SeuratObject_4.1.3.9001 Seurat_4.3.0 hdf5r_1.3.7
## [10] klippy_0.0.0.9500
##
## loaded via a namespace (and not attached):
## [1] backports_1.4.1 plyr_1.8.8 igraph_1.5.0.1
## [4] lazyeval_0.2.2 sp_2.0-0 splines_4.1.2
## [7] listenv_0.9.0 scattermore_0.8 digest_0.6.34
## [10] foreach_1.5.1 htmltools_0.5.4 fansi_1.0.4
## [13] magrittr_2.0.3 doParallel_1.0.14 tensor_1.5
## [16] cluster_2.1.2 ROCR_1.0-11 limma_3.50.3
## [19] globals_0.16.2 fastcluster_1.2.3 matrixStats_0.63.0
## [22] spatstat.sparse_3.0-0 rmdformats_1.0.4 colorspace_2.1-0
## [25] ggrepel_0.9.2 xfun_0.37 jsonlite_1.8.4
## [28] progressr_0.13.0 spatstat.data_3.0-0 iterators_1.0.14
## [31] survival_3.6-4 zoo_1.8-11 glue_1.6.2.9000
## [34] polyclip_1.10-4 gtable_0.3.1 leiden_0.4.3
## [37] car_3.1-0 future.apply_1.10.0 abind_1.4-5
## [40] scales_1.2.1 DBI_1.2.2 data.tree_1.0.0
## [43] rstatix_0.7.1 spatstat.random_3.1-3 miniUI_0.1.1.1
## [46] Rcpp_1.0.10 viridisLite_0.4.1 xtable_1.8-4
## [49] magic_1.6-0 reticulate_1.28 bit_4.0.5
## [52] akima_0.6-3.4 randomForestSRC_3.2.0 htmlwidgets_1.5.4
## [55] httr_1.4.5 DiagrammeR_1.0.9 ellipsis_0.3.2
## [58] ica_1.0-3 farver_2.1.1 pkgconfig_2.0.3
## [61] sass_0.4.2 uwot_0.1.14 deldir_1.0-6
## [64] utf8_1.2.3 dynamicTreeCut_1.63-1 labeling_0.4.2
## [67] tidyselect_1.2.0 rlang_1.1.0 reshape2_1.4.4
## [70] later_1.3.0 munsell_0.5.0 tools_4.1.2
## [73] visNetwork_2.1.0 cachem_1.0.7 cli_3.6.2
## [76] dbscan_1.1-11 generics_0.1.3 broom_1.0.4
## [79] ggridges_0.5.4 geometry_0.4.5 evaluate_0.23
## [82] stringr_1.5.0 fastmap_1.1.1 yaml_2.3.7
## [85] goftest_1.2-3 rematch2_2.1.2 knitr_1.41
## [88] bit64_4.0.5 fitdistrplus_1.1-11 purrr_1.0.1
## [91] RANN_2.6.1 pbapply_1.7-0 future_1.32.0
## [94] nlme_3.1-155 mime_0.12 ggrastr_1.0.1
## [97] compiler_4.1.2 rstudioapi_0.14 beeswarm_0.4.0
## [100] plotly_4.10.0 png_0.1-8 ggsignif_0.6.3
## [103] spatstat.utils_3.0-1 tibble_3.2.1 bslib_0.4.0
## [106] stringi_1.7.12 highr_0.10 lattice_0.20-45
## [109] Matrix_1.5-4 vctrs_0.6.2 pillar_1.9.0
## [112] lifecycle_1.0.3 spatstat.geom_3.0-6 lmtest_0.9-40
## [115] jquerylib_0.1.4 RcppAnnoy_0.0.20 data.table_1.15.0
## [118] cowplot_1.1.1 irlba_2.3.5.1 httpuv_1.6.8
## [121] patchwork_1.1.2 R6_2.5.1 bookdown_0.30
## [124] promises_1.2.0.1 KernSmooth_2.23-20 gridExtra_2.3
## [127] vipor_0.4.5 parallelly_1.35.0 codetools_0.2-18
## [130] MASS_7.3-60 assertthat_0.2.1 philentropy_0.7.0
## [133] withr_2.5.0 sctransform_0.3.5 parallel_4.1.2
## [136] grid_4.1.2 tidyr_1.2.1 rmarkdown_2.14
## [139] carData_3.0-5 packcircles_0.3.5 Rtsne_0.16
## [142] ggpubr_0.5.0 spatstat.explore_3.0-6 shiny_1.7.2
## [145] ggbeeswarm_0.6.0
For more info about us click here.