-
Notifications
You must be signed in to change notification settings - Fork 4
Example of results of Grouped Analysis for a school case
We will do the data merge, normalization (unless individual normalizations are kept), dimension reduction, and evaluate the biais and the clustering. Alignment, quality control and filtering have already been done in the individual sample analyzes (see Example of results of Individual Analysis for a school case for more information).
To simplify the explanation, we will keep the normalizations by SCTransform
to allow the comparison with the integration by Seurat
, but we could test other methods.
I advise to save if you keep the individual normalizations (or not) in the name you give to the grouped object (here: name.grp : ["sc5p_v2_hs_PBMC_Grp_keep"]
).
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Params_grp.yaml
Steps: ["Grp_Norm_DimRed_Eval_GE"]
Grp_Norm_DimRed_Eval_GE :
name.grp : ["sc5p_v2_hs_PBMC_Grp_keep"]
input.list.rda : ["/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_individual_analysis_example_of_wiki/Results/sc5p_v2_hs_PBMC_1k_5gex_GE/F200_C1000_M0-0.15_R0-1_G5/DOUBLETSFILTER_all/NORMKEPT/pca/dims35_res1.2/sc5p_v2_hs_PBMC_1k_5gex_GE_SCTransform_pca_35_1.2_ADT_TCR_BCR.rda,/mnt/beegfs/userdata/m_aglave/pipeline/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/sc5p_v2_hs_PBMC_10k_5gex_GE/F200_C1000_M0-0.15_R0-1_G5/DOUBLETSFILTER_all/NORMKEPT/pca/dims33_res0.4/sc5p_v2_hs_PBMC_10k_5gex_GE_SCTransform_pca_33_0.4_ADT_TCR_BCR.rda"]
output.dir.grp : ["/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/"]
eval.markers : "GAPDH"
author.name : "marine aglave"
author.mail : "[email protected], [email protected]"
keep.norm : TRUE
#dims.max : 100
For the traceability of the analysis, I prefer to put the command lines in a script, but it is not mandatory.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/launcher_grp.sh
#!/bin/bash
########################################################################
## Single-cell script to launch single-cell pipeline
##
## using: sbatch /mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/launcher_grp.sh
########################################################################
#SBATCH --job-name=pipeline_sc
#SBATCH --nodes=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=1G
#SBATCH --partition=mediumq
source /mnt/beegfs/software/conda/etc/profile.d/conda.sh
conda activate /mnt/beegfs/userdata/m_aglave/.environnement_conda/single_cell_user
module load singularity
path_to_configfile="/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Params_grp.yaml"
path_to_pipeline="/mnt/beegfs/pipelines/single-cell"
snakemake --profile ${path_to_pipeline}/profiles/slurm -s ${path_to_pipeline}/Snakefile --configfile ${path_to_configfile}
conda deactivate
sbatch /mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/launcher_grp.sh
The Grp_Norm_DimRed_Eval_GE step corresponds to the Norm_DimRed_Eval_GE step but with a data merge step beforehand. The results, their interpretations, and their conclusions, are similar to those described in the Norm_DimRed_Eval_GE section of the individual sample analysis. Thus, I would develop the analysis succinctly. For more details, please refer to the section Normalization, Dimension Reduction, Biases and Clustering Evaluation of Individual analysis.
Provided by the Grp_Norm_DimRed_Eval_GE step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/sc5p_v2_hs_PBMC_Grp_keep_SCT_pca_dims.bias.cor.png
This graph represent the correlation between potential biases and each dimension, after nomalization and dimension reduction. Here, we choose to keep the individual normalizations (to compare this results with the results of the integration) so we can't correct biases (we will be able to estimate the impact of the biases on the final umap results). There appears to be a correlation between the levels of RNA encoding ribosomal proteins, the levels of stress RNA and the number of transcripts, with the dimensions.
Provided by the Grp_Norm_DimRed_Eval_GE step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/clustree_SCT_pca/uMAPs/
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca3_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca7_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca9_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca11_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca13_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca15_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca17_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca19_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca21_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca23_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca25_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca27_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca29_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca31_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca33_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca35_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca37_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca39_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca41_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca43_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca45_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca47_ALLres.png
sc5p_v2_hs_PBMC_Grp_keep_uMAPs_SCT_pca49_ALLres.png
As usual, to identify the number of dimensions to keep for clustering as well as the adequate resolution, the pipeline has drawn all possible umaps according to these 2 parameters. We have to look at all the umaps and choose the one that seems to be the most "beautiful": cluster well isolated from each other, cells well grouped within its cluster. As in the integrated analysis, we know the expected number of clusters because we performed the individual analysis of each sample, so we know the cell types (or cell subtypes) present.
The clutree plot is a tree plot to observe the influence of a parameter on the results. Here we measure the evolution of the membership of cells to a cluster:
- the resolution is fixed and the number of dimensions to keep evolves:
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/clustree_SCT_pca/louvain_resolution
Provided by the Grp_Norm_DimRed_Eval_GE step.
sc5p_v2_hs_PBMC_Grp_keep_SCT_res0.1.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_res0.2.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_res0.3.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_res0.4.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_res0.5.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_res0.6.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_res0.7.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_res0.8.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_res0.9.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_res1.0.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_res1.1.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_res1.2.png
- the number of dimensions to keep is fixed and the resolution evolves:
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/clustree_SCT_pca/dimensions/
*Provided by the Grp_Norm_DimRed_Eval_GE step.*sc5p_v2_hs_PBMC_Grp_keep_SCT_pca3.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca5.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca7.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca9.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca11.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca13.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca15.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca17.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca19.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca21.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca23.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca25.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca27.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca29.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca31.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca33.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca35.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca37.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca39.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca41.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca43.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca45.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca47.png
sc5p_v2_hs_PBMC_Grp_keep_SCT_pca49.png
The goal is that the membership of the cells in a cluster remains relatively stable.
Here, the umap are very stable across different dimensions (like in integrated analysis). I choose 25 dimensions and a resolution of 0.4. The clusters seem more grouped together and contaminate each other less. Like in integrated analysis we have 13 clusters, very close to integrated results so the interpretation is the similar. (I have tested up to 100 dimensions but the cell subtypes are not visible too.)
We will do the clustering, find the marker genes, make the annotation, add ADT, add TCR, add BCR and convert main results into a cerebro object.
We keep the same Configuration Parameter file, but we add Grp_Clust_Markers_Annot_GE, Grp_Adding_ADT, Grp_Adding_TCR, Grp_Adding_BCR and Cerebro steps. Some parameters (as name.grp
and input.rda.grp
) will be determined automatically thanks to the Grp_Norm_DimRed_Eval_GE step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Params_grp.yaml
Steps: ["Grp_Norm_DimRed_Eval_GE","Grp_Clust_Markers_Annot_GE","Grp_Adding_ADT","Grp_Adding_TCR","Grp_Adding_BCR","Cerebro"]
Grp_Norm_DimRed_Eval_GE :
name.grp : ["sc5p_v2_hs_PBMC_Grp_keep"]
input.list.rda : ["/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_individual_analysis_example_of_wiki/Results/sc5p_v2_hs_PBMC_1k_5gex_GE/F200_C1000_M0-0.15_R0-1_G5/DOUBLETSFILTER_all/NORMKEPT/pca/dims35_res1.2/sc5p_v2_hs_PBMC_1k_5gex_GE_SCTransform_pca_35_1.2_ADT_TCR_BCR.rda,/mnt/beegfs/userdata/m_aglave/pipeline/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/sc5p_v2_hs_PBMC_10k_5gex_GE/F200_C1000_M0-0.15_R0-1_G5/DOUBLETSFILTER_all/NORMKEPT/pca/dims33_res0.4/sc5p_v2_hs_PBMC_10k_5gex_GE_SCTransform_pca_33_0.4_ADT_TCR_BCR.rda"]
output.dir.grp : ["/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/"]
eval.markers : "GAPDH"
author.name : "marine aglave"
author.mail : "[email protected], [email protected]"
keep.norm : TRUE
#dims.max : 100
Grp_Clust_Markers_Annot_GE:
markfile : "/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_individual_analysis_example_of_wiki/markfile.xlsx"
keep.dims : 25
keep.res : 0.4
Grp_Adding_ADT:
samples.name.adt: ["sc5p_v2_hs_PBMC_1k_5fb,sc5p_v2_hs_PBMC_10k_5fb"]
input.dirs.adt: ["/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_individual_analysis_example_of_wiki/Results/sc5p_v2_hs_PBMC_1k_5fb_ADT/KALLISTOBUS/,/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/sc5p_v2_hs_PBMC_10k_5fb_ADT/KALLISTOBUS/"]
gene.names: "CD3G,CD19,PTPRC,CD4,CD8A,CD14,FCGR3A,NCAM1,IL2RA,PTPRC,PDCD1,TIGIT,IGHG1,IGHG2,IGHG2,IL7R,FUT4,CCR7,HLA-DRA"
Grp_Adding_TCR:
vdj.input.files.tcr: ["/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_individual_analysis_example_of_wiki/Results/sc5p_v2_hs_PBMC_1k_t_TCR/sc5p_v2_hs_PBMC_1k_t_TCR_CellRanger/outs/filtered_contig_annotations.csv,/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/sc5p_v2_hs_PBMC_10k_t_TCR/sc5p_v2_hs_PBMC_10k_t_TCR_CellRanger/outs/filtered_contig_annotations.csv"]
Grp_Adding_BCR:
vdj.input.files.bcr: ["/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_individual_analysis_example_of_wiki/Results/sc5p_v2_hs_PBMC_1k_b_BCR/sc5p_v2_hs_PBMC_1k_b_BCR_CellRanger/outs/filtered_contig_annotations.csv,/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/sc5p_v2_hs_PBMC_10k_b_BCR/sc5p_v2_hs_PBMC_10k_b_BCR_CellRanger/outs/filtered_contig_annotations.csv"]
No change in /mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/launcher_grp.sh script.
sbatch /mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/launcher_grp.sh
Provided by the Grp_Clust_Markers_Annot_GE step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/sc5p_v2_hs_PBMC_Grp_keep_SCT_pca_uMAP_dim25_res0.4.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/sc5p_v2_hs_PBMC_Grp_keep_SCT_pca_uMAP3d_dim25_res0.4.png
These graphs correspond to the 2D and 3D umaps of the data with the assignment of cells to their cluster. It corresponds to the umap chosen in the previous step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/sc5p_v2_hs_PBMC_Grp_keep_SCT_pca_uMAP.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/sc5p_v2_hs_PBMC_Grp_keep_SCT_pca_split_uMAP.png
These graphs correspond to the umap of the data with the assignment of cells to their sample (concatenated view, and splited view).
We observe that the cells of the 2 samples are NOT present in all the clusters. There is a batch effect! The cluster 7 only belongs to sample sc5p_v2_hs_PBMC_1k_5gex_GE, and the clusters 1 and 3 only belong to sample sc5p_v2_hs_PBMC_10k_5gex_GE.
Provided by the Grp_Clust_Markers_Annot_GE step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/technical/sc5p_v2_hs_PBMC_Grp_keep_technical_MULTI_ALL_uMAPs.png
This group of graphs corresponds to the plotting of biases on the umap. The goal of these graphs is to check the correction or not of biases. The cells should not be separated according to biases, but according to the biological processes of interest of the cells.
The interpretation is similar to that of the Clust_Markers_Annot_GE step. Here, there does not seem to be any influence of mitochondrial RNAs, nor of the cell cycle. We can observe a potential effect of the RNAs which code for ribosomal proteins, but we can't correct this effect because we kept the individual normalizations (and remember: the correction of this bias depends on the studied biological process). Also, we can observe a cell of cluster 12 which appears to be highly stressed (like in integrated results).
Provided by the Clust_Markers_Annot_GE step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/found_markers/sc5p_v2_hs_PBMC_Grp_keep_findmarkers_upset_all.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/found_markers/sc5p_v2_hs_PBMC_Grp_keep_findmarkers_upset_top10.png
As in Clust_Markers_Annot_GE step, here you can see the number of marker genes specific to a cluster and the number of marker genes shared between several clusters (the first graph matches all the results and the second graphs matches the 10 marker genes with the highest logFC for each cluster).
Provided by the Grp_Clust_Markers_Annot_GE step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/found_markers/sc5p_v2_hs_PBMC_Grp_keep_SCT_pca.29_res.0.6_findmarkers_all.txt
Ten first lines of file:
genes | avg_log2FC | p_val | adj.P.Val | pct.1 | pct.2 | tested_cluster | control_cluster | min.pct |
---|---|---|---|---|---|---|---|---|
S100A8 | 6,5918340594 | 0 | 0 | 0,999 | 0,136 | 0 | All | 0,75 |
S100A9 | 5,6644178329 | 0 | 0 | 1 | 0,116 | 0 | All | 0,75 |
LYZ | 3,8965096893 | 0 | 0 | 0,999 | 0,089 | 0 | All | 0,75 |
VCAN | 3,6598168291 | 0 | 0 | 0,988 | 0,044 | 0 | All | 0,75 |
FOS | 3,6489220386 | 0 | 0 | 0,998 | 0,436 | 0 | All | 0,75 |
S100A12 | 3,3718582643 | 0 | 0 | 0,928 | 0,015 | 0 | All | 0,75 |
FCN1 | 3,3166683836 | 0 | 0 | 0,996 | 0,043 | 0 | All | 0,75 |
TYROBP | 2,9414135212 | 0 | 0 | 1 | 0,139 | 0 | All | 0,75 |
MNDA | 2,8871975423 | 0 | 0 | 0,971 | 0,045 | 0 | All | 0,75 |
As in Clust_Markers_Annot_GE step, this table lists all the marker genes for each cluster (comparison one cluster against all the others):
- adj.P.Val > 5%,
- avg_log2FC > 0,5 (positif log2FC only),
- pct.1 or pct.2 > 0,75 (pct.x: percentage of cells in the group x expressing the tested gene (example: 0,75 corresponds to 75% of cells); min.pct: threshold used for pct.1 and pct.2).
Provided by the Grp_Clust_Markers_Annot_GE step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/found_markers/sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_heatmap.png
The heatmap represents the expression of the 10 best marker genes in logFC for each cluster for all cells group by cluster. The expressions were normalized between the genes in order to allow a visual comparison between these genes. See Clust_Markers_Annot_GE for more details.
Here, we observe that clusters 1, 2,3 and 7 are similar, even if they have a few genes with different expression,so maybe they are the same cell type (thanks to the integration analysis, we know that these cells are T Lymphocytes).
Provided by the Grp_Clust_Markers_Annot_GE step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/found_markers/
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster0_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster1_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster2_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster3_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster4_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster5_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster6_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster7_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster8_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster9_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster10_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster11_vln.png
sc5p_v2_hs_PBMC_Grp_keep_findmarkers_top10_cluster12_vln.png
The violinplot represents the expression of the 10 best marker genes in logFC for each cluster by cell for all clusters. The expression of the marker gene of each cell is plotted by cluster. This allows to verify that a marker gene is quite specific for a cluster and is not shared by other clusters.
For example, the LYZ gene is a marker gene of cluster 0 but it's also highly expressed in cluster 9. So it isn't specific of this cluster. The S100A12 gene is a marker gene of cluster 0 and it's lowly expressed by other clusters, so it is very specific of this cluster.
Provided by the Grp_Clust_Markers_Annot_GE step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/markers/sc5p_v2_hs_PBMC_Grp_keep_markers_ALL_uMAPs.png
This is a representation of all genes from the Markfile. It can help to annotate cell types.
Provided by the Grp_Clust_Markers_Annot_GE step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/cells_annotation/singler/sc5p_v2_hs_PBMC_Grp_keep_SCTuMAP_SR_NovershternHematopoieticData_clust.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/cells_annotation/singler/sc5p_v2_hs_PBMC_Grp_keep_SCTuMAP_SR_NovershternHematopoieticData_cells.png
The automatic annotation is done with clusterifyR and singleR as in Clust_Markers_Annot_GE step.
Here, I present you 2 examples (one realized on the clusters, the other realized on each cell). These are the same examples as for the integrated analysis. Warning: the references provided with the tools are not necessarily very relevant (from microarray or bulk RNA-seq) but they are the best available at the moment. So, the results should be interpreted with caution.
Note:
It is better to group the information to annotate the cells: the automatic annotation, the marker genes and the Markfile genes.
By cross-checking the results with the Markfile we can conclude that:
- cluster 0 corresponds to Monocytes,
- cluster 1 corresponds to Lymphocytes T CD4+,
- cluster 2 corresponds to Lymphocytes T CD4+,
- cluster 3 corresponds to Lymphocytes T CD8+,
- cluster 4 corresponds to Lymphocytes B,
- cluster 5 corresponds to Lymphocytes T CD8+,
- cluster 6 corresponds to NK cells,
- cluster 7 corresponds to Lymphocytes T (CD4+?, CD8+?)
- cluster 8 corresponds to Monocytes,
- cluster 9 corresponds to Dendritic cells,
- cluster 10 corresponds to Lymphocytes T CD4+?
- cluster 11 corresponds to Dendritic cells,
- cluster 12 corresponds to Megakaryocytes? Platelet cells?
So, if we compare to the integrated analysis:
grouped | integrated | |
---|---|---|
cluster 0 | <=> | clusters 1 and 2 |
cluster 1 | <=> | cluster 0 |
cluster 2 | <=> | cluster 3 |
cluster 3 | <=> | cluster 4 |
cluster 4 | <=> | cluster 5 |
cluster 5 | <=> | cluster 6 |
cluster 6 | <=> | cluster 7 |
cluster 7 | <=> | cluster 6 ? |
cluster 8 | <=> | cluster 9 |
cluster 9 | <=> | cluster 8 |
cluster 10 | <=> | cluster 10 |
cluster 11 | <=> | cluster 11 |
cluster 12 | <=> | cluster 12 |
Provided by the Grp_Adding_ADT step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/ADT_results/ADT_dimplot.png
This plot shows the normalized expression of the genes (left) with the normalized expression of the corresponding proteins (right). Often protein expression is very strong with background noise due to non-specific hybridization of the antibodies. To solve this problem we can modify the cutoff of the legend by quantiles.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/ADT_results/ADT_dimplot_legend_cutoff.png
This plot shows exactly the same things, but with a legend cutoff (default parameters).
Provided by the Adding_TCR step.
There are several ways to define a clonotype:
- gene: use the genes comprising the TCR.
- nt: use the nucleotide sequence of the CDR3 region.
- aa: use the amino acid sequence of the CDR3 region.
- gene+nt: use the genes comprising the TCR + the nucleotide sequence of the CDR3 region for T cells. This is the proper definition of clonotype.
Note:
All the TCR and BCR results, are the same between integrated and grouped analysis; only umap and the analysis by clusters are differents.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/quantUniqueContig.png
This group of graphs represents the number of different (unique) clonotypes in the sample.
Here, we can observe that almost all contigs are present in a single copy (they are unique), regardless of the definition of clonotypes chosen. For clonotypes defined only by their gene, we observe 444 unique contigs out of a total of 451 contigs for sc5p_v2_hs_PBMC_1k_5gex_GE and 4112 unique contigs out of a total of 4443 contigs for sc5p_v2_hs_PBMC_10k_5gex_GE.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/abundanceContig.png
This group of line graphs represents the number of clonotypes depending on the number of cells where the contig is present. The points of the graph are connected.
The interpretation is the same as for the TCR part of the individual analysis of samples, but we have the 2 samples present on the graphs.
Here we have mostly single clonotypes, present in a single cell, which is represented by a dot at over 400 clonotypes with an abundance of 1 for sc5p_v2_hs_PBMC_1k_5gex_GE sample and at over 4000 clonotypes with an abundance of 1 for sc5p_v2_hs_PBMC_10k_5gex_GE sample. It is in agreement with the previous graph of unique contigs.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/clhomeo.png
By examining the clonal space, we are effectively looking at the relative space occupied by clones at specific proportions. Another way to think about this would be thinking of the total immune receptor sequencing run as a measuring cup. In this cup, we will fill liquids of different viscosity - or different number of clonal proportions. Clonal space homeostasis is asking what percentage of the cup is filled by clones in distinct proportions (or liquids of different viscosity, to extend the analogy).
The interpretation is the same as for the TCR part of the individual analysis of samples, but we have the 2 samples present on the graphs.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/clprop.png
Like clonal space homeostasis above, clonal proportion acts to place clones into separate bins. The key difference is instead of looking at the relative proportion of the clone to the total, the clonalProportion() function will rank the clones by total number and place them into bins. Example: [1:10] are the top 10 clonotypes in each sample.
The interpretation is the same as for the TCR part of the individual analysis of samples, but we have the 2 samples present on the graphs.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/Frequency_top_10_umapsc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/Frequency_top11to20_umapsc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/Frequency_top_10_umapsc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/Frequency_top11to20_umapsc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/Frequency_top_10_umapsc5p_v2_hs_PBMC_Grp_keep.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/Frequency_top11to20_umapsc5p_v2_hs_PBMC_Grp_keep.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/Frequency_umapsc5p_v2_hs_PBMC_Grp_keep.png
The frequency represents the number of cells that contain the clonotype based on its amino acid sequence. We have theses results for each sample and the integration.
Here, we can confirm the observations of the individual analysis of each sample. The localization of clonotypes on the umap confirms the T cell annotation of the clusters. We also observe that not all T cells have an identified TCR.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/lengthContig.png
This graph represents the length distribution of the CDR3 sequences (combined or separate chains) for each sample.
The interpretation of this type of graph depends on the biological context.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/cloneType_sc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/cloneType_sc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/cloneType_sc5p_v2_hs_PBMC_Grp_keep.png
These groups of graphs represents several umap with the location of each part of the TCR and the size of the TRA and TRB, for each sample and the integration.
The interpretation of this type of graph depends on the biological context.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/cldiv.png
This graph represents the measures the diversity of clonotypes within the sample. It is provided by 4 metrics (Shannon, inverse Simpson, Chao1, and Abundance-based Coverage Estimator (ACE)) for each sample.
The interpretation of this type of graph depends on the biological context.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Global_analysis/aaProperties.png
This group of graphs represents a list the physicochemical properties of the amino acids that make up the receptors.
The interpretation of this type of graph depends on the biological context.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_1k_5gex/clust_quantContig_sc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/clust_quantContig_sc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/clust_quantContig_sc5p_v2_hs_PBMC_Grp_keep.png
This group of graphs are similar to that of the globale analysis, but by clusters, and for each sample and the merger, so the interpretation is similar too.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/clust_abundanceContig.png
This group of graphs are similar to that of the globale analysis, but by clusters for the merger, so the interpretation is similar too.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_1k_5gex/clust_clhomeo_sc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/clust_clhomeo_sc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/clust_clhomeo_sc5p_v2_hs_PBMC_Grp_keep.png
This group of graphs are similar to that of the globale analysis, but by clusters, and for each sample and the merger, so the interpretation is similar too.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_1k_5gex/clust_clprop_sc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/clust_clprop_sc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/clust_clprop_sc5p_v2_hs_PBMC_Grp_keep.png
This group of graphs are similar to that of the globale analysis, but by clusters, and for each sample and the merger, so the interpretation is similar too.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_1k_5gex/Frequency_top_10_clust1_umapsc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_1k_5gex/Frequency_top_10_clust2_umapsc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_1k_5gex/Frequency_top_10_clust5_umapsc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_1k_5gex/Frequency_top_10_clust7_umapsc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_1k_5gex/Frequency_top_10_clust10_umapsc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/Frequency_top_10_clust0_umapsc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/Frequency_top_10_clust1_umapsc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/Frequency_top_10_clust2_umapsc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/Frequency_top_10_clust3_umapsc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/Frequency_top_10_clust4_umapsc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/Frequency_top_10_clust5_umapsc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/Frequency_top_10_clust6_umapsc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/Frequency_top_10_clust7_umapsc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/Frequency_top_10_clust10_umapsc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/Frequency_top_10_clust0_umapsc5p_v2_hs_PBMC_Grp_keep.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/Frequency_top_10_clust1_umapsc5p_v2_hs_PBMC_Grp_keep.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/Frequency_top_10_clust2_umapsc5p_v2_hs_PBMC_Grp_keep.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/Frequency_top_10_clust3_umapsc5p_v2_hs_PBMC_Grp_keep.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/Frequency_top_10_clust4_umapsc5p_v2_hs_PBMC_Grp_keep.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/Frequency_top_10_clust5_umapsc5p_v2_hs_PBMC_Grp_keep.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/Frequency_top_10_clust6_umapsc5p_v2_hs_PBMC_Grp_keep.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/Frequency_top_10_clust7_umapsc5p_v2_hs_PBMC_Grp_keep.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/Frequency_top_10_clust10_umapsc5p_v2_hs_PBMC_Grp_keep.png
This group of graphs are similar to that of the globale analysis, but by clusters, and for each sample and the merger, so the interpretation is similar too.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_1k_5gex/clust_clOverlap_sc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/clust_clOverlap_sc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/clust_clOverlap_sc5p_v2_hs_PBMC_Grp_keep.png
The graph represents the percentages of common clonotypes between 2 clusters (scaled to the number of unique clonotypes in the smaller cluster). The tables which present the number and the sequence of common clonotypes between the clusters are not shown here but are computed too.
The interpretation is exactly the same as individual analysisof sample.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_1k_5gex/clust_cldiv_sc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/clust_cldiv_sc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/clust_cldiv_sc5p_v2_hs_PBMC_Grp_keep.png
This group of graphs are similar to that of the globale analysis, but by clusters, and for each sample and the merger, so the interpretation is similar too.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_1k_5gex/aaProperties_sc5p_v2_hs_PBMC_1k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/sc5p_v2_hs_PBMC_10k_5gex/aaProperties_sc5p_v2_hs_PBMC_10k_5gex.png
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/TCR_results/Clusters_analysis/aaProperties_sc5p_v2_hs_PBMC_Grp_keep.png
This group of graphs are similar to that of the globale analysis, but by clusters, and for each sample and the merger, so the interpretation is similar too.
Provided by the Adding_BCR step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/BCR_results/
The results provided for the BCRs are the same as for the TCR analysis. So I will not explain again. We observe that all the clonotypes are unique except 3 (1 in sc5p_v2_hs_PBMC_1k_5gex and 2 in sc5p_v2_hs_PBMC_10k_5gex, and isn't the same clonotype). In addition, the clonotypes colocalize well with the cluster of B lymphocytes identified with the annotation. Cluster analysis is not very interesting because we have only one cluster of B lymphocytes in this example. There are one clonotype in cluster 0, but it is probably an artefact.
Provided by the Cerebro step.
/mnt/beegfs/userdata/m_aglave/pipeline/single-cell/examples/complete_grouped_integrated_analysis_example_of_wiki/Results/GROUPED_ANALYSIS/NO_INTEGRATED/sc5p_v2_hs_PBMC_Grp_keep/NORMKEPT/pca/dims25_res0.4/sc5p_v2_hs_PBMC_Grp_keep_SCTransform_pca_26_0.6_ADT_TCR_BCR.crb
Cerebro file can be loaded into CerebroApp R Shiny to exploit the main results.
- The cerebro file is not present in the results because its size exceed the threshold of 50 mb of github to store it.
Resources of the Theory of single cell RNA-seq
v1.3
Pipeline details
Configuration
-
Parameter file
- Steps
- Alignment_countTable_GE
- Droplets_QC_GE
- Filtering_GE
- Norm_DimRed_Eval_GE
- Clust_Markers_Annot_GE
- Cerebro
- Alignment_countTable_ADT
- Adding_ADT
- Alignment_annotations_TCR_BCR
- Adding_TCR
- Adding_BCR
- Int_Norm_DimRed_Eval_GE
- Int_Clust_Markers_Annot_GE
- Int_Adding_ADT
- Int_Adding_TCR
- Int_Adding_BCR
- Grp_Norm_DimRed_Eval_GE
- Grp_Clust_Markers_Annot_GE
- Grp_Adding_ADT
- Grp_Adding_TCR
- Grp_Adding_BCR
- Additional files
Results help
- Arborescence of all results
-
Observations and weird results
- Not a threshold by emptyDrops
- Large and small cells into the same sample
- emptyDrops does't work well
- More than 15% mitochondrial RNA while I filtered them out at 15%
- Impact of empty droplets on umap
- Choose the right number of dimensions
- Be careful with the colors, they are sometimes misleading
- Impact of bias correction on umap
Complete Examples of school cases
Individual analysis :
1 sample (scRNA-seq + ADT + TCR + BCR)
Grouped/Integrated analysis :
2 samples (scRNA-seq + ADT + TCR + BCR)