diff --git a/README.Rmd b/README.Rmd index 5843832..08a151c 100644 --- a/README.Rmd +++ b/README.Rmd @@ -102,11 +102,11 @@ We provide several vignettes demonstrating the different types of analysis that ## Tutorials -We recommend users to start with the following vignette, which demonstrates the different steps in the analysis without too many details yet. This is the recommended vignette to learn the basics of MultiNicheNet. +We recommend users to start with the following vignette, which demonstrates the different steps in the analysis and exploration of the output. This is the recommended vignette to learn MultiNicheNet. * [**MultiNicheNet - comprehensive tutorial** - Condition A vs Condition B vs Condition C](vignettes/basic_analysis_steps_MISC.knit.md) | [_R Markdown version_](vignettes/basic_analysis_steps_MISC.Rmd) | [_HTML version_](vignettes/basic_analysis_steps_MISC.html) -That vignette provides an example of a comparison between 3 groups. The following vignettes demonstrate how to analyze cell-cell communication differences in other settings. These vignettes are the best vignettes to learn how to apply MultiNicheNet to different datastes for addressing different questions. To reduce the length of these vignettes, the sections on downstream analysis has been reduced strongly. So we strongly recommend to read these vignettes to learn how to perform the analysis in other settings, but still perform all additional analyses and checks as demonstrated in the comprehensive tutorial vignette . +That vignette provides an example of a comparison between 3 groups. The following vignettes demonstrate how to analyze cell-cell communication differences in other settings. These vignettes are the best vignettes to learn how to apply MultiNicheNet to different datastes for addressing different questions. To reduce the length of these vignettes, the sections on downstream analysis has been reduced strongly and a wrapper function is sometimes used to perform the core analysis. So we strongly recommend to read these vignettes to learn how to perform the analysis in different settings, but still perform all additional analyses and checks as demonstrated in the comprehensive tutorial vignette. * [Condition A vs Condition B - without repeated subjects](vignettes/pairwise_analysis_MISC.knit.md) | [_R Markdown version_](vignettes/pairwise_analysis_MISC.Rmd) | [_HTML version_](vignettes/pairwise_analysis_MISC.html) * [Condition A vs Condition B - with **repeated subjects**: paired analysis with subject_id as covariate](vignettes/paired_analysis_SCC.knit.md) | [_R Markdown version_](vignettes/paired_analysis_SCC.Rmd) | [_HTML version_](vignettes/paired_analysis_SCC.html) diff --git a/vignettes/basic_analysis_steps_MISC.Rmd b/vignettes/basic_analysis_steps_MISC.Rmd index bff4f6f..7ed2050 100644 --- a/vignettes/basic_analysis_steps_MISC.Rmd +++ b/vignettes/basic_analysis_steps_MISC.Rmd @@ -1011,7 +1011,7 @@ The following block of code will show how to visualize the activities for the to ```{r, fig.width=13, fig.height=8} ligands_oi = multinichenet_output$prioritization_tables$ligand_activities_target_de_tbl %>% inner_join(contrast_tbl) %>% - group_by(group, receiver) %>% + group_by(group, receiver) %>% filter(direction_regulation == "up") %>% distinct(ligand, receiver, group, activity) %>% top_n(5, activity) %>% pull(ligand) %>% unique() diff --git a/vignettes/basic_analysis_steps_MISC.html b/vignettes/basic_analysis_steps_MISC.html index f566ffd..5653bd2 100644 --- a/vignettes/basic_analysis_steps_MISC.html +++ b/vignettes/basic_analysis_steps_MISC.html @@ -11,7 +11,7 @@ - +
multinichenetr 2.0.0
@@ -968,7 +968,7 @@The first plot visualizes the number of cells per celltype-sample combination, and indicates which combinations are removed during the DE analysis because there are less than min_cells
in the celltype-sample combination.
abundance_info$abund_plot_sample
-
+
The red dotted line indicates the required minimum of cells as defined above in min_cells
. We can see here that some sample-celltype combinations are left out. For the DE analysis in the next step, only cell types will be considered if there are at least two samples per group with a sufficient number of cells. But as we can see here: all cell types will be considered for the analysis and there are no condition-specific cell types.
Important: Based on the cell type abundance diagnostics, we recommend users to change their analysis settings if required (eg changing cell type annotation level, batches, …), before proceeding with the rest of the analysis. If too many celltype-sample combinations don’t pass this threshold, we recommend to define your cell types in a more general way (use one level higher of the cell type ontology hierarchy) (eg TH17 CD4T cells –> CD4T cells) or use min_cells = 5
if this would not be possible.
Evaluate the distributions of p-values:
DE_info$hist_pvals
-
+
These distributions look fine (uniform distribution, except peak at p-value <= 0.05), so we will continue using these regular p-values. In case these p-value distributions look irregular, you can estimate empirical p-values as we will demonstrate in another vignette.
empirical_pval = FALSE
if(empirical_pval == TRUE){
@@ -1188,30 +1187,30 @@ 2.4.2 Combine DE information for
)
sender_receiver_de %>% head(20)
## # A tibble: 20 × 12
-## contrast sender receiver ligand receptor lfc_ligand lfc_receptor ligand_receptor_lfc_…¹ p_val_ligand p_adj_ligand
-## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
-## 1 M-(S+A)/2 M_Mono… M_Monoc… C3 VSIG4 2.96 5.76 4.36 0.0964 0.439
-## 2 M-(S+A)/2 M_Mono… M_Monoc… IL10 IL10RB 5.61 0.715 3.16 0.0052 0.142
-## 3 M-(S+A)/2 M_Mono… L_T_TIM… IL18 IL18RAP 1.93 4.39 3.16 0.0445 0.321
-## 4 M-(S+A)/2 M_Mono… L_T_TIM… IL10 IL10RB 5.61 0.184 2.90 0.0052 0.142
-## 5 M-(S+A)/2 L_NK_C… L_T_TIM… IL18 IL18RAP 1.29 4.39 2.84 0.00758 0.231
-## 6 M-(S+A)/2 M_Mono… L_NK_CD… IL10 IL10RA 5.61 0.0393 2.82 0.0052 0.142
-## 7 M-(S+A)/2 M_Mono… L_T_TIM… IL10 IL10RA 5.61 -0.0612 2.77 0.0052 0.142
-## 8 M-(S+A)/2 M_Mono… L_NK_CD… IL10 IL10RB 5.61 -0.0727 2.77 0.0052 0.142
-## 9 M-(S+A)/2 M_Mono… M_Monoc… IL10 IL10RA 5.61 -0.316 2.65 0.0052 0.142
-## 10 M-(S+A)/2 M_Mono… L_T_TIM… C3 ITGAM 2.96 2.2 2.58 0.0964 0.439
-## 11 A-(S+M)/2 L_T_TI… M_Monoc… SCGB3… MARCO 4.49 0.611 2.55 0.0492 0.932
-## 12 M-(S+A)/2 M_Mono… M_Monoc… IL1B IL1RAP 1.53 3.56 2.54 0.0145 0.216
-## 13 M-(S+A)/2 M_Mono… M_Monoc… THBS1 CD47 4.58 0.446 2.51 0.00784 0.166
-## 14 M-(S+A)/2 L_T_TI… M_Monoc… TNFSF… CD163 0.484 4.54 2.51 0.282 0.84
-## 15 M-(S+A)/2 M_Mono… M_Monoc… THBS1 ITGA6 4.58 0.382 2.48 0.00784 0.166
-## 16 M-(S+A)/2 M_Mono… L_NK_CD… THBS1 ITGA6 4.58 0.352 2.47 0.00784 0.166
-## 17 M-(S+A)/2 M_Mono… L_NK_CD… THBS1 ITGA4 4.58 0.283 2.43 0.00784 0.166
-## 18 M-(S+A)/2 M_Mono… L_T_TIM… THBS1 ITGA6 4.58 0.263 2.42 0.00784 0.166
-## 19 M-(S+A)/2 L_T_TI… M_Monoc… HMGB1 CD163 0.226 4.54 2.38 0.111 0.649
-## 20 M-(S+A)/2 M_Mono… M_Monoc… HMGB1 CD163 0.154 4.54 2.35 0.287 0.673
+## contrast sender receiver ligand receptor lfc_ligand lfc_receptor ligand_receptor_lfc_…¹ p_val_ligand p_adj_ligand p_val_receptor
+## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
+## 1 M-(S+A)/2 M_Monocy… M_Monoc… C3 VSIG4 2.96 5.76 4.36 0.0964 0.439 0.00154
+## 2 M-(S+A)/2 M_Monocy… M_Monoc… IL10 IL10RB 5.61 0.715 3.16 0.0052 0.142 0.0123
+## 3 M-(S+A)/2 M_Monocy… L_T_TIM… IL18 IL18RAP 1.93 4.39 3.16 0.0445 0.321 0.000199
+## 4 M-(S+A)/2 M_Monocy… L_T_TIM… IL10 IL10RB 5.61 0.184 2.90 0.0052 0.142 0.557
+## 5 M-(S+A)/2 L_NK_CD5… L_T_TIM… IL18 IL18RAP 1.29 4.39 2.84 0.00758 0.231 0.000199
+## 6 M-(S+A)/2 M_Monocy… L_NK_CD… IL10 IL10RA 5.61 0.0393 2.82 0.0052 0.142 0.814
+## 7 M-(S+A)/2 M_Monocy… L_T_TIM… IL10 IL10RA 5.61 -0.0612 2.77 0.0052 0.142 0.728
+## 8 M-(S+A)/2 M_Monocy… L_NK_CD… IL10 IL10RB 5.61 -0.0727 2.77 0.0052 0.142 0.78
+## 9 M-(S+A)/2 M_Monocy… M_Monoc… IL10 IL10RA 5.61 -0.316 2.65 0.0052 0.142 0.233
+## 10 M-(S+A)/2 M_Monocy… L_T_TIM… C3 ITGAM 2.96 2.2 2.58 0.0964 0.439 0.00108
+## 11 A-(S+M)/2 L_T_TIM3… M_Monoc… SCGB3… MARCO 4.49 0.611 2.55 0.0492 0.932 0.577
+## 12 M-(S+A)/2 M_Monocy… M_Monoc… IL1B IL1RAP 1.53 3.56 2.54 0.0145 0.216 0.00294
+## 13 M-(S+A)/2 M_Monocy… M_Monoc… THBS1 CD47 4.58 0.446 2.51 0.00784 0.166 0.0682
+## 14 M-(S+A)/2 L_T_TIM3… M_Monoc… TNFSF… CD163 0.484 4.54 2.51 0.282 0.84 0.00403
+## 15 M-(S+A)/2 M_Monocy… M_Monoc… THBS1 ITGA6 4.58 0.382 2.48 0.00784 0.166 0.527
+## 16 M-(S+A)/2 M_Monocy… L_NK_CD… THBS1 ITGA6 4.58 0.352 2.47 0.00784 0.166 0.0844
+## 17 M-(S+A)/2 M_Monocy… L_NK_CD… THBS1 ITGA4 4.58 0.283 2.43 0.00784 0.166 0.0206
+## 18 M-(S+A)/2 M_Monocy… L_T_TIM… THBS1 ITGA6 4.58 0.263 2.42 0.00784 0.166 0.321
+## 19 M-(S+A)/2 L_T_TIM3… M_Monoc… HMGB1 CD163 0.226 4.54 2.38 0.111 0.649 0.00403
+## 20 M-(S+A)/2 M_Monocy… M_Monoc… HMGB1 CD163 0.154 4.54 2.35 0.287 0.673 0.00403
## # ℹ abbreviated name: ¹ligand_receptor_lfc_avg
-## # ℹ 2 more variables: p_val_receptor <dbl>, p_adj_receptor <dbl>
+## # ℹ 1 more variable: p_adj_receptor <dbl>
We can see here that for all cell type / contrast combinations, all geneset/background ratio’s are within the recommended range (in_range_up
and in_range_down
columns). When these geneset/background ratio’s would not be within the recommended ranges, we should interpret ligand activity results for these cell types with more caution, or use different thresholds (for these or all cell types).
For the sake of demonstration, we will also calculate these ratio’s in case we would use the adjusted p-value as threshold.
geneset_assessment_adjustedPval = contrast_tbl$contrast %>%
@@ -1256,18 +1255,18 @@ 2.5.1 Assess geneset_oi-vs-backgr
bind_rows()
geneset_assessment_adjustedPval
## # A tibble: 9 × 12
-## cluster_id n_background n_geneset_up n_geneset_down prop_geneset_up prop_geneset_down in_range_up in_range_down
-## <chr> <int> <int> <int> <dbl> <dbl> <lgl> <lgl>
-## 1 L_NK_CD56._CD1… 6621 7 0 0.00106 0 FALSE FALSE
-## 2 L_T_TIM3._CD38… 8461 15 5 0.00177 0.000591 FALSE FALSE
-## 3 M_Monocyte_CD16 8817 25 11 0.00284 0.00125 FALSE FALSE
-## 4 L_NK_CD56._CD1… 6621 28 50 0.00423 0.00755 FALSE TRUE
-## 5 L_T_TIM3._CD38… 8461 2 5 0.000236 0.000591 FALSE FALSE
-## 6 M_Monocyte_CD16 8817 10 15 0.00113 0.00170 FALSE FALSE
-## 7 L_NK_CD56._CD1… 6621 36 19 0.00544 0.00287 TRUE FALSE
-## 8 L_T_TIM3._CD38… 8461 13 1 0.00154 0.000118 FALSE FALSE
-## 9 M_Monocyte_CD16 8817 4 3 0.000454 0.000340 FALSE FALSE
-## # ℹ 4 more variables: contrast <chr>, logFC_threshold <dbl>, p_val_threshold <dbl>, adjusted <lgl>
+## cluster_id n_background n_geneset_up n_geneset_down prop_geneset_up prop_geneset_down in_range_up in_range_down contrast
+## <chr> <int> <int> <int> <dbl> <dbl> <lgl> <lgl> <chr>
+## 1 L_NK_CD56._CD16. 6621 7 0 0.00106 0 FALSE FALSE M-(S+A)/2
+## 2 L_T_TIM3._CD38._HLADR. 8461 15 5 0.00177 0.000591 FALSE FALSE M-(S+A)/2
+## 3 M_Monocyte_CD16 8817 25 11 0.00284 0.00125 FALSE FALSE M-(S+A)/2
+## 4 L_NK_CD56._CD16. 6621 28 50 0.00423 0.00755 FALSE TRUE S-(M+A)/2
+## 5 L_T_TIM3._CD38._HLADR. 8461 2 5 0.000236 0.000591 FALSE FALSE S-(M+A)/2
+## 6 M_Monocyte_CD16 8817 10 15 0.00113 0.00170 FALSE FALSE S-(M+A)/2
+## 7 L_NK_CD56._CD16. 6621 36 19 0.00544 0.00287 TRUE FALSE A-(S+M)/2
+## 8 L_T_TIM3._CD38._HLADR. 8461 13 1 0.00154 0.000118 FALSE FALSE A-(S+M)/2
+## 9 M_Monocyte_CD16 8817 4 3 0.000454 0.000340 FALSE FALSE A-(S+M)/2
+## # ℹ 3 more variables: logFC_threshold <dbl>, p_val_threshold <dbl>, adjusted <lgl>
We can see here that for most cell type / contrast combinations, the geneset/background ratio’s are not within the recommended range. Therefore, we will proceed with the default tresholds for the ligand activity analysis
First: group-based summary table
prioritization_tables$group_prioritization_tbl %>% head(20)
## # A tibble: 20 × 18
-## contrast group sender receiver ligand receptor lr_interaction id scaled_lfc_ligand scaled_p_val_ligand_…¹
-## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
-## 1 M-(S+A)/2 M M_Monocyte_… L_NK_CD… HLA.E KLRC1 HLA.E_KLRC1 HLA.… 0.816 0.982
-## 2 M-(S+A)/2 M L_T_TIM3._C… M_Monoc… IFNG IFNGR1 IFNG_IFNGR1 IFNG… 0.958 0.946
-## 3 A-(S+M)/2 A M_Monocyte_… L_T_TIM… LGALS3 LAG3 LGALS3_LAG3 LGAL… 0.918 0.980
-## 4 M-(S+A)/2 M L_T_TIM3._C… M_Monoc… IFNG IFNGR2 IFNG_IFNGR2 IFNG… 0.958 0.946
-## 5 M-(S+A)/2 M M_Monocyte_… L_T_TIM… CXCL16 CXCR6 CXCL16_CXCR6 CXCL… 0.861 0.936
-## 6 A-(S+M)/2 A M_Monocyte_… M_Monoc… VCAN TLR2 VCAN_TLR2 VCAN… 0.942 0.812
-## 7 M-(S+A)/2 M M_Monocyte_… L_T_TIM… HLA.E CD8A HLA.E_CD8A HLA.… 0.816 0.982
-## 8 M-(S+A)/2 M M_Monocyte_… M_Monoc… S100A9 CD68 S100A9_CD68 S100… 0.891 0.880
-## 9 M-(S+A)/2 M M_Monocyte_… L_T_TIM… HLA.D… LAG3 HLA.DRA_LAG3 HLA.… 0.886 0.978
-## 10 M-(S+A)/2 M M_Monocyte_… L_T_TIM… S100A8 CD69 S100A8_CD69 S100… 0.871 0.791
-## 11 M-(S+A)/2 M M_Monocyte_… M_Monoc… HLA.F LILRB1 HLA.F_LILRB1 HLA.… 0.890 0.985
-## 12 M-(S+A)/2 M M_Monocyte_… M_Monoc… TNF LTBR TNF_LTBR TNF_… 0.984 0.953
-## 13 A-(S+M)/2 A L_T_TIM3._C… L_T_TIM… LGALS… ITGB1 LGALS3BP_ITGB1 LGAL… 0.947 0.995
-## 14 A-(S+M)/2 A M_Monocyte_… M_Monoc… S100A8 CD36 S100A8_CD36 S100… 0.830 0.697
-## 15 M-(S+A)/2 M M_Monocyte_… M_Monoc… HLA.F LILRB2 HLA.F_LILRB2 HLA.… 0.890 0.985
-## 16 M-(S+A)/2 M M_Monocyte_… L_T_TIM… HLA.A CD8A HLA.A_CD8A HLA.… 0.925 0.999
-## 17 M-(S+A)/2 M M_Monocyte_… L_NK_CD… HLA.C KIR2DL1 HLA.C_KIR2DL1 HLA.… 0.897 0.972
-## 18 S-(M+A)/2 S L_T_TIM3._C… L_T_TIM… CD99 CD81 CD99_CD81 CD99… 0.656 0.780
-## 19 S-(M+A)/2 S L_NK_CD56._… M_Monoc… TGFB1 ENG TGFB1_ENG TGFB… 0.672 0.873
-## 20 S-(M+A)/2 S M_Monocyte_… M_Monoc… TGFB1 ENG TGFB1_ENG TGFB… 0.791 0.930
+## contrast group sender receiver ligand receptor lr_interaction id scaled_lfc_ligand scaled_p_val_ligand_…¹ scaled_lfc_receptor
+## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
+## 1 M-(S+A)/2 M M_Monocy… L_NK_CD… HLA.E KLRC1 HLA.E_KLRC1 HLA.… 0.816 0.982 0.954
+## 2 M-(S+A)/2 M L_T_TIM3… M_Monoc… IFNG IFNGR1 IFNG_IFNGR1 IFNG… 0.958 0.946 0.863
+## 3 A-(S+M)/2 A M_Monocy… L_T_TIM… LGALS3 LAG3 LGALS3_LAG3 LGAL… 0.918 0.980 0.985
+## 4 M-(S+A)/2 M L_T_TIM3… M_Monoc… IFNG IFNGR2 IFNG_IFNGR2 IFNG… 0.958 0.946 0.691
+## 5 M-(S+A)/2 M M_Monocy… L_T_TIM… CXCL16 CXCR6 CXCL16_CXCR6 CXCL… 0.861 0.936 0.966
+## 6 A-(S+M)/2 A M_Monocy… M_Monoc… VCAN TLR2 VCAN_TLR2 VCAN… 0.942 0.812 0.948
+## 7 M-(S+A)/2 M M_Monocy… L_T_TIM… HLA.E CD8A HLA.E_CD8A HLA.… 0.816 0.982 0.834
+## 8 M-(S+A)/2 M M_Monocy… M_Monoc… S100A9 CD68 S100A9_CD68 S100… 0.891 0.880 0.857
+## 9 M-(S+A)/2 M M_Monocy… L_T_TIM… HLA.D… LAG3 HLA.DRA_LAG3 HLA.… 0.886 0.978 0.784
+## 10 M-(S+A)/2 M M_Monocy… L_T_TIM… S100A8 CD69 S100A8_CD69 S100… 0.871 0.791 0.920
+## 11 M-(S+A)/2 M M_Monocy… M_Monoc… HLA.F LILRB1 HLA.F_LILRB1 HLA.… 0.890 0.985 0.982
+## 12 M-(S+A)/2 M M_Monocy… M_Monoc… TNF LTBR TNF_LTBR TNF_… 0.984 0.953 0.902
+## 13 A-(S+M)/2 A L_T_TIM3… L_T_TIM… LGALS… ITGB1 LGALS3BP_ITGB1 LGAL… 0.947 0.995 0.937
+## 14 A-(S+M)/2 A M_Monocy… M_Monoc… S100A8 CD36 S100A8_CD36 S100… 0.830 0.697 0.969
+## 15 M-(S+A)/2 M M_Monocy… M_Monoc… HLA.F LILRB2 HLA.F_LILRB2 HLA.… 0.890 0.985 0.942
+## 16 M-(S+A)/2 M M_Monocy… L_T_TIM… HLA.A CD8A HLA.A_CD8A HLA.… 0.925 0.999 0.834
+## 17 M-(S+A)/2 M M_Monocy… L_NK_CD… HLA.C KIR2DL1 HLA.C_KIR2DL1 HLA.… 0.897 0.972 0.790
+## 18 S-(M+A)/2 S L_T_TIM3… L_T_TIM… CD99 CD81 CD99_CD81 CD99… 0.656 0.780 0.962
+## 19 S-(M+A)/2 S L_NK_CD5… M_Monoc… TGFB1 ENG TGFB1_ENG TGFB… 0.672 0.873 0.974
+## 20 S-(M+A)/2 S M_Monocy… M_Monoc… TGFB1 ENG TGFB1_ENG TGFB… 0.791 0.930 0.974
## # ℹ abbreviated name: ¹scaled_p_val_ligand_adapted
-## # ℹ 8 more variables: scaled_lfc_receptor <dbl>, scaled_p_val_receptor_adapted <dbl>, max_scaled_activity <dbl>,
-## # scaled_pb_ligand <dbl>, scaled_pb_receptor <dbl>, fraction_expressing_ligand_receptor <dbl>,
-## # prioritization_score <dbl>, top_group <chr>
+## # ℹ 7 more variables: scaled_p_val_receptor_adapted <dbl>, max_scaled_activity <dbl>, scaled_pb_ligand <dbl>,
+## # scaled_pb_receptor <dbl>, fraction_expressing_ligand_receptor <dbl>, prioritization_score <dbl>, top_group <chr>
This table gives the final prioritization score of each interaction, and the values of the individual prioritization criteria.
With this step, all required steps are finished. Now, we can optionally still run the following steps * Calculate the across-samples expression correlation between ligand-receptor pairs and target genes @@ -1462,7 +1460,7 @@
Whereas these ChordDiagram circos plots show the most specific interactions per group, they don’t give insights into the data behind these predictions. Because inspecting the data behind the prioritization is recommended to decide on which interactions to validate, we created several functionalities to do this.
Therefore we will now generate “interpretable bubble plots” that indicate the different prioritization criteria used in MultiNicheNet.
+
Some notes about this plot:
* Samples that were left out of the DE analysis (because too few cells in that celltype-sample combination) are indicated with a smaller dot. This helps to indicate the samples that did not contribute to the calculation of the logFC, and thus not contributed to the final prioritization.
* As you can see, the HEBP1-FPR2 interaction does not have Omnipath DB scores. This is because this LR pair was not documented by the Omnipath LR database. Instead it was documented by the original NicheNet LR network (source: Guide2Pharmacology) as can be seen in the table (lr_network_all %>% filter(ligand == "HEBP1" & receptor == "FPR2")
).
and finally for the A group
group_oi = "A"
prioritized_tbl_oi_A_50 = prioritized_tbl_oi_all %>%
@@ -1507,7 +1505,7 @@ 3.1.2 Interpretable bubble plots<
prioritized_tbl_oi_A_50 %>% inner_join(lr_network_all)
)
plot_oi
-
+
As you could observe from the Circos ChordDiagram and Interpretable Bubble plots above: we find more specific interactions for the M-group than for the S- and A-group here. If you want to visualize more interactions specific for a group of interest, so not restricted to e.g. the top50 overall, but the top50 for a group of interest, you can run the following:
prioritized_tbl_oi_A_50 = get_top_n_lr_pairs(
@@ -1519,7 +1517,7 @@ 3.1.2 Interpretable bubble plots<
prioritized_tbl_oi_A_50 %>% inner_join(lr_network_all)
)
plot_oi
-+
Typically, there are way more than 50 differentially expressed and active ligand-receptor pairs per group across all sender-receiver combinations. Therefore it might be useful to zoom in on specific cell types as senders/receivers:
We will illustrate this for the “M_Monocyte_CD16” cell type as receiver in the M group:
group_oi = "M"
@@ -1534,7 +1532,7 @@ 3.1.2 Interpretable bubble plots<
prioritized_tbl_oi_M_50 %>% inner_join(lr_network_all)
)
plot_oi
-
+
And now as sender:
prioritized_tbl_oi_M_50 = get_top_n_lr_pairs(
multinichenet_output$prioritization_tables,
@@ -1545,7 +1543,7 @@ 3.1.2 Interpretable bubble plots<
multinichenet_output$prioritization_tables,
prioritized_tbl_oi_M_50 %>% inner_join(lr_network_all))
plot_oi
-
+
These two types of plots created above (Circos ChordDiagram and Interpretable Bubble Plot) for the most strongly prioritized interactions are the types of plot you should always create and inspect as an end-user.
The plots that we will discuss in the rest of the vignette are more optional, and can help to dive more deeply in the data. They are however not as necessary as the plots above.
So, let’s now continue with more detailed plots and downstream functionalities:
@@ -1567,18 +1565,18 @@Second, subset on ligands/receptors as target genes
lr_target_df %>% filter(target %in% union(lr_network$ligand, lr_network$receptor))
## # A tibble: 486 × 8
-## group sender receiver ligand receptor id target direction_regulation
-## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <fct>
-## 1 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_C… AREG up
-## 2 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_C… CD55 up
-## 3 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_C… KLRC1 up
-## 4 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_C… S100A8 up
-## 5 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_C… SLC1A5 up
-## 6 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_C… TFRC up
-## 7 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_C… TIMP1 down
-## 8 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR1 IFNG_IFNGR1_L_T_TIM3._CD… B2M up
-## 9 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR1 IFNG_IFNGR1_L_T_TIM3._CD… BTN3A1 up
-## 10 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR1 IFNG_IFNGR1_L_T_TIM3._CD… C1QB up
+## group sender receiver ligand receptor id target direction_regulation
+## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <fct>
+## 1 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_CD16_L_NK_CD56._CD… AREG up
+## 2 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_CD16_L_NK_CD56._CD… CD55 up
+## 3 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_CD16_L_NK_CD56._CD… KLRC1 up
+## 4 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_CD16_L_NK_CD56._CD… S100A8 up
+## 5 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_CD16_L_NK_CD56._CD… SLC1A5 up
+## 6 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_CD16_L_NK_CD56._CD… TFRC up
+## 7 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyte_CD16_L_NK_CD56._CD… TIMP1 down
+## 8 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR1 IFNG_IFNGR1_L_T_TIM3._CD38._HLADR._M_Mono… B2M up
+## 9 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR1 IFNG_IFNGR1_L_T_TIM3._CD38._HLADR._M_Mono… BTN3A1 up
+## 10 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR1 IFNG_IFNGR1_L_T_TIM3._CD38._HLADR._M_Mono… C1QB up
## # ℹ 476 more rows
Whereas these code blocks are just to demonstrate that this type of information is available in MultiNicheNet, the next block of code will infer the systems-wide intercellular regulatory network automatically:
network = infer_intercellular_regulatory_network(lr_target_df, prioritized_tbl_oi_all)
@@ -1606,7 +1604,7 @@ 3.2.1 Without filtering of target
colors_sender["L_T_TIM3._CD38._HLADR."] = "pink" # the original yellow background with white font is not very readable
network_graph = visualize_network(network, colors_sender)
network_graph$plot
-
+
As you can see here: we can see see here that several prioritized ligands seem to be regulated by other prioritized ligands! But, it may be challenging sometimes to discern individual links when several interactions are shown. Therefore, inspection of the underlying data tables (network$links
and network$nodes
) may be necessary to discern individual interactions. It is also suggested to export these data tables into more sophisticated network visualization tools (e.g., CytoScape) for better inspection of this network.
To inspect interactions involving specific ligands, such as IFNG as example, we can run the following code:
network$nodes %>% filter(gene == "IFNG")
@@ -1698,7 +1696,7 @@ 3.2.2 With filtering of target ge
colors_sender["L_T_TIM3._CD38._HLADR."] = "pink" # the original yellow background with white font is not very readable
network_graph = visualize_network(network, colors_sender)
network_graph$plot
-
+
As can be expected, we see fewer links here than in the previously generated intercellular regulatory network. The links that are not present anymore in this network are those ligand-target links that are not supported by high across-sample expression correlation. In conclusion, the links visualized here are the most trustworthy ones, since they are both supported by prior knowledge and expression correlation.
Interestingly, ligands/receptors visualized in this network can be considered as additionally prioritized because they are not only a prioritized ligand/receptor but also a target gene of another prioritized ligand-receptor interaction! So, we can also use this network to further prioritize differential CCC interactions. We can get these interactions as follows:
network$prioritized_lr_interactions
@@ -1720,18 +1718,18 @@ 3.2.2 With filtering of target ge
network$prioritized_lr_interactions)
prioritized_tbl_oi_network
## # A tibble: 27 × 8
-## group sender receiver ligand receptor id prioritization_score prioritization_rank
-## <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
-## 1 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.… 0.956 1
-## 2 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR1 IFNG… 0.948 2
-## 3 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR2 IFNG… 0.928 4
-## 4 M M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. CXCL16 CXCR6 CXCL… 0.922 5
-## 5 M M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. HLA.E CD8A HLA.… 0.919 7
-## 6 M M_Monocyte_CD16 M_Monocyte_CD16 S100A9 CD68 S100… 0.917 8
-## 7 M M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. HLA.DRA LAG3 HLA.… 0.910 9
-## 8 M M_Monocyte_CD16 M_Monocyte_CD16 HLA.F LILRB1 HLA.… 0.903 11
-## 9 M M_Monocyte_CD16 M_Monocyte_CD16 TNF LTBR TNF_… 0.902 12
-## 10 M M_Monocyte_CD16 M_Monocyte_CD16 HLA.F LILRB2 HLA.… 0.898 15
+## group sender receiver ligand receptor id prioritization_score prioritization_rank
+## <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
+## 1 M M_Monocyte_CD16 L_NK_CD56._CD16. HLA.E KLRC1 HLA.E_KLRC1_M_Monocyt… 0.956 1
+## 2 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR1 IFNG_IFNGR1_L_T_TIM3.… 0.948 2
+## 3 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR2 IFNG_IFNGR2_L_T_TIM3.… 0.928 4
+## 4 M M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. CXCL16 CXCR6 CXCL16_CXCR6_M_Monocy… 0.922 5
+## 5 M M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. HLA.E CD8A HLA.E_CD8A_M_Monocyte… 0.919 7
+## 6 M M_Monocyte_CD16 M_Monocyte_CD16 S100A9 CD68 S100A9_CD68_M_Monocyt… 0.917 8
+## 7 M M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. HLA.DRA LAG3 HLA.DRA_LAG3_M_Monocy… 0.910 9
+## 8 M M_Monocyte_CD16 M_Monocyte_CD16 HLA.F LILRB1 HLA.F_LILRB1_M_Monocy… 0.903 11
+## 9 M M_Monocyte_CD16 M_Monocyte_CD16 TNF LTBR TNF_LTBR_M_Monocyte_C… 0.902 12
+## 10 M M_Monocyte_CD16 M_Monocyte_CD16 HLA.F LILRB2 HLA.F_LILRB2_M_Monocy… 0.898 15
## # ℹ 17 more rows
Visualize now the expression and activity of these interactions for the M group
group_oi = "M"
@@ -1742,7 +1740,7 @@ 3.2.2 With filtering of target ge
prioritized_tbl_oi_M %>% inner_join(lr_network_all)
)
plot_oi
-
+
To summarize: this interpretable bubble plot is an important and helpful plot because:
1) these LR interactions are all in the overall top50 of condition-specific interactions
2) they are a likely interaction inducing one or more other prioritized LR interaction and/or they are regulated by one or more other prioritized LR interactions.
@@ -1757,7 +1755,7 @@
3.3 Visualize sender-agnostic lig
The following block of code will show how to visualize the activities for the top5 ligands for each receiver cell type - condition combination:
ligands_oi = multinichenet_output$prioritization_tables$ligand_activities_target_de_tbl %>%
inner_join(contrast_tbl) %>%
- group_by(group, receiver) %>%
+ group_by(group, receiver) %>% filter(direction_regulation == "up") %>%
distinct(ligand, receiver, group, activity) %>%
top_n(5, activity) %>%
pull(ligand) %>% unique()
@@ -1768,7 +1766,7 @@ 3.3 Visualize sender-agnostic lig
contrast_tbl,
widths = NULL)
plot_oi
-
+
Interestingly, we can here see a clear type I interferon (IFNA1, IFNB1, …) ligand activity signature in the A-group with predicted upregulatory activity in NK cells and T cells. Because type I interferons were not (sufficiently high) expressed by the cell types in our dataset, they were not retrieved by the classic MultiNicheNet analysis. However, they may have an important role in the Adult COVID19 patient group, as is supported by literature! This demonstrates the usefulness of this analysis: it can help you in having an idea about relevant ligands not captured in the data at hand but with a strong predicted target gene signature in one of the cell types in the data.
Note you can replace the automatically determined ligands_oi
by any set of ligands that are of interest to you.
With this plot/downstream analysis, we end the overview of visualizations that can help you in finding interesting hypotheses about important differential ligand-receptor interactions in your data. In case you ended up with a shortlist of interactions for further checks and potential experimental validation, we recommend going over the visualizations that are introduced in the next section. They are some additional “sound checks” for your shortlist of interactions. However, we don’t recommend generating these plots before having thoroughly analyzed and inspected all the previous visualizations. Only go further now if you understood all the previous steps to avoid getting more overwhelmed.
@@ -1801,10 +1799,10 @@ 3.4.1.1 Without filtering of targ
plot_legend = FALSE)
combined_plot
## $combined_plot
-
+
##
## $legends
-+
One observation we can make here is that several genes upregulated in the M-group are indeed high-confident target genes of IFNG (dark purple - high regulatory potential scores). Most of these genes are also potential target genes of TNF, but some specific genes are present as well.
Whereas this plot just showed the top ligands for a certain receiver-contrast, you can also zoom in on specific ligands of interest. As example, we will look at IFNG and IL15:
group_oi = "M"
@@ -1829,10 +1827,10 @@ 3.4.1.1 Without filtering of targ
plot_legend = FALSE)
combined_plot
## $combined_plot
-
+
##
## $legends
-
+
In summary, these “Ligand activity - target gene combination plots” show how well ligand-target links are supported by general prior knowledge, but not whether they are likely to be active in the system under study. That’s what we will look at now.
+
This visualization can help users assess whether ligand-target links that are supported by general prior knowledge, are also potentially active in the system under study: target genes that show across-sample expression correlation with their upstream ligand-receptor pairs may be more likely true target genes than target genes that don’t show this pattern.
Even though this plot indicates the strength of the correlation between ligand-receptor expression and target gene expression, it’s hard to assess the pattern of correlation. To help users evaluate whether high correlation values are not due to artifacts, we provide the following LR-target expression scatter plot visualization for a selected LR pair and their targets:
ligand_oi = "IFNG"
@@ -1885,7 +1883,7 @@ 3.4.1.2 With filtering of target
multinichenet_output$grouping_tbl,
lr_target_prior_cor_filtered)
lr_target_scatter_plot
-
+
As mentioned, we can also inspect the network table documenting the underlying data source(s) behind each of the links shown in this graph. This analysis can help users to assess the trustworthiness of ligand-target predictions.
data_source_network %>% head()
## # A tibble: 6 × 5
@@ -2000,7 +1998,7 @@ 3.4.3.1 Zoom in on specific ligan
sample_id = sample_id,
celltype_id = celltype_id)
p_violin
-
+
##
## [[2]]
-
+
p_target$singlecell_plot + ggtitle("DE genes (single-cell expression)")
-
+
Among these DE genes, you may be most interested in ligands or receptors
Ligands:
group_oi = "M"
@@ -2065,9 +2063,9 @@ 3.4.4 Visualize top DE genes for
celltype_oi = receiver_oi,
multinichenet_output$grouping_tbl)
p_target$pseudobulk_plot + ggtitle("DE ligands (pseudobulk expression)")
-
+
p_target$singlecell_plot + ggtitle("DE ligands (single-cell expression)")
-
+
Receptors:
group_oi = "M"
receiver_oi = "M_Monocyte_CD16"
@@ -2089,9 +2087,9 @@ 3.4.4 Visualize top DE genes for
celltype_oi = receiver_oi,
multinichenet_output$grouping_tbl)
p_target$pseudobulk_plot + ggtitle("DE receptors (pseudobulk expression)")
-
+
p_target$singlecell_plot + ggtitle("DE receptors (single-cell expression)")
-
+