diff --git a/.DS_Store b/.DS_Store index 5008ddf..3e5a709 100644 Binary files a/.DS_Store and b/.DS_Store differ diff --git a/README.Rmd b/README.Rmd index 846ae7e..19f9055 100644 --- a/README.Rmd +++ b/README.Rmd @@ -77,8 +77,19 @@ We provide several vignettes demonstrating the different types of analysis that We recommend users to start with the following vignette, which demonstrates the different steps in the analysis without too many details yet. This is the recommended vignette to learn the basics of MultiNicheNet. +TEST +TEST +TEST +TEST + +* [MultiNicheNet analysis: MIS-C threewise comparison - step-by-step](vignettes/basic_analysis_steps_MISC.html): `vignette("basic_analysis_steps_MISC", package="multinichenetr")` * [MultiNicheNet analysis: MIS-C threewise comparison - step-by-step](vignettes/basic_analysis_steps_MISC.md): `vignette("basic_analysis_steps_MISC", package="multinichenetr")` +TEST +TEST +TEST +TEST + This vignette provides an example of a comparison between 3 groups. The following vignettes demonstrate how to analyze cell-cell communication differences in other settings. For sake of simplicity, these vignettes also use a MultiNicheNet wrapper function, which encompasses the different steps demonstrated in the previous vignette. These vignettes are the best vignettes to learn how to apply MultiNicheNet to different datastes for addressing different questions. * [MultiNicheNet analysis: MIS-C pairwise comparison - wrapper function](vignettes/pairwise_analysis_MISC.md): `vignette("pairwise_analysis_MISC.Rmd", package="multinichenetr")` diff --git a/README.md b/README.md index 4459619..6b8892c 100644 --- a/README.md +++ b/README.md @@ -111,8 +111,15 @@ with: devtools::install_github("saeyslab/nichenetr") devtools::install_github("saeyslab/multinichenetr") +It is possible that during installation the following warning is thrown: + +“glmmTMB was built with TMB version 1.9.4” “Current TMB version is 1.9.5 + +This warning can be safely ignored since this does not affect +multinichenetr’s installation and functionalities. + multinichenetr is tested via Github Actions version control on Windows, -Linux (Ubuntu) and Mac (most recently tested R version: R 4.3.0.). +Linux (Ubuntu) and Mac (most recently tested R version: R 4.3.1.). ## Learning to use multinichenetr @@ -125,10 +132,17 @@ demonstrates the different steps in the analysis without too many details yet. This is the recommended vignette to learn the basics of MultiNicheNet. +TEST TEST TEST TEST + +- [MultiNicheNet analysis: MIS-C threewise comparison - + step-by-step](vignettes/basic_analysis_steps_MISC.html): + `vignette("basic_analysis_steps_MISC", package="multinichenetr")` - [MultiNicheNet analysis: MIS-C threewise comparison - step-by-step](vignettes/basic_analysis_steps_MISC.md): `vignette("basic_analysis_steps_MISC", package="multinichenetr")` +TEST TEST TEST TEST + This vignette provides an example of a comparison between 3 groups. The following vignettes demonstrate how to analyze cell-cell communication differences in other settings. For sake of simplicity, these vignettes diff --git a/vignettes/.DS_Store b/vignettes/.DS_Store new file mode 100644 index 0000000..7aebd86 Binary files /dev/null and b/vignettes/.DS_Store differ diff --git a/vignettes/add_proteomics_MISC.Rmd b/vignettes/add_proteomics_MISC.Rmd index 5adde27..d74702b 100644 --- a/vignettes/add_proteomics_MISC.Rmd +++ b/vignettes/add_proteomics_MISC.Rmd @@ -457,8 +457,8 @@ Of this data, we will only keep information about ligands and receptors ```{r} olink_df = xlsx::read.xlsx( - "/Users/robinb/Work/current_projects/MISC OLINK/data/summary_difference_diorioMIS-C-vs-HC.xlsx", 1 - ) %>% as_tibble() %>% + url("https://zenodo.org/records/10908003/files/summary_difference_diorioMIS-C-vs-HC.xlsx"), + 1) %>% as_tibble() %>% dplyr::rename(gene = variables, logFC = mean_diff_d0, pval = adj_p_values) %>% dplyr::select(gene, logFC, pval) %>% dplyr::mutate(gene = gene %>% make.names()) %>% diff --git a/vignettes/basic_analysis_steps_MISC.Rmd b/vignettes/basic_analysis_steps_MISC.Rmd index a79ac21..bcf5efc 100644 --- a/vignettes/basic_analysis_steps_MISC.Rmd +++ b/vignettes/basic_analysis_steps_MISC.Rmd @@ -4,6 +4,7 @@ author: "Robin Browaeys" package: "`r BiocStyle::pkg_ver('multinichenetr')`" output: BiocStyle::html_document +output_dir: "/Users/robinb/Work/multinichenetr/vignettes" vignette: > %\VignetteIndexEntry{MultiNicheNet analysis: MIS-C threewise comparison - step-by-step} %\VignetteEngine{knitr::rmarkdown} @@ -26,6 +27,10 @@ knitr::opts_chunk$set( library(BiocStyle) ``` + + In this vignette, you can learn how to perform a MultiNicheNet analysis to compare cell-cell communication between conditions of interest. A MultiNicheNet analysis can be performed if you have multi-sample, multi-condition/group single-cell data. We strongly recommend having at least 4 samples in each of the groups/conditions you want to compare. With less samples, the benefits of performing a pseudobulk-based DE analysis are less clear. For those datasets, you can check and run our alternative workflow that makes use of cell-level sample-agnostic differential expression tools. As input you need a SingleCellExperiment object containing at least the raw count matrix and metadata providing the following information for each cell: the **group**, **sample** and **cell type**. @@ -647,11 +652,11 @@ In multi-sample datasets, we have the opportunity to look whether expression of ```{r} lr_target_prior_cor = lr_target_prior_cor_inference( - receivers_oi = multinichenet_output$prioritization_tables$group_prioritization_tbl$receiver %>% unique(), + receivers_oi = prioritization_tables$group_prioritization_tbl$receiver %>% unique(), abundance_expression_info = abundance_expression_info, celltype_de = celltype_de, grouping_tbl = grouping_tbl, - prioritization_tables = multinichenet_output$prioritization_tables, + prioritization_tables = prioritization_tables, ligand_target_matrix = ligand_target_matrix, logFC_threshold = logFC_threshold, p_val_threshold = p_val_threshold, diff --git a/vignettes/basic_analysis_steps_MISC.html b/vignettes/basic_analysis_steps_MISC.html index 626bcec..ec06285 100644 --- a/vignettes/basic_analysis_steps_MISC.html +++ b/vignettes/basic_analysis_steps_MISC.html @@ -1,1747 +1,1829 @@ - +
- - + - + - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - + + + + + + -Robin Browaeys 2023-06-06
- - -In this vignette, you can learn how to perform an all-vs-all -MultiNicheNet analysis. In this vignette, we start from one -SingleCellExperiment object containing cells from both sender and -receiver cell types and from different patients.
-A MultiNicheNet analysis can be performed if you have multi-sample, -multi-group single-cell data. MultiNicheNet will look for cell-cell -communication between the cell types in your data for each sample, and -compare the cell-cell communication patterns between the groups of -interest. Therefore, the absolute minimum of meta data you need to have, -are following columns indicating for each cell: the -group, sample and cell -type.
-As example expression data of interacting cells, we will here use -scRNAseq data of immune cells in MIS-C patients and healthy siblings -from this paper of Hoste et al.: TIM3+ -TRBV11-2 T cells and IFNγ signature in patrolling monocytes and CD16+ NK -cells delineate MIS-C . MIS-C (multisystem inflammatory syndrome in children) -is a novel rare immunodysregulation syndrome that can arise after -SARS-CoV-2 infection in children. We will use NicheNet to explore immune -cell crosstalk enriched in MIS-C compared to healthy siblings.
-In this vignette, we will prepare the data and analysis parameters, -and then perform the MultiNicheNet analysis.
-The different steps of the MultiNicheNet analysis are the -following:
+ +multinichenetr 2.0.0
+ +In this vignette, you can learn how to perform a MultiNicheNet analysis to compare cell-cell communication between conditions of interest. A MultiNicheNet analysis can be performed if you have multi-sample, multi-condition/group single-cell data. We strongly recommend having at least 4 samples in each of the groups/conditions you want to compare. With less samples, the benefits of performing a pseudobulk-based DE analysis are less clear. For those datasets, you can check and run our alternative workflow that makes use of cell-level sample-agnostic differential expression tools.
+As input you need a SingleCellExperiment object containing at least the raw count matrix and metadata providing the following information for each cell: the group, sample and cell type.
+As example expression data of interacting cells, we will here use scRNAseq data of immune cells in MIS-C patients and healthy siblings from this paper of Hoste et al.: TIM3+ TRBV11-2 T cells and IFNγ signature in patrolling monocytes and CD16+ NK cells delineate MIS-C . +MIS-C (multisystem inflammatory syndrome in children) is a novel rare immunodysregulation syndrome that can arise after SARS-CoV-2 infection in children. We will use MultiNicheNet to explore immune cell crosstalk enriched in MIS-C compared to healthy siblings and adult COVID-19 patients.
+In this vignette, we will first prepare the MultiNicheNet core analysis, then run the several steps in the MultiNicheNet core analysis, and finally interpret the output.
+library(SingleCellExperiment)
+library(dplyr)
+library(ggplot2)
+library(nichenetr)
+library(multinichenetr)
+MultiNicheNet builds upon the NicheNet framework and uses the same prior knowledge networks (ligand-receptor network and ligand-target matrix, currently v2 version).
+The Nichenet v2 networks and matrices for both mouse and human can be downloaded from Zenodo .
+We will read these object in for human because our expression data is of human patients.
+Gene names are here made syntactically valid via make.names()
to avoid the loss of genes (eg H2-M3) in downstream visualizations.
organism = "human"
+options(timeout = 120)
+
+if(organism == "human"){
+
+ lr_network_all =
+ readRDS(url(
+ "https://zenodo.org/record/10229222/files/lr_network_human_allInfo_30112033.rds"
+ )) %>%
+ mutate(
+ ligand = convert_alias_to_symbols(ligand, organism = organism),
+ receptor = convert_alias_to_symbols(receptor, organism = organism))
+
+ lr_network_all = lr_network_all %>%
+ mutate(ligand = make.names(ligand), receptor = make.names(receptor))
+
+ lr_network = lr_network_all %>%
+ distinct(ligand, receptor)
+
+ ligand_target_matrix = readRDS(url(
+ "https://zenodo.org/record/7074291/files/ligand_target_matrix_nsga2r_final.rds"
+ ))
+
+ colnames(ligand_target_matrix) = colnames(ligand_target_matrix) %>%
+ convert_alias_to_symbols(organism = organism) %>% make.names()
+ rownames(ligand_target_matrix) = rownames(ligand_target_matrix) %>%
+ convert_alias_to_symbols(organism = organism) %>% make.names()
+
+ lr_network = lr_network %>% filter(ligand %in% colnames(ligand_target_matrix))
+ ligand_target_matrix = ligand_target_matrix[, lr_network$ligand %>% unique()]
+
+} else if(organism == "mouse"){
+
+ lr_network_all = readRDS(url(
+ "https://zenodo.org/record/10229222/files/lr_network_mouse_allInfo_30112033.rds"
+ )) %>%
+ mutate(
+ ligand = convert_alias_to_symbols(ligand, organism = organism),
+ receptor = convert_alias_to_symbols(receptor, organism = organism))
+
+ lr_network_all = lr_network_all %>%
+ mutate(ligand = make.names(ligand), receptor = make.names(receptor))
+ lr_network = lr_network_all %>%
+ distinct(ligand, receptor)
+
+ ligand_target_matrix = readRDS(url(
+ "https://zenodo.org/record/7074291/files/ligand_target_matrix_nsga2r_final_mouse.rds"
+ ))
+
+ colnames(ligand_target_matrix) = colnames(ligand_target_matrix) %>%
+ convert_alias_to_symbols(organism = organism) %>% make.names()
+ rownames(ligand_target_matrix) = rownames(ligand_target_matrix) %>%
+ convert_alias_to_symbols(organism = organism) %>% make.names()
+
+ lr_network = lr_network %>% filter(ligand %in% colnames(ligand_target_matrix))
+ ligand_target_matrix = ligand_target_matrix[, lr_network$ligand %>% unique()]
+
+}
+In this vignette, we will load in a subset of the scRNAseq data of the MIS-C . For the sake of demonstration, this subset only contains 3 cell types. These celltypes are some of the cell types that were found to be most interesting related to MIS-C according to Hoste et al.
+If you start from a Seurat object, you can convert it easily to a SingleCellExperiment object via sce = Seurat::as.SingleCellExperiment(seurat_obj, assay = "RNA")
.
Because the NicheNet 2.0. networks are in the most recent version of the official gene symbols, we will make sure that the gene symbols used in the expression data are also updated (= converted from their “aliases” to official gene symbols). Afterwards, we will make them again syntactically valid.
+sce = readRDS(url(
+ "https://zenodo.org/record/8010790/files/sce_subset_misc.rds"
+ ))
+sce = alias_to_symbol_SCE(sce, "human") %>% makenames_SCE()
+In this step, we will formalize our research question into MultiNicheNet input arguments.
+In this case study, we want to study differences in cell-cell communication patterns between MIS-C patients (M), their healthy siblings (S) and adult patients with severe covid (A). The meta data columns that indicate this disease status(=group/condition of interest) is MIS.C.AgeTier
.
Cell type annotations are indicated in the Annotation_v2.0
column, and the sample is indicated by the ShortID
column.
+If your cells are annotated in multiple hierarchical levels, we recommend using a relatively high level in the hierarchy. This for 2 reasons: 1) MultiNicheNet focuses on differential expression and not differential abundance, and 2) there should be sufficient cells per sample-celltype combination (see later).
sample_id = "ShortID"
+group_id = "MIS.C.AgeTier"
+celltype_id = "Annotation_v2.0"
+Important: It is required that each sample-id is uniquely assigned to only one condition/group of interest. See the vignettes about paired and multifactorial analysis to see how to define your analysis input when you have multiple samples (and conditions) per patient.
+If you would have batch effects or covariates you can correct for, you can define this here as well. However, this is not applicable to this dataset. Therefore we will use the following NA settings:
+covariates = NA
+batches = NA
+Important: for categorical covariates and batches, there should be at least one sample for every group-batch combination. If one of your groups/conditions lacks a certain level of your batch, you won’t be able to correct for the batch effect because the model is then not able to distinguish batch from group/condition effects.
+Important: The column names of group, sample, cell type, batches and covariates should be syntactically valid (make.names
)
Important: All group, sample, cell type, batch and covariate names should be syntactically valid as well (make.names
) (eg through SummarizedExperiment::colData(sce)$ShortID = SummarizedExperiment::colData(sce)$ShortID %>% make.names()
)
Here, we want to compare each patient group to the other groups, so the MIS-C (M) group vs healthy control siblings (S) and adult COVID19 patients (A) (= M vs S+A) and so on. We want to know which cell-cell communication patterns are specific for the M vs A+S group, the A vs M+S group and the S vs A+M group.
+To perform this comparison, we need to set the following contrasts:
+contrasts_oi = c("'M-(S+A)/2','S-(M+A)/2','A-(S+M)/2'")
+Very Important Note the format to indicate the contrasts! This formatting should be adhered to very strictly, and white spaces are not allowed! Check ?get_DE_info
for explanation about how to define this well. The most important points are that:
+each contrast is surrounded by single quotation marks
+contrasts are separated by a comma without any white space
+*all contrasts together are surrounded by double quotation marks.
If you compare against two groups, you should divide by 2 (as demonstrated here), if you compare against three groups, you should divide by 3 and so on.
+For downstream visualizations and linking contrasts to their main condition, we also need to run the following: +This is necessary because we will also calculate cell-type+condition specificity of ligands and receptors.
+contrast_tbl = tibble(contrast =
+ c("M-(S+A)/2","S-(M+A)/2", "A-(S+M)/2"),
+ group = c("M","S","A"))
+If you want to compare only two groups (eg M vs S), you can use the following:
+contrasts_oi = c("'M-S','S-M'")
+contrast_tbl = tibble(contrast = c("M-S","S-M"), group = c("M","S"))
Other vignettes will demonstrate how to formalize different types of research questions.
+If you want to focus the analysis on specific cell types (e.g. because you know which cell types reside in the same microenvironments based on spatial data), you can define this here. If you have sufficient computational resources and no specific idea of cell-type colocalzations, we recommend to consider all cell types as potential senders and receivers. Later on during analysis of the output it is still possible to zoom in on the cell types that interest you most, but your analysis is not biased to them.
+Here we will consider all cell types in the data:
+senders_oi = SummarizedExperiment::colData(sce)[,celltype_id] %>% unique()
+receivers_oi = SummarizedExperiment::colData(sce)[,celltype_id] %>% unique()
+sce = sce[, SummarizedExperiment::colData(sce)[,celltype_id] %in%
+ c(senders_oi, receivers_oi)
+ ]
+In case you would have samples in your data that do not belong to one of the groups/conditions of interest, we recommend removing them and only keeping conditions of interst:
+conditions_keep = c("M", "S", "A")
+sce = sce[, SummarizedExperiment::colData(sce)[,group_id] %in%
+ conditions_keep
+ ]
+Now we will run the core of a MultiNicheNet analysis. This analysis consists of the following steps:
In this vignette, we will demonstrate all these steps one by one.
-After the MultiNicheNet analysis is done, we will explore the output -of the analysis with different ways of visualization.
-library(SingleCellExperiment)
-library(dplyr)
-library(ggplot2)
-library(multinichenetr)
The Nichenet v2 networks and matrices for both mouse and human can be -downloaded from Zenodo .
-We will read these object in for human because our expression data is
-of human patients. Gene names are here made syntactically valid via
-make.names()
to avoid the loss of genes (eg H2-M3) in
-downstream visualizations.
= "human"
- organism if(organism == "human"){
-= readRDS(url("https://zenodo.org/record/7074291/files/lr_network_human_21122021.rds"))
- lr_network = lr_network %>% dplyr::rename(ligand = from, receptor = to) %>% distinct(ligand, receptor) %>% mutate(ligand = make.names(ligand), receptor = make.names(receptor))
- lr_network = readRDS(url("https://zenodo.org/record/7074291/files/ligand_target_matrix_nsga2r_final.rds"))
- ligand_target_matrix colnames(ligand_target_matrix) = colnames(ligand_target_matrix) %>% make.names()
- rownames(ligand_target_matrix) = rownames(ligand_target_matrix) %>% make.names()
- else if(organism == "mouse"){
- } = readRDS(url("https://zenodo.org/record/7074291/files/lr_network_mouse_21122021.rds"))
- lr_network = lr_network %>% dplyr::rename(ligand = from, receptor = to) %>% distinct(ligand, receptor) %>% mutate(ligand = make.names(ligand), receptor = make.names(receptor))
- lr_network = readRDS(url("https://zenodo.org/record/7074291/files/ligand_target_matrix_nsga2r_final_mouse.rds"))
- ligand_target_matrix colnames(ligand_target_matrix) = colnames(ligand_target_matrix) %>% make.names()
- rownames(ligand_target_matrix) = rownames(ligand_target_matrix) %>% make.names()
- }
In this vignette, sender and receiver cell types are in the same -SingleCellExperiment object, which we will load here. In this vignette, -we will load in a subset of the scRNAseq data of the MIS-C . For the sake of demonstration, this subset only -contains 3 cell types. These celltypes are some of the cell types that -were found to be most interesting related to MIS-C according to Hoste et -al.
-If you start from a Seurat object, you can convert it easily to a
-SingleCellExperiment via
-sce = Seurat::as.SingleCellExperiment(seurat_obj, assay = "RNA")
.
Because the NicheNet 2.0. networks are in the most recent version of -the official gene symbols, we will make sure that the gene symbols used -in the expression data are also updated (= converted from their -“aliases” to official gene symbols). Afterwards, we will make them again -syntactically valid.
-= readRDS(url("https://zenodo.org/record/8010790/files/sce_subset_misc.rds"))
- sce = alias_to_symbol_SCE(sce, "human") %>% makenames_SCE()
- sce ## [1] "there are provided symbols that are not in the alias annotation table: "
-## [1] "AL627309.1" "AL669831.5" "AL645608.8" "AL645608.2" "AL162741.1" "AL391244.3" "AL645728.1" "AL691432.2" "FO704657.1" "AL109917.1" "AL590822.2" "AL139246.5" "AL139246.3" "AL513320.1" "AL805961.2"
-## [16] "AL365255.1" "AL031848.2" "Z98884.1" "AL034417.4" "AL034417.3" "AL096855.1" "AL139424.2" "AL354956.1" "AL139423.1" "AL109811.3" "AL109811.1" "AL109811.2" "AC243836.1" "AC254633.1" "AL031283.1"
-## [31] "AL121992.3" "AL121992.1" "AL450998.2" "AL137802.2" "AL021920.3" "BX284668.2" "BX284668.5" "BX284668.6" "AL035413.1" "AL031727.2" "AL020998.1" "AL031005.1" "AL031728.1" "AL031281.2" "AL445686.2"
-## [46] "AL445471.1" "AL445471.2" "AL031432.5" "AL031432.3" "AL606491.1" "AL031280.1" "AL020996.1" "AL033528.2" "AL391650.1" "AL512408.1" "BX293535.1" "AL512288.1" "AL353622.1" "AL353622.2" "AL360012.1"
-## [61] "AC114488.2" "AC114488.3" "AL136115.2" "AL445248.1" "AL049795.1" "AL662907.3" "AC114490.3" "AC114490.2" "AC004865.2" "AL591845.1" "AL139260.1" "AL033527.3" "AL033527.5" "AL050341.2" "AL603839.3"
-## [76] "AC098484.3" "AL512353.1" "AL139289.2" "AL139289.1" "AL451062.3" "AL357079.1" "AL139220.2" "AL592166.1" "AC104170.1" "AL050343.1" "AL445685.1" "AL606760.2" "AL606760.3" "AL606760.1" "AC119428.2"
-## [91] "AL161644.1" "AC105277.1" "AC099791.2" "AL357078.1" "AL583808.1" "AC118549.1" "AC103591.3" "AL359504.2" "AL078459.1" "AL049597.2" "AC099063.4" "AC099063.1" "AC099568.2" "AC104836.1" "AC105942.1"
-## [106] "AC092802.1" "AC118553.1" "AC118553.2" "AC104506.1" "AC093157.2" "AC093157.1" "AL109741.1" "AL390036.1" "AL359258.2" "AL356488.3" "AL356488.2" "SARS" "AC000032.1" "AL450468.2" "AL160006.1"
-## [121] "AL355488.1" "AL365361.1" "AL360270.1" "AL360270.3" "AL355816.2" "AL354760.1" "AL603832.1" "AL357055.3" "AL137856.1" "AL390066.1" "AL445231.1" "AL157902.1" "AL359915.2" "AC244453.2" "AC244021.1"
-## [136] "AL592494.3" "AC239800.2" "AC239800.3" "AC245595.1" "AC246785.3" "FP700111.1" "AC245014.3" "AC245014.1" "AC243547.3" "AC243547.2" "NOTCH2NL" "AC239799.2" "AC242426.2" "AC239809.3" "AC245297.3"
-## [151] "AC245297.2" "AC239868.3" "AC239868.2" "AL590133.2" "AL391069.2" "AL391069.3" "AL391335.1" "AL589765.7" "AL450992.2" "AL162258.2" "AL358472.5" "AL358472.4" "AL358472.3" "AL358472.2" "AL451085.1"
-## [166] "AL451085.2" "AC234582.1" "AC234582.2" "AL353807.2" "AL355388.2" "AL139412.1" "AL590560.1" "AL590560.2" "AL121987.2" "AL139011.1" "AL138930.1" "AL121985.1" "AL591806.3" "AL590714.1" "AL592295.4"
-## [181] "AL359541.1" "AL596325.1" "AL356441.1" "AL451074.2" "AL451050.2" "AL359962.2" "AL031733.2" "AL021068.1" "AL031599.1" "AL645568.1" "AL121983.1" "AL121983.2" "Z99127.1" "AL513329.1" "AL359265.3"
-## [196] "AL137796.1" "AL449106.1" "AL353708.3" "AL353708.1" "AL162431.2" "AL445228.2" "AL078644.2" "AL133383.1" "AL390957.1" "AL136987.1" "AL136454.1" "AL157402.2" "AC096677.1" "AC099676.1" "AC098934.4"
-## [211] "AL512306.3" "AL512306.2" "AC244034.2" "AL137789.1" "AL031316.1" "AL606468.1" "AL451060.1" "AL360091.3" "AC092803.2" "AL590648.3" "AC096642.1" "AL513314.2" "AL592148.3" "AL392172.1" "AL359979.2"
-## [226] "AC138393.3" "AC092809.4" "AC092811.2" "AL591895.1" "AL512343.2" "AL353593.1" "AL353593.2" "AL670729.1" "AL117350.1" "AL121990.1" "AL445524.1" "AL355472.3" "AL355472.4" "AL355472.1" "AL160408.2"
-## [241] "AL391832.2" "AL391832.3" "AL359921.2" "AL359921.1" "BX323046.1" "AL356512.1" "AL591848.3" "AL390728.6" "AL390728.5" "AC138089.1" "AC098483.1" "AC092159.2" "AC093390.2" "AC231981.1" "AC108488.1"
-## [256] "AC017076.1" "AC010904.2" "AC092580.2" "AC073195.1" "AC010969.2" "AC104794.2" "AC007240.1" "AC007249.2" "AC062028.1" "AC079148.1" "AC013400.1" "AC079145.1" "AC098828.2" "AC012065.3" "AC012065.4"
-## [271] "AC018742.1" "AC012073.1" "AC012074.1" "AC104699.1" "AC013472.3" "AC013403.2" "AC074117.1" "AC093690.1" "AC104695.2" "AC104695.3" "AC092164.1" "AC016907.2" "AL121652.1" "AL121658.1" "AL133245.1"
-## [286] "AC007378.1" "AC006369.1" "AC074366.1" "AC019171.1" "AC007388.1" "AC083949.1" "AC010883.1" "AC019129.2" "AC018682.1" "AC016722.3" "AC073283.1" "AC079807.1" "AC092650.1" "AC093635.1" "AC008280.3"
-## [301] "AC093110.1" "AC012358.3" "AC015982.1" "AC007743.1" "AC016727.1" "AC107081.2" "AC009501.1" "AC012368.1" "AC008074.3" "AC007365.1" "AC017083.3" "AC017083.2" "AC017083.1" "AC022201.2" "AC007881.3"
-## [316] "AC007878.1" "AC073046.1" "AC073263.1" "AC073263.2" "AC006030.1" "AC005041.1" "AC019069.1" "AC104135.1" "AC005034.3" "AC009309.1" "AC015971.1" "AC133644.2" "AC104134.1" "AC062029.1" "AC103563.7"
-## [331] "AC092835.1" "AC009237.14" "AC021188.1" "AC092683.1" "AC109826.1" "AC079447.1" "AC092587.1" "AC018690.1" "AC092667.1" "AC016738.1" "AC104655.1" "AC012360.1" "AC012360.3" "AC010978.1" "AC013271.1"
-## [346] "AC108463.3" "AC068491.3" "AC017002.3" "AC017002.1" "AC079922.2" "AC016683.1" "AC016745.2" "AC104653.1" "AC110769.2" "AC009312.1" "AC009404.1" "AC093901.1" "AC012447.1" "AC068282.1" "AC012306.2"
-## [361] "AC073869.3" "AC011893.1" "AC114763.1" "AC092620.1" "AC007364.1" "AC091488.1" "AC009506.1" "AC009961.1" "AC009299.3" "AC008063.1" "AC009495.3" "AC007405.3" "AC078883.2" "AC078883.3" "AC078883.1"
-## [376] "AC010894.2" "AC010894.4" "AC093459.1" "AC079305.1" "AC073834.1" "AC009948.1" "AC009948.4" "AC009948.3" "AC010680.3" "AC010680.2" "AC010680.1" "AC068196.1" "AC009962.1" "AC096667.1" "AC017071.1"
-## [391] "AC013468.1" "AC108047.1" "AC005540.1" "AC064834.1" "AC114760.2" "AC020571.1" "AC013264.1" "AC010746.1" "AC011997.1" "AC007163.1" "AC005037.1" "AC064836.3" "AC007383.2" "AC007383.3" "AC008269.1"
-## [406] "AC007879.2" "AC009226.1" "AC007038.2" "AC007038.1" "AC079610.2" "AC079610.1" "AC068051.1" "AC016708.1" "AC098820.2" "AC021016.3" "AC012510.1" "AC009974.1" "AC097468.3" "AC068946.1" "AC053503.2"
-## [421] "AC079834.2" "AC097461.1" "AC097662.1" "AC009950.1" "AC012507.1" "AC012507.4" "AC073254.1" "AC105760.1" "AC105760.2" "AC112721.2" "AC104667.1" "AC012485.3" "AC016999.1" "AC062017.1" "SEPT2"
-## [436] "AC114730.1" "AC131097.4" "AC093642.1" "AC024060.1" "AC018816.1" "AC026202.2" "AC026202.3" "AC026191.1" "AC018809.2" "AC034198.2" "AC093495.1" "AC090948.1" "AC090948.2" "AC090948.3" "AC116096.1"
-## [451] "AC112220.4" "AC123023.1" "AC097359.3" "AC097359.2" "AC092053.3" "AC092053.2" "AC099541.1" "AC006059.1" "AC099329.3" "AC104184.1" "AC104187.1" "AC099669.1" "AC124045.1" "AC098613.1" "AC099778.1"
-## [466] "AC134772.1" "AC137630.2" "AC137630.1" "AC137630.3" "QARS" "AC105935.1" "U73166.1" "AC096920.1" "AC006252.1" "AC006254.1" "AC096887.1" "AC012467.2" "AC135507.1" "AC012557.2" "AC109587.1"
-## [481] "AC097634.3" "AC097634.1" "AC104435.2" "AC107204.1" "AC069222.1" "AC128688.2" "AC074044.1" "AC078785.1" "AC093010.2" "AC073352.1" "AC073352.2" "AC092910.3" "AC083798.2" "AC092903.2" "AC079848.1"
-## [496] "AC112484.3" "AC112484.1" "AC108673.2" "AC108673.3" "AC107027.1" "AC117382.2" "AC096992.2" "AC097103.2" "AC026304.1" "AC020636.1" "AC108718.1" "AC084036.1" "AC106707.1" "AC080013.6" "AC080013.1"
-## [511] "AC080013.5" "AC069224.1" "AC112491.1" "AC078802.1" "AC078795.1" "AC078795.2" "AC008040.1" "AC008040.5" "AC007823.1" "AC007620.2" "AC125618.1" "AC069431.1" "AC092953.2" "AC131235.3" "AC068631.1"
-## [526] "AC112907.3" "AC069213.1" "AC233280.1" "AC055764.2" "AC055764.1" "AC107464.1" "AC107464.3" "AC139887.4" "AC139887.2" "AC139887.1" "AC092535.4" "AC147067.1" "AC016773.1" "AL158068.2" "AC105415.1"
-## [541] "AC093323.1" "AC097382.2" "AC104825.1" "AC105345.1" "AC099550.1" "AC116651.1" "AC006160.1" "AC133961.1" "AC106047.1" "AC104078.1" "AC108471.2" "AC095057.2" "AC111006.1" "AC096734.2" "AC096586.1"
-## [556] "AC107068.1" "AC027271.1" "AC110792.3" "AC068620.1" "AC068620.2" "AC104806.2" "AC053527.1" "AC093677.2" "AC098818.2" "AC124016.2" "AC124016.1" "AC093827.5" "AC093827.4" "AC083829.2" "AC019077.1"
-## [571] "AC114811.2" "AC019131.2" "AP002026.1" "AC097460.1" "AP001816.1" "AC098487.1" "AF213884.3" "AC018797.2" "AC004069.1" "AC109361.1" "AC096564.1" "AC096564.2" "AC004067.1" "AC126283.1" "AC109347.1"
-## [586] "AC106864.1" "AC110079.2" "AC108866.1" "AC108062.1" "AC096711.3" "AC097376.2" "AC112236.2" "AC096733.2" "AC097504.2" "AC139720.1" "AC104596.1" "AC093835.1" "AC095055.1" "AC097375.1" "AC106882.1"
-## [601] "AC105285.1" "AC097534.1" "AC097534.2" "AC019163.1" "AC078881.1" "AC107214.1" "AC084871.2" "AC106897.1" "AC097652.1" "AC021087.1" "AC106772.1" "AC026740.1" "AC116351.1" "AC026412.3" "AC091965.1"
-## [616] "AC091965.4" "AC034229.4" "AC012640.2" "AC010491.1" "AC022113.1" "AC022113.2" "AC026785.2" "AC025181.2" "AC034231.1" "AC026801.2" "AC137810.1" "AC008957.1" "AC026741.1" "AC008875.1" "AC025171.3"
-## [631] "AC025171.5" "AC025171.2" "AC025171.4" "AC114947.2" "AC114956.1" "AC093297.2" "AC008966.1" "AC008914.1" "AC008892.1" "AC016596.1" "AC008937.1" "AC008937.3" "AC092343.1" "AC104113.1" "AC092354.2"
-## [646] "AC025442.1" "AC024581.1" "AC010280.2" "AC146944.4" "AC035140.1" "AC008972.2" "AC099522.2" "AC010501.1" "AC116337.3" "AC008897.2" "AC010245.2" "AC026725.1" "AC012636.1" "AC010260.1" "AC008771.1"
-## [661] "AC008434.1" "AC104118.1" "AC018754.1" "AC123595.1" "AC008840.1" "AC020900.1" "AC008906.1" "AC008906.2" "AC009126.1" "AC008522.1" "AC021086.1" "AC008467.1" "AC008494.3" "AC010226.1" "AC008549.2"
-## [676] "AC034236.2" "AC008629.1" "AC093535.1" "AC011416.3" "AC116366.3" "AC116366.2" "AC116366.1" "AC004775.1" "AC010240.3" "AC008608.2" "AC104109.3" "AC104109.2" "AC109454.2" "AC026691.1" "AC106791.1"
-## [691] "AC113382.1" "AC011405.1" "AC135457.1" "AC008667.1" "AC008438.1" "AC005618.1" "AC008781.1" "AC091959.3" "AC091948.1" "AC131025.1" "AC008641.1" "AC011337.1" "AC011374.2" "AC091982.3" "AC009185.1"
-## [706] "AC008676.1" "AC134043.1" "AC008691.1" "AC113414.1" "AC034199.1" "AC022217.3" "AC008429.1" "AC008429.3" "AC008378.1" "AC139491.1" "AC139491.7" "AC139795.3" "AC139795.2" "AC106795.2" "AC136604.2"
-## [721] "AC136604.3" "AC008393.1" "AC008610.1" "AC008443.5" "AC008443.6" "AC138035.1" "AL357054.4" "AL357054.2" "AL133351.1" "AL031963.3" "AL138831.1" "AL162718.1" "AL359643.3" "AL136307.1" "AL024498.1"
-## [736] "AL157373.2" "AL008729.1" "AL008729.2" "AL441883.1" "AL109914.1" "AL138720.1" "AL136162.1" "AL137003.1" "AL137003.2" "AL138724.1" "AL031775.1" "U91328.2" "U91328.1" "HIST1H2BC" "AL353759.1"
-## [751] "HIST1H2BE" "AL031777.3" "HIST1H2BF" "HIST1H2BG" "HIST1H2BI" "AL121936.1" "AL513548.3" "AL513548.1" "AL009179.1" "AL121944.1" "AL358933.1" "AL049543.1" "AL645929.2" "AL662797.1" "AL662844.4"
-## [766] "AL645933.2" "AL662899.2" "AL662884.4" "AL662796.1" "AL669918.1" "AL645941.3" "AL645940.1" "AL451165.2" "Z84485.1" "AL353597.1" "AL121574.1" "AL365205.1" "AL513008.1" "AL035587.1" "AL133375.1"
-## [781] "AL355802.2" "AL096865.1" "AL035701.1" "AL355353.1" "AL021368.1" "AL021368.2" "AL135905.1" "AL354719.2" "AL391807.1" "AL365226.2" "AL365226.1" "AC019205.1" "AL121972.1" "AL590428.1" "AL080250.1"
-## [796] "AL359715.3" "AL359715.1" "AL359715.2" "AL139274.2" "AL049697.1" "AL353135.1" "AL096678.1" "AL121787.1" "AL132996.1" "AL589740.1" "AL513550.1" "AL137784.2" "AL133338.1" "AL133406.3" "AL133406.2"
-## [811] "AL024507.2" "AL390208.1" "AL359711.2" "AL512303.1" "AC002464.1" "AL080317.1" "AL080317.2" "AL080317.3" "Z97989.1" "AL050331.2" "AL365275.1" "AL096711.2" "AL590006.1" "AL355581.1" "AL353596.1"
-## [826] "AL024508.2" "AL356234.2" "AL357060.1" "AL590617.2" "AL031772.1" "AL158850.1" "AL035446.1" "AL023806.1" "AL356599.1" "AL031056.1" "AL358852.1" "AL355312.4" "AL080276.2" "AL590867.1" "AL596202.1"
-## [841] "AL355297.3" "AL355297.4" "AL391863.1" "AL035634.1" "AL627422.2" "AL035530.2" "AL356417.3" "AL356417.1" "AL139393.2" "AL022069.1" "AL159163.1" "Z94721.1" "AL121935.1" "AL121935.2" "BX322234.1"
-## [856] "AL109910.2" "AC093627.5" "AC093627.4" "AC147651.1" "AC147651.4" "AC073957.3" "AC073957.2" "AC091729.3" "AC102953.2" "AC104129.1" "AC004906.1" "AC024028.1" "AC092171.4" "AC092171.3" "AC073343.2"
-## [871] "AC004982.2" "AC004948.1" "AC006042.4" "AC006042.2" "AC007029.1" "AC073333.1" "AC073332.1" "AC004130.1" "AC073072.1" "AC005082.1" "MPP6" "AC005165.1" "AC004520.1" "AC004540.1" "AC004593.3"
-## [886] "AC007285.1" "AC007036.1" "AC005154.6" "GARS" "AC018647.2" "AC007349.3" "AC006033.2" "AC011290.1" "AC072061.1" "AC004951.1" "AC004854.2" "AC073115.2" "AC020743.1" "AC074351.1" "AC118758.3"
-## [901] "AC092634.3" "AC073349.1" "AC068533.4" "AC008267.5" "AC027644.3" "AC073335.2" "AC006480.2" "AC211476.2" "AC004921.1" "AC005076.1" "AC003991.2" "AC002383.1" "AC007566.1" "AC002454.1" "AC002451.1"
-## [916] "AC004834.1" "AC004922.1" "AC073842.2" "AC073842.1" "AC092849.1" "AC069281.1" "AC105446.1" "AC006329.1" "AC093668.3" "AC005064.1" "AC007384.1" "AC005070.3" "AC073073.2" "AC007032.1" "AC004492.1"
-## [931] "AC004839.1" "AC002467.1" "AC005046.1" "AC006333.2" "AC073934.1" "AC090114.2" "AC018638.7" "AC078846.1" "AC073320.1" "AC087071.1" "AC007938.2" "AC007938.3" "AC016831.7" "AC016831.1" "AC016831.5"
-## [946] "AC058791.1" "AC008264.2" "AC009275.1" "AC083862.2" "AC009542.1" "AC024084.1" "AC078845.1" "AC083880.1" "AC004918.1" "AC093673.1" "AC004889.1" "AC074386.1" "AC005229.4" "AC073314.1" "AC004877.1"
-## [961] "AC092681.3" "AC005586.1" "AC073111.5" "AC021097.1" "AC006017.1" "AC093726.2" "AC144652.1" "AC009403.1" "AC005534.1" "AC005481.1" "AC011899.2" "AC011899.3" "AL732314.6" "AL732314.4" "AL672277.1"
-## [976] "AL683807.1" "BX890604.1" "AC073529.1" "AC131011.1" "CXorf21" "AC108879.1" "AC234772.3" "AC234772.2" "AF196972.1" "AC115618.1" "AC231533.1" "AC232271.1" "AC233976.1" "AL445472.1" "AL354793.1"
-## [991] "AL022157.1" "AL034397.2" "AL034397.3" "AL590764.1" "AL353804.1" "Z68871.1" "AL121601.1" "AL683813.1" "AL078639.1" "AC244197.3"
-## [ reached getOption("max.print") -- omitted 2569 entries ]
-## [1] "they are added to the alias annotation table, so they don't get lost"
-## [1] "following are the official gene symbols of input aliases: "
-## symbol alias
-## 1 AARS1 AARS
-## 2 ABITRAM FAM206A
-## 3 ACP3 ACPP
-## 4 ACTN1-DT ACTN1-AS1
-## 5 ADPRS ADPRHL2
-## 6 ADSS1 ADSSL1
-## 7 ADSS2 ADSS
-## 8 ANKRD20A2P ANKRD20A2
-## 9 ANKRD20A3P ANKRD20A3
-## 10 ANKRD20A4P ANKRD20A4
-## 11 ANKRD40CL LINC00483
-## 12 ANTKMT FAM173A
-## 13 AOPEP C9orf3
-## 14 ARLNC1 LINC02170
-## 15 ATP5MJ ATP5MPL
-## 16 ATP5MK ATP5MD
-## 17 ATPSCKMT FAM173B
-## 18 B3GALT9 B3GNT10
-## 19 BBLN C9orf16
-## 20 BMERB1 C16orf45
-## 21 BPNT2 IMPAD1
-## 22 BRME1 C19orf57
-## 23 CARNMT1-AS1 C9orf41-AS1
-## 24 CARS1 CARS
-## 25 CARS1-AS1 CARS-AS1
-## 26 CBY2 SPERT
-## 27 CCDC200 LINC00854
-## 28 CCDC28A-AS1 GVQW2
-## 29 CCN2 CTGF
-## 30 CCN3 NOV
-## 31 CCNP CNTD2
-## 32 CDIN1 C15orf41
-## 33 CENATAC CCDC84
-## 34 CEP20 FOPNL
-## 35 CEP43 FGFR1OP
-## 36 CERT1 COL4A3BP
-## 37 CFAP20DC C3orf67
-## 38 CFAP251 WDR66
-## 39 CFAP410 C21orf2
-## 40 CFAP91 MAATS1
-## 41 CFAP92 KIAA1257
-## 42 CIAO2A FAM96A
-## 43 CIAO2B FAM96B
-## 44 CIAO3 NARFL
-## 45 CIBAR1 FAM92A
-## 46 CIBAR2 FAM92B
-## 47 CILK1 ICK
-## 48 CLLU1-AS1 CLLU1OS
-## 49 COA8 APOPT1
-## 50 CRACD KIAA1211
-## 51 CRACDL KIAA1211L
-## 52 CRPPA ISPD
-## 53 CYRIA FAM49A
-## 54 CYRIB FAM49B
-## 55 CZIB C1orf123
-## 56 DARS1 DARS
-## 57 DARS1-AS1 DARS-AS1
-## 58 DENND10 FAM45A
-## 59 DENND11 KIAA1147
-## 60 DENND2B ST5
-## 61 DIPK1A FAM69A
-## 62 DIPK1B FAM69B
-## 63 DIPK2A C3orf58
-## 64 DMAC2L ATP5S
-## 65 DNAAF10 WDR92
-## 66 DNAAF11 LRRC6
-## 67 DNAAF8 C16orf71
-## 68 DNAAF9 C20orf194
-## 69 DNAI3 WDR63
-## 70 DNAI4 WDR78
-## 71 DNAI7 CASC1
-## 72 DOCK8-AS1 C9orf66
-## 73 DOP1A DOPEY1
-## 74 DOP1B DOPEY2
-## 75 DUSP29 DUPD1
-## 76 DYNC2I1 WDR60
-## 77 DYNC2I2 WDR34
-## 78 DYNLT2 TCTE3
-## 79 DYNLT2B TCTEX1D2
-## 80 DYNLT4 TCTEX1D4
-## 81 DYNLT5 TCTEX1D1
-## 82 ECRG4 C2orf40
-## 83 ELAPOR1 KIAA1324
-## 84 ELAPOR2 KIAA1324L
-## 85 EOLA1 CXorf40A
-## 86 EOLA2 CXorf40B
-## 87 EPRS1 EPRS
-## 88 EZHIP CXorf67
-## 89 FAM153CP FAM153C
-## 90 FAM174C C19orf24
-## 91 FAM86C1P FAM86C1
-## 92 FCSK FUK
-## 93 FHIP1A FAM160A1
-## 94 FHIP1A-DT FAM160A1-DT
-## 95 FHIP1B FAM160A2
-## 96 FHIP2A FAM160B1
-## 97 FHIP2B FAM160B2
-## 98 FLACC1 ALS2CR12
-## 99 FMC1-LUC7L2 C7orf55-LUC7L2
-## 100 GARRE1 KIAA0355
-## 101 GASK1A FAM198A
-## 102 GASK1B FAM198B
-## 103 GASK1B-AS1 FAM198B-AS1
-## 104 GCNT4 LINC01336
-## 105 GDF5-AS1 GDF5OS
-## 106 GET1 WRB
-## 107 GET3 ASNA1
-## 108 GFUS TSTA3
-## 109 GOLM2 CASC4
-## 110 H1-0 H1F0
-## 111 H1-1 HIST1H1A
-## 112 H1-10 H1FX
-## 113 H1-10-AS1 H1FX-AS1
-## 114 H1-2 HIST1H1C
-## 115 H1-3 HIST1H1D
-## 116 H1-4 HIST1H1E
-## 117 H1-5 HIST1H1B
-## 118 H1-6 HIST1H1T
-## 119 H2AB1 H2AFB1
-## 120 H2AC11 HIST1H2AG
-## 121 H2AC12 HIST1H2AH
-## 122 H2AC13 HIST1H2AI
-## 123 H2AC14 HIST1H2AJ
-## 124 H2AC15 HIST1H2AK
-## 125 H2AC16 HIST1H2AL
-## 126 H2AC17 HIST1H2AM
-## 127 H2AC18 HIST2H2AA3
-## 128 H2AC19 HIST2H2AA4
-## 129 H2AC20 HIST2H2AC
-## 130 H2AC21 HIST2H2AB
-## 131 H2AC4 HIST1H2AB
-## 132 H2AC6 HIST1H2AC
-## 133 H2AC8 HIST1H2AE
-## 134 H2AJ H2AFJ
-## 135 H2AW HIST3H2A
-## 136 H2AX H2AFX
-## 137 H2AZ1 H2AFZ
-## 138 H2AZ2 H2AFV
-## 139 H2BC11 HIST1H2BJ
-## 140 H2BC12 HIST1H2BK
-## 141 H2BC13 HIST1H2BL
-## 142 H2BC14 HIST1H2BM
-## 143 H2BC15 HIST1H2BN
-## 144 H2BC17 HIST1H2BO
-## 145 H2BC18 HIST2H2BF
-## 146 H2BC21 HIST2H2BE
-## 147 H2BC3 HIST1H2BB
-## 148 H2BC5 HIST1H2BD
-## 149 H2BC9 HIST1H2BH
-## 150 H2BU1 HIST3H2BB
-## 151 H3-2 HIST2H3PS2
-## 152 H3-3A H3F3A
-## 153 H3-3B H3F3B
-## 154 H3-5 H3F3C
-## 155 H3C1 HIST1H3A
-## 156 H3C10 HIST1H3H
-## 157 H3C11 HIST1H3I
-## 158 H3C12 HIST1H3J
-## 159 H3C13 HIST2H3D
-## 160 H3C2 HIST1H3B
-## 161 H3C3 HIST1H3C
-## 162 H3C6 HIST1H3E
-## 163 H3C7 HIST1H3F
-## 164 H3C8 HIST1H3G
-## 165 H4-16 HIST4H4
-## 166 H4C1 HIST1H4A
-## 167 H4C11 HIST1H4J
-## 168 H4C13 HIST1H4L
-## 169 H4C14 HIST2H4A
-## 170 H4C15 HIST2H4B
-## 171 H4C2 HIST1H4B
-## 172 H4C3 HIST1H4C
-## 173 H4C4 HIST1H4D
-## 174 H4C5 HIST1H4E
-## 175 H4C6 HIST1H4F
-## 176 H4C8 HIST1H4H
-## 177 H4C9 HIST1H4I
-## 178 HARS1 HARS
-## 179 HEXD HEXDC
-## 180 HROB C17orf53
-## 181 HSDL2-AS1 C9orf147
-## 182 IARS1 IARS
-## 183 IFTAP C11orf74
-## 184 ILRUN C6orf106
-## 185 IRAG1 MRVI1
-## 186 IRAG1-AS1 MRVI1-AS1
-## 187 IRAG2 LRMP
-## 188 ITPRID2 SSFA2
-## 189 KARS1 KARS
-## 190 KASH5 CCDC155
-## 191 KATNIP KIAA0556
-## 192 KICS2 C12orf66
-## 193 KIFBP KIF1BP
-## 194 KRT10-AS1 TMEM99
-## 195 LARS1 LARS
-## 196 LDAF1 TMEM159
-## 197 LINC02481 LINC002481
-## 198 LINC02693 C17orf51
-## 199 LINC02694 C15orf53
-## 200 LINC02870 C10orf91
-## 201 LINC02875 C17orf82
-## 202 LINC02899 C5orf17
-## 203 LINC02901 C6orf99
-## 204 LINC02908 C9orf139
-## 205 LINC02909 C12orf77
-## 206 LINC02910 C20orf197
-## 207 LINC02913 C9orf106
-## 208 LNCTAM34A LINC01759
-## 209 LRATD2 FAM84B
-## 210 LTO1 ORAOV1
-## 211 MACIR C5orf30
-## 212 MACROH2A1 H2AFY
-## 213 MACROH2A2 H2AFY2
-## 214 MAILR AZIN1-AS1
-## 215 MARCHF1 MARCH1
-## 216 MARCHF10 MARCH10
-## 217 MARCHF2 MARCH2
-## 218 MARCHF3 MARCH3
-## 219 MARCHF4 MARCH4
-## 220 MARCHF5 MARCH5
-## 221 MARCHF6 MARCH6
-## 222 MARCHF7 MARCH7
-## 223 MARCHF8 MARCH8
-## 224 MARCHF9 MARCH9
-## 225 MEAK7 TLDC1
-## 226 METTL25B RRNAD1
-## 227 MICOS10 MINOS1
-## 228 MICOS13 C19orf70
-## 229 MIDEAS ELMSAN1
-## 230 MINAR1 KIAA1024
-## 231 MIR3667HG C22orf34
-## 232 MIR9-1HG C1orf61
-## 233 MIX23 CCDC58
-## 234 MMUT MUT
-## 235 MROCKI LINC01268
-## 236 MRTFA MKL1
-## 237 MRTFB MKL2
-## 238 MTARC1 MARC1
-## 239 MTARC2 MARC2
-## 240 MTLN SMIM37
-## 241 MTRES1 C6orf203
-## 242 MTRFR C12orf65
-## 243 MTSS2 MTSS1L
-## 244 MYG1 C12orf10
-## 245 NARS1 NARS
-## 246 NCBP2AS2 NCBP2-AS2
-## 247 NDUFV1-DT C11orf72
-## 248 NIBAN1 FAM129A
-## 249 NIBAN2 FAM129B
-## 250 NIBAN3 FAM129C
-## 251 NOPCHAP1 C12orf45
-## 252 NTAQ1 WDYHV1
-## 253 NUP42 NUPL2
-## 254 OBI1 RNF219
-## 255 OBI1-AS1 RNF219-AS1
-## 256 ODAD1 CCDC114
-## 257 ODAD2 ARMC4
-## 258 ODAD3 CCDC151
-## 259 ODAD4 TTC25
-## 260 PABIR1 FAM122A
-## 261 PABIR2 FAM122B
-## 262 PABIR3 FAM122C
-## 263 PACC1 TMEM206
-## 264 PALM2AKAP2 AKAP2
-## 265 PALM2AKAP2 PALM2-AKAP2
-## 266 PALS1 MPP5
-## 267 PAXIP1-DT PAXIP1-AS1
-## 268 PEDS1 TMEM189
-## 269 PEDS1-UBE2V1 TMEM189-UBE2V1
-## 270 PELATON SMIM25
-## 271 PGAP4 TMEM246
-## 272 PGAP6 TMEM8A
-## 273 PHAF1 C16orf70
-## 274 PIK3IP1-DT PIK3IP1-AS1
-## 275 PITX1-AS1 C5orf66
-## 276 PLA2G2C UBXN10-AS1
-## 277 PLAAT1 HRASLS
-## 278 PLAAT2 HRASLS2
-## 279 PLAAT3 PLA2G16
-## 280 PLAAT4 RARRES3
-## 281 PLAAT5 HRASLS5
-## 282 PLEKHG7 C12orf74
-## 283 POGLUT2 KDELC1
-## 284 POGLUT3 KDELC2
-## 285 POLR1F TWISTNB
-## 286 POLR1G CD3EAP
-## 287 POLR1H ZNRD1
-## 288 PPP1R13B-DT LINC00637
-## 289 PRANCR LINC01481
-## 290 PRECSIT LINC00346
-## 291 PRORP KIAA0391
-## 292 PRSS42P PRSS42
-## 293 PRXL2A FAM213A
-## 294 PRXL2B FAM213B
-## 295 PRXL2C AAED1
-## 296 PSME3IP1 FAM192A
-## 297 RAB30-DT RAB30-AS1
-## 298 RADX CXorf57
-## 299 RAMAC RAMMET
-## 300 RARS1 RARS
-## 301 RBIS C8orf59
-## 302 RELCH KIAA1468
-## 303 RESF1 KIAA1551
-## 304 RO60 TROVE2
-## 305 RPL34-DT RPL34-AS1
-## 306 RRM2 C2orf48
-## 307 RSKR SGK494
-## 308 RUSF1 C16orf58
-## 309 SANBR KIAA1841
-## 310 SCAT1 LINC02081
-## 311 SEPTIN1 SEPT1
-## 312 SEPTIN10 SEPT10
-## 313 SEPTIN11 SEPT11
-## 314 SEPTIN3 SEPT3
-## 315 SEPTIN4 SEPT4
-## 316 SEPTIN4-AS1 SEPT4-AS1
-## 317 SEPTIN5 SEPT5
-## 318 SEPTIN6 SEPT6
-## 319 SEPTIN7 SEPT7
-## 320 SEPTIN7-DT SEPT7-AS1
-## 321 SEPTIN8 SEPT8
-## 322 SEPTIN9 SEPT9
-## 323 SHFL C19orf66
-## 324 SHOC1 C9orf84
-## 325 SLC49A4 DIRC2
-## 326 SLC66A1 PQLC2
-## 327 SLC66A2 PQLC1
-## 328 SLC66A3 PQLC3
-## 329 SMC5-DT SMC5-AS1
-## 330 SMIM43 TMEM155
-## 331 SNHG30 LINC02001
-## 332 SNHG32 C6orf48
-## 333 SPRING1 C12orf49
-## 334 SSPOP SSPO
-## 335 STAM-DT STAM-AS1
-## 336 STEEP1 CXorf56
-## 337 STIMATE-MUSTN1 TMEM110-MUSTN1
-## 338 STING1 TMEM173
-## 339 STX17-DT STX17-AS1
-## 340 TAFA1 FAM19A1
-## 341 TAFA2 FAM19A2
-## 342 TAMALIN GRASP
-## 343 TARS1 TARS
-## 344 TARS3 TARSL2
-## 345 TASOR FAM208A
-## 346 TASOR2 FAM208B
-## 347 TBC1D29P TBC1D29
-## 348 TIMM23B-AGAP6 LINC00843
-## 349 TLCD3A FAM57A
-## 350 TLCD3B FAM57B
-## 351 TLCD4 TMEM56
-## 352 TLCD5 TMEM136
-## 353 TLE5 AES
-## 354 TM4SF19-DYNLT2B TM4SF19-TCTEX1D2
-## 355 TMCC1-DT TMCC1-AS1
-## 356 TOLLIP-DT TOLLIP-AS1
-## 357 TRAPPC14 C7orf43
-## 358 TUBB8B TUBB8P12
-## 359 USP27X-DT USP27X-AS1
-## 360 USP46-DT USP46-AS1
-## 361 VARS1 VARS
-## 362 WARS1 WARS
-## 363 YAE1 YAE1D1
-## 364 YARS1 YARS
-## 365 YJU2B CCDC130
-## 366 YTHDF3-DT YTHDF3-AS1
-## 367 ZFTA C11orf95
-## 368 ZNF22-AS1 C10orf25
-## 369 ZNF407-AS1 LINC00909
-## 370 ZNF516-DT C18orf65
-## 371 ZNF582-DT ZNF582-AS1
-## 372 ZNF875 HKR1
-## 373 ZNRD2 SSSCA1
In this case study, we want to study differences in cell-cell
-communication patterns between MIS-C patients (M), their healthy
-siblings (S) and adult patients with severe covid (A). The meta data
-columns that indicate this disease status is
-MIS.C.AgeTier
.
Cell type annotations are indicated in the
-Annotation_v2.0
column, and the sample is indicated by the
-ShortID
column. If your cells are annotated in multiple
-hierarchical levels, we recommend using a high level in the hierarchy.
-This for 2 reasons: 1) MultiNicheNet focuses on differential expression
-and not differential abundance, and 2) there should be sufficient cells
-per sample-celltype combination.
If you would have batch effects or covariates you can correct for, -you can define this here as well.
-Important: for categorical covariates and batches, there should be at -least one sample for every group-batch combination. If one of your -groups/conditions lacks a certain level of your batch, you won’t be able -to correct for the batch effect because the model is then not able to -distinguish batch from group/condition effects.
-Important: the column names of group, sample, cell type, batches and
-covariates should be syntactically valid (make.names
)
Important: All group, sample, cell type, batch and covariate names
-should be syntactically valid as well (make.names
) (eg
-through
-SummarizedExperiment::colData(sce)$ShortID = SummarizedExperiment::colData(sce)$ShortID %>% make.names()
)
= "ShortID"
- sample_id = "MIS.C.AgeTier"
- group_id = "Annotation_v2.0"
- celltype_id = NA
- covariates = NA batches
Sender and receiver cell types also need to be defined. Both are here -all cell types in the dataset because we are interested in an All-vs-All -analysis.
-= SummarizedExperiment::colData(sce)[,celltype_id] %>% unique()
- senders_oi = SummarizedExperiment::colData(sce)[,celltype_id] %>% unique() receivers_oi
If the user wants it, it is possible to use only a subset of senders -and receivers. Senders and receivers can be entirely different, but also -overlapping, or the same. If you don’t use all the cell types in your -data, we recommend to continue with a subset of your data.
-= sce[, SummarizedExperiment::colData(sce)[,celltype_id] %in% c(senders_oi, receivers_oi)] sce
Now we will go to the first real step of the MultiNicheNet -analysis
-Since MultiNicheNet will infer group differences at the sample level -for each cell type (currently via Muscat - pseudobulking + EdgeR), we -need to have sufficient cells per sample of a cell type, and this for -both groups. In the following analysis we will set this minimum number -of cells per cell type per sample at 10 (recommended minimum).
-= 10 min_cells
Now we will calculate abundance and expression information for each -cell type / sample / group combination with the following functions. In -the output of this function, you can also find some ‘Cell type abundance -diagnostic plots’ that will the users which celltype-sample combinations -will be left out later on for DE calculation because the nr of cells is -lower than de defined minimum defined here above. If too many -celltype-sample combinations don’t pass this threshold, we recommend to -define your cell types in a more general way (use one level higher of -the cell type ontology hierarchy) (eg TH17 CD4T cells –> CD4T -cells).
-= get_abundance_expression_info(sce = sce, sample_id = sample_id, group_id = group_id, celltype_id = celltype_id, min_cells = min_cells, senders_oi = senders_oi, receivers_oi = receivers_oi, lr_network = lr_network, batches = batches) abundance_expression_info
Following these steps, one can optionally +* 7. Calculate the across-samples expression correlation between ligand-receptor pairs and target genes +* 8. Prioritize communication patterns involving condition-specific cell types through an alternative prioritization scheme
+After these steps, the output can be further explored as we will demonstrate in the “Downstream analysis of the MultiNicheNet output” section.
+In this vignette, we will demonstrate these steps one-by-one, which offers the most flexibility to the user to assess intermediary results. Other vignettes will demonstrate the use of the multi_nichenet_analysis
wrapper function.
In this step we will calculate and visualize cell type abundances. This will give an indication about which cell types will be retained in the analysis, and which cell types will be filtered out.
+Since MultiNicheNet will infer group differences at the sample level for each cell type (currently via Muscat - pseudobulking + EdgeR), we need to have sufficient cells per sample of a cell type, and this for all groups. In the following analysis we will set this minimum number of cells per cell type per sample at 10. Samples that have less than min_cells
cells will be excluded from the analysis for that specific cell type.
min_cells = 10
+We recommend using min_cells = 10
, except for datasets with several lowly abundant cell types of interest. For those datasets, we recommend using min_cells = 5
.
abundance_info = get_abundance_info(
+ sce = sce,
+ sample_id = sample_id, group_id = group_id, celltype_id = celltype_id,
+ min_cells = min_cells,
+ senders_oi = senders_oi, receivers_oi = receivers_oi,
+ batches = batches
+ )
First, we will check the cell type abundance diagnostic plots.
-The first plot visualizes the number of cells per celltype-sample
-combination, and indicates which combinations are removed during the DE
-analysis because there are less than min_cells
in the
-celltype-sample combination.
$abund_plot_sample abundance_expression_info
-The red dotted line indicates the required minimum of cells as defined
-above in min_cells
. We can see here that some
-sample-celltype combinations are left out. For the DE analysis in the
-next step, only cell types will be considered if there are at least two
-samples per group with a sufficient number of cells.
Important: Based on the cell type abundance -diagnostics, we recommend users to change their analysis settings if -required (eg changing cell type annotation level, batches, …), before -proceeding with the rest of the analysis.
-Now we will go over to the multi-group, multi-sample differential -expression (DE) analysis (also called ‘differential state’ analysis by -the developers of Muscat).
-Here, we want to compare each patient group to the other groups, so -the MIS-C (M) group vs healthy control siblins (S) and adult COVID19 -patients (A) etcetera. To do this comparison, we need to set the -following contrasts:
-= c("'M-(S+A)/2','S-(M+A)/2','A-(S+M)/2'") contrasts_oi
Very Important Note the format to indicate the
-contrasts! This formatting should be adhered to very strictly, and white
-spaces are not allowed! Check ?get_DE_info
for explanation
-about how to define this well. The most important things are that: each
-contrast is surrounded by single quotation marks, contrasts are
-separated by a comma without any whitespace, and alle contrasts together
-are surrounded by double quotation marks. If you compare against two
-groups, you should divide by 2, if you compare against three groups, you
-should divide by 3 etcetera.
For downstream visualizations and linking contrasts to their main -group, you need to run the following:
-= tibble(contrast =
- contrast_tbl c("M-(S+A)/2","S-(M+A)/2", "A-(S+M)/2"),
- group = c("M","S","A"))
If you want to compare only two groups (eg M vs S), you can do like
-this: contrasts_oi = c("'M-S','S-M'")
-contrast_tbl = tibble(contrast = c("M-S","S-M"), group = c("M","S"))
= get_DE_info(sce = sce, sample_id = sample_id, group_id = group_id, celltype_id = celltype_id, batches = batches, covariates = covariates, contrasts_oi = contrasts_oi, min_cells = min_cells) DE_info
Table with logFC and p-values for each gene-celltype-contrast:
-$celltype_de$de_output_tidy %>% arrange(p_adj) %>% head()
- DE_info## # A tibble: 6 x 9
-## gene cluster_id logFC logCPM F p_val p_adj.loc p_adj contrast
-## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
-## 1 TRBV11.2 L_T_TIM3._CD38._HLADR. 4.73 9.84 246 4.33e-13 0.00000000283 0.00000000283 M-(S+A)/2
-## 2 IGLV3.25 L_NK_CD56._CD16. 7.54 5 218 6.04e-13 0.00000000516 0.00000000516 A-(S+M)/2
-## 3 IGHV3.11 L_NK_CD56._CD16. 5.72 3.09 182 6.56e-11 0.00000028 0.00000028 A-(S+M)/2
-## 4 IGHV1.46 L_NK_CD56._CD16. 4.58 4.12 125 1.44e-10 0.000000411 0.000000411 A-(S+M)/2
-## 5 IGLV6.57 L_NK_CD56._CD16. 4.76 3.83 112 3.83e-10 0.000000817 0.000000817 A-(S+M)/2
-## 6 IGKV1.27 L_NK_CD56._CD16. 4.13 3.35 108 5.5 e-10 0.000000939 0.000000939 A-(S+M)/2
= DE_info$celltype_de$de_output_tidy celltype_de
In the next step, we will combine the DE information of senders and -receivers by linking their ligands and receptors together:
-abundance_expression_info$sender_receiver_info
)= combine_sender_receiver_de(
- sender_receiver_de sender_de = celltype_de,
- receiver_de = celltype_de,
- senders_oi = senders_oi,
- receivers_oi = receivers_oi,
- lr_network = lr_network
- )
%>% head(20)
- sender_receiver_de ## # A tibble: 20 x 12
-## contrast sender receiver ligand receptor lfc_ligand lfc_receptor ligand_receptor_lfc_avg p_val_ligand p_adj_ligand p_val_receptor p_adj_receptor
-## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-## 1 M-(S+A)/2 L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 CCL3 CCR1 2.89 0.8 1.85 0.00053 0.0881 0.0568 0.308
-## 2 M-(S+A)/2 L_T_TIM3._CD38._HLADR. L_T_TIM3._CD38._HLADR. CCL3 CCR5 2.89 0.4 1.64 0.00053 0.0881 0.267 0.821
-## 3 M-(S+A)/2 L_NK_CD56._CD16. M_Monocyte_CD16 S100A8 CD69 1.68 1.53 1.60 0.031 0.428 0.00275 0.0893
-## 4 A-(S+M)/2 M_Monocyte_CD16 M_Monocyte_CD16 HLA.G KLRD1 1.46 1.6 1.53 0.00266 0.268 0.0991 0.644
-## 5 M-(S+A)/2 M_Monocyte_CD16 M_Monocyte_CD16 C1QB CD33 2.99 0.0287 1.51 0.00000638 0.0131 0.926 0.992
-## 6 A-(S+M)/2 M_Monocyte_CD16 M_Monocyte_CD16 C1QB LRP1 2.21 0.578 1.39 0.000377 0.152 0.0223 0.518
-## 7 M-(S+A)/2 M_Monocyte_CD16 M_Monocyte_CD16 SLAMF7 SLAMF7 1.38 1.38 1.38 0.0318 0.239 0.0318 0.239
-## 8 M-(S+A)/2 L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 GZMB IGF2R 2.28 0.423 1.35 0.0000633 0.0414 0.304 0.64
-## 9 M-(S+A)/2 M_Monocyte_CD16 M_Monocyte_CD16 C1QB LRP1 2.99 -0.326 1.33 0.00000638 0.0131 0.0727 0.346
-## 10 M-(S+A)/2 L_T_TIM3._CD38._HLADR. L_T_TIM3._CD38._HLADR. GZMB IGF2R 2.28 0.376 1.33 0.0000633 0.0414 0.198 0.755
-## 11 M-(S+A)/2 M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. IL18 IL18R1 1.9 0.726 1.31 0.0637 0.327 0.0692 0.55
-## 12 A-(S+M)/2 M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. LGALS3 LAG3 0.915 1.71 1.31 0.00129 0.224 0.0012 0.334
-## 13 M-(S+A)/2 L_NK_CD56._CD16. L_T_TIM3._CD38._HLADR. FAM3C ADGRG5 0.101 2.52 1.31 0.576 0.932 0.00294 0.173
-## 14 M-(S+A)/2 M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. HLA.A KLRD1 0.799 1.78 1.29 0.0000086 0.0133 0.000741 0.0969
-## 15 M-(S+A)/2 M_Monocyte_CD16 L_NK_CD56._CD16. IL18 IL18R1 1.9 0.673 1.29 0.0637 0.327 0.0275 0.414
-## 16 S-(M+A)/2 M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. RETN GPR25 -0.209 2.75 1.27 0.666 0.962 0.00000532 0.0171
-## 17 A-(S+M)/2 L_NK_CD56._CD16. M_Monocyte_CD16 SPON2 ITGAM 0.442 2.09 1.27 0.0424 0.569 0.0111 0.416
-## 18 M-(S+A)/2 M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. TNF TNFRSF1A 1.78 0.727 1.25 0.021 0.205 0.0745 0.561
-## 19 A-(S+M)/2 L_NK_CD56._CD16. M_Monocyte_CD16 ICAM2 ITGAM 0.397 2.09 1.24 0.0228 0.446 0.0111 0.416
-## 20 A-(S+M)/2 M_Monocyte_CD16 L_NK_CD56._CD16. HLA.G LILRB1 1.46 0.996 1.23 0.00266 0.268 0.00673 0.282
Here, we need to define the thresholds that will be used to consider -genes as differentially expressed or not (logFC, p-value, decision -whether to use adjusted or normal p-value, minimum fraction of cells -that should express a gene in at least one sample in a group, whether to -use the normal p-values or empirical p-values).
-NicheNet ligand activity will then be calculated as the enrichment of -predicted target genes of ligands in this set of DE genes compared to -the genomic background. Here we choose for a minimum logFC of 0.50, -maximum p-value of 0.05, and minimum fraction of expression of 0.05.
-= 0.50
- logFC_threshold = 0.05
- p_val_threshold = 0.05 fraction_cutoff
We will here choose for applying the p-value cutoff on the normal -p-values, and not on the p-values corrected for multiple testing. This -choice was made here because this dataset has only a few samples per -group and we might have a lack of statistical power due to -pseudobulking. In case of more samples per group, and a sufficient high -number of DE genes per group-celltype (> 50), we would recommend -using the adjusted p-values.
-# p_val_adj = TRUE
-= FALSE p_val_adj
For the NicheNet ligand-target inference, we also need to select -which top n of the predicted target genes will be considered (here: top -250 targets per ligand).
-= 250 top_n_target
The NicheNet ligand activity analysis can be run in parallel for each -receiver cell type, by changing the number of cores as defined here. -This is only recommended if you have many receiver cell type.
-= TRUE
- verbose = 8
- cores_system = min(cores_system, sender_receiver_de$receiver %>% unique() %>% length()) # use one core per receiver cell type n.cores
(this might take some time)
-= suppressMessages(suppressWarnings(get_ligand_activities_targets_DEgenes(
- ligand_activities_targets_DEgenes receiver_de = celltype_de,
- receivers_oi = receivers_oi,
- ligand_target_matrix = ligand_target_matrix,
- logFC_threshold = logFC_threshold,
- p_val_threshold = p_val_threshold,
- p_val_adj = p_val_adj,
- top_n_target = top_n_target,
- verbose = verbose,
- n.cores = n.cores
- )))
Check the DE genes used for the activity analysis
-$de_genes_df %>% head(20)
- ligand_activities_targets_DEgenes## # A tibble: 20 x 6
-## gene receiver logFC p_val p_adj contrast
-## <chr> <chr> <dbl> <dbl> <dbl> <chr>
-## 1 TNFRSF14.AS1 L_NK_CD56._CD16. 0.957 0.00935 0.276 M-(S+A)/2
-## 2 ACOT7 L_NK_CD56._CD16. 0.711 0.00449 0.218 M-(S+A)/2
-## 3 KDM1A L_NK_CD56._CD16. 0.632 0.0268 0.408 M-(S+A)/2
-## 4 SMAP2 L_NK_CD56._CD16. 0.56 0.00332 0.198 M-(S+A)/2
-## 5 P3H1 L_NK_CD56._CD16. 0.58 0.00937 0.276 M-(S+A)/2
-## 6 CPT2 L_NK_CD56._CD16. 0.632 0.0256 0.401 M-(S+A)/2
-## 7 DHCR24 L_NK_CD56._CD16. 1.15 0.00182 0.165 M-(S+A)/2
-## 8 GADD45A L_NK_CD56._CD16. 0.538 0.0224 0.382 M-(S+A)/2
-## 9 S100A9 L_NK_CD56._CD16. 1.57 0.0315 0.431 M-(S+A)/2
-## 10 S100A8 L_NK_CD56._CD16. 1.68 0.031 0.428 M-(S+A)/2
-## 11 XCL1 L_NK_CD56._CD16. 0.896 0.0467 0.473 M-(S+A)/2
-## 12 SELL L_NK_CD56._CD16. 0.576 0.00131 0.15 M-(S+A)/2
-## 13 IER5 L_NK_CD56._CD16. 0.736 0.0152 0.338 M-(S+A)/2
-## 14 MIR181A1HG L_NK_CD56._CD16. 0.885 0.0153 0.339 M-(S+A)/2
-## 15 CD55 L_NK_CD56._CD16. 0.58 0.0251 0.4 M-(S+A)/2
-## 16 UTP25 L_NK_CD56._CD16. 0.697 0.0479 0.475 M-(S+A)/2
-## 17 H2AW L_NK_CD56._CD16. 0.677 0.000416 0.109 M-(S+A)/2
-## 18 RNF187 L_NK_CD56._CD16. 0.661 0.0032 0.198 M-(S+A)/2
-## 19 LPIN1 L_NK_CD56._CD16. 0.555 0.000124 0.0778 M-(S+A)/2
-## 20 POMC L_NK_CD56._CD16. 0.59 0.015 0.335 M-(S+A)/2
Check the output of the activity analysis
-$ligand_activities %>% head(20)
- ligand_activities_targets_DEgenes## # A tibble: 20 x 8
-## # Groups: receiver, contrast [1]
-## ligand activity contrast target ligand_target_weight receiver direction_regulation activity_scaled
-## <chr> <dbl> <chr> <chr> <dbl> <chr> <fct> <dbl>
-## 1 A2M 0.0224 M-(S+A)/2 CD55 0.00649 L_NK_CD56._CD16. up 0.823
-## 2 A2M 0.0224 M-(S+A)/2 FKBP5 0.00723 L_NK_CD56._CD16. up 0.823
-## 3 A2M 0.0224 M-(S+A)/2 FOS 0.0146 L_NK_CD56._CD16. up 0.823
-## 4 A2M 0.0224 M-(S+A)/2 GADD45A 0.0110 L_NK_CD56._CD16. up 0.823
-## 5 A2M 0.0224 M-(S+A)/2 H2AC6 0.00747 L_NK_CD56._CD16. up 0.823
-## 6 A2M 0.0224 M-(S+A)/2 H2BC12 0.00692 L_NK_CD56._CD16. up 0.823
-## 7 A2M 0.0224 M-(S+A)/2 ISG20 0.00738 L_NK_CD56._CD16. up 0.823
-## 8 A2M 0.0224 M-(S+A)/2 LMNB1 0.00699 L_NK_CD56._CD16. up 0.823
-## 9 A2M 0.0224 M-(S+A)/2 MYC 0.0199 L_NK_CD56._CD16. up 0.823
-## 10 A2M 0.0224 M-(S+A)/2 NFKB1 0.00895 L_NK_CD56._CD16. up 0.823
-## 11 A2M 0.0224 M-(S+A)/2 NFKBIZ 0.00661 L_NK_CD56._CD16. up 0.823
-## 12 A2M 0.0224 M-(S+A)/2 PPP1R15A 0.00776 L_NK_CD56._CD16. up 0.823
-## 13 A2M 0.0224 M-(S+A)/2 SLC1A5 0.00710 L_NK_CD56._CD16. up 0.823
-## 14 A2M 0.0224 M-(S+A)/2 SMAD3 0.00776 L_NK_CD56._CD16. up 0.823
-## 15 A2M 0.0224 M-(S+A)/2 SOCS3 0.00968 L_NK_CD56._CD16. up 0.823
-## 16 A2M 0.0224 M-(S+A)/2 TFRC 0.00712 L_NK_CD56._CD16. up 0.823
-## 17 A2M 0.0224 M-(S+A)/2 WARS1 0.00743 L_NK_CD56._CD16. up 0.823
-## 18 A2M 0.0224 M-(S+A)/2 ZFP36 0.00732 L_NK_CD56._CD16. up 0.823
-## 19 AANAT 0.0215 M-(S+A)/2 FKBP5 0.00514 L_NK_CD56._CD16. up 0.652
-## 20 AANAT 0.0215 M-(S+A)/2 FOS 0.00863 L_NK_CD56._CD16. up 0.652
In the 3 previous steps, we calculated expression, differential -expression and NicheNet activity information. Now we will combine these -different types of information in one prioritization scheme.
-MultiNicheNet allows the user to define the weights of the following -criteria to prioritize ligand-receptor interactions:
+The first plot visualizes the number of cells per celltype-sample combination, and indicates which combinations are removed during the DE analysis because there are less than min_cells
in the celltype-sample combination.
abundance_info$abund_plot_sample
+
+The red dotted line indicates the required minimum of cells as defined above in min_cells
. We can see here that some sample-celltype combinations are left out. For the DE analysis in the next step, only cell types will be considered if there are at least two samples per group with a sufficient number of cells. But as we can see here: all cell types will be considered for the analysis and there are no condition-specific cell types.
Important: Based on the cell type abundance diagnostics, we recommend users to change their analysis settings if required (eg changing cell type annotation level, batches, …), before proceeding with the rest of the analysis. If too many celltype-sample combinations don’t pass this threshold, we recommend to define your cell types in a more general way (use one level higher of the cell type ontology hierarchy) (eg TH17 CD4T cells –> CD4T cells) or use min_cells = 5
if this would not be possible.
Running the following block of code can help you determine which cell types are condition-specific and which cell types are absent.
+sample_group_celltype_df = abundance_info$abundance_data %>%
+ filter(n > min_cells) %>%
+ ungroup() %>%
+ distinct(sample_id, group_id) %>%
+ cross_join(
+ abundance_info$abundance_data %>%
+ ungroup() %>%
+ distinct(celltype_id)
+ ) %>%
+ arrange(sample_id)
+
+abundance_df = sample_group_celltype_df %>% left_join(
+ abundance_info$abundance_data %>% ungroup()
+ )
+
+abundance_df$n[is.na(abundance_df$n)] = 0
+abundance_df$keep[is.na(abundance_df$keep)] = FALSE
+abundance_df_summarized = abundance_df %>%
+ mutate(keep = as.logical(keep)) %>%
+ group_by(group_id, celltype_id) %>%
+ summarise(samples_present = sum((keep)))
+
+celltypes_absent_one_condition = abundance_df_summarized %>%
+ filter(samples_present == 0) %>% pull(celltype_id) %>% unique()
+# find truly condition-specific cell types by searching for cell types
+# truely absent in at least one condition
+
+celltypes_present_one_condition = abundance_df_summarized %>%
+ filter(samples_present >= 2) %>% pull(celltype_id) %>% unique()
+# require presence in at least 2 samples of one group so
+# it is really present in at least one condition
+
+condition_specific_celltypes = intersect(
+ celltypes_absent_one_condition,
+ celltypes_present_one_condition)
+
+total_nr_conditions = SummarizedExperiment::colData(sce)[,group_id] %>%
+ unique() %>% length()
+
+absent_celltypes = abundance_df_summarized %>%
+ filter(samples_present < 2) %>%
+ group_by(celltype_id) %>%
+ count() %>%
+ filter(n == total_nr_conditions) %>%
+ pull(celltype_id)
+
+print("condition-specific celltypes:")
+## [1] "condition-specific celltypes:"
+print(condition_specific_celltypes)
+## character(0)
+
+print("absent celltypes:")
+## [1] "absent celltypes:"
+print(absent_celltypes)
+## character(0)
+Absent cell types will be filtered out, condition-specific cell types can be filtered out if you as a user do not want to run the alternative workflow for condition-specific cell types in the optional step 8 of the core MultiNicheNet analysis.
+analyse_condition_specific_celltypes = FALSE
+if(analyse_condition_specific_celltypes == TRUE){
+ senders_oi = senders_oi %>% setdiff(absent_celltypes)
+ receivers_oi = receivers_oi %>% setdiff(absent_celltypes)
+} else {
+ senders_oi = senders_oi %>%
+ setdiff(union(absent_celltypes, condition_specific_celltypes))
+ receivers_oi = receivers_oi %>%
+ setdiff(union(absent_celltypes, condition_specific_celltypes))
+}
+
+sce = sce[, SummarizedExperiment::colData(sce)[,celltype_id] %in%
+ c(senders_oi, receivers_oi)
+ ]
+Before running the DE analysis, we will determine which genes are not sufficiently expressed and should be filtered out.
+We will perform gene filtering based on a similar procedure as used in edgeR::filterByExpr
. However, we adapted this procedure to be more interpretable for single-cell datasets.
For each cell type, we will consider genes expressed if they are expressed in at least a min_sample_prop
fraction of samples in the condition with the lowest number of samples. By default, we set min_sample_prop = 0.50
, which means that genes should be expressed in at least 2 samples if the group with lowest nr. of samples has 4 samples like this dataset.
min_sample_prop = 0.50
+But how do we define which genes are expressed in a sample? For this we will consider genes as expressed if they have non-zero expression values in a fraction_cutoff
fraction of cells of that cell type in that sample. By default, we set fraction_cutoff = 0.05
, which means that genes should show non-zero expression values in at least 5% of cells in a sample.
fraction_cutoff = 0.05
+We recommend using these default values unless there is specific interest in prioritizing (very) weakly expressed interactions. In that case, you could lower the value of fraction_cutoff
. We explicitly recommend against using fraction_cutoff > 0.10
.
Now we will calculate the information required for gene filtering with the following command:
+frq_list = get_frac_exprs(
+ sce = sce,
+ sample_id = sample_id, celltype_id = celltype_id, group_id = group_id,
+ batches = batches,
+ min_cells = min_cells,
+ fraction_cutoff = fraction_cutoff, min_sample_prop = min_sample_prop)
+## [1] "Samples are considered if they have more than 10 cells of the cell type of interest"
+## [1] "Genes with non-zero counts in at least 5% of cells of a cell type of interest in a particular sample will be considered as expressed in that sample."
+## [1] "Genes expressed in at least 2 samples will considered as expressed in the cell type: L_NK_CD56._CD16."
+## [1] "Genes expressed in at least 2 samples will considered as expressed in the cell type: L_T_TIM3._CD38._HLADR."
+## [1] "Genes expressed in at least 2 samples will considered as expressed in the cell type: M_Monocyte_CD16"
+## [1] "6621 genes are considered as expressed in the cell type: L_NK_CD56._CD16."
+## [1] "8461 genes are considered as expressed in the cell type: L_T_TIM3._CD38._HLADR."
+## [1] "8817 genes are considered as expressed in the cell type: M_Monocyte_CD16"
+Now only keep genes that are expressed by at least one cell type:
+genes_oi = frq_list$expressed_df %>%
+ filter(expressed == TRUE) %>% pull(gene) %>% unique()
+sce = sce[genes_oi, ]
+After filtering out absent cell types and genes, we will continue the analysis by calculating the different prioritization criteria that we will use to prioritize cell-cell communication patterns.
+First, we will determine and normalize per-sample pseudobulk expression levels for each expressed gene in each present cell type. The function process_abundance_expression_info
will link this expression information for ligands of the sender cell types to the corresponding receptors of the receiver cell types. This will later on allow us to define the cell-type specicificy criteria for ligands and receptors.
abundance_expression_info = process_abundance_expression_info(
+ sce = sce,
+ sample_id = sample_id, group_id = group_id, celltype_id = celltype_id,
+ min_cells = min_cells,
+ senders_oi = senders_oi, receivers_oi = receivers_oi,
+ lr_network = lr_network,
+ batches = batches,
+ frq_list = frq_list,
+ abundance_info = abundance_info)
+Normalized pseudobulk expression values per gene/celltype/sample can be inspected by:
+abundance_expression_info$celltype_info$pb_df %>% head()
+## # A tibble: 6 × 4
+## gene sample pb_sample celltype
+## <chr> <chr> <dbl> <fct>
+## 1 A1BG A1 3.10 L_T_TIM3._CD38._HLADR.
+## 2 AAAS A1 4.01 L_T_TIM3._CD38._HLADR.
+## 3 AAGAB A1 5.54 L_T_TIM3._CD38._HLADR.
+## 4 AAK1 A1 6.40 L_T_TIM3._CD38._HLADR.
+## 5 AAMDC A1 4.57 L_T_TIM3._CD38._HLADR.
+## 6 AAMP A1 5.54 L_T_TIM3._CD38._HLADR.
+An average of these sample-level expression values per condition/group can be inspected by:
+abundance_expression_info$celltype_info$pb_df_group %>% head()
+## # A tibble: 6 × 4
+## # Groups: group, celltype [1]
+## group celltype gene pb_group
+## <chr> <chr> <chr> <dbl>
+## 1 A L_NK_CD56._CD16. A1BG 3.91
+## 2 A L_NK_CD56._CD16. AAAS 4.81
+## 3 A L_NK_CD56._CD16. AAGAB 4.82
+## 4 A L_NK_CD56._CD16. AAK1 7.03
+## 5 A L_NK_CD56._CD16. AAMDC 3.66
+## 6 A L_NK_CD56._CD16. AAMP 6.04
+Inspecting these values for ligand-receptor interactions can be done by:
+abundance_expression_info$sender_receiver_info$pb_df %>% head()
+## # A tibble: 6 × 8
+## sample sender receiver ligand receptor pb_ligand pb_receptor
+## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
+## 1 M4 M_Monocyte_CD16 M_Monocyt… B2M LILRB1 14.2 11.4
+## 2 M4 L_NK_CD56._CD16. M_Monocyt… B2M LILRB1 14.0 11.4
+## 3 M4 L_T_TIM3._CD38._HLADR. M_Monocyt… B2M LILRB1 13.8 11.4
+## 4 M5 L_NK_CD56._CD16. M_Monocyt… B2M LILRB1 14.5 10.5
+## 5 M5 M_Monocyte_CD16 M_Monocyt… B2M LILRB1 14.4 10.5
+## 6 M5 L_T_TIM3._CD38._HLADR. M_Monocyt… B2M LILRB1 14.2 10.5
+## # ℹ 1 more variable: ligand_receptor_pb_prod <dbl>
+abundance_expression_info$sender_receiver_info$pb_df_group %>% head()
+## # A tibble: 6 × 8
+## # Groups: group, sender [6]
+## group sender receiver ligand receptor pb_ligand_group pb_receptor_group
+## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
+## 1 M L_NK_CD56._C… M_Monoc… B2M LILRB1 14.1 10.0
+## 2 M M_Monocyte_C… M_Monoc… B2M LILRB1 14.1 10.0
+## 3 M L_T_TIM3._CD… M_Monoc… B2M LILRB1 14.0 10.0
+## 4 A L_NK_CD56._C… L_NK_CD… B2M KLRD1 14.3 9.71
+## 5 S L_NK_CD56._C… L_NK_CD… B2M KLRD1 14.4 9.46
+## 6 A L_T_TIM3._CD… L_NK_CD… B2M KLRD1 13.9 9.71
+## # ℹ 1 more variable: ligand_receptor_pb_prod_group <dbl>
+In this step, we will perform genome-wide differential expression analysis of receiver and sender cell types to define DE genes between the conditions of interest (as formalized by the contrasts_oi
). Based on this analysis, we later can define the levels of differential expression of ligands in senders and receptors in receivers, and define the set of affected target genes in the receiver cell types (which will be used for the ligand activity analysis).
We will apply pseudobulking followed by EdgeR to perform multi-condition multi-sample differential expression (DE) analysis (also called ‘differential state’ analysis by the developers of Muscat).
+DE_info = get_DE_info(
+ sce = sce,
+ sample_id = sample_id, group_id = group_id, celltype_id = celltype_id,
+ batches = batches, covariates = covariates,
+ contrasts_oi = contrasts_oi,
+ min_cells = min_cells,
+ expressed_df = frq_list$expressed_df)
+## [1] "DE analysis is done:"
+## [1] "included cell types are:"
+## [1] "L_T_TIM3._CD38._HLADR." "L_NK_CD56._CD16." "M_Monocyte_CD16"
+Check DE output information in table with logFC and p-values for each gene-celltype-contrast:
+DE_info$celltype_de$de_output_tidy %>% head()
+## # A tibble: 6 × 9
+## gene cluster_id logFC logCPM F p_val p_adj.loc p_adj contrast
+## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
+## 1 A1BG L_T_TIM3._CD38._… 0.127 4.46 0.175 0.68 0.984 0.984 M-(S+A)…
+## 2 AAAS L_T_TIM3._CD38._… -0.072 4.78 0.0678 0.797 1 1 M-(S+A)…
+## 3 AAGAB L_T_TIM3._CD38._… 0.492 5.01 4.73 0.04 0.481 0.481 M-(S+A)…
+## 4 AAK1 L_T_TIM3._CD38._… -0.521 6.83 12.7 0.00163 0.138 0.138 M-(S+A)…
+## 5 AAMDC L_T_TIM3._CD38._… -0.44 3.95 2.15 0.156 0.724 0.724 M-(S+A)…
+## 6 AAMP L_T_TIM3._CD38._… 0.224 6.04 2.95 0.235 0.806 0.806 M-(S+A)…
+Evaluate the distributions of p-values:
+DE_info$hist_pvals
+
+These distributions look fine (uniform distribution, except peak at p-value <= 0.05), so we will continue using these regular p-values. In case these p-value distributions look irregular, you can estimate empirical p-values as we will demonstrate in another vignette.
+empirical_pval = FALSE
+if(empirical_pval == TRUE){
+ DE_info_emp = get_empirical_pvals(DE_info$celltype_de$de_output_tidy)
+ celltype_de = DE_info_emp$de_output_tidy_emp %>% select(-p_val, -p_adj) %>%
+ rename(p_val = p_emp, p_adj = p_adj_emp)
+} else {
+ celltype_de = DE_info$celltype_de$de_output_tidy
+}
+To end this step, we will combine the DE information of senders and receivers by linking their ligands and receptors together based on the prior knowledge ligand-receptor network.
+sender_receiver_de = combine_sender_receiver_de(
+ sender_de = celltype_de,
+ receiver_de = celltype_de,
+ senders_oi = senders_oi,
+ receivers_oi = receivers_oi,
+ lr_network = lr_network
+)
+sender_receiver_de %>% head(20)
+## # A tibble: 20 × 12
+## contrast sender receiver ligand receptor lfc_ligand lfc_receptor
+## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
+## 1 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… C3 VSIG4 2.96 5.76
+## 2 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… IL10 IL10RB 5.61 0.715
+## 3 M-(S+A)/2 M_Monocyte_CD16 L_T_TIM… IL18 IL18RAP 1.93 4.39
+## 4 M-(S+A)/2 M_Monocyte_CD16 L_T_TIM… IL10 IL10RB 5.61 0.184
+## 5 M-(S+A)/2 L_NK_CD56._CD16. L_T_TIM… IL18 IL18RAP 1.29 4.39
+## 6 M-(S+A)/2 M_Monocyte_CD16 L_NK_CD… IL10 IL10RA 5.61 0.0393
+## 7 M-(S+A)/2 M_Monocyte_CD16 L_T_TIM… IL10 IL10RA 5.61 -0.0612
+## 8 M-(S+A)/2 M_Monocyte_CD16 L_NK_CD… IL10 IL10RB 5.61 -0.0727
+## 9 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… IL10 IL10RA 5.61 -0.316
+## 10 M-(S+A)/2 M_Monocyte_CD16 L_T_TIM… C3 ITGAM 2.96 2.2
+## 11 A-(S+M)/2 L_T_TIM3._CD38._H… M_Monoc… SCGB3… MARCO 4.49 0.611
+## 12 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… IL1B IL1RAP 1.53 3.56
+## 13 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… THBS1 CD47 4.58 0.446
+## 14 M-(S+A)/2 L_T_TIM3._CD38._H… M_Monoc… TNFSF… CD163 0.484 4.54
+## 15 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… THBS1 ITGA6 4.58 0.382
+## 16 M-(S+A)/2 M_Monocyte_CD16 L_NK_CD… THBS1 ITGA6 4.58 0.352
+## 17 M-(S+A)/2 M_Monocyte_CD16 L_NK_CD… THBS1 ITGA4 4.58 0.283
+## 18 M-(S+A)/2 M_Monocyte_CD16 L_T_TIM… THBS1 ITGA6 4.58 0.263
+## 19 M-(S+A)/2 L_T_TIM3._CD38._H… M_Monoc… HMGB1 CD163 0.226 4.54
+## 20 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… HMGB1 CD163 0.154 4.54
+## # ℹ 5 more variables: ligand_receptor_lfc_avg <dbl>, p_val_ligand <dbl>,
+## # p_adj_ligand <dbl>, p_val_receptor <dbl>, p_adj_receptor <dbl>
+In this step, we will predict NicheNet ligand activities and NicheNet ligand-target links based on these differential expression results. We do this to prioritize interactions based on their predicted effect on a receiver cell type. We will assume that the most important group-specific interactions are those that lead to group-specific gene expression changes in a receiver cell type.
+Similarly to base NicheNet (https://github.com/saeyslab/nichenetr), we use the DE output to create a “geneset of interest”: here we assume that DE genes within a cell type may be DE because of differential cell-cell communication processes. In the ligand activity prediction, we will assess the enrichment of target genes of ligands within this geneset of interest. In case high-probabiliy target genes of a ligand are enriched in this set compared to the background of expressed genes, we predict that this ligand may have a high activity.
+Because the ligand activity analysis is an enrichment procedure, it is important that this geneset of interest should contain a sufficient but not too large number of genes. The ratio geneset_oi/background should ideally be between 1/200 and 1/10 (or close to these ratios).
+To determine the genesets of interest based on DE output, we need to define some logFC and/or p-value thresholds per cell type/contrast combination. In general, we recommend inspecting the nr. of DE genes for all cell types based on the default thresholds and adapting accordingly. By default, we will apply the p-value cutoff on the normal p-values, and not on the p-values corrected for multiple testing. This choice was made because most multi-sample single-cell transcriptomics datasets have just a few samples per group and we might have a lack of statistical power due to pseudobulking. But, if the smallest group >= 20 samples, we typically recommend using p_val_adj = TRUE. When the biological difference between the conditions is very large, we typically recommend increasing the logFC_threshold and/or using p_val_adj = TRUE.
+We will first inspect the geneset_oi-vs-background ratios for the default tresholds:
+logFC_threshold = 0.50
+p_val_threshold = 0.05
+p_val_adj = FALSE
+geneset_assessment = contrast_tbl$contrast %>%
+ lapply(
+ process_geneset_data,
+ celltype_de, logFC_threshold, p_val_adj, p_val_threshold
+ ) %>%
+ bind_rows()
+geneset_assessment
+## # A tibble: 9 × 12
+## cluster_id n_background n_geneset_up n_geneset_down prop_geneset_up
+## <chr> <int> <int> <int> <dbl>
+## 1 L_NK_CD56._CD16. 6621 162 82 0.0245
+## 2 L_T_TIM3._CD38._HLAD… 8461 401 194 0.0474
+## 3 M_Monocyte_CD16 8817 647 438 0.0734
+## 4 L_NK_CD56._CD16. 6621 150 219 0.0227
+## 5 L_T_TIM3._CD38._HLAD… 8461 201 320 0.0238
+## 6 M_Monocyte_CD16 8817 368 254 0.0417
+## 7 L_NK_CD56._CD16. 6621 118 110 0.0178
+## 8 L_T_TIM3._CD38._HLAD… 8461 225 150 0.0266
+## 9 M_Monocyte_CD16 8817 262 464 0.0297
+## # ℹ 7 more variables: prop_geneset_down <dbl>, in_range_up <lgl>,
+## # in_range_down <lgl>, contrast <chr>, logFC_threshold <dbl>,
+## # p_val_threshold <dbl>, adjusted <lgl>
+We can see here that for all cell type / contrast combinations, all geneset/background ratio’s are within the recommended range (in_range_up
and in_range_down
columns). When these geneset/background ratio’s would not be within the recommended ranges, we should interpret ligand activity results for these cell types with more caution, or use different thresholds (for these or all cell types).
For the sake of demonstration, we will also calculate these ratio’s in case we would use the adjusted p-value as threshold.
+geneset_assessment_adjustedPval = contrast_tbl$contrast %>%
+ lapply(
+ process_geneset_data,
+ celltype_de, logFC_threshold, p_val_adj = TRUE, p_val_threshold
+ ) %>%
+ bind_rows()
+geneset_assessment_adjustedPval
+## # A tibble: 9 × 12
+## cluster_id n_background n_geneset_up n_geneset_down prop_geneset_up
+## <chr> <int> <int> <int> <dbl>
+## 1 L_NK_CD56._CD16. 6621 7 0 0.00106
+## 2 L_T_TIM3._CD38._HLAD… 8461 15 5 0.00177
+## 3 M_Monocyte_CD16 8817 25 11 0.00284
+## 4 L_NK_CD56._CD16. 6621 28 50 0.00423
+## 5 L_T_TIM3._CD38._HLAD… 8461 2 5 0.000236
+## 6 M_Monocyte_CD16 8817 10 15 0.00113
+## 7 L_NK_CD56._CD16. 6621 36 19 0.00544
+## 8 L_T_TIM3._CD38._HLAD… 8461 13 1 0.00154
+## 9 M_Monocyte_CD16 8817 4 3 0.000454
+## # ℹ 7 more variables: prop_geneset_down <dbl>, in_range_up <lgl>,
+## # in_range_down <lgl>, contrast <chr>, logFC_threshold <dbl>,
+## # p_val_threshold <dbl>, adjusted <lgl>
+We can see here that for most cell type / contrast combinations, the geneset/background ratio’s are not within the recommended range. Therefore, we will proceed with the default tresholds for the ligand activity analysis
+After the ligand activity prediction, we will also infer the predicted target genes of these ligands in each contrast. For this ligand-target inference procedure, we also need to select which top n of the predicted target genes will be considered (here: top 250 targets per ligand). This parameter will not affect the ligand activity predictions. It will only affect ligand-target visualizations and construction of the intercellular regulatory network during the downstream analysis. We recommend users to test other settings in case they would be interested in exploring fewer, but more confident target genes, or vice versa.
+top_n_target = 250
+The NicheNet ligand activity analysis can be run in parallel for each receiver cell type, by changing the number of cores as defined here. Using more cores will speed up the analysis at the cost of needing more memory. This is only recommended if you have many receiver cell types of interest.
+verbose = TRUE
+cores_system = 8
+n.cores = min(cores_system, celltype_de$cluster_id %>% unique() %>% length())
+Running the ligand activity prediction will take some time (the more cell types and contrasts, the more time)
+ligand_activities_targets_DEgenes = suppressMessages(suppressWarnings(
+ get_ligand_activities_targets_DEgenes(
+ receiver_de = celltype_de,
+ receivers_oi = intersect(receivers_oi, celltype_de$cluster_id %>% unique()),
+ ligand_target_matrix = ligand_target_matrix,
+ logFC_threshold = logFC_threshold,
+ p_val_threshold = p_val_threshold,
+ p_val_adj = p_val_adj,
+ top_n_target = top_n_target,
+ verbose = verbose,
+ n.cores = n.cores
+ )
+))
+You can check the output of the ligand activity and ligand-target inference here:
+ligand_activities_targets_DEgenes$ligand_activities %>% head(20)
+## # A tibble: 20 × 8
+## # Groups: receiver, contrast [1]
+## ligand activity contrast target ligand_target_weight receiver
+## <chr> <dbl> <chr> <chr> <dbl> <chr>
+## 1 A2M 0.0282 M-(S+A)/2 ACOT7 0.00632 L_NK_CD56._CD16.
+## 2 A2M 0.0282 M-(S+A)/2 AREG 0.00638 L_NK_CD56._CD16.
+## 3 A2M 0.0282 M-(S+A)/2 CD55 0.00649 L_NK_CD56._CD16.
+## 4 A2M 0.0282 M-(S+A)/2 CD74 0.00634 L_NK_CD56._CD16.
+## 5 A2M 0.0282 M-(S+A)/2 FKBP5 0.00723 L_NK_CD56._CD16.
+## 6 A2M 0.0282 M-(S+A)/2 FOS 0.0146 L_NK_CD56._CD16.
+## 7 A2M 0.0282 M-(S+A)/2 GADD45A 0.0110 L_NK_CD56._CD16.
+## 8 A2M 0.0282 M-(S+A)/2 H2AC6 0.00747 L_NK_CD56._CD16.
+## 9 A2M 0.0282 M-(S+A)/2 H2BC12 0.00692 L_NK_CD56._CD16.
+## 10 A2M 0.0282 M-(S+A)/2 ISG20 0.00738 L_NK_CD56._CD16.
+## 11 A2M 0.0282 M-(S+A)/2 LMNB1 0.00699 L_NK_CD56._CD16.
+## 12 A2M 0.0282 M-(S+A)/2 MYC 0.0199 L_NK_CD56._CD16.
+## 13 A2M 0.0282 M-(S+A)/2 NFKB1 0.00895 L_NK_CD56._CD16.
+## 14 A2M 0.0282 M-(S+A)/2 NFKBIZ 0.00661 L_NK_CD56._CD16.
+## 15 A2M 0.0282 M-(S+A)/2 PPP1R15A 0.00776 L_NK_CD56._CD16.
+## 16 A2M 0.0282 M-(S+A)/2 SLC1A5 0.00710 L_NK_CD56._CD16.
+## 17 A2M 0.0282 M-(S+A)/2 SMAD3 0.00776 L_NK_CD56._CD16.
+## 18 A2M 0.0282 M-(S+A)/2 SOCS3 0.00968 L_NK_CD56._CD16.
+## 19 A2M 0.0282 M-(S+A)/2 TFRC 0.00712 L_NK_CD56._CD16.
+## 20 A2M 0.0282 M-(S+A)/2 WARS1 0.00743 L_NK_CD56._CD16.
+## # ℹ 2 more variables: direction_regulation <fct>, activity_scaled <dbl>
+In the previous steps, we calculated expression, differential expression and NicheNet ligand activity. In the final step, we will now combine all calculated information to rank all sender-ligand—receiver-receptor pairs according to group/condition specificity. We will use the following criteria to prioritize ligand-receptor interactions:
de_ligand
and de_receptor
frac_exprs_ligand_receptor
exprs_ligand
and
-exprs_receptor
activity_scaled
abund_sender
and abund_receiver
-(experimental feature - not recommended to give non-zero weights for
-default analyses)The different properties of the sender-ligand—receiver-receptor pairs -can be weighted according to the user’s preference and insight in the -dataset at hand.
-We will set our preference for this dataset as follows - and -recommend the user to use the same weights by default:
-= c("de_ligand" = 1,
- prioritizing_weights_DE "de_receptor" = 1)
- = c("activity_scaled" = 2)
- prioritizing_weights_activity
-= c("exprs_ligand" = 2,
- prioritizing_weights_expression_specificity "exprs_receptor" = 2)
-
-= c("frac_exprs_ligand_receptor" = 1)
- prioritizing_weights_expression_sufficiency
-= c( "abund_sender" = 0,
- prioritizing_weights_relative_abundance "abund_receiver" = 0)
= c(prioritizing_weights_DE,
- prioritizing_weights
- prioritizing_weights_activity,
- prioritizing_weights_expression_specificity,
- prioritizing_weights_expression_sufficiency, prioritizing_weights_relative_abundance)
Make necessary grouping data frame
-= sender_receiver_de %>% dplyr::distinct(sender, receiver)
- sender_receiver_tbl
-= SummarizedExperiment::colData(sce) %>% tibble::as_tibble()
- metadata_combined
-if(!is.na(batches)){
-= metadata_combined[,c(sample_id, group_id, batches)] %>% tibble::as_tibble() %>% dplyr::distinct()
- grouping_tbl colnames(grouping_tbl) = c("sample","group",batches)
- else {
- } = metadata_combined[,c(sample_id, group_id)] %>% tibble::as_tibble() %>% dplyr::distinct()
- grouping_tbl colnames(grouping_tbl) = c("sample","group")
- }
Crucial note: grouping_tbl: group should be the same as in the -contrast_tbl, and as in the expression info tables! Rename accordingly -if this would not be the case. If you followed the guidelines of this -tutorial closely, there should be no problem.
-= suppressMessages(generate_prioritization_tables(
- prioritization_tables sender_receiver_info = abundance_expression_info$sender_receiver_info,
- sender_receiver_de = sender_receiver_de,
- ligand_activities_targets_DEgenes = ligand_activities_targets_DEgenes,
- contrast_tbl = contrast_tbl,
- sender_receiver_tbl = sender_receiver_tbl,
- grouping_tbl = grouping_tbl,
- prioritizing_weights = prioritizing_weights,
- fraction_cutoff = fraction_cutoff,
- abundance_data_receiver = abundance_expression_info$abundance_data_receiver,
- abundance_data_sender = abundance_expression_info$abundance_data_sender
- ))
We will combine these prioritization criteria in a single aggregated prioritization score. In the default setting, we will weigh each of these criteria equally (scenario = "regular"
). This setting is strongly recommended. However, we also provide some additional setting to accomodate different biological scenarios. The setting scenario = "lower_DE"
halves the weight for DE criteria and doubles the weight for ligand activity. This is recommended in case your hypothesis is that the differential CCC patterns in your data are less likely to be driven by DE (eg in cases of differential migration into a niche). The setting scenario = "no_frac_LR_expr"
ignores the criterion “Sufficiently high expression levels of ligand and receptor in many samples of the same group”. This may be interesting for users that have data with a limited number of samples and don’t want to penalize interactions if they are not sufficiently expressed in some samples.
Finally, we still need to make one choice. For NicheNet ligand activity we can choose to prioritize ligands that only induce upregulation of target genes (ligand_activity_down = FALSE
) or can lead potentially lead to both up- and downregulation (ligand_activity_down = TRUE
). The benefit of ligand_activity_down = FALSE
is ease of interpretability: prioritized ligand-receptor pairs will be upregulated in the condition of interest, just like their target genes. ligand_activity_down = TRUE
can be harder to interpret because target genes of some interactions may be upregulated in the other conditions compared to the condition of interest. This is harder to interpret, but may help to pick up interactions that can also repress gene expression.
Here we will choose for setting ligand_activity_down = FALSE
and focus specifically on upregulating ligands.
ligand_activity_down = FALSE
+sender_receiver_tbl = sender_receiver_de %>% distinct(sender, receiver)
+
+metadata_combined = SummarizedExperiment::colData(sce) %>% tibble::as_tibble()
+
+if(!is.na(batches)){
+ grouping_tbl = metadata_combined[,c(sample_id, group_id, batches)] %>%
+ tibble::as_tibble() %>% distinct()
+ colnames(grouping_tbl) = c("sample","group",batches)
+} else {
+ grouping_tbl = metadata_combined[,c(sample_id, group_id)] %>%
+ tibble::as_tibble() %>% distinct()
+ colnames(grouping_tbl) = c("sample","group")
+}
+
+prioritization_tables = suppressMessages(generate_prioritization_tables(
+ sender_receiver_info = abundance_expression_info$sender_receiver_info,
+ sender_receiver_de = sender_receiver_de,
+ ligand_activities_targets_DEgenes = ligand_activities_targets_DEgenes,
+ contrast_tbl = contrast_tbl,
+ sender_receiver_tbl = sender_receiver_tbl,
+ grouping_tbl = grouping_tbl,
+ scenario = "regular", # all prioritization criteria will be weighted equally
+ fraction_cutoff = fraction_cutoff,
+ abundance_data_receiver = abundance_expression_info$abundance_data_receiver,
+ abundance_data_sender = abundance_expression_info$abundance_data_sender,
+ ligand_activity_down = ligand_activity_down
+ ))
Check the output tables
First: group-based summary table
-$group_prioritization_tbl %>% head(20)
- prioritization_tables## # A tibble: 20 x 60
-## contrast group sender recei~1 ligand recep~2 lfc_l~3 lfc_r~4 ligan~5 p_val~6 p_adj~7 p_val~8 p_adj~9 activity direc~* activ~* lr_in~* id avg_l~* avg_r~* ligan~* fract~* fract~* ligan~* rel_a~* rel_a~* sende~*
-## <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
-## 1 A-(S+M)/2 A M_Monocyt~ L_T_TI~ NAMPT ITGB1 0.633 0.835 0.734 0.0525 0.584 0.00849 0.643 1.06e-2 up 0.0376 NAMPT_~ NAMP~ 1.35 0.907 1.22 0.730 0.554 0.405 0.001 0.130 0.0656
-## 2 A-(S+M)/2 A M_Monocyt~ L_T_TI~ NAMPT ITGB1 0.633 0.835 0.734 0.0525 0.584 0.00849 0.643 2.13e-2 down 4.11 NAMPT_~ NAMP~ 1.35 0.907 1.22 0.730 0.554 0.405 0.001 0.130 0.0656
-## 3 M-(S+A)/2 M L_T_TIM3.~ M_Mono~ IFNG IFNGR1 1.26 0.507 0.884 0.0233 0.376 0.101 0.396 5.17e-2 up 8.69 IFNG_I~ IFNG~ 0.155 0.575 0.0889 0.114 0.498 0.0567 1.00 0.389 0.695
-## 4 M-(S+A)/2 M L_T_TIM3.~ M_Mono~ IFNG IFNGR1 1.26 0.507 0.884 0.0233 0.376 0.101 0.396 5.15e-3 down -2.51 IFNG_I~ IFNG~ 0.155 0.575 0.0889 0.114 0.498 0.0567 1.00 0.389 0.695
-## 5 A-(S+M)/2 A M_Monocyt~ L_T_TI~ LGALS3 LAG3 0.915 1.71 1.31 0.00129 0.224 0.0012 0.334 1.75e-2 up 1.35 LGALS3~ LGAL~ 1.80 0.978 1.76 0.835 0.499 0.417 0.001 0.130 0.0656
-## 6 A-(S+M)/2 A M_Monocyt~ L_T_TI~ LGALS3 LAG3 0.915 1.71 1.31 0.00129 0.224 0.0012 0.334 1.27e-2 down 0.886 LGALS3~ LGAL~ 1.80 0.978 1.76 0.835 0.499 0.417 0.001 0.130 0.0656
-## 7 S-(M+A)/2 S M_Monocyt~ L_NK_C~ TGFB1 TGFBR3 0.347 0.278 0.312 0.0336 0.467 0.0157 0.24 9.26e-3 up 4.39 TGFB1_~ TGFB~ 0.919 0.461 0.424 0.631 0.253 0.160 0.765 0.459 0.612
-## 8 S-(M+A)/2 S M_Monocyt~ L_NK_C~ TGFB1 TGFBR3 0.347 0.278 0.312 0.0336 0.467 0.0157 0.24 2.05e-2 down -0.0993 TGFB1_~ TGFB~ 0.919 0.461 0.424 0.631 0.253 0.160 0.765 0.459 0.612
-## 9 S-(M+A)/2 S L_NK_CD56~ L_NK_C~ TGFB1 TGFBR3 0.197 0.278 0.238 0.0778 0.459 0.0157 0.24 9.26e-3 up 4.39 TGFB1_~ TGFB~ 0.897 0.461 0.414 0.463 0.253 0.117 0.459 0.459 0.459
-## 10 S-(M+A)/2 S L_NK_CD56~ L_NK_C~ TGFB1 TGFBR3 0.197 0.278 0.238 0.0778 0.459 0.0157 0.24 2.05e-2 down -0.0993 TGFB1_~ TGFB~ 0.897 0.461 0.414 0.463 0.253 0.117 0.459 0.459 0.459
-## 11 S-(M+A)/2 S M_Monocyt~ M_Mono~ TGFB1 ENG 0.347 1.36 0.854 0.0336 0.467 0.00936 0.299 1.17e-2 up 1.64 TGFB1_~ TGFB~ 0.919 0.147 0.135 0.631 0.125 0.0788 0.765 0.765 0.765
-## 12 S-(M+A)/2 S M_Monocyt~ M_Mono~ TGFB1 ENG 0.347 1.36 0.854 0.0336 0.467 0.00936 0.299 3.28e-2 down 0.354 TGFB1_~ TGFB~ 0.919 0.147 0.135 0.631 0.125 0.0788 0.765 0.765 0.765
-## 13 A-(S+M)/2 A M_Monocyt~ M_Mono~ ADAM10 NOTCH4 0.391 1.69 1.04 0.119 0.663 0.00183 0.236 9.22e-3 up -1.16 ADAM10~ ADAM~ 0.685 0.183 0.126 0.532 0.181 0.0960 0.001 0.001 0.001
-## 14 A-(S+M)/2 A M_Monocyt~ M_Mono~ ADAM10 NOTCH4 0.391 1.69 1.04 0.119 0.663 0.00183 0.236 1.03e-4 down 2.04 ADAM10~ ADAM~ 0.685 0.183 0.126 0.532 0.181 0.0960 0.001 0.001 0.001
-## 15 A-(S+M)/2 A M_Monocyt~ M_Mono~ ADAM10 TSPAN14 0.391 0.372 0.382 0.119 0.663 0.0825 0.628 9.22e-3 up -1.16 ADAM10~ ADAM~ 0.685 1.10 0.751 0.532 0.710 0.378 0.001 0.001 0.001
-## 16 A-(S+M)/2 A M_Monocyt~ M_Mono~ ADAM10 TSPAN14 0.391 0.372 0.382 0.119 0.663 0.0825 0.628 1.03e-4 down 2.04 ADAM10~ ADAM~ 0.685 1.10 0.751 0.532 0.710 0.378 0.001 0.001 0.001
-## 17 S-(M+A)/2 S L_NK_CD56~ M_Mono~ TGFB1 ENG 0.197 1.36 0.779 0.0778 0.459 0.00936 0.299 1.17e-2 up 1.64 TGFB1_~ TGFB~ 0.897 0.147 0.132 0.463 0.125 0.0578 0.459 0.765 0.612
-## 18 S-(M+A)/2 S L_NK_CD56~ M_Mono~ TGFB1 ENG 0.197 1.36 0.779 0.0778 0.459 0.00936 0.299 3.28e-2 down 0.354 TGFB1_~ TGFB~ 0.897 0.147 0.132 0.463 0.125 0.0578 0.459 0.765 0.612
-## 19 A-(S+M)/2 A M_Monocyt~ M_Mono~ S100A8 CD36 0.536 1.39 0.963 0.378 0.866 0.0907 0.633 1.68e-2 up 1.65 S100A8~ S100~ 2.36 0.296 0.699 0.772 0.240 0.185 0.001 0.001 0.001
-## 20 A-(S+M)/2 A M_Monocyt~ M_Mono~ S100A8 CD36 0.536 1.39 0.963 0.378 0.866 0.0907 0.633 -3.57e-3 down -0.159 S100A8~ S100~ 2.36 0.296 0.699 0.772 0.240 0.185 0.001 0.001 0.001
-## # ... with 33 more variables: lfc_pval_ligand <dbl>, p_val_ligand_adapted <dbl>, scaled_lfc_ligand <dbl>, scaled_p_val_ligand <dbl>, scaled_lfc_pval_ligand <dbl>, scaled_p_val_ligand_adapted <dbl>,
-## # lfc_pval_receptor <dbl>, p_val_receptor_adapted <dbl>, scaled_lfc_receptor <dbl>, scaled_p_val_receptor <dbl>, scaled_lfc_pval_receptor <dbl>, scaled_p_val_receptor_adapted <dbl>, activity_up <dbl>,
-## # activity_scaled_up <dbl>, scaled_activity_scaled_up <dbl>, scaled_activity_up <dbl>, activity_down <dbl>, activity_scaled_down <dbl>, scaled_activity_scaled_down <dbl>, scaled_activity_down <dbl>,
-## # scaled_avg_exprs_ligand <dbl>, scaled_avg_frq_ligand <dbl>, pb_ligand_group <dbl>, scaled_pb_ligand <dbl>, scaled_avg_exprs_receptor <dbl>, scaled_avg_frq_receptor <dbl>, pb_receptor_group <dbl>,
-## # scaled_pb_receptor <dbl>, fraction_expressing_ligand_receptor <dbl>, max_scaled_activity <dbl>, na.rm <lgl>, prioritization_score <dbl>, top_group <chr>, and abbreviated variable names 1: receiver, 2: receptor,
-## # 3: lfc_ligand, 4: lfc_receptor, 5: ligand_receptor_lfc_avg, 6: p_val_ligand, 7: p_adj_ligand, 8: p_val_receptor, 9: p_adj_receptor, *: direction_regulation, *: activity_scaled, *: lr_interaction,
-## # *: avg_ligand_group, *: avg_receptor_group, *: ligand_receptor_prod_group, *: fraction_ligand_group, *: fraction_receptor_group, *: ligand_receptor_fraction_prod_group, *: rel_abundance_scaled_sender, ...
In multi-sample datasets, we have the opportunity to look whether -expression of ligand-receptor across all samples is correlated with the -expression of their by NicheNet predicted target genes. This is what we -will do with the following line of code:
-= lr_target_prior_cor_inference(prioritization_tables$group_prioritization_tbl$receiver %>% unique(), abundance_expression_info, celltype_de, grouping_tbl, prioritization_tables, ligand_target_matrix, logFC_threshold = logFC_threshold, p_val_threshold = p_val_threshold, p_val_adj = p_val_adj) lr_target_prior_cor
To avoid needing to redo the analysis later. All the output written -down here is sufficient to make all in-built downstream -visualizations.
-= "./"
- path
-= list(
- multinichenet_output celltype_info = abundance_expression_info$celltype_info,
- celltype_de = celltype_de,
- sender_receiver_info = abundance_expression_info$sender_receiver_info,
- sender_receiver_de = sender_receiver_de,
- ligand_activities_targets_DEgenes = ligand_activities_targets_DEgenes,
- prioritization_tables = prioritization_tables,
- grouping_tbl = grouping_tbl,
- lr_target_prior_cor = lr_target_prior_cor
-
- ) = make_lite_output(multinichenet_output)
- multinichenet_output
-= FALSE
- save if(save == TRUE){
-saveRDS(multinichenet_output, paste0(path, "multinichenet_output.rds"))
-
- }
In a first instance, we will look at the broad overview of -prioritized interactions via condition-specific Circos plots.
-We will look here at the top 50 predictions across all contrasts, -senders, and receivers of interest.
-= get_top_n_lr_pairs(multinichenet_output$prioritization_tables, 50, rank_per_group = FALSE) prioritized_tbl_oi_all
= multinichenet_output$prioritization_tables$group_prioritization_tbl %>%
- prioritized_tbl_oi filter(id %in% prioritized_tbl_oi_all$id) %>%
- distinct(id, sender, receiver, ligand, receptor, group) %>% left_join(prioritized_tbl_oi_all)
- $prioritization_score[is.na(prioritized_tbl_oi$prioritization_score)] = 0
- prioritized_tbl_oi
-= union(prioritized_tbl_oi$sender %>% unique(), prioritized_tbl_oi$receiver %>% unique()) %>% sort()
- senders_receivers
-= RColorBrewer::brewer.pal(n = length(senders_receivers), name = 'Spectral') %>% magrittr::set_names(senders_receivers)
- colors_sender = RColorBrewer::brewer.pal(n = length(senders_receivers), name = 'Spectral') %>% magrittr::set_names(senders_receivers)
- colors_receiver
-= make_circos_group_comparison(prioritized_tbl_oi, colors_sender, colors_receiver) circos_list
Now we will visualize per sample the scaled product of ligand and -receptor expression. Samples that were left out of the DE analysis are -indicated with a smaller dot (this helps to indicate the samples that -did not contribute to the calculation of the logFC, and thus not -contributed to the final prioritization)
-We will now check the top 50 interactions specific for the MIS-C -group
-= "M" group_oi
= get_top_n_lr_pairs(multinichenet_output$prioritization_tables, 50, groups_oi = group_oi)
- prioritized_tbl_oi_M_50
-= make_sample_lr_prod_activity_plots(multinichenet_output$prioritization_tables, prioritized_tbl_oi_M_50)
- plot_oi plot_oi
Typically, there are way more than 50 differentially expressed and -active ligand-receptor pairs per group across all sender-receiver -combinations. Therefore it might be useful to zoom in on specific cell -types as senders/receivers:
+prioritization_tables$group_prioritization_tbl %>% head(20)
+## # A tibble: 20 × 18
+## contrast group sender receiver ligand receptor lr_interaction id
+## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
+## 1 M-(S+A)/2 M M_Monocyte_CD16 L_NK_CD… HLA.E KLRC1 HLA.E_KLRC1 HLA.…
+## 2 M-(S+A)/2 M L_T_TIM3._CD38… M_Monoc… IFNG IFNGR1 IFNG_IFNGR1 IFNG…
+## 3 A-(S+M)/2 A M_Monocyte_CD16 L_T_TIM… LGALS3 LAG3 LGALS3_LAG3 LGAL…
+## 4 M-(S+A)/2 M L_T_TIM3._CD38… M_Monoc… IFNG IFNGR2 IFNG_IFNGR2 IFNG…
+## 5 M-(S+A)/2 M M_Monocyte_CD16 L_T_TIM… CXCL16 CXCR6 CXCL16_CXCR6 CXCL…
+## 6 A-(S+M)/2 A M_Monocyte_CD16 M_Monoc… VCAN TLR2 VCAN_TLR2 VCAN…
+## 7 M-(S+A)/2 M M_Monocyte_CD16 L_T_TIM… HLA.E CD8A HLA.E_CD8A HLA.…
+## 8 M-(S+A)/2 M M_Monocyte_CD16 M_Monoc… S100A9 CD68 S100A9_CD68 S100…
+## 9 M-(S+A)/2 M M_Monocyte_CD16 L_T_TIM… HLA.D… LAG3 HLA.DRA_LAG3 HLA.…
+## 10 M-(S+A)/2 M M_Monocyte_CD16 L_T_TIM… S100A8 CD69 S100A8_CD69 S100…
+## 11 M-(S+A)/2 M M_Monocyte_CD16 M_Monoc… HLA.F LILRB1 HLA.F_LILRB1 HLA.…
+## 12 M-(S+A)/2 M M_Monocyte_CD16 M_Monoc… TNF LTBR TNF_LTBR TNF_…
+## 13 A-(S+M)/2 A L_T_TIM3._CD38… L_T_TIM… LGALS… ITGB1 LGALS3BP_ITGB1 LGAL…
+## 14 A-(S+M)/2 A M_Monocyte_CD16 M_Monoc… S100A8 CD36 S100A8_CD36 S100…
+## 15 M-(S+A)/2 M M_Monocyte_CD16 M_Monoc… HLA.F LILRB2 HLA.F_LILRB2 HLA.…
+## 16 M-(S+A)/2 M M_Monocyte_CD16 L_T_TIM… HLA.A CD8A HLA.A_CD8A HLA.…
+## 17 M-(S+A)/2 M M_Monocyte_CD16 L_NK_CD… HLA.C KIR2DL1 HLA.C_KIR2DL1 HLA.…
+## 18 S-(M+A)/2 S L_T_TIM3._CD38… L_T_TIM… CD99 CD81 CD99_CD81 CD99…
+## 19 S-(M+A)/2 S L_NK_CD56._CD1… M_Monoc… TGFB1 ENG TGFB1_ENG TGFB…
+## 20 S-(M+A)/2 S M_Monocyte_CD16 M_Monoc… TGFB1 ENG TGFB1_ENG TGFB…
+## # ℹ 10 more variables: scaled_lfc_ligand <dbl>,
+## # scaled_p_val_ligand_adapted <dbl>, scaled_lfc_receptor <dbl>,
+## # scaled_p_val_receptor_adapted <dbl>, max_scaled_activity <dbl>,
+## # scaled_pb_ligand <dbl>, scaled_pb_receptor <dbl>,
+## # fraction_expressing_ligand_receptor <dbl>, prioritization_score <dbl>,
+## # top_group <chr>
+This table gives the final prioritization score of each interaction, and the values of the individual prioritization criteria.
+With this step, all required steps are finished. Now, we can optionally still run the following steps +* Calculate the across-samples expression correlation between ligand-receptor pairs and target genes +* Prioritize communication patterns involving condition-specific cell types through an alternative prioritization scheme
+Here we will only focus on the expression correlation step:
+In multi-sample datasets, we have the opportunity to look whether expression of ligand-receptor across all samples is correlated with the expression of their by NicheNet predicted target genes. This is what we will do with the following line of code:
+lr_target_prior_cor = lr_target_prior_cor_inference(
+ receivers_oi = prioritization_tables$group_prioritization_tbl$receiver %>% unique(),
+ abundance_expression_info = abundance_expression_info,
+ celltype_de = celltype_de,
+ grouping_tbl = grouping_tbl,
+ prioritization_tables = prioritization_tables,
+ ligand_target_matrix = ligand_target_matrix,
+ logFC_threshold = logFC_threshold,
+ p_val_threshold = p_val_threshold,
+ p_val_adj = p_val_adj
+ )
+To avoid needing to redo the analysis later, we will here to save an output object that contains all information to perform all downstream analyses.
+path = "./"
+
+multinichenet_output = list(
+ celltype_info = abundance_expression_info$celltype_info,
+ celltype_de = celltype_de,
+ sender_receiver_info = abundance_expression_info$sender_receiver_info,
+ sender_receiver_de = sender_receiver_de,
+ ligand_activities_targets_DEgenes = ligand_activities_targets_DEgenes,
+ prioritization_tables = prioritization_tables,
+ grouping_tbl = grouping_tbl,
+ lr_target_prior_cor = lr_target_prior_cor
+ )
+multinichenet_output = make_lite_output(multinichenet_output)
+
+save = FALSE
+if(save == TRUE){
+ saveRDS(multinichenet_output, paste0(path, "multinichenet_output.rds"))
+
+}
+In a first instance, we will look at the broad overview of prioritized interactions via condition-specific Chordiagram circos plots.
+We will look here at the top 50 predictions across all contrasts, senders, and receivers of interest.
+prioritized_tbl_oi_all = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ top_n = 50,
+ rank_per_group = FALSE
+ )
+prioritized_tbl_oi =
+ multinichenet_output$prioritization_tables$group_prioritization_tbl %>%
+ filter(id %in% prioritized_tbl_oi_all$id) %>%
+ distinct(id, sender, receiver, ligand, receptor, group) %>%
+ left_join(prioritized_tbl_oi_all)
+prioritized_tbl_oi$prioritization_score[is.na(prioritized_tbl_oi$prioritization_score)] = 0
+
+senders_receivers = union(prioritized_tbl_oi$sender %>% unique(), prioritized_tbl_oi$receiver %>% unique()) %>% sort()
+
+colors_sender = RColorBrewer::brewer.pal(n = length(senders_receivers), name = 'Spectral') %>% magrittr::set_names(senders_receivers)
+colors_receiver = RColorBrewer::brewer.pal(n = length(senders_receivers), name = 'Spectral') %>% magrittr::set_names(senders_receivers)
+
+circos_list = make_circos_group_comparison(prioritized_tbl_oi, colors_sender, colors_receiver)
+
+Whereas these ChordDiagrams show the most specific interactions per group, they don’t give insights into the data behind these predictions. Therefore we will now look at visualizations that indicate the different prioritization criteria used in MultiNicheNet.
+In the next type of plots, we will 1) visualize the per-sample scaled product of normalized ligand and receptor pseudobulk expression, 2) visualize the scaled ligand activities, 3) cell-type specificity.
+We will now check the top 50 interactions specific for the MIS-C group
+group_oi = "M"
+prioritized_tbl_oi_M_50 = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ top_n = 50,
+ groups_oi = group_oi)
+plot_oi = make_sample_lr_prod_activity_plots(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_M_50)
+plot_oi
++Samples that were left out of the DE analysis are indicated with a smaller dot (this helps to indicate the samples that did not contribute to the calculation of the logFC, and thus not contributed to the final prioritization)
+As a further help for further prioritization, we can assess the level of curation of these LR pairs as defined by the Intercellular Communication part of the Omnipath database
+prioritized_tbl_oi_M_50_omnipath = prioritized_tbl_oi_M_50 %>%
+ inner_join(lr_network_all)
+Now we add this to the bubble plot visualization:
+plot_oi = make_sample_lr_prod_activity_plots_Omnipath(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_M_50_omnipath)
+plot_oi
+
+As you can see, the HEBP1-FPR2 interaction has no Omnipath DB scores. This is because this LR pair was not documented by the Omnipath LR database. Instead it was documented by the original NicheNet LR network (source: Guide2Pharmacology) as can be seen in the table.
+Further note: Typically, there are way more than 50 differentially expressed and active ligand-receptor pairs per group across all sender-receiver combinations. Therefore it might be useful to zoom in on specific cell types as senders/receivers:
Eg M_Monocyte_CD16 as receiver:
-= get_top_n_lr_pairs(multinichenet_output$prioritization_tables, 50, groups_oi = group_oi, receivers_oi = "M_Monocyte_CD16")
- prioritized_tbl_oi_M_50
-= make_sample_lr_prod_activity_plots(multinichenet_output$prioritization_tables, prioritized_tbl_oi_M_50)
- plot_oi plot_oi
prioritized_tbl_oi_M_50 = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 50,
+ groups_oi = group_oi,
+ receivers_oi = "M_Monocyte_CD16")
+plot_oi = make_sample_lr_prod_activity_plots_Omnipath(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_M_50 %>% inner_join(lr_network_all))
+plot_oi
+
Eg M_Monocyte_CD16 as sender:
-= get_top_n_lr_pairs(multinichenet_output$prioritization_tables, 50, groups_oi = group_oi, senders_oi = "M_Monocyte_CD16")
- prioritized_tbl_oi_M_50
-= make_sample_lr_prod_activity_plots(multinichenet_output$prioritization_tables, prioritized_tbl_oi_M_50)
- plot_oi plot_oi
You can make these plots also for the other groups, like we will -illustrate now for the S group
-= "S" group_oi
= get_top_n_lr_pairs(multinichenet_output$prioritization_tables, 50, groups_oi = group_oi)
- prioritized_tbl_oi_S_50
-= make_sample_lr_prod_activity_plots(multinichenet_output$prioritization_tables, prioritized_tbl_oi_S_50)
- plot_oi plot_oi
Note: Use
-make_sample_lr_prod_activity_batch_plots
if you have
-batches and want to visualize them on this plot!
In another type of plot, we can visualize the ligand activities for a -group-receiver combination, and show the predicted ligand-target links, -and also the expression of the predicted target genes across -samples.
-For this, we now need to define a receiver cell type of interest. As
-example, we will take L_T_TIM3._CD38._HLADR.
cells as
-receiver, and look at the top 20 senderLigand-receiverReceptor pairs
-with these cells as receiver.
= "M"
- group_oi = "L_T_TIM3._CD38._HLADR."
- receiver_oi = get_top_n_lr_pairs(multinichenet_output$prioritization_tables, 10, groups_oi = group_oi, receivers_oi = receiver_oi) prioritized_tbl_oi_M_10
= make_ligand_activity_target_plot(group_oi, receiver_oi, prioritized_tbl_oi_M_10, multinichenet_output$prioritization_tables, multinichenet_output$ligand_activities_targets_DEgenes, contrast_tbl, multinichenet_output$grouping_tbl, multinichenet_output$celltype_info, ligand_target_matrix, plot_legend = FALSE)
- combined_plot
- combined_plot## $combined_plot
prioritized_tbl_oi_M_50 = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 50,
+ groups_oi = group_oi,
+ senders_oi = "M_Monocyte_CD16")
+plot_oi = make_sample_lr_prod_activity_plots_Omnipath(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_M_50 %>% inner_join(lr_network_all))
+plot_oi
+
+You can make these plots also for the other groups, like we will illustrate now for the S group
+group_oi = "S"
+prioritized_tbl_oi_S_50 = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 50,
+ groups_oi = group_oi)
+
+plot_oi = make_sample_lr_prod_activity_plots_Omnipath(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_S_50 %>% inner_join(lr_network_all))
+plot_oi
+
+Note: Use make_sample_lr_prod_activity_batch_plots
if you have batches and want to visualize them on this plot!
In another type of plot, we can visualize the ligand activities for a group-receiver combination, and show the predicted ligand-target links, and also the expression of the predicted target genes across samples.
+For this, we now need to define a receiver cell type of interest. As example, we will take L_T_TIM3._CD38._HLADR.
cells as receiver, and look at the top 10 senderLigand-receiverReceptor pairs with these cells as receiver.
group_oi = "M"
+receiver_oi = "M_Monocyte_CD16"
+prioritized_tbl_oi_M_10 = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 10,
+ groups_oi = group_oi,
+ receivers_oi = receiver_oi)
+combined_plot = make_ligand_activity_target_plot(
+ group_oi,
+ receiver_oi,
+ prioritized_tbl_oi_M_10,
+ multinichenet_output$prioritization_tables,
+ multinichenet_output$ligand_activities_targets_DEgenes, contrast_tbl,
+ multinichenet_output$grouping_tbl,
+ multinichenet_output$celltype_info,
+ ligand_target_matrix,
+ plot_legend = FALSE)
+combined_plot
+## $combined_plot
+
+##
+## $legends
+
+What if there is a specific ligand you are interested in?
+group_oi = "M"
+receiver_oi = "M_Monocyte_CD16"
+ligands_oi = c("IFNG","IL15")
+prioritized_tbl_ligands_oi = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 10000,
+ groups_oi = group_oi,
+ receivers_oi = receiver_oi
+ ) %>% filter(ligand %in% ligands_oi) # ligands should still be in the output tables of course
+combined_plot = make_ligand_activity_target_plot(
+ group_oi,
+ receiver_oi,
+ prioritized_tbl_ligands_oi,
+ multinichenet_output$prioritization_tables,
+ multinichenet_output$ligand_activities_targets_DEgenes,
+ contrast_tbl,
+ multinichenet_output$grouping_tbl,
+ multinichenet_output$celltype_info,
+ ligand_target_matrix,
+ plot_legend = FALSE)
+combined_plot
+## $combined_plot
+
##
## $legends
-
-Note Use
-make_DEgene_dotplot_pseudobulk_batch
if you want to
-indicate the batch of each sample to the plot
Before, we had calculated the correlation in expression between -ligand-receptor pairs and DE genes. Now we will filter out correlated -ligand-receptor –> target links that both show high expression -correlation (spearman or activity > 0.50 in this example) and have -some prior knowledge to support their link.
-= "M"
- group_oi = "M_Monocyte_CD16"
- receiver_oi = multinichenet_output$lr_target_prior_cor %>% inner_join(ligand_activities_targets_DEgenes$ligand_activities %>% distinct(ligand, target, direction_regulation, contrast)) %>% inner_join(contrast_tbl) %>% filter(group == group_oi, receiver == receiver_oi)
- lr_target_prior_cor_filtered = lr_target_prior_cor_filtered %>% filter(direction_regulation == "up") %>% filter( (rank_of_target < top_n_target) & (pearson > 0.50 | spearman > 0.50))
- lr_target_prior_cor_filtered_up = lr_target_prior_cor_filtered %>% filter(direction_regulation == "down") %>% filter( (rank_of_target < top_n_target) & (pearson < -0.50 | spearman < -0.50)) # downregulation -- negative correlation
- lr_target_prior_cor_filtered_down = bind_rows(lr_target_prior_cor_filtered_up, lr_target_prior_cor_filtered_down) lr_target_prior_cor_filtered
Now we will visualize the top correlated target genes for the LR -pairs that are also in the top 50 LR pairs discriminating the groups -from each other:
-= get_top_n_lr_pairs(prioritization_tables, 50, groups_oi = group_oi, receivers_oi = receiver_oi) prioritized_tbl_oi
= make_lr_target_correlation_plot(multinichenet_output$prioritization_tables, prioritized_tbl_oi, lr_target_prior_cor_filtered , multinichenet_output$grouping_tbl, multinichenet_output$celltype_info, receiver_oi,plot_legend = FALSE)
- lr_target_correlation_plot $combined_plot lr_target_correlation_plot
You can also visualize the expression correlation in the following -way for a selected LR pair and their targets:
-= "IFNG"
- ligand_oi = "IFNGR2"
- receptor_oi = "L_T_TIM3._CD38._HLADR."
- sender_oi = "M_Monocyte_CD16"
- receiver_oi = make_lr_target_scatter_plot(multinichenet_output$prioritization_tables, ligand_oi, receptor_oi, sender_oi, receiver_oi, multinichenet_output$celltype_info, multinichenet_output$grouping_tbl, lr_target_prior_cor_filtered)
- lr_target_scatter_plot lr_target_scatter_plot
In the plots before, we demonstrated that some DE genes have both -expression correlation and prior knowledge support to be downstream of -ligand-receptor pairs. Interestingly, some target genes can be ligands -or receptors themselves. This illustrates that cells can send signals to -other cells, who as a response to these signals produce signals -themselves to feedback to the original sender cells, or who will effect -other cell types.
-As last plot, we can generate a ‘systems’ view of these intercellular -feedback and cascade processes than can be occuring between the -different cell populations involved. In this plot, we will draw links -between ligands of sender cell types their ligand/receptor-annotated -target genes in receiver cell types. So links are ligand-target links (= -gene regulatory links) and not ligand-receptor protein-protein -interactions!
-= get_top_n_lr_pairs(prioritization_tables, 150, rank_per_group = FALSE)
- prioritized_tbl_oi
-= prioritization_tables$group_prioritization_tbl$group %>% unique() %>% lapply(function(group_oi){
- lr_target_prior_cor_filtered = multinichenet_output$lr_target_prior_cor %>% inner_join(ligand_activities_targets_DEgenes$ligand_activities %>% distinct(ligand, target, direction_regulation, contrast)) %>% inner_join(contrast_tbl) %>% filter(group == group_oi)
- lr_target_prior_cor_filtered = lr_target_prior_cor_filtered %>% filter(direction_regulation == "up") %>% filter( (rank_of_target < top_n_target) & (pearson > 0.50 | spearman > 0.50))
- lr_target_prior_cor_filtered_up = lr_target_prior_cor_filtered %>% filter(direction_regulation == "down") %>% filter( (rank_of_target < top_n_target) & (pearson < -0.50 | spearman < -0.50))
- lr_target_prior_cor_filtered_down = bind_rows(lr_target_prior_cor_filtered_up, lr_target_prior_cor_filtered_down)
- lr_target_prior_cor_filtered %>% bind_rows() })
"L_T_TIM3._CD38._HLADR."] = "pink" # the original yellow background color with white font is not very readable
- colors_sender[= make_ggraph_ligand_target_links(lr_target_prior_cor_filtered = lr_target_prior_cor_filtered, prioritized_tbl_oi = prioritized_tbl_oi, colors = colors_sender)
- graph_plot $plot graph_plot
$source_df_lt %>% head()
- graph_plot## # A tibble: 6 x 6
-## sender receiver direction_regulation group type weight
-## <chr> <chr> <fct> <chr> <chr> <dbl>
-## 1 M_Monocyte_CD16_S100A8 L_T_TIM3._CD38._HLADR._TNFSF10 up A Ligand-Target 1
-## 2 M_Monocyte_CD16_LGALS3 L_T_TIM3._CD38._HLADR._TNFSF10 up A Ligand-Target 1
-## 3 L_T_TIM3._CD38._HLADR._LGALS3BP L_T_TIM3._CD38._HLADR._TNFSF10 up A Ligand-Target 1
-## 4 L_T_TIM3._CD38._HLADR._CD40LG L_T_TIM3._CD38._HLADR._TNFSF10 up A Ligand-Target 1
-## 5 M_Monocyte_CD16_ADAM10 L_T_TIM3._CD38._HLADR._TNFSF10 up A Ligand-Target 1
-## 6 L_NK_CD56._CD16._ADAM10 L_T_TIM3._CD38._HLADR._TNFSF10 up A Ligand-Target 1
-$nodes_df %>% head()
- graph_plot## node celltype gene type_gene
-## M_Monocyte_CD16_S100A8 M_Monocyte_CD16_S100A8 M_Monocyte_CD16 S100A8 ligand
-## M_Monocyte_CD16_LGALS3 M_Monocyte_CD16_LGALS3 M_Monocyte_CD16 LGALS3 ligand
-## L_T_TIM3._CD38._HLADR._LGALS3BP L_T_TIM3._CD38._HLADR._LGALS3BP L_T_TIM3._CD38._HLADR. LGALS3BP ligand
-## L_T_TIM3._CD38._HLADR._CD40LG L_T_TIM3._CD38._HLADR._CD40LG L_T_TIM3._CD38._HLADR. CD40LG ligand
-## M_Monocyte_CD16_ADAM10 M_Monocyte_CD16_ADAM10 M_Monocyte_CD16 ADAM10 ligand
-## L_NK_CD56._CD16._ADAM10 L_NK_CD56._CD16._ADAM10 L_NK_CD56._CD16. ADAM10 ligand
In the previous plots, target genes were shown that are predicted as target gene of ligands based on prior knowledge. However, we can use the multi-sample nature of this data to filter target genes based on expression correlation between the upstream ligand-receptor pair and the downstream target gene. We will filter out correlated ligand-receptor –> target links that both show high expression correlation (spearman or pearson correlation > 0.50 in this example) and have some prior knowledge to support their link. Note that you can only make these visualization if you ran step 7 of the core MultiNicheNet analysis.
+group_oi = "M"
+receiver_oi = "M_Monocyte_CD16"
+lr_target_prior_cor_filtered = multinichenet_output$lr_target_prior_cor %>%
+ inner_join(
+ multinichenet_output$ligand_activities_targets_DEgenes$ligand_activities %>%
+ distinct(ligand, target, direction_regulation, contrast)
+ ) %>%
+ inner_join(contrast_tbl) %>% filter(group == group_oi, receiver == receiver_oi)
+
+lr_target_prior_cor_filtered_up = lr_target_prior_cor_filtered %>%
+ filter(direction_regulation == "up") %>%
+ filter( (rank_of_target < top_n_target) & (pearson > 0.50 | spearman > 0.50))
+lr_target_prior_cor_filtered_down = lr_target_prior_cor_filtered %>%
+ filter(direction_regulation == "down") %>%
+ filter( (rank_of_target < top_n_target) & (pearson < -0.50 | spearman < -0.50)) # downregulation -- negative correlation
+lr_target_prior_cor_filtered = bind_rows(
+ lr_target_prior_cor_filtered_up,
+ lr_target_prior_cor_filtered_down)
+Now we will visualize the top correlated target genes for the LR pairs that are also in the top 50 LR pairs discriminating the groups from each other:
+prioritized_tbl_oi = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 50,
+ groups_oi = group_oi,
+ receivers_oi = receiver_oi)
+lr_target_correlation_plot = make_lr_target_correlation_plot(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi,
+ lr_target_prior_cor_filtered ,
+ multinichenet_output$grouping_tbl,
+ multinichenet_output$celltype_info,
+ receiver_oi,
+ plot_legend = FALSE)
+lr_target_correlation_plot$combined_plot
+
+You can also visualize the expression correlation in the following way for a selected LR pair and their targets:
+ligand_oi = "IFNG"
+receptor_oi = "IFNGR2"
+sender_oi = "L_T_TIM3._CD38._HLADR."
+receiver_oi = "M_Monocyte_CD16"
+lr_target_scatter_plot = make_lr_target_scatter_plot(
+ multinichenet_output$prioritization_tables,
+ ligand_oi, receptor_oi, sender_oi, receiver_oi,
+ multinichenet_output$celltype_info,
+ multinichenet_output$grouping_tbl,
+ lr_target_prior_cor_filtered)
+lr_target_scatter_plot
+
+In the plots before, we demonstrated that some DE genes have both expression correlation and prior knowledge support to be downstream of ligand-receptor pairs. Interestingly, some target genes can be ligands or receptors themselves. This illustrates that cells can send signals to other cells, who as a response to these signals produce signals themselves to feedback to the original sender cells, or who will effect other cell types.
+As last plot, we can generate a ‘systems’ view of these intercellular feedback and cascade processes than can be occuring between the different cell populations involved. In this plot, we will draw links between ligands of sender cell types their ligand/receptor-annotated target genes in receiver cell types. So links are ligand-target links (= gene regulatory links) and not ligand-receptor protein-protein interactions! We will infer this intercellular regulatory network here for the top250 interactions.
+prioritized_tbl_oi = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 250,
+ rank_per_group = FALSE)
+
+lr_target_prior_cor_filtered =
+ multinichenet_output$prioritization_tables$group_prioritization_tbl$group %>% unique() %>%
+ lapply(function(group_oi){
+ lr_target_prior_cor_filtered = multinichenet_output$lr_target_prior_cor %>%
+ inner_join(
+ multinichenet_output$ligand_activities_targets_DEgenes$ligand_activities %>%
+ distinct(ligand, target, direction_regulation, contrast)
+ ) %>%
+ inner_join(contrast_tbl) %>% filter(group == group_oi)
+
+ lr_target_prior_cor_filtered_up = lr_target_prior_cor_filtered %>%
+ filter(direction_regulation == "up") %>%
+ filter( (rank_of_target < top_n_target) & (pearson > 0.50 | spearman > 0.50))
+
+ lr_target_prior_cor_filtered_down = lr_target_prior_cor_filtered %>%
+ filter(direction_regulation == "down") %>%
+ filter( (rank_of_target < top_n_target) & (pearson < -0.50 | spearman < -0.50))
+ lr_target_prior_cor_filtered = bind_rows(
+ lr_target_prior_cor_filtered_up,
+ lr_target_prior_cor_filtered_down
+ )
+}) %>% bind_rows()
+
+lr_target_df = lr_target_prior_cor_filtered %>%
+ distinct(group, sender, receiver, ligand, receptor, id, target, direction_regulation)
+network = infer_intercellular_regulatory_network(lr_target_df, prioritized_tbl_oi)
+network$links %>% head()
+## # A tibble: 6 × 6
+## sender_ligand receiver_target direction_regulation group type weight
+## <chr> <chr> <fct> <chr> <chr> <dbl>
+## 1 L_T_TIM3._CD38._HLADR… L_NK_CD56._CD1… up M Liga… 1
+## 2 M_Monocyte_CD16_ICAM1 L_NK_CD56._CD1… up M Liga… 1
+## 3 M_Monocyte_CD16_B2M M_Monocyte_CD1… up M Liga… 1
+## 4 M_Monocyte_CD16_B2M M_Monocyte_CD1… up M Liga… 1
+## 5 M_Monocyte_CD16_HLA.A M_Monocyte_CD1… up M Liga… 1
+## 6 M_Monocyte_CD16_HLA.A M_Monocyte_CD1… up M Liga… 1
+network$nodes %>% head()
+## # A tibble: 6 × 4
+## node celltype gene type_gene
+## <chr> <chr> <chr> <chr>
+## 1 L_T_TIM3._CD38._HLADR._ICAM3 L_T_TIM3._CD38._HLADR. ICAM3 ligand/receptor
+## 2 M_Monocyte_CD16_LILRB2 M_Monocyte_CD16 LILRB2 ligand/receptor
+## 3 M_Monocyte_CD16_ICAM3 M_Monocyte_CD16 ICAM3 ligand/receptor
+## 4 M_Monocyte_CD16_ITGB2 M_Monocyte_CD16 ITGB2 ligand/receptor
+## 5 M_Monocyte_CD16_SIGLEC9 M_Monocyte_CD16 SIGLEC9 ligand/receptor
+## 6 M_Monocyte_CD16_ENG M_Monocyte_CD16 ENG ligand/receptor
+colors_sender["L_T_TIM3._CD38._HLADR."] = "pink" # the original yellow background with white font is not very readable
+network_graph = visualize_network(network, colors_sender)
+network_graph$plot
+
+Interestingly, we can also use this network to further prioritize differential CCC interactions. Here we will assume that the most important LR interactions are the ones that are involved in this intercellular regulatory network. We can get these interactions as follows:
+network$prioritized_lr_interactions
+## # A tibble: 117 × 5
+## group sender receiver ligand receptor
+## <chr> <chr> <chr> <chr> <chr>
+## 1 M L_T_TIM3._CD38._HLADR. L_NK_CD56._CD16. CLEC2B KLRB1
+## 2 M M_Monocyte_CD16 L_NK_CD56._CD16. ICAM1 ITGAX
+## 3 M M_Monocyte_CD16 M_Monocyte_CD16 B2M LILRB1
+## 4 M M_Monocyte_CD16 M_Monocyte_CD16 HLA.A LILRB1
+## 5 M M_Monocyte_CD16 M_Monocyte_CD16 HLA.B LILRB1
+## 6 M M_Monocyte_CD16 M_Monocyte_CD16 HLA.C LILRB1
+## 7 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 HLA.A LILRB1
+## 8 M M_Monocyte_CD16 M_Monocyte_CD16 HLA.A LILRB2
+## 9 M M_Monocyte_CD16 M_Monocyte_CD16 HLA.C LILRB2
+## 10 M M_Monocyte_CD16 M_Monocyte_CD16 HLA.DRA CD63
+## # ℹ 107 more rows
+prioritized_tbl_oi_network = prioritized_tbl_oi %>% inner_join(
+ network$prioritized_lr_interactions)
+prioritized_tbl_oi_network
+## # A tibble: 117 × 8
+## group sender receiver ligand receptor id prioritization_score
+## <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
+## 1 M M_Monocyte_CD16 L_NK_CD… HLA.E KLRC1 HLA.… 0.956
+## 2 M L_T_TIM3._CD38._HL… M_Monoc… IFNG IFNGR1 IFNG… 0.948
+## 3 M L_T_TIM3._CD38._HL… M_Monoc… IFNG IFNGR2 IFNG… 0.928
+## 4 M M_Monocyte_CD16 L_T_TIM… CXCL16 CXCR6 CXCL… 0.922
+## 5 M M_Monocyte_CD16 L_T_TIM… HLA.E CD8A HLA.… 0.919
+## 6 M M_Monocyte_CD16 L_T_TIM… HLA.D… LAG3 HLA.… 0.910
+## 7 M M_Monocyte_CD16 L_T_TIM… S100A8 CD69 S100… 0.907
+## 8 M M_Monocyte_CD16 M_Monoc… HLA.F LILRB1 HLA.… 0.903
+## 9 M M_Monocyte_CD16 M_Monoc… TNF LTBR TNF_… 0.902
+## 10 M M_Monocyte_CD16 M_Monoc… HLA.F LILRB2 HLA.… 0.898
+## # ℹ 107 more rows
+## # ℹ 1 more variable: prioritization_rank <dbl>
+Visualize now the expression and activity of these interactions for the M group
+group_oi = "M"
+prioritized_tbl_oi_M = prioritized_tbl_oi_network %>% filter(group == group_oi)
+
+plot_oi = make_sample_lr_prod_activity_plots_Omnipath(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_M %>% inner_join(lr_network_all)
+ )
+plot_oi
+
+multinichenetr 2.0.0
+ +In this vignette, you can learn how to perform a sample-agnostic/cell-level MultiNicheNet analysis to compare cell-cell communication between conditions of interest. In this workflow, cells from the same condition will be pooled together similar to regular differential cell-cell communication workflows. We only recommend running this pipeline if you have less than 3 samples in each of the groups/conditions you want to compare. Do not run this workflow if you have more samples per condition.
+As input you need a SingleCellExperiment object containing at least the raw count matrix and metadata providing the following information for each cell: the group, sample and cell type.
+As example expression data of interacting cells, we will here use scRNAseq data of immune cells in MIS-C patients and healthy siblings from this paper of Hoste et al.: TIM3+ TRBV11-2 T cells and IFNγ signature in patrolling monocytes and CD16+ NK cells delineate MIS-C . +MIS-C (multisystem inflammatory syndrome in children) is a novel rare immunodysregulation syndrome that can arise after SARS-CoV-2 infection in children. We will use MultiNicheNet to explore immune cell crosstalk enriched in MIS-C compared to healthy siblings and adult COVID-19 patients.
+In this vignette, we will first prepare the MultiNicheNet core analysis, then run the several steps in the MultiNicheNet core analysis, and finally interpret the output.
+library(SingleCellExperiment)
+library(dplyr)
+library(ggplot2)
+library(nichenetr)
+library(multinichenetr)
+MultiNicheNet builds upon the NicheNet framework and uses the same prior knowledge networks (ligand-receptor network and ligand-target matrix, currently v2 version).
+The Nichenet v2 networks and matrices for both mouse and human can be downloaded from Zenodo .
+We will read these object in for human because our expression data is of human patients.
+Gene names are here made syntactically valid via make.names()
to avoid the loss of genes (eg H2-M3) in downstream visualizations.
organism = "human"
+options(timeout = 120)
+
+if(organism == "human"){
+
+ lr_network_all =
+ readRDS(url(
+ "https://zenodo.org/record/10229222/files/lr_network_human_allInfo_30112033.rds"
+ )) %>%
+ mutate(
+ ligand = convert_alias_to_symbols(ligand, organism = organism),
+ receptor = convert_alias_to_symbols(receptor, organism = organism))
+
+ lr_network_all = lr_network_all %>%
+ mutate(ligand = make.names(ligand), receptor = make.names(receptor))
+
+ lr_network = lr_network_all %>%
+ distinct(ligand, receptor)
+
+ ligand_target_matrix = readRDS(url(
+ "https://zenodo.org/record/7074291/files/ligand_target_matrix_nsga2r_final.rds"
+ ))
+
+ colnames(ligand_target_matrix) = colnames(ligand_target_matrix) %>%
+ convert_alias_to_symbols(organism = organism) %>% make.names()
+ rownames(ligand_target_matrix) = rownames(ligand_target_matrix) %>%
+ convert_alias_to_symbols(organism = organism) %>% make.names()
+
+ lr_network = lr_network %>% filter(ligand %in% colnames(ligand_target_matrix))
+ ligand_target_matrix = ligand_target_matrix[, lr_network$ligand %>% unique()]
+
+} else if(organism == "mouse"){
+
+ lr_network_all = readRDS(url(
+ "https://zenodo.org/record/10229222/files/lr_network_mouse_allInfo_30112033.rds"
+ )) %>%
+ mutate(
+ ligand = convert_alias_to_symbols(ligand, organism = organism),
+ receptor = convert_alias_to_symbols(receptor, organism = organism))
+
+ lr_network_all = lr_network_all %>%
+ mutate(ligand = make.names(ligand), receptor = make.names(receptor))
+ lr_network = lr_network_all %>%
+ distinct(ligand, receptor)
+
+ ligand_target_matrix = readRDS(url(
+ "https://zenodo.org/record/7074291/files/ligand_target_matrix_nsga2r_final_mouse.rds"
+ ))
+
+ colnames(ligand_target_matrix) = colnames(ligand_target_matrix) %>%
+ convert_alias_to_symbols(organism = organism) %>% make.names()
+ rownames(ligand_target_matrix) = rownames(ligand_target_matrix) %>%
+ convert_alias_to_symbols(organism = organism) %>% make.names()
+
+ lr_network = lr_network %>% filter(ligand %in% colnames(ligand_target_matrix))
+ ligand_target_matrix = ligand_target_matrix[, lr_network$ligand %>% unique()]
+
+}
+In this vignette, we will load in a subset of the scRNAseq data of the MIS-C . For the sake of demonstration, this subset only contains 3 cell types. These celltypes are some of the cell types that were found to be most interesting related to MIS-C according to Hoste et al.
+If you start from a Seurat object, you can convert it easily to a SingleCellExperiment object via sce = Seurat::as.SingleCellExperiment(seurat_obj, assay = "RNA")
.
Because the NicheNet 2.0. networks are in the most recent version of the official gene symbols, we will make sure that the gene symbols used in the expression data are also updated (= converted from their “aliases” to official gene symbols). Afterwards, we will make them again syntactically valid.
+sce = readRDS(url(
+ "https://zenodo.org/record/8010790/files/sce_subset_misc.rds"
+ ))
+sce = alias_to_symbol_SCE(sce, "human") %>% makenames_SCE()
+This dataset does not yet contain normalized counts:
+sce = scuttle::logNormCounts(sce)
+In this step, we will formalize our research question into MultiNicheNet input arguments.
+In this case study, we want to study differences in cell-cell communication patterns between MIS-C patients (M), their healthy siblings (S) and adult patients with severe covid (A). The meta data columns that indicate this disease status(=group/condition of interest) is MIS.C.AgeTier
.
Cell type annotations are indicated in the Annotation_v2.0
column, and the sample is indicated by the ShortID
column.
+If your cells are annotated in multiple hierarchical levels, we recommend using a relatively high level in the hierarchy because MultiNicheNet focuses on differential expression and not differential abundance
group_id = "MIS.C.AgeTier"
+sample_id = "ShortID"
+celltype_id = "Annotation_v2.0"
+In the sammple-agnostic / cell-level worklow, it is not possible to correct for batch effects or covariates. Therefore, you here have to use the following NA settings:
+covariates = NA
+batches = NA
+Important: The column names of group, sample, and cell type should be syntactically valid (make.names
)
Important: All group, sample, and cell type names should be syntactically valid as well (make.names
) (eg through SummarizedExperiment::colData(sce)$ShortID = SummarizedExperiment::colData(sce)$ShortID %>% make.names()
)
Here, we want to compare each patient group to the other groups, so the MIS-C (M) group vs healthy control siblings (S) and adult COVID19 patients (A) (= M vs S+A) and so on. We want to know which cell-cell communication patterns are specific for the M vs A+S group, the A vs M+S group and the S vs A+M group.
+To perform this comparison, we need to set the following contrasts:
+contrasts_oi = c("'M-(S+A)/2','S-(M+A)/2','A-(S+M)/2'")
+Very Important Note the format to indicate the contrasts! This formatting should be adhered to very strictly, and white spaces are not allowed! Check ?get_DE_info
for explanation about how to define this well. The most important points are that:
+each contrast is surrounded by single quotation marks
+contrasts are separated by a comma without any white space
+*all contrasts together are surrounded by double quotation marks.
If you compare against two groups, you should divide by 2 (as demonstrated here), if you compare against three groups, you should divide by 3 and so on.
+For downstream visualizations and linking contrasts to their main condition, we also need to run the following: +This is necessary because we will also calculate cell-type+condition specificity of ligands and receptors.
+contrast_tbl = tibble(contrast =
+ c("M-(S+A)/2","S-(M+A)/2", "A-(S+M)/2"),
+ group = c("M","S","A"))
+If you want to compare only two groups (eg M vs S), you can use the following:
+contrasts_oi = c("'M-S','S-M'")
+contrast_tbl = tibble(contrast = c("M-S","S-M"), group = c("M","S"))
If you want to focus the analysis on specific cell types (e.g. because you know which cell types reside in the same microenvironments based on spatial data), you can define this here. If you have sufficient computational resources and no specific idea of cell-type colocalzations, we recommend to consider all cell types as potential senders and receivers. Later on during analysis of the output it is still possible to zoom in on the cell types that interest you most, but your analysis is not biased to them.
+Here we will consider all cell types in the data:
+senders_oi = SummarizedExperiment::colData(sce)[,celltype_id] %>% unique()
+receivers_oi = SummarizedExperiment::colData(sce)[,celltype_id] %>% unique()
+sce = sce[, SummarizedExperiment::colData(sce)[,celltype_id] %in%
+ c(senders_oi, receivers_oi)
+ ]
+In case you would have samples in your data that do not belong to one of the groups/conditions of interest, we recommend removing them and only keeping conditions of interst:
+conditions_keep = c("M", "S", "A")
+sce = sce[, SummarizedExperiment::colData(sce)[,group_id] %in%
+ conditions_keep
+ ]
+Now we will run the core of a MultiNicheNet analysis. This analysis consists of the following steps:
+Following these steps, one can optionally +* 7. Calculate the across-samples expression correlation between ligand-receptor pairs and target genes +* 8. Prioritize communication patterns involving condition-specific cell types through an alternative prioritization scheme
+After these steps, the output can be further explored as we will demonstrate in the “Downstream analysis of the MultiNicheNet output” section.
+In this vignette, we will demonstrate these steps one-by-one, which offers the most flexibility to the user to assess intermediary results. Other vignettes will demonstrate the use of the multi_nichenet_analysis
wrapper function.
In this step we will calculate and visualize cell type abundances. This will give an indication about which cell types will be retained in the analysis, and which cell types will be filtered out.
+In the following analysis we will a required minimum number of cells per cell type per condition at 10. Conditions that have less than min_cells
cells will be excluded from the analysis for that specific cell type.
min_cells = 10
+abundance_info = get_abundance_info(
+ sce = sce,
+ sample_id = group_id, group_id = group_id, celltype_id = celltype_id,
+ min_cells = min_cells,
+ senders_oi = senders_oi, receivers_oi = receivers_oi,
+ batches = batches
+ )
+First, we will check the cell type abundance diagnostic plots.
+The first plot visualizes the number of cells per celltype-condition combination, and indicates which combinations are removed during the DE analysis because there are less than min_cells
in the celltype-condition combination.
abundance_info$abund_plot_sample
+
+The red dotted line indicates the required minimum of cells as defined above in min_cells
. We can see here that all cell types are present in all conditions.
In case this plot would indicate that not all cell types are present in all conditions: +running the following block of code can help you determine which cell types are condition-specific and which cell types are absent.
+sample_group_celltype_df = abundance_info$abundance_data %>%
+ filter(n > min_cells) %>%
+ ungroup() %>%
+ distinct(sample_id, group_id) %>%
+ cross_join(
+ abundance_info$abundance_data %>%
+ ungroup() %>%
+ distinct(celltype_id)
+ ) %>%
+ arrange(sample_id)
+
+abundance_df = sample_group_celltype_df %>% left_join(
+ abundance_info$abundance_data %>% ungroup()
+ )
+
+abundance_df$n[is.na(abundance_df$n)] = 0
+abundance_df$keep[is.na(abundance_df$keep)] = FALSE
+abundance_df_summarized = abundance_df %>%
+ mutate(keep = as.logical(keep)) %>%
+ group_by(group_id, celltype_id) %>%
+ summarise(samples_present = sum((keep)))
+
+celltypes_absent_one_condition = abundance_df_summarized %>%
+ filter(samples_present == 0) %>% pull(celltype_id) %>% unique()
+# find truly condition-specific cell types by searching for cell types
+# truely absent in at least one condition
+
+celltypes_present_one_condition = abundance_df_summarized %>%
+ filter(samples_present >= 1) %>% pull(celltype_id) %>% unique()
+# require presence in at least 1 samples of one group so
+# it is really present in at least one condition
+
+condition_specific_celltypes = intersect(
+ celltypes_absent_one_condition,
+ celltypes_present_one_condition)
+
+total_nr_conditions = SummarizedExperiment::colData(sce)[,group_id] %>%
+ unique() %>% length()
+
+absent_celltypes = abundance_df_summarized %>%
+ filter(samples_present < 1) %>%
+ group_by(celltype_id) %>%
+ count() %>%
+ filter(n == total_nr_conditions) %>%
+ pull(celltype_id)
+
+print("condition-specific celltypes:")
+## [1] "condition-specific celltypes:"
+print(condition_specific_celltypes)
+## character(0)
+
+print("absent celltypes:")
+## [1] "absent celltypes:"
+print(absent_celltypes)
+## character(0)
+Absent cell types will be filtered out, condition-specific cell types can be filtered out if you as a user do not want to run the alternative workflow for condition-specific cell types at the end of the core MultiNicheNet analysis.
+analyse_condition_specific_celltypes = FALSE
+if(analyse_condition_specific_celltypes == TRUE){
+ senders_oi = senders_oi %>% setdiff(absent_celltypes)
+ receivers_oi = receivers_oi %>% setdiff(absent_celltypes)
+} else {
+ senders_oi = senders_oi %>%
+ setdiff(union(absent_celltypes, condition_specific_celltypes))
+ receivers_oi = receivers_oi %>%
+ setdiff(union(absent_celltypes, condition_specific_celltypes))
+}
+
+sce = sce[, SummarizedExperiment::colData(sce)[,celltype_id] %in%
+ c(senders_oi, receivers_oi)
+ ]
+Before running the DE analysis, we will determine which genes are not sufficiently expressed and should be filtered out.
+We will perform gene filtering based on a similar procedure as used in edgeR::filterByExpr
. However, we adapted this procedure to be more interpretable for single-cell datasets.
For each cell type, we will consider genes expressed if they are expressed in at least one condition. To do this, we need to set min_sample_prop = 1
.
min_sample_prop = 1
+But how do we define which genes are expressed in a condition? For this we will consider genes as expressed if they have non-zero expression values in a fraction_cutoff
fraction of cells of that cell type in that condition By default, we set fraction_cutoff = 0.05
, which means that genes should show non-zero expression values in at least 5% of cells in a condition
fraction_cutoff = 0.05
+We recommend using these default values unless there is specific interest in prioritizing (very) weakly expressed interactions. In that case, you could lower the value of fraction_cutoff
. We explicitly recommend against using fraction_cutoff > 0.10
.
Now we will calculate the information required for gene filtering with the following command:
+frq_list = get_frac_exprs_sampleAgnostic(
+ sce = sce,
+ sample_id = sample_id, celltype_id = celltype_id, group_id = group_id,
+ batches = batches,
+ min_cells = min_cells,
+ fraction_cutoff = fraction_cutoff, min_sample_prop = min_sample_prop)
+## # A tibble: 6 × 2
+## MIS.C.AgeTier Annotation_v2.0
+## <chr> <chr>
+## 1 M L_T_TIM3._CD38._HLADR.
+## 2 M L_T_TIM3._CD38._HLADR.
+## 3 M L_T_TIM3._CD38._HLADR.
+## 4 M L_T_TIM3._CD38._HLADR.
+## 5 M L_T_TIM3._CD38._HLADR.
+## 6 M L_T_TIM3._CD38._HLADR.
+## MIS.C.AgeTier MIS.C.AgeTier Annotation_v2.0
+## 1 M M L_T_TIM3._CD38._HLADR.
+## 2 M M L_T_TIM3._CD38._HLADR.
+## 3 M M L_T_TIM3._CD38._HLADR.
+## 4 M M L_T_TIM3._CD38._HLADR.
+## 5 M M L_T_TIM3._CD38._HLADR.
+## 6 M M L_T_TIM3._CD38._HLADR.
+## sample group celltype
+## 1 M M L_T_TIM3._CD38._HLADR.
+## 2 M M L_T_TIM3._CD38._HLADR.
+## 3 M M L_T_TIM3._CD38._HLADR.
+## 4 M M L_T_TIM3._CD38._HLADR.
+## 5 M M L_T_TIM3._CD38._HLADR.
+## 6 M M L_T_TIM3._CD38._HLADR.
+## # A tibble: 6 × 3
+## sample group celltype
+## <chr> <chr> <chr>
+## 1 M M L_T_TIM3._CD38._HLADR.
+## 2 M M L_T_TIM3._CD38._HLADR.
+## 3 M M L_T_TIM3._CD38._HLADR.
+## 4 M M L_T_TIM3._CD38._HLADR.
+## 5 M M L_T_TIM3._CD38._HLADR.
+## 6 M M L_T_TIM3._CD38._HLADR.
+## [1] "Groups are considered if they have more than 10 cells of the cell type of interest"
+## [1] "Genes with non-zero counts in at least 5% of cells of a cell type of interest in a particular group/condition will be considered as expressed in that group/condition"
+## [1] "Genes expressed in at least 1 group will considered as expressed in the cell type: L_NK_CD56._CD16."
+## [1] "Genes expressed in at least 1 group will considered as expressed in the cell type: L_T_TIM3._CD38._HLADR."
+## [1] "Genes expressed in at least 1 group will considered as expressed in the cell type: M_Monocyte_CD16"
+## [1] "5589 genes are considered as expressed in the cell type: L_NK_CD56._CD16."
+## [1] "7150 genes are considered as expressed in the cell type: L_T_TIM3._CD38._HLADR."
+## [1] "7794 genes are considered as expressed in the cell type: M_Monocyte_CD16"
+Now only keep genes that are expressed by at least one cell type:
+genes_oi = frq_list$expressed_df %>%
+ filter(expressed == TRUE) %>% pull(gene) %>% unique()
+sce = sce[genes_oi, ]
+After filtering out absent cell types and genes, we will continue the analysis by calculating the different prioritization criteria that we will use to prioritize cell-cell communication patterns.
+First, we will determine and normalize per-condition pseudobulk expression levels for each expressed gene in each present cell type. The function process_abundance_expression_info
will link this expression information for ligands of the sender cell types to the corresponding receptors of the receiver cell types. This will later on allow us to define the cell-type specicificy criteria for ligands and receptors.
abundance_expression_info = process_abundance_expression_info(
+ sce = sce,
+ sample_id = group_id, group_id = group_id, celltype_id = celltype_id,
+ min_cells = min_cells,
+ senders_oi = senders_oi, receivers_oi = receivers_oi,
+ lr_network = lr_network,
+ batches = batches,
+ frq_list = frq_list,
+ abundance_info = abundance_info)
+Normalized pseudobulk expression values per gene/celltype/condition can be inspected by:
+abundance_expression_info$celltype_info$pb_df %>% head()
+## # A tibble: 6 × 4
+## gene sample pb_sample celltype
+## <chr> <chr> <dbl> <fct>
+## 1 A1BG A 3.97 L_T_TIM3._CD38._HLADR.
+## 2 AAAS A 5.00 L_T_TIM3._CD38._HLADR.
+## 3 AAGAB A 4.77 L_T_TIM3._CD38._HLADR.
+## 4 AAK1 A 6.78 L_T_TIM3._CD38._HLADR.
+## 5 AAMDC A 4.11 L_T_TIM3._CD38._HLADR.
+## 6 AAMP A 5.84 L_T_TIM3._CD38._HLADR.
+Inspecting these values for ligand-receptor interactions can be done by:
+abundance_expression_info$sender_receiver_info$pb_df %>% head()
+## # A tibble: 6 × 8
+## sample sender receiver ligand receptor pb_ligand pb_receptor
+## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
+## 1 M L_NK_CD56._CD16. M_Monocyt… B2M LILRB1 14.1 10.3
+## 2 M L_T_TIM3._CD38._HLADR. M_Monocyt… B2M LILRB1 13.9 10.3
+## 3 M M_Monocyte_CD16 M_Monocyt… B2M LILRB1 13.8 10.3
+## 4 A L_NK_CD56._CD16. L_NK_CD56… B2M KLRD1 14.4 9.73
+## 5 S L_NK_CD56._CD16. L_NK_CD56… B2M KLRD1 14.4 9.55
+## 6 M L_NK_CD56._CD16. L_NK_CD56… B2M KLRD1 14.1 9.62
+## # ℹ 1 more variable: ligand_receptor_pb_prod <dbl>
+In this step, we will perform genome-wide differential expression analysis of receiver and sender cell types to define DE genes between the conditions of interest (as formalized by the contrasts_oi
). Based on this analysis, we later can define the levels of differential expression of ligands in senders and receptors in receivers, and define the set of affected target genes in the receiver cell types (which will be used for the ligand activity analysis).
Because we don’t have several samples per condition, we cannot apply pseudobulking followed by EdgeR as done in the regular MultiNicheNet workflow. Instead, we will here perform a classic FindMarkers approach. This has as consequence that you cannot perform DE on multifactorial experimental designs. You can only compare one group vs other group(s).
+DE_info = get_DE_info_sampleAgnostic(
+ sce = sce,
+ group_id = group_id, celltype_id = celltype_id,
+ contrasts_oi = contrasts_oi,
+ expressed_df = frq_list$expressed_df,
+ min_cells = min_cells,
+ contrast_tbl = contrast_tbl)
+Check DE output information in table with logFC and p-values for each gene-celltype-contrast:
+celltype_de = DE_info$celltype_de_findmarkers
+celltype_de %>% arrange(-logFC) %>% head()
+## # A tibble: 6 × 6
+## gene cluster_id logFC p_val p_adj contrast
+## <chr> <chr> <dbl> <dbl> <dbl> <chr>
+## 1 TRBV11.2 L_T_TIM3._CD38._HLADR. 2.63 0 0 M-(S+A)/2
+## 2 MTRNR2L8 M_Monocyte_CD16 2.22 1.24e- 34 1.16e- 32 S-(M+A)/2
+## 3 IFI27 M_Monocyte_CD16 2.12 5.86e- 23 5.07e- 20 A-(S+M)/2
+## 4 NKG7 L_T_TIM3._CD38._HLADR. 2.02 3.16e- 87 2.51e- 84 M-(S+A)/2
+## 5 GZMB L_T_TIM3._CD38._HLADR. 1.88 1.59e-122 3.80e-119 M-(S+A)/2
+## 6 GNLY L_T_TIM3._CD38._HLADR. 1.67 6.38e- 50 2.07e- 47 M-(S+A)/2
+To end this step, we will combine the DE information of senders and receivers by linking their ligands and receptors together based on the prior knowledge ligand-receptor network.
+sender_receiver_de = combine_sender_receiver_de(
+ sender_de = celltype_de,
+ receiver_de = celltype_de,
+ senders_oi = senders_oi,
+ receivers_oi = receivers_oi,
+ lr_network = lr_network
+)
+sender_receiver_de %>% head(20)
+## # A tibble: 20 × 12
+## contrast sender receiver ligand receptor lfc_ligand lfc_receptor
+## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
+## 1 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… HLA.A LILRB1 0.880 1.32
+## 2 M-(S+A)/2 L_T_TIM3._CD38._H… M_Monoc… GZMB MCL1 1.88 0.184
+## 3 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… TIMP1 CD63 1.02 1.03
+## 4 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… B2M LILRB1 0.706 1.32
+## 5 A-(S+M)/2 M_Monocyte_CD16 M_Monoc… TIMP1 CD63 1.05 0.958
+## 6 M-(S+A)/2 L_T_TIM3._CD38._H… M_Monoc… HLA.C LILRB1 0.673 1.32
+## 7 M-(S+A)/2 L_T_TIM3._CD38._H… L_T_TIM… GZMB MCL1 1.88 0.0908
+## 8 M-(S+A)/2 L_T_TIM3._CD38._H… L_NK_CD… GZMB IGF2R 1.88 0.0769
+## 9 M-(S+A)/2 L_T_TIM3._CD38._H… M_Monoc… GZMB IGF2R 1.88 0.0723
+## 10 M-(S+A)/2 L_T_TIM3._CD38._H… M_Monoc… HLA.A LILRB1 0.628 1.32
+## 11 M-(S+A)/2 L_T_TIM3._CD38._H… L_NK_CD… GZMB MCL1 1.88 0.0618
+## 12 M-(S+A)/2 L_T_TIM3._CD38._H… L_T_TIM… GZMB IGF2R 1.88 0.0384
+## 13 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… HLA.C LILRB1 0.520 1.32
+## 14 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… HLA.B LILRB1 0.470 1.32
+## 15 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… S100A9 CD68 1.11 0.675
+## 16 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… HLA.A LILRB2 0.880 0.897
+## 17 M-(S+A)/2 M_Monocyte_CD16 L_T_TIM… S100A9 ITGB2 1.11 0.594
+## 18 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… HLA.F LILRB1 0.347 1.32
+## 19 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… S100A8 CD68 0.986 0.675
+## 20 M-(S+A)/2 M_Monocyte_CD16 M_Monoc… HLA.D… CD63 0.569 1.03
+## # ℹ 5 more variables: ligand_receptor_lfc_avg <dbl>, p_val_ligand <dbl>,
+## # p_adj_ligand <dbl>, p_val_receptor <dbl>, p_adj_receptor <dbl>
+In this step, we will predict NicheNet ligand activities and NicheNet ligand-target links based on these differential expression results. We do this to prioritize interactions based on their predicted effect on a receiver cell type. We will assume that the most important group-specific interactions are those that lead to group-specific gene expression changes in a receiver cell type.
+Similarly to base NicheNet (https://github.com/saeyslab/nichenetr), we use the DE output to create a “geneset of interest”: here we assume that DE genes within a cell type may be DE because of differential cell-cell communication processes. In the ligand activity prediction, we will assess the enrichment of target genes of ligands within this geneset of interest. In case high-probabiliy target genes of a ligand are enriched in this set compared to the background of expressed genes, we predict that this ligand may have a high activity.
+Because the ligand activity analysis is an enrichment procedure, it is important that this geneset of interest should contain a sufficient but not too large number of genes. The ratio geneset_oi/background should ideally be between 1/200 and 1/10 (or close to these ratios).
+To determine the genesets of interest based on DE output, we need to define some logFC and/or p-value thresholds per cell type/contrast combination. In general, we recommend inspecting the nr. of DE genes for all cell types based on the default thresholds and adapting accordingly.
+We will first inspect the geneset_oi-vs-background ratios for the default tresholds:
+logFC_threshold = 0.25 # lower here for FindMarkers than for Pseudobulk-EdgeR
+p_val_threshold = 0.05
+p_val_adj = TRUE
+geneset_assessment = contrast_tbl$contrast %>%
+ lapply(
+ process_geneset_data,
+ celltype_de, logFC_threshold, p_val_adj, p_val_threshold
+ ) %>%
+ bind_rows()
+geneset_assessment
+## # A tibble: 9 × 12
+## cluster_id n_background n_geneset_up n_geneset_down prop_geneset_up
+## <chr> <int> <int> <int> <dbl>
+## 1 L_NK_CD56._CD16. 5589 214 23 0.0383
+## 2 L_T_TIM3._CD38._HLAD… 7150 151 30 0.0211
+## 3 M_Monocyte_CD16 7794 435 36 0.0558
+## 4 L_NK_CD56._CD16. 5589 75 145 0.0134
+## 5 L_T_TIM3._CD38._HLAD… 7150 33 141 0.00462
+## 6 M_Monocyte_CD16 7794 81 352 0.0104
+## 7 L_NK_CD56._CD16. 5589 49 144 0.00877
+## 8 L_T_TIM3._CD38._HLAD… 7150 40 125 0.00559
+## 9 M_Monocyte_CD16 7794 85 441 0.0109
+## # ℹ 7 more variables: prop_geneset_down <dbl>, in_range_up <lgl>,
+## # in_range_down <lgl>, contrast <chr>, logFC_threshold <dbl>,
+## # p_val_threshold <dbl>, adjusted <lgl>
+We can see here that for most cell type / contrast combinations, all geneset/background ratio’s are within the recommended range (in_range_up
and in_range_down
columns). When these geneset/background ratio’s would not be within the recommended ranges, we should interpret ligand activity results for these cell types with more caution, or use different thresholds (for these or all cell types). Here, a few celltype-contrast combination are not in the recommended range for up- and or-down genes but they are close (recommend ranges between 0.005 and 0.1).
After the ligand activity prediction, we will also infer the predicted target genes of these ligands in each contrast. For this ligand-target inference procedure, we also need to select which top n of the predicted target genes will be considered (here: top 250 targets per ligand). This parameter will not affect the ligand activity predictions. It will only affect ligand-target visualizations and construction of the intercellular regulatory network during the downstream analysis. We recommend users to test other settings in case they would be interested in exploring fewer, but more confident target genes, or vice versa.
+top_n_target = 250
+The NicheNet ligand activity analysis can be run in parallel for each receiver cell type, by changing the number of cores as defined here. Using more cores will speed up the analysis at the cost of needing more memory. This is only recommended if you have many receiver cell types of interest.
+verbose = TRUE
+cores_system = 8
+n.cores = min(cores_system, celltype_de$cluster_id %>% unique() %>% length())
+Running the ligand activity prediction will take some time (the more cell types and contrasts, the more time)
+ligand_activities_targets_DEgenes = suppressMessages(suppressWarnings(
+ get_ligand_activities_targets_DEgenes(
+ receiver_de = celltype_de,
+ receivers_oi = intersect(receivers_oi, celltype_de$cluster_id %>% unique()),
+ ligand_target_matrix = ligand_target_matrix,
+ logFC_threshold = logFC_threshold,
+ p_val_threshold = p_val_threshold,
+ p_val_adj = p_val_adj,
+ top_n_target = top_n_target,
+ verbose = verbose,
+ n.cores = n.cores
+ )
+))
+You can check the output of the ligand activity and ligand-target inference here:
+ligand_activities_targets_DEgenes$ligand_activities %>% head(20)
+## # A tibble: 20 × 8
+## # Groups: receiver, contrast [1]
+## ligand activity contrast target ligand_target_weight receiver
+## <chr> <dbl> <chr> <chr> <dbl> <chr>
+## 1 A2M 0.0279 A-(S+M)/2 ALOX5AP 0.00727 L_NK_CD56._CD16.
+## 2 A2M 0.0279 A-(S+M)/2 ANXA2 0.00643 L_NK_CD56._CD16.
+## 3 A2M 0.0279 A-(S+M)/2 AREG 0.00638 L_NK_CD56._CD16.
+## 4 A2M 0.0279 A-(S+M)/2 BST2 0.00662 L_NK_CD56._CD16.
+## 5 A2M 0.0279 A-(S+M)/2 CXCR4 0.00967 L_NK_CD56._CD16.
+## 6 A2M 0.0279 A-(S+M)/2 DDIT4 0.0114 L_NK_CD56._CD16.
+## 7 A2M 0.0279 A-(S+M)/2 IRF7 0.00777 L_NK_CD56._CD16.
+## 8 A2M 0.0279 A-(S+M)/2 ISG20 0.00738 L_NK_CD56._CD16.
+## 9 A2M 0.0279 A-(S+M)/2 MT2A 0.00639 L_NK_CD56._CD16.
+## 10 A2M 0.0279 A-(S+M)/2 TSC22D3 0.00661 L_NK_CD56._CD16.
+## 11 A2M 0.0279 A-(S+M)/2 TXNIP 0.00939 L_NK_CD56._CD16.
+## 12 A2M 0.0279 A-(S+M)/2 VIM 0.00857 L_NK_CD56._CD16.
+## 13 A2M 0.0279 A-(S+M)/2 ZFP36 0.00732 L_NK_CD56._CD16.
+## 14 A2M 0.0279 A-(S+M)/2 ZFP36L2 0.00660 L_NK_CD56._CD16.
+## 15 AANAT 0.0239 A-(S+M)/2 ALOX5AP 0.00384 L_NK_CD56._CD16.
+## 16 AANAT 0.0239 A-(S+M)/2 AREG 0.00379 L_NK_CD56._CD16.
+## 17 AANAT 0.0239 A-(S+M)/2 CXCR4 0.00589 L_NK_CD56._CD16.
+## 18 AANAT 0.0239 A-(S+M)/2 DDIT4 0.00659 L_NK_CD56._CD16.
+## 19 AANAT 0.0239 A-(S+M)/2 ISG20 0.00446 L_NK_CD56._CD16.
+## 20 AANAT 0.0239 A-(S+M)/2 TSC22D3 0.00370 L_NK_CD56._CD16.
+## # ℹ 2 more variables: direction_regulation <fct>, activity_scaled <dbl>
+In the previous steps, we calculated expression, differential expression and NicheNet ligand activity. In the final step, we will now combine all calculated information to rank all sender-ligand—receiver-receptor pairs according to group/condition specificity. We will use the following criteria to prioritize ligand-receptor interactions:
+We will combine these prioritization criteria in a single aggregated prioritization score. In the default setting, we will weigh each of these criteria equally (scenario = "regular"
). This setting is strongly recommended. However, we also provide some additional setting to accomodate different biological scenarios. The setting scenario = "lower_DE"
halves the weight for DE criteria and doubles the weight for ligand activity. This is recommended in case your hypothesis is that the differential CCC patterns in your data are less likely to be driven by DE (eg in cases of differential migration into a niche). The setting scenario = "no_frac_LR_expr"
ignores the criterion “Sufficiently high expression levels of ligand and receptor in many samples of the same group”. This may be interesting for users that have data with a limited number of samples and don’t want to penalize interactions if they are not sufficiently expressed in some samples.
Finally, we still need to make one choice. For NicheNet ligand activity we can choose to prioritize ligands that only induce upregulation of target genes (ligand_activity_down = FALSE
) or can lead potentially lead to both up- and downregulation (ligand_activity_down = TRUE
). The benefit of ligand_activity_down = FALSE
is ease of interpretability: prioritized ligand-receptor pairs will be upregulated in the condition of interest, just like their target genes. ligand_activity_down = TRUE
can be harder to interpret because target genes of some interactions may be upregulated in the other conditions compared to the condition of interest. This is harder to interpret, but may help to pick up interactions that can also repress gene expression.
Here we will choose for setting ligand_activity_down = FALSE
and focus specifically on upregulating ligands.
ligand_activity_down = FALSE
+sender_receiver_tbl = sender_receiver_de %>% distinct(sender, receiver)
+
+metadata_combined = SummarizedExperiment::colData(sce) %>% tibble::as_tibble()
+
+ if(!is.na(batches)){
+ grouping_tbl = metadata_combined[,c(group_id, batches)] %>% tibble::as_tibble() %>% dplyr::distinct()
+ colnames(grouping_tbl) = c("group",batches)
+ grouping_tbl = grouping_tbl %>% mutate(sample = group)
+ grouping_tbl = grouping_tbl %>% tibble::as_tibble()
+ } else {
+ grouping_tbl = metadata_combined[,c(group_id)] %>% tibble::as_tibble() %>% dplyr::distinct()
+ colnames(grouping_tbl) = c("group")
+ grouping_tbl = grouping_tbl %>% mutate(sample = group) %>% select(sample, group)
+
+ }
+
+prioritization_tables = suppressMessages(generate_prioritization_tables(
+ sender_receiver_info = abundance_expression_info$sender_receiver_info,
+ sender_receiver_de = sender_receiver_de,
+ ligand_activities_targets_DEgenes = ligand_activities_targets_DEgenes,
+ contrast_tbl = contrast_tbl,
+ sender_receiver_tbl = sender_receiver_tbl,
+ grouping_tbl = grouping_tbl,
+ scenario = "regular", # all prioritization criteria will be weighted equally
+ fraction_cutoff = fraction_cutoff,
+ abundance_data_receiver = abundance_expression_info$abundance_data_receiver,
+ abundance_data_sender = abundance_expression_info$abundance_data_sender,
+ ligand_activity_down = ligand_activity_down
+ ))
+Check the output tables
+First: group-based summary table
+prioritization_tables$group_prioritization_tbl %>% head(20)
+## # A tibble: 20 × 18
+## contrast group sender receiver ligand receptor lr_interaction id
+## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
+## 1 A-(S+M)/2 A L_NK_CD56._CD1… L_NK_CD… HLA.A KIR3DL1 HLA.A_KIR3DL1 HLA.…
+## 2 A-(S+M)/2 A L_NK_CD56._CD1… L_NK_CD… HLA.A KLRD1 HLA.A_KLRD1 HLA.…
+## 3 M-(S+A)/2 M L_T_TIM3._CD38… M_Monoc… IFNG IFNGR1 IFNG_IFNGR1 IFNG…
+## 4 M-(S+A)/2 M L_T_TIM3._CD38… M_Monoc… IFNG IFNGR2 IFNG_IFNGR2 IFNG…
+## 5 M-(S+A)/2 M L_NK_CD56._CD1… L_T_TIM… CCL4 CCR5 CCL4_CCR5 CCL4…
+## 6 M-(S+A)/2 M M_Monocyte_CD16 M_Monoc… TNF LTBR TNF_LTBR TNF_…
+## 7 A-(S+M)/2 A L_NK_CD56._CD1… L_NK_CD… HLA.A KIR3DL2 HLA.A_KIR3DL2 HLA.…
+## 8 M-(S+A)/2 M M_Monocyte_CD16 L_T_TIM… HLA.D… LAG3 HLA.DRB5_LAG3 HLA.…
+## 9 M-(S+A)/2 M M_Monocyte_CD16 L_T_TIM… HLA.D… LAG3 HLA.DPB1_LAG3 HLA.…
+## 10 M-(S+A)/2 M M_Monocyte_CD16 L_T_TIM… HLA.D… LAG3 HLA.DRA_LAG3 HLA.…
+## 11 M-(S+A)/2 M M_Monocyte_CD16 M_Monoc… TIMP1 CD63 TIMP1_CD63 TIMP…
+## 12 A-(S+M)/2 A L_NK_CD56._CD1… L_NK_CD… HLA.B KIR3DL1 HLA.B_KIR3DL1 HLA.…
+## 13 A-(S+M)/2 A L_NK_CD56._CD1… L_NK_CD… HLA.B KLRD1 HLA.B_KLRD1 HLA.…
+## 14 M-(S+A)/2 M L_NK_CD56._CD1… M_Monoc… CCL3 CCR1 CCL3_CCR1 CCL3…
+## 15 M-(S+A)/2 M L_NK_CD56._CD1… M_Monoc… IFNG IFNGR1 IFNG_IFNGR1 IFNG…
+## 16 M-(S+A)/2 M L_NK_CD56._CD1… M_Monoc… IFNG IFNGR2 IFNG_IFNGR2 IFNG…
+## 17 M-(S+A)/2 M M_Monocyte_CD16 L_NK_CD… TYROBP KLRD1 TYROBP_KLRD1 TYRO…
+## 18 M-(S+A)/2 M L_NK_CD56._CD1… M_Monoc… CD99 PILRA CD99_PILRA CD99…
+## 19 M-(S+A)/2 M L_NK_CD56._CD1… L_T_TIM… CCL3 CCR5 CCL3_CCR5 CCL3…
+## 20 M-(S+A)/2 M L_T_TIM3._CD38… M_Monoc… CD99 PILRA CD99_PILRA CD99…
+## # ℹ 10 more variables: scaled_lfc_ligand <dbl>,
+## # scaled_p_val_ligand_adapted <dbl>, scaled_lfc_receptor <dbl>,
+## # scaled_p_val_receptor_adapted <dbl>, max_scaled_activity <dbl>,
+## # scaled_pb_ligand <dbl>, scaled_pb_receptor <dbl>,
+## # fraction_expressing_ligand_receptor <dbl>, prioritization_score <dbl>,
+## # top_group <chr>
+This table gives the final prioritization score of each interaction, and the values of the individual prioritization criteria.
+With this step, all required steps are finished. Now, we can optionally still run the following steps +* Prioritize communication patterns involving condition-specific cell types through an alternative prioritization scheme +However, this is not relevant for this dataset since there are no condition-specific cell types here.
+To avoid needing to redo the analysis later, we will here to save an output object that contains all information to perform all downstream analyses.
+path = "./"
+
+multinichenet_output = list(
+ celltype_info = abundance_expression_info$celltype_info,
+ celltype_de = celltype_de,
+ sender_receiver_info = abundance_expression_info$sender_receiver_info,
+ sender_receiver_de = sender_receiver_de,
+ ligand_activities_targets_DEgenes = ligand_activities_targets_DEgenes,
+ prioritization_tables = prioritization_tables,
+ grouping_tbl = grouping_tbl,
+ lr_target_prior_cor = tibble()
+ )
+multinichenet_output = make_lite_output(multinichenet_output)
+
+save = FALSE
+if(save == TRUE){
+ saveRDS(multinichenet_output, paste0(path, "multinichenet_output.rds"))
+
+}
+In a first instance, we will look at the broad overview of prioritized interactions via condition-specific Chordiagram circos plots.
+We will look here at the top 50 predictions across all contrasts, senders, and receivers of interest.
+prioritized_tbl_oi_all = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ top_n = 50,
+ rank_per_group = FALSE
+ )
+prioritized_tbl_oi =
+ multinichenet_output$prioritization_tables$group_prioritization_tbl %>%
+ filter(id %in% prioritized_tbl_oi_all$id) %>%
+ distinct(id, sender, receiver, ligand, receptor, group) %>%
+ left_join(prioritized_tbl_oi_all)
+prioritized_tbl_oi$prioritization_score[is.na(prioritized_tbl_oi$prioritization_score)] = 0
+
+senders_receivers = union(prioritized_tbl_oi$sender %>% unique(), prioritized_tbl_oi$receiver %>% unique()) %>% sort()
+
+colors_sender = RColorBrewer::brewer.pal(n = length(senders_receivers), name = 'Spectral') %>% magrittr::set_names(senders_receivers)
+colors_receiver = RColorBrewer::brewer.pal(n = length(senders_receivers), name = 'Spectral') %>% magrittr::set_names(senders_receivers)
+
+circos_list = make_circos_group_comparison(prioritized_tbl_oi, colors_sender, colors_receiver)
+
+Whereas these ChordDiagrams show the most specific interactions per group, they don’t give insights into the data behind these predictions. Therefore we will now look at visualizations that indicate the different prioritization criteria used in MultiNicheNet.
+In the next type of plots, we will 1) visualize the per-sample scaled product of normalized ligand and receptor pseudobulk expression, 2) visualize the scaled ligand activities, 3) cell-type specificity.
+We will now check the top 50 interactions specific for the MIS-C group
+group_oi = "M"
+prioritized_tbl_oi_M_50 = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ top_n = 50,
+ groups_oi = group_oi)
+plot_oi = make_sample_lr_prod_activity_plots(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_M_50)
+plot_oi
++Samples that were left out of the DE analysis are indicated with a smaller dot (this helps to indicate the samples that did not contribute to the calculation of the logFC, and thus not contributed to the final prioritization)
+As a further help for further prioritization, we can assess the level of curation of these LR pairs as defined by the Intercellular Communication part of the Omnipath database
+prioritized_tbl_oi_M_50_omnipath = prioritized_tbl_oi_M_50 %>%
+ inner_join(lr_network_all)
+Now we add this to the bubble plot visualization:
+plot_oi = make_sample_lr_prod_activity_plots_Omnipath(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_M_50_omnipath)
+plot_oi
+
+As you can see, the HEBP1-FPR2 interaction has no Omnipath DB scores. This is because this LR pair was not documented by the Omnipath LR database. Instead it was documented by the original NicheNet LR network (source: Guide2Pharmacology) as can be seen in the table.
+Further note: Typically, there are way more than 50 differentially expressed and active ligand-receptor pairs per group across all sender-receiver combinations. Therefore it might be useful to zoom in on specific cell types as senders/receivers:
+Eg M_Monocyte_CD16 as receiver:
+prioritized_tbl_oi_M_50 = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 50,
+ groups_oi = group_oi,
+ receivers_oi = "M_Monocyte_CD16")
+plot_oi = make_sample_lr_prod_activity_plots_Omnipath(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_M_50 %>% inner_join(lr_network_all))
+plot_oi
+
+Eg M_Monocyte_CD16 as sender:
+prioritized_tbl_oi_M_50 = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 50,
+ groups_oi = group_oi,
+ senders_oi = "M_Monocyte_CD16")
+plot_oi = make_sample_lr_prod_activity_plots_Omnipath(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_M_50 %>% inner_join(lr_network_all))
+plot_oi
+
+You can make these plots also for the other groups, like we will illustrate now for the S group
+group_oi = "S"
+prioritized_tbl_oi_S_50 = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 50,
+ groups_oi = group_oi)
+
+plot_oi = make_sample_lr_prod_activity_plots_Omnipath(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_S_50 %>% inner_join(lr_network_all))
+plot_oi
+
+In another type of plot, we can visualize the ligand activities for a group-receiver combination, and show the predicted ligand-target links, and also the expression of the predicted target genes across samples.
+For this, we now need to define a receiver cell type of interest. As example, we will take L_T_TIM3._CD38._HLADR.
cells as receiver, and look at the top 10 senderLigand-receiverReceptor pairs with these cells as receiver.
group_oi = "M"
+receiver_oi = "M_Monocyte_CD16"
+prioritized_tbl_oi_M_10 = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 10,
+ groups_oi = group_oi,
+ receivers_oi = receiver_oi)
+combined_plot = make_ligand_activity_target_plot(
+ group_oi,
+ receiver_oi,
+ prioritized_tbl_oi_M_10,
+ multinichenet_output$prioritization_tables,
+ multinichenet_output$ligand_activities_targets_DEgenes, contrast_tbl,
+ multinichenet_output$grouping_tbl,
+ multinichenet_output$celltype_info,
+ ligand_target_matrix,
+ plot_legend = FALSE)
+combined_plot
+## $combined_plot
+
+##
+## $legends
+
+In the plots before, we visualized target genes that are supported by prior knowledge to be downstream of ligand-receptor pairs. Interestingly, some target genes can be ligands or receptors themselves. This illustrates that cells can send signals to other cells, who as a response to these signals produce signals themselves to feedback to the original sender cells, or who will effect other cell types.
+As last plot, we can generate a ‘systems’ view of these intercellular feedback and cascade processes than can be occuring between the different cell populations involved. In this plot, we will draw links between ligands of sender cell types their ligand/receptor-annotated target genes in receiver cell types. So links are ligand-target links (= gene regulatory links) and not ligand-receptor protein-protein interactions! We will infer this intercellular regulatory network here for the top50 interactions.
+Important In the default MultiNicheNet workflow for datasets with multiple samples per condition, we further filter out target genes based on expression correlation before generating this plot. However, this would not be meaningful here since we don’t have multiple samples. As a consequence, we will have many ligand-target links here in these plots, making the plots less readable if we would consider more than the top 50 or 100 interactions.
+prioritized_tbl_oi = get_top_n_lr_pairs(
+ multinichenet_output$prioritization_tables,
+ 50,
+ rank_per_group = FALSE)
+lr_target_prior = prioritized_tbl_oi %>% inner_join(
+ multinichenet_output$ligand_activities_targets_DEgenes$ligand_activities %>%
+ distinct(ligand, target, direction_regulation, contrast) %>% inner_join(contrast_tbl) %>% ungroup()
+ )
+
+
+lr_target_df = lr_target_prior %>% distinct(group, sender, receiver, ligand, receptor, id, target, direction_regulation)
+network = infer_intercellular_regulatory_network(lr_target_df, prioritized_tbl_oi)
+network$links %>% head()
+## # A tibble: 6 × 6
+## sender_ligand receiver_target direction_regulation group type weight
+## <chr> <chr> <fct> <chr> <chr> <dbl>
+## 1 L_NK_CD56._CD16._HLA.A L_NK_CD56._CD1… up A Liga… 1
+## 2 L_T_TIM3._CD38._HLADR… M_Monocyte_CD1… up M Liga… 1
+## 3 L_T_TIM3._CD38._HLADR… M_Monocyte_CD1… up M Liga… 1
+## 4 L_T_TIM3._CD38._HLADR… M_Monocyte_CD1… up M Liga… 1
+## 5 L_T_TIM3._CD38._HLADR… M_Monocyte_CD1… up M Liga… 1
+## 6 L_T_TIM3._CD38._HLADR… M_Monocyte_CD1… up M Liga… 1
+network$nodes %>% head()
+## # A tibble: 6 × 4
+## node celltype gene type_gene
+## <chr> <chr> <chr> <chr>
+## 1 L_NK_CD56._CD16._CD99 L_NK_CD56._CD16. CD99 ligand/receptor
+## 2 L_T_TIM3._CD38._HLADR._CD99 L_T_TIM3._CD38._HLADR. CD99 ligand/receptor
+## 3 L_NK_CD56._CD16._ITGB2 L_NK_CD56._CD16. ITGB2 ligand/receptor
+## 4 L_NK_CD56._CD16._HLA.A L_NK_CD56._CD16. HLA.A ligand
+## 5 L_T_TIM3._CD38._HLADR._IFNG L_T_TIM3._CD38._HLADR. IFNG ligand
+## 6 L_NK_CD56._CD16._CCL4 L_NK_CD56._CD16. CCL4 ligand
+colors_sender["L_T_TIM3._CD38._HLADR."] = "pink" # the original yellow background with white font is not very readable
+network_graph = visualize_network(network, colors_sender)
+network_graph$plot
+
+If you want to know the exact set of top-ranked ligands that are potentially regulated by one ligand of interest, you can run following code:
+ligand_oi = "IFNG"
+group_oi = "M"
+lr_target_df %>% filter(ligand == ligand_oi & group == group_oi) %>% pull(target) %>% unique() %>% intersect(lr_network$ligand)
+## [1] "B2M" "C1QB" "CD55" "CXCL16" "HLA.A" "HLA.B"
+## [7] "HLA.C" "HLA.DPB1" "HLA.DQB1" "HLA.DRA" "HLA.DRB1" "HLA.E"
+## [13] "HLA.F" "IFITM1" "IL32" "NAMPT" "S100A4" "S100A8"
+## [19] "TNF" "TNFSF10"
+Interestingly, we can also use this network to further prioritize differential CCC interactions. Here we will assume that the most important LR interactions are the ones that are involved in this intercellular regulatory network. We can get these interactions as follows:
+network$prioritized_lr_interactions
+## # A tibble: 50 × 5
+## group sender receiver ligand receptor
+## <chr> <chr> <chr> <chr> <chr>
+## 1 A L_NK_CD56._CD16. L_NK_CD56._CD16. HLA.A KIR3DL1
+## 2 A L_NK_CD56._CD16. L_NK_CD56._CD16. HLA.A KLRD1
+## 3 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR1
+## 4 M L_T_TIM3._CD38._HLADR. M_Monocyte_CD16 IFNG IFNGR2
+## 5 M L_NK_CD56._CD16. L_T_TIM3._CD38._HLADR. CCL4 CCR5
+## 6 M M_Monocyte_CD16 M_Monocyte_CD16 TNF LTBR
+## 7 A L_NK_CD56._CD16. L_NK_CD56._CD16. HLA.A KIR3DL2
+## 8 M M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. HLA.DRB5 LAG3
+## 9 M M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. HLA.DPB1 LAG3
+## 10 M M_Monocyte_CD16 L_T_TIM3._CD38._HLADR. HLA.DRA LAG3
+## # ℹ 40 more rows
+prioritized_tbl_oi_network = prioritized_tbl_oi %>% inner_join(
+ network$prioritized_lr_interactions)
+prioritized_tbl_oi_network
+## # A tibble: 50 × 8
+## group sender receiver ligand receptor id prioritization_score
+## <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
+## 1 A L_NK_CD56._CD16. L_NK_CD… HLA.A KIR3DL1 HLA.… 0.958
+## 2 A L_NK_CD56._CD16. L_NK_CD… HLA.A KLRD1 HLA.… 0.956
+## 3 M L_T_TIM3._CD38._HL… M_Monoc… IFNG IFNGR1 IFNG… 0.950
+## 4 M L_T_TIM3._CD38._HL… M_Monoc… IFNG IFNGR2 IFNG… 0.948
+## 5 M L_NK_CD56._CD16. L_T_TIM… CCL4 CCR5 CCL4… 0.940
+## 6 M M_Monocyte_CD16 M_Monoc… TNF LTBR TNF_… 0.940
+## 7 A L_NK_CD56._CD16. L_NK_CD… HLA.A KIR3DL2 HLA.… 0.936
+## 8 M M_Monocyte_CD16 L_T_TIM… HLA.D… LAG3 HLA.… 0.936
+## 9 M M_Monocyte_CD16 L_T_TIM… HLA.D… LAG3 HLA.… 0.935
+## 10 M M_Monocyte_CD16 L_T_TIM… HLA.D… LAG3 HLA.… 0.935
+## # ℹ 40 more rows
+## # ℹ 1 more variable: prioritization_rank <dbl>
+Visualize now the expression and activity of these interactions for the M group
+group_oi = "M"
+prioritized_tbl_oi_M = prioritized_tbl_oi_network %>% filter(group == group_oi)
+
+plot_oi = make_sample_lr_prod_activity_plots_Omnipath(
+ multinichenet_output$prioritization_tables,
+ prioritized_tbl_oi_M %>% inner_join(lr_network_all)
+ )
+plot_oi
+
+