Skip to content

Commit

Permalink
updated vignettes, added demonstration of criterion filtering for rob…
Browse files Browse the repository at this point in the history
…ustness
  • Loading branch information
browaeysrobin committed Sep 5, 2024
1 parent 2123298 commit 70678ec
Show file tree
Hide file tree
Showing 69 changed files with 892 additions and 1,311 deletions.
6 changes: 1 addition & 5 deletions R/pipeline.R
Original file line number Diff line number Diff line change
Expand Up @@ -370,11 +370,7 @@ multi_nichenet_analysis = function(sce,
batches = batches)

## check for condition-specific cell types
sample_group_celltype_df = abundance_info$abundance_data %>% filter(n > min_cells) %>% ungroup() %>% distinct(sample_id, group_id) %>% cross_join(abundance_info$abundance_data %>% ungroup() %>% distinct(celltype_id)) %>% arrange(sample_id)
abundance_df = sample_group_celltype_df %>% left_join(abundance_info$abundance_data %>% ungroup())
abundance_df$n[is.na(abundance_df$n)] = 0
abundance_df$keep[is.na(abundance_df$keep)] = FALSE
abundance_df_summarized = abundance_df %>% mutate(keep = as.logical(keep)) %>% group_by(group_id, celltype_id) %>% summarise(samples_present = sum((keep)))
abundance_df_summarized = abundance_info$abundance_data %>% mutate(keep = as.logical(keep)) %>% group_by(group_id, celltype_id) %>% summarise(samples_present = sum((keep)))
celltypes_absent_one_condition = abundance_df_summarized %>% filter(samples_present == 0) %>% pull(celltype_id) %>% unique() # find truly condition-specific cell types by searching for cell types truely absent in at least one condition
celltypes_present_one_condition = abundance_df_summarized %>% filter(samples_present >= 2) %>% pull(celltype_id) %>% unique() # require presence in at least 2 samples of one group so it is really present in at least one condition
condition_specific_celltypes = intersect(celltypes_absent_one_condition, celltypes_present_one_condition)
Expand Down
54 changes: 35 additions & 19 deletions vignettes/basic_analysis_steps_MISC.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -284,24 +284,7 @@ __Important__: Based on the cell type abundance diagnostics, we recommend users
Running the following block of code can help you determine which cell types are condition-specific and which cell types are absent.

```{r}
sample_group_celltype_df = abundance_info$abundance_data %>%
filter(n > min_cells) %>%
ungroup() %>%
distinct(sample_id, group_id) %>%
cross_join(
abundance_info$abundance_data %>%
ungroup() %>%
distinct(celltype_id)
) %>%
arrange(sample_id)
abundance_df = sample_group_celltype_df %>% left_join(
abundance_info$abundance_data %>% ungroup()
)
abundance_df$n[is.na(abundance_df$n)] = 0
abundance_df$keep[is.na(abundance_df$keep)] = FALSE
abundance_df_summarized = abundance_df %>%
abundance_df_summarized = abundance_info$abundance_data %>%
mutate(keep = as.logical(keep)) %>%
group_by(group_id, celltype_id) %>%
summarise(samples_present = sum((keep)))
Expand Down Expand Up @@ -1000,10 +983,43 @@ Because of this, interactions in this plot may be interesting candidates for fol

__Note__: These networks were generated by only looking at the top50 interactions overall. In practice, we encourage users to explore more hits than the top50, certainly if many cell type pairs are considered in the analysis.

All the previous were informative for interactions where both the sender and receiver cell types are captured in the data and where ligand and receptor are sufficiently expressed at the RNA level. However, these two conditions are not always fulfilled and some interesting cell-cell communication signals may be missed as a consequence. Can we still have an idea about these potentially missed interactions? Yes, we can.
## Filter interactions based on specific prioritization criteria

For some use cases, users could want to filter some interactions based on certain criteria. For example, if you would only be interested in seeing interactions that are strongly expressed in all samples within a condition, you could filter on that criterion as we will demonstrate now.

The scores for the individual criteria can be inspected in this data frame:

```{r}
multinichenet_output$prioritization_tables$group_prioritization_tbl
```

To only consider interactions with sufficiently high ligand-and-receptor expression in all samples of a condition (MIS-C as example, all samples: `fraction_expressing_ligand_receptor` = 1), you can run this line of code to extract all CCI ids that fullfill this:

```{r}
filtered_ids = multinichenet_output$prioritization_tables$group_prioritization_tbl %>% filter(fraction_expressing_ligand_receptor == 1 & group == "M") %>% pull(id)
```

Now: continue only with these filtered CCIs in the top100 generally prioritized interactions for the M-group
```{r}
prioritized_tbl_oi_M_100_filtered = get_top_n_lr_pairs(
multinichenet_output$prioritization_tables,
100,
groups_oi = "M") %>% filter(id %in% filtered_ids)
```

```{r, fig.height=10, fig.width=17}
plot_oi = make_sample_lr_prod_activity_plots_Omnipath(
multinichenet_output$prioritization_tables,
prioritized_tbl_oi_M_100_filtered %>% inner_join(lr_network_all)
)
plot_oi
```
Note that this we don't recommend this as a general strategy. In general, the default prioritization framework finds a tradeoff between relevant aspects of CCC, of which sufficent expression is one criterion. However, in some use-cases, users may want to emphasize some properties more than others, and for such cases, this downstream filtering may be helpful.

## Visualize sender-agnostic ligand activities for each receiver-group combination

All the previous figures were informative for interactions where both the sender and receiver cell types are captured in the data and where ligand and receptor are sufficiently expressed at the RNA level. However, these two conditions are not always fulfilled and some interesting cell-cell communication signals may be missed as a consequence. Can we still have an idea about these potentially missed interactions? Yes, we can.

In the next type of plot, we plot all the ligand activities (both scaled and absolute activities) of each receiver-condition combination. This can give us some insights in active signaling pathways across conditions. Note that we can thus show top ligands based on ligand activity - irrespective and agnostic of expression in sender. Benefits of this analysis are the possibility to infer the activity of ligands that are expressed by cell types that are not in your single-cell dataset or that are hard to pick up at the RNA level.

The following block of code will show how to visualize the activities for the top5 ligands for each receiver cell type - condition combination:
Expand Down
Loading

0 comments on commit 70678ec

Please sign in to comment.