forked from BaderLab/CBW_Pathways_2021
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path6.3-scRNA_scNetViz.Rmd
217 lines (166 loc) · 15.4 KB
/
6.3-scRNA_scNetViz.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
# Module 6 lab 3: scRNA ScNetViz {#scNetViz-lab}
**This work is licensed under a [Creative Commons Attribution-ShareAlike 3.0 Unported License](http://creativecommons.org/licenses/by-sa/3.0/deed.en_US). This means that you are able to copy, share and modify the work, as long as the result is distributed under the same license.**
*<font color="#827e9c">By Veronique Voisin, Chaitra Sarathy and Ruth Isserlin</font>*
## Introduction
- scNetViz is a Cytoscape app designed to perform downstream analysis of single cell RNAseq (scRNA) data. The goal is to explore the results of a single cell RNAseq experiment in the context of the pathways involved and in network analysis framework.
- It is assumed that the data was already been preprocessed, for example using the Seurat or similiar pipeline. Ideally the experimental results have been published and deposited into a public repository. Biologists with no or little computational background can now load the dataset of their choice into Cytoscape.
- Each dataset is loaded with metadata indicating, for example different groups to which the cells belong, ie. control or treated groups or specific clusters that the upstream data analysis revealed. The idea is to select groups or clusters of interest and perform differential expression analysis.
- Plots visualizing the results of the differential expression analysis are created by the scNetViz app including heatmaps and violin plots.
- The app takes the top n differentially expressed genes and creates a network representing them. One of the best features of this app is that at this point it can take advantage of all the features in Cytoscape and facilitate the integration of other omics data.
- We can perform pathway enrichment analysis on the created network (the top genes) enabling the identification of functions associated with the top differentially expressed genes.
- The app takes as input scRNA data stored in the [Single Cell Expression Atlas repository](https://www.ebi.ac.uk/gxa/sc/home) hosted by EMBL's European Bioinformatics Institute (https://www.ebi.ac.uk/gxa/sc/home). This repository contains scRNA experiments from Animals, Plants, Fungi and Protists. It features experiments from the Human Cell Atlas, the Fly Cell Atlas, the Chan Zuckerberg Biohub and more.
<p align="center">
<img src="./scNetViz_lab/images/scNetViz1.png" alt="start" width="100%" />
</p>
## Goal of the practical lab
In this example, we will browse the Single Cell Expression Atlas from within Cytoscape, explore a particular dataset, perform differential expression analysis based on one of the provided cell annotations categories, generate networks from the top differentially expressed genes for each group within the chosen category, and functionally characterize and visualize the networks.
## Data
- ACCESSION NUMBER: [E-MTAB-7417](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-7417/results/tsne)
- TISSUE: Cells from a digested skin sample were taken from two 8-week old female C57BL/6 mice
- INSTRUMENT(S): Illumina HiSeq 4000
- ORGANISM(S): Mus Musculus
- DISEASE(S): Normal
- DATA PROCESSING: Droplet-based sequencing data was aligned, filtered and quantified using the Cell Ranger Single-Cell Software Suite (version 22.2.0), against the mouse reference genome provided by Cell Ranger.
```{block, type="rmd-tip"}
scNetViz accesses the data directly from the Single Cell Expression Atlas repository, therefore it is not necessary to download the data for the practical lab.<br>
However, if you would like to work on data offline, there is an option to import the data manually into scNetViz:
* In Cytoscape menu bar select **Apps** -> **scNetViz** -> **Load Experiment** -> **Import from File**.
It can also be useful if you already have your data of interest downloaded on your computer.<br>
Instructions below show you how to download the data directly from the repository but you are not required to pull your data from there.
```
- DATA AVAILABLE AT: https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-7417/downloads:
i. Scroll down to **Result files** and download **Normalized counts files (MatrixMarket archive)**.
i. A folder called [E-GEOD-109979-normalised-files.zip](./scNetViz_lab/data/E-MTAB-7417-normalised-files.zip) is now downloaded onto your computer.
* Note 1: values in the file are normalized to counts per million.
* Note 2: do not unzip the file to upload it into scNetViz
i. Download the cluster information as well: [E-MTAB-7417.clusters.tsv](./scNetViz_lab/data/E-MTAB-7417.clusters.tsv)
* Note: The cells were clustered using the Louvain algorithm.
- Reference paper: Davidson S, Efremova M, Riedel A, Mahata B, Pramanik J et al. (2018) Single-cell RNA sequencing reveals a dynamic stromal niche within the evolving tumour microenvironment [https://pubmed.ncbi.nlm.nih.gov/32433953/].
- This paper discusses melanoma using a mouse model. The scRNA are samples from the skin of 2 healthy mice. Only the clusters identified as fibroblasts (based on expression of Col1a1, Col1a2) were considered for comparison with the stromal clusters.
```{block, type="rmd-datadownload"}
Right click on link below and select "Save Link As...".
Place it in the corresponding module directory of your CBW work directory.
**There is no need to download this data as you can download it directly from the scNetViz app**.
```
- [E-MTAB-7417.aggregated_filtered_normalised_counts.mtx](./scNetViz_lab/data/E-MTAB-7417.aggregated_filtered_normalised_counts.mtx)
- [E-MTAB-7417.aggregated_filtered_normalised_counts.mtx_cols](./scNetViz_lab/data/E-MTAB-7417.aggregated_filtered_normalised_counts.mtx_cols)
- [E-MTAB-7417.aggregated_filtered_normalised_counts.mtx_rows](./scNetViz_lab/data/E-MTAB-7417.aggregated_filtered_normalised_counts.mtx_rows)
- [E-MTAB-7417.clusters.tsv](./scNetViz_lab/data/E-MTAB-7417.clusters.tsv)
## Steps
1. Open Cytoscape
1. Click the icon **Browse EBI Single Cell Expression Atlas** in the Cytoscape toolbar. This opens the Single Cell Experiment Atlas browser.
<p align="center">
<img src="./scNetViz_lab/images/scNetViz2.png" alt="start" width="100%" />
</p>
1. Click the column header labelled **Accession** and search for **E-MTAB-7417** in the resulting table sorted by accession numbers.
1. Select the row with the accession number E-MTAB-7417 by double clicking on it.
<p align="center">
<img src="./scNetViz_lab/images/scNetViz3.png" alt="start" width="100%" />
</p>
1. An experiment table with 3 tabs will open, select the first tab called **TPM**.<br><p align="center"><img src="./scNetViz_lab/images/scNetViz4.png" alt="start" width="100%" /></p>
<br>
<br>
```{block, type="rmd-note"}
This tab contains the genes as rows and the individual cells as columns. Each cell of the table represents the normalized counts for a given gene and cell.
The table has many blank spaces. Single cell RNASeq matrices are sparse because not all genes are detected in all cells.
```
<br>
<ol start=6 type="1">
<li>Select the second tab called **Categories**.</li></ol>
<ol start=7 type="1">
<li> Make sure that **Cluster** is selected in the **Available categories** field. </li></ol>
<br>
<br>
```{block, type="rmd-note"}
The clustering has been computed by the Single Cell Experiment Atlas. The Louvain algorithm has been run with several resolution values. Higher resolutions produces a larger number of clusters and lower resolution produces a lower number of clusters.<br>
A resolution of 1 is selected by default by the scNetViz app. It corresponds to the default Louvain method. It is indicated by "True" in the column sel.K. <br>
In this example, the clustering with parameter K=20 is chosen for further analysis. The other columns indicate the cluster membership of each cell.
```
<br><br>
<p align="center">
<img src="./scNetViz_lab/images/scNetViz5.png" alt="start" width="100%" />
</p>
## Calculate differential expression
<ol start=8 type="1">
<li>Calculate differential expression: </li></ol>
i. To get the table of differentially expressed genes press the button labeled **Calculate Diff Exp**. This initiates a comparison of each cluster versus all other clusters. There are a total of 20 clusters and we will obtain results for each of them in the same table. <br> <p align="center">
<img src="./scNetViz_lab/images/scNetViz6.png" alt="start" width="50%" />
</p>
i. Differential gene expression is calculated using a Wilcoxon rank sum test. The log2(fold change) represents the log ratio of the mean expression of a gene in the cells of the selected cluster versus the mean expression of that gene in all the other cells.
<ol start=9 type="1">
<li>Create a protein-protein network for each of the cluster marker genes: </li></ol>
i. Check **Positive Only** (Because we are performing differential expression between each cluster vs. all remaining clusters genes with positive scores represent genes specific to the cluster and genes with negative scores are those specific to the remaining set of clusters. Therefore we are only interested in the genes that are positive and specific to a given cluster)
i. click on **Create Networks**.
i. For each cluster, a maximum of 200 genes that have a FDR 0.05 and a logFC >0 are selected to create each cluster's representative network. (These parameters are adjustable.) <br><p align="center">
<img src="./scNetViz_lab/images/scNetViz7.png" alt="start" width="100%" />
</p>
i. As there are 20 networks to create, this step takes a few seconds.<br><p align="center">
<img src="./scNetViz_lab/images/scNetViz8.png" alt="start" width="50%" />
</p>
i. Once created, the list of networks is visible in the **Network** tab in the control panel on the left. In the below picture, cluster 20 is selected and contains 79 genes and is displayed in the main network window.
i. In the results panel on the right side, we can see the STRING app parameters, which were used to create the protein interaction networks. Nodes (genes/proteins) are connected with each other if they are known to associate or physically interact with each other. <br><p align="center">
<img src="./scNetViz_lab/images/scNetViz9.png" alt="start" width="100%" />
</p>
## Pathway Analysis of clusters
<ol start=9 type="1">
<li>The next step is to perform pathway analysis on one of these lists of genes/one of these clusters.</li></ol>
i. Click on the network named **Cluster 15**. Cluster 15 has the highest number of connections: it contains 192 nodes (genes/proteins) and 1045 edges (associations/interactions).
i. Locate the STRING app tab in the results panel on the right, unpin it and make it larger.
i. Click on **Functional Enrichment**.<br><p align="center"> <img src="./scNetViz_lab/images/scNetViz10.png" alt="start" width="100%" /></p>
i. A **Retrieve functional enrichment** window opens up. Click on **OK**.<br><p align="center"> <img src="./scNetViz_lab/images/scNetViz11.png" alt="start" width="50%" /></p>
i. The STRING enrichment table appears in the String Enrichment Panel below the network in the Table Panel.The pathways are ranked by FDR values of the enrichment. <br><p align="center"> <img src="./scNetViz_lab/images/scNetViz12.png" alt="start" width="100%" /> </p>
i. STRING app uses more than 15 source of pathways and gene-sets sources to calculate pathway enrichment. We can filter the results to display only the GO Biological Process results for clarity.
i. Click on the funnel icon in the top left of the STRING enrichment table.
i. Select categories **GO Biological Process** and click on **OK** <br><p align="center"> <img src="./scNetViz_lab/images/scNetViz13.png" alt="start" width="50%" /></p>
i. Now the table contains only the results for the **GO Biological Process** pathways.<br><p align="center"> <img src="./scNetViz_lab/images/scNetViz14.png" alt="start" width="100%" /></p>
i. Click on the top GO term. It will highlight in yellow all the genes annotated to this pathway contained in the network.
i. Optional - Perform the same analysis on the remaining clusters.
* Here are the functional enrichment results for each cluster.
i. Extracellular matrix
i. Keratinization
i. Tissue Development/ cell migration
i. Skin development/Epithelial cell differentiation
i. Regulation of metabolic process
i. Angiogenesis/ Blood Vessel Development
i. Tissue development/ Skeletal muscle cell differentiation
i. Anatomical structure development. Nervous System Development.
i. Muscle Structure Development
i. Nervous System Development
i. Tissue Development
i. Small molecule Metabolic Process
i. Response to endogenous stimulus
i. Skin Development
i. Immune system/Cell activation
i. Tissue Development/Cell migration
i. Lipid Biosynthesis
i. Regulation of Cell motility
i. Angiogenesis
i. Regulation of Cell motility
This analysis highlights the heterogeneity of the skin tissue composed of differentiated keratinocytes but also epidermal stem cells, fibroblasts, endothelial cells, immune cells like T and B-cells and macrophages.<br>
Pathway enrichment analysis with marker genes can help identify cell types based on the scRNA clustering and further steps might focus on clusters of interests only.
```{block, type="rmd-tip"}
We took the top pathway to annotate the cluster. It is an arbitrary decision. STRING has a similiar interface to EnrichmentTable and has the option to create an enrichmentmap or enhanced graphics from the STRING pathway enrichment results [see cytoscape primer for steps on how to use this feature](#enrichmenttabl-features). Combined with the AutoAnnote app, it might be a more thorough approach to explore the functions associated with each cluster.
```
<p align="center">
<img src="./scNetViz_lab/images/scNetViz20.png" alt="start" width="60%" />
</p>
<br>
<br>
```{block, type="rmd-bonus"}
**color the nodes of cluster 15 proportial to the logFC**
1. Locate the **Style** tab and select the **Fill Color**. Expand it using the down arrow.
1. In the column field, select **Cluster 15 log2FC** attribute
1. Double-click on the color gradient<br><p align="center"> <img src="./scNetViz_lab/images/scNetViz15.png" alt="start" width="60%" /></p>
1. Choose a color palette of your choice.
1. Option to choose colorblind-friendly color gradient is available.<br><p align="center"> <img src="./scNetViz_lab/images/scNetViz16.png" alt="start" width="60%" /></p>
1. Click **Yes** on the message "This will reset your current settings. Are you sure you want to continue?"<br><p align="center"> <img src="./scNetViz_lab/images/scNetViz17.png" alt="start" width="60%" /></p>
1. Another window called "Continuous Mapping Editor for Node Fill Color" appears. Click on **OK**.<br><p align="center"> <img src="./scNetViz_lab/images/scNetViz18.png" alt="start" width="60%" /></p>
1. The nodes of the network are now colored using the logFC (fold change) from the differential expression analysis. Red color indicates the top cluster 15 marker genes.<br><p align="center"> <img src="./scNetViz_lab/images/scNetViz19.png" alt="start" width="60%" /> </p>
```
## Automation (for advanced users)
scNetViz provides its own automation commands and are useful for scripts to control scNetViz operations.
Details available in the Swagger documentation (Help ! Automation ! CyREST Command API)) and in the scVIzNet reference paper.
## scNetViz References
- video ISCB 2019: https://www.youtube.com/watch?v=GGpsWKD9iQE&t=36s
- reference paper: https://pubmed.ncbi.nlm.nih.gov/34912541/
- tutorial: http://www.rbvi.ucsf.edu/cytoscape/scNetViz/index.shtml