-
Notifications
You must be signed in to change notification settings - Fork 0
GSEA
Objective: perform a GSEA preranked analysis on mesenchymal and immunoreactive ovarian cancer subtypes
Time estimated: 2 h; taken 1.5 h;
Date started: 2023-3-20 ; completed: 2023-3-20
Issue: cannot open docker with GSEA Solution: download software- GSEA v4.3.2 Mac App on GSEA official website
- Download data:
- mesenchymal vs immuno rank file using github URL
- newest baderlab geneset:
[Human_GOBP_AllPathways_no_GO_iea_March_02_2023_symbol.gmt](http://download.baderlab.org/EM_Genesets/March_02_2023/Human/symbol/Human_GOBP_AllPathways_no_GO_iea_March_02_2023_symbol.gmt)
with no IEA
- Upload using Load data
-
Run GSEAPreanked
- Collapse: No_Collapse
- Basic fields: Maxsize 200
- rest of the parameters used default, as the defaulted permutation method is gene set permutation.
Issue: no Max genesize for selection Solution: Scroll right for basic fields, show
- mesenchymal vs immuno rank file
- genesets from the baderlab geneset collection containing GO biological process, no IEA and pathways.
- maximum geneset size of 200
- Exclude huge pathways, increase specificity of result
- minimum geneset size of 15
- gene set permutation
- We use a ranked list instead of phenotype mutation which is not optimized for RNA seq datasets.
- To calculate the NES values for all S and permutation, and compare where the actual ES is in this distribution using FDR and p value
- Explain the reasons for using each of the above parameters.
- We used the newest baderlab geneset because
- It is updated on a monthly basis, therefore it contain more pathways then GSEA default, and more up to date.
- GO biological process is included, not included IEA for more credible pathways instead of electronic annotations.
- Maximum geneset size is set to 200
- We want to exclude huge pathways and increase specificity of our result.
- Minimum geneset size is set to 15
- We want to exclude extremely small pathways which might that may just contain a few number of genes. . This approach ensures that the resulting pathways are not too specific.
- We chose gene set permutation
- The other option: phenotype mutation is not optimized for RNA seq datasets
- We used a ranked gene list, and we want to maintain the rank during the permutation of GSEA analysis
- We used the newest baderlab geneset because
-
- What is the top gene set returned for the Mesenchymal sub type?
- HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION%MSIGDBHALLMARK%HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION
- What is its pvalue, ES, NES and FDR associated with it.
- pvalue: 0.000
- ES: 0.87
- NES: 2.59
- FDR: 0.000
- How many genes in its leading edge?
- 147 genes in its leading edge.
- What is the top gene associated with this geneset.
- Top gene is FBN1.
- What is the top gene set returned for the Immunoreactive subtype?
- HALLMARK_INTERFERON_ALPHA_RESPONSE%MSIGDBHALLMARK%HALLMARK_INTERFERON_ALPHA_RESPONSE
- What is its pvalue, ES, NES and FDR associated with it.
- pvalue: 0.000
- ES: -0.86
- NES: -2.90
- FDR: 0.000
- How many genes in its leading edge?
- 79 genes in its leading edge.
- What is the top gene associated with this geneset.
- Top gene is PROCR.
- What is the top gene set returned for the Mesenchymal sub type?
Summary: GSEA account for all signals instead of the top differentiated ones, the negative and positive values account for different ovarian cancer subtypes and will be detected as phenotypes automatically.
References:
Mootha, V. K., Lindgren, C. M., Eriksson, K. F., Subramanian, A., Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstrale, M., Laurila, E., et al. (2003). PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet 34, 267-273.