Skip to content

G:Profiler

Lola-W edited this page Mar 6, 2023 · 2 revisions

G:Profiler enrichment analysis


Objective: Run a g:profiler enrichment analysis on given list of genes

Time estimated: 2 h; taken 2 h;

Date started: 2023-3-5 ; completed: 2023-3-5


Conditions:

  • Query set: a gene list
  • Enrichment analysis: g:profiler
  • parameter used:
    • All results
    • Data sources : Reactome, Go biological process, and Wiki pathways
    • Multiple hypothesis testing, less strict: Benjamini hochberg

Process:

  1. Use: Select the Ensembl ID with the most GO annotations, to resolve duplicates
  2. Rerun query
  3. Go to Detailed Results
  4. Set Term size to 200 maximum to increase specificity.

ii. Get number of genes: arrows next to the stats heading, see T, T/Q

vi. Search for a specific gene in downloaded GEM file.

Results:

  1. What is the top term returned in each data source?

    Top terms for:

    • GO biological process: immune system process, GO:0002376
    • Reactome: Immune System, REAC:R-HSA-168256
    • Wiki pathways: Allograft rejection, WP:WP2328
  2. How many genes are in each of the above genesets returned? (hint, in the Detailed results tab of g:profiler results if you click on the arrows next to the stats heading you will be able to see the number of genes in a term, number of genes in your query and number of genes in your query that are also in your term)

    • 2683 genes are in GO:0002376.
    • 2041 genes are in REAC:R-HSA-168256.
    • 88 genes are in WP:WP2328.
  3. How many genes from our query are found in the above genesets?

    • In GO:0002376 290 out of 430 genes from our query are found.
    • In REAC:R-HSA-168256, 220 out of 334 genes from our query are found.
    • In WP:WP2328, 32 out of 291 ****genes from our query are found.
  4. Change g:profiler settings so that you limit the size of the returned genesets. Make sure the returned genesets are between 5 and 200 genes in size. Did that change the results?

    • It changed the results for GO biological process and Reactome, while the top hit for Wiki pathways remains.
    • Now the top terms for GO biological process and Reactome are antigen processing and presentation (GO:0019882) and Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell (REAC:R-HSA-198933).
  5. Which of the 4 ovarian cancer expression subtypes do you think this list represents?

    • I think this list represents the immunoreactive ovarian cancer expression subtypes. Because the top terms returned in the 3 data sources are all immune-related, moreover, when we limit the size of the returned genesets, one of the top term returned includes Lymphoid cell, which is a cell that provides immune response.
  6. Bonus: The top gene returned for this comparison is TFEC (ensembl gene id:ENSG00000105967). Is it found annotated in any of the pathways returned by g:profiler for our query? What terms is it associated with it g:profiler?

    • When we limit the size of returned genesets to 5-200, we found TFEC in 3 genesets returned by GO biological process: GO:0009266, GO:0009408, GO:0034605, its associated terms are temperature/heat response.
    • When we remove the limitation on genesets’ sizes and set it to default of 1-10000, we found TFEC in 85 genesets from GO biological process and some of the associated terms are response to stimulus/stress, regulations of biological process, and metabolic process.

Summary: In g:profiler, by constraining the sizes of genesets the returned results are more specific, while using a larger size of searching gives us a greater picture of the overall type of each geneset.


References:

Uku Raudvere, Liis Kolberg, Ivan Kuzmin, Tambet Arak, Priit Adler, Hedi Peterson, Jaak Vilo: g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update) Nucleic Acids Research 2019; doi: 10.1093/nar/gkw199 [PDF].

Verhaak, R. G. W., Tamayo, P., Yang, J.-Y., Hubbard, D., Zhang, H., Creighton, C. J., Fereday, S., Lawrence, M., Carter, S. L., Mermel, C. H., Kostic, A. D., Etemadmoghadam, D., Saksena, G., Cibulskis, K., Duraisamy, S., Levanon, K., Sougnez, C., Tsherniak, A., Gomez, S., … Meyerson, M. (2012). Prognostically relevant gene signatures of high-grade serous ovarian carcinoma. Journal of Clinical Investigation. https://doi.org/10.1172/jci65833

Clone this wiki locally