Skip to content

ENAR2024

Ali Rahnavard edited this page Jun 20, 2023 · 2 revisions

Welcome to the wiki of the ENAR 2024 Short Course!

Meta-analysis, biomarker discovery, and pathway enrichment analysis of omics data

organized by GW Computational Biology Institute and GW Libraries and Academic Innovation.


Abstract

Methodological advancements paired with measured multi-omics data using high-throughput technologies enable capturing a comprehensive snapshot of distinct biological entities. In particular, low-cost, culture-independent omics profiling has made omics surveys of human health, other hosts, and the environment feasible at an unprecedented scale. The resulting data have stimulated the development of new statistical and computational approaches to analyze and integrate omics data, including human gene expression, microbial gene products, metabolites, and proteins, among others.

Metabolomics data generated from diverse platforms are often analyzed individually; we aim to combine metabolite profiles and feed them into generic downstream analysis software with proper appreciation of the data's statistical properties, resulting in more powerful results and biological inferences. Further, there is also an overwhelmingly extensive collection of downstream analysis software platforms, and appropriately selecting the best tool can be difficult for untrained researchers and non-specialists.

Also, we present a high-level introduction to computational multi-omics, highlighting the state-of-the-art in the field and outstanding challenges geared towards downstream analysis methods. The workshop will include formulating biological hypotheses and identifying the statistical methods currently available to achieve them. The workshop is project-focused and uses a hands-on approach. Participants are encouraged to attend with a specific study or project in mind for the application of the workshop content in the short term. The workshop will use real data for the exercises.

Rationale for Workshop

A joint effort will run this workshop between George Washington University and Merck Research Laboratories, with open and FAIR resources available on GitHub. Researchers from industry and academia will come together to share a diverse perspective on the topic, both from drug discovery and basic science angles, enabling attendees to achieve a holistic view of multi-omics and clinical data integration through state-of-the-art tools applied to motivating examples and use cases. We will begin with an overview of the statistical challenges inherent in analyzing the high-dimensional data that is typical of multi-omics studies. Introductory lectures will include: 1) The challenges associated with precisely testing for multivariable association in population-scale meta-omics studies, 2) challenges and advances in pathway enrichment analyses, including techniques and characterization of omics features, 3) meta-analysis of metabolomics datasets for high-sensitivity discovery and integration with other types of data such as metagenomics data.

Learning objectives

  1. Workshop attendees will use tools for metabolomics meta-analysis through multi-study data scaling, integration, and harmonization using massSight tool.
  2. Workshop attendees will use tools for pattern discovery in multi-omics with metabolomics data, including
  • Tweedieverse tutorial and Tweedieverse examples: A unified statistical framework for differential analysis of multi-omics,
  • omePath: omics pathway enrichment analysis
  • omeClust: Omics community detection using multi-resolution clustering, interspersed with lecture content, attendees will work through multi-omics analysis tutorials; and
  1. Attendees will practice generating publication-quality figures and effective visualization of the results.

Learning outcomes for participants

Participants will:

  1. Be able to apply novel techniques (such as massSight) to combine metabolite profiles and perform meta-analysis of metabolomics data.
  2. Understand statistical properties of metabolomics data and challenges for multivariable association testing in population-scale meta-omics studies.
  3. Understand how to apply pathway enrichment analysis to metabolomic data using a variety of statistical methods implemented in omePath, and
  4. Be able to perform a meta-analysis of metabolomics datasets by combining multiple studies data and perform pairwise association testing with other omics profiles in population-scale datasets.

Prepration

Preparation tasks are optional. However, they help the organizers to focus on scientific discussion rather than troubleshooting technical issues.

  • Install the latest R and Rstudio on your local computer
  • Install the listed software in the learning objectives
  • Try to run demos of each software
  • Bring your data to apply these techniques

Tips

  • For windows OS please use Command Prompt with admin access

Organizers

Ali Rahnavard: George Washington University (Organizer, Instructor)

Himel Mallick: Cornell University (Instructor)

Acknowledgement

This material is based upon work supported by the National Science Foundation under Grant Number (2109688), Bill & Melinda Gates Foundation under Investment number (016930).