Skip to content

Commit

Permalink
Update paper.md
Browse files Browse the repository at this point in the history
  • Loading branch information
KaiyanM authored Nov 20, 2023
1 parent 46e5aa5 commit c176cb7
Showing 1 changed file with 8 additions and 9 deletions.
17 changes: 8 additions & 9 deletions paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,23 @@ title: 'MolPad: An R-Shiny Package for Cluster Co-Expression Analysis in Longitu
tags:
- R
- Shiny
- multi-omics
- microbiomics
- visualization
- cluster analysis
- network
authors:
- name: Kaiyan Ma
orcid: 0000-0002-7355-8924
equal-contrib: true
affiliation: 1
- name: Author Without ORCID
equal-contrib: true # (This is how you can denote equal contributions between multiple authors)
affiliation: "1, 2" # (Multiple affiliations must be quoted)
- name: Author with no affiliation
- name: Author
affiliation: "1, 2"
- name: Author
corresponding: true # (This is how to denote the corresponding author)
affiliation: 3
affiliations:
- name: University of Wisconsin-Madison, USA
index: 1
- name: Institution Name, Country
- name: Wisconsin Institute for Discovery, USA
index: 2
date: 18 November 2023
bibliography: paper.bib
Expand All @@ -44,11 +42,12 @@ In response to the above issues, previous studies on interactive visualization t

# Methods

## Network Generation
We first scale and cluster the trajectories across all molecular features to depict the longitudinal changes. For clustering, we use K-means and a built-in elbow method to choose the optimal number. Then, we predict a co-expression network for the extracted patterns, similar to what GENIE3 [@GENIE3] does to create a genetic regulatory network. We also divide the prediction process into individual regression tasks. Each central pattern of a cluster is predicted from the expression patterns of all the other central patterns, using tree-based ensemble methods Random Forests. It is chosen because of its potential to deal with interacting features and non-linearity without making any extra assumptions. The Mean Decrease Accuracy of a subset of top predictors whose expression directly influences the expression of the target cluster is taken as an indication of a putative link. That is to say, based on the random forest prediction, if two groups of features are highly linked according to the network, they will have strongly related longitudinal patterns, as shown in Fig \ref{fig:pattern}.

## Network Navigation
Navigating the network in the MolPad dashboard follows three steps: First, choose a primary functional annotation. Adjustment options for fine-tuning include network layout and importance threshold for edge density. Nodes that turn bright green (Fig \ref{fig:dashboard}.A) represent clusters containing the most features in the chosen functional annotation. Second, brushing on the network reveals patterns of taxonomic composition (Fig \ref{fig:dashboard}.B) and typical trajectories (Fig \ref{fig:dashboard}.C). The user can also zoom into specific taxonomic annotations by filtering. Third, view the feature table (Fig \ref{fig:dashboard}.D) , examine the drop-down options for other related function annotations, and click the link for online information on the interested items. The interface is designed to support iterative exploration, encouraging the use of several steps to answer specific questions, like comparing the pattern distribution between two functions or finding functionally important community members metabolizing a feature of interest. Overall, this aggregation adopted the focus-plus-context approach to address the low interoperability of the network graph, facilitating the examination of high-level details for individual features while providing contextual information about cluster interactions among microbiome data.


# Case Study: Cheese Data

Here we aim to highlight the versatility of the MolPad Dashboard with a case study of microbial communities on the wash-rind cheese' surface collected during cheese ripening [@doi:10.1128/msystems.00701-22]. This data stands for a general case that only includes single-omic measurements for the change of Bacteria or Eukaryota in each cheese sample. It has multiple nested annotation labels ranging from kingdom to class, making it more flexible in interpretation.
Expand All @@ -65,7 +64,7 @@ The source code for `MolPad` is stored on [Github](https://github.com/KaiyanM/Mo

![Dashboard Overview: `A`: cluster-level network, `B`: taxonomic-level bar plot, `C`: a type-level line plot, and `D`: a feature-level table. \label{fig:dashboard}](dashboard.png)

![Example of discovering related patterns with network plot. For Groups 1, 7, and 8, the patterns are w-shape with an evident peak at the same time section. For Groups 1 and 2, although Group 1 has higher volatility, they both follow highly overlapped increasing trends.\label{fig:pattern}](pattern.png){ width=60% }
![Example of discovering related patterns with network plot. For Groups 1, 7, and 8, the patterns are w-shape with an evident peak at the same time section. For Groups 1 and 2, although Group 1 has higher volatility, they both follow highly overlapped increasing trends.\label{fig:pattern}](pattern.png){ width=80% }

![Dashboard showing Groups 10, 7, 4, and 3 for the bacterial (a.) and Group 4 for the eukaryotic (b.) community. Groups 10 and 4 have decreasing trends for both cheeses, and they all include largely Proteobacteria and Firmicutes. While Groups 3 and 7 have the opposite increasing trends, which include more Actinobacteria and Bacteroidetes. Among these, Groups 7 and 4 have the strongest periodicity, suggesting a more reproducible tendency for the corresponding main components. For the eukaryote community, most of the features followed the same stable pattern as in Group 4. \label{fig:cheesecase}](cheesecase.png){ width=80% }

Expand Down

0 comments on commit c176cb7

Please sign in to comment.