diff --git a/docs/articles/cheese.html b/docs/articles/cheese.html index e7a2862..68459b2 100644 --- a/docs/articles/cheese.html +++ b/docs/articles/cheese.html @@ -271,6 +271,44 @@
For example, when we filter for Actinomycetota (Actinobacteria) as +the functional group, we see that there are no edges connecting to group +10 and group 3- the clusters that have the most features associated with +Actinomycetoa for Cheese sample A (Fig 1.3.A).
+Fig 1: Dashboard showing Actinomycetota filtered +network (A) with enrichment pattern for Cheese Sample-A (B) and Cheese +Sample-C (C); Cluster pattern for Group 9, which also is enriched for +Type IV secretion genes (D).
+Looking at the pattern traces of these groups, (Fig 1.3.B), there is +a peak in samples A4 (week 9) and A5 (week 13), which mirrors the 16S +rRNA results of Saak et al. Since these two clusters do not have edges +connecting them to other groups, this suggests that the Actinomycetoa +features found in these groups follow distinct longitudinal succession +patterns that are independent. When looking at Actinomycetoa within +Cheese Sample C we see a different pattern. Groups 2 and 5, have the +most features associated with Actinomycetoa, but they are highly +connected to the other groups (Fig 1.3.A). From these results, we can +hypothesize that though Actinomycetoa features are more abundant in +later time points for both cheese samples, their dynamics are +differentially influenced.
+The authors found that Type VI secretion was enriched in +Pseudomonadota bacteria (specifically, Psychrobacter), and hypothesized +this enrichment was due to the importance of physical species +interactions that occur with this habitat. Using MolPad, we searched for +other secretion systems associated genes, to understand their dynamics +within the community. Focusing on KEGG annotated Type IV secretion +genes, we found that Group 9 contained 12/13 of these genes. Within this +group, features that cluster are ones that peak in Cheese sample C5 +(week 13, Fig 1.3.D). This is also the most taxonomically diverse +sample. From this, we can hypothesize that increased taxonomic diversity +is also associated with increases in genes that are related to +competitive species interactions.
library(MolPad)
+In the figure, A, B, and C can represent datasets from different
sources or different aspects of measurements. Below are two examples of
what the input data might look like and how to transform it into the
required format through pre_process()
modules.
Here is a dataset that already includes a labeled ‘type’ column:
-#> ID Day_1 Day_2 Day_3 Day_4 Day_5 Day_6
-#> 1 1 NA NA 0.104083550 -1.07999900 1.3416214 -1.733564471
-#> 2 2 NA NA 0.882578509 0.08440181 0.7165712 0.004811372
-#> 3 3 0.40203917 NA 0.675076590 0.97017740 -0.8210872 0.659073443
-#> 4 4 1.02859068 0.30178418 0.412617771 0.39306149 -0.3427640 0.837009714
-#> 5 5 -2.14036218 -0.06315709 0.656982200 -0.18301850 -1.3967562 -0.123327229
-#> 6 6 -2.04354192 1.30960484 1.371349945 -1.67403067 -1.2946150 0.760315504
-#> 7 7 0.31164367 2.26615374 0.972843085 -0.03956293 1.0190965 1.180680430
-#> 8 8 -0.79692858 -0.86038791 -3.417872877 0.62118867 -2.1983197 -0.104229763
-#> 9 9 -0.12033042 1.17585682 -0.001075426 0.01122359 0.3430992 -0.189234532
-#> 10 10 0.02592915 0.48551867 0.841254390 -0.46430744 0.6891800 0.751766756
-#> Day_7 Day_8 type
-#> 1 0.4638261 2.1318716 peptide
-#> 2 -0.2949926 -0.9310538 peptide
-#> 3 -0.2520195 0.7809977 peptide
-#> 4 -1.0120371 -0.1162361 peptide
-#> 5 -2.0085970 -0.8959521 peptide
-#> 6 0.7874222 -0.5106016 peptide
-#> 7 -0.7719437 -1.4686258 peptide
-#> 8 0.3541200 0.8224521 lipid
-#> 9 0.5674059 -0.9313882 lipid
-#> 10 -0.4097464 NA metabolite
+#> ID Day_1 Day_2 Day_3 Day_4 Day_5 Day_6
+#> 1 1 NA NA -0.82919412 0.08237778 1.30438699 0.4875914
+#> 2 2 NA NA -0.41606256 1.42523475 -0.44792066 1.1425154
+#> 3 3 1.0354708 NA 0.78841026 -1.49585251 -0.41361867 -0.3271995
+#> 4 4 -0.3005323 0.6998150 -1.86018783 1.14167433 -1.04204735 -2.4930409
+#> 5 5 -0.1307903 0.7839983 0.25834857 0.62421147 -0.07202124 -0.9214203
+#> 6 6 -0.9910321 1.8059465 -1.25395397 -0.45260649 -0.57303842 0.9990113
+#> 7 7 -0.9160734 1.4475673 -3.10452531 -0.06560360 -1.27374744 -0.5703149
+#> 8 8 -0.7381729 -0.4844302 1.10281675 0.12049188 0.20892638 -0.3115689
+#> 9 9 -0.2581716 -0.5906591 0.09794986 2.44249606 1.13135657 0.4886958
+#> 10 10 1.8831483 -1.2490571 0.55981559 -2.31731640 0.97323395 -0.7994677
+#> Day_7 Day_8 type
+#> 1 -1.4632070 0.05832042 peptide
+#> 2 -0.4292379 -0.79047610 peptide
+#> 3 -0.3508094 -0.34995898 peptide
+#> 4 -0.3011731 0.67869535 peptide
+#> 5 1.1761504 -0.86303628 peptide
+#> 6 -0.8993647 -0.44801227 peptide
+#> 7 1.1593812 -0.90149427 peptide
+#> 8 0.5463595 -1.51687039 lipid
+#> 9 -0.5016034 -0.08241911 lipid
+#> 10 -0.9442708 NA metabolite
pre_process()
:
x1 <- pre_process(x, replaceNA = TRUE,scale = TRUE)
head(x1,10)
-#> ID Day_1 Day_2 Day_3 Day_4 Day_5 Day_6
-#> 1 1 -0.12499488 -0.1249949 -0.04022866 -1.00455207 0.9676290 -1.53681943
-#> 2 2 -0.10246242 -0.1024624 1.46237249 0.04718422 1.1680372 -0.09393173
-#> 3 3 0.16412393 -0.4940273 0.61109508 1.09418472 -1.8381738 0.58489741
-#> 4 4 1.27432720 0.1728189 0.34079228 0.31115385 -0.8040231 0.98397755
-#> 5 5 -1.35954882 0.7001734 1.41425156 0.58132077 -0.6222015 0.64050965
-#> 6 6 -1.35505505 1.0595199 1.10398207 -1.08897287 -0.8157586 0.66398092
-#> 7 7 -0.10291149 1.5438734 0.45418630 -0.39882285 0.4931574 0.62930098
-#> 8 8 -0.06748422 -0.1105542 -1.84632390 0.89499455 -1.0186109 0.40265173
-#> 9 9 -0.36947382 1.7376966 -0.17560476 -0.15561064 0.3839091 -0.48148903
-#> 10 10 -0.41003145 0.4704748 1.15201231 -1.34925296 0.8606601 0.98056711
+#> ID Day_1 Day_2 Day_3 Day_4 Day_5 Day_6
+#> 1 1 0.05451890 0.05451890 -0.9508450 0.1543986 1.6360347 0.64570353
+#> 2 2 -0.07554744 -0.07554744 -0.5950354 1.7039742 -0.6348129 1.35097605
+#> 3 3 1.49874204 0.17759695 1.1835204 -1.7309438 -0.3501343 -0.23987305
+#> 4 4 0.10376561 0.87801613 -1.1033792 1.2200072 -0.4701534 -1.59319589
+#> 5 5 -0.31530188 0.89803359 0.2008348 0.6860993 -0.2373532 -1.36395879
+#> 6 6 -0.71746086 1.90776216 -0.9642373 -0.2120986 -0.3251351 1.15037885
+#> 7 7 -0.27002806 1.37506283 -1.7931878 0.3218978 -0.5189688 -0.02938062
+#> 8 8 -0.74469269 -0.43190500 1.5246885 0.3137802 0.4227931 -0.21881953
+#> 9 9 -0.58919072 -0.91616395 -0.2389755 2.0666865 0.7772922 0.14528989
+#> 10 10 1.56649401 -0.74805373 0.5886154 -1.5374455 0.8941115 -0.41582898
#> Day_7 Day_8 type
-#> 1 0.2527477 1.6112132 peptide
-#> 2 -0.6254920 -1.7532453 peptide
-#> 3 -0.9065915 0.7844915 peptide
-#> 4 -1.8183370 -0.4607098 peptide
-#> 5 -1.2288927 -0.1256124 peptide
-#> 6 0.6835002 -0.2511966 peptide
-#> 7 -1.0158949 -1.6028889 peptide
-#> 8 0.7137346 1.0315924 lipid
-#> 9 0.7485573 -1.6879846 lipid
-#> 10 -1.2447220 -0.4597079 metabolite
Annotation_path_taxon <- gAnnotation(annotations,"phylum","class")
To determine which feature clusters are predictive of a given trajectory, the Mean Decrease Accuracy of a subset of top predictors whose expression directly influences the expression of the target @@ -405,8 +413,11 @@
Once you’ve launched the Shiny dashboard, you can zoom in or make adjustments to explore interesting findings within your data. To effectively navigate the dashboard generated by MolPad, you’ll follow three main steps:
Start by selecting a primary functional annotation from the available options. Then, fine-tune the edge density by adjusting the threshold @@ -440,14 +453,14 @@
Brushing on the network unveils patterns of taxonomic composition and typical trajectories. You can also zoom into specific taxonomic annotations by applying filters.
Delve into the feature table to examine the specifics of the features within the selected clusters. Explore additional related function @@ -457,11 +470,13 @@
The network plot is a powerful visualization tool that displays the +relationships between different groups or features within your data. In +MolPad, the network plot helps to identify clusters of features that +share similar patterns, revealing underlying connections that might not +be immediately obvious. By visualizing these connections, users can gain +a clearer understanding of the structure within their data, making it +easier to pinpoint significant associations and trends.
+The stacked bar plot provides a detailed view of the composition of +each cluster or group in your dataset. By stacking different categories +on top of each other within a single bar, this plot allows for a quick +comparison of relative proportions across multiple groups. This is +particularly useful in microbiome experiments where understanding the +distribution of taxa across different conditions or time points is +crucial. The stacked bar plot makes it easy to see how these +distributions change between experimental conditions, facilitating +deeper insights.
+The ribbon plot is designed to visualize changes over time, making it +an ideal tool for tracking longitudinal data. In MolPad, the ribbon plot +illustrates how the abundance or expression levels of features vary +across different time points or conditions. The smooth, flowing design +of the ribbons helps to emphasize trends and patterns, enabling users to +quickly identify periods of significant change or stability. This plot +is particularly advantageous when comparing multiple groups, as it +clearly shows overlapping trends and divergences, providing a +comprehensive view of temporal dynamics in the data.
With the increasing multi-omics data and longitudinal designs +integrated into microbiome experiments, there is a growing need to +present the network, especially with complex variations across +biological modalities. Network perspective helps detect the underlying co-occurrence among microbiome samples, allowing for high-level insights into the global structure. Yet when it comes to experimental data that records time series for 100,000 features, the network will collapse into -some entangled clumps and therefore unable to read. -
For the aim of network interpretation, MolPad shows improvements in 3 important aspects:
Now, let’s see what you can get from the dashboard. We’ll start with +an overview and then demonstrate how to discover patterns within your +data.
Above is the overview of the MolPad Dashboard. To explore the dashboard effectively, you can start by following the sequence A-B-C-D. This approach will guide you through the cluster-level network, @@ -170,6 +181,7 @@
Here is a short example of discovering related patterns using the network plot: The shade of the edges represents the proximity of nodes. In the brushed area, Groups 1-7-8 (circled by solid black lines) and diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml index 24820a2..106fea2 100644 --- a/docs/pkgdown.yml +++ b/docs/pkgdown.yml @@ -6,7 +6,7 @@ articles: cheese: cheese.html getstarted: getstarted.html whymolpad: whymolpad.html -last_built: 2024-08-15T06:34Z +last_built: 2024-08-15T07:35Z urls: reference: https://kaiyanm.github.io/MolPad/reference article: https://kaiyanm.github.io/MolPad/articles diff --git a/docs/reference/figures/cheesecase.png b/docs/reference/figures/cheesecase.png deleted file mode 100644 index d91f265..0000000 Binary files a/docs/reference/figures/cheesecase.png and /dev/null differ diff --git a/docs/reference/figures/cheesecase1.png b/docs/reference/figures/cheesecase1.png new file mode 100644 index 0000000..6d4d8aa Binary files /dev/null and b/docs/reference/figures/cheesecase1.png differ diff --git a/docs/reference/gClusters-1.png b/docs/reference/gClusters-1.png index 993d2a4..463b550 100644 Binary files a/docs/reference/gClusters-1.png and b/docs/reference/gClusters-1.png differ diff --git a/docs/reference/gClusters.html b/docs/reference/gClusters.html index 28642a8..566cc47 100644 --- a/docs/reference/gClusters.html +++ b/docs/reference/gClusters.html @@ -102,70 +102,70 @@