JuBiotech · michaelosthege · Oct 26, 2024 · Oct 24, 2024 · Oct 24, 2024 · Oct 24, 2024
diff --git a/README.md b/README.md
@@ -9,6 +9,32 @@ PeakPerformance employs Bayesian modeling for chromatographic peak data fitting.
 This has the innate advantage of providing uncertainty quantification while jointly estimating all peak parameters united in a single peak model.
 As Markov Chain Monte Carlo (MCMC) methods are utilized to infer the posterior probability distribution, convergence checks and the aformentioned uncertainty quantification are applied as novel quality metrics for a robust peak recognition.
 
+# Installation
+
+It is highly recommended to follow the following steps and install ``PeakPerformance`` in a fresh Python environment:
+1. Install the package manager [Mamba](https://github.com/conda-forge/miniforge/releases).
+Choose the latest installer at the top of the page, click on "show all assets", and download an installer denominated by "Mambaforge-version number-name of your OS.exe", so e.g. "Mambaforge-23.3.1-1-Windows-x86_64.exe" for a Windows 64 bit operating system. Then, execute the installer to install mamba and activate the option "Add Mambaforge to my PATH environment variable".
+
+```{caution}
+If you have already installed Miniconda, you can install Mamba on top of it but there are compatibility issues with Anaconda.
+```
+
+```{note}
+The newest conda version should also work, just replace `mamba` with `conda` in step 2.
+```
+
+2. Create a new Python environment in the command line using the provided [`environment.yml`](https://github.com/JuBiotech/peak-performance/blob/main/environment.yml) file from the repo.
+   Download `environment.yml` first, then navigate to its location on the command line interface and run the following command:
+```
+mamba env create -f environment.yml
+```
+
+Naturally, it is alternatively possible to just install ``PeakPerformance`` via pip:
+
+```bash
+pip install peak-performance
+```
+
 # First steps
 Be sure to check out our thorough [documentation](https://peak-performance.readthedocs.io/en/latest). It contains not only information on how to install PeakPerformance and prepare raw data for its application but also detailed treatises about the implemented model structures, validation with both synthetic and experimental data against a commercially available vendor software, exemplary usage of diagnostic plots and investigation of various effects.
 Furthermore, you will find example notebooks and data sets showcasing different aspects of PeakPerformance.

diff --git a/docs/source/index.md b/docs/source/index.md
@@ -9,32 +9,51 @@ title: PeakPerformance documentation
 [![](https://zenodo.org/badge/DOI/10.5281/zenodo.10255543.svg)](https://zenodo.org/doi/10.5281/zenodo.10255543)
 
 
-``peak_performance`` is a Python toolbox for Bayesian inference of peak areas.
+``PeakPerformance`` is a Python toolbox for Bayesian inference of peak areas.
 
 It defines PyMC models describing the intensity curves of chromatographic peaks.
 
 Using Bayesian inference, this enables the fitting of peaks, yielding uncertainty estimates for retention times, peak height, area and much more.
 
+This documentation features various notebooks that demonstrate the usage.
+
 # Installation
 
+It is highly recommended to follow the following steps and install ``PeakPerformance`` in a fresh Python environment:
+1. Install the package manager [Mamba](https://github.com/conda-forge/miniforge/releases).
+Choose the latest installer at the top of the page, click on "show all assets", and download an installer denominated by "Mambaforge-version number-name of your OS.exe", so e.g. "Mambaforge-23.3.1-1-Windows-x86_64.exe" for a Windows 64 bit operating system. Then, execute the installer to install mamba and activate the option "Add Mambaforge to my PATH environment variable".
+
+```{caution}
+If you have already installed Miniconda, you can install Mamba on top of it but there are compatibility issues with Anaconda.
+```
+
+```{note}
+The newest conda version should also work, just replace `mamba` with `conda` in step 2.)
+```
+
+2. Create a new Python environment in the command line using the provided [`environment.yml`](https://github.com/JuBiotech/peak-performance/blob/main/environment.yml) file from the repo.
+   Download `environment.yml` first, then navigate to its location on the command line interface and run the following command:
+```
+mamba env create -f environment.yml
+```
+
+Naturally, it is alternatively possible to just install ``PeakPerformance`` via pip:
+
 ```bash
 pip install peak-performance
 ```
 
 You can also download the latest version from [GitHub](https://github.com/JuBiotech/peak-performance).
 
-
-The documentation features various notebooks that demonstrate the usage.
-
 ```{toctree}
 :caption: Tutorials
 :maxdepth: 1
 
-markdown/Installation
-markdown/Preparing_raw_data
+notebooks/Preparing_raw_data_for_PeakPerformance
 markdown/Peak_model_composition
-markdown/PeakPerformance_validation
 markdown/PeakPerformance_workflow
+markdown/PeakPerformance_validation
+notebooks/Recreate_data_from_scratch
 markdown/Diagnostic_plots
 markdown/How_to_adapt_PeakPerformance_to_your_data
 ```

diff --git a/docs/source/markdown/Installation.md b/docs/source/markdown/Installation.md
diff --git a/docs/source/markdown/Preparing_raw_data.md b/docs/source/markdown/Preparing_raw_data.md
diff --git a/docs/source/markdown/Recreate_data_from_scratch.md b/docs/source/markdown/Recreate_data_from_scratch.md
@@ -0,0 +1,26 @@
+# Recreate the presented data in paper and documentation from scratch
+
+## Recreate Figure 2 from the PeakPerformance publication
+
+Navigate to `docs/source/notebooks` and run the `Create_results_in_figure_2.ipynb` notebook.
+
+It is separated into two sections which work and are structured in an analogous manner.
+The first creates the results figure for the single peak and the second for the double peak.
+Both sections walk through the following sequential steps:
+  1. open and plot example raw data
+  2. define a model
+  3. perform both sampling and posterior predictive sampling
+  4. display the summary DataFrame containing the results of the peak fitting
+  5. display cumulative plot of the posterior predictive check
+  6. display the posterior predictive check and the peak fit against the raw data points.
+
+## Recreate the validation plot from the documentation
+
+To actually recreate the validation plot, navigate to `docs/source/notebooks` and run the notebook `Create_validation_plot_from_raw_data.ipynb`.
+
+However, not all data loaded in this notebook is raw data.
+Particularly, the data from the first stage of validation using synthetic data sets is pre-processed based on the results of said test using the notebook `Processing_test_1_raw_data.ipynb`.
+Since all necessary files are present for both notebooks, they can be run in any order.
+
+Also, the data for the comparison with the commercial software MultiQuant in the third stage of validation is contained in `docs/source/notebooks/test3_df_comparison.xlsx`.
+The `PeakPerformance` results listed in this file have been obtained by executing a batch run with the raw data stored in `docs/source/notebooks/paper raw data` using the settings detailed in the `Template.xlsx` file in the same directory.