diff --git a/README.md b/README.md index cdd4347..4d8e093 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,9 @@ [![DOI](https://zenodo.org/badge/713469041.svg)](https://zenodo.org/doi/10.5281/zenodo.10255543) # About PeakPerformance -PeakPerformance employs Bayesian modelling for chromatographic peak data fitting. This has the innate advantage of providing uncertainty quantification while jointly estimating all peak parameters united in a single peak model. As Markoc Chain Monte Carlo (MCMC) methods are utilized to infer the posterior probability distribution, convergence checks and the aformentioned uncertainty quantification are applied as novel quality metrics for a robust peak recognition. +PeakPerformance employs Bayesian modeling for chromatographic peak data fitting. +This has the innate advantage of providing uncertainty quantification while jointly estimating all peak parameters united in a single peak model. +As Markov Chain Monte Carlo (MCMC) methods are utilized to infer the posterior probability distribution, convergence checks and the aformentioned uncertainty quantification are applied as novel quality metrics for a robust peak recognition. # First steps Be sure to check out our thorough [documentation](https://peak-performance.readthedocs.io/en/latest). It contains not only information on how to install PeakPerformance and prepare raw data for its application but also detailed treatises about the implemented model structures, validation with both synthetic and experimental data against a commercially available vendor software, exemplary usage of diagnostic plots and investigation of various effects. diff --git a/docs/source/markdown/Preparing_raw_data.md b/docs/source/markdown/Preparing_raw_data.md index fa76d08..088f109 100644 --- a/docs/source/markdown/Preparing_raw_data.md +++ b/docs/source/markdown/Preparing_raw_data.md @@ -1,13 +1,15 @@ # Preparing raw data -This step is crucial when using PeakPerformance. Raw data has to be supplied as time series meaning for each signal you want to analyze, save a NumPy array consisting of time in the first dimension and intensity in the second dimension (compare example data in the repository). Both time and intensity should also be NumPy arrays. If you e.g. have time and intensity of a signal as lists, you can use the following code to convert, format, and save them in the correct manner: +This step is crucial when using PeakPerformance. +Raw data has to be supplied as time series meaning for each signal you want to analyze, save a shape `(2, ?)` NumPy array consisting of time in the first, and intensity in the second entry in the first dimension (compare example data in the repository). +Both time and intensity should also be NumPy arrays. +If you e.g. have time and intensity of a signal as lists, you can use the following code to convert, format, and save them in the correct manner: ```python import numpy as np -from pathlib import Path time_series = np.array([np.array(time), np.array(intensity)]) -np.save(Path(r"example_path/time_series.npy"), time_series) +np.save("time_series.npy", time_series) ``` The naming convention of raw data files is `___.npy`. There should be no underscores within the named sections such as `acquisition name`. Essentially, the raw data names include the acquisition and mass trace, thus yielding a recognizable and unique name for each isotopomer/fragment/metabolite/sample. This is of course only relevant when using the pre-manufactured data pipeline and does not apply to user-generated custom data pipelines.