Skip to content

Commit

Permalink
Merge pull request #3 from EnzymeML/enzymeml-website
Browse files Browse the repository at this point in the history
Enzymeml website
  • Loading branch information
haeussma authored Feb 10, 2025
2 parents a7bdcf1 + bed5f95 commit 6b7f2b1
Show file tree
Hide file tree
Showing 4 changed files with 189 additions and 17 deletions.
42 changes: 33 additions & 9 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,42 @@
## Why EnzymeML?

EnzymeML is a data model for catalyzed reaction data.
## Unlock the Full Potential of Your Biocatalytical Data

It sets information on small molecules, proteins, and their reaction in context with reaction conditions and the measured data.
This training course is designed to empower researchers, scientists, and data analysts in biocatalysis by equipping them with the skills to manage and analyze experimental data beyond traditional Excel workflows. By leveraging Python and AI-driven tools, participants will enhance their ability to structure, process, and interpret complex datasets while ensuring adherence to FAIR data principles.

EnzymeML is a standardized data model allowing for exchange of data among colleagues, database providers, and data science tools.
## Goals

### 1. Move Beyond Excel: Smarter Data Management

## How to use EnzymeML?
Excel is a widely used tool in biocatalysis, but it has limitations when handling large-scale, multidimensional datasets. This course provides participants with alternative approaches that allow for:

Besides the data model, different tools are available to accelerate your processing and analysis of catalyzed reaction data.
- Efficient data structuring and processing.
- Automated workflows that reduce errors and improve reproducibility.
- Scalable solutions for large datasets that exceed Excel's capabilities.

- EnzymeML Suite: A desktop application for creating, editing, simulating, and visualizing EnzymeML documents.
- Chromatopy: A Python tool for processing chromatographic data.
- MTPHandler: A Python tool for processing plate reader data.
- NMRpy: A Python tool for processing NMR data.
### 2. Apply Python Directly to Your Research Data

Unlike generic coding courses that rely on theoretical examples, this training is focused on your own experimental data. Participants will:

- Work with their real-world datasets from their research projects.
- Learn how to manipulate and analyze biocatalytical data using Python.
- Gain hands-on experience in integrating computational tools into their workflows.

### 3. Simplified Python Learning with AI Assistance

For participants new to Python, the learning curve can be steep. This course integrates AI-driven tools to facilitate:

- Code generation and debugging support.
- Step-by-step guidance in writing and optimizing Python scripts.
- Automated solutions for routine data processing tasks.

### 4. Ensure FAIR Compliance with EnzymeML

The course emphasizes FAIR-compliant data management, ensuring that experimental results are:

- **Findable** – Easily searchable and indexed for future reference.
- **Accessible** – Structured in a way that allows seamless data sharing.
- **Interoperable** – Compatible with other datasets and computational tools.
- **Reusable** – Properly documented and standardized to support further research.

Through hands-on training, participants will learn how to generate EnzymeML documents, a standardized format for enzymatic reaction data that enhances data exchange and reproducibility.
68 changes: 67 additions & 1 deletion docs/sessions/01-overview.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,72 @@
# Overview
# Availabe tools for processing of experimental data

## From Raw Experimental Data to EnzymeML-Driven Analysis

Processing experimental data for analysis is often a complex and error-prone task. Typically, raw data from lab instruments such as plate readers, chromatographs, and NMR devices must be manually extracted, cleaned, and reformatted before analysis can begin. This process is time-consuming and not scalable.

To streamline this workflow, Python tools such as chromatopy, MTPHandler, and NMRpy have been developed. These tools enable direct reading of raw data files from experimental instruments, automating the transformation into a structured format that is immediately usable for analysis.

## Data Processing Workflow

Raw data is read directly from files generated by lab instruments and transformed into EnzymeML documents. EnzymeML provides a standardized structure for storing key reaction data, including reaction conditions, catalysts, and substrate properties. This ensures that data is well-organized, FAIR-compliant, and ready for computational analysis.

Within a Jupyter Notebook environment, these tools allow seamless integration of data processing, analysis, and visualization. The entire workflow—from raw data ingestion to structured analysis—is transparent, reproducible, and easy to share with others.

## From Raw Data to Analyzable Data

Once experimental data has been transformed into EnzymeML format, it becomes the foundation for further data science applications:

- Yield, conversion, and selectivity calculations
- Kinetic modeling and reaction simulations
- Comprehensive visualization of experimental results

The following diagram shows the workflow from raw data to analyzable data in form of an EnzymeML Document:

```mermaid
graph LR
A[🌈 Chromatographic Instrument] -->|output| A1[📄 Files]
B[🔬 Plate Reader] -->|output| B1[📄 Files]
C[🧲 NMR] -->|output| C1[📄 Files]
A1 -->|read| D
B1 -->|read| E
C1 -->|read| F
subgraph in Jupyter Notebook:
subgraph Experimental Data Processing
D{chromatopy}
E{MTPHandler}
F{NMRpy}
end
D -->|transform| DataObject[EnzymeML Object]
E -->|transform| DataObject
F -->|transform| DataObject
DataObject -.-> DS1
DataObject <-.-> DS2
DataObject -.-> DS3
subgraph with Data Science Python Tools:
DS1[Determine e.g., yield, conversion, selectivity]
DS2[Kinetic Modeling]
DS3[Visualization]
end
end
DataObject -->|transform| ExperimentalDocument
ExperimentalDocument["<b>📄 EnzymeML Document</b><br><br>
<i>Small Molecules</i><br>
<i>Proteins</i><br>
<i>Measurements</i><br>
<i>Reactions</i>"]
```


## 🔬 Photometric Data

The [MTPHandler](https://fairchemistry.github.io/MTPHandler/) Python library streamlines the processing of photometric data from plate readers. It enables reading, processing, and exporting data from a variety of plate reader formats, blank correction, and concentration calculation in a scalable way.

## 🌈 Chromatographic Data

The [Chromatopy](https://fairchemistry.github.io/Chromatopy) Python library streamlines the processing of chromatographic time-course data. It enables reading, processing, and exporting data from a variety of chromatographic instruments, assignment of retention times to molecules, and concentration calculation in a scalable way.

## 🧲 NMR Data

The [NMRPy](https://nmrpy.readthedocs.io/en/latest/) Python library streamlines the processing of NMR time-course data.
92 changes: 89 additions & 3 deletions docs/sessions/02-programming-setup.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,95 @@
# Programming Setup

## Cursor
## Required Software

### 1. Cursor IDE
Cursor is an AI-powered code editor that can help you write code faster and more efficiently. For more information and to install it, please visit the [Cursor website](https://www.cursor.com/).

## Installing a Python Interpreter
### 2. Python Environment
We'll use Conda as our Python environment manager. It's available for Windows, macOS, and Linux.

Besides the code editor, such as Cursor, you will need to install a Python interpreter, which will be used to run the code within the code editor.
## Installing Conda

Choose one of these distributions:
- **Miniconda**: Lightweight version
- **Anaconda**: Full package suite
- **Miniforge**: Community-driven version

### Windows Installation
1. Download the installer from your chosen distribution
2. Run the `.exe` file and follow the installation wizard
3. Verify installation by opening Anaconda/Miniforge Prompt:
```bash
conda list
```

### macOS Installation
1. Download your chosen installer
2. Open terminal and run:
```bash
bash <conda-installer-name>-latest-MacOSX-x86_64.sh
```
3. Close and reopen terminal
4. Verify installation:
```bash
conda list
```

### Linux Installation
1. Download Miniconda:
```bash
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
```
2. Install:
```bash
bash Miniconda3-latest-Linux-x86_64.sh
```
3. Restart terminal
4. Verify installation:
```bash
conda list
```

## Working with Conda Environments

### Creating Environments
Create a new environment with specific Python version and packages:
```bash
conda create -n my_project_env python=3.10 numpy pandas
```

### Managing Environments
- **Activate** an environment:
```bash
conda activate my_project_env
```
- **List** all environments:
```bash
conda env list
# or
conda info --envs
```
- **Install** packages in active environment via `pip`:
```bash
pip install scipy matplotlib
```

## IDE Setup

### Setting Up Conda in Cursor/VSCode

1. Activate your environment:
```bash
conda activate my_project_env
```

2. Install Jupyter kernel:
```bash
conda install ipykernel
```

3. Configure VSCode:
- Install Python extension from marketplace
- Press `Ctrl + Shift + P` (or `Cmd + Shift + P` on macOS)
- Search for "Select Python Interpreter"
- Choose your Conda environment
4 changes: 0 additions & 4 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ nav:
- Session 1 - Overview: sessions/01-overview.md
- Session 2 - Programming Setup: sessions/02-programming-setup.md
- Session 3 - Read Experimental Data: sessions/03-read-experimental-data.md
- Use cases:
- Motivation: usecases/motivation.md
- Hackathon: usecases/hackathon.md


theme:
name: material
Expand Down

0 comments on commit 6b7f2b1

Please sign in to comment.