Skip to content

Latest commit

 

History

History
111 lines (77 loc) · 7.49 KB

README.md

File metadata and controls

111 lines (77 loc) · 7.49 KB

Project AKTAmercy - "CHROMER"

[VER] | [DEV] | [STATUS]

CHROMER is a Pythonic automaton that facillitates the post-processing of chromatographic data from the AKTA Pure HPLC systems running UNICORN (7.0+). It is designed to be used in conjunction with a Google Sheets database to index purification targets and their associated data. An example can be found below.

Its current functionality is limited to the Kulp Lab, but in the future it will be expanded to be more generalizable.

Features

  • PyCORN-powered Parsing: Parses the proprietary .UFol (UNICORNS) archives generated by the UNICORN Evaluation software for the chromatographic data and metadata, using the PyCORN module by Yasar L. Ahmed
  • Chromatogram Recognition: Recognizes and annotates chromatograms with the sample name, purification method, and date.
  • Peak Detection: Detects peaks in the chromatogram and determine pool fractions based on the peak area.
  • Database Synchronicity: Utilizes the Google API suite to automatically update the database with the processed chromatograms.

Developmental Roadmap

1.0.0
Advancements
+ SEC Chromatograms generated are now numerically accurate, accounting for flow rate and injection point.
+ Index algorithm now covers a majority of targets
+ Debug tools <b>?INSPECT</b> and <b>parseLOG</b> added to identify deviants easily.
+ CHROMER -> CHROMER

Regressions
- Multiprocessing removed for simplicity, potential for reimplementation at later release
- AFFINITY chromatograms are visually congruent, but numerically (x values / volume) incorrect.
- When processing a suffeciently large backlog, CHROMER requires reauthentication via Google sign in. Chromatographic data processed during this interim is dropped.

Known Issues and Limitations

  • Chromatogram Recognition: The chromatogram recognitiom algorithm relies on an index system, in which a purification target is identified by an index value that is stored in an external database (Google Sheets) and entered into the Sample_ID field prior to purification.
  • Duplicate Logging: Even without multiprocessing, the log generates duplicate lines for each sample. parseLOG is required to process logs when looking for deviants.
  • Pooling Fractions (AFFINITY): The implementation is incomplete, and lacks the ability to ennumerate multiple peaks. Some of the pooling ranges are a bit too conservative.
  • Plotting: Revisions to the plotting are likely, to include more information and to make the plots more readable (pretty).
  • Reauthentication: When processing a considerable amount of .Result files (500+), CHROMER will require a user to reauthenticate with their Google account credentials. During this interim, data processed will be lost and require reprocessing.

Please report any issues you encounter here, and feel free to contribute to the project by submitting a pull request.

Usage

Google Sheets: When using Google Sheets to store the purification index, one must set up aa service account and give it sufficient permission.

To use CHROMER, you need to have Python installed on your system. A virtual environment via Anaconda or Miniconda, which both include Python, is recommmended.

# Create a new virtual environment
conda create --name CHROMER python=3.11

# Activate the virtual environment
conda activate CHROMER

# Clone the repository
git clone https://github.com/alxdolphin/AKTAmercy.git # if you don't have git, you can download the repository as a .zip file\

# Navigate to the repository
cd AKTAmercy

# Populate the virtual environment with the required packages
pip install -r requirements.txt

# Run the script
./CHROMER.py

# You may be prompted to log in to your Google account and give the script permission to access your Google Sheets / Drive. This facilitates the automatic updating of the database, and gives me access to your bank account.

The script will process all of the .UFol (UNICORNS) and .Result files in the ./data/DROP-OFF directory into the chromatograms they contain, and deposit them in the ./data/DONE directory, If the mode is configured to cloud or both in the config.json file, they will also be uploaded to the chosen data store.

State of the Art - UNICORN vs CHROMER

v1.0.0-rc1

UNICORN CHROMAUTOGRAM
Lectin
The X values are slightly off in the CHROMAUTOGRAM, but the plot is otherwise identical to the UNICORN plot, with the peak falling within the same fraction range and reaching the same height. I'll take it (+ no negative Y-axis), but actually I won't because I'm going to fix it.
UNICORN CHROMAUTOGRAM
SEC
Features like peak area shading, an overview plot, and more are planned in future releases,

Resources and Acknowledgements

Category Libraries/APIs
Data Handling PyCORN, Google Sheets API, Google Drive API, gspread, pydrive, oauth2client
Python Standard Library datetime, io, json, logging, os, re, struct, tarfile, tempfile, xml.etree.ElementTree, collections.OrderedDict, concurrent.futures.ProcessPoolExecutor, zipfile, time, queue, argparse, shutil
Data Visualization numpy, matplotlib, mpl_toolkits, seaborn, scipy.signal