CHROMER is a Pythonic automaton that facillitates the post-processing of chromatographic data from the AKTA Pure HPLC systems running UNICORN (7.0+). It is designed to be used in conjunction with a Google Sheets database to index purification targets and their associated data. An example can be found below.
Its current functionality is limited to the Kulp Lab, but in the future it will be expanded to be more generalizable.
- PyCORN-powered Parsing: Parses the proprietary .UFol (UNICORNS) archives generated by the UNICORN Evaluation software for the chromatographic data and metadata, using the PyCORN module by Yasar L. Ahmed
- Chromatogram Recognition: Recognizes and annotates chromatograms with the sample name, purification method, and date.
- Peak Detection: Detects peaks in the chromatogram and determine pool fractions based on the peak area.
- Database Synchronicity: Utilizes the Google API suite to automatically update the database with the processed chromatograms.
1.0.0 ✅
Advancements
+ SEC Chromatograms generated are now numerically accurate, accounting for flow rate and injection point.
+ Index algorithm now covers a majority of targets
+ Debug tools <b>?INSPECT</b> and <b>parseLOG</b> added to identify deviants easily.
+ CHROMER -> CHROMER
Regressions
- Multiprocessing removed for simplicity, potential for reimplementation at later release
- AFFINITY chromatograms are visually congruent, but numerically (x values / volume) incorrect.
- When processing a suffeciently large backlog, CHROMER requires reauthentication via Google sign in. Chromatographic data processed during this interim is dropped.
- Chromatogram Recognition: The chromatogram recognitiom algorithm relies on an index system, in which a purification target is identified by an index value that is stored in an external database (Google Sheets) and entered into the
Sample_ID
field prior to purification. - Duplicate Logging: Even without multiprocessing, the log generates duplicate lines for each sample.
parseLOG
is required to process logs when looking for deviants. - Pooling Fractions (AFFINITY): The implementation is incomplete, and lacks the ability to ennumerate multiple peaks. Some of the pooling ranges are a bit too conservative.
- Plotting: Revisions to the plotting are likely, to include more information and to make the plots more readable (pretty).
- Reauthentication: When processing a considerable amount of
.Result
files (500+), CHROMER will require a user to reauthenticate with their Google account credentials. During this interim, data processed will be lost and require reprocessing.
Please report any issues you encounter here, and feel free to contribute to the project by submitting a pull request.
Google Sheets: When using Google Sheets to store the purification index, one must set up aa service account and give it sufficient permission.
To use CHROMER, you need to have Python installed on your system. A virtual environment via Anaconda or Miniconda, which both include Python, is recommmended.
# Create a new virtual environment
conda create --name CHROMER python=3.11
# Activate the virtual environment
conda activate CHROMER
# Clone the repository
git clone https://github.com/alxdolphin/AKTAmercy.git # if you don't have git, you can download the repository as a .zip file\
# Navigate to the repository
cd AKTAmercy
# Populate the virtual environment with the required packages
pip install -r requirements.txt
# Run the script
./CHROMER.py
# You may be prompted to log in to your Google account and give the script permission to access your Google Sheets / Drive. This facilitates the automatic updating of the database, and gives me access to your bank account.
The script will process all of the .UFol
(UNICORNS) and .Result
files in the ./data/DROP-OFF
directory into the chromatograms they contain, and deposit them in the ./data/DONE
directory, If the mode is configured to cloud
or both
in the config.json
file, they will also be uploaded to the chosen data store.
v1.0.0-rc1
UNICORN | CHROMAUTOGRAM | |
---|---|---|
Lectin | ||
UNICORN | CHROMAUTOGRAM | |
---|---|---|
SEC |
Category | Libraries/APIs |
---|---|
Data Handling | PyCORN, Google Sheets API, Google Drive API, gspread, pydrive, oauth2client |
Python Standard Library | datetime, io, json, logging, os, re, struct, tarfile, tempfile, xml.etree.ElementTree, collections.OrderedDict, concurrent.futures.ProcessPoolExecutor, zipfile, time, queue, argparse, shutil |
Data Visualization | numpy, matplotlib, mpl_toolkits, seaborn, scipy.signal |