-
Notifications
You must be signed in to change notification settings - Fork 51
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #264 from SysBioChalmers/feat/readme
doc: README for GECKO root and databases
- Loading branch information
Showing
44 changed files
with
245 additions
and
478 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,126 +1,81 @@ | ||
.. image:: GECKO.png | ||
:align: center | ||
|
||
|Current Version| |Tests passing| |Build Status| |PyPI Version| |Docs Status| |Gitter| | ||
|Current Version| |Tests passing| |Gitter| |Zenodo| | ||
|
||
About GECKO | ||
----------- | ||
|
||
The **GECKO** toolbox is a Matlab/Python package for enhancing a **G**\ enome-scale model to account for **E**\ nzyme **C**\ onstraints, using **K**\ inetics and **O**\ mics. It is the companion software to `this <http://www.dx.doi.org/10.15252/msb.20167411>`_ publication, and it has two main parts: | ||
The **GECKO** toolbox is able to enhance a **G**\ enome-scale model to account for **E**\ nzyme **C**\ onstraints, using **K**\ inetics and **O**\ mics. The resulting enzyme-constrained model (**ecModel**) can be used to perform simulations where enzyme allocation is either drawn from a total protein pool, or constrained by measured protein levels from proteomics data. | ||
|
||
- ``geckomat``: Matlab+Python scripts to fetch online data and build/simulate enzyme-constrained models. | ||
- ``geckopy``: a Python package which can be used with `cobrapy <https://opencobra.github.io/cobrapy/>`_ to obtain a ecYeastGEM model object, optionally adjusted for provided proteomics data. | ||
**Note:** Due to significant refactoring of the code, ecModels generated with GECKO versions 1 or 2 are not compatible with GECKO 3, and *vice versa*. The latest GECKO 2 release is available `here <https://github.com/SysBioChalmers/GECKO/releases/tag/v2.0.3>`_, while the ``gecko2`` branch is retained. | ||
|
||
Last update: 2021-02-17 | ||
**Citation** | ||
|
||
This repository is administered by Benjamin J. Sanchez (`@BenjaSanchez <https://github.com/benjasanchez>`_), Division of Systems and Synthetic Biology, Department of Biology and Biological Engineering, Chalmers University of Technology. | ||
- A GECKO 3 publication is currently under consideration, citation information will appear here in due course. | ||
- For GECKO release 2, please cite `Domenzain et al. (2022) <https://doi.org/10.1038/s41467-022-31421-1>`_. | ||
- For GECKO release 1, please cite `Sánchez et al. (2017) <https://doi.org/10.15252/msb.20167411>`_. | ||
|
||
Last update: 2023-03-05 | ||
|
||
geckomat: Building enzyme-constrained models | ||
-------------------------------------------- | ||
|
||
Required software - Python module | ||
Required software | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
- `Python 2.7 <https://www.python.org/>`_ | ||
- `setuptools for python 2.7 <http://www.lfd.uci.edu/~gohlke/pythonlibs/#setuptools>`_ | ||
- SOAPpy: | ||
|
||
:: | ||
|
||
easy_install-2.7 SOAPpy | ||
- MATLAB version 2019b or later, no additional MathWorks toolboxes are required. | ||
- `RAVEN <https://github.com/SysBioChalmers/RAVEN>`_ Toolbox version 2.7.12 or later. | ||
- `Gurobi Optimizer <https://www.gurobi.com/solutions/gurobi-optimizer/>`_ is recommended for simulations (free academic license available). Alternatively, the open-source GNU Linear Programming Kit (`GLPK <https://www.gnu.org/software/glpk/>`_, distributed with RAVEN) or SoPlex as part of the `SCIP Optimization Suite <https://scipopt.org/>`_ can be used. | ||
- `Docker <https://www.docker.com/>`_ for running DLKcat. | ||
|
||
Required software - Matlab module | ||
Installation | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
- `MATLAB <http://www.mathworks.com/>`_ 9.1 (R2016b) or higher + Optimization Toolbox. | ||
- The `COBRA toolbox for MATLAB <https://github.com/opencobra/cobratoolbox>`_. | ||
- The `RAVEN toolbox for MATLAB <https://github.com/SysBioChalmers/RAVEN>`_. | ||
- The `libSBML MATLAB API <https://sourceforge.net/projects/sbml/files/libsbml/MATLAB%20Interface>`_ (version 5.17.0 is recommended). | ||
**GECKO toolbox** | ||
|
||
Usage | ||
~~~~~ | ||
- The preferred way to download GECKO is via git clone:: | ||
|
||
- **For creating an enzyme constrained model:** | ||
git clone --depth=1 https://github.com/SysBioChalmers/GECKO | ||
|
||
- Update the following data files in ``/databases`` with your organism infomation: | ||
- Alternatively, a `ZIP-archive <https://github.com/SysBioChalmers/GECKO/releases>`_ can be directly downloaded from GitHub. The ZIP-archive should be extracted to a disk location where the user has read- and write-access rights. | ||
|
||
- ``databases/prot_abundance.txt``: Protein abundance Data from Pax-DB. If data is not available for your organism, then a relative proteomics dataset (in molar fractions) can be used instead. The required format is a tab-separated file, named as ``databases/relative_proteomics.txt`` , with a single header line and 2 columns; the first with gene IDs and the second with the relative abundances for each protein. | ||
- ``databases/uniprot.tab``: Gene-proteins data from uniprot. | ||
- ``databases/chemostatData.tsv``: Chemostat data for estimating GAM (optional, called by ``fitGAM.m``). | ||
- ``databases/manual_data.txt``: Kcat data from eventual manual curations (optional, called by ``manualModifications.m``). | ||
- After git clone or extracting the ZIP-archive, the user should navigate in MATLAB to the GECKO folder. GECKO can then be installed with the command that adds GECKO (sub-)folders to the MATLAB path:: | ||
|
||
- Adapt the following functions in ``/geckomat`` to your organism: | ||
cd('C:\path\to\GECKO') % Modify to match GECKO folder and OS | ||
GECKOInstaller.install | ||
|
||
- ``geckomat/getModelParameters.m`` | ||
- ``geckomat/change_model/manualModifications.m`` | ||
- ``geckomat/limit_proteins/sumProtein.m`` | ||
- ``geckomat/limit_proteins/scaleBioMass.m`` | ||
- ``geckomat/kcat_sensitivity_analysis/changeMedia_batch.m`` (optional) | ||
- ``geckomat/change_model/removeIncorrectPathways.m`` (optional, called by ``manualModifications.m``) | ||
- ``geckomat/limit_proteins/sumBioMass.m`` (optional, called by ``sumProtein.m`` & ``scaleBiomass.m``) | ||
- If desired, a removal command is available as:: | ||
|
||
- Run ``geckomat/get_enzyme_data/updateDatabases.m`` to update ``ProtDatabase.mat``. | ||
- Run ``geckomat/enhanceGEM.m`` with your metabolic model as input. | ||
GECKOInstaller.uninstall | ||
|
||
- **For performing simulations with an enzyme-constrained model:** Enzyme-constrained models can be used as any other metabolic model, with toolboxes such as COBRA or RAVEN. For more information on rxn/met naming convention, see the supporting information of `Sanchez et al. (2017) <https://dx.doi.org/10.15252/msb.20167411>`_ | ||
**RAVEN Toolbox and Gurobi** | ||
|
||
geckopy: Integrating proteomic data to ecYeastGEM | ||
------------------------------------------------- | ||
- The RAVEN Toolbox Wiki contains installation instructions for both `RAVEN Toolbox <https://github.com/SysBioChalmers/RAVEN/wiki/Installation>`_ and `Gurobi <https://github.com/SysBioChalmers/RAVEN/wiki/Installation#solvers>`_. | ||
|
||
If all you need is the ecYeastGEM model to use together with cobrapy you can use the ``geckopy`` Python package. | ||
- Briefly, RAVEN is either downloaded via git clone, as ZIP-archive from GitHub, or installed as `MATLAB AddOn <https://se.mathworks.com/matlabcentral/fileexchange/112330-raven-toolbox>`_. | ||
|
||
Required software | ||
~~~~~~~~~~~~~~~~~ | ||
- After finishing all installation instructions, the user should run installation checks in MATLAB with:: | ||
|
||
- Python 3.6, 3.7 or 3.8 | ||
- cobrapy | ||
checkInstallation | ||
|
||
Installation | ||
~~~~~~~~~~~~ | ||
**Docker** | ||
|
||
:: | ||
- Installation instructions are available at https://docs.docker.com/get-docker/. | ||
|
||
pip install geckopy | ||
|
||
Usage | ||
~~~~~ | ||
|
||
.. code:: python | ||
Getting started | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
from geckopy import GeckoModel | ||
import pandas | ||
some_measurements = pandas.Series({'P00549': 0.1, 'P31373': 0.1, 'P31382': 0.1}) | ||
model = GeckoModel('multi-pool') | ||
model.limit_proteins(some_measurements) | ||
model.optimize() | ||
In the GECKO folder, ``protocols.m`` contains instructions on how to reconstruct and analyze an ecModel for *S. cerevisiae*. | ||
|
||
Contributing | ||
------------ | ||
|
||
Contributions are always welcome! Please read the `contributing guidelines <https://github.com/SysBioChalmers/GECKO/blob/devel/.github/CONTRIBUTING.md>`_ to get started. | ||
|
||
Contributors | ||
------------ | ||
|
||
- Ivan Domenzain (`@IVANDOMENZAIN <https://github.com/IVANDOMENZAIN>`_), Chalmers University of Technology, Gothenburg Sweden | ||
- Eduard Kerkhoven (`@edkerk <https://github.com/edkerk>`_), Chalmers University of Technology, Gothenburg Sweden | ||
- Benjamin J. Sanchez (`@BenjaSanchez <https://github.com/benjasanchez>`_), Chalmers University of Technology, Gothenburg Sweden | ||
- Moritz Emanuel Beber (`@Midnighter <https://github.com/Midnighter>`_), Danish Technical University, Lyngby Denmark | ||
- Henning Redestig (`@hredestig <https://github.com/hredestig>`_), Danish Technical University, Lyngby Denmark | ||
- Cheng Zhang, Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm Sweden | ||
|
||
.. |Current Version| image:: https://badge.fury.io/gh/sysbiochalmers%2Fgecko.svg | ||
:target: https://badge.fury.io/gh/sysbiochalmers%2Fgecko | ||
.. |Tests passing| image:: https://github.com/SysBioChalmers/GECKO/actions/workflows/tests.yml/badge.svg?branch=main | ||
:target: https://github.com/SysBioChalmers/GECKO/actions | ||
.. |Build Status| image:: https://travis-ci.com/SysBioChalmers/GECKO.svg?branch=master | ||
:target: https://travis-ci.com/SysBioChalmers/GECKO | ||
.. |PyPI Version| image:: https://badge.fury.io/py/geckopy.svg | ||
:target: https://badge.fury.io/py/geckopy | ||
.. |Docs Status| image:: https://readthedocs.org/projects/geckotoolbox/badge/?version=latest | ||
:alt: Documentation Status | ||
:target: http://geckotoolbox.readthedocs.io/ | ||
.. |Gitter| image:: https://badges.gitter.im/SysBioChalmers/GECKO.svg | ||
:alt: Join the chat at https://gitter.im/SysBioChalmers/GECKO | ||
:target: https://gitter.im/SysBioChalmers/GECKO?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge | ||
.. |Zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.7699818.svg | ||
:target: https://doi.org/10.5281/zenodo.7699818 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
- `DLKcatCurrencyMets.tsv` is a table of metabolites that form pairs of currency metabolites when occuring in a reaction together (one as substrate, other as product). This is used by `writeDLKcatInput` to filter out currency metabolites. This file is manually curated to reflect common metabolite pairs, but can be extended to include more model-specific metabolite names. This can either be in this folder (and a pull request to the GitHub repository will make this more widely available to other users), or by keeping a copy of this file in the `data` subfolder of the model adapter folder. | ||
- `DLKcatIgnoreMets.tsv` is a table of small metabolites/ions that `writeDLKcatInput` filters out as DLKcat does not predict kcat values for such substrates. This can either be in this folder (and a pull request to the GitHub repository will make this more widely available to other users), or by keeping a copy of this file in the `data` subfolder of the model adapter folder. | ||
- `max_KCAT.txt` is a collation of maximum kcat values per organism, reaction and substrate, as gathered from BRENDA database by `/src/geckopy/brenda_parser`. | ||
- `max_MW.txt` is a collation of maximum molecular weights per organism and reaction (without explicitly referring to an protein identifier), as gathered from BRENDA database by `/src/geckopy/brenda_parser`. | ||
- `max_SA.txt` is a collation of maximum specific activities per organism, reaction and substrate, as gathered from BRENDA database by `/src/geckopy/brenda_parser`. | ||
- `PhylDist.mat` is a taxonomic tree of KEGG organisms, as generated by RAVEN Toolbox. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.