Chado Library

A Python library for interacting with a Chado database.

Installation

$ pip install chado

# On first use you'll need to create a config file to connect to the database, just run:

$ chakin init
Welcome to Chado's Chakin! (茶巾)
PGHOST: xxxx
PGDATABASE: xxxx
PGUSER: xxxx
PGPASS:
PGPORT: 5432
PGSCHEMA: public

This will create a chakin config file in ~/.chakin.yml

Examples

from chado import ChadoInstance
ci = ChadoInstance(dbhost="localhost", dbname="chado", dbuser="chado", dbpass="chado", dbschema="public", dbport=5432)

# Create human species
org = ci.organism.add_organism(genus="Homo", species="sapiens", common="Human", abbr="H.sapiens")

# Then display the list of organisms
orgs = ci.organism.get_organisms()

for org in orgs:
    print('{} {}'.format(org.genus, org.species))

# Create an analysis
an = ci.analysis.add_analysis(name="My cool analysis",
                                   program="Something",
                                   programversion="1.0",
                                   algorithm="Google",
                                   sourcename="src",
                                   sourceversion="2.1beta",
                                   sourceuri="http://example.org/",
                                   date_executed="2018-02-03")

# And load some data
ci.feature.load_fasta(fasta="./test-data/genome.fa", analysis_id=an['analysis_id'], organism_id=orgs[0]['organism_id'])
ci.feature.load_gff(gff="./test-data/annot.gff", analysis_id=an['analysis_id'], organism_id=orgs[0]['organism_id'])

Or with the Chakin client:

$ my_org=`chakin organism add_organism --species sapiens Homo Human H.sapiens  | jq -r '.organism_id'`

$ chakin organism get_organisms
[
    {
        "organism_id": 1133,
        "genus": "Homo",
        "species": "sapiens",
        "abbreviation": "H.sapiens",
        "common_name": "Human",
        "comment": null
    }
]

# Then load some data
$ my_analysis=`chakin analysis add_analysis \
    "My cool analysis" \
    "Something" \
    "v1.0" \
    "src" | jq -r '.analysis_id'`


$ chakin feature load_fasta \
    --analysis_id $my_analysis \
    --sequence_type contig \
    ./test-data/genome.fa $my_org

History

2.3.3
- Now requires python >= 3.6
- Better error reporting for blast loader
2.3.2
- Fix interproscan loader only loading the first result of XML v5
- Fix interproscan loader failing to load IPR by name
2.3.1
- Fix data loading in Tripal database
2.3.0
- Fix non working --re_parent option in fasta loader
- allow connection using a preformatted url (needed by galaxy tools using pgutil)
- added loading of Blast and InterProScan data
- moved chakin feature load_go to chakin load go
- fix sequence computing when landmark sequence is available in the db
- add more options to match features in expression matrix loader (query_type, match_on_name, re_name, skip_missing)
2.2.6
- fix requirement name for psycopg2 (name change for version >=2.8)
2.2.5
- Added support for units in expression loaders
- Fix error in load_gff when no source is specified
2.2.4
- Fix broken --skip_missing option for load_go
2.2.3
- Throw a warning instead of an exception when a GFF target feature does not exist
2.2.2
- Bug fixes and improvements to the expression module
2.2.1
- Minor release to fix broken package at pypi, no code change
2.2.0
- Added feature.load_go() to load GO annotation (blast2go results)
- Added feature.get_feature_analyses() to fetch the analyses associated with a feature
- Added feature.get_feature_cvterms() to fetch the cvterms associated with a feature
- Added support for biomaterial/expression data (as used by tripal_analysis_expression)
- New --protein_id_attr option for feature.load_gff()
2.1.5
- bugfix: fix features deletion when deleting an analysis
2.1.4
- bugfix: fix sporadic errors with AnalysisFeature class declaration
2.1.3
- bugfix: make --species a mandatory arg for organism creation
- bugfix: fix features deletion when deleting an analysis or an organism
- update chado docker image
2.1.2
- skip whole database schema reflection for simple tasks (analysis and organism management)
- fix polypeptide creation for genes beginning at position 0
- fix various small bugs in phylogeny and featureprop loading
- fix bug in cvterm creation
- fix crashes in gbk/gff exporters
2.1.1
- newick: remove prefix from node labels too
- newick: fix errors with named internal nodes
2.1
- auto reflect db schema
- add phylogeny module
- load features from fasta
- load features from gff3
- load featureprops from tabular file
- make chakin util commands work when db is offline
- add unit tests
2.0
- "Chakin" CLI utility
- Complete package restructure
- Nearly all functions renamed

License

Available under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 373 Commits
chado		chado
chakin		chakin
docs		docs
galaxy		galaxy
scripts @ d0cdfaf		scripts @ d0cdfaf
test-data		test-data
test		test
.command-engine.yml		.command-engine.yml
.gitignore		.gitignore
.gitmodules		.gitmodules
.travis.yml		.travis.yml
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chado Library

Installation

Examples

History

License

About

Releases

Packages

Languages

License

dreyes17/python-chado

Folders and files

Latest commit

History

Repository files navigation

Chado Library

Installation

Examples

History

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages