Skip to content

A Python package for pharmacogenomics research

License

Notifications You must be signed in to change notification settings

BioinfoLabImmuno/pypgx

 
 

Repository files navigation

README

Documentation Status

Table of Contents

Introduction

pypgx is a Python package for pharmacogenomics research, which can be used as a standalone program and as a Python module. Documentation is available at Read the Docs.

Installation

You can easily install pypgx and all of its dependencies with the Anaconda distribution. It is strongly recommended to create a new environment specifically for pypgx, as there are many required dependencies that you may not want added to an existing environment.

$ conda create -n pypgx -c conda-forge -c bioconda -c defaults -c sbslee pypgx

Before using pypgx, make sure to activate the conda environment where pypgx is installed.

$ conda activate pypgx

pypgx CLI

The pypgx CLI page describes command-line interface (CLI) for the pypgx package.

You can display help message for pypgx CLI by entering:

$ pypgx -h
usage: pypgx [-v] [-h] COMMAND ...

positional arguments:
  COMMAND               Name of the command.
    compare-stargazer-calls
                        Compute the concordance between two 'genotype-
                        calls.tsv' files created by Stargazer.
    calculate-read-depth
                        Create a GDF (GATK DepthOfCoverage Format) file for
                        Stargazer from BAM files by computing read depth.
    call-variants-gatk-sge
                        Create a VCF (Variant Call Format) file for Stargazer
                        from BAM files by calling SNVs and indels.

optional arguments:
  -v, --version         Show the version and exit.
  -h, --help            Show this help message and exit.

You can display command-specific help message by entering (e.g. calculate-read-depth):

$ pypgx calculate-read-depth -h
usage: pypgx calculate-read-depth -t TEXT -c TEXT [-i PATH] -o PATH [-a TEXT]
                                  [-h]

Create a GDF (GATK DepthOfCoverage Format) file for Stargazer from BAM files
by computing read depth.

Arguments:
  -t TEXT, --target-gene TEXT
                        Name of the target gene. Choices: {'abcb1', 'cacna1s',
                        'cftr', 'cyp1a1', 'cyp1a2', 'cyp1b1', 'cyp2a6',
                        'cyp2a13', 'cyp2b6', 'cyp2c8', 'cyp2c9', 'cyp2c19',
                        'cyp2d6', 'cyp2e1', 'cyp2f1', 'cyp2j2', 'cyp2r1',
                        'cyp2s1', 'cyp2w1', 'cyp3a4', 'cyp3a5', 'cyp3a7',
                        'cyp3a43', 'cyp4a11', 'cyp4a22', 'cyp4b1', 'cyp4f2',
                        'cyp17a1', 'cyp19a1', 'cyp26a1', 'dpyd', 'g6pd',
                        'gstm1', 'gstp1', 'gstt1', 'ifnl3', 'nat1', 'nat2',
                        'nudt15', 'por', 'ptgis', 'ryr1', 'slc15a2',
                        'slc22a2', 'slco1b1', 'slco1b3', 'slco2b1', 'sult1a1',
                        'tbxas1', 'tpmt', 'ugt1a1', 'ugt1a4', 'ugt2b7',
                        'ugt2b15', 'ugt2b17', 'vkorc1', 'xpc'}. [required]
  -c TEXT, --control-gene TEXT
                        Name of a preselected control gene. Used for
                        intrasample normalization during copy number analysis
                        by Stargazer. Choices: {'egfr', 'ryr1', 'vdr'}.
                        Alternatively, you can provide a custom genomic region
                        with the 'chr:start-end' format (e.g.
                        chr12:48232319-48301814). [required]
  -i PATH, --bam-path PATH
                        Read BAM files from PATH, one file path per line. Also
                        accepts single BAM file. [required]
  -o PATH, --output-file PATH
                        Path to the output file. [required]
  -a TEXT, --genome-build TEXT
                        Build of the reference genome assembly. Choices:
                        {'hg19', 'hg38'}. [default: 'hg19']
  -h, --help            Show this help message and exit.

For running in command line:

$ pypgx calculate-read-depth -t cyp2d6 -c vdr -i bam-list.txt -o read-depth.gdf

The output GDF file will look something like:

Locus       Total_Depth     Average_Depth_sample    Depth_for_Steven        Depth_for_John
...
chr22:42539471      190     95      53      137
chr22:42539472      192     96      54      138
chr22:42539473      190     95      53      137
...

pypgx API

The pypgx API page describes application programming interface (API) for the pypgx package.

For running within Python (e.g. phenotyper):

from pypgx.phenotyper import phenotyper
print(phenotyper("cyp2d6", "*1", "*1"))
print(phenotyper("cyp2d6", "*1", "*4"))
print(phenotyper("cyp2d6", "*1", "*2x2"))  # *2x2 is gene duplication.
print(phenotyper("cyp2d6", "*4", "*5"))    # *5 is gene deletion.

To give:

normal_metabolizer
intermediate_metabolizer
ultrarapid_metabolizer
poor_metabolizer

About

A Python package for pharmacogenomics research

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.4%
  • Other 0.6%