Skip to content

Commit

Permalink
0.6.1 release
Browse files Browse the repository at this point in the history
  • Loading branch information
sigven committed Nov 30, 2020
1 parent 26bb096 commit 0e91907
Show file tree
Hide file tree
Showing 51 changed files with 2,921 additions and 2,051 deletions.
168 changes: 90 additions & 78 deletions README.md

Large diffs are not rendered by default.

57 changes: 57 additions & 0 deletions conda_pkg/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
### Installation of CPSR using Conda

This is an alternative installation approach that does not require Docker on your machine. At the moment it works only for Linux systems.

A prerequisite is that you have Conda installed. First download the Conda package manager. Get it with:

```
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p ./miniconda
```

Run the following to add Conda into your PATH. You can even put that into your `~/.bashrc` or `~/.zshrc` to avoid re-running this in the future:

```
. ./miniconda/etc/profile.d/conda.sh
```

#### Alternative 1
Create a new environment (`-n cpsr`), install the _cpsr_ Conda package directly from Anaconda Cloud:

```
conda create -n cpsr -c conda-forge -c bioconda -c pcgr cpsr
```

#### Alternative 2
Build the _cpsr_ package from source, which is useful if you need to use the development code from the repository:

```
conda install conda-build
export CHANNELS="-c conda-forge -c bioconda -c pcgr -c defaults"
conda build $CHANNELS conda_pkg/cpsr
conda install --use-local $CHANNELS cpsr
```

For both alternatives you also need to download the reference data bundle for your genome build:

```
wget http://insilico.hpc.uio.no/pcgr/pcgr.databundle.20201123.grch37.tar.gz -O grch37.tar.gz
wget http://insilico.hpc.uio.no/pcgr/pcgr.databundle.20201123.grch38.tar.gz -O grch38.tar.gz
tar -xzf grch37.tar.gz # will extract into ./data/grch37/
tar -xzf grch38.tar.gz # will extract into ./data/grch38/
```

There is a chance you'll encounter errors during the installation. Due to ongoing updates of the packages in public repositories, some packages might end up conflicting with each other or missing for your system. So try to stick to the dockerized version of CPSR whenever possible.

### Running condarized CPSR

Activate your environment with:

```
conda activate cpsr
```

Run CPSR with `--no-docker` flag. The `--pcgr_dir` argument now doesn't have to contain anything but a `data` directory that you downloaded.

If you encounter errors with VEP, you may need to unset/reset PERL5LIB (e.g. `export PERL5LIB=""`), see the [following issue](https://github.com/bioconda/bioconda-recipes/issues/4390)
30 changes: 17 additions & 13 deletions cpsr.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@
import toml
from argparse import RawTextHelpFormatter

PCGR_VERSION = 'dev'
CPSR_VERSION = '0.6.0rc'
DB_VERSION = 'PCGR_DB_VERSION = 20200920'
PCGR_VERSION = '0.9.1'
CPSR_VERSION = '0.6.1'
DB_VERSION = 'PCGR_DB_VERSION = 20201123'
VEP_VERSION = '101'
GENCODE_VERSION = '35'
VEP_ASSEMBLY = 'GRCh38'
Expand All @@ -24,7 +24,7 @@
#global VEP_ASSEMBLY

GE_panels = {
0: "CPSR exploratory cancer predisposition panel (n = 216, TCGA + Cancer Gene Census + NCGC + Other)",
0: "CPSR exploratory cancer predisposition panel (n = 335, Genomics England PanelApp / TCGA Germline Study / Cancer Gene Census / Other)",
1: "Adult solid tumours cancer susceptibility (Genomics England PanelApp)",
2: "Adult solid tumours for rare disease (Genomics England PanelApp)",
3: "Bladder cancer pertinent cancer susceptibility (Genomics England PanelApp)",
Expand Down Expand Up @@ -72,7 +72,7 @@

def __main__():

panels = "0 = CPSR exploratory cancer predisposition panel (n = 216, TCGA + Cancer Gene Census + NCGC + Other)\n"
panels = "0 = CPSR exploratory cancer predisposition panel (n = 335, Genomics England PanelApp / TCGA Germline Study / Cancer Gene Census / Other)\n"
panels = panels + "1 = Adult solid tumours cancer susceptibility (Genomics England PanelApp)\n"
panels = panels + "2 = Adult solid tumours for rare disease (Genomics England PanelApp)\n"
panels = panels + "3 = Bladder cancer pertinent cancer susceptibility (Genomics England PanelApp)\n"
Expand Down Expand Up @@ -137,9 +137,10 @@ def __main__():
optional.add_argument('--docker-uid', dest='docker_user_id', help='Docker user ID. Default is the host system user ID. If you are experiencing permission errors,\n try setting this up to root (`--docker-uid root`), default: %(default)s')
optional.add_argument('--no-docker', action='store_true', dest='no_docker', default=False, help='Run the CPSR workflow in a non-Docker mode, default: %(default)s')
optional.add_argument('--ignore_noncoding', action='store_true',dest='ignore_noncoding',default=False,help='Do not list non-coding variants in HTML report')
optional.add_argument('--incidental_findings', action='store_true',dest='incidental_findings',default=False, help='Include variants found in ACMG-recommended list for incidental findings (v2.0)')
optional.add_argument('--secondary_findings', action='store_true',dest='secondary_findings',default=False, help='Include variants found in ACMG-recommended list for secondary findings (v2.0)')
optional.add_argument('--gwas_findings', action='store_true',dest='gwas_findings',default=False, help='Report overlap with low to moderate cancer risk variants (tag SNPs) identified from genome-wide association studies')
optional.add_argument('--classify_all', action='store_true',dest='classify_all',help='Provide CPSR variant classifications (TIER 1-5) also for variants with exising ClinVar classifications in output TSV')
optional.add_argument('--clinvar_ignore_noncancer', action='store_true', help='Ignore (exclude from report) ClinVar-classified variants reported only for phenotypes/conditions NOT related to cancer')
optional.add_argument('--maf_upper_threshold', type=float, default = 0.9, dest = 'maf_upper_threshold',help='Upper MAF limit (gnomAD global population frequency) for variants to be included in the report, default: %(default)s')
optional.add_argument('--debug',action='store_true',default=False, help='Print full docker commands to log, default: %(default)s')
required.add_argument('--query_vcf', help='VCF input file with germline query variants (SNVs/InDels).', required = True)
Expand Down Expand Up @@ -471,21 +472,24 @@ def run_cpsr(arg_dict, host_directories, config_options):
virtual_panel_id = -1
ignore_noncoding = 0
gwas_findings = 0
incidental_findings = 0
secondary_findings = 0
classify_all = 0
clinvar_ignore_noncancer = 0

diagnostic_grade_set = "OFF"
incidental_findings_set = "OFF"
secondary_findings_set = "OFF"
gwas_findings_set = "OFF"

if arg_dict['clinvar_ignore_noncancer']:
clinvar_ignore_noncancer = 1
if arg_dict['classify_all']:
classify_all = 1
if arg_dict['gwas_findings']:
gwas_findings = 1
gwas_findings_set = "ON"
if arg_dict['incidental_findings']:
incidental_findings = 1
incidental_findings_set = "ON"
if arg_dict['secondary_findings']:
secondary_findings = 1
secondary_findings_set = "ON"
if arg_dict['diagnostic_grade_only']:
diagnostic_grade_only = 1
diagnostic_grade_set = "ON"
Expand Down Expand Up @@ -607,7 +611,7 @@ def run_cpsr(arg_dict, host_directories, config_options):
else:
logger.info("Virtual gene panel: " + str(GE_panels[virtual_panel_id]))
logger.info("Diagnostic-grade genes in virtual panels (GE PanelApp): " + str(diagnostic_grade_set))
logger.info("Include incidential findings (ACMG recommended list v2.0): " + str(incidental_findings_set))
logger.info("Include incidential findings (ACMG recommended list v2.0): " + str(secondary_findings_set))
logger.info("Include low to moderate cancer risk variants from genome-wide association studies: " + str(gwas_findings_set))
logger.info("Reference population, germline variant frequencies (gnomAD): " + str(config_options['popgen']['pop_gnomad']).upper())
logger.info("Genome assembly: " + str(arg_dict['genome_assembly']))
Expand Down Expand Up @@ -725,7 +729,7 @@ def run_cpsr(arg_dict, host_directories, config_options):
str(PCGR_VERSION) + " " + str(CPSR_VERSION) + " " + str(arg_dict['genome_assembly']) + " " + \
str(virtual_panel_id) + " " + str(custom_bed) + " " + str(arg_dict['maf_upper_threshold']) + " " + \
str(diagnostic_grade_only) + " " + data_dir + " " + str(ignore_noncoding) + " " + \
str(incidental_findings) + " " + str(gwas_findings) + " " + str(classify_all) + docker_command_run_end)
str(clinvar_ignore_noncancer) + " " + str(secondary_findings) + " " + str(gwas_findings) + " " + str(classify_all) + docker_command_run_end)
check_subprocess(logger, cpsr_report_command)
logger.info("Finished")

Expand Down
Binary file added cpsr_classification.pdf
Binary file not shown.
Binary file added cpsr_classification.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
19 changes: 19 additions & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,25 @@

## CHANGELOG

#### 0.6.1 - November 30th 2020

##### Added

- Increased number of genes in panel 0: All genes in 42 virtual panels related to cancer conditions in Genomics England PanelApp now also contributes toward panel 0
- Added option in main script (`--clinvar_ignore_noncancer`) that will exclude any query variants (from HTML report and TSV/JSON output) that have been reported and classified for non-cancer related conditions only (in ClinVar)
- this to exclude variants associated with non-cancer related phenotypes
- For the variant biomarker table, the resolution of the reported biomarker mapping is highlighted with designated background colors for the gene (exact/codon - black vs. exon/gene - orange)

##### Fixed
- Bug in GWAS hits retrieval, [Issue #30](https://github.com/sigven/cpsr/issues/18)
- Custom VCF tags (as specified by user in configuration file) not shown in output TSV files

##### Changed
- Removed DisGeNET annotations from output (associations from Open Targets Platform serve same purpose)
- Renamed report section __Genomic Biomarkers__ to __Variant Biomarkers__
- Option `--incidental_findings` changed back to `--secondary_findings` - recommended term to use according to ACMG
- Removed _MOD (mechanism-of-disease)_ from TSV output file

#### 0.6.0rc - September 24th 2020

- Data updates: ClinVar, GWAS catalog, GENCODE, CIViC, CancerMine, UniProt KB, dbNSFP, Pfam, KEGG, Open Targets Platform, Genomics England PanelApp
Expand Down
88 changes: 71 additions & 17 deletions docs/CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,30 +1,78 @@
CHANGELOG
---------

0.6.0 - September 23rd 2020
^^^^^^^^^^^^^^^^^^^^^^^^^^^
0.6.1 - November 30th 2020
^^^^^^^^^^^^^^^^^^^^^^^^^^

Added
'''''

- Increased number of genes in panel 0: All genes in 42 virtual panels
related to cancer conditions in Genomics England PanelApp now also
contributes toward panel 0
- Added option in main script (``--clinvar_ignore_noncancer``) that
will exclude any query variants (from HTML report and TSV/JSON
output) that have been reported and classified for non-cancer related
conditions only (in ClinVar)

- this to exclude variants associated with non-cancer related
phenotypes

- For the variant biomarker table, the resolution of the reported
biomarker mapping is highlighted with designated background colors
for the gene (exact/codon - black vs. exon/gene - orange)

Fixed
'''''

- Bug in GWAS hits retrieval, `Issue
#30 <https://github.com/sigven/cpsr/issues/18>`__
- Custom VCF tags (as specified by user in configuration file) not
shown in output TSV files

Changed
'''''''

- Removed DisGeNET annotations from output (associations from Open
Targets Platform serve same purpose)
- Renamed report section **Genomic Biomarkers** to **Variant
Biomarkers**
- Option ``--incidental_findings`` changed back to
``--secondary_findings`` - recommended term to use according to ACMG
- Removed *MOD (mechanism-of-disease)* from TSV output file

0.6.0rc - September 24th 2020
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- Data updates: ClinVar, GWAS catalog, GENCODE, CIViC, CancerMine,
UniProt KB, dbNSFP, Pfam, KEGG, Open Targets Platform, Genomics
England PanelApp
- Software updates: VEP 101

.. _fixed-1:

Fixed
'''''

- Duplicated entries in incidental findings

.. _changed-1:

Changed
'''''''

- All arguments to ``cpsr.py`` is now organized with a ``--`` (no
positional arguments)
- All arguments to ``cpsr.py`` are now non-positional
- Arguments to ``cpsr.py`` are divided into two groups: *required* and
*optional*
- ``secondary_findings`` is now coined ``incidental_findings``
- Option **gwas:gwas_hits** in CPSR configuration file is now option
``--gwas_findings`` in ``cpsr.py``
- Option **gwas:gwas_hits** in CPSR configuration file is now optional
argument ``--gwas_findings`` in ``cpsr.py``
- Option **classification:clinvar_cpsr** in CPSR configuration file is
now option ``--classify_all`` in ``cpsr.py``
now optional argument ``--classify_all`` in ``cpsr.py``
- Option **maf_imits:maf_gnomad** in CPSR configuration file is now
option ``--maf_upper_threshold`` in ``cpsr.py``
optional argument ``--maf_upper_threshold`` in ``cpsr.py``
- Option **secondary_findings:show_sf** in CPSR configuration file is
now option ``--incidental_findings`` in ``cpsr.py``
now optional argument ``--incidental_findings`` in ``cpsr.py``
- Virtual panels is now displayed through HTML (previously static
ggplot plot)
- **Settings** section of report is now divived into three:
Expand All @@ -33,13 +81,19 @@ Changed
- Report configuration
- Virtual panel

- Classifications of genes as tumor suppressors/oncogenes are now based
on a combination of CancerMine citation count and presence in Network
of Cancer Genes

.. _added-1:

Added
'''''

- Missing ACMG criteria for classification of silent and intronic
variants outside of splice regions (ACMG_BP7)
- Missing ACMG criterion for classification of silent and intronic
variants outside of splice regions (*ACMG_BP7*)
- Missing ACMG criterion for classification of variants in promoter and
untranslated regions (ACMG_BP3)
untranslated regions (*ACMG_BP3*)
- Possibility to create custom virtual panel - any combination of genes
from panel 0 provided as a single-column text file with argument
``--custom_list``
Expand All @@ -52,7 +106,7 @@ Added
0.5.2 - November 18th 2019
^^^^^^^^^^^^^^^^^^^^^^^^^^

.. _changed-1:
.. _changed-2:

Changed
'''''''
Expand All @@ -63,7 +117,7 @@ Changed
- Moved virtual panel identifier from positional argument to optional
argument (``--panel_id``) in ``cpsr.py``

.. _added-1:
.. _added-2:

Added
'''''
Expand All @@ -75,7 +129,7 @@ Added
0.5.1 - October 14th 2019
^^^^^^^^^^^^^^^^^^^^^^^^^

.. _fixed-1:
.. _fixed-2:

Fixed
'''''
Expand All @@ -88,7 +142,7 @@ Fixed
0.5.0 - September 23rd 2019
^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. _fixed-2:
.. _fixed-3:

Fixed
'''''
Expand All @@ -105,7 +159,7 @@ Fixed
- Handling of non-coding variants (synonymous, upstream_variants) in
the report, no longer excluded

.. _added-2:
.. _added-3:

Added
'''''
Expand Down
Binary file modified docs/_build/doctrees/CHANGELOG.doctree
Binary file not shown.
Binary file modified docs/_build/doctrees/about.doctree
Binary file not shown.
Binary file modified docs/_build/doctrees/annotation_resources.doctree
Binary file not shown.
Binary file modified docs/_build/doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/_build/doctrees/getting_started.doctree
Binary file not shown.
Binary file modified docs/_build/doctrees/output.doctree
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/_build/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 430fd7bd5058a69516f266833495f846
config: 8e9d3f30cf8d8e2799e0ca09ed367f82
tags: 645f666f9bcd5a90fca523b33c5a78b7
Loading

0 comments on commit 0e91907

Please sign in to comment.