Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module/ichorcna/1.0 #178

Merged
merged 9 commits into from
May 4, 2021
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions demo/Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ configfile: "../modules/utils/2.0/config/default.yaml"
configfile: "../modules/bwa_mem/1.0/config/default.yaml"
configfile: "../modules/liftover/1.0/config/default.yaml"
configfile: "../modules/controlfreec/1.1/config/default.yaml"
configfile: "../modules/ichorcna/1.0/config/default.yaml"


# Load project-specific config, which includes the shared
Expand Down Expand Up @@ -67,7 +68,7 @@ include: "../modules/utils/2.0/utils.smk"
include: "../modules/bwa_mem/1.0/bwa_mem.smk"
include: "../modules/liftover/1.0/liftover.smk"
include: "../modules/controlfreec/1.1/controlfreec.smk"

include: "../modules/ichorcna/1.0/ichorcna.smk"


##### TARGETS ######
Expand All @@ -83,4 +84,5 @@ rule all:
rules._strelka_all.input,
rules._bwa_mem_all.input,
rules._liftover_all.input,
rules._controlfreec_all.input
rules._controlfreec_all.input,
rules._ichorcna_all.input
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a newline after this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

7 changes: 5 additions & 2 deletions demo/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -77,5 +77,8 @@ lcr-modules:
tool: "battenberg"
inputs:
sample_seg: "data/{tool}/hg38/{tumour_sample_id}--{normal_sample_id}_subclones.igv.seg"



ichorcna:
inputs:
sample_bam: "data/{sample_id}.bam"
sample_bai: "data/{sample_id}.bam.bai"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a newline at the end if this is the last line

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

120 changes: 120 additions & 0 deletions modules/ichorcna/1.0/config/default.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
lcr-modules:

ichorcna:

inputs:
# Available wildcards: {seq_type} {genome_build} {sample_id}
sample_bam: "__UPDATE__"
sample_bai: "__UPDATE__"


scratch_subdirectories: []

options:
readcounter:
readCounterScript: "{MODSDIR}/src/readCounter"
chrs:
hg19: "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we wouldn't rely on the user to specify the chromosomes this way but I can live with this for now.

grch37: "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y"
hs37d5: "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y"
hg38: "chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY"
grch38: "chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY"
qual: 20
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain with a comment what this does.
e.g.
qual: 20 #set the minimum mapping quality (or whatever this actually means)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

binSize: 1000000 # set window size to compute coverage
# available binSizes are: 1000000, 500000, 50000, 10000
run:
ichorCNA_libdir: "{MODSDIR}/src/"
ichorCNA_rscript: "{MODSDIR}/src/runIchorCNA.R"
# use panel matching same bin size (optional)
ichorCNA_normalPanel:
"1000000": "{MODSDIR}/src/inst/extdata/HD_ULP_PoN_{genome_build}_1Mb_median_normAutosome_median.rds"
"500000": "{MODSDIR}/src/inst/extdata/HD_ULP_PoN_{genome_build}_500kb_median_normAutosome_median.rds"
# must use gc wig file corresponding to same binSize (required)
ichorCNA_gcWig:
"1000000": "{MODSDIR}/src/inst/extdata/gc_{genome_build}_1000kb.wig"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this genome_build naming match the one we use ? I assume it does since this was run in GAMBL, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unfortunately ichorCNA's github repo is messy and inconsistent with their naming conventions. In my original version, I had to manually rename some of their reference files to fit this format. In the current version, there's one rule with a bunch of symlinks that renames the reference files so it would fit in downstream rules.

"500000": "{MODSDIR}/src/inst/extdata/gc_{genome_build}_500kb.wig"
"50000": "{MODSDIR}/src/inst/extdata/gc_{genome_build}_50kb.wig"
"10000": "{MODSDIR}/src/inst/extdata/gc_{genome_build}_10kb.wig"
# must use map wig file corresponding to same binSize (required)
ichorCNA_mapWig:
"1000000": "{MODSDIR}/src/inst/extdata/map_{genome_build}_1000kb.wig"
"500000": "{MODSDIR}/src/inst/extdata/map_{genome_build}_500kb.wig"
"50000": "{MODSDIR}/src/inst/extdata/map_{genome_build}_50kb.wig"
"10000": "{MODSDIR}/src/inst/extdata/map_{genome_build}_10kb.wig"
# use bed file if sample has targeted regions, eg. exome data (optional)
ichorCNA_exons: NULL
ichorCNA_centromere:
grch37: "{MODSDIR}/src/inst/extdata/GRCh37.p13_centromere_UCSC-gapTable.txt"
hg19: "{MODSDIR}/src/inst/extdata/GRCh37.p13_centromere_UCSC-gapTable.txt"
hs37d5: "{MODSDIR}/src/inst/extdata/GRCh37.p13_centromere_UCSC-gapTable.txt"
grch38: "{MODSDIR}/src/inst/extdata/GRCh38.GCA_000001405.2_centromere_acen.txt"
hg38: "{MODSDIR}/src/inst/extdata/GRCh38.GCA_000001405.2_centromere_acen.txt"
ichorCNA_minMapScore: 0.75
ichorCNA_chrs:
grch37: "c('1', '2', '3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','X')"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very redundant with the chrs in the config near the top. What's the deal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just input-syntax for two different programs (the first one is formatted for readCounter as part of hmmcopy_utils and this format is for ichorCNA (to be used in R))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved to snakelike, instead of being in config? There, you can use file listing chromosomes generated by reference files (main_chromosomes.txt) or use function to generate chromosome names (there is example in sage module)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I agree it's better to embed anything that generates code (python, R etc) into the snakefile

hg19: "c('1', '2', '3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','X')"
hs37d5: "c('1', '2', '3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','X')"
grch38: "c('chr1', 'chr2', 'chr3','chr4','chr5','chr6','chr7','chr8','chr9','chr10','chr11','chr12','chr13','chr14','chr15','chr16','chr17','chr18','chr19','chr20','chr21','chr22','chrX')"
hg38: "c('chr1', 'chr2', 'chr3','chr4','chr5','chr6','chr7','chr8','chr9','chr10','chr11','chr12','chr13','chr14','chr15','chr16','chr17','chr18','chr19','chr20','chr21','chr22','chrX')"
ichorCNA_fracReadsInChrYForMale: 0.002
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you briefly document this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

ichorCNA_genomeStyle: # can set this to UCSC or NCBI
grch37: "NCBI"
hg19: "NCBI"
hs37d5: "NCBI"
grch38: "UCSC"
hg38: "UCSC"
# chrs used for training ichorCNA parameters, e.g. tumor fraction.
ichorCNA_chrTrain:
grch37: "c(1:22)"
hg19: "c(1:22)"
hs37d5: "c(1:22)"
grch38: "paste0('chr', c(1:22))"
hg38: "paste0('chr', c(1:22))"
# non-tumor fraction parameter restart values; higher values should be included for cfDNA
ichorCNA_normal: "c(0.5,0.6,0.7,0.8,0.9,0.95)"
# ploidy parameter restart values
ichorCNA_ploidy: "c(2,3)"
ichorCNA_estimateNormal: TRUE
ichorCNA_estimatePloidy: TRUE
ichorCNA_estimateClonality: TRUE
# states to use for subclonal CN
ichorCNA_scStates: "c(1,3)"
# set maximum copy number to use
ichorCNA_maxCN: 5
# TRUE/FALSE to include homozygous deletion state # FALSE for low coverage libraries (ex. 0.1x) ; can turn on for higher coverage data (ex. >10x)
ichorCNA_includeHOMD: FALSE
# Exclude solutions if total length of subclonal CNAs > this fraction of the genome
ichorCNA_maxFracGenomeSubclone: 0.5
# Exclude solutions if total length of subclonal CNAs > this fraction of total CNA length
ichorCNA_maxFracCNASubclone: 0.7
# control segmentation - higher (e.g. 0.9999999) leads to higher specificity and fewer segments
# lower (e.g. 0.99) leads to higher sensitivity and more segments
ichorCNA_txnE: 0.9999
# control segmentation - higher (e.g. 10000000) leads to higher specificity and fewer segments
# lower (e.g. 100) leads to higher sensitivity and more segments
ichorCNA_txnStrength: 10000
ichorCNA_plotFileType: "pdf"
ichorCNA_plotYlim: "c(-2,2)"


conda_envs:
ichorcna: "{MODSDIR}/envs/ichorcna.env.yaml"


threads:
readcounter: 4
run: 4

resources:
readcounter:
mem_mb: 2000
bam: 1
run:
mem_mb: 2000
bam: 1

pairing_config:
genome:
run_paired_tumours: False
run_unpaired_tumours_with: "no_normal"
run_paired_tumours_as_unpaired: True
106 changes: 106 additions & 0 deletions modules/ichorcna/1.0/envs/ichorcna.env.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
name: null
channels:
- conda-forge
- bioconda
- defaults
- r
dependencies:
- _libgcc_mutex=0.1
- _openmp_mutex=4.5
- _r-mutex=1.0.1
- binutils_impl_linux-64=2.35.1
- binutils_linux-64=2.35
- bioconductor-biocgenerics=0.36.0
- bioconductor-genomeinfodb=1.26.0
- bioconductor-genomeinfodbdata=1.2.4
- bioconductor-genomicranges=1.42.0
- bioconductor-hmmcopy=1.32.0
- bioconductor-iranges=2.24.0
- bioconductor-s4vectors=0.28.0
- bioconductor-xvector=0.30.0
- bioconductor-zlibbioc=1.36.0
- bwidget=1.9.14
- bzip2=1.0.8
- ca-certificates=2020.12.5
- cairo=1.16.0
- curl=7.71.1
- fontconfig=2.13.1
- freetype=2.10.4
- fribidi=1.0.10
- gcc_impl_linux-64=9.3.0
- gcc_linux-64=9.3.0
- gettext=0.19.8.1
- gfortran_impl_linux-64=9.3.0
- gfortran_linux-64=9.3.0
- graphite2=1.3.13
- gsl=2.6
- gxx_impl_linux-64=9.3.0
- gxx_linux-64=9.3.0
- harfbuzz=2.8.0
- icu=68.1
- jpeg=9d
- kernel-headers_linux-64=2.6.32
- krb5=1.17.2
- ld_impl_linux-64=2.35.1
- libblas=3.8.0
- libcblas=3.8.0
- libcurl=7.71.1
- libedit=3.1.20191231
- libffi=3.3
- libgcc-devel_linux-64=9.3.0
- libgcc-ng=9.3.0
- libgfortran-ng=9.3.0
- libgfortran5=9.3.0
- libglib=2.66.7
- libgomp=9.3.0
- libiconv=1.16
- liblapack=3.8.0
- libopenblas=0.3.10
- libpng=1.6.37
- libssh2=1.9.0
- libstdcxx-devel_linux-64=9.3.0
- libstdcxx-ng=9.3.0
- libtiff=4.2.0
- libuuid=2.32.1
- libwebp-base=1.2.0
- libxcb=1.13
- libxml2=2.9.10
- lz4-c=1.9.3
- make=4.3
- ncurses=6.2
- openssl=1.1.1j
- pango=1.42.4
- pcre=8.44
- pcre2=10.36
- pixman=0.40.0
- pthread-stubs=0.4
- r-base=4.0.3
- r-bitops=1.0_6
- r-data.table=1.14.0
- r-getopt=1.20.3
- r-ichorcna=0.2.0
- r-optparse=1.6.6
- r-plyr=1.8.6
- r-rcpp=1.0.6
- r-rcurl=1.98_1.2
- readline=8.0
- sed=4.8
- sysroot_linux-64=2.12
- tk=8.6.10
- tktable=2.10
- xorg-kbproto=1.0.7
- xorg-libice=1.0.10
- xorg-libsm=1.2.3
- xorg-libx11=1.7.0
- xorg-libxau=1.0.9
- xorg-libxdmcp=1.1.3
- xorg-libxext=1.3.4
- xorg-libxrender=0.9.10
- xorg-libxt=1.2.1
- xorg-renderproto=0.11.1
- xorg-xextproto=7.3.0
- xorg-xproto=7.0.31
- xz=5.2.5
- zlib=1.2.11
- zstd=1.4.9
prefix: /projects/rmorin/projects/tumour_contam/envs/ichorcna
Loading