Skip to content

Commit

Permalink
Merge pull request #914 from mskcc/hotfix/1.4_official
Browse files Browse the repository at this point in the history
Hotfix/1.4 official
  • Loading branch information
anoronh4 authored Aug 25, 2021
2 parents 499ee2a + 172da37 commit 18ee862
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 5 deletions.
4 changes: 3 additions & 1 deletion docs/reference-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@ Part of the [GATK bundle](https://software.broadinstitute.org/gatk/download/bund
BED files that specify the regions of the genome to consider for variant calling are specified in the [input files](running-the-pipeline.md#input-files).

### Exome Capture Platforms
For exomes, use BED file corresponding to the platform used for target capture. Currently, Tempo supports:
For exomes, use BED file corresponding to the platform used for target capture. Currently, Juno reference files are configured to support:
- __AgilentExon_51MB__: SureSelectXT Human All Exon V4 from Agilent.
- __IDT_Exome__: xGen Exome Research Panel v1.0 from IDT.
- __IDT_Exome_v2__: xGen Exome Research Panel v2.0 from IDT.

::: tip Note
Contact us if you are interested in support for other sequencing assays or capture kits.
Expand Down Expand Up @@ -78,6 +79,7 @@ You can have as many targets as you like. Under each folder are the 6 target fil
In this case, the files are symbolically linked to the original. Whether soft links or hard links are used, the files in this folder should strictly match the names `coding.bed`, `baits.interval_list`, `targets.interval_list`, `targets.bed`, `targets.bed.gz`, `targets.bed.gz.tbi`.

When running Tempo, use the parameter `--targets_base <targets_base folder>` so that Nextflow will know where to find your target files.
When running with `--assayType genome`, only the `<targets_base>/wgs` target folder will be available. Conversely, the `<targets_base>/wgs` target folder will not be available when `--assayType genome` is not set.

## RepeatMasker and Mappability Blacklist
BED files with genomic repeat and mappability information are used to annotate the VCFs with somatic and germline SNV/indels. These data are from [RepeatMasker](http://www.repeatmasker.org/) and the [ENCODE consortium](http://rohsdb.cmb.usc.edu/GBshape/ENCODE/index.html), and the files are retrieved from the [UCSC Genome Browser](https://genome.ucsc.edu) and parsed as such:
Expand Down
2 changes: 1 addition & 1 deletion lib/TempoUtils.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ class TempoUtils {
static def checkTarget(it, assayType, availableTargets) {
def supportedTargets = []
if(assayType == "genome"){ supportedTargets = ["wgs"] }
else if(assayType == "exome"){ supportedTargets = availableTargets }
else if(assayType == "exome"){ supportedTargets = availableTargets ; supportedTargets.remove("wgs") }
else {} // this is covered by checkAssayType(){} above

if(!supportedTargets.contains(it)){
Expand Down
10 changes: 7 additions & 3 deletions pipeline.nf
Original file line number Diff line number Diff line change
Expand Up @@ -823,12 +823,13 @@ process CreateScatteredIntervals {

script:
scatterCount = params.scatterCount
subdivision_mode = targetId == "wgs" ? "INTERVAL_SUBDIVISION" : "BALANCING_WITHOUT_INTERVAL_SUBDIVISION_WITH_OVERFLOW"
"""
gatk SplitIntervals \
--reference ${genomeFile} \
--intervals ${targets} \
--scatter-count ${scatterCount} \
--subdivision-mode BALANCING_WITHOUT_INTERVAL_SUBDIVISION_WITH_OVERFLOW \
--subdivision-mode ${subdivision_mode} \
--output $targetId
for i in $targetId/*.interval_list;
Expand Down Expand Up @@ -1649,12 +1650,13 @@ process RunPolysolver {
output:
set val("placeHolder"), idNormal, target, file("${outputPrefix}.hla.txt") into hlaOutput, hlaOutputForLOHHLA, hlaOutputForMetaDataParser

when: "polysolver" in tools && runSomatic
when: "polysolver" in tools && runSomatic && ["GRCh38","GRCh37"].contains(params.genome)

script:
outputPrefix = "${idNormal}"
outputDir = "."
tmpDir = "${outputDir}-nf-scratch"
genome_ = params.genome == "GRCh37" ? "hg19" : "hg38"
"""
cp /home/polysolver/scripts/shell_call_hla_type .
Expand All @@ -1664,7 +1666,7 @@ process RunPolysolver {
${bamNormal} \
Unknown \
1 \
hg19 \
${genome_} \
STDFQ \
0 \
${outputDir}
Expand Down Expand Up @@ -3837,6 +3839,8 @@ def loadTargetReferences(){
def result_array = [:]
new File(params.targets_base).eachDir{ i ->
def target_id = i.getBaseName()
if (params.assayType == "genome" && target_id != "wgs" ){ return }
if (params.assayType != "genome" && target_id == "wgs" ){ return }
result_array["${target_id}"] = [:]
for ( j in params.targets.keySet()) { // baitsInterval, targetsInterval, targetsBedGz, targetsBedGzTbi, codingBed
result_array."${target_id}" << [ ("$j".toString()) : evalTargetPath(j,target_id)]
Expand Down

0 comments on commit 18ee862

Please sign in to comment.