Skip to content

Releases: dkoboldt/varscan

VarScan v2.4.6

28 Mar 16:11
99ba0e0
Compare
Choose a tag to compare

VarScan v2.4.6 is the latest release and contains key improvements/fixing to the mpileup2somatic functionality introduced in v2.4.5.

See the release notes in the description.txt file for key improvements.

VarScan v2.4.5

31 Jan 22:50
51fbe2e
Compare
Choose a tag to compare

RELEASE NOTES FOR VARSCAN V2.4.5

31-January-2023

VarScan v2.4.5 is the current release. Yes, it has been a while but we continue to support this tool.
VarScan v2.4.4 is the previous release.
VarScan v2.4.0 was the first release to VarScan's new home at GitHub, http://dkoboldt.github.io/varscan/
VarScan v2.3.9 and prior releases will persist on SourceForge, as will the support forums and other resources.

LICENSE
VarScan 2 is free for non-commercial use by academic, government, and non-profit/not-for-profit institutions.
A commercial version of the software is available, and licensed through the Office of Technology Management at
Washington University School of Medicine. For more information, please contact:

Office of Technology Management
[email protected]
+1 314-362-5426
https://otm.wustl.edu/for-industry/tools/

SUPPORT
Please submit any support issues to the SourceForge forum or e-mail DanKoboldt (at) gmail to contact me directly.

VERSION 2.4.5 CHANGES
The primary change with this release is a new capability: somatic calling across multiple tumor samples simultaneously. The VarScan subcommand for this tool is mpileup2somatic. As usual, it expects a multi-sample mpileup file as input, with the matched normal sample as the first sample. Most parameters for paired somatic calling are available, except the indelFilter. Usage is as follows:

USAGE: java -jar VarScan.jar mpileup2somatic [multi-sample.mpileup] OPTIONS
multi-sample.mpileup - The mpileup file from normal BAM and tumor BAM(s)

OPTIONS:
--min-coverage - Minimum coverage in normal and tumor to call variant [8]
--min-coverage-normal - Minimum coverage in normal to call somatic, supersedes min-coverage [8]
--min-coverage-tumor - Minimum coverage in tumor to call somatic, supersedes min-coverage [6]
--min-avg-qual - Minimum Phred quality to count a base [15]
--min-var-freq - Minimum variant frequency to call a heterozygote [0.20]
--min-freq-for-hom Minimum frequency to call homozygote [0.75]
--normal-purity - Estimated purity (non-tumor content) of normal sample [1.00]
--tumor-purity - Estimated purity (tumor content) of tumor sample [1.00]
--p-value - P-value threshold to call a heterozygote [0.99]
--somatic-p-value - P-value threshold to call a somatic site [0.05]
--strand-filter - If set to 1, removes variants with >90% strand bias
--validation - If set to 1, outputs all compared positions even if non-variant
--output-file - If specified, output printed to this file rather than stdout
--vcf-sample-list - A list of sample names to use for output columns, one per line. Recommended.
--output-vcf - If set to 1, output VCF instead of VarScan native format

REMINDER: PLEASE USE THE FALSE POSITIVE FILTER
The scientific basis of this filter is described in the VarScan 2 publication. It will improve
the precision of variant and mutation calling by removing artifacts associated with short-read alignment.
-For somatic mutations, generate bam-readcounts with the Tumor BAM. For LOH and Germline, generate readcounts with the Normal BAM
-For de novo mutations (trio calling), generate readcounts with the child BAM.
The filter requires the bam-readcount utility: https://github.com/genome/bam-readcount

VarScan v2.4.2

26 May 17:45
Compare
Choose a tag to compare

RELEASE NOTES FOR VARSCAN V2.4.2

26-May-2016

VarScan v2.4.2 is the current release
VarScan v2.4.1 is the previous release
VarScan v2.4.0 was the first release to VarScan's new home at GitHub, http://dkoboldt.github.io/varscan/
VarScan v2.3.9 and prior releases will persist on SourceForge, as will the support forums and other resources.

LICENSE
VarScan 2 is free for non-commercial use by academic, government, and non-profit/not-for-profit institutions.
A commercial version of the software is available, and licensed through the Office of Technology Management at
Washington University School of Medicine. For more information, please contact:

Paul Carter, Business Development Director
[email protected]
+1 314-362-5426
https://otm.wustl.edu/for-industry/tools/

VERSION 2.4.2 CHANGES
The primary changes are minor bugfixes for user-reported issues:

1.) Better handling of mal-formed mpileup lines in VarScan copynumber

2.) Addressing missing segregation STATUS codes in VarScan trio and the addition of a new code (0) for "Unknown")

3.) Addressing a VCF format issue for somatic mutation calling, in which multi-allelic sites (classified as "UNKNOWN") were reported
in non-standard VCF format (with forward slashes in the ALT column). VarScan 2 somatic should now report multiple variant alleles
using the ALT1,ALT2 format and the genotype fields for normal and tumor samples should reflect the appropriate variant allele.

REMINDER: PLEASE USE THE FALSE POSITIVE FILTER
The scientific basis of this filter is described in the VarScan 2 publication. It will improve
the precision of variant and mutation calling by removing artifacts associated with short-read alignment.
-For somatic mutations, generate bam-readcounts with the Tumor BAM. For LOH and Germline, generate readcounts with the Normal BAM
-For de novo mutations (trio calling), generate readcounts with the child BAM.
The filter requires the bam-readcount utility: https://github.com/genome/bam-readcount

USAGE: java -jar VarScan.jar fpfilter [variant file] [readcount file] OPTIONS
variant file - A file of SNPs or indels in VarScan-native or VCF format
readcount file - The output file from bam-readcount for those positions
_For detailed filtering instructions, please visit http://varscan.sourceforge.net_

OPTIONS:
--output-file       Optional output file for filter-pass variants
--filtered-file     Optional output file for filter-fail variants
--dream3-settings   If set to 1, optimizes filter parameters based on TCGA-ICGC DREAM-3 SNV Challenge results
--keep-failures     If set to 1, includes failures in the output file

FILTERING PARAMETERS:
--min-var-count     Minimum number of variant-supporting reads [4]
--min-var-count-lc  Minimum number of variant-supporting reads when depth below somaticPdepth [2]
--min-var-freq      Minimum variant allele frequency [0.05]
--max-somatic-p     Maximum somatic p-value [0.05]
--max-somatic-p-depth   Depth required to test max somatic p-value [10]
--min-ref-readpos   Minimum average read position of ref-supporting reads [0.1]
--min-var-readpos   Minimum average read position of var-supporting reads [0.1]
--min-ref-dist3     Minimum average distance to effective 3' end (ref) [0.1]
--min-var-dist3     Minimum average distance to effective 3' end (var) [0.1]
--min-strandedness  Minimum fraction of variant reads from each strand [0.01]
--min-strand-reads  Minimum allele depth required to perform the strand tests [5]
--min-ref-basequal  Minimum average base quality for ref allele [15]
--min-var-basequal  Minimum average base quality for var allele [15]
--min-ref-avgrl     Minimum average trimmed read length for ref allele [90]
--min-var-avgrl     Minimum average trimmed read length for var allele [90]
--max-rl-diff       Maximum average relative read length difference (ref - var) [0.25]
--max-ref-mmqs      Maximum mismatch quality sum of reference-supporting reads [100]
--max-var-mmqs      Maximum mismatch quality sum of variant-supporting reads [100]
--max-mmqs-diff     Maximum average mismatch quality sum (var - ref) [50]
--min-ref-mapqual   Minimum average mapping quality for ref allele [15]
--min-var-mapqual   Minimum average mapping quality for var allele [15]
--max-mapqual-diff  Maximum average mapping quality (ref - var) [50]

DREAM-3 SETTINGS FOR FPFILTER
Please note the --dream3-settings parameter for fpfilter, which (if set to 1) will optimize the
false positive filter settings based on the fine-tuning we did for the TCGA-ICGC DREAM-3
SNV Challenge. See the "in silico 3" dataset described here:
https://www.synapse.org/#!Synapse:syn312572/wiki/62018
This dataset modeled 100% tumor purity, but three subclones at 50%, 33%, and 20% variant allele frequency.
Optimal VarScan settings were established as follows:
For SAMtools:
mpileup -B (disables BAQ)

For VarScan somatic:
    --min-coverage 3 --min-var-freq 0.08 --p-value 0.10 --somatic-p-value 0.05 --strand-filter 0

For VarScan fpfilter:
    --min-var-count = 3
    --min-var-count-lc = 1
    --min-strandedness = 0
    --min-var-basequal = 30
    --min-ref-readpos = 0.20
    --min-ref-dist3 = 0.20
    --min-var-readpos = 0.15
    --min-var-dist3 = 0.15
    --max-rl-diff = 0.05
    --max-mapqual-diff = 10
    --min-ref-mapqual = 20
    --min-var-mapqual = 30
    --max-var-mmqs = 100
    --max-ref-mmqs = 50

CITING VARSCAN
If you use VarScan, please note the version number and cite this publication along with the
version-appropriate URL:

Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson RK.
VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing.
Genome Res. 2012 Mar;22(3):568-76. doi: 10.1101/gr.129684.111.

https://github.com/dkoboldt/varscan (v2.4.0 and beyond)
or
http://varscan.sourceforge.net (v2.3.9 and before)

VarScan v2.4.1

23 Oct 19:37
Compare
Choose a tag to compare

RELEASE NOTES FOR VARSCAN V2.4.1

23-Oct-2015

VarScan v2.4.1 is the current release
VarScan v2.4.0 was the first release to VarScan's new home at GitHub, http://dkoboldt.github.io/varscan/
VarScan v2.3.9 and prior releases will persist on SourceForge, as will the support forums and other resources.

LICENSE
VarScan 2 is free for non-commercial use by academic, government, and non-profit/not-for-profit institutions.
A commercial version of the software is available, and licensed through the Office of Technology Management at
Washington University School of Medicine. For more information, please contact:

Paul Carter, Business Development Director
[email protected]
+1 314-362-5426

VERSION 2.4.1 CHANGES
Major changes include:

1.) Correction of a bug in v2.4.0 that caused some somatic mutation calling oddities, especially for indels.
This issue was introduced in v2.4.0 (minor fix #1) in an attempt to more accurately count variant-supporting
reads in the normal sample during somatic mutation calling. That fix is now rolled back.

2.) Expansion and improvement of the fpfilter command. It now includes many more parameters for fine-tuning
the filter, and a master option (--dream3-settings 1) that will optimize parameters based on the DREAM-3
mutation calling challenge (see below).

3.) The fpfilter command now outputs the average trimmed read length of reference- and variant-supporting
reads, so there will be two new columns appended to the tab-delimited output file.

REMINDER: PLEASE USE THE FALSE POSITIVE FILTER
The scientific basis of this filter is described in the VarScan 2 publication. It will improve
the precision of variant and mutation calling by removing artifacts associated with short-read alignment.
-For somatic mutations, generate bam-readcounts with the Tumor BAM. For LOH and Germline, generate readcounts with the Normal BAM
-For de novo mutations (trio calling), generate readcounts with the child BAM.
The filter requires the bam-readcount utility: https://github.com/genome/bam-readcount

USAGE: java -jar VarScan.jar fpfilter [variant file] [readcount file] OPTIONS
variant file - A file of SNPs or indels in VarScan-native or VCF format
readcount file - The output file from bam-readcount for those positions
_For detailed filtering instructions, please visit http://varscan.sourceforge.net_

OPTIONS:
--output-file       Optional output file for filter-pass variants
--filtered-file     Optional output file for filter-fail variants
--dream3-settings   If set to 1, optimizes filter parameters based on TCGA-ICGC DREAM-3 SNV Challenge results
--keep-failures     If set to 1, includes failures in the output file

FILTERING PARAMETERS:
--min-var-count     Minimum number of variant-supporting reads [4]
--min-var-count-lc  Minimum number of variant-supporting reads when depth below somaticPdepth [2]
--min-var-freq      Minimum variant allele frequency [0.05]
--max-somatic-p     Maximum somatic p-value [0.05]
--max-somatic-p-depth   Depth required to test max somatic p-value [10]
--min-ref-readpos   Minimum average read position of ref-supporting reads [0.1]
--min-var-readpos   Minimum average read position of var-supporting reads [0.1]
--min-ref-dist3     Minimum average distance to effective 3' end (ref) [0.1]
--min-var-dist3     Minimum average distance to effective 3' end (var) [0.1]
--min-strandedness  Minimum fraction of variant reads from each strand [0.01]
--min-strand-reads  Minimum allele depth required to perform the strand tests [5]
--min-ref-basequal  Minimum average base quality for ref allele [15]
--min-var-basequal  Minimum average base quality for var allele [15]
--min-ref-avgrl     Minimum average trimmed read length for ref allele [90]
--min-var-avgrl     Minimum average trimmed read length for var allele [90]
--max-rl-diff       Maximum average relative read length difference (ref - var) [0.25]
--max-ref-mmqs      Maximum mismatch quality sum of reference-supporting reads [100]
--max-var-mmqs      Maximum mismatch quality sum of variant-supporting reads [100]
--max-mmqs-diff     Maximum average mismatch quality sum (var - ref) [50]
--min-ref-mapqual   Minimum average mapping quality for ref allele [15]
--min-var-mapqual   Minimum average mapping quality for var allele [15]
--max-mapqual-diff  Maximum average mapping quality (ref - var) [50]

DREAM-3 SETTINGS FOR FPFILTER
Please note the --dream3-settings parameter for fpfilter, which (if set to 1) will optimize the
false positive filter settings based on the fine-tuning we did for the TCGA-ICGC DREAM-3
SNV Challenge. See the "in silico 3" dataset described here:
https://www.synapse.org/#!Synapse:syn312572/wiki/62018
This dataset modeled 100% tumor purity, but three subclones at 50%, 33%, and 20% variant allele frequency.
Optimal VarScan settings were established as follows:
For SAMtools:
mpileup -B (disables BAQ)

For VarScan somatic:
    --min-coverage 3 --min-var-freq 0.08 --p-value 0.10 --somatic-p-value 0.05 --strand-filter 0

For VarScan fpfilter:
    --min-var-count = 3
    --min-var-count-lc = 1
    --min-strandedness = 0
    --min-var-basequal = 30
    --min-ref-readpos = 0.20
    --min-ref-dist3 = 0.20
    --min-var-readpos = 0.15
    --min-var-dist3 = 0.15
    --max-rl-diff = 0.05
    --max-mapqual-diff = 10
    --min-ref-mapqual = 20
    --min-var-mapqual = 30
    --max-var-mmqs = 100
    --max-ref-mmqs = 50

CITING VARSCAN
If you use VarScan, please note the version number and cite this publication along with the
version-appropriate URL:

Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson RK.
VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing.
Genome Res. 2012 Mar;22(3):568-76. doi: 10.1101/gr.129684.111.

https://github.com/dkoboldt/varscan (v2.4.0 and beyond)
or
http://varscan.sourceforge.net (v2.3.9 and before)

VarScan v2.4.0

14 Sep 16:47
Compare
Choose a tag to compare

20-Aug-2015

VarScan v2.4.0 is the first release to VarScan's new home at GitHub, http://dkoboldt.github.io/varscan/

VERSION 2.4.0 CHANGES
The major change to v2.4.0 is the implementation of a SmartFileReader class, which addresses a known bug
in Java runtime that could cause VarScan to hang if given an empty input file. Hat tip to Bina Technologies
for contributing this code.

Minor issues addressed in the current release include:
1.) A correction in the way normal_reads2 values are counted when the mutation allele is not observed. Prior
to this fix, a non-reference base would be counted as a variant allele even if it didn't match the actual mutation
allele called in the tumor. Now, only observations of the tumor variant allele will be counted and go into the FET.

2.) Improved parameter-handling logic for two flags (--validation and --strand-filter), which previously were
sometimes considered "turned on" if the user provided them, even if the value provided was a zero.

3.) Catching the rare ArrayIndexOutOfBoundsException errors thrown in the copynumber and trio functions when
VarScan encountered incomplete mpileup columns.

4.) Addressed a typo-bug for the tumor-purity parameter which only had an effect if the user provided percentage
values (e.g. 15) rather than fractions (e.g. 0.15).