Output

#Output of the classify command

Introduction

All subcommands of classify produce a file with a standard format. This is a tab delimited file with a header for each column.

The file will look like the following

chrom	position	ref_base	var_base	normal_counts_a	normal_counts_b	tumour_counts_a	tumour_counts_b	p_AA_AA	p_AA_AB	p_AA_BB	p_AB_AA	p_AB_AB	p_AB_BB	p_BB_AA	p_BB_AB	p_BB_BB
1	1299268	T	C	26	25	3	17	0.0000	0.0000	0.0000	0.0000	0.0000	1.0000	0.0000	0.0000	0.0000

The last nine columns of the file list the posterior probability of each of the joint genotypes. They have the form p_gN_gT where gN is the normal genotype and gT is the tumour genotype. For deterministic methods only one of these columns will be non-zero and will have a value of 1.

The rows of the file correspond to genomic positions. The columns are as follows

chrom - Chromosome the site is on.
position - 1-based position on the chromosome
ref_base - Base found in reference genome at this position.
var_base - Variant base found at this position. If no variant base is found this will be N.
normal_counts_a - Number of read matching ref_base in the normal at this position
normal_counts_b - Number of reads matching var_base in the normal at this position.
tumour_counts_a - Number of read matching ref_base in the tumour at this position
tumour_counts_b - Number of reads matching var_base in the tumour at this position.
p_AA_AA - Probability of joint genotype AA_AA
p_AA_AB - Probability of joint genotype AA_AB
p_AA_BB - Probability of joint genotype AA_BB
p_AB_AA - Probability of joint genotype AB_AA
p_AB_AB - Probability of joint genotype AB_AB
p_AB_BB - Probability of joint genotype AB_BB
p_AB_AA - Probability of joint genotype BB_AA
p_AB_AB - Probability of joint genotype BB_AB
p_AB_BB - Probability of joint genotype BB_BB

To extract somatic positions from this file I suggest adding p_AA_AB + p_AA_BB together to get the somatic genotype probability. You can then choose to threshold at whatever level is appropriate.

This file format can easily be manipulated using Python and the csv library which is installed by default. The csv.DictReader class will be especially useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output

Introduction

Clone this wiki locally