You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to change the option -q in ivar variants to values below the default 20 and discovered a problem with calling deletions once the q-score threshold is set to a value below 10.
My data is SARS-CoV2 and the deletion in question is the del69/70 in spike at position 21764.
With this command I tried all values for -q from 0 to 20:
for i in `seq 0 20`;
do
samtools mpileup -A -d 0 --reference nCoV-2019.reference.fasta -B -Q 0 input.bam | \
ivar variants -r nCoV-2019.reference.fasta -m 20 -p q$i.variants -q $i -t 0.25;
done
The output for position 21764 is:
grep 21764 q*.variants.tsv | column -t -s$'\t'
q0.variants.tsv:MN908947.3 21764 A -TACATG 4079 4070 33 4058 0 0 0.990239 4098 1 FALSE NA NA NA NA NA
q1.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 1 0.990239 4098 1 FALSE NA NA NA NA NA
q2.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 2 0.990239 4098 1 FALSE NA NA NA NA NA
q3.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 3 0.990239 4098 1 FALSE NA NA NA NA NA
q4.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 4 0.990239 4098 1 FALSE NA NA NA NA NA
q5.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 5 0.990239 4098 1 FALSE NA NA NA NA NA
q6.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 6 0.990239 4098 1 FALSE NA NA NA NA NA
q7.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 7 0.990239 4098 1 FALSE NA NA NA NA NA
q8.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 8 0.990239 4098 1 FALSE NA NA NA NA NA
q9.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 9 0.990239 4098 1 FALSE NA NA NA NA NA
q10.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 10 0.990239 4098 0 TRUE NA NA NA NA NA
q11.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 11 0.990239 4098 0 TRUE NA NA NA NA NA
q12.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 12 0.990239 4098 0 TRUE NA NA NA NA NA
q13.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 13 0.990239 4098 0 TRUE NA NA NA NA NA
q14.variants.tsv:MN908947.3 21764 A -TACATG 4070 4061 34 4058 0 14 0.990239 4098 0 TRUE NA NA NA NA NA
q15.variants.tsv:MN908947.3 21764 A -TACATG 3891 3882 34 4058 0 15 0.990239 4098 0 TRUE NA NA NA NA NA
q16.variants.tsv:MN908947.3 21764 A -TACATG 3891 3882 34 4058 0 16 0.990239 4098 0 TRUE NA NA NA NA NA
q17.variants.tsv:MN908947.3 21764 A -TACATG 3891 3882 34 4058 0 17 0.990239 4098 0 TRUE NA NA NA NA NA
q18.variants.tsv:MN908947.3 21764 A -TACATG 3891 3882 34 4058 0 18 0.990239 4098 0 TRUE NA NA NA NA NA
q19.variants.tsv:MN908947.3 21764 A -TACATG 3891 3882 34 4058 0 19 0.990239 4098 0 TRUE NA NA NA NA NA
q20.variants.tsv:MN908947.3 21764 A -TACATG 3891 3882 34 4058 0 20 0.990239 4098 0 TRUE NA NA NA NA NA
For -q 10 and above, the variant is found with P-value = 0, and for -q 9 and below, it is not found with P-value = 1.
The main difference between the files is that the column ALT_QUAL takes on exactly the value set for -q, probably because of
Seeing the same issue with the 6p deletion GAGTTCA22028G:EFR156G in Delta:
When setting -q 10, the P-value is all good:
samtools mpileup -A -d 0 --reference nCoV-2019.reference.fasta -B -Q 0 mapped.primertrimmed.sorted.bam | \
grep -w 22028 | \
ivar variants -r nCoV-2019.reference.fasta -m 20 -p variants -q 10 -t 0.25 && column -t -s$'\t' variants.tsv
[mpileup] 1 samples in 1 input files
[mpileup] Max depth set to maximum value (2147483647)
A GFF file containing the open reading frames (ORFs) has not been provided. Amino acid translation will not be done.
REGION POS REF ALT REF_DP REF_RV REF_QUAL ALT_DP ALT_RV ALT_QUAL ALT_FREQ TOTAL_DP PVAL PASS GFF_FEATURE REF_CODON REF_AA ALT_CODON ALT_AA
MN908947.3 22028 G -AGTTCA 25 25 34 25 0 10 1 25 0.000322779 TRUE NA NA NA NA NA
When setting -q 0, the P-value is set to 1:
samtools mpileup -A -d 0 --reference nCoV-2019.reference.fasta -B -Q 0 mapped.primertrimmed.sorted.bam | \
grep -w 22028 | \
ivar variants -r nCoV-2019.reference.fasta -m 20 -p variants -q 0 -t 0.25 && column -t -s$'\t' variants.tsv
[mpileup] 1 samples in 1 input files
[mpileup] Max depth set to maximum value (2147483647)
A GFF file containing the open reading frames (ORFs) has not been provided. Amino acid translation will not be done.
REGION POS REF ALT REF_DP REF_RV REF_QUAL ALT_DP ALT_RV ALT_QUAL ALT_FREQ TOTAL_DP PVAL PASS GFF_FEATURE REF_CODON REF_AA ALT_CODON ALT_AA
MN908947.3 22028 G -AGTTCA 25 25 34 25 0 0 1 25 1 FALSE NA NA NA NA NA
Dear @gkarthik ,
I tried to change the option
-q
inivar variants
to values below the default 20 and discovered a problem with calling deletions once the q-score threshold is set to a value below 10.My data is SARS-CoV2 and the deletion in question is the del69/70 in spike at position 21764.
With this command I tried all values for
-q
from 0 to 20:The output for position 21764 is:
For
-q 10
and above, the variant is found with P-value = 0, and for-q 9
and below, it is not found with P-value = 1.The main difference between the files is that the column ALT_QUAL takes on exactly the value set for
-q
, probably because ofivar/src/allele_functions.cpp
Line 90 in ce1490a
This value is then used here
ivar/src/call_variants.cpp
Line 138 in ce1490a
The text was updated successfully, but these errors were encountered: