Skip to content

Deletions

kseniakh edited this page Mar 10, 2017 · 1 revision

Deletions

Deletion - a deletion of some bases present in the reference sequence from a query sequence.



Figure 1: Deletion example



If a deletion difference has caused alignment fragmentation, it is output in the query_struct.gff and ref_struct.gff files, otherwise it is output in the query_snps.gff and ref_snps.gff files.

An example with the deletion entries in query_snps.gff :

##gff-version 3
##sequence-region	query_1	1	13000
query_1	NucDiff_v2.0	SO:0000159	500	500	.	.	.	ID=SNP_1;Name=deletion;del_len=5;query_dir=1;ref_sequence=ref_1;ref_coord=501-505;query_bases=-;ref_bases=tctcg;color=#0000EE
query_1	NucDiff_v2.0	SO:0000159	1499	1499	.	.	.	ID=SNP_2;Name=deletion;del_len=20;query_dir=1;ref_sequence=ref_1;ref_coord=1505-1524;query_bases=-;ref_bases=tcttgattaactgttagata;color=#0000EE
query_1	NucDiff_v2.0	SO:0000159	2500	2500	.	.	.	ID=SNP_3;Name=deletion;del_len=50;query_dir=1;ref_sequence=ref_1;ref_coord=2526-2575;query_bases=-;ref_bases=agatcagacctacgggaccaactattggatcagccgcgagaattagttag;color=#0000EE
query_1	NucDiff_v2.0	SO:0000159	3500	3500	.	.	.	ID=SNP_4;Name=deletion;del_len=65;query_dir=1;ref_sequence=ref_1;ref_coord=3576-3640;query_bases=-;ref_bases=cgactgtgttgaatagtgtagttgtagataactgagcacaatgtatggtctaatttttacgtgaa;color=#0000EE



The query_snps.gff file contains the following information (see Figure 1 for notations):

GFF3 fields Content Notes
col 1 Query_seq
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0000159 Sequence Ontology accession number corresponding to the "deletion" SO term
col 4 Q_pos
col 5 Q_pos
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "SNP_1" ID in query_snps.gff is equal to ID in ref_snps.gff
col 9, Name "deletion"
col 9, del_len Length(Deletion)
col 9, query_dir "1" or "-1" -1 if the deleted fragment should be reverse complemented before its insertion to a Query_seq
col 9, ref_sequence Ref_seq
col 9, ref_coord Del_st - Del_end
col 9, query_bases "-"
col 9, ref_bases ATGC's the subsequence is reverse complemented if the query_dir value is equal to -1



An example with the deletion entries in ref_snps.gff :

##gff-version 3
##sequence-region	ref_1	1	15063
ref_1	NucDiff_v2.0	SO:0000159	501	505	.	.	.	ID=SNP_1;Name=deletion;del_len=5;query_dir=1;query_sequence=query_1;query_coord=500;query_bases=-;ref_bases=tctcg;color=#0000EE
ref_1	NucDiff_v2.0	SO:0000159	1505	1524	.	.	.	ID=SNP_2;Name=deletion;del_len=20;query_dir=1;query_sequence=query_1;query_coord=1499;query_bases=-;ref_bases=tcttgattaactgttagata;color=#0000EE
ref_1	NucDiff_v2.0	SO:0000159	2526	2575	.	.	.	ID=SNP_3;Name=deletion;del_len=50;query_dir=1;query_sequence=query_1;query_coord=2500;query_bases=-;ref_bases=agatcagacctacgggaccaactattggatcagccgcgagaattagttag;color=#0000EE
ref_1	NucDiff_v2.0	SO:0000159	3576	3640	.	.	.	ID=SNP_4;Name=deletion;del_len=65;query_dir=1;query_sequence=query_1;query_coord=3500;query_bases=-;ref_bases=cgactgtgttgaatagtgtagttgtagataactgagcacaatgtatggtctaatttttacgtgaa;color=#0000EE



The ref_snps.gff file contains the following information (see Figure 1 for notations):

GFF3 fields Content Notes
col 1 Ref_seq
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0000159 Sequence Ontology accession number corresponding to the "deletion" SO term
col 4 Del_st
col 5 Del_end
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "SNP_1" ID in ref_snps.gff is equal to ID in query_snps.gff
col 9, Name "deletion"
col 9, del_len Length(Deletion)
col 9, query_dir "1" or "-1" -1 if the deleted fragment should be reverse complemented before its insertion to a Query_seq
col 9, query_sequence Query_seq
col 9, query_coord Q_pos
col 9, query_bases "-"
col 9, ref_bases ATGC's



An example with the deletion entries in query_struct.gff :

##gff-version 3
##sequence-region	query_1	1	13000
query_1	NucDiff_v2.0	SO:0000159	5500	5500	.	.	.	ID=SV_1;Name=deletion;del_len=88;query_dir=1;ref_sequence=ref_1;ref_coord=5726-5813;color=#0000EE
query_1	NucDiff_v2.0	SO:0000159	6500	6500	.	.	.	ID=SV_2;Name=deletion;del_len=100;query_dir=1;ref_sequence=ref_1;ref_coord=6814-6913;color=#0000EE
query_1	NucDiff_v2.0	SO:0000159	7500	7500	.	.	.	ID=SV_3;Name=deletion;del_len=150;query_dir=1;ref_sequence=ref_1;ref_coord=7914-8063;color=#0000EE



The query_struct.gff file contains the following information (see Figure 1 for notations):

GFF3 fields Content Notes
col 1 Query_seq
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0000159 Sequence Ontology accession number corresponding to the "deletion" SO term
col 4 Q_pos
col 5 Q_pos
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "SNP_1" ID in query_snps.gff is equal to ID in ref_snps.gff
col 9, Name "deletion"
col 9, del_len Length(Deletion)
col 9, query_dir "1" or "-1" -1 if the deleted fragment should be reverse complemented before its insertion to a Query_seq
col 9, ref_sequence Ref_seq
col 9, ref_coord Del_st - Del_end



An example with the deletion entries in ref_struct.gff :

##gff-version 3
##sequence-region	ref_1	1	15063
ref_1	NucDiff_v2.0	SO:0000159	5726	5813	.	.	.	ID=SV_1;Name=deletion;del_len=88;query_dir=1;query_sequence=query_1;query_coord=5500;color=#0000EE
ref_1	NucDiff_v2.0	SO:0000159	6814	6913	.	.	.	ID=SV_2;Name=deletion;del_len=100;query_dir=1;query_sequence=query_1;query_coord=6500;color=#0000EE
ref_1	NucDiff_v2.0	SO:0000159	7914	8063	.	.	.	ID=SV_3;Name=deletion;del_len=150;query_dir=1;query_sequence=query_1;query_coord=7500;color=#0000EE



The ref_struct.gff file contains the following information (see Figure 1 for notations):

GFF3 fields Content Notes
col 1 Ref_seq
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0000159 Sequence Ontology accession number corresponding to the "deletion" SO term
col 4 Del_st
col 5 Del_end
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "SNP_1" ID in ref_snps.gff is equal to ID in query_snps.gff
col 9, Name "deletion"
col 9, del_len Length(Deletion)
col 9, query_dir "1" or "-1" -1 if the deleted fragment should be reverse complemented before its insertion to a Query_seq
col 9, query_sequence Query_seq
col 9, query_coord Q_pos