Skip to content

Unaligned beginnings

kseniakh edited this page Mar 10, 2017 · 1 revision

Unaligned beginnings

Unaligned beginning - unaligned bases in the beginning of a query sequence.



Figure 1: Unaligned beginning example



An unaligned beginning difference is output in the query_struct.gff and ref_struct.gff files.

An example with the unaligned beginning entries in query_struct.gff :

##gff-version 3
##sequence-region	query_4	1	1065
query_4	NucDiff_v2.0	SO:0000667	1	498	.	.	.	ID=SV_1;Name=unaligned_beginning;ins_len=498;query_dir=1;ref_sequence=ref_1;ref_coord=47573;color=#EE0000
##sequence-region	query_5	1	1085
query_5	NucDiff_v2.0	SO:0000667	1	499	.	.	.	ID=SV_2;Name=unaligned_beginning;ins_len=499;query_dir=1;ref_sequence=ref_1;ref_coord=59639;color=#EE0000




The query_struct.gff file contains the following information (see Figure 1 for notations):

GFF3 fields Content Notes
col 1 Query_seq
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0000667 Sequence Ontology accession number corresponding to the "insertion" SO term
col 4 Ins_st
col 5 Ins_end
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "SV_1" ID in query_struct.gff is equal to ID in ref_struct.gff
col 9, Name "unaligned_beginning"
col 9, ins_len Length(Unaligned_beginning)
col 9, query_dir "1" or "-1" -1 if the inserted fragment should be reverse complemented before its insertion to a Ref_seq
col 9, ref_sequence Ref_seq
col 9, ref_coord Ref_pos



An example with the unaligned beginning entries in ref_struct.gff :

##gff-version 3
##sequence-region	ref_1	1	158063
ref_1	NucDiff_v2.0	SO:0000667	47573	47573	.	.	.	ID=SV_1;Name=unaligned_beginning;ins_len=498;query_dir=1;query_sequence=query_4;query_coord=1-498;color=#EE0000
ref_1	NucDiff_v2.0	SO:0000667	59639	59639	.	.	.	ID=SV_2;Name=unaligned_beginning;ins_len=499;query_dir=1;query_sequence=query_5;query_coord=1-499;color=#EE0000



The ref_struct.gff file contains the following information (see Figure 1 for notations):

GFF3 fields Content Notes
col 1 Ref_seq
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0000667 Sequence Ontology accession number corresponding to the "insertion" SO term
col 4 Ref_pos
col 5 Ref_pos
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "SV_1" ID in ref_struct.gff is equal to ID in query_struct.gff
col 9, Name "unaligned_beginning"
col 9, ins_len Length(Unaligned_beginning)
col 9, query_dir "1" or "-1" -1 if the inserted fragment should be reverse complemented before its insertion to a Ref_seq
col 9, query_sequence Query_seq
col 9, query_coord Ins_st-Ins_end