Skip to content

query_blocks.gff

kseniakh edited this page Mar 10, 2017 · 2 revisions

query_blocks.gff

qury_blocks.gff is a GFF3 file that contains information about all relocated, translocated, inverted, and reshuffled query regions. In addition, information about all aligned query fragments is provided.



An example of the query_blocks.gff file when the query sequence does not contain any structural differences:

##gff-version 3
##sequence-region	query_1	1	400
query_1	NucDiff_v2.0	SO:0000001	1	400	.	.	.	ID=Blk_1;Name=Block;blk_length=400;query_dir=1;ref_sequence=ref_1;ref_coord=1664-2063;color=#00000

In this case the file contains only information about the aligned query fragment that is the whole query sequence.



The query_blocks.gff file contains following information for each aligned region:

GFF3 fields Content Notes
col 1 Query_seq_name
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0000001 Sequence Ontology accession number corresponding to the "region" SO term
col 4 St_q start of the aligned query fragment
col 5 End_q end of the aligned query fragment
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "Blk_1"
col 9, Name "Block"
col 9, blk_length Length(aligned_query_fragment)
col 9, query_dir "1" or "-1" 1 if an aligned query fragment has the same direction as the corresponding reference fragment, otherwise -1
col 9, ref_sequence Ref_seq_name
col 9, ref_coord St_r - End_r start and end of the corresponding reference fragment



If the query sequence has inversion and/or reshuffling differences, then information about inverted and/or reshuffled regions will be added to the file.

An example of the query_blocks.gff file when the query sequence has inversion and reshuffling differences:

##gff-version 3
##sequence-region	query_1	1	3500
query_1	NucDiff_v2.0	SO:0000001	1	500	.	.	.	ID=Blk_1;Name=Block;blk_length=500;query_dir=-1;ref_sequence=ref_1;ref_coord=1501-2000;color=#000000
query_1	NucDiff_v2.0	SO:0000001	1	500	.	.	.	ID=Blk_2;Name=Reshuffling-part_3_gr_0;blk_length=500;query_dir=-1;ref_sequence=ref_1;ref_coord=1501-2000;color=#04B404
query_1	NucDiff_v2.0	SO:0000001	501	1000	.	.	.	ID=Blk_5;Name=Block;blk_length=500;query_dir=-1;ref_sequence=ref_1;ref_coord=501-1000;color=#000000
query_1	NucDiff_v2.0	SO:0000001	501	1000	.	.	.	ID=Blk_6;Name=Inversion;blk_length=500;query_dir=-1;ref_sequence=ref_1;ref_coord=501-1000;color=#DF0101
query_1	NucDiff_v2.0	SO:0000001	501	1000	.	.	.	ID=Blk_7;Name=Reshuffling-part_1_gr_0;blk_length=500;query_dir=-1;ref_sequence=ref_1;ref_coord=501-1000;color=#04B404
...



The query_blocks.gff file contains following information for each inverted/reshuffled region:

GFF3 fields Content Notes
col 1 Query_seq_name
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0000001 Sequence Ontology accession number corresponding to the "region" SO term
col 4 St_q start of the inverted/reshuffled query region
col 5 End_q end of the inverted/reshuffled query region
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "Blk_1"
col 9, Name "Inversion" or "Reshuffling-part_1_gr_0" part_X - an order number within the reshuffled region, gr_Y- an order number of the reshuffled region
col 9, blk_length Length(inv/resh_query_region)
col 9, query_dir "1" or "-1" 1 if an inverted/reshuffled query region has the same direction as the corresponding reference region, otherwise -1
col 9, ref_sequence Ref_seq_name
col 9, ref_coord St_r - End_r start and end of the corresponding reference fragment



If the query sequence has translocation and/or relocation differences, then information about translocated and/or relocated regions will be added to the file.

An example of the query_blocks.gff file when the query sequence has relocation and translocation differences:

##gff-version 3
##sequence-region	query_1	1	3500
query_1	NucDiff_v2.0	SO:0000001	1	2500	.	.	.	ID=Blk_1;Name=Block;blk_length=2500;query_dir=-1;ref_sequence=ref_1;ref_coord=1501-4000;color=#000000
query_1	NucDiff_v2.0	SO:0000001	1	2500	.	.	.	ID=Blk_3;Name=Relocation_block;blk_length=2500;color=#01DFD7
query_1	NucDiff_v2.0	SO:0000001	1	3000	.	.	.	ID=Blk_4;Name=Translocation_block;blk_length=3000;color=#0404B4
query_1	NucDiff_v2.0	SO:0000001	2501	3000	.	.	.	ID=Blk_15;Name=Block;blk_length=500;query_dir=1;ref_sequence=ref_1;ref_coord=13501-14000;color=#000000
query_1	NucDiff_v2.0	SO:0000001	2501	3000	.	.	.	ID=Blk_16;Name=Relocation_block;blk_length=500;color=#01DFD7
query_1	NucDiff_v2.0	SO:0000001	3000	3500	.	.	.	ID=Blk_17;Name=Block;blk_length=501;query_dir=1;ref_sequence=ref_2;ref_coord=500-1000;color=#000000
query_1	NucDiff_v2.0	SO:0000001	3000	3500	.	.	.	ID=Blk_18;Name=Translocation_block;blk_length=501;color=#0404B4



The query_blocks.gff file contains the following information for each relocated/translocated region:

GFF3 fields Content Notes
col 1 Query_seq_name
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0000001 Sequence Ontology accession number, corresponding to the "region" SO term
col 4 St_q start of the relocated/translocated query region
col 5 End_q end of the relocated/translocated query region
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "Blk_1"
col 9, Name "Relocation_block" or "Translocation_block"
col 9, blk_length Length(relocated/translocated_query_region)



IGV visualisation of the query_blocks.gff file:

Figure 1: IGV visualisation of the results output in query_blocks.gff file