-
Notifications
You must be signed in to change notification settings - Fork 10
Circular genome start
Circular genome start - is a special case that may appeared in circular genomes when the start of the query sequence does not coincide with the start of the reference sequence and cause an alignment fragmentation. It is not treated as a difference, although it is included in the output.
Figure 1: Circular genome start breakpoint example.
A circular genome start breakpoint is output in the query_struct.gff and ref_struct.gff files. Information about the blocks before and after breakpoint is also output in the ref_blocks.gff and query_blocks.gff files. The descriptions and examples of the last two files can be found at their wiki pages.
An example with the circular genome start breakpoint entries in query_struct.gff :
##gff-version 3
##sequence-region query_1 1 1920
query_1 NucDiff_v2.0 SO:0001874 961 961 . . . ID=SV_1;Name=circular_genome_start;ref_sequence=ref_1;blk_1_query=1-960;blk_1_ref=11941-12900;blk_1_query_len=960;blk_1_ref_len=960;blk_1_st_query=1;blk_1_st_ref=11941;blk_1_end_query=960;blk_1_end_ref=12900;blk_2_query=961-1920;blk_2_ref=1-960;blk_2_query_len=960;blk_2_ref_len=960;blk_2_st_query=961;blk_2_st_ref=1;blk_2_end_query=1920;blk_2_end_ref=960;color=#990099
The query_struct.gff file contains the following information (see Figure 1a ) for notations):
GFF3 fields | Content | Notes |
---|---|---|
col 1 | Query_seq | |
col 2 | NucDiff_v2.0 | name and current version of the tool |
col 3 | SO:0001874 | Sequence Ontology accession number corresponding to the "intrachromosomal_breakpoint" SO term |
col 4 | End_q_1 | |
col 5 | St_q_2 | |
col 6/col 7/col8 | . | score/strand/phase fields are not used |
col 9, ID | "SV_1" | ID in query_struct.gff is related to ID in ref_struct.gff |
col 9, Name | "circular_genome_start" | |
col 9, ref_sequence | Ref_seq | |
col 9, blk_1_query | St_q_1 - End_q_1 | |
col 9, blk_1_ref | St_r_1 - Ref_end | |
col 9, blk_1_query_len | Length(B) | |
col 9, blk_1_ref_len | Length(B*) | |
col 9, blk_1_st_query | St_q_1 | |
col 9, blk_1_st_ref | St_r_1 | |
col 9, blk_1_end_query | End_q_1 | |
col 9, blk_1_end_ref | Ref_end | |
col 9, blk_2_query | St_q_2 - End_q_2 | |
col 9, blk_2_ref | Ref_st - End_r_2 | |
col 9, blk_2_query_len | Length(A) | |
col 9, blk_2_ref_len | Length(A*) | |
col 9, blk_2_st_query | St_q_2 | |
col 9, blk_2_st_ref | Ref_st | |
col 9, blk_2_end_query | End_q_2 | |
col 9, blk_2_end_ref | End_r_2 |
An example with the circular genome start breakpoint entries in ref_struct.gff :
##gff-version 3
##sequence-region ref_1 1 12900
ref_1 NucDiff_v2.0 SO:0001874 1 1 . . . ID=SV_1.2;Name=circular_genome_start;query_sequence=query_1;query_coord=961;breakpoint_query=961-961;blk_query=961-1920;blk_ref=1-960;blk_query_len=960;blk_ref_len=960;color=#990099
ref_1 NucDiff_v2.0 SO:0001874 12900 12900 . . . ID=SV_1.1;Name=circular_genome_start;query_sequence=query_1;query_coord=960;breakpoint_query=961-961;blk_query=1-960;blk_ref=11941-12900;blk_query_len=960;blk_ref_len=960;color=#990099
The ref_struct.gff file contains the following information (see Figure 1a for notations):
GFF3 fields | Content for Relocation block 1 | Content for Relocation block 2 | Notes |
---|---|---|---|
col 1 | Ref_seq | Ref_seq | |
col 2 | NucDiff_v2.0 | NucDiff_v2.0 | name and current version of the tool |
col 3 | SO:0001874 | SO:0001874 | Sequence Ontology accession number corresponding to the "intrachromosomal_breakpoint" SO term |
col 4 | Ref_end | Ref_st | |
col 5 | Ref_end | Ref_st | |
col 6/col 7/col8 | . | . | score/strand/phase fields are not used |
col 9, ID | "SV_1.1" | "SV_1.2" | ID in ref_struct.gff is related to ID in query_struct.gff |
col 9, Name | "circular_genome_start" | "circular_genome_start" | |
col 9, query_sequence | Query_seq | Query_seq | |
col 9, query_coord | End_q_1 | St_q_2 | a query_coord base corresponds to the reference base from col 4 |
col 9, breakpoint_query | End_q_1 - St_q_2 | End_q_1 - St_q_2 | |
col 9, blk_query | St_q_1 - End_q_1 | St_q_2 - End_q_2 | |
col 9, blk_ref | St_r_1 - Ref_end | Ref_st - End_r_2 | |
col 9, blk_query_len | Length(B) | Length(A) | |
col 9, blk_ref_len | Length(B*) | Length(A*) |