-
Notifications
You must be signed in to change notification settings - Fork 10
Translocations with inserted gap
Translocation - a group of different types of inter-chromosomal structural rearrangements which occur when two regions located on different reference sequences are placed nearby in the same query sequence.
Translocation with inserted gap - a translocation where two query fragments have a stretch of unknown bases (N's) inserted between them. The inserted region is treated as an inserted gap difference.
Figure 1: Translocation with inserted gap example. The reference coordinates End_r_1 and St_r_2, corresponding to the end of the query translocated block A and to the start of the query translocated block B (End_q_1 and St_q_2, respectively), coincide with the end of the reference translocated block A* and with the start of the reference translocated block B* (Trl_end_r_1 and Trl_st_r_2).
Figure 2: Translocation with inserted gap example. The reference coordinates End_r_1 and St_r_2, corresponding to the end of the query translocated block A and to the start of the query translocated block B (End_q_1 and St_q_2, respectively), do not coincide with the end of the reference translocated block A* and with the start of the reference translocated block B* (Trl_end_r_1 and Trl_st_r_2).
A translocation with inserted gap difference is output in the query_struct.gff and ref_struct.gff files. Information about the translocated blocks is also output in the ref_blocks.gff and query_blocks.gff files. The descriptions and examples of the last two files can be found at their wiki pages.
An example with the translocation with inserted gap entries in query_struct.gff :
##gff-version 3
##sequence-region query_1 1 1005
query_1 NucDiff_v2.0 SO:0001873 501 505 . . . ID=SV_1;Name=translocation-inserted_gap;ins_len=5;ref_sequence_1=ref_1;blk_1_query=1-500;blk_1_ref=501-1000;blk_1_query_len=500;blk_1_ref_len=500;blk_1_st_query=1;blk_1_st_ref=501;blk_1_end_query=500;blk_1_end_ref=1000;ref_sequence_2=ref_2;blk_2_query=506-1005;blk_2_ref=501-1000;blk_2_query_len=500;blk_2_ref_len=500;blk_2_st_query=506;blk_2_st_ref=501;blk_2_end_query=1005;blk_2_end_ref=1000;color=#A0A0A0
##sequence-region query_2 1 1020
query_2 NucDiff_v2.0 SO:0001873 501 520 . . . ID=SV_2;Name=translocation-inserted_gap;ins_len=20;ref_sequence_1=ref_1;blk_1_query=1-500;blk_1_ref=2001-2500;blk_1_query_len=500;blk_1_ref_len=500;blk_1_st_query=1;blk_1_st_ref=2001;blk_1_end_query=500;blk_1_end_ref=2500;ref_sequence_2=ref_2;blk_2_query=521-1020;blk_2_ref=2001-2500;blk_2_query_len=500;blk_2_ref_len=500;blk_2_st_query=521;blk_2_st_ref=2001;blk_2_end_query=1020;blk_2_end_ref=2500;color=#A0A0A0
The query_struct.gff file contains the following information (see Figure 1 for notations):
GFF3 fields | Content | Notes |
---|---|---|
col 1 | Query_seq | |
col 2 | NucDiff_v2.0 | name and current version of the tool |
col 3 | SO:0001873 | Sequence Ontology accession number corresponding to the "interchromosomal_breakpoint" SO term |
col 4 | End_q_1 | |
col 5 | St_q_2 | |
col 6/col 7/col8 | . | score/strand/phase fields are not used |
col 9, ID | "SV_1" | ID in query_struct.gff is related to ID in ref_struct.gff |
col 9, Name | "translocation-inserted_gap" | |
col 9, ins_len | Length(Inserted_gap) | |
col 9, ref_sequence_1 | Ref_seq_1 | |
col 9, blk_1_query | St_q_1 - End_q_1 | |
col 9, blk_1_ref | Trl_st_r_1 - Trl_end_r_1 | |
col 9, blk_1_query_len | Length(A) | |
col 9, blk_1_ref_len | Length(A*) | |
col 9, blk_1_st_query | St_q_1 | |
col 9, blk_1_st_ref | St_r_1 | |
col 9, blk_1_end_query | End_q_1 | |
col 9, blk_1_end_ref | End_r_1 | |
col 9, ref_sequence_2 | Ref_seq_2 | |
col 9, blk_2_query | St_q_2 - End_q_2 | |
col 9, blk_2_ref | Trl_st_r_2 - Trl_end_r_2 | |
col 9, blk_2_query_len | Length(B) | |
col 9, blk_2_ref_len | Length(B*) | |
col 9, blk_2_st_query | St_q_2 | |
col 9, blk_2_st_ref | St_r_2 | |
col 9, blk_2_end_query | End_q_2 | |
col 9, blk_2_end_ref | End_r_2 |
An example with the translocation with inserted gap entries in ref_struct.gff :
##gff-version 3
##sequence-region ref_1 1 19500
ref_1 NucDiff_v2.0 SO:0001873 1000 1000 . . . ID=SV_1.1;Name=translocation-inserted_gap;ins_len=5;query_sequence=query_1;query_coord=500;breakpoint_query=501-505;blk_query=1-500;blk_ref=501-1000;blk_query_len=500;blk_ref_len=500;color=#A0A0A0
ref_1 NucDiff_v2.0 SO:0001873 2500 2500 . . . ID=SV_2.1;Name=translocation-inserted_gap;ins_len=20;query_sequence=query_2;query_coord=500;breakpoint_query=501-520;blk_query=1-500;blk_ref=2001-2500;blk_query_len=500;blk_ref_len=500;color=#A0A0A0
##sequence-region ref_2 1 19500
ref_2 NucDiff_v2.0 SO:0001873 501 501 . . . ID=SV_1.2;Name=translocation-inserted_gap;ins_len=5;query_sequence=query_1;query_coord=506;breakpoint_query=501-505;blk_query=506-1005;blk_ref=501-1000;blk_query_len=500;blk_ref_len=500;color=#A0A0A0
ref_2 NucDiff_v2.0 SO:0001873 2001 2001 . . . ID=SV_2.2;Name=translocation-inserted_gap;ins_len=20;query_sequence=query_2;query_coord=521;breakpoint_query=501-520;blk_query=521-1020;blk_ref=2001-2500;blk_query_len=500;blk_ref_len=500;color=#A0A0A0
The ref_struct.gff file contains the following information (see Figure 1 for notations):
GFF3 fields | Content for Translocation block 1 | Content for Translocation block 2 | Notes |
---|---|---|---|
col 1 | Ref_seq_1 | Ref_seq_2 | |
col 2 | NucDiff_v2.0 | NucDiff_v2.0 | name and current version of the tool |
col 3 | SO:0001873 | SO:0001873 | Sequence Ontology accession number corresponding to the "interchromosomal_breakpoint" SO term |
col 4 | End_r_1 | St_r_2 | |
col 5 | End_r_1 | St_r_2 | |
col 6/col 7/col8 | . | . | score/strand/phase fields are not used |
col 9, ID | "SV_1.1" | "SV_1.2" | ID in ref_struct.gff is related to ID in query_struct.gff |
col 9, Name | "translocation-inserted_gap" | "translocation-inserted_gap" | |
co9 9, ins_len | Length(Inserted_gap) | Length(Inserted_gap) | |
col 9, query_sequence | Query_seq | Query_seq | |
col 9, query_coord | End_q_1 | St_q_2 | a query_coord base corresponds to the reference base from col 4 |
col 9, breakpoint_query | End_q_1 - St_q_2 | End_q_1 - St_q_2 | |
col 9, blk_query | St_q_1 - End_q_1 | St_q_2 - End_q_2 | |
col 9, blk_ref | Trl_st_r_1 - Trl_end_r_1 | Trl_st_r_2 - Trl_end_r_2 | |
col 9, blk_query_len | Length(A) | Length(B) | |
col 9, blk_ref_len | Length(A*) | Length(B*) |