Skip to content

Translocations

kseniakh edited this page Mar 10, 2017 · 1 revision

Simple translocations

Translocation - a group of different types of inter-chromosomal structural rearrangements which occur when two regions located on different reference sequences are placed nearby in the same query sequence.

Simple translocation - a translocation where two query fragments are placed adjacent to each other.



Figure 1: Simple translocation example. The reference coordinates End_r_1 and St_r_2, corresponding to the breakpoint ends End_q_1 and St_q_2, respectively, coincide with the translocated block ends Trl_end_r_1 and Trl_st_r_2.



Figure 2: Simple translocation example. The reference coordinates End_r_1 and St_r_2, corresponding to the breakpoint ends End_q_1 and St_q_2, respectively, do not coincide with the translocated block ends Trl_end_r_1 and Trl_st_r_2.



A translocation difference is output in the query_struct.gff and ref_struct.gff files. Information about the translocated blocks is also output in the ref_blocks.gff and query_blocks.gff files. The descriptions and examples of the last two files can be found at their wiki pages.

An example with the translocation entries in query_struct.gff :

##gff-version 3
##sequence-region	query_1	1	1000
query_1	NucDiff_v2.0	SO:0001873	500	501	.	.	.	ID=SV_1;Name=translocation;ref_sequence_1=ref_1;blk_1_query=1-500;blk_1_ref=501-1000;blk_1_query_len=500;blk_1_ref_len=500;blk_1_st_query=1;blk_1_st_ref=501;blk_1_end_query=500;blk_1_end_ref=1000;ref_sequence_2=ref_2;blk_2_query=501-1000;blk_2_ref=501-1000;blk_2_query_len=500;blk_2_ref_len=500;blk_2_st_query=501;blk_2_st_ref=501;blk_2_end_query=1000;blk_2_end_ref=1000;color=#A0A0A0
##sequence-region	query_2	1	1000
query_2	NucDiff_v2.0	SO:0001873	500	501	.	.	.	ID=SV_2;Name=translocation;ref_sequence_1=ref_1;blk_1_query=1-500;blk_1_ref=2001-2500;blk_1_query_len=500;blk_1_ref_len=500;blk_1_st_query=1;blk_1_st_ref=2001;blk_1_end_query=500;blk_1_end_ref=2500;ref_sequence_2=ref_2;blk_2_query=501-1000;blk_2_ref=2001-2500;blk_2_query_len=500;blk_2_ref_len=500;blk_2_st_query=501;blk_2_st_ref=2001;blk_2_end_query=1000;blk_2_end_ref=2500;color=#A0A0A0



The query_struct.gff file contains the following information (see Figure 1 for notations):

GFF3 fields Content Notes
col 1 Query_seq
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0001873 Sequence Ontology accession number corresponding to the "interchromosomal_breakpoint" SO term
col 4 End_q_1
col 5 St_q_2
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "SV_1" ID in query_struct.gff is related to ID in ref_struct.gff
col 9, Name "translocation"
col 9, ref_sequence_1 Ref_seq_1
col 9, blk_1_query St_q_1 - End_q_1
col 9, blk_1_ref Trl_st_r_1 - Trl_end_r_1
col 9, blk_1_query_len Length(A)
col 9, blk_1_ref_len Length(A*)
col 9, blk_1_st_query St_q_1
col 9, blk_1_st_ref St_r_1
col 9, blk_1_end_query End_q_1
col 9, blk_1_end_ref End_r_1
col 9, ref_sequence_2 Ref_seq_2
col 9, blk_2_query St_q_2 - End_q_2
col 9, blk_2_ref Trl_st_r_2 - Trl_end_r_2
col 9, blk_2_query_len Length(B)
col 9, blk_2_ref_len Length(B*)
col 9, blk_2_st_query St_q_2
col 9, blk_2_st_ref St_r_2
col 9, blk_2_end_query End_q_2
col 9, blk_2_end_ref End_r_2



An example with the translocation entries in ref_struct.gff :

##gff-version 3
##sequence-region	ref_1	1	15000
ref_1	NucDiff_v2.0	SO:0001873	1000	1000	.	.	.	ID=SV_1.1;Name=translocation;query_sequence=query_1;query_coord=500;breakpoint_query=500-501;blk_query=1-500;blk_ref=501-1000;blk_query_len=500;blk_ref_len=500;color=#A0A0A0
ref_1	NucDiff_v2.0	SO:0001873	2500	2500	.	.	.	ID=SV_2.1;Name=translocation;query_sequence=query_2;query_coord=500;breakpoint_query=500-501;blk_query=1-500;blk_ref=2001-2500;blk_query_len=500;blk_ref_len=500;color=#A0A0A0
##sequence-region	ref_2	1	15000
ref_2	NucDiff_v2.0	SO:0001873	501	501	.	.	.	ID=SV_1.2;Name=translocation;query_sequence=query_1;query_coord=501;breakpoint_query=500-501;blk_query=501-1000;blk_ref=501-1000;blk_query_len=500;blk_ref_len=500;color=#A0A0A0
ref_2	NucDiff_v2.0	SO:0001873	2001	2001	.	.	.	ID=SV_2.2;Name=translocation;query_sequence=query_2;query_coord=501;breakpoint_query=500-501;blk_query=501-1000;blk_ref=2001-2500;blk_query_len=500;blk_ref_len=500;color=#A0A0A0



The ref_struct.gff file contains the following information (see Figure 1 for notations):

GFF3 fields Content for Translocation block 1 Content for Translocation block 2 Notes
col 1 Ref_seq_1 Ref_seq_2
col 2 NucDiff_v2.0 NucDiff_v2.0 name and current version of the tool
col 3 SO:0001873 SO:0001873 Sequence Ontology accession number corresponding to the "interchromosomal_breakpoint" SO term
col 4 End_r_1 St_r_2
col 5 End_r_1 St_r_2
col 6/col 7/col8 . . score/strand/phase fields are not used
col 9, ID "SV_1.1" "SV_1.2" ID in ref_struct.gff is related to ID in query_struct.gff
col 9, Name "translocation" "translocation"
col 9, query_sequence Query_seq Query_seq
col 9, query_coord End_q_1 St_q_2 a query_coord base corresponds to the reference base from col 4
col 9, breakpoint_query End_q_1 - St_q_2 End_q_1 - St_q_2
col 9, blk_query St_q_1 - End_q_1 St_q_2 - End_q_2
col 9, blk_ref Trl_st_r_1 - Trl_end_r_1 Trl_st_r_2 - Trl_end_r_2
col 9, blk_query_len Length(A) Length(B)
col 9, blk_ref_len Length(A*) Length(B*)