ref_additional.gff is a GFF3 file that contains information about:

  • duplicated reference regions
  • repeated reference regions involved in duplications, tandem duplications, collapsed repeats, and tandem collapsed repeat
  • reference regions corresponding to the query overlapped regions in relocations and translocation with overlap differences
  • unmapped reference regions not involved in deletion differences

An example of the ref_additional.gff file and a detailed description of the information, provided for repeated regions and regions involved in overlaps, can be found at the wiki page of a difference.

An example with the duplicated reference region entries in the ref_additional.gff file:

##gff-version 3
##sequence-region	ref_1	1	296114
ref_1	NucDiff_v2.0	SO:0000001	53630	53661	.	.	.	ID=Region_1;Name=Ref_duplication;duplic_len=32;color=#4005BF
ref_1	NucDiff_v2.0	SO:0000001	53707	53724	.	.	.	ID=Region_2;Name=Ref_duplication;duplic_len=18;color=#4005BF

The ref_additional.gff file contains the following information for each reference region:

GFF3 fields Content Notes
col 1 Ref_seq name
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0000001 Sequence Ontology accession number corresponding to the "region" SO term
col 4 Pos_st start position of the duplicated region
col 5 Pos_end end position of the duplicated region
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "Region_1"
col 9, Name "Ref_duplication"
col 9, duplic_len Length(duplicated_region)

An example with the unmapped reference region entries in ref_struct.gff :

##gff-version 3
##sequence-region	ref_1	1	115000
ref_1	NucDiff_v2.0	SO:0000001	501	10999	.	.	.	ID=SV_1;Name=uncovered_region;region_len=10499;color=#990000

The ref_struct.gff file contains the following information:

GFF3 fields Content Notes
col 1 Ref_seq
col 2 NucDiff_v2.0 name and current version of the tool
col 3 SO:0000001 Sequence Ontology accession number corresponding to the "region" SO term
col 4 pos_st start of the unmapped reference region
col 5 pos_end end of the unmapped reference region
col 6/col 7/col8 . score/strand/phase fields are not used
col 9, ID "SV_1"
col 9, Name "uncovered_region"
col 9, region_len Length(unmapped_region)

IGV visualisation of the ref_additional.gff file:

Figure 1: IGV visualisation of the results output in ref_additional.gff file