-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathManual
57 lines (27 loc) · 3.19 KB
/
Manual
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
1.Introduction
2.Basic Workflow
3.Advanced Options
4.Output Files
1. Introduction
GTFmerge is a C++ script written by Matthew Patrick at the University of Michigan Dermatology Department designed to produce a singular genomic annnotation containing all non-overlapping information from several input files. The GTFmerge tool takes input files in Gene Transfer Format (GTF).
The tool is designed to exclude all exons of a particular gene in which at least one exon of the gene overlaps with any exon within any input GTF file. One input file is designated as the Reference GTF which serves as the base of the merged output. In the situation in which exons of genes
from one input GTF overlap with an exon of another input, information from the designated reference will be preserved while the gene from the other GTF (which is designated as a "Target GTF") will be discarded. # What about in the event where 2 non-reference GTF have overlapping exons? Which is preserved? GTFmerge does not merge within a GTF file, so both are preserved.
2. Basic Workflow
Gtfmergetool has the following format:
gtfmerge --"advanced option" "Reference.GTF" "Target.GTF" --outPrefix [/path/to/directory]
# Gtfmergetool can only handle two files within a single run. In order to merge several target GTF files, it is necessary to designate the merged file of one run as the --refGTF in the second run.
3. Advanced Options
--help produce help message
--outPrefix arg set output prefix (default: "./out")
--minBP arg set minimum number basepairs for overlap (default: 50)
--minPercent arg set minimum percentage for overlap (default: 20.0)
--numChromosomes arg set number of numeric autosomes (default: 22)
--allowOtherSequenceNames arg indicate whether to restrict to numeric autosomes
--outPrefix arg Specifies a prefix all outputting files of gtfmerge contain
--minBP arg Controls the minimum number of basepairs of overlap between two exons overwhich gtfmergetool considers an overlap. Overlaps between exons below this set number are not considered overlaps by the program and this information will be preserved in the merged file.
--minPercent arg Sets the minimum percentage of overlap between two exons above which gtfmergetool considers an overlap. An exon overlapping other exons within less than the set percentage of its basepairs is not considered a true overlap and is preserved.
4. Output files
Gtfmergetool outputs three files (added.gtf, combined.gtf, overlap.gtf)
Added.gtf contains all genes merged from the target gtf into the merged file.
Combined.gtf contains the union of non-overlapping genes from the target and reference gtf
Overlap.gtf contains all genomic information from the target gtf with at least one exon that overlaps to an exon of the reference gtf. This information has been discarded from the merged.gtf