-
Notifications
You must be signed in to change notification settings - Fork 0
/
blast_unofficial_man.1
134 lines (126 loc) · 5.46 KB
/
blast_unofficial_man.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
.\" Process this file with
.\" groff -man -Tascii foo.1
.\"
.TH BLASTN 1 "Rowan" "User Manuals"
~~How do you actually use BLASTN?~~ A practical quick start manual for BLASTN
.SH BLASTN \- Basic Linear Alignment Search Tool
.B blastn [-db
.I blast-database
.B ]
.B [-query
.I file
.B ]
.B [-out
.I out-file
.B ]
.B ...
.SH DESCRIPTION
.B BLASTN
A common and popular tool used to search for nucleotide sequences that match a sequence in a database.
This tool uses has been in use for a long time and is quite fast for searching many sequences for matches against multiple other sequences.
.B BLASTP
is the version of this tool for proteins.
Much of this manual is taken from either the NCBI website or other website such as biostars or school trainings on how to use BLAST.
.SH OPTIONS
.IP -evalue
Expectation value (E) threshold for saving hits
default is 10
.IP -outfmt
puts out the file in the specified out format if you 6 is for tab separated, 7 is for tab separated with comments for the names of each of the columns. See blastn (2)
.IP -perc_identity
What percent identity does the sequence from the database have to match onto the query sequence.
.IP -num_alignments <integer> >= 0
This is the number of database sequences to show alignments for if you get way too many hits then thsi might make it hard to interpret the results.
.SH OUTPUT TYPES
The following section is a cut and paste from the
.B BLASTN
-help option
outfmt <String>
alignment view options:
0 = Pairwise,
1 = Query-anchored showing identities,
2 = Query-anchored no identities,
3 = Flat query-anchored showing identities,
4 = Flat query-anchored no identities,
5 = BLAST XML,
6 = Tabular,
7 = Tabular with comment lines,
8 = Seqalign (Text ASN.1),
9 = Seqalign (Binary ASN.1),
10 = Comma-separated values,
11 = BLAST archive (ASN.1),
12 = Seqalign (JSON),
13 = Multiple-file BLAST JSON,
14 = Multiple-file BLAST XML2,
15 = Single-file BLAST JSON,
16 = Single-file BLAST XML2,
17 = Sequence Alignment/Map (SAM),
18 = Organism Report
Options 6, 7, 10 and 17 can be additionally configured to produce
a custom format specified by space delimited format specifiers.
The supported format specifiers for options 6, 7 and 10 are:
qseqid means Query Seq-id
qgi means Query GI
qacc means Query accesion
qaccver means Query accesion.version
qlen means Query sequence length
sseqid means Subject Seq-id
sallseqid means All subject Seq-id(s), separated by a ';'
sgi means Subject GI
sallgi means All subject GIs
sacc means Subject accession
saccver means Subject accession.version
sallacc means All subject accessions
slen means Subject sequence length
qstart means Start of alignment in query
qend means End of alignment in query
sstart means Start of alignment in subject
send means End of alignment in subject
qseq means Aligned part of query sequence
sseq means Aligned part of subject sequence
evalue means Expect value
bitscore means Bit score
score means Raw score
length means Alignment length
pident means Percentage of identical matches
nident means Number of identical matches
mismatch means Number of mismatches
positive means Number of positive-scoring matches
gapopen means Number of gap openings
gaps means Total number of gaps
ppos means Percentage of positive-scoring matches
frames means Query and subject frames separated by a '/'
qframe means Query frame
sframe means Subject frame
btop means Blast traceback operations (BTOP)
staxid means Subject Taxonomy ID
ssciname means Subject Scientific Name
scomname means Subject Common Name
sblastname means Subject Blast Name
sskingdom means Subject Super Kingdom
staxids means unique Subject Taxonomy ID(s), separated by a ';'
(in numerical order)
sscinames means unique Subject Scientific Name(s), separated by a ';'
scomnames means unique Subject Common Name(s), separated by a ';'
sblastnames means unique Subject Blast Name(s), separated by a ';'
(in alphabetical order)
sskingdoms means unique Subject Super Kingdom(s), separated by a ';'
(in alphabetical order)
stitle means Subject Title
salltitles means All Subject Title(s), separated by a '<>'
sstrand means Subject Strand
qcovs means Query Coverage Per Subject
qcovhsp means Query Coverage Per HSP
qcovus means Query Coverage Per Unique Subject (blastn only)
When not provided, the default value is:
'qaccver saccver pident length mismatch gapopen qstart qend sstart send
evalue bitscore', which is equivalent to the keyword 'std'
.SH BUGS
The documentation can be difficult and hard to read. So far I have not had any issues with blast
however, more info available at
.IR https://www.ncbi.nlm.nih.gov/books/NBK279670/
.SH AUTHOR
Rowan Callahan and sundry internet blogs if you have actual questions please email someone who isn't me <blast-help at ncbi.nlm.nih.gov>
.SH "SEE ALSO"
.BR blastn (2),
.BR blastp (1)