module.xml

<?xml version="1.0" encoding="UTF-8"?>
<modules>
    <module>
        <name>searchGUI</name>
        <category>search</category>
        <description>description1</description>
        <inputFile>input.xml</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>false</outputFile_required>
        <outputFile>N/A</outputFile>
        <outputParam>true</outputParam>
        <params>-spectrum_files xyz.mgf -output_folder folder_path -id_params params.par</params>
        <command>java -cp SearchGUI-3.3.3.jar eu.isas.searchgui.cmd.SearchCLI</command>
    </module>
    <module>
        <name>ProteoGrouper</name>
        <category>Grouper</category>
        <description>This tool can perform sequence-based protein inference, based on a set of PSMs. It should be parameterized with the CV accession for the PSM score used to create a protein score. The tool also needs to know whether the score should be log transformed (true for e/p-values etc) to create a positive protein score.</description>
        <inputFile>[input].mzid or [input].mzid.gz</inputFile>
        <inputParam>false</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>output2.txt</outputFile>
        <outputParam>false</outputParam>
        <params>-requireSIIsToPassThreshold true -verboseOutput false -cvAccForSIIScore \"MS:1001171\" -logTransScore false -version1_1 true -compress true</params>
        <command>java -jar "mzidlib-1.7.jar" ProteoGrouper mydata_fdr_threshold.mzid.gz mydata_fdr_threshold_groups.mzid.gz</command>
    </module>
    <module>
        <name>Omssa2mzid</name>
        <category>MZID</category>
        <description>This tool converts OMSSA omx (XML) files into mzid. It has optional parameters for inserting fragment ions into mzid (much larger files). If a decoy Regex is specified, the mzid attribute isDecoy will be set correctly for peptides. No protein inference is done by this tool (no protein list produced). To make valid mzid output, OMSSA must have been run with the option "-w" include spectra and search params in search results. Without this option, search parameters cannot be extracted from OMSSA. In this case, the OMSSA CSV converter should be used.</description>
        <inputFile>[input]. omx or [input]. omx.gz</inputFile>
        <inputParam>false</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[output].mzid or [output].mzid.gz</outputFile>
        <outputParam>false</outputParam>
        <params>-outputFragmentation false -decoyRegex REVERSED -mzidVer 1.2 -compress false</params>
        <command>java -jar "mzidlib-1.7.jar" Omssa2mzid mydata.omx mydata_omssa.mzid.gz</command>
    </module>
    <module>
        <name>Tandem2mzid</name>
        <category>MZID</category>
        <description>This tool converts X!Tandem XML results files into mzid. There are several optional parameters: whether to export fragment ions (makes bigger files), and include a decoy regular expression to set the isDecoy attribute in mzid. Valid mzid files require several pieces of metadata that are difficult to extract from mzid files, the format of the database searched and the file format of the input spectra. If these parameters are not set, the converter attempts to guess these based on the file extension. In X!Tandem, the numbering of spectra differs dependent upon the input spectra type - the IDs start at zero for mzML files, the IDs start at one for other spectra types e.g. MGF. This is a command line parameter which should be set to make sure that the mzid file references the correct spectrum in the source spectrum file.</description>
        <inputFile>[input]. xml or [input]. xml.gz</inputFile>
        <inputParam>false</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[output].mzid or [output].mzid.gz</outputFile>
        <outputParam>false</outputParam>
        <params>-outputFragmentation (true|false) -decoyRegex decoyRegex -databaseFileFormatID (e.g. MS:1001348 is FASTA format) "MS:100XXX" -massSpecFileFormatID (e.g. MS:1001062 is MGF) "MS:100XXX" -idsStartAtZero (true for mzML searched, false otherwise) true|false -compress true|false</params>
        <command>java -jar "mzidlib-1.7.jar" Tandem2mzid mydata.xml mydata_tandem.mzid.gz</command>
    </module>
    <module>
        <name>FalseDiscoveryRateGlobal</name>
        <category>n/a</category>
        <description>The Global FDR module calculates the FDR on one of the three levels. 1) PSM, 2) Peptide, 3) ProteinGroup. If ProteinGroup is chosen, there are two options for protein level PAG or PDH.</description>
        <inputFile>[input].mzid or [input].mzid.gz</inputFile>
        <inputParam>false</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[output].mzid or [output].mzid.gz</outputFile>
        <outputParam>false</outputParam>
        <params>-decoyValue decoyToTargetRatio -decoyRegex decoyRegex -cvTerm cvTerm -betterScoresAreLower true|false -fdrLevel fdrLevel -proteinLevel proteinLevel [-compress true|false]</params>
        <command>java -jar "mzidlib-1.7.jar" FalseDiscoveryRateGlobal mydata.mzid mydata_fdr.mzid.gz </command>
    </module>
    <module>
        <name>Threshold</name>
        <category>n/a</category>
        <description>This tool can be used to set the passThreshold parameter for PSMs or proteins in an mzid file, to indicate high-quality identifications that will be used by another tool. It can handle any type of score (sourced from the PSI-MS CV) and scores can be ordered low to high or vice versa. If deleteUnderThreshold is specified, PSMs and referenced proteins under the threshold will be removed from the file.</description>
        <inputFile>[input].mzid or [input].mzid.gz</inputFile>
        <inputParam>false</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[output].mzid or [output].mzid.gz</outputFile>
        <outputParam>false</outputParam>
        <params>-isPSMThreshold true|false -cvAccessionForScoreThreshold "MS:100XXX" -threshValue doubleValue  -betterScoresAreLower true|false -deleteUnderThreshold true|false [-compress true|false]</params>
        <command>java -jar "mzidlib-1.7.jar" Threshold mydata_fdr.mzid.gz mydata_fdr_threshold.mzid.gz </command>
    </module>
    <module>
        <name>Mzid2Csv</name>
        <category>n/a</category>
        <description>This tool can export from an mzid file into CSV, according to one of the four types of export specified as parameters.</description>
        <inputFile>[input].mzid or [input].mzid.gz</inputFile>
        <inputParam>false</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[output].csv or [output].csv.gz</outputFile>
        <outputParam>false</outputParam>
        <params>-exportType exportProteinGroups|exportPSMs|exportProteinsOnly|exportRepProteinPerPAGOnly|exportProteoAnnotator  [-verboseOutput true|false] [-compress true|false]</params>
        <command>java -jar "mzidlib-1.7.jar" Mzid2Csv mydata_fdr.mzid.gz mydata.csv</command>
    </module>
    <module>
        <name>AddRetentionTimeToMzid</name>
        <category>n/a</category>
        <description>Add Retention Time to Mzid</description>
        <inputFile>[input].mzid or [input].mzid.gz</inputFile>
        <inputParam>false</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[output]. mzid or [output]. mzid.gz</outputFile>
        <outputParam>false</outputParam>
        <params>-compress true|false</params>
        <command>java -jar "mzidlib-1.7.jar" AddRetentionTimeToMzid input.mzid output.mzid</command>
    </module>
    <module>
        <name>msconvert</name>
        <category>n/a</category>
        <description>msconvert is a command line tool for converting between various file formats. Full documentation for this tool can be found at: http://proteowizard.sourceforge.net/tools/msconvert.html</description>
        <inputFile>data.RAW</inputFile>
        <inputParam>false</inputParam>
        <outputFile_required>false</outputFile_required>
        <outputFile>N/A</outputFile>
        <outputParam>false</outputParam>
        <params>-f [ --filelist ] arg : specify text file containing filenames
-o [ --outdir ] arg (=.) : set output directory ('-' for stdout) [.]
-c [ --config ] arg : configuration file (optionName=value)
--outfile arg : Override the name of output file.
-e [ --ext ] arg : set extension for output files
[mzML|mzXML|mgf|txt|mz5]
--mzML : write mzML format [default]
--mzXML : write mzXML format
--mz5 : write mz5 format
--mgf : write Mascot generic format
--text : write ProteoWizard internal text format
--ms1 : write MS1 format
--cms1 : write CMS1 format
--ms2 : write MS2 format
--cms2 : write CMS2 format
-v [ --verbose ] : display detailed progress information
--64 : set default binary encoding to 64-bit precision
[default]
--32 : set default binary encoding to 32-bit precision
--mz64 : encode m/z values in 64-bit precision [default]
--mz32 : encode m/z values in 32-bit precision
--inten64 : encode intensity values in 64-bit precision
--inten32 : encode intensity values in 32-bit precision
[default]
--noindex : do not write index
-i [ --contactInfo ] arg : filename for contact info
-z [ --zlib ] : use zlib compression for binary data
--numpressLinear [toler] : use numpress linear prediction lossy compression for binary mz and rt data (relative error guaranteed less than given tolerance, default is 2e-009)
--numpressPic : use numpress positive integer lossy compression for binary intensities (maximum 0.5 absolute error guaranteed)
--numpressSlof [toler] : use numpress short logged float lossy compression for binary intensities (relative error guaranteed less than given tolerance, default is 0.0002)
-n [ --numpressAll] : same as --numpressLinear --numpressSlof (see https://github.com/fickludd/ms-numpress for more info)
--numpressLinearAbsTol : desired absolute tolerance for linear numpress prediction (e.g. use 1e-4 for a mass accuracy of 0.2 ppm at 500 m/z, default uses -1.0 for maximal accuracy). Note: setting this value may substantially reduce file size, this overrides relative accuracy tolerance.
Numpress may be used at the same time as zlib (-z) for best compression, though some older mzML parsers may not handle this properly.
-g [ --gzip ] : gzip entire output file (adds .gz to filename)
--filter arg : add a spectrum list filter
--merge : create a single output file from multiple input
files by merging file-level metadata and
concatenating spectrum lists
--simAsSpectra : write selected ion monitoring as spectra, not chromatograms
--srmAsSpectra : write selected reaction monitoring as spectra, not chromatograms
--combineIonMobilitySpectra : write all drift bins/scans in a frame/block as one spectrum instead of individual spectra
--acceptZeroLengthSpectra : some vendor readers have an efficient way of filtering out empty spectra, but it takes more time to open the file
--ignoreUnknownInstrumentError : if true, if an instrument cannot be determined from a vendor file, it will not be an error
--help : show this message, with extra detail on filter options</params>
        <command>msconvert.exe data.raw</command>
    </module>
    <module>
        <name>msaccess</name>
        <category>N/A</category>
        <description>msaccess is a command line tool for extracting data and metadata from data files. Full documentation for this tool can be found at: http://proteowizard.sourceforge.net/tools/msaccess.html</description>
        <inputFile>data.mzML</inputFile>
        <inputParam>false</inputParam>
        <outputFile_required>false</outputFile_required>
        <outputFile>false</outputFile>
        <outputParam>true</outputParam>
        <params>-f [ --filelist ] arg : text file containing filenames to process
-o [ --outdir ] arg (=.) : output directory
-c [ --config ] arg : configuration file (containing settings as optionName=value)
-x [ --exec ] arg : execute command, e.g --exec "tic mz=409-412"
--filter arg : add a spectrum list filter, e.g. --filter="msLevel [2,3]"
(see a full list of supported filter types here) 
-v [ --verbose ] : print progress messages</params>
        <command>msaccess.exe data.mzML</command>
    </module>
    <module>
        <name>idconvert</name>
        <category>N/A</category>
        <description>idconvert is a command line tool for converting between various file formats. pepXML, protXML, mzIdentML. Write: pepXML, mzIdentML. Full documentation for this tool can be found at: http://proteowizard.sourceforge.net/tools/idconvert.html</description>
        <inputFile>data.pepXML, data.protXML or data.mzIdentML</inputFile>
        <inputParam>false</inputParam>
        <outputFile_required>false</outputFile_required>
        <outputFile>data.pepXML, data.protXML, data.mzIdentML</outputFile>
        <outputParam>false</outputParam>
        <params>--pepXML -o my_output_dir</params>
        <command>idconvert data.pepXML</command>
    </module>
    <module>
        <name>mspicture</name>
        <category>n/a</category>
        <description>msPicture is a tool that produces pseudo2d gels from mass spectra data. There are many options available for manipulating layout, color scheme, and markup of the resulting image. Being part of the proteowizard suite, msPicture can read a wide variety of MS data formats. Marking peptide locations is done easily by giving the location of pepXML, msInspect, or even a flat file.</description>
        <inputFile>data.mzML</inputFile>
        <inputParam>false</inputParam>
        <outputFile_required>false</outputFile_required>
        <outputFile>example2.mzXML.image</outputFile>
        <outputParam>false</outputParam>
        <params>-o [ --outdir ] arg (=.) : output directory
-c [ --config ] arg      : configuration file (optionName=value) (ignored)
-l [ --label ] arg       : set filename label to xxx
--mzLow arg              : set low m/z cutoff
--mzHigh arg             : set high m/z cutoff
--timeScale arg          : set scale of time axis
-b [ --binCount ] arg    : set histogram bin count
-t [ --time ]            : render linearly to time
-s [ --scan ]            : render linearly to scans
-z [ --zRadius ] arg     : set intensity function z-score radius [=2]
--bry                    : use blue-red-yellow gradient
--grey                   : use grey-scale gradient
--binSum                 : sum intensity in bins [default = max intensity]
-m [ --ms2locs ]         : indicate masses selected for ms2
--shape arg              : shape of the pseudo2d gel markup [circle(default)|square].
-p [ --pepxml ] arg      : pepxml file location
-i [ --msi ] arg         : msInspect output file location
-f [ --flat ] arg        : peptide file location (nativeID rt mz score seq)
-w [--width] arg         : set image width in pixels [default is calculated]
-h [--height] arg        : set image height in pixels [default is calculated]
-v [ --verbose ]         : prints extra information.
-h [ --help ]            : print this helpful message.

Commands:
      label=xxxx (set filename label to xxxx)
      mzLow=N (set low m/z cutoff)
      mzHigh=N (set high m/z cutoff)
      timeScale=N (set scaling factor for time axis)
      binCount=N (set histogram bin count)
      zRadius=N (set intensity function z-score radius [=2])
      scan (render y-axis linear with scans)
      time (render y-axis linear with time)
      bry (use blue-red-yellow gradient)
      grey (use grey-scale gradient)
      binSum (sum intensity in bins [default = max intensity])
      ms2locs (indicate masses selected for ms2)
      pepxml=xxx (set ms2 id's from pepxml file xxx)
      msi=xxx (set ms2 id's from msinspect output file xxx)
      flat=xxx (set ms2 id's from tab delim file xxx)</params>
        <command>mspicture.exe filename.mzML</command>
    </module>
    <module>
        <name>qtofpeakpicker</name>
        <category>n/a</category>
        <description>qtofpeakpicker is a command line tool for peak detection in TOF (Time Of Flight spectra). Full documentation for this tool can be found at: http://proteowizard.sourceforge.net/tools/qtofpeakpicker.html</description>
        <inputFile>All proteowizard formats are supported.</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>true</outputFile>
        <outputParam>true</outputParam>
        <params>File Handling:: -H [ --help ] produce help message -V [ --version ] produces version information -I [ --in ] arg input file -O [ --out ] arg output file -C [ --config-file ] arg configuration file
Processing Options:: --resolution arg (=20000) instrument resolution. --area arg (=1) default area, otherwise store intensity (0). --threshold arg (=10) removes peaks less than threshold times smallest intensity in spectrum --numberofpeaks arg (=0) maximum number of peaks per spectrum (0 = no limit)
Advanced Processing Options:: -i [ --widthint ] arg (=2) peak apex +- integration width --smoothwidth arg (=1) smoothing width</params>
        <command>qtofpeakpicker.exe</command>
    </module>
    <module>
        <name>blastn</name>
        <category>search</category>
        <description>The blastn application searches a nucleotide query against nucleotide subject sequences or a nucleotide database. An option of type "flag" takes no arguments, but if present the argument is true. 
Four different tasks are supported:
    1.) "megablast", for very similar sequences (e.g, sequencing errors),
    2.) "dc-megablast", typically used for inter-species comparisons,
    3.) "blastn", the traditional program used for inter-species comparisons,
    4.) "blastn-short", optimized for sequences less than 30 nucleotides.
    Full documentation for this tool can be found on the following page: https://www.ncbi.nlm.nih.gov/books/NBK279684/#appendices.Options_for_the_commandline_a
        </description>
        <inputFile>[file].fasta</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[results].out</outputFile>
        <outputParam>true</outputParam>
        <params>Parameters common to all BLAST+ search modules:
Option  Type    Default value   Description/notes
db	string	none	BLAST database name.
query	string	stdin	Query file name.
query_loc	string	none	Location on the query sequence (Format: start-stop)
out	string	stdout	Output file name
evalue	real	10.0	Expect value (E) for saving hits
subject	string	none	File with subject sequence(s) to search.
subject_loc	string	none	Location on the subject sequence (Format: start-stop).
show_gis	flag	N/A	Show NCBI GIs in report.
num_descriptions	integer	500	Show one-line descriptions for this number of database sequences.
num_alignments	integer	250	Show alignments for this number of database sequences.
max_target_seqs	Integer	500	Number of aligned sequences to keep. Use with report formats that do not have separate definition line and alignment sections such as tabular (all outfmt > 4). Not compatible with num_descriptions or num_alignments.
max_hsps	integer	none	Maximum number of HSPs (alignments) to keep for any single query-subject pair. The HSPs shown will be the best as judged by expect value. This number should be an integer that is one or greater. If this option is not set, BLAST shows all HSPs meeting the expect value criteria. Setting it to one will show only the best HSP for every query-subject pair
html	flag	N/A	Produce HTML output
gilist	string	none	Restrict search of database to GI’s listed in this file. Local searches only.
negative_gilist	string	none	Restrict search of database to everything except the GI’s listed in this file. Local searches only.
entrez_query	string	none	Restrict search with the given Entrez query. Remote searches only.
culling_limit	integer	none	Delete a hit that is enveloped by at least this many higher-scoring hits.
best_hit_overhang	real	none	Best Hit algorithm overhang value (recommended value: 0.1)
best_hit_score_edge	real	none	Best Hit algorithm score edge value (recommended value: 0.1)
dbsize	integer	none	Effective size of the database
searchsp	integer	none	Effective length of the search space
import_search_strategy	string	none	Search strategy file to read.
export_search_strategy	string	none	Record search strategy to this file.
parse_deflines	flag	N/A	Parse query and subject bar delimited sequence identifiers (e.g., gi|129295).
num_threads	integer	1	Number of threads (CPUs) to use in blast search.
remote	flag	N/A	Execute search on NCBI servers?
outfmt	string	0	alignment view options:
                            0 = pairwise,
                            1 = query-anchored showing identities,
                            2 = query-anchored no identities,
                            3 = flat query-anchored, show identities,
                            4 = flat query-anchored, no identities,
                            5 = XML Blast output,
                            6 = tabular,
                            7 = tabular with comment lines,
                            8 = Text ASN.1,
                            9 = Binary ASN.1
                            10 = Comma-separated values
                            11 = BLAST archive format (ASN.1)
                            Options 6, 7, and 10 can be additionally configured to produce a custom format specified by space delimited format specifiers.
                            The supported format specifiers are:
                            qseqid means Query Seq-id
                            qgi means Query GI
                            qacc means Query accesion
                            sseqid means Subject Seq-id
                            sallseqid means All subject Seq-id(s), separated by a ';'
                            sgi means Subject GI
                            sallgi means All subject GIs
                            sacc means Subject accession
                            sallacc means All subject accessions
                            qstart means Start of alignment in query
                            qend means End of alignment in query
                            sstart means Start of alignment in subject
                            send means End of alignment in subject
                            qseq means Aligned part of query sequence
                            sseq means Aligned part of subject sequence
                            evalue means Expect value
                            bitscore means Bit score
                            score means Raw score
                            length means Alignment length
                            pident means Percentage of identical matches
                            nident means Number of identical matches
                            mismatch means Number of mismatches
                            positive means Number of positive-scoring matches
                            gapopen means Number of gap openings
                            gaps means Total number of gap
                            ppos means Percentage of positive-scoring matches
                            frames means Query and subject frames separated by a '/'
                            qframe means Query frame
                            sframe means Subject frame
                            btop means Blast traceback operations (BTOP)
                            staxids means unique Subject Taxonomy ID(s), separated by a ';'(in numerical order)
                            sscinames means unique Subject Scientific Name(s), separated by a ';'
                            scomnames means unique Subject Common Name(s), separated by a ';'
                            sblastnames means unique Subject Blast Name(s), separated by a ';' (in alphabetical order)
                            sskingdoms means unique Subject Super Kingdom(s), separated by a ';' (in alphabetical order)
                            stitle means Subject Title
                            salltitles means All Subject Title(s), separated by a '&lt;&gt;'
                            sstrand means Subject Strand
                            qcovs means Query Coverage Per Subject (for all HSPs)
                            qcovhsp means Query Coverage Per HSP
                            qcovus is a measure of Query Coverage that counts a position in a subject sequence for this measure only once. The second time the position is aligned to the query is not counted towards this measure.
                            When not provided, the default value is:
                            'qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore', which is equivalent to the keyword 'std'
MODULE SPECIFIC PARAMS:
option	task(s)	type	default value	description and notes
word_size	megablast	integer	28	Length of initial exact match.
word_size	dc-megablast	integer	11	Number of matching nucleotides in initial match. dc-megablast allows non-consecutive letters to match.
word_size	blastn	integer	11	Length of initial exact match.
word_size	blastn-short	integer	7	Length of initial exact match.
gapopen megablast	integer	0	Cost to open a gap. See appendix "BLASTN reward/penalty values".
gapextend	megablast	integer	none	Cost to extend a gap. This default is a function of reward/penalty value. See appendix "BLASTN reward/penalty values".
gapopen	blastn, blastn-short, dc-megablast	integer	5	Cost to open a gap. See appendix "BLASTN reward/penalty values".
gapextend	blastn, blastn-short, dc-megablast	integer	2	Cost to extend a gap. See appendix "BLASTN reward/penalty values".
reward	megablast	integer	1	Reward for a nucleotide match.
penalty	megablast	integer	-2	Penalty for a nucleotide mismatch.
reward	blastn, dc-megablast	integer	2	Reward for a nucleotide match.
penalty	blastn, dc-megablast	integer	-3	Penalty for a nucleotide mismatch.
reward	blastn-short	integer	1	Reward for a nucleotide match.
penalty	blastn-short	integer	-3	Penalty for a nucleotide mismatch.
strand	all	string	both	Query strand(s) to search against database/subject. Choice of both, minus, or plus.
dust	all	string	20 64 1	Filter query sequence with dust.
filtering_db	all	string	none	Mask query using the sequences in this database.
window_masker_taxid	all	integer	none	Enable WindowMasker filtering using a Taxonomic ID.
window_masker_db	all	string	none	Enable WindowMasker filtering using this file.
soft_masking	all	boolean	true	Apply filtering locations as soft masks (i.e., only for finding initial matches).
lcase_masking	all	flag	N/A	Use lower case filtering in query and subject sequence(s).
db_soft_mask	all	integer	none	Filtering algorithm ID to apply to the BLAST database as soft mask (i.e., only for finding initial matches).
db_hard_mask	all	integer	none	Filtering algorithm ID to apply to the BLAST database as hard mask (i.e., sequence is masked for all phases of search).
perc_identity	all	integer	0	Percent identity cutoff.
template_type	dc-megablast	string	coding	Discontiguous MegaBLAST template type. Allowed values are coding, optimal and coding_and_optimal.
template_length	dc-megablast	integer	18	Discontiguous MegaBLAST template length.
use_index	megablast	boolean	false	Use MegaBLAST database index. Indices may be created with the makembindex application.
index_name	megablast	string	none	MegaBLAST database index name.
xdrop_ungap	all	real	20	Heuristic value (in bits) for ungapped extensions.
xdrop_gap	all	real	30	Heuristic value (in bits) for preliminary gapped extensions.
xdrop_gap_final	all	real	100	Heuristic value (in bits) for final gapped alignment.
no_greedy	megablast	flag	N/A	Use non-greedy dynamic programming extension.
min_raw_gapped_score	all	integer	none	Minimum raw gapped score to keep an alignment in the preliminary gapped and trace-back stages. Normally set based upon expect value.
ungapped	all	flag	N/A	Perform ungapped alignment.
window_size	dc-megablast	integer	40	Multiple hits window size, use 0 to specify 1-hit algorithm
        </params>
        <command>blastn.exe</command>
    </module>
    <module>
        <name>blastp</name>
        <category>search</category>
        <description>The blastp application searches a protein sequence against protein subject sequences or a protein database.
An option of type "flag" takes no arguments, but if present the argument is true.
Three different tasks are supported:
        1.) "blastp", for standard protein-protein comparisons,
        2.) "blastp-short", optimized for query sequences shorter than 30 residues, and
        3.) "blastp-fast", a faster version that uses a larger word-size per https://www.ncbi.nlm.nih.gov/pubmed/17921491.
This table reflects the 2.2.27 BLAST+ release.
Full documentation for this tool can be found on the following page: https://www.ncbi.nlm.nih.gov/books/NBK279684/#appendices.Options_for_the_commandline_a
        </description>
        <inputFile>[file].fasta</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[results].out</outputFile>
        <outputParam>true</outputParam>
        <params>Parameters common to all BLAST+ search modules:
Option  Type    Default value   Description/notes
db	string	none	BLAST database name.
query	string	stdin	Query file name.
query_loc	string	none	Location on the query sequence (Format: start-stop)
out	string	stdout	Output file name
evalue	real	10.0	Expect value (E) for saving hits
subject	string	none	File with subject sequence(s) to search.
subject_loc	string	none	Location on the subject sequence (Format: start-stop).
show_gis	flag	N/A	Show NCBI GIs in report.
num_descriptions	integer	500	Show one-line descriptions for this number of database sequences.
num_alignments	integer	250	Show alignments for this number of database sequences.
max_target_seqs	Integer	500	Number of aligned sequences to keep. Use with report formats that do not have separate definition line and alignment sections such as tabular (all outfmt > 4). Not compatible with num_descriptions or num_alignments.
max_hsps	integer	none	Maximum number of HSPs (alignments) to keep for any single query-subject pair. The HSPs shown will be the best as judged by expect value. This number should be an integer that is one or greater. If this option is not set, BLAST shows all HSPs meeting the expect value criteria. Setting it to one will show only the best HSP for every query-subject pair
html	flag	N/A	Produce HTML output
gilist	string	none	Restrict search of database to GI’s listed in this file. Local searches only.
negative_gilist	string	none	Restrict search of database to everything except the GI’s listed in this file. Local searches only.
entrez_query	string	none	Restrict search with the given Entrez query. Remote searches only.
culling_limit	integer	none	Delete a hit that is enveloped by at least this many higher-scoring hits.
best_hit_overhang	real	none	Best Hit algorithm overhang value (recommended value: 0.1)
best_hit_score_edge	real	none	Best Hit algorithm score edge value (recommended value: 0.1)
dbsize	integer	none	Effective size of the database
searchsp	integer	none	Effective length of the search space
import_search_strategy	string	none	Search strategy file to read.
export_search_strategy	string	none	Record search strategy to this file.
parse_deflines	flag	N/A	Parse query and subject bar delimited sequence identifiers (e.g., gi|129295).
num_threads	integer	1	Number of threads (CPUs) to use in blast search.
remote	flag	N/A	Execute search on NCBI servers?
outfmt	string	0	alignment view options:
                    0 = pairwise,
                    1 = query-anchored showing identities,
                    2 = query-anchored no identities,
                    3 = flat query-anchored, show identities,
                    4 = flat query-anchored, no identities,
                    5 = XML Blast output,
                    6 = tabular,
                    7 = tabular with comment lines,
                    8 = Text ASN.1,
                    9 = Binary ASN.1
                    10 = Comma-separated values
                    11 = BLAST archive format (ASN.1)
                    Options 6, 7, and 10 can be additionally configured to produce a custom format specified by space delimited format specifiers.
                    The supported format specifiers are:
                    qseqid means Query Seq-id
                    qgi means Query GI
                    qacc means Query accesion
                    sseqid means Subject Seq-id
                    sallseqid means All subject Seq-id(s), separated by a ';'
                    sgi means Subject GI
                    sallgi means All subject GIs
                    sacc means Subject accession
                    sallacc means All subject accessions
                    qstart means Start of alignment in query
                    qend means End of alignment in query
                    sstart means Start of alignment in subject
                    send means End of alignment in subject
                    qseq means Aligned part of query sequence
                    sseq means Aligned part of subject sequence
                    evalue means Expect value
                    bitscore means Bit score
                    score means Raw score
                    length means Alignment length
                    pident means Percentage of identical matches
                    nident means Number of identical matches
                    mismatch means Number of mismatches
                    positive means Number of positive-scoring matches
                    gapopen means Number of gap openings
                    gaps means Total number of gap
                    ppos means Percentage of positive-scoring matches
                    frames means Query and subject frames separated by a '/'
                    qframe means Query frame
                    sframe means Subject frame
                    btop means Blast traceback operations (BTOP)
                    staxids means unique Subject Taxonomy ID(s), separated by a ';'(in numerical order)
                    sscinames means unique Subject Scientific Name(s), separated by a ';'
                    scomnames means unique Subject Common Name(s), separated by a ';'
                    sblastnames means unique Subject Blast Name(s), separated by a ';' (in alphabetical order)
                    sskingdoms means unique Subject Super Kingdom(s), separated by a ';' (in alphabetical order)
                    stitle means Subject Title
                    salltitles means All Subject Title(s), separated by a '&lt;&gt;'
                    sstrand means Subject Strand
                    qcovs means Query Coverage Per Subject (for all HSPs)
                    qcovhsp means Query Coverage Per HSP
                    qcovus is a measure of Query Coverage that counts a position in a subject sequence for this measure only once. The second time the position is aligned to the query is not counted towards this measure.
                    When not provided, the default value is:
                    'qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore', which is equivalent to the keyword 'std'
MODULE SPECIFIC PARAMS:
option	task	type	default value	description and notes
word_size	blastp	integer	3	Word size of initial match. Valid word sizes are 2-7.
word_size	blastp-short	integer	2	Word size of initial match.
word_size	blastp-fast	integer	6	Word size of initial match
gapopen	blastp and blastp-fast	integer	11	Cost to open a gap.
gapextend	blastp and blastp-fast	integer	1	Cost to extend a gap.
gapopen	blastp-short	integer	9	Cost to open a gap.
gapextend	blastp-short	integer	1	Cost to extend a gap.
matrix	blastp and blastp-fast	string	BLOSUM62	Scoring matrix name.
matrix	blastp-short	string	PAM30	Scoring matrix name.
threshold	blastp	integer	11	Minimum score to add a word to the BLAST lookup table.
threshold	blastp-short	integer	16	Minimum score to add a word to the BLAST lookup table.
threshold	blastp-fast	Integer	21	Minimum score to add a word to the BLAST lookup table.
comp_based_stats	blastp and blastp-fast	string	2	Use composition-based statistics:
D or d: default (equivalent to 2)
0 or F or f: no composition-based statistics
1: Composition-based statistics as in NAR 29:2994-3005, 2001
2 or T or t : Composition-based score adjustment as in Bioinformatics
21:902-911, 2005, conditioned on sequence properties
3: Composition-based score adjustment as in Bioinformatics 21:902-911, 2005, unconditionally
comp_based_stats	blastp-short	string	0	Use composition-based statistics :
D or d: default (equivalent to 2)
0 or F or f: no composition-based statistics
1: Composition-based statistics as in NAR 29:2994-3005, 2001
2 or T or t : Composition-based score adjustment as in Bioinformatics
21:902-911, 2005, conditioned on sequence properties
3: Composition-based score adjustment as in Bioinformatics 21:902-911, 2005, unconditionally
seg	all	string	no	Filter query sequence with SEG (Format: 'yes', 'window locut hicut', or 'no' to disable).
soft_masking	all	boolean	false	Apply filtering locations as soft masks (i.e., only for finding initial matches).
lcase_masking	all	flag	N/A	Use lower case filtering in query and subject sequence(s).
db_soft_mask	all	integer	none	Filtering algorithm ID to apply to the BLAST database as soft mask (i.e., only for finding initial matches).
db_hard_mask	all	integer	none	Filtering algorithm ID to apply to the BLAST database as hard mask (i.e., sequence is masked for all phases of search).
xdrop_gap_final	all	real	25	Heuristic value (in bits) for final gapped alignment/
window_size	blastp and blastp-fast	integer	40	Multiple hits window size, use 0 to specify 1-hit algorithm.
window_size	blastp-short	integer	15	Multiple hits window size, use 0 to specify 1-hit algorithm.
use_sw_tback	all	flag	N/A	Compute locally optimal Smith-Waterman alignments?</params>
        <command>blastp.exe</command>
    </module>
    <module>
        <name>blastx</name>
        <category>search</category>
        <description>The blastx application translates a nucleotide query and searches it against protein subject sequences or a protein database. 
Two different tasks are supported:
    1.) "blastx" for standard translated nucleotide-protein comparison and
    2.) "blastx-fast", a faster version that uses a larger word-size based on https://www.ncbi.nlm.nih.gov/pubmed/17921491.
Full documentation for this tool can be found on the following page: https://www.ncbi.nlm.nih.gov/books/NBK279684/#appendices.Options_for_the_commandline_a
        </description>
        <inputFile>[file].fasta</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[results].out</outputFile>
        <outputParam>true</outputParam>
        <params>Parameters common to all BLAST+ search modules:
Option  Type    Default value   Description/notes
db	string	none	BLAST database name.
query	string	stdin	Query file name.
query_loc	string	none	Location on the query sequence (Format: start-stop)
out	string	stdout	Output file name
evalue	real	10.0	Expect value (E) for saving hits
subject	string	none	File with subject sequence(s) to search.
subject_loc	string	none	Location on the subject sequence (Format: start-stop).
show_gis	flag	N/A	Show NCBI GIs in report.
num_descriptions	integer	500	Show one-line descriptions for this number of database sequences.
num_alignments	integer	250	Show alignments for this number of database sequences.
max_target_seqs	Integer	500	Number of aligned sequences to keep. Use with report formats that do not have separate definition line and alignment sections such as tabular (all outfmt > 4). Not compatible with num_descriptions or num_alignments.
max_hsps	integer	none	Maximum number of HSPs (alignments) to keep for any single query-subject pair. The HSPs shown will be the best as judged by expect value. This number should be an integer that is one or greater. If this option is not set, BLAST shows all HSPs meeting the expect value criteria. Setting it to one will show only the best HSP for every query-subject pair
html	flag	N/A	Produce HTML output
gilist	string	none	Restrict search of database to GI’s listed in this file. Local searches only.
negative_gilist	string	none	Restrict search of database to everything except the GI’s listed in this file. Local searches only.
entrez_query	string	none	Restrict search with the given Entrez query. Remote searches only.
culling_limit	integer	none	Delete a hit that is enveloped by at least this many higher-scoring hits.
best_hit_overhang	real	none	Best Hit algorithm overhang value (recommended value: 0.1)
best_hit_score_edge	real	none	Best Hit algorithm score edge value (recommended value: 0.1)
dbsize	integer	none	Effective size of the database
searchsp	integer	none	Effective length of the search space
import_search_strategy	string	none	Search strategy file to read.
export_search_strategy	string	none	Record search strategy to this file.
parse_deflines	flag	N/A	Parse query and subject bar delimited sequence identifiers (e.g., gi|129295).
num_threads	integer	1	Number of threads (CPUs) to use in blast search.
remote	flag	N/A	Execute search on NCBI servers?
outfmt	string	0	alignment view options:
                    0 = pairwise,
                    1 = query-anchored showing identities,
                    2 = query-anchored no identities,
                    3 = flat query-anchored, show identities,
                    4 = flat query-anchored, no identities,
                    5 = XML Blast output,
                    6 = tabular,
                    7 = tabular with comment lines,
                    8 = Text ASN.1,
                    9 = Binary ASN.1
                    10 = Comma-separated values
                    11 = BLAST archive format (ASN.1)
                    Options 6, 7, and 10 can be additionally configured to produce a custom format specified by space delimited format specifiers.
                    The supported format specifiers are:
                    qseqid means Query Seq-id
                    qgi means Query GI
                    qacc means Query accesion
                    sseqid means Subject Seq-id
                    sallseqid means All subject Seq-id(s), separated by a ';'
                    sgi means Subject GI
                    sallgi means All subject GIs
                    sacc means Subject accession
                    sallacc means All subject accessions
                    qstart means Start of alignment in query
                    qend means End of alignment in query
                    sstart means Start of alignment in subject
                    send means End of alignment in subject
                    qseq means Aligned part of query sequence
                    sseq means Aligned part of subject sequence
                    evalue means Expect value
                    bitscore means Bit score
                    score means Raw score
                    length means Alignment length
                    pident means Percentage of identical matches
                    nident means Number of identical matches
                    mismatch means Number of mismatches
                    positive means Number of positive-scoring matches
                    gapopen means Number of gap openings
                    gaps means Total number of gap
                    ppos means Percentage of positive-scoring matches
                    frames means Query and subject frames separated by a '/'
                    qframe means Query frame
                    sframe means Subject frame
                    btop means Blast traceback operations (BTOP)
                    staxids means unique Subject Taxonomy ID(s), separated by a ';'(in numerical order)
                    sscinames means unique Subject Scientific Name(s), separated by a ';'
                    scomnames means unique Subject Common Name(s), separated by a ';'
                    sblastnames means unique Subject Blast Name(s), separated by a ';' (in alphabetical order)
                    sskingdoms means unique Subject Super Kingdom(s), separated by a ';' (in alphabetical order)
                    stitle means Subject Title
                    salltitles means All Subject Title(s), separated by a '&lt;&gt;'
                    sstrand means Subject Strand
                    qcovs means Query Coverage Per Subject (for all HSPs)
                    qcovhsp means Query Coverage Per HSP
                    qcovus is a measure of Query Coverage that counts a position in a subject sequence for this measure only once. The second time the position is aligned to the query is not counted towards this measure.
                    When not provided, the default value is:
                    'qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore', which is equivalent to the keyword 'std'
MODULE SPECIFIC PARAMS:
option	task	type	default value	description and notes
word_size	blastx	integer	3	Word size for initial match. Valid word sizes are 2-7.
word_size	blastx-fast	integer	6	Word size for initial match.
gapopen	all	integer	11	Cost to open a gap.
gapextend	all	integer	1	Cost to extend a gap.
matrix	all	string	BLOSUM62	Scoring matrix name.
threshold	blastx	integer	12	Minimum score to add a word to the BLAST lookup table.
threshold	blastx-fast	Integer	21	Minimum score to add a word to the BLAST lookup table.
seg	all	string	12 2.2 2.5	Filter query sequence with SEG (Format: 'yes', 'window locut hicut', or 'no' to disable).
soft_masking	all	boolean	false	Apply filtering locations as soft masks (i.e., only for finding initial matches).
lcase_masking	all	flag	N/A	Use lower case filtering in query and subject sequence(s).
db_soft_mask	all	integer	none	Filtering algorithm ID to apply to the BLAST database as soft mask (i.e., only for finding initial matches).
db_hard_mask	all	integer	none	Filtering algorithm ID to apply to the BLAST database as hard mask (i.e., sequence is masked for all phases of search).
xdrop_gap_final	all	real	25	Heuristic value (in bits) for final gapped alignment.
window_size	all	integer	40	Multiple hits window size, use 0 to specify 1-hit algorithm.
strand	all	string	both	Query strand(s) to search against database/subject. Choice of both, minus, or plus.
query_genetic_code	all	integer	1	Genetic code to translate query, see ftp://ftp.ncbi.nih.gov/entrez/misc/data/gc.prt
max_intron_length	all	integer	0	Length of the largest intron allowed in a translated nucleotide sequence when linking multiple distinct alignments (a negative value disables linking).
comp_based_stats	all	integer	2	Use composition-based statistics for blastx:
                                            D or d: default (equivalent to 2)
                                            0 or F or f: no composition-based statistics
                                            1: Composition-based statistics as in NAR 29:2994-3005, 2001
                                            2 or T or t : Composition-based score adjustment as in Bioinformatics
                                            21:902-911, 2005, conditioned on sequence properties
                                            3: Composition-based score adjustment as in Bioinformatics 21:902-911, 2005, unconditionally
                                            Default = `2'
        </params>
        <command>blastx.exe</command>
    </module>
    <module>
        <name>tblastn</name>
        <category>search</category>
        <description>The tblastn application searches a protein query against nucleotide subject sequences or a nucleotide database translated at search time. 
Two different tasks are supported:
    1.) "tblastn" for a standard protein-translated nucleotide comparison and
    2.) "tblastn-fast" for a faster version with a larger word-size based on https://www.ncbi.nlm.nih.gov/pubmed/17921491.
Full documentation for this tool can be found on the following page: https://www.ncbi.nlm.nih.gov/books/NBK279684/#appendices.Options_for_the_commandline_a
        </description>
        <inputFile>[file].fasta</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[results].out</outputFile>
        <outputParam>true</outputParam>
        <params>Parameters common to all BLAST+ search modules:
Option  Type    Default value   Description/notes
db	string	none	BLAST database name.
query	string	stdin	Query file name.
query_loc	string	none	Location on the query sequence (Format: start-stop)
out	string	stdout	Output file name
evalue	real	10.0	Expect value (E) for saving hits
subject	string	none	File with subject sequence(s) to search.
subject_loc	string	none	Location on the subject sequence (Format: start-stop).
show_gis	flag	N/A	Show NCBI GIs in report.
num_descriptions	integer	500	Show one-line descriptions for this number of database sequences.
num_alignments	integer	250	Show alignments for this number of database sequences.
max_target_seqs	Integer	500	Number of aligned sequences to keep. Use with report formats that do not have separate definition line and alignment sections such as tabular (all outfmt > 4). Not compatible with num_descriptions or num_alignments.
max_hsps	integer	none	Maximum number of HSPs (alignments) to keep for any single query-subject pair. The HSPs shown will be the best as judged by expect value. This number should be an integer that is one or greater. If this option is not set, BLAST shows all HSPs meeting the expect value criteria. Setting it to one will show only the best HSP for every query-subject pair
html	flag	N/A	Produce HTML output
gilist	string	none	Restrict search of database to GI’s listed in this file. Local searches only.
negative_gilist	string	none	Restrict search of database to everything except the GI’s listed in this file. Local searches only.
entrez_query	string	none	Restrict search with the given Entrez query. Remote searches only.
culling_limit	integer	none	Delete a hit that is enveloped by at least this many higher-scoring hits.
best_hit_overhang	real	none	Best Hit algorithm overhang value (recommended value: 0.1)
best_hit_score_edge	real	none	Best Hit algorithm score edge value (recommended value: 0.1)
dbsize	integer	none	Effective size of the database
searchsp	integer	none	Effective length of the search space
import_search_strategy	string	none	Search strategy file to read.
export_search_strategy	string	none	Record search strategy to this file.
parse_deflines	flag	N/A	Parse query and subject bar delimited sequence identifiers (e.g., gi|129295).
num_threads	integer	1	Number of threads (CPUs) to use in blast search.
remote	flag	N/A	Execute search on NCBI servers?
outfmt	string	0	alignment view options:
                    0 = pairwise,
                    1 = query-anchored showing identities,
                    2 = query-anchored no identities,
                    3 = flat query-anchored, show identities,
                    4 = flat query-anchored, no identities,
                    5 = XML Blast output,
                    6 = tabular,
                    7 = tabular with comment lines,
                    8 = Text ASN.1,
                    9 = Binary ASN.1
                    10 = Comma-separated values
                    11 = BLAST archive format (ASN.1)
                    Options 6, 7, and 10 can be additionally configured to produce a custom format specified by space delimited format specifiers.
                    The supported format specifiers are:
                    qseqid means Query Seq-id
                    qgi means Query GI
                    qacc means Query accesion
                    sseqid means Subject Seq-id
                    sallseqid means All subject Seq-id(s), separated by a ';'
                    sgi means Subject GI
                    sallgi means All subject GIs
                    sacc means Subject accession
                    sallacc means All subject accessions
                    qstart means Start of alignment in query
                    qend means End of alignment in query
                    sstart means Start of alignment in subject
                    send means End of alignment in subject
                    qseq means Aligned part of query sequence
                    sseq means Aligned part of subject sequence
                    evalue means Expect value
                    bitscore means Bit score
                    score means Raw score
                    length means Alignment length
                    pident means Percentage of identical matches
                    nident means Number of identical matches
                    mismatch means Number of mismatches
                    positive means Number of positive-scoring matches
                    gapopen means Number of gap openings
                    gaps means Total number of gap
                    ppos means Percentage of positive-scoring matches
                    frames means Query and subject frames separated by a '/'
                    qframe means Query frame
                    sframe means Subject frame
                    btop means Blast traceback operations (BTOP)
                    staxids means unique Subject Taxonomy ID(s), separated by a ';'(in numerical order)
                    sscinames means unique Subject Scientific Name(s), separated by a ';'
                    scomnames means unique Subject Common Name(s), separated by a ';'
                    sblastnames means unique Subject Blast Name(s), separated by a ';' (in alphabetical order)
                    sskingdoms means unique Subject Super Kingdom(s), separated by a ';' (in alphabetical order)
                    stitle means Subject Title
                    salltitles means All Subject Title(s), separated by a '&lt;&gt;'
                    sstrand means Subject Strand
                    qcovs means Query Coverage Per Subject (for all HSPs)
                    qcovhsp means Query Coverage Per HSP
                    qcovus is a measure of Query Coverage that counts a position in a subject sequence for this measure only once. The second time the position is aligned to the query is not counted towards this measure.
                    When not provided, the default value is:
                    'qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore', which is equivalent to the keyword 'std'
MODULE SPECIFIC PARAMS:
option	task	type	default value	description and notes
word_size	tblastn	integer	3	Word size for initial match. Valid word sizes are 2-7.
word_size	tblastn-fast	integer	6	Word size for initial match.
gapopen	all	integer	11	Cost to open a gap.
gapextend	all	integer	1	Cost to extend a gap.
matrix	all	string	BLOSUM62	Scoring matrix name.
threshold	tblastn	integer	13	Minimum score to add a word to the BLAST lookup table.
threshold	tblastn-fast	Integer	21	Minimum score to add a word to the BLAST lookup table.
seg	all	string	12 2.2 2.5	Filter query sequence with SEG (Format: 'yes', 'window locut hicut', or 'no' to disable).
soft_masking	all	boolean	false	Apply filtering locations as soft masks (i.e., only for finding initial matches).
lcase_masking	all	flag	N/A	Use lower case filtering in query and subject sequence(s).
db_soft_mask	all	integer	none	Filtering algorithm ID to apply to the BLAST database as soft mask (i.e., only for finding initial matches).
db_hard_mask	all	integer	none	Filtering algorithm ID to apply to the BLAST database as hard mask (i.e., sequence is masked for all phases of search).
xdrop_gap_final	all	real	25	Heuristic value (in bits) for final gapped alignment.
window_size	all	integer	40	Multiple hits window size, use 0 to specify 1-hit algorithm.
db_gen_code	all	integer	1	Genetic code to translate subject sequences, see ftp://ftp.ncbi.nih.gov/entrez/misc/data/gc.prt
max_intron_length	all	integer	0	Length of the largest intron allowed in a translated nucleotide sequence when linking multiple distinct alignments (a negative value disables linking).
comp_based_stats	all	string	2	Use composition-based statistics for tblastn:
                                        D or d: default (equivalent to 2)
                                        0 or F or f: no composition-based statistics
                                        1: Composition-based statistics as in NAR 29:2994-3005, 2001
                                        2 or T or t : Composition-based score adjustment as in Bioinformatics
                                        21:902-911, 2005, conditioned on sequence properties
                                        3: Composition-based score adjustment as in Bioinformatics 21:902-911, 2005, unconditionally
                                        Default = `2'
        </params>
        <command>tblastn.exe</command>
    </module>
    <module>
        <name>tblastx</name>
        <category>search</category>
        <description>The tblastx application searches a translated nucleotide query against translated nucleotide subject sequences or a translated nucleotide database.
An option of type "flag" takes no arguments, but if present the argument is true. This table reflects the 2.2.27 BLAST+ release.
Only ungapped searches are supported for tblastx.
Full documentation for this tool can be found on the following page: https://www.ncbi.nlm.nih.gov/books/NBK279684/#appendices.Options_for_the_commandline_a
        </description>
        <inputFile>[file].fasta</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[results].out</outputFile>
        <outputParam>true</outputParam>
        <params>Parameters common to all BLAST+ search modules:
Option  Type    Default value   Description/notes
db	string	none	BLAST database name.
query	string	stdin	Query file name.
query_loc	string	none	Location on the query sequence (Format: start-stop)
out	string	stdout	Output file name
evalue	real	10.0	Expect value (E) for saving hits
subject	string	none	File with subject sequence(s) to search.
subject_loc	string	none	Location on the subject sequence (Format: start-stop).
show_gis	flag	N/A	Show NCBI GIs in report.
num_descriptions	integer	500	Show one-line descriptions for this number of database sequences.
num_alignments	integer	250	Show alignments for this number of database sequences.
max_target_seqs	Integer	500	Number of aligned sequences to keep. Use with report formats that do not have separate definition line and alignment sections such as tabular (all outfmt > 4). Not compatible with num_descriptions or num_alignments.
max_hsps	integer	none	Maximum number of HSPs (alignments) to keep for any single query-subject pair. The HSPs shown will be the best as judged by expect value. This number should be an integer that is one or greater. If this option is not set, BLAST shows all HSPs meeting the expect value criteria. Setting it to one will show only the best HSP for every query-subject pair
html	flag	N/A	Produce HTML output
gilist	string	none	Restrict search of database to GI’s listed in this file. Local searches only.
negative_gilist	string	none	Restrict search of database to everything except the GI’s listed in this file. Local searches only.
entrez_query	string	none	Restrict search with the given Entrez query. Remote searches only.
culling_limit	integer	none	Delete a hit that is enveloped by at least this many higher-scoring hits.
best_hit_overhang	real	none	Best Hit algorithm overhang value (recommended value: 0.1)
best_hit_score_edge	real	none	Best Hit algorithm score edge value (recommended value: 0.1)
dbsize	integer	none	Effective size of the database
searchsp	integer	none	Effective length of the search space
import_search_strategy	string	none	Search strategy file to read.
export_search_strategy	string	none	Record search strategy to this file.
parse_deflines	flag	N/A	Parse query and subject bar delimited sequence identifiers (e.g., gi|129295).
num_threads	integer	1	Number of threads (CPUs) to use in blast search.
remote	flag	N/A	Execute search on NCBI servers?
outfmt	string	0	alignment view options:
                    0 = pairwise,
                    1 = query-anchored showing identities,
                    2 = query-anchored no identities,
                    3 = flat query-anchored, show identities,
                    4 = flat query-anchored, no identities,
                    5 = XML Blast output,
                    6 = tabular,
                    7 = tabular with comment lines,
                    8 = Text ASN.1,
                    9 = Binary ASN.1
                    10 = Comma-separated values
                    11 = BLAST archive format (ASN.1)
                    Options 6, 7, and 10 can be additionally configured to produce a custom format specified by space delimited format specifiers.
                    The supported format specifiers are:
                    qseqid means Query Seq-id
                    qgi means Query GI
                    qacc means Query accesion
                    sseqid means Subject Seq-id
                    sallseqid means All subject Seq-id(s), separated by a ';'
                    sgi means Subject GI
                    sallgi means All subject GIs
                    sacc means Subject accession
                    sallacc means All subject accessions
                    qstart means Start of alignment in query
                    qend means End of alignment in query
                    sstart means Start of alignment in subject
                    send means End of alignment in subject
                    qseq means Aligned part of query sequence
                    sseq means Aligned part of subject sequence
                    evalue means Expect value
                    bitscore means Bit score
                    score means Raw score
                    length means Alignment length
                    pident means Percentage of identical matches
                    nident means Number of identical matches
                    mismatch means Number of mismatches
                    positive means Number of positive-scoring matches
                    gapopen means Number of gap openings
                    gaps means Total number of gap
                    ppos means Percentage of positive-scoring matches
                    frames means Query and subject frames separated by a '/'
                    qframe means Query frame
                    sframe means Subject frame
                    btop means Blast traceback operations (BTOP)
                    staxids means unique Subject Taxonomy ID(s), separated by a ';'(in numerical order)
                    sscinames means unique Subject Scientific Name(s), separated by a ';'
                    scomnames means unique Subject Common Name(s), separated by a ';'
                    sblastnames means unique Subject Blast Name(s), separated by a ';' (in alphabetical order)
                    sskingdoms means unique Subject Super Kingdom(s), separated by a ';' (in alphabetical order)
                    stitle means Subject Title
                    salltitles means All Subject Title(s), separated by a '&lt;&gt;'
                    sstrand means Subject Strand
                    qcovs means Query Coverage Per Subject (for all HSPs)
                    qcovhsp means Query Coverage Per HSP
                    qcovus is a measure of Query Coverage that counts a position in a subject sequence for this measure only once. The second time the position is aligned to the query is not counted towards this measure.
                    When not provided, the default value is:
                    'qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore', which is equivalent to the keyword 'std'
MODULE SPECIFIC PARAMS:
option	type	default value	description and notes
word_size	integer	3	Word size for initial match.
matrix	string	BLOSUM62	Scoring matrix name.
threshold	integer	13	Minimum word score to add the word to the BLAST lookup table.
seg	string	12 2.2 2.5	Filter query sequence with SEG (Format: 'yes', 'window locut hicut', or 'no' to disable).
soft_masking	boolean	false	Apply filtering locations as soft masks (i.e., only for finding initial matches).
lcase_masking	flag	N/A	Use lower case filtering in query and subject sequence(s).
db_soft_mask	integer	none	Filtering algorithm ID to apply to the BLAST database as soft mask (i.e., only for finding initial matches).
db_hard_mask	integer	none	Filtering algorithm ID to apply to the BLAST database as hard mask (i.e., sequence is masked for all phases of search).
strand	string	both	Query strand(s) to search against database subject sequences. Choice of both, minus, or plus.
query_genetic_code	integer	1	Genetic code to translate query, see ftp://ftp.ncbi.nih.gov/entrez/misc/data/gc.prt
db_gen_code	integer	1	Genetic code to translate subject sequences, see ftp://ftp.ncbi.nih.gov/entrez/misc/data/gc.prt
max_intron_length	integer	0	Length of the largest intron allowed in a translated nucleotide sequence when linking multiple distinct alignments (a negative value disables linking)
        </params>
        <command>tblastx.exe</command>
    </module>
    <!-- <module>
        <name>rpsblast</name>
        <category>search</category>
        <description>The rpsblast application searches a protein query against the conserved domain database (CDD), which is a set of protein profiles. 
        Many of the common options such as matrix or word threshold are set when the CDD is built and cannot be changed by the rpsblast application. 
        A search ready CDD can be downloaded from ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/</description>
        <inputFile>[file].fasta</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>[results].out</outputFile>
        <outputParam>true</outputParam>
        <params>
        
        </params>
        <command></command>
    </module> -->
    <module>
        <name>Makeblastdb</name>
        <category>Database Builder</category>
        <description>This application builds a BLAST database. An option of type "flag" takes no arguments, but if present the argument is true.
Full documentation for this tool can be found on the following page: https://www.ncbi.nlm.nih.gov/books/NBK279684/#appendices.Options_for_the_commandline_a
        </description>
        <inputFile>fasta: for FASTA file(s)
blastdb: for BLAST database(s)
asn1_txt: for Seq-entries in text ASN.1 format
asn1_bin: for Seq-entries in binary ASN.1 format</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>true</outputFile>
        <outputParam>true</outputParam>
        <params>option	type	default value	Description and notes
in	string	stdin	Input file/database name
input_type	string	fasta	Input file type, it may be any of the following:
fasta: for FASTA file(s)
blastdb: for BLAST database(s)
asn1_txt: for Seq-entries in text ASN.1 format
asn1_bin: for Seq-entries in binary ASN.1 format
dbtype	string	prot	Molecule type of input, values can be nucl or prot.
title	string	none	Title for BLAST database. If not set, the input file name will be used.
parse_seqids	flag	N/A	Parse bar delimited sequence identifiers (e.g., gi|129295) in FASTA input.
hash_index	flag	N/A	Create index of sequence hash values.
mask_data	string	none	Comma-separated list of input files containing masking data as produced by NCBI masking applications (e.g. dustmasker, segmasker, windowmasker).
out	string	input file name	Name of BLAST database to be created. Input file name is used if none provided. This field is required if input consists of multiple files.
max_file_size	string	1GB	Maximum file size to use for BLAST database.
taxid	integer	none	Taxonomy ID to assign to all sequences.
taxid_map	string	none	File with two columns mapping sequence ID to the taxonomy ID. The first column is the sequence ID represented as one of:
                                1. fasta with accessions (e.g., emb|X17276.1|)
                                2. fasta with GI (e.g., gi|4)
                                3. GI as a bare number (e.g., 4)
                                4. A local ID. The local ID must be prefixed with "lcl" (e.g., lcl|4).
                                The second column should be the NCBI taxonomy ID (e.g., 9606 for human).
logfile	string	none	Program log file (default is stderr).
        </params>
        <command>makeblastdb.exe</command>
    </module>
    <module>
        <name>Makeprofiledb</name>
        <category>Database</category>
        <description>This application builds an RPS-BLAST database. An option of type "flag" takes no arguments, but if present the argument is true.
COBALT (a multiple sequence alignment program) and DELTA-BLAST both use RPS-BLAST searches as part of their processing, but use specialized versions of the database.
This application can build databases for COBALT, DELTA-BLAST, and a standard RPS-BLAST search.
The "dbtype" option (see entry in table) determines which flavor of the database is built.
Full documentation for this tool can be found on the following page: https://www.ncbi.nlm.nih.gov/books/NBK279684/#appendices.Options_for_the_commandline_a</description>
        <inputFile>Input file that contains a list of scoremat files (delimited by space, tab, or newline)</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>Name of BLAST database to be created. Input file name is used if none provided.</outputFile>
        <outputParam>true</outputParam>
        <params>option	type	default value	Description and notes
in	string	stdin	Input file that contains a list of scoremat files (delimited by space, tab, or newline)
binary	flag	N/A	The scoremat files are binary ASN.1
title	string	none	Title for RPS-BLAST database. If not set, the input file name will be used.
threshold	real	9.82	Threshold for RPSBLAST lookup table.
out	string	input file name	Name of BLAST database to be created. Input file name is used if none provided.
max_file_size	string	1GB	Maximum file size to use for BLAST database.
dbtype	string	rps	Specifies use for RPSBLAST db. One of rps, cobalt, or delta.
index	flag	N/A	Creates index files.
gapopen	integer	none	Cost to open a gap. Used only if scoremat files do not contain PSSM scores, otherwise ignored.
gapextend	integer	none	Cost to extend a gap by one residue. Used only if scoremat files do not contain PSSM scores, otherwise ignored.
scale	real	100	PSSM scale factor.
matrix	string	BLOSUM62	Matrix to use in constructing PSSM. One of BLOSUM45, BLOSUM50, BLOSUM62, BLOSUM80, BLOSUM90, PAM250, PAM30 or PAM70. Used only if scoremat files do not contain PSSM scores, otherwise ignored.
obsr_threshold	real	6	Exclude domains with maximum number of independent observations below this value (for use in DELTA-BLAST searches).
exclude_invalid	real	true	Exclude domains that do not pass validation test (for use in DELTA-BLAST searches).
logfile	string	none	Program log file (default is stderr).
        </params>
        <command>makeprofiledb.exe</command>
    </module>
    <module>
        <name>Blastdbcmd</name>
        <category>Database</category>
        <description>This application reads a BLAST database and produces reports.
Full documentation for this tool can be found on the following page: https://www.ncbi.nlm.nih.gov/books/NBK279684/#appendices.Options_for_the_commandline_a</description>
        <inputFile>BLAST database</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>true</outputFile>
        <outputParam>true</outputParam>
        <params>option	type	default value	description and notes
db	string	nr	BLAST database name.
dbtype	string	guess	Molecule type stored in BLAST database, one of nucl, prot, or guess.
entry	string	none	Comma-delimited search string(s) of sequence identifiers: e.g.: 555, AC147927, 'gnl|dbname|tag', or 'all' to select all sequences in the database
entry_batch	string	none	Input file for batch processing. The format requires one entry per line; each line should begin with the sequence ID followed by any of the following optional specifiers (in any order): range (format: ‘from-to’, inclusive in 1-offsets), strand (‘plus’ or ‘minus’), or masking algorithm ID (integer value representing the available masking algorithm). Omitting the ending range (e.g.: ‘10-‘) is supported, but there should not be any spaces around the ‘-‘.
pig	integer	none	PIG (protein identity group) to retrieve.
info	flag	N/A	Print BLAST database information.
range	string	none	Range of sequence to extract (Format: start-stop).
strand	string	plus	Strand of nucleotide sequence to extract. Choice of plus or minus.
mask_sequence_with	string	none	Produce lower-case masked FASTA using the algorithm IDs specified.
out	string	stdout	Output file name.
outfmt	string	%f	Output format, where the available format specifiers are:
                            %f means sequence in FASTA format
                            %s means sequence data (without defline)
                            %a means accession
                            %g means gi
                            %o means ordinal id (OID)
                            %t means sequence title
                            %l means sequence length
                            %T means taxid
                            %L means common taxonomic name
                            %S means scientific name
                            %P means PIG
                            %mX means sequence masking data, where X is an optional comma-separated list of integers to specify the algorithm ID(s) to display (or all masks if absent or invalid specification). Masking data will be displayed as a series of 'N-M' values separated by ';' or the word 'none' if none are available. For every format except '%f', each line of output will correspond to a sequence.
target_only	flag	N/A	Definition line should contain target GI only.
get_dups	flag	N/A	Retrieve duplicate accessions.
line_length	integer	80	Line length for output.
ctrl_a	flag	N/A	Use Ctrl-A as the non-redundant definition line separator.
        </params>
        <command>blastdbcmd.exe</command>
    </module>
    <module>
        <name>Makembindex</name>
        <category>Database</category>
        <description>The indexed databases created by makembindex are used by production MegaBLAST software and by a new srsearch utility designed to quickly search for nearly exact matches (up to one mismatch) of short queries against a genomic database. 
When a FASTA formatted file is used as the input, then masking by lower case letters is incorporated in the index.
Makembindex can currently build two types of indices, called "old style" and "new style" indexing.
The NCBI offers full support for the new style and has deprecated the old style.
A MegaBLAST search with a new style index requires that both the index and the corresponding BLAST database be present.
Full documentation for this tool can be found on the following page: https://www.ncbi.nlm.nih.gov/books/NBK279684/#appendices.Options_for_the_commandline_a</description>
        <inputFile>input</inputFile>
        <inputParam>true</inputParam>
        <outputFile_required>true</outputFile_required>
        <outputFile>true</outputFile>
        <outputParam>true</outputParam>
        <params>option	type	default value	Description and notes
input	string	stdin	Input file name or BLAST database name, depending on the value of the iformat parameter. For FASTA formatted input, this parameter is optional and defaults to the program's standard input stream.
output	string	none	The resulting index name. The index itself can consist of multiple files, called volumes, called &lt;index_name&gt;.00.idx, &lt;index_name&gt;.01.idx,...
This option should not be used with new style indices.
iformat	string	fasta	The input format selector. Possible values are 'fasta' and 'blastdb'.
old_style_index	boolean	false	The old_style_index is no longer supported. If set to 'false' the new style index is created. New style indices require a BLAST database as input (use -iformat blastdb), which can be downloaded from the NCBI FTP site or created with makeblastdb. The option -output is ignored for a new style index. New style indices are always created at the same location as the corresponding BLAST database.
db_mask	integer	None	Exclude masked regions of BLAST db from the index. Use makeblastdb to discover the algorithm ID to be used as input for this argument.
legacy	boolean	true	This is a compatibility feature to support current production MegaBLAST. If true, then -stride, -nmer, and -ws_hint are ignored. The legacy format must be used for BLAST.
nmer	integer	12	N-mer size to use. Ignored if –legacy is specified
ws_hint	integer	28	This is an optimization hint for makembindex that indicates an expected minimum match size in searches that use the index. If n is the value of -nmer parameter and s is the value of –stride parameter, then the value of -ws_hint must be at least n + s - 1.
stride	integer	5	makembindex will index every stride-th N-mer of the database.
volsize	integer	1536	Target index volume size in megabytes.</params>
        <command>makembindex.exe</command>
    </module>
    <!-- <module>
        <name></name>
        <category></category>
        <description></description>
        <inputFile></inputFile>
        <inputParam></inputParam>
        <outputFile_required></outputFile_required>
        <outputFile></outputFile>
        <outputParam></outputParam>
        <params></params>
        <command></command>
    </module> -->
</modules>