Skip to content

Parameter Docs

Sateesh Peri edited this page May 11, 2022 · 13 revisions
------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  CDCgov/mycosnp-nf v1.2
------------------------------------------------------
Typical pipeline command:

  nextflow run CDCgov/mycosnp-nf -profile singularity,test

Input/output options
  --input                      [string]  Path to comma-separated file containing information about the samples in the experiment.
  --add_sra_file               [string]  Path to comma-separated file containing SRA ids to download from NCBI. Format: Name,SRAID
  --add_vcf_file               [string]  Path to text file containing a list of file paths to vcf files generated from previous runs of this workflow to include 
                                         in this analysis. They must use the same exact reference. *.tbi file must be in same location. Text file with list in 
                                         Format: /path/to/vcf/file.gz 
  --outdir                     [string]  Path to the output directory where the results will be saved. [default: ./results]
  --publish_dir_mode           [string]  Method used to save pipeline results to output directory. [default: copy]

Reference genome options
  --fasta                      [string]  Path to FASTA formatted reference genome file.
  --ref_dir                    [string]  Path to reference genome masked files/picard dict/samtools fai/bwa index from previous run. If you use this command, it 
                                         invalidates `--fasta --ref_masked_fasta --ref_fai --ref_bwa --ref_dict` 
  --ref_masked_fasta           [string]  Path to reference genome masked fasta file. If you use this command, must provide all `--ref_masked_fasta --ref_fai 
                                         --ref_bwa --ref_dict` 
  --ref_fai                    [string]  Path to reference genome samtools fai file. If you use this command, must provide all `--ref_masked_fasta --ref_fai 
                                         --ref_bwa --ref_dict` 
  --ref_dict                   [string]  Path to reference genome picard tools dict file. If you use this command, must provide all `--ref_masked_fasta 
                                         --ref_fai --ref_bwa --ref_dict` 
  --ref_bwa                    [string]  Path to reference genome bwatools bwa directory. If you use this command, must provide all `--ref_masked_fasta 
                                         --ref_fai --ref_bwa --ref_dict` 

MycoSNP Run/Save/Skip Options
  --save_reference             [boolean] Saves the reference genome/index files to the results folder [default: true]
  --save_alignment             [boolean] Saves the reference alignment BAM files to the samples results folder [default: true]
  --rapidnj                    [boolean] Build a tree using the RapidNJ neighbour-joining algorithm [default: true]
  --fasttree                   [boolean] Build a tree using the FastTree approximate ML algorithm [default: true]
  --iqtree                     [boolean] Build a tree using the IQ-TREE ML algorithm
  --raxmlng                    [boolean] Build a tree using the RAxML-NG ML algorithm
  --save_debug                 [boolean] Save intermediate files for debugging
  --skip_samples               [string]  a comma separated list of IDs to skip for variant calling and analysis
  --skip_samples_file          [string]  a file with new-line separated list of IDs to skip for variant calling and analysis
  --skip_combined_analysis     [boolean] Skip combined variant analysis (run reference prep and mapping)
  --skip_phylogeny             [boolean] Skip phylogenetic tree creation

MycoSNP Run Params
  --sample_ploidy              [integer] Ploidy of sample (GATK) [default: 1]
  --coverage                   [integer] If coverage is specified and rate is not specified, coverage is used to calculate a down-sampling rate that results in 
                                         the specified coverage. For example if coverage 70, then FASTQ files are down-sampled such that, when aligned to the 
                                         reference, the result is approximately 70x coverage [default: 70] 
  --rate                       [number]  If rate is specified, then coverage is ignored. rate specifies the rate for down-sampling FASTQ files. A rate of 1.0 
                                         indicates that 100% of reads in the FASTQ files are retained, which effectively "skips" down-sampling 
  --gvcfs_filter               [string]  Filter criteria for variants (GATK) [default: QD < 2.0 || FS > 60.0 || MQ < 40.0 || DP < 10]
  --gatkgenotypes_filter       [string]  Filter criteria for script filterGatkGenotypes [default: --min_GQ "50" --keep_GQ_0_refs --min_percent_alt_in_AD 
                                         "0.8" --min_total_DP "10" --keep_all_ref] 
  --max_amb_samples            [integer] Max number of samples with ambiguous calls for inclusion (GATK) [default: 10000000]
  --max_perc_amb_samples       [integer] Max percent of samples with ambiguous calls for inclusion (GATK) [default: 10]
  --min_depth                  [integer] Min depth for a base to be called as the consensus sequence, otherwise it will be called as an N. Set to 0 to 
                                         disable. [default: 50] 
  --mask                       [boolean] Perform masking of reference genome before analysis [default: true]

Generic options
  --help                       [boolean] Display help text.
  --tmpdir                     [string]  temporary directory for shell and java processes [default: /tmp]
Clone this wiki locally