Skip to content

Subcommand: simulate

Lucas Czech edited this page Mar 21, 2023 · 20 revisions
<style> code { white-space: nowrap; } </style>

Create a file with simulated random frequency data.

Usage: grenedalf simulate --coverages TEXT --length UINT [other options]

Options

Settings

--format
TEXT:{pileup,sync}=pileup
Select the output file format, either (m)pileup, or PoPollation2 sync.
--random-seed
UINT=0
Set the random seed for generating values, which allows reproducible results. If not provided, the system clock is used to obtain a random seed.

Samples

--coverages
TEXT
Required. Coverages of the samples to simulate, as a comma- or tab-separated list. The coverage of each sample is used at the total count per position to randomly distribute across nucleotides. Per sample, the list can either contain a single number, which will be used as the coverage for that sample at each position, or it can be two numbers separated by a slash, which will be used as min/max to generate random coverage at each position. The length of this list is also used to determine the number of samples to simulate.

Genome

--chromosome
TEXT=A
Name of the chromosome. This is simply used as the first column in the output file. At the moment, only one chromosome is supported.
--mutation-rate
FLOAT:(FLOAT in [0 - 1]) AND (POSITIVE)=1e-08 Excludes: --mutation-count
Mutation rate to simulate. This rate times the --length is used as the number of mutations to generate in total (which can alternatively be directly provided via --mutation-count).
--mutation-count
UINT=0 Excludes: --mutation-rate
Number of mutations to simulate in total across the chromosome, spread across the --length.
--length
UINT=0
Required. Total length of the chromosome to simulate. Mutations are spread across this length.
--omit-invariant-positions
If set, only write the mutated positions in the output file. Note that these are not standard (m)pileup or sync files any more; still this option might be useful.

Pileup

--with-quality-scores
If set, phred-scaled quality scores are written when simulating an (m)pileup file, using the --min-phred-score and --max-phred-score settings. Ignored otherwise.
--min-phred-score
UINT:UINT in [0 - 90]=10
Minimum phred score to use when simulating an (m)pileup file. Ignored otherwise.
--max-phred-score
UINT:UINT in [0 - 90]=40
Maximum phred score to use when simulating an (m)pileup file. Ignored otherwise.

Output

--out-dir
TEXT=.
Directory to write files to
--file-prefix
TEXT
File prefix for output files. Most grenedalf commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data.
--file-suffix
TEXT
File suffix for output files. Most grenedalf commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data.
--compress
If set, compress the output files using gzip. Output file extensions are automatically extended by .gz.

Global Options

--allow-file-overwriting
Allow to overwrite existing output files instead of aborting the command.
--verbose
Produce more verbose output.
--threads
UINT
Number of threads to use for calculations. If not set, we guess a reasonable number of threads, by looking at the environmental variables (1) OMP_NUM_THREADS (OpenMP) and (2) SLURM_CPUS_PER_TASK (slurm), as well as (3) the hardware concurrency, taking hyperthreads into account, in the given order of precedence.
--log-file
TEXT
Write all output to a log file, in addition to standard output to the terminal.

Citation

When using this method, please do not forget to cite

Lucas Czech, Moises Exposito-Alonso. Grenedalf: Genome Analyses of Differential Allele Frequencies. Manuscript in preparation, 2021. doi:

Clone this wiki locally