Skip to content

Subcommand: simulate

Lucas Czech edited this page Mar 21, 2023 · 20 revisions

Create a file with simulated random frequency data.

Usage: grenedalf simulate --coverages TEXT --length UINT [other options]

Options

Settings

--format TEXT:{pileup,sync}=pileup
Select the output file format, either (m)pileup, or PoPollation2 sync.
--random-seed UINT=0
Set the random seed for generating values, which allows reproducible results. If not provided, the system clock is used to obtain a random seed.

Samples

--coverages TEXT
Required. Coverages of the samples to simulate, as a comma- or tab-separated list. The coverage of each sample is used at the total count per position to randomly distribute across nucleotides. Per sample, the list can either contain a single number, which will be used as the coverage for that sample at each position, or it can be two numbers separated by a slash, which will be used as min/max to generate random coverage at each position. The length of this list is also used to determine the number of samples to simulate.

Genome

--chromosome TEXT=A
Name of the chromosome. This is simply used as the first column in the output file. At the moment, only one chromosome is supported.
--mutation-rate FLOAT:(FLOAT in [0 - 1]) AND (POSITIVE)=1e-08 Excludes: --mutation-count
Mutation rate to simulate. This rate times the --length is used as the number of mutations to generate in total (which can alternatively be directly provided via --mutation-count).
--mutation-count UINT=0 Excludes: --mutation-rate
Number of mutations to simulate in total across the chromosome, spread across the --length.
--length UINT=0
Required. Total length of the chromosome to simulate. Mutations are spread across this length.
--omit-invariant-positions
If set, only write the mutated positions in the output file. Note that these are not standard (m)pileup or sync files any more; still this option might be useful.

Pileup

--with-quality-scores
If set, phred-scaled quality scores are written when simulating an (m)pileup file, using the --min-phred-score and --max-phred-score settings. Ignored otherwise.
--min-phred-score UINT:UINT in [0 - 90]=10
Minimum phred score to use when simulating an (m)pileup file. Ignored otherwise.
--max-phred-score UINT:UINT in [0 - 90]=40
Maximum phred score to use when simulating an (m)pileup file. Ignored otherwise.

Output

--out-dir TEXT=.
Directory to write files to
--file-prefix TEXT
File prefix for output files. Most grenedalf commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data.
--file-suffix TEXT
File suffix for output files. Most grenedalf commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data.
--compress
If set, compress the output files using gzip. Output file extensions are automatically extended by .gz.

Global Options

--allow-file-overwriting
Allow to overwrite existing output files instead of aborting the command.
--verbose
Produce more verbose output.
--threads UINT
Number of threads to use for calculations. If not set, we guess a reasonable number of threads, by looking at the environmental variables (1) OMP_NUM_THREADS (OpenMP) and (2) SLURM_CPUS_PER_TASK (slurm), as well as (3) the hardware concurrency, taking hyperthreads into account, in the given order of precedence.
--log-file TEXT
Write all output to a log file, in addition to standard output to the terminal.

Citation

When using this method, please do not forget to cite

Lucas Czech, Moises Exposito-Alonso. Grenedalf: Genome Analyses of Differential Allele Frequencies. Manuscript in preparation, 2021. doi:

Clone this wiki locally