Skip to content

Subcommand: cathedral plot

Lucas Czech edited this page Sep 16, 2024 · 11 revisions

Synopsis: Create a cathedral plot, using the pre-computated cathedral data.

Usage: grenedalf cathedral-plot [options]

Documentation for grenedalf v0.6.2

Table of contents:

Description

Create cathedral plots. This is the second step after fst-cathedral, and turns the matrix computed there into the actual plots. We split this into two commands for efficiency, so that it is faster to iterate over different color schemes and other plotting settings.

The command takes either the csv or the json files produced by fst-cathedral as input (and infers the respective other file, both have to be in the same directory, with the same base name). It then colors the value matrix according to the provided color map settings, and stores the result as a bmp bitmap picture. Furthermore, it creates an svg file that additionally contains axes, a legend for the color map, and a title, and can be edited and refined later with any vector graphics program.

FST Cathedral Plot.

See fst-cathedral for details on this plot.

Note that the pool-sequencing corrected estimator of FST applies a correction term that can yield FST values below zero. This is expected, and a consequence of correcting for the statistical noise. For details, see our equations document. In order to not have these artifacts influence the plot, and to create consistency in the plots, we recommend to clip negative values to zero, by providing --min-value 0 --clip-under. The first option limits the scale to non-negative values, and the second option makes sure that the negative values are clipped to be 0, instead of being highlighted in the --under-color.

Similarly, it might be beneficial to use --max-value X --clip-over with some reasonable maximum value X, if multiple plots are created that need to be compared to each other. That way, all plots will have the same scale, and hence have comparable color values.

Lastly, at the moment, we only have implemented cathedral plots for FST. They are however also possible for any other window-based statistic, such as the diversity metrics. If this is something that you are interested in, please open an issue to tell us.

Colors

For the options of this command, the single colors and the main gradient can be specified as described here. This documents the color usage of our tool gappa, but we use the same here in grenedalf. Generally we however recommend to keep the default colors, as they were designed to be working well for cathedral plots.

Options

Input

--json-path
TEXT:PATH(existing)=[] ... Excludes: --csv-path
List of json files or directories to process. For directories, only files with the extension .json are processed. To input more than one file or directory, either separate them with spaces, or provide this option multiple times.
--csv-path
TEXT:PATH(existing)=[] ... Excludes: --json-path
List of csv files or directories to process. For directories, only files with the extension .csv are processed. To input more than one file or directory, either separate them with spaces, or provide this option multiple times.

Color

--color-list
TEXT=inferno
List of colors to use for the palette. Can either be the name of a color list, a file containing one color per line, or an actual comma-separated list of colors. Colors can be specified in the format #rrggbb using hex values, or by web color names.
--reverse-color-list
FLAG
If set, the order of colors of the --color-list is reversed.
--under-color
TEXT=#00ffff
Color used to indicate values below the min value. Color can be specified in the format #rrggbb using hex values, or by web color names.
--clip-under
FLAG
Clip (i.e., clamp) values less than min to be inside [ min, max ], by setting values that are too low to the specified min value. If set, --under-color is not used to indicate values out of range.
--over-color
TEXT=#ff00ff
Color used to indicate values above the max value. Color can be specified in the format #rrggbb using hex values, or by web color names.
--clip-over
FLAG
Clip (i.e., clamp) values greater than max to be inside [ min, max ], by setting values that are too high to the specified max value. If set, --over-color is not used to indicate values out of range.
--clip
FLAG
Clip (i.e., clamp) values to be inside [ min, max ], by setting values outside of that interval to the nearest boundary of it. This option is a shortcut to set --clip-under and --clip-over at once.
--color-normalization
TEXT:{linear,logarithmic}=linear
To create the cathedral plot, the value of each pixel needs to be translated into a color, by mapping from the range of values into the range of the color map. This translation can be done as a simple linear transform, or logarithmic, so that low values can be distinguished with more detail.
--min-value
FLOAT=nan
As an alternative to determining the range of values automatically, the range limits can be set explicitly. This allows for instance to cap the visualization in cases of outliers that would otherwise hide detail in the lower values. Any value that is below the min specified here will then be mapped to the under color, or clipped to the lowest value in the color map.
--max-value
FLOAT=nan
See --min-value; this is the equivalent upper limit of values.Any value that is above the max specified here will then be mapped to the over color, or be clipped to the highest value in the color map.

Output

--out-dir
TEXT=.
Directory to write files to
--file-prefix
TEXT
File prefix for output files. Most grenedalf commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data.
--file-suffix
TEXT
File suffix for output files. Most grenedalf commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data.

Global Options

--allow-file-overwriting
FLAG
Allow to overwrite existing output files instead of aborting the command. By default, we abort if any output file already exists, to avoid overwriting by mistake.
--verbose
FLAG
Produce more verbose output.
--threads
UINT
Number of threads to use for calculations. If not set, we guess a reasonable number of threads, by looking at the environmental variables (1) OMP_NUM_THREADS (OpenMP) and (2) SLURM_CPUS_PER_TASK (slurm), as well as (3) the hardware concurrency (number of CPU cores), taking hyperthreads into account, in the given order of precedence.
--log-file
TEXT
Write all output to a log file, in addition to standard output to the terminal.

Citation

When using this method, please do not forget to cite

Lucas Czech, Jeffrey Spence, Moises Exposito-Alonso. grenedalf: population genetic statistics for the next generation of pool sequencing. Bioinformatics, vol. 40, no. 8, 2024. doi:10.1093/bioinformatics/btae508