-
Notifications
You must be signed in to change notification settings - Fork 3
Subcommand: cathedral plot
Synopsis: Create a cathedral plot, using the pre-computated cathedral data.
Usage: grenedalf cathedral-plot [options]
Documentation for grenedalf v0.5.2
Create cathedral plots. This is the second step after fst-cathedral
, and turns the matrix computed there into the actual plots. We split this into two commands
for efficiency, so that it is faster to iterate over different color schemes and other plotting settings.
The command takes either the csv
or the json
files produced by fst-cathedral
as input (and infers the respective other file, both have to be in the same directory, with the same base name).
It then colors the value matrix according to the provided color map settings, and stores the result as
a bmp
bitmap picture. Furthermore, it creates an svg
file that additionally contains axes, a
legend for the color map, and a title, and can be edited and refined later with any vector graphics
program.
See fst-cathedral
for details on this plot.
Note that the pool-sequencing corrected estimator of FST applies a correction term that can yield
FST values below zero. This is expected, and a consequence of correcting for the statistical noise.
For details, see our equations document.
In order to not have these artifacts influence the plot, and to create consistency in the plots,
we recommend to clip negative values to zero, by providing --min-value 0 --clip-under
. The first
option limits the scale to non-negative values, and the second option makes sure that the negative
values are clipped to be 0, instead of being highlighted in the --under-color
.
Similarly, it might be beneficial to use --max-value X --clip-over
with some reasonable maximum
value X
, if multiple plots are created that need to be compared to each other. That way, all
plots will have the same scale, and hence have comparable color values.
Lastly, at the moment, we only have implemented cathedral plots for FST. They are however also possible for any other window-based statistic, such as the diversity metrics. If this is something that you are interested in, please open an issue to tell us.
For the options of this command, the single colors and the main gradient can be specified as follows.
Single colors can be specified
- by name, as one of the 140 web colors, that is, the basic 16 html color names and the extended 124 X11 color names. This is case-independent and insensitive to white spaces.
- by name, as one of the 954 xckd colors, again case- and white-space-insensitive.
- by hex code in the format
#RRGGBB
or#RRGGBBAA
(with alpha, which might be useful when producing svg files), using hexadecimal coding for each of the red, green, and blue values, case insensitive. For example, use#000000
for black and#ffffff
for white. Note that#
also happens to denote the start of a comment in command lines; hence, you probably need to put this in quotation marks.
A typical color specification might hence look like this: --under-color "#ff00ff"
or --mask-color orange
.
Gradients and lists of colors can be specified as
-
a comma-separated list of colors following the above specifications for single colors (this list can either be provided in a file with one color per line, or directly as a string on the command line), or
-
as one of the following named color lists/gradients:
Depending on context, not all of these lists might be well suited; it does for example not make much sense to use a (categorical) qualitative color list as a (continuous) gradient.
When specifying individual colors to build a custom gradient, the specified colors are evenly spaced out across the range of values, and then linearly interpolated to create the gradient. For example, a gradient from black to red to yellow could be specified as --color-list "#000000,#ff0000,#ffff00"
.
Our internal interpolation between colors to create a gradient (currently) is done linearly in RGB color space - this does not always yield the best looking results. We hence recommend to construct a gradient with several (5 or more) intermediate colors using external tools that operate in LCH space (e.g., this gradient generator), and then use these intermediate colors as input. This way, we only need to interpolate between nearby similar colors in RGB, which works/looks better than RGB interpolation between vastly different colors.
--json-path
-
TEXT:PATH(existing)=[] ... Excludes: --csv-path
List of json files or directories to process. For directories, only files with the extension.json
are processed. To input more than one file or directory, either separate them with spaces, or provide this option multiple times. --csv-path
-
TEXT:PATH(existing)=[] ... Excludes: --json-path
List of csv files or directories to process. For directories, only files with the extension.csv
are processed. To input more than one file or directory, either separate them with spaces, or provide this option multiple times.
--color-list
-
TEXT=inferno
List of colors to use for the palette. Can either be the name of a color list, a file containing one color per line, or an actual comma-separated list of colors. Colors can be specified in the format#rrggbb
using hex values, or by web color names. --reverse-color-list
-
FLAG
If set, the order of colors of the--color-list
is reversed. --under-color
-
TEXT=#00ffff
Color used to indicate values below the min value. Color can be specified in the format#rrggbb
using hex values, or by web color names. --clip-under
-
FLAG
Clip (i.e., clamp) values less than min to be inside[ min, max ]
, by setting values that are too low to the specified min value. If set,--under-color
is not used to indicate values out of range. --over-color
-
TEXT=#ff00ff
Color used to indicate values above the max value. Color can be specified in the format#rrggbb
using hex values, or by web color names. --clip-over
-
FLAG
Clip (i.e., clamp) values greater than max to be inside[ min, max ]
, by setting values that are too high to the specified max value. If set,--over-color
is not used to indicate values out of range. --clip
-
FLAG
Clip (i.e., clamp) values to be inside[ min, max ]
, by setting values outside of that interval to the nearest boundary of it. This option is a shortcut to set--clip-under
and--clip-over
at once. --color-normalization
-
TEXT:{linear,logarithmic}=linear
To create the cathedral plot, the value of each pixel needs to be translated into a color, by mapping from the range of values into the range of the color map. This translation can be done as a simple linear transform, or logarithmic, so that low values can be distinguished with more detail. --min-value
-
FLOAT=nan
As an alternative to determining the range of values automatically, the range limits can be set explicitly. This allows for instance to cap the visualization in cases of outliers that would otherwise hide detail in the lower values. Any value that is below the min specified here will then be mapped to theunder
color, or clipped to the lowest value in the color map. --max-value
-
FLOAT=nan
See--min-value
; this is the equivalent upper limit of values.Any value that is above the max specified here will then be mapped to theover
color, or be clipped to the highest value in the color map.
--out-dir
-
TEXT=.
Directory to write files to --file-prefix
-
TEXT
File prefix for output files. Most grenedalf commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data. --file-suffix
-
TEXT
File suffix for output files. Most grenedalf commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data.
--allow-file-overwriting
-
FLAG
Allow to overwrite existing output files instead of aborting the command. --verbose
-
FLAG
Produce more verbose output. --threads
-
UINT
Number of threads to use for calculations. If not set, we guess a reasonable number of threads, by looking at the environmental variables (1)OMP_NUM_THREADS
(OpenMP) and (2)SLURM_CPUS_PER_TASK
(slurm), as well as (3) the hardware concurrency, taking hyperthreads into account, in the given order of precedence. --log-file
-
TEXT
Write all output to a log file, in addition to standard output to the terminal.
When using this method, please do not forget to cite
Lucas Czech, Jeffrey Spence, Moises Exposito-Alonso. grenedalf: population genetic statistics for the next generation of pool sequencing. arXiv, 2023. doi:10.48550/arXiv.2306.11622