-
Notifications
You must be signed in to change notification settings - Fork 21
Report and Quality Control
Many tools report their output statistics via MultiQC.
We store the resulting html
file in qc/multiqc.html
for each run.
See the Rule Call Graph for an overview of the data flow into MultiQC, and see the multiqc
rule for full details.
Also, there might be cases where you only want the quality control without the actual variant calling, for example, to get an overview of the quality of some sequencing run. In that case, you can also just run all the pipeline steps needed to obtain the MultiQC report by calling
snakemake [other options] all_qc
which is a special rule that only runs the jobs necessary to get the MultiQC report.
Important remark: Note however that SnpEff and VEP are inputs to MultiQC, which both depend on the variant calls.
Hence, if either of them is activated, the variant calling is still being run. So, if SnpEff and VEP are not needed in your MultiQC report,
make sure to deactivate them in your config.yaml
first!
Lastly, we provide statistics on the reference genome (number of sequences and their lenghts), using SeqKit, doi:10.1371/journal.pone.0163962. The output file with the statistics is stored next to the reference genome file, with the additional suffix .seqkit
.
We also offer to automatically generate a Snakemake report for a run of the pipeline
snakemake --report my-analysis/report.html
This needs to be amended by the --directory
option as needed.
For this to work, the Python packages networkx and pygraphviz must be installed:
sudo apt-get install python3-dev graphviz libgraphviz-dev pkg-config python-pip
sudo pip install networkx pygraphviz