KlusterCaller-to-VCF is a shell script written to take the output from KlusterCaller software and convert it into a compressed variant calling format file (vcf.gz) using base R data manipulation functions. This VCF is compliant with formatting requirements and is readable in other programs like TASSEL.
KlusterCaller-to-VCF requires three inputs:
- A KlusterCaller formatted output - example
- A keyfile which relates KlusterCaller output to allelic states - example
- A string for output name (i.e., "output_example")
Both files provided to the KlusterCaller-to-VCF function must be tab delimited and marker names are case-sensitive and must match exactly.
To call on the KlusterCaller-to-VCF function, the user must call on the klustercaller_to_vcf.sh file directly using the following argument:
bash klustercaller_to_vcf.sh
This can either be in the directory where you have pulled the repository or a direct path to the location of the installation. To view the usage file for further assistance use the following command:
bash klustercaller_to_vcf.sh --help
To run the provided example and look at textual output use the following command:
bash klustercaller_to_vcf.sh \
-k klustercaller_keyfile_example.txt \
-c klustercaller_file_example.txt \
-o example_output \
-v
This will result in a compressed VCF file that looks like this example.
The bash script provided in this GitHub repository requires the following to run properly:
- R statistical coding language - link to page
- samtools/bcftools - link to page
Please make sure you have both programs installed properly prior to running to avoid errors.