This tutorial is forked from vappiah/bacterial-genomics-tutorial, share with our lab co-worker as educational use, Thanks alot !
conda config --add channels conda-forge\
conda config --add channels bioconda\
conda config --add channels daler\
conda config --add channels defaults\
git clone https://github.com/vappiah/bacterial-genomics-tutorial.git
cd bacterial-genomics-tutorial
conda env create -f environment.yaml
mkdir apps\
wget https://github.com/broadinstitute/pilon/releases/download/v1.23/pilon-1.23.jar -O apps/pilon.jar
source activate bacterial-genomics-tutorial
chmod +x *.{py,sh,pl}
pip install -r pip-requirements.txt
./download_data.sh
./qc_raw_reads.sh
./trim_reads.sh
./qc_trimmed_reads.sh
./assemble.sh
### Step 7: Perform QC for both raw assembly and polished assembly
./qc_assembly.sh
### Step 8: Generate draft genome by reordering contigs against a reference genome using ragtag\
./reorder_contigs.sh
./amr.sh
./annotate.sh
Features such as genes, CDS will be counted and displayed. The scripts requires you to specify the folder where annotations were saved . i.e. P7741 Python should be used to run that script
python get_annot_stats.py P7741_annotation P7741
./dendogram.sh
Input files are gff (version 3 ) format. It is recommended to use prokka generated gff. So we generate the gffs for the files in the genome folder by reannotating with prokka. We use the get_genome_gffs script
./get_genome_gffs.sh
Then perform pangenome analysis\
./get_pangenome.sh
Step 15: Get gene summary for three of the organism. the default is P7741 Agy99 and Liflandii. Feel free to change it. A venn diagram will be generated(gene_count_summary.png)
python gene_count_summary.py P7741 Agy99 Liflandii pangenome/gene_presence_absence.csv
If you are working on a cluster you will want to combine the analysis results into a zip file for download and view locally.
./zip_results.sh
Step 16: Compare your draft genome with the other organisms in the genomes folder by generating circular structures for them . Use the tutorial here to guide you https://youtu.be/pobQgE4z-5Q
The result interpretation are available on my youtube video tutorial : https://youtu.be/S_sRo_85jhs
Now that you have been able to perform a bacterial comparative genome analysis. Its time to apply your skills on a real world data. Good luck and see you next time
Vincent Appiah, 2020. Bacterial Genomics Tutorial https://github.com/vappiah/bacterial-genomics-tutorial