- Scripts author: Guillem Ylla
- Article: "Insights into the genomic evolution of insects from cricket genomes"
- Genome database: https://gbimaculatusgenome.rc.fas.harvard.edu
This repository includes:
- Compilation of scripts used for the de novo annotating the genomes of the crickets:
- The Analysis directory contains the scripts used to analyze these genomes and generate the results shown publication.
- The Annotation_files direcotry contains the annotations (gff3 files) for G. bimaculatus and L. kohalensis
-
The genome annotation pipeline involves:
- Repeat masking with RepeatMasker using a custom repear library generated by combining 5 different tools.
- Protein coding genes annotation based on MAKER2 pipeline.
- Functional annotation based on InterProscan and Blast, and creation of a SQL database.
-
Pipeline based on the multiple online sources. Most relevant resources used:
- MAKER tutorial
- MAKER tutorial to create custom repeat libaries
- Daren Card pipline for annotating the Boa constrictor genome (https://gist.github.com/darencard/bb1001ac1532dd4225b030cf0cd61ce2)
-
The annotation pipline used in both crickets is very similar. If you would like to use it for any other species, I recommend you to check the G. bimaculatus one, since the scripts are better explained.
-
Schematic representation of the G. bimaculatus genoem annotation:
- Please cite:
Ylla, G., Nakamura, T., Itoh, T. et al. Insights into the genomic evolution of insects from cricket genomes. Commun Biol 4, 733 (2021). https://doi.org/10.1038/s42003-021-02197-9