-
Notifications
You must be signed in to change notification settings - Fork 189
Genome build
We have included a suite of tools including genome size survey, genetic map and Hi-C heatmap to check for quality of genome build.
Tip
Download the test dataset here.
The raw sequencing data provides a way to estimate the size, ploidy, heterozygosity and repeat content of a genome, similar to GenomeScope. Let's say that you have a kmer count histogram (commonly generated by Jellyfish, or other kmer counter), in a file reads.histo
.
1 1281576854
2 89292133
3 21588481
4 9347716
5 5569400
6 4705214
With 1st column the frequency of kmer in the sequencing data, and 2nd column the abundance of kmer with a given frequency. It is easy to infer all the genome statistics and annotate directly on the kmer histogram.
python -m jcvi.assembly.kmer histogram reads.histo "*S. species* ‘Variety 1’" 21
This takes the kmer counts and the species name that goes in the tile. Finally the size K
when used to generate the kmer histogram.
© Haibao Tang, 2010-2024