Skip to content

Experiments optimising python (with Cython)

Notifications You must be signed in to change notification settings

pathogen-informatics/PySnpSites

 
 

Repository files navigation

PySnpSites

Experiments profiling and optimising something like snp-sites written in Python.

Have a look at the git history to see the difference optimisations used, it's now plenty quick (10s for 1GB aligned fasta?). The full log entries include run times for processing a cynthetic 1GB fasta (with a couple of exceptions).

This uses Cython; there are probably more annotations I could add to make this even quicker

Usage

Create an aligned fasta with random sequences in your current directory (211 x 5MBases ~= 1GB fasta)

./random_fasta.py

Create a VCF from an aligned fasta

./snp_sites.py random.fa random.python.vcf

This should take about 10s per GB of input fasta

WARNING

This is not a replacement for snp-sites. For one thing, it uses a really hacky fasta parser which probably doesn't play nicely with any input not created by the random_fasta.py script used in this project. For another, it doesn't sanity check the file are all.

Basically don't use this for anything, it is retained more as a quick reference for me.

About

Experiments optimising python (with Cython)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 77.7%
  • Cython 22.3%