Skip to content

University class project. Fun simple assembler. Based loosely on Ben Langmead's scripts

Notifications You must be signed in to change notification settings

hansiu/TSG2-little_assembler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

TSG2-little_assembler

This was a university project for a class focused on NGS data analysis.

The task was to create a simple assembler that returns contigs. We were allowed to make following simplifying assumptions about data:

  • reads are from 1 strand of 1 chromosome
  • reads length is 80
  • average coverage of the reference sequence is 5
  • reads have 1%, 2% or 5% of errors (3 sets)

We were allowed to use the code shown/used in classes, based on Ben Langmead codes.

Mine assembler is based on an idea to have fun with mixing a bit of OLC and DBG approach. The simplest explanation: here I use a greedy OLC approach with DBG-like error correction.

Usage

To use the script simply run in the console:

assembly input.fasta output.fasta

with input and output filenames of your choosing.

About

University class project. Fun simple assembler. Based loosely on Ben Langmead's scripts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages