TSG2-little_assembler

This was a university project for a class focused on NGS data analysis.

The task was to create a simple assembler that returns contigs. We were allowed to make following simplifying assumptions about data:

reads are from 1 strand of 1 chromosome
reads length is 80
average coverage of the reference sequence is 5
reads have 1%, 2% or 5% of errors (3 sets)

We were allowed to use the code shown/used in classes, based on Ben Langmead codes.

Mine assembler is based on an idea to have fun with mixing a bit of OLC and DBG approach. The simplest explanation: here I use a greedy OLC approach with DBG-like error correction.

Usage

To use the script simply run in the console:

assembly input.fasta output.fasta

with input and output filenames of your choosing.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
assembly		assembly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TSG2-little_assembler

Usage

About

Releases

Packages

Languages

hansiu/TSG2-little_assembler

Folders and files

Latest commit

History

Repository files navigation

TSG2-little_assembler

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages