Skip to content

Tools for handling Unique Molecular Identifiers in NGS data sets

License

Notifications You must be signed in to change notification settings

peterch405/UMI-tools

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tools for dealing with Unique Molecular Identifiers

This repository contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs). Currently there are two tools:

  • extract: Flexible removal of UMI sequences from fastq reads.
    UMIs are removed and appended to the read name. Any other barcode, for example a library barcode, is left on the read.
  • dedup: Implements a number of different UMI deduplication schemes.
    The recommended method is directional_adjecency.

See simulation results at the CGAT blog.

Genome Science 2015 poster.

Biorxiv Preprint.

Installation

If you're using Conda, you can use:

conda install -c https://conda.anaconda.org/toms umi_tools

Or pip:

pip install umi_tools

Or if you'd like to work directly from the git repository:

git clone [email protected]:CGATOxford/UMI-tools.git

Enter repository and run:

python setup.py install

Help

To get help on umi_tools run

`umi_tools --help`

To get help on umi_tools extract run

`umi_tools extract --help`

To get help on umi_tools dedup run

`umi_tools dedup --help`

Dependencies

umi_tools is dependent on numpy, pandas, cython, pysam and future

About

Tools for handling Unique Molecular Identifiers in NGS data sets

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%