Skip to content
Michael Shaffer edited this page Dec 20, 2019 · 15 revisions

Welcome to the wiki for the tool for Distilling and Refining Annotations of Metabolism (DRAM)! Here you will find all you need to know to setup, install and run DRAM and DRAM-v.

DRAM Overview

Overview

Like the process of making the eponymous glass of whiskey, DRAM distills genome annotations to metabolic functions in three levels that scale in information: (1) Raw, (2) Distillate, and (3) Liquor. Through this distillation process, DRAM is able to annotate high volumes of microbial genomes and organize the resulting information in a way that highlights functional guilds, allowing users to infer organismal metabolism across hundreds of genomes. To obtain the Raw output, DRAM calls genes on input genomes, searches each gene against seven databases, and considers all derived annotations together. This approach significantly increases database searches by at least 25% beyond other annotators such as DFAST, MetaERG, and Prokka. The DRAM Raw output contains all database hits per gene in every input genome, which is the final output for most annotators. DRAM significantly advances genome annotation beyond the final raw output by providing the first of its kind organization and visualization of all annotations into ecosystem relevant functions.

DRAM Overview

DRAM-v Overview

Quickstart

These are the commands needed to quickly install, setup and get started running DRAM and DRAM-v.

DRAM installation

It is recommended to install DRAM within a conda environment. If you would like to install DRAM manually see the How to Install and Set Up DRAM section of the Wiki.

wget https://raw.githubusercontent.com/shafferm/DRAM/master/environment.yaml
conda env create -f environment.yml -n DRAM

If this installation method is used then all further steps should be ran inside the created DRAM environment.

DRAM setup

Then set up DRAM using the following command:

DRAM.py prepare_databases --output_dir DRAM_data --kegg_loc kegg.pep

Running DRAM

Once DRAM is set up you are ready to annotate some MAGs. The following commands will generate the full annotation and distillation of MAGs:

DRAM.py annotate -i 'my_bins/*.fa' -o annotation
DRAM.py distill -i annotation/annotations.tsv -o genome_summaries --trna_path annotation/trnas.tsv --rrna_path annotation/rrnas.tsv

Running DRAM-v

Annotating and distilling viral contigs requires some preprocessing and an additional input. The contigs must be processed with VirSorter and the processed viral contigs and VIRSorter_affi-contigs.tab are used as input to DRAM-v. The following commands will generate the full annotation and distillation of viral contigs:

DRAM-v.py annotate -i my_viral_contigs.fa -v VIRSorter_affi-contigs.tab -o annotation
DRAM-v.py distill -i annotation/annotations.tsv -o annotation/distilled

System Requirements

DRAM has a large memory burden and is design to be ran on servers. DRAM annotates against a large variety of databases which must be processed and stored. With a standard setup the processed DRAM databases take up about 20 GB of storage. DRAM memory usage depends on the databases used. When annotating with UniRef90 around 220 GB of RAM is required. If the KEGG gene database has been provided and the --skip_uniref flag is used then memory usage is around 100 GB of RAM. If KOfam is used to annotate KEGG along with the --skip_uniref flag then less than 50 GB of RAM is required. DRAM can be run with any number of processors on a single node.