-
Notifications
You must be signed in to change notification settings - Fork 52
Home
Welcome to the wiki for the tool for Distilling and Refining Annotations of Metabolism (DRAM)! Here you will find all you need to know to setup, install and run DRAM and DRAM-v.
Like the process of making the eponymous glass of whiskey, DRAM distills genome annotations to metabolic functions in three levels that scale in information: (1) Raw, (2) Distillate, and (3) Liquor. Through this distillation process, DRAM is able to annotate high volumes of microbial genomes and organize the resulting information in a way that highlights functional guilds, allowing users to infer organismal metabolism across hundreds of genomes. To obtain the Raw output, DRAM calls genes on input genomes, searches each gene against seven databases, and considers all derived annotations together. This approach significantly increases database searches by at least 25% beyond other annotators such as DFAST, MetaERG, and Prokka. The DRAM Raw output contains all database hits per gene in every input genome, which is the final output for most annotators. DRAM significantly advances genome annotation beyond the final raw output by providing the first of its kind organization and visualization of all annotations into ecosystem relevant functions.
These are the commands needed to quickly install, setup and get started running DRAM and DRAM-v.
It is recommended to install DRAM within a conda environment. If you would like to install DRAM manually see the How to Install and Set Up DRAM section of the Wiki.
wget https://raw.githubusercontent.com/shafferm/DRAM/master/environment.yaml
conda env create -f environment.yml -n DRAM
If this installation method is used then all further steps should be ran inside the created DRAM environment.
Then set up DRAM using the following command:
DRAM.py prepare_databases --output_dir DRAM_data --kegg_loc kegg.pep
Once DRAM is set up you are ready to annotate some MAGs. The following commands will generate the full annotation and distillation of MAGs:
DRAM.py annotate -i 'my_bins/*.fa' -o annotation
DRAM.py distill -i annotation/annotations.tsv -o genome_summaries --trna_path annotation/trnas.tsv --rrna_path annotation/rrnas.tsv
Annotating and distilling viral contigs requires some preprocessing and an additional input. The contigs must be processed with VirSorter and the processed viral contigs and VIRSorter_affi-contigs.tab
are used as input to DRAM-v. The following commands will generate the full annotation and distillation of viral contigs:
DRAM-v.py annotate -i my_viral_contigs.fa -v VIRSorter_affi-contigs.tab -o annotation
DRAM-v.py distill -i annotation/annotations.tsv -o annotation/distilled
DRAM has a large memory burden and is design to be ran on servers. DRAM annotates against a large variety of databases which must be processed and stored. With a standard setup the processed DRAM databases take up about 20 GB of storage. DRAM memory usage depends on the databases used. When annotating with UniRef90 around 220 GB of RAM is required. If the KEGG gene database has been provided and the --skip_uniref
flag is used then memory usage is around 100 GB of RAM. If KOfam is used to annotate KEGG along with the --skip_uniref
flag then less than 50 GB of RAM is required. DRAM can be run with any number of processors on a single node.