BeeDeeM is a general-purpose Bioinformatics Databank Manager.
It provides a suite of command-line and UI softwares to manage (download, unarchive, index, install) and enable the easy use of major sequence databanks and biological classifications.
BeeDeeM automatically performs:
- the download of the database files from remote sites (via FTP, HTTP or Aspera),
- the decompression of the files (gzip files),
- the un-archiving of the files (tar files),
- the conversion of native sequence banks (e.g. Genbank) to FASTA files,
- the preparation of databases in BLAST format from native sequence bank formats,
- the preparation of other indexes such as Diamond, Bowtie, Hisat, etc.
- the indexing of Genbank, Refseq, Embl, Genpept, Swissprot, TrEmbl and Fasta files allowing their efficient querying by way of sequence identifiers,
- the indexing of sequence features and ontologies data (NCBI Taxonomy, Gene Ontology, Enzyme Commission, Intepro domains and PFAM domains),
- the preparation of taxonomic subsets out of annotated sequence banks,
- the filtering of sequence banks with user-defined constraints.
Task execution extension:
- Any kind of pre- and post-processing of data can be done using external scripts
- Such scripts can be executed on the host computer (local mode) or though SGE, PBS or SLURM scheduler (cluster mode)
- Task executions are controlled by configuration files; e.g. to specify software ressources (RAM, CPU, walltime), access to softwares (direct execution or through Conda), etc.
Index creation extension:
- Using the task execution engine, additional index can be quite easily created in a fully automated way (e.g. Diamond, Bowtie, etc.)
More: read the user manual!
It is the ideal companion of sequence comparison tools (e.g. BLAST, PLAST, Diamond), as well as tools such as ORSON annotation pipeline, BLAST Viewer platform and Galaxy platform.
BeeDeeM provides a toolchain made of:
- a command-line tool to automate databanks installation
- a UI front-end to do the same in a more friendly way (see below)
- a command-line tool to annotate BLAST results
- a command-line to query databanks using sequence IDs
More.
Here is an example of a script to start Genbank_CoreNucleotide installation on Ifremer's DATARMOR supercomputer:
#!/usr/bin/env bash
#PBS -q web
#PBS -l mem=4gb
#PBS -l ncpus=8
#PBS -l walltime=72:00:00
# Release of BeeDeeM to use
BDM_HOME="$SOFT/bioinfo/beedeem"
BDM_VER="5.0.0"
# Load BeeDeeM environment
module load java/1.8.0_121
# Tell BeeDeeM where is its working directory and where it has to install banks
# (adapt! This is for a test)
export KL_WORKING_DIR=$HOME/bdm-test ; mkdir -p $KL_WORKING_DIR
export KL_mirror__path=$HOME/bdm-banks ; mkdir -p $KL_mirror__path
# prefix of '.dsc' file that must exist in $BDM_HOME/conf/descriptor
DESCRIPTOR="PDB_protein"
export KL_LOG_FILE=${DESCRIPTOR}.log
$BDM_HOME/$BDM_VER/bdm install \
-desc ${DESCRIPTOR} \
>& "$HOME/beedeem/logs/${DESCRIPTOR}-pbs.out"
You can easily automate bank installation using such scripts. Above script relies on a standalone installation of the software, but you can also use either Conda, Docker or Singularity installation of the software.
In addition to use BeeDeeM from the command-line, the software also comes with a friendly interface:
Among others, these databanks can be used to:
- prepare and maintain up-to-date local copy of usefull data
- run BLAST, Diamond or PLAST sequence comparison jobs
- annotate BLAST, Diamond or PLAST results with sequence features and ontologies
BeeDeeM features and data are accessible from:
- ORSON nextflow pipeline
- BioDocument Viewer
- BLAST Viewer
- BLAST Filter Tool
- Plealog Bioinformatics Core API
It is worth noting that BeeDeeM is capable of creating Galaxy Data Manager loc files, enabling a Galaxy web portal to use banks installed by BeeBeeM.
This manual explains how to install, configure and use BeeDeeM.
Use a Java Virtual Machine 1.8 (or above) from Oracle.
Not tested with any other JVM providers but Oracle... so there is no guarantee that the software will work as expected if not using Oracle's JVM. More about BeeDeeM requirements.
BeeDeeM itself is released under the GNU Affero General Public License, Version 3.0. AGPL
It depends on several thrid-party libraries as stated in the NOTICE.txt file provided with this project.
(c) 2003-2023 - Patrick G. Durand
BeeDeeM development started in early 2003 by the development of Core API for BLAST Viewer. Firt release of BeeDeeM was out by mid 2007... a long, long story by now! ;-)
Contributors: Ludovic Antin (2013-15; JUnit test suite, data filtring engine), Pierre Cuzin (2021; configuration).