Skip to content

Commit

Permalink
Edited the instructions for building the nr Diamond database
Browse files Browse the repository at this point in the history
  • Loading branch information
eeaunin committed Nov 19, 2024
1 parent d838d08 commit 0f66058
Showing 1 changed file with 9 additions and 1 deletion.
10 changes: 9 additions & 1 deletion docs/databases.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,15 @@ Download and set up according to the instructions at https://blobtoolkit.genomeh

## NCBI nr Diamond database

Download the nr database protein FASTA files from the NCBI ftp server (`wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz`) and build the database similarly to the Uniprot Diamond database, following the instructions at https://blobtoolkit.genomehubs.org/install/.
Download the nr database protein FASTA files from the NCBI ftp server (`wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz`). The database building is similar to the Uniprot Diamond database building (as described at https://blobtoolkit.genomehubs.org/install/).
An example command for building the nr Diamond database looks like this:
```
diamond makedb --threads 16 --in ./nr/nr.gz -d
./ncbi_taxonomy/proteins/nr --taxonmap ./ncbi_taxonomy/proteins/prot.accession2taxid.FULL --taxonnodes ./ncbi_taxonomy/proteins/taxdump/nodes.dmp --taxonnames ./ncbi_taxonomy/proteins/taxdump/names.dmp
```
The `prot.accession2taxid.FULL` file comes from https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/.
The taxdump files come from ftp://ftp.ncbi.nih.gov/pub/taxonomy/new_taxdump/new_taxdump.tar.gz.


## NCBI accession2taxid

Expand Down

0 comments on commit 0f66058

Please sign in to comment.