-
Notifications
You must be signed in to change notification settings - Fork 1
/
search.Rmd
34 lines (24 loc) · 1.42 KB
/
search.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
---
title: "Download the database"
output: html_document
---
```{r setup, include=FALSE}
source("style.R")
```
There are multiple releases of the database.
Usually, you should use the latest release, unless you are trying to reproduce a previous analysis.
You can download the entire database here or just a subset.
## The FASTA header format
The database is a FASTA file with headers in the following format:
```
>name=Aphanomyces_invadans|strain=NJM9701|ncbi_acc=KX405005|ncbi_taxid=157072|oodb_id=13|taxonomy=cellular_organisms;Eukaryota;Stramenopiles;Oomycetes;Saprolegniales;Saprolegniaceae;Aphanomyces_invadans
```
The following fields are present:
* "name": The binomial species name of the organism with spaces replaced by underscores.
* "strain": The name of the strain/isolate if available. If the strain is not available, it is left empty.
* "ncbi_acc: The NCBI accession number for the sequence submitted to genbank. Note that the version number (the number at the end, after the period) is not included.
* "ncbi_taxid": The NCBI taxonomy id. This can be looked up using the NCBI accession number.
* "oodb_id": This is the unique numeric ID specific to OomyceteDB.
* "taxonomy": The taxonomic classification separated by semicolons. This classification is curated by us and is not the taxonomic classification from NCBI associated with the NCBI taxid.
## Releases
<h3><a href="data/releases/release_1.fa" download>Release 1</a></h3>