Skip to content
This repository has been archived by the owner on Jan 12, 2025. It is now read-only.

Latest commit

 

History

History
38 lines (32 loc) · 2.02 KB

wf-fastq.md

File metadata and controls

38 lines (32 loc) · 2.02 KB

Uploading FastQ sequencing read data to ENA using ENAdumper

Prerequisites

  1. Register a Webin account at ENA for submitting files
  2. Register a study at ENA and get a study ID
  3. Register your samples to acquire the sample sheet from ENA
  4. The sample sheet will act as a key-file linking the sample names to the corresponding ENA sample IDs. Example below:
TYPE ACCESSION ALIAS
SAMPLE ERS00000001 S1
SAMPLE ERS00000002 S2
SAMPLE ERS00000003 S3
SAMPLE ERS00000004 S4
SAMPLE ERS00000005 S5
SAMPLE ERS00000006 S6

Running the ENAdumper workflow

  1. The workflow assumes that the FastQ datasets have already been compressed to save storage space
  2. The main input is a tsv file containing the paths and sample names of the compressed FastQ files. Example below:
FILEPATH ALIAS
/path/to/reads/S1.fastq.gz S1
/path/to/reads/S2.fastq.gz
/path/to/reads/S3.fastq.gz
/path/to/reads/S4.fastq.gz S4
/path/to/reads/S5.fastq.gz
/path/to/reads/S6.fastq.gz S6
  1. If the sample name is not provided, the basename of the read file will be used instead.
  2. Example command for uploading FastQ data with ENAdumper:
    ENAdumper --fastq_list list.tsv --key sample_sheet.tsv --study PRJEB00001 -n BATCH1 -user Webin-0001 -pass Banana1
  3. ENAdumper will upload FastQ files in parallel, which can be set using the --processes option
  4. After upload is complete, a template spreadsheet will be generated that can be submitted to include the files in the permanent archive
  5. It is highly recommended to double-check the template spreadsheet before submitting it to ENA