From 3ebb9fe01f561752fd62ee4df15cc138d5bb7cfb Mon Sep 17 00:00:00 2001 From: fraser-combe Date: Thu, 14 Nov 2024 15:59:31 -0600 Subject: [PATCH] initial updates --- ...retrieve_srr_metadata.md => fetch_srr_accession.md} | 10 +++------- docs/workflows_overview/workflows_alphabetically.md | 2 +- docs/workflows_overview/workflows_kingdom.md | 2 +- docs/workflows_overview/workflows_type.md | 2 +- ...h_srr_metadata.wdl => task_fetch_srr_accession.wdl} | 3 +-- ...ate_srr_metadata.wdl => wf_fetch_srr_accession.wdl} | 4 +--- 6 files changed, 8 insertions(+), 15 deletions(-) rename docs/workflows/public_data_sharing/{retrieve_srr_metadata.md => fetch_srr_accession.md} (90%) rename tasks/utilities/data_handling/{task_fetch_srr_metadata.wdl => task_fetch_srr_accession.wdl} (98%) rename workflows/utilities/data_import/{wf_update_srr_metadata.wdl => wf_fetch_srr_accession.wdl} (85%) diff --git a/docs/workflows/public_data_sharing/retrieve_srr_metadata.md b/docs/workflows/public_data_sharing/fetch_srr_accession.md similarity index 90% rename from docs/workflows/public_data_sharing/retrieve_srr_metadata.md rename to docs/workflows/public_data_sharing/fetch_srr_accession.md index 9a52c6179..efd1dfae8 100644 --- a/docs/workflows/public_data_sharing/retrieve_srr_metadata.md +++ b/docs/workflows/public_data_sharing/fetch_srr_accession.md @@ -1,4 +1,4 @@ -# Retrieve SRR Metadata Workflow +# Fetch SRR Accession Workflow ## Quick Facts @@ -6,7 +6,7 @@ |---|---|---|---|---| | [Public Data Sharing](../../workflows_overview/workflows_type.md/#public-data-sharing) | [Any Taxa](../../workflows_overview/workflows_kingdom.md/#any-taxa) | PHB v2.3.0 | Yes | Sample-level | -## Retrieve SRR Metadata +## Fetch SRR Accession This workflow is designed to retrieve the Sequence Read Archive (SRA) accession (SRR) associated with a given sample accession. The primary inputs are BioSample IDs (e.g., SAMN00000000) or SRA Experiment IDs (e.g., SRX000000), which link to sequencing data in the SRA repository. @@ -14,8 +14,6 @@ The workflow uses the fastq-dl tool to fetch metadata from SRA and specifically ### Inputs -
- | **Terra Task Name** | **Variable** | **Type** | **Description**| **Default Value** | **Terra Status** | | --- | --- | --- | --- | --- | --- | | fetch_srr_metadata | **sample_accession** | String | SRA-compatible accession, such as a **BioSample ID** (e.g., "SAMN00000000") or **SRA Experiment ID** (e.g., "SRX000000"), used to retrieve SRR metadata. | | Required | @@ -24,14 +22,12 @@ The workflow uses the fastq-dl tool to fetch metadata from SRA and specifically | fetch_srr_metadata | **cpu** | Int | Number of CPUs allocated for the task. | 2 | Optional | | fetch_srr_metadata | **memory** | Int | Memory in GB allocated for the task. | 8 | Optional | -
- ### Workflow Tasks This workflow has a single task that performs metadata retrieval for the specified sample accession. ??? task "`fastq-dl`: Fetches SRR metadata for sample accession" - Fetches metadata for a given sample accession using the `fastq-dl` tool. This task uses a Docker container and retrieves the SRR accession by parsing the metadata output. + When provided a BioSample accession or SRA experiment ID, 'fastq-dl' collects metadata and returns the appropriate SRR accession. !!! techdetails "fastq-dl Technical Details" | | Links | diff --git a/docs/workflows_overview/workflows_alphabetically.md b/docs/workflows_overview/workflows_alphabetically.md index 6f6314618..359eb066a 100644 --- a/docs/workflows_overview/workflows_alphabetically.md +++ b/docs/workflows_overview/workflows_alphabetically.md @@ -47,7 +47,7 @@ title: Alphabetical Workflows | [**TheiaValidate**](../workflows/standalone/theiavalidate.md)| This workflow performs basic comparisons between user-designated columns in two separate tables. | Any taxa | | No | v2.0.0 | [TheiaValidate_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/TheiaValidate_PHB:main?tab=info) | | [**Transfer_Column_Content**](../workflows/data_export/transfer_column_content.md)| Transfer contents of a specified Terra data table column for many samples ("entities") to a GCP storage bucket location | Any taxa | Set-level | Yes | v1.3.0 | [Transfer_Column_Content_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Transfer_Column_Content_PHB:main?tab=info) | | [**Samples_to_Ref_Tree**](../workflows/phylogenetic_placement/usher.md)| Use UShER to rapidly and accurately place your samples on any existing phylogenetic tree | Monkeypox virus, SARS-CoV-2, Viral | Sample-level, Set-level | Yes | v2.1.0 | [Usher_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Usher_PHB:main?tab=info) | -| [**Update_SRR_Metadata**](../workflows/public_data_sharing/update_srr_metadata.md)| Update SRR metadata in a Terra data table | Any taxa | | Yes | v2.3.0 | [Update_SRR_Metadata_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Update_SRR_Metadata_PHB:main?tab=info) | +| [**Fetch_SRR_Accession**](../workflows/public_data_sharing/fetch_srr_accession.md)| Update SRR metadata in a Terra data table | Any taxa | | Yes | v2.3.0 | [*Fetch_SRR_Accession_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Fetch_SRR_Accession_PHB:main?tab=info) | | [**Usher_PHB**](../workflows/genomic_characterization/vadr_update.md)| Update VADR assignments | HAV, Influenza, Monkeypox virus, RSV-A, RSV-B, SARS-CoV-2, Viral, WNV | Sample-level | Yes | v1.2.1 | [VADR_Update_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/VADR_Update_PHB:main?tab=info) | | [**Zip_Column_Content**](../workflows/data_export/zip_column_content.md)| Zip contents of a specified Terra data table column for many samples ("entities") | Any taxa | Set-level | Yes | v2.1.0 | [Zip_Column_Content_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Zip_Column_Content_PHB:main?tab=info) | diff --git a/docs/workflows_overview/workflows_kingdom.md b/docs/workflows_overview/workflows_kingdom.md index 4775b0963..e9d54a396 100644 --- a/docs/workflows_overview/workflows_kingdom.md +++ b/docs/workflows_overview/workflows_kingdom.md @@ -24,7 +24,7 @@ title: Workflows by Kingdom | [**TheiaMeta**](../workflows/genomic_characterization/theiameta.md) | Genome assembly and QC from metagenomic sequencing | Any taxa | Sample-level | Yes | v2.0.0 | [TheiaMeta_Illumina_PE_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/TheiaMeta_Illumina_PE_PHB:main?tab=info) | | [**TheiaValidate**](../workflows/standalone/theiavalidate.md)| This workflow performs basic comparisons between user-designated columns in two separate tables. | Any taxa | | No | v2.0.0 | [TheiaValidate_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/TheiaValidate_PHB:main?tab=info) | | [**Transfer_Column_Content**](../workflows/data_export/transfer_column_content.md)| Transfer contents of a specified Terra data table column for many samples ("entities") to a GCP storage bucket location | Any taxa | Set-level | Yes | v1.3.0 | [Transfer_Column_Content_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Transfer_Column_Content_PHB:main?tab=info) | -| [**Update_SRR_Metadata**](../workflows/data_import/update_srr_metadata.md)| Update SRR metadata in a Terra data table | Any taxa | Set-level | Yes | v2.1.0 | [Update_SRR_Metadata_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Update_SRR_Metadata_PHB:main?tab=info) | +| [**Fetch_SRR_Accession**](../workflows/public_data_sharing/fetch_srr_accession.md)| Update SRR metadata in a Terra data table | Any taxa | Set-level | Yes | v2.3.0 | [Fetch_SRR_Accession_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Fetch_SRR_Accession_PHB:main?tab=info) | | [**Zip_Column_Content**](../workflows/data_export/zip_column_content.md)| Zip contents of a specified Terra data table column for many samples ("entities") | Any taxa | Set-level | Yes | v2.1.0 | [Zip_Column_Content_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Zip_Column_Content_PHB:main?tab=info) | diff --git a/docs/workflows_overview/workflows_type.md b/docs/workflows_overview/workflows_type.md index 7fff0b59b..1e6ce735f 100644 --- a/docs/workflows_overview/workflows_type.md +++ b/docs/workflows_overview/workflows_type.md @@ -75,7 +75,7 @@ title: Workflows by Type | [**Mercury_Prep_N_Batch**](../workflows/public_data_sharing/mercury_prep_n_batch.md)| Prepare metadata and sequence data for submission to NCBI and GISAID | Influenza, Monkeypox virus, SARS-CoV-2, Viral | Set-level | No | v2.2.0 | [Mercury_Prep_N_Batch_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Mercury_Prep_N_Batch_PHB:main?tab=info) | | [**Terra_2_GISAID**](../workflows/public_data_sharing/terra_2_gisaid.md)| Upload of assembly data to GISAID | SARS-CoV-2, Viral | Set-level | Yes | v1.2.1 | [Terra_2_GISAID_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Terra_2_GISAID_PHB:main?tab=info) | | [**Terra_2_NCBI**](../workflows/public_data_sharing/terra_2_ncbi.md)| Upload of sequence data to NCBI | Bacteria, Mycotics, Viral | Set-level | No | v2.1.0 | [Terra_2_NCBI_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Terra_2_NCBI_PHB:main?tab=info) | -| [**Update_SRR_Metadata**](../workflows/public_data_sharing/update_srr_metadata.md)| Update SRR metadata in a Terra data table | Any taxa | | Yes | v2.3.0 | [Update_SRR_Metadata_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Update_SRR_Metadata_PHB:main?tab=info) | +| [**Fetch_SRR_Accession**](../workflows/public_data_sharing/fetch_srr_accession.md)| Update SRR metadata in a Terra data table | Any taxa | | Yes | v2.3.0 | [Fetch_SRR_Accession_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Fetch_SRR_Accession_PHB:main?tab=info) | diff --git a/tasks/utilities/data_handling/task_fetch_srr_metadata.wdl b/tasks/utilities/data_handling/task_fetch_srr_accession.wdl similarity index 98% rename from tasks/utilities/data_handling/task_fetch_srr_metadata.wdl rename to tasks/utilities/data_handling/task_fetch_srr_accession.wdl index 17fc5650a..5c1a0044f 100644 --- a/tasks/utilities/data_handling/task_fetch_srr_metadata.wdl +++ b/tasks/utilities/data_handling/task_fetch_srr_accession.wdl @@ -1,6 +1,6 @@ version 1.0 -task fetch_srr_metadata { +task fetch_srr_accession { input { String sample_accession String docker = "us-docker.pkg.dev/general-theiagen/biocontainers/fastq-dl:2.0.4--pyhdfd78af_0" @@ -41,7 +41,6 @@ task fetch_srr_metadata { echo "No SRR accession found" > metadata_output/srr_accession.txt fi >>> - output { String srr_accession = read_string("metadata_output/srr_accession.txt") } diff --git a/workflows/utilities/data_import/wf_update_srr_metadata.wdl b/workflows/utilities/data_import/wf_fetch_srr_accession.wdl similarity index 85% rename from workflows/utilities/data_import/wf_update_srr_metadata.wdl rename to workflows/utilities/data_import/wf_fetch_srr_accession.wdl index 564859e41..966695f80 100644 --- a/workflows/utilities/data_import/wf_update_srr_metadata.wdl +++ b/workflows/utilities/data_import/wf_fetch_srr_accession.wdl @@ -1,6 +1,6 @@ version 1.0 -import "../../../tasks/utilities/data_handling/task_fetch_srr_metadata.wdl" as srr_task +import "../../../tasks/utilities/data_handling/fetch_srr_accession.wdl" as srr_task workflow wf_retrieve_srr { meta { @@ -9,12 +9,10 @@ workflow wf_retrieve_srr { input { String sample_accession } - call srr_task.fetch_srr_metadata { input: sample_accession = sample_accession } - output { String srr_accession = fetch_srr_metadata.srr_accession }