-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Michal-Babins
committed
May 11, 2021
1 parent
de2e05e
commit 0d2a984
Showing
1 changed file
with
62 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# Validation Workflow | ||
|
||
## Summary | ||
|
||
The metaAssembly validation workflow is meant to compare test data in json format available on the NMDC github against data generated by user test input. The purpose is to ensure user test data matches to or falls within an acceptable range of the NMDC test data in an effort to push towards reproducible data standards. The **test_assembly.wdl** file contains most of the neccesary inputs needed for the metaAssembly worfklow and will grab user generated json file, but will require user input for the path of the outdir in their local environment and local user paths for test data to be updated in the input.json. | ||
|
||
|
||
## Inputs for test_assembly.wdl | ||
1. Bbtools contianer (provided), | ||
2. Spades container (provided), | ||
3. Comparejson container(provided), | ||
4. Prefix to rename contigs (provided, can be changed by user) | ||
5. Kmer parameter (provided) | ||
6. Memory (optional, can be changed by user), | ||
7. Threads (optional, can be changed by user), | ||
8. Outdir (user specified) | ||
9. url for test fasta (provided), | ||
10. url for NMDC test json (provided for small test data) | ||
``` | ||
workflow test_assembly { | ||
String bbtools_container="microbiomedata/bbtools:38.90" | ||
String spades_container="microbiomedata/spades:3.15.0" | ||
String validate_container="mbabinski17/comparejson:0.1" | ||
String rename_contig_prefix="scaffold" | ||
Float uniquekmer=1000 | ||
String? memory="60G" | ||
String? threads="8" | ||
String? outdir="/vol_b/nmdc_workflows/test_nmdc/metaAssembly/test_output" | ||
String url="https://portal.nersc.gov/cfs/m3408/test_data/Ecoli_10x-int.fastq.gz" | ||
String ref_json="https://raw.githubusercontent.com/microbiomedata/metaAssembly/master/test_output/small_test_stats.json" | ||
``` | ||
## Docker contianers can be found here: | ||
Bbtools: [microbiomedata/bbtools:38.44](https://hub.docker.com/r/microbiomedata/bbtools) | ||
Spades: [microbiomedata/spades:3.15.0](https://hub.docker.com/r/microbiomedata/spades) | ||
Comparjson: [microbiomedata/comparejson:0.1](https://hub.docker.com/r/microbiomedata/comparejson) | ||
|
||
## Running Testing Validation Workflow | ||
|
||
The command for running test validation is similar to that found in the submit.sh file, with the exception of switching out jgi_assembly.wdl for test_assembly.wdl. | ||
|
||
- `test_assembly.wdl` file: the WDL file for test validation | ||
- `input.json` file: the test input for the workflow | ||
- `cromwell.conf` file: the conf file for running Cromwell. | ||
- `cromwell.jar` file: the jar file for running Cromwell. | ||
- `metadata_out.json` file: file collects run data, will be created after run of command | ||
|
||
Example: | ||
``` | ||
java -XX:ParallelGCThreads=32 -Dconfig.file=cromwell.conf -jar cromwell.jar run -m metadata_out.json -i input.json test_assembly.wdl | ||
``` | ||
|
||
## Validation Metric | ||
Validation metric is determined through a printed command line statement that will read: | ||
``` | ||
"test.validate.result": ["No differences detected: test validated"] | ||
``` | ||
or | ||
``` | ||
"test.validate.result": ["Test Failed"] | ||
``` | ||
|
||
If test fails, please check inputs or contact local system administrators to ensure there are no system issues causing discrepency in results. |