Skip to content

Commit

Permalink
Readme for metaAssembly test wdl
Browse files Browse the repository at this point in the history
  • Loading branch information
Michal-Babins committed May 11, 2021
1 parent de2e05e commit 0d2a984
Showing 1 changed file with 62 additions and 0 deletions.
62 changes: 62 additions & 0 deletions test_output/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Validation Workflow

## Summary

The metaAssembly validation workflow is meant to compare test data in json format available on the NMDC github against data generated by user test input. The purpose is to ensure user test data matches to or falls within an acceptable range of the NMDC test data in an effort to push towards reproducible data standards. The **test_assembly.wdl** file contains most of the neccesary inputs needed for the metaAssembly worfklow and will grab user generated json file, but will require user input for the path of the outdir in their local environment and local user paths for test data to be updated in the input.json.


## Inputs for test_assembly.wdl
1. Bbtools contianer (provided),
2. Spades container (provided),
3. Comparejson container(provided),
4. Prefix to rename contigs (provided, can be changed by user)
5. Kmer parameter (provided)
6. Memory (optional, can be changed by user),
7. Threads (optional, can be changed by user),
8. Outdir (user specified)
9. url for test fasta (provided),
10. url for NMDC test json (provided for small test data)
```
workflow test_assembly {
String bbtools_container="microbiomedata/bbtools:38.90"
String spades_container="microbiomedata/spades:3.15.0"
String validate_container="mbabinski17/comparejson:0.1"
String rename_contig_prefix="scaffold"
Float uniquekmer=1000
String? memory="60G"
String? threads="8"
String? outdir="/vol_b/nmdc_workflows/test_nmdc/metaAssembly/test_output"
String url="https://portal.nersc.gov/cfs/m3408/test_data/Ecoli_10x-int.fastq.gz"
String ref_json="https://raw.githubusercontent.com/microbiomedata/metaAssembly/master/test_output/small_test_stats.json"
```
## Docker contianers can be found here:
Bbtools: [microbiomedata/bbtools:38.44](https://hub.docker.com/r/microbiomedata/bbtools)
Spades: [microbiomedata/spades:3.15.0](https://hub.docker.com/r/microbiomedata/spades)
Comparjson: [microbiomedata/comparejson:0.1](https://hub.docker.com/r/microbiomedata/comparejson)

## Running Testing Validation Workflow

The command for running test validation is similar to that found in the submit.sh file, with the exception of switching out jgi_assembly.wdl for test_assembly.wdl.

- `test_assembly.wdl` file: the WDL file for test validation
- `input.json` file: the test input for the workflow
- `cromwell.conf` file: the conf file for running Cromwell.
- `cromwell.jar` file: the jar file for running Cromwell.
- `metadata_out.json` file: file collects run data, will be created after run of command

Example:
```
java -XX:ParallelGCThreads=32 -Dconfig.file=cromwell.conf -jar cromwell.jar run -m metadata_out.json -i input.json test_assembly.wdl
```

## Validation Metric
Validation metric is determined through a printed command line statement that will read:
```
"test.validate.result": ["No differences detected: test validated"]
```
or
```
"test.validate.result": ["Test Failed"]
```

If test fails, please check inputs or contact local system administrators to ensure there are no system issues causing discrepency in results.

0 comments on commit 0d2a984

Please sign in to comment.