diff --git a/test_output/README.md b/test_output/README.md new file mode 100644 index 0000000..34e9d01 --- /dev/null +++ b/test_output/README.md @@ -0,0 +1,62 @@ +# Validation Workflow + +## Summary + +The metaAssembly validation workflow is meant to compare test data in json format available on the NMDC github against data generated by user test input. The purpose is to ensure user test data matches to or falls within an acceptable range of the NMDC test data in an effort to push towards reproducible data standards. The **test_assembly.wdl** file contains most of the neccesary inputs needed for the metaAssembly worfklow and will grab user generated json file, but will require user input for the path of the outdir in their local environment and local user paths for test data to be updated in the input.json. + + +## Inputs for test_assembly.wdl +1. Bbtools contianer (provided), +2. Spades container (provided), +3. Comparejson container(provided), +4. Prefix to rename contigs (provided, can be changed by user) +5. Kmer parameter (provided) +6. Memory (optional, can be changed by user), +7. Threads (optional, can be changed by user), +8. Outdir (user specified) +9. url for test fasta (provided), +10. url for NMDC test json (provided for small test data) +``` +workflow test_assembly { + String bbtools_container="microbiomedata/bbtools:38.90" + String spades_container="microbiomedata/spades:3.15.0" + String validate_container="mbabinski17/comparejson:0.1" + String rename_contig_prefix="scaffold" + Float uniquekmer=1000 + String? memory="60G" + String? threads="8" + String? outdir="/vol_b/nmdc_workflows/test_nmdc/metaAssembly/test_output" + String url="https://portal.nersc.gov/cfs/m3408/test_data/Ecoli_10x-int.fastq.gz" + String ref_json="https://raw.githubusercontent.com/microbiomedata/metaAssembly/master/test_output/small_test_stats.json" +``` +## Docker contianers can be found here: +Bbtools: [microbiomedata/bbtools:38.44](https://hub.docker.com/r/microbiomedata/bbtools) +Spades: [microbiomedata/spades:3.15.0](https://hub.docker.com/r/microbiomedata/spades) +Comparjson: [microbiomedata/comparejson:0.1](https://hub.docker.com/r/microbiomedata/comparejson) + +## Running Testing Validation Workflow + +The command for running test validation is similar to that found in the submit.sh file, with the exception of switching out jgi_assembly.wdl for test_assembly.wdl. + + - `test_assembly.wdl` file: the WDL file for test validation + - `input.json` file: the test input for the workflow + - `cromwell.conf` file: the conf file for running Cromwell. + - `cromwell.jar` file: the jar file for running Cromwell. + - `metadata_out.json` file: file collects run data, will be created after run of command + +Example: +``` +java -XX:ParallelGCThreads=32 -Dconfig.file=cromwell.conf -jar cromwell.jar run -m metadata_out.json -i input.json test_assembly.wdl +``` + +## Validation Metric +Validation metric is determined through a printed command line statement that will read: +``` +"test.validate.result": ["No differences detected: test validated"] +``` +or +``` +"test.validate.result": ["Test Failed"] +``` + +If test fails, please check inputs or contact local system administrators to ensure there are no system issues causing discrepency in results.