You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When an evaluation expression exceeds the timeout, the resulting error should be clear that is the cause.
Actual Behavior
At least some of the time, this situation raises a JSONDecodeError exception. If the user doesn't realize the timeout might be the issue, this can cause them to lose a lot of time trying to debug an issue in the expression itself where none exists.
This issue was noticed with a "real" expression that similarly built a large array of an array of files, and hit the default evaluation timeout. However, you can reproduce the issue using the attached minimal reproduction and a small evaluation timeout:
% venv/bin/cwltool --debug --eval-timeout=.1 wf.cwl array-in.yml
INFO venv/bin/cwltool 3.1.20240404144621
INFO Resolved 'wf.cwl' to 'file:///home/brett/Scratch/wf.cwl'
DEBUG [workflow ] initialized from file:///home/brett/Scratch/wf.cwl
INFO [workflow ] start
DEBUG [workflow ] inputs {}
INFO [workflow ] starting step build_array
DEBUG [step build_array] job input {}
DEBUG [step build_array] evaluated job input to {}
INFO [step build_array] start
Expecting value: line 1 column 1 (char 0)
script was:
01 "use strict";
02 var inputs = {};
03 var self = null;
04 var runtime = {
05 "cores": 1,
06 "ram": 256,
07 "tmpdirSize": 1024,
08 "outdirSize": 1024,
09 "tmpdir": null,
10 "outdir": null
11 };
12 (function(){
13 var result = [];
14 for (var i=0; i<100000; i++) {
15 var arr = [];
16 result.push(arr);
17 for (var j=0; j<100000; j++) {
18 var num = i * 100000 + j;
19 arr.push({
20 class: "File",
21 location: "file:///tmp/array.cwl/file" + num.toString() + ".fq",
22 format: "http://edamontology.org/format_1930"
23 });
24 }
25 }
26 return {"out_arr": result};
27 })()
stdout was: ''
stderr was: ''
Traceback (most recent call last):
File "/home/brett/Scratch/venv/lib/python3.12/site-packages/cwl_utils/sandboxjs.py", line 512, in eval
return cast(CWLOutputType, json.loads(stdout))
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
The above exception was the direct cause of the following exception:
[Note: Everything from here on is basically propagation of the same root cause exception, but here it is for completeness:]
Traceback (most recent call last):
File "/home/brett/Scratch/venv/lib/python3.12/site-packages/cwl_utils/expression.py", line 315, in do_eval
return interpolate(
^^^^^^^^^^^^
File "/home/brett/Scratch/venv/lib/python3.12/site-packages/cwl_utils/expression.py", line 215, in interpolate
e = evaluator(
^^^^^^^^^^
File "/home/brett/Scratch/venv/lib/python3.12/site-packages/cwl_utils/expression.py", line 161, in evaluator
return cast(CWLOutputType, js_engine.eval(ex, jslib, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/brett/Scratch/venv/lib/python3.12/site-packages/cwl_utils/sandboxjs.py", line 514, in eval
raise JavascriptException(
cwl_utils.errors.JavascriptException: Expecting value: line 1 column 1 (char 0)
script was:
01 "use strict";
02 var inputs = {};
03 var self = null;
04 var runtime = {
05 "cores": 1,
06 "ram": 256,
07 "tmpdirSize": 1024,
08 "outdirSize": 1024,
09 "tmpdir": null,
10 "outdir": null
11 };
12 (function(){
13 var result = [];
14 for (var i=0; i<100000; i++) {
15 var arr = [];
16 result.push(arr);
17 for (var j=0; j<100000; j++) {
18 var num = i * 100000 + j;
19 arr.push({
20 class: "File",
21 location: "file:///tmp/array.cwl/file" + num.toString() + ".fq",
22 format: "http://edamontology.org/format_1930"
23 });
24 }
25 }
26 return {"out_arr": result};
27 })()
stdout was: ''
stderr was: ''
ERROR Got workflow error
[error messages and tracebacks repeat from here]
Your Environment
cwltool version: 3.1.20240404144621
Diagnosis
This occurs because of a few related issues in the way that the expression evaluation timeout is implemented:
In general, there's a lot of code in exec_js_process that seems like it could be handled by the subprocess module directly, following the example code documented with Popen.communicate. A rough sketch would be something like:
This would still need code at the top to find the name of the Node command, handle other arguments, etc. But I think this could replace all the process and timeout management code that currently lives in the method, while addressing the bugs above.
The text was updated successfully, but these errors were encountered:
Expected Behavior
When an evaluation expression exceeds the timeout, the resulting error should be clear that is the cause.
Actual Behavior
At least some of the time, this situation raises a
JSONDecodeError
exception. If the user doesn't realize the timeout might be the issue, this can cause them to lose a lot of time trying to debug an issue in the expression itself where none exists.This is sort of a sequel issue to #1680.
Workflow Code
eval-timeout.zip
Full Traceback
This issue was noticed with a "real" expression that similarly built a large array of an array of files, and hit the default evaluation timeout. However, you can reproduce the issue using the attached minimal reproduction and a small evaluation timeout:
Your Environment
Diagnosis
This occurs because of a few related issues in the way that the expression evaluation timeout is implemented:
nodejs.poll()
will returnNone
, causing the conditional to assume it succeeded: https://github.com/common-workflow-language/cwl-utils/blob/0ad6983898f0d9001fe0f416f97c4d8b940e384a/cwl_utils/sandboxjs.py#L272-L280In general, there's a lot of code in
exec_js_process
that seems like it could be handled by the subprocess module directly, following the example code documented withPopen.communicate
. A rough sketch would be something like:This would still need code at the top to find the name of the Node command, handle other arguments, etc. But I think this could replace all the process and timeout management code that currently lives in the method, while addressing the bugs above.
The text was updated successfully, but these errors were encountered: