You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a result, bbi_config.json isn't written. As discussed at metrumresearchgroup/bbr#685, from bbr's perspective it would be useful to have bbi_config.json consistently written as an indication that a model run has completed (even if unsuccessfully).
When considering this change, we should go through local's Cleanup and decide which parts should run in case of a failure.
Here are three related issues mentioned in the bbr thread. (At this point I'll track them here, but if they don't end up getting handled by the same series, they should be split off into dedicated issues.)
if bbi aborts early due to the data file not existing, it doesn't write the config file or create the output directory. From a consistency perspective, it might be good if bbi_config.json was created. I'm not sure where to draw the line between a pure bbi early non-zero exit (e.g., unknown argument given) and a model run error. But if the data file doesn't exist, that's something nmfe would fail on, and bbi is just catching it early. So it seems reasonable for it to behave consistently with other nmfe failures.
There are perhaps other cases that behave similarly to the non-existent data path.
if --overwrite is passed but bbi aborts to a non-existent data path (again, there may be other cases), the directory is not removed. That's confusing from bbr's perspective because it looks at bbi_config.json and will think it's from its latest model submission.
bbr is interested in knowing whether a model is currently running. To help, bbi could do the combination of 1) adding a file before launching nmfe* and removing it on cleanup and 2) ensuring cleanup still fires when the nmfe* subprocess exits with a non-zero status
The text was updated successfully, but these errors were encountered:
[ related to metrumresearchgroup/bbr#685, +cc @seth127 @barrettk ]
For the various errors where
RecordConcurrentError
is invoked (including a non-zero exit of thenmfe*
subprocess)bbi/cmd/local.go
Lines 185 to 191 in 835953e
...
true
is sent thecancel
channelbbi/cmd/root.go
Lines 177 to 178 in 835953e
... which stops the execution before the turnstile cleanup phase.
As a result,
bbi_config.json
isn't written. As discussed at metrumresearchgroup/bbr#685, from bbr's perspective it would be useful to havebbi_config.json
consistently written as an indication that a model run has completed (even if unsuccessfully).When considering this change, we should go through local's
Cleanup
and decide which parts should run in case of a failure.Here are three related issues mentioned in the bbr thread. (At this point I'll track them here, but if they don't end up getting handled by the same series, they should be split off into dedicated issues.)
if bbi aborts early due to the data file not existing, it doesn't write the config file or create the output directory. From a consistency perspective, it might be good if
bbi_config.json
was created. I'm not sure where to draw the line between a pure bbi early non-zero exit (e.g., unknown argument given) and a model run error. But if the data file doesn't exist, that's something nmfe would fail on, and bbi is just catching it early. So it seems reasonable for it to behave consistently with other nmfe failures.There are perhaps other cases that behave similarly to the non-existent data path.
if
--overwrite
is passed but bbi aborts to a non-existent data path (again, there may be other cases), the directory is not removed. That's confusing from bbr's perspective because it looks atbbi_config.json
and will think it's from its latest model submission.bbr is interested in knowing whether a model is currently running. To help, bbi could do the combination of 1) adding a file before launching nmfe* and removing it on cleanup and 2) ensuring cleanup still fires when the nmfe* subprocess exits with a non-zero status
The text was updated successfully, but these errors were encountered: