Always raise a received result-as-error in spawn tasks #288

goodboy · 2022-01-03T17:47:37Z

Hopefully fixes #287.

This small change ensures that any actor spawn task that receives an error from its remote target will raise that error locally thus triggering a surrounding nursery error and possible teardown.

ping @didimelli

goodboy · 2022-01-04T00:10:57Z

Ahah, so the (main) issue with this solution is that we don't get all errors from all child actors to be accumulated as previous since some actors may get cancelled and their results (even when errors) abandoned.

I've been trying some other things but likely the way to solve the underlying issue and get as much error collecting as possible would be to do a 2nd pass of all child portals at nursery close once all subprocs are known to be terminated and then fill any later received errors into the errors: dict.

In an effort to support `.run_in_actor()` error raising by our nursery we ideally collect as many child errors as possible during nursery teardown and error collection/propagation. Here we try a couple things, - factor the per-actor error y retrieval into a new `pack_and_report_errors()` - when a result retrieval via `exhaust_portal()` is cancelled pack the `trio.Cancelled` into the `errors: dict` expecting to rescan for errors for any such entries after process termination. - at the end of the spawn task conduct a timed-out 2nd retrieval of any late delivered error from the child task for each entry in `errors` containing a cancelled. This causes a bunch of cancellation tests to still fail seemingly due to the race case where the OCA nursery may have requested cancellation of children *before* they can remote-error and thus the `MultiError` matching expectations aren't going to (always) be correct. Previously we were always waiting for all `.run_in_actor()` results to arrive and **not** raising any errors early (which in turn triggers local cancellation).

goodboy · 2022-01-05T17:30:34Z

Oh yeah ignore the mp runs there 😂
Code hasn't been changed for that backend!

goodboy · 2022-01-06T16:11:53Z

tractor/_spawn.py

@@ -369,6 +397,11 @@ async def new_proc(
                    f"{subactor.uid}")
                nursery.cancel_scope.cancel()

+                # if errors:


This was the alternative approach mentioned in #287: cancelling the ria_nursery instead of raising any final error retrieved from the portal.

I think this approach is a bit more, hard to extend wrt supervisor strats being overloaded by a user.

Always raise a received result-as-error in spawn tasks

a8d71f1

goodboy mentioned this pull request Jan 3, 2022

Nursery does not re-raise error from .run_in_actor() remote task #287

Open

goodboy added 2 commits January 5, 2022 09:50

Adjust nested error tests for racy nursery error collection

f4cab2d

goodboy added bug Something isn't working supervision labels Jan 5, 2022

goodboy commented Jan 6, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Always raise a received result-as-error in spawn tasks #288

Always raise a received result-as-error in spawn tasks #288

goodboy commented Jan 3, 2022 •

edited

Loading

goodboy commented Jan 4, 2022

goodboy commented Jan 5, 2022

goodboy Jan 6, 2022

Always raise a received result-as-error in spawn tasks #288

Are you sure you want to change the base?

Always raise a received result-as-error in spawn tasks #288

Conversation

goodboy commented Jan 3, 2022 • edited Loading

goodboy commented Jan 4, 2022

goodboy commented Jan 5, 2022

goodboy Jan 6, 2022

Choose a reason for hiding this comment

goodboy commented Jan 3, 2022 •

edited

Loading