Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lift disruption issue #393

Merged
merged 5 commits into from
Nov 12, 2024
Merged

Fix lift disruption issue #393

merged 5 commits into from
Nov 12, 2024

Conversation

mxgrey
Copy link
Contributor

@mxgrey mxgrey commented Nov 11, 2024

This PR fixes #389

The cause of the problem was a subtle error in some conditional logic which only gets triggered when a task gets cancelled or a replan occurs while a robot is waiting for a lift but the lift has not yet arrived. Beginning in #310 the horizon of a robot is only reported up to the point where it reaches a lift and will need to wait for the lift to arrive. Once the lift arrives, the robot will replan to produce the next plan horizon that it will report to the traffic schedule. There was a flawed conditional statement inside of RequestLift to watch for when this occurs and then short-circuit the RequestLift logic.

That conditional statement is fixed by having RequestLift preserve the arrived / not arrived state inside the RobotContext and then checking that before it tries to short-circuit.

Signed-off-by: Michael X. Grey <[email protected]>
@mxgrey mxgrey requested a review from xiyuoh November 11, 2024 09:25
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
@xiyuoh
Copy link
Member

xiyuoh commented Nov 12, 2024

Thanks for tracking down the cause and providing the fix! It definitely makes sense to check for the stores arrived / not arrived state.

I noticed that the short-circuit logic will never be triggered, since _lift_arrived will only be set to true when RequestLift::ActivePhase::_finish() is called, which will only happen after the has_lift_arrived() check at the start of _init_obs(). This lead me to wonder if we need the short-circuit logic at all. Let me know if I'm misunderstanding the logic here.

@mxgrey
Copy link
Contributor Author

mxgrey commented Nov 12, 2024

_lift_arrived will only be set to true when RequestLift::ActivePhase::_finish() is called, which will only happen after the has_lift_arrived() check at the start of _init_obs().

Unfortunately it's a very confusing and circular workflow, but notice that _context->request_replan() gets triggered after _set_lift_arrived is used. This will cause the plan to cycle back and redo the RequestLift, at which point we want to short-circuit the logic so we don't get thrown back into that cycle.

This strange cycle is done because waiting for the lift can take so long that the remainder of the plan might need to change by the time the lift arrives. Once we know that the lift has arrived and is ready for the robot to board, we do this replan based on the latest information about what's going on in the environment before sending the robot into the lift.

For the next generation we should do something much cleaner with workflows, but we're stuck with this weird cyclical logic for now.

Copy link
Member

@xiyuoh xiyuoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for explaining! The fix LGTM, tested it a few times in a modified hotel world as well and the bug seems to be gone 👍

@mxgrey mxgrey merged commit dd91566 into main Nov 12, 2024
6 checks passed
@mxgrey mxgrey deleted the fix_lift_disruption branch November 12, 2024 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Robot enters lift without session when its ongoing task was cancelled and assigned with another task
2 participants