Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The tolerance of navfn_planner and bt_navigator issues when a robot is stuck #1683

Closed
QQting opened this issue May 5, 2020 · 9 comments
Closed

Comments

@QQting
Copy link
Contributor

QQting commented May 5, 2020

Bug report

Required Info:

  • Operating System:
    • Ubuntu 18.04
  • Version or commit hash:
    current master branch
    ecc42fb
  • RMW:
    CycloneDDS

Steps to reproduce issue

I've tested my robot in a 2x2 meter square area with current master branch. I intentionally make my robot stuck in a corner and the robot has no way to escape. Then I send a goal asking my robot to navigate to another corner. There is no valid path can be generated from the robot's position to the goal because my robot is stuck. However, the navigation system just tells me the robot reaches the goal even my robot is (1.5 meter) far from the goal.

robot_stuck

It took me a while to figure out until I found the tolerance of navfn_planner is set to 2.0 meters according to this discussion. After I change the value back to 0.0, it looks reasonable because the planner_server failed to create a plan. But there is another issue: planner_server aborts its handle and no replanning or recovery happens.

Expected behavior

What I expected is the stuck robot should NOT be able to create a valid path and then going to replanning or recovery by following navigate_w_replanning_and_recovery.xml.

Actual behavior

When the tolerance of navfn_planner is set to 2.0:
Navigation succeed but actually the robot is still stuck at the same place.

When the tolerance of navfn_planner is set to 0.0:
ComputePathToPoseAction keeps ticking on_wait_for_result() even the planner server has aborted its handle. In this situation, new goal can't preempt the current goal until I restart navigation stack.

@SteveMacenski
Copy link
Member

When the tolerance of navfn_planner is set to 2.0:
Navigation succeed but actually the robot is still stuck at the same place.

Well, it was in tolerance so I'll say I think that's expected behavior. There's no way to tell if its valid because you're in tolerance of 2m (which could be totally reasonable depending on the application, like security). I think you just need to update your config file to reflect wanting a smaller tolerance. I will say that 2 seems high. If you submitted a PR reducing it to 1m or 0.5m, I'd merge it.

Do you think this issue has caused your problem? #1652 It says more or less that the abort is used as a preemption trigger and therefore aborts aren't working well. I think from what you describe I would think that it would continously be trying to replan, not fail to replan or not trigger recoveries. Since the controller server also reports an abort (I believe? maybe we should check that they both do it the same way / handled the same way by their BT nodes) when it fails to control, I believe that that aspect should be working. If you get in a bad control situation, you still see the controller fail, abort, and trigger recoveries.

@QQting
Copy link
Contributor Author

QQting commented May 5, 2020

Do you think this issue has caused your problem? #1652 It says more or less that the abort is used as a preemption trigger and therefore aborts aren't working well.

Yes, thanks for pointing out there is a similar issue. I just tried what @maxlein has mentioned in #1652:

If I remove the if clause, everything works as expected when for example the controller is aborting.

And then the bt_navigator goes well.

Updated:
If I mark below if clause, it seems the global path will not avoid the dynamic obstacle.

https://github.com/ros-planning/navigation2/blob/6fb43d4a0543897d29d38b3e4a749b6bea6108e1/nav2_behavior_tree/include/nav2_behavior_tree/bt_action_node.hpp#L221

@SteveMacenski
Copy link
Member

@QQting I haven't had time to look into this. Can you look into why the controller server when it fails it will trigger recoveries and why you've found that the planner won't? Maybe they behave differently in the BT plugin nodes or in the nav2_controller/nav2_planner servers.

@SteveMacenski
Copy link
Member

@QQting is that ticket causing your problems as well? If so, I'd like to close this and converge the conversations over there.

@QQting
Copy link
Contributor Author

QQting commented May 6, 2020

@SteveMacenski My ticket consists of two issues:

  1. The tolerance of navfn_planner
    For this issue, can we change the default tolerance back to 0.0 meter?
    Change to the same value as transform_tolerance of DWBLocalPlanner may also be possible?
  2. The same as BTActionNode does not abort #1652, I can try to look into it, thanks.

@SteveMacenski
Copy link
Member

For this issue, can we change the default tolerance back to 0.0 meter?

I'd accept a PR to reduce it to something like 0.5m, but I don't think its good behavior to have it default to 0. The configuration file can be used to change this value.

The same as #1652, I can try to look into it, thanks.

That would be great. Lets continue the discussion over there.

@QQting
Copy link
Contributor Author

QQting commented May 7, 2020

@QQting I haven't had time to look into this. Can you look into why the controller server when it fails it will trigger recoveries and why you've found that the planner won't? Maybe they behave differently in the BT plugin nodes or in the nav2_controller/nav2_planner servers.

Hi @SteveMacenski ,
After testing on my robot, I summarize three test results:

  1. accept_pending_goal() which is called by ControllerServer::updateGlobalPath() aborts current handle without any exception. So, the follow_path action server in ControllerServer continues to do its jobs.
    https://github.com/ros-planning/navigation2/blob/0c4aea88a2c553504693decd109e8733cf463137/nav2_util/include/nav2_util/simple_action_server.hpp#L237-L249

  2. If progress_checker throws a "Failed to make progress error" in ControllerServer, the follow_path action server will terminate itself as below codes. However, follow_path action client as a bt_action_node is still waiting for the result.
    https://github.com/ros-planning/navigation2/blob/0c4aea88a2c553504693decd109e8733cf463137/nav2_controller/src/nav2_controller.cpp#L275-L280

  3. If planner_server failed to create a plan like my stuck robot, compute_path_to_pose action server will terminate itself as below codes. Likewise, compute_path_to_pose action client as a bt_action_node is still waiting for the result.
    https://github.com/ros-planning/navigation2/blob/0c4aea88a2c553504693decd109e8733cf463137/nav2_planner/src/planner_server.cpp#L252-L259

@SteveMacenski
Copy link
Member

aborts current handle without any exception

Why would there be an exception? Its aborting the goal and the goal handle of the client will return a result of being aborted.

On point 2: the terminate_currentwill also send the same abort signal https://github.com/ros-planning/navigation2/blob/master/nav2_util/include/nav2_util/simple_action_server.hpp#L377 because it wasn't cancelled, it errorred out. Therefore the abort from an error looks identical to an abort from getting a new goal from preemption, as far as the action server result codes goes.

On point 3, I see the same terminate_current -> abort codes.

These all look identical to me

@SteveMacenski
Copy link
Member

Closing as its a duplicate of #1652. @QQting please submit a PR to change the defaults to 0.5m if you like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants