-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClearEntireCostmap returns before all costmap layer include valid data #4734
Comments
That is why there is a ‘isCurrent’ API that is checked before planning or control is allowed to make sure there is new data before the cost map is allowed to be used. Why the cost map might be in a state without valid data after a clear, it is not used until the data is again valid for all layers. Does that make sense? |
Thank you for the quick answer - that definitely makes sense :) . |
A collision checker though would be used inside of a controller/planner plugin, so if the controller / planner is not being utilized until data is valid yet, then there is no issue :-) There shouldn't be any calls to costmap data structures by the task server's algorithms until |
Are you saying to use the costmap topic collision detector object? That is a good point - I think though we could adjust how the costmap publishing is done so that we don't publish a new costmap until its current again. I don't think blocking the clear costmap service is a good idea since the costmap update rate is user-defined so it could be very slow. We could not update the published costmap topics as well as blocking the algorithm's use of the costmap until Is that what you mean / work for you? |
Thank your for your quick replies. Yes exactly - I was talking about the Yes that would be amazing and definitely solve our problem. I'm not sure if it could cause any problems for other existing users though. Maybe we could also introduce a parameter to determine if the use of the costmap should be blocked until |
@schupf2 I looked over the details and it should be as easy as adding As long as we keep the Would you be open to testing that out to make sure that resolves your concern and then submitting the PR? :-) |
Amazing thank you. |
I implemented the discussed change today and tested it. Unfortunately, it still doesn't solve the problem. I discovered that the
|
Ah - I didn't catch that in my initial analysis. I believe we set that to So from that perspective, this feels correct - we did successfully update the costmap layer, it just didn't contain a promise that data was processed yet. Though, if using a static layer, we know a map and inflation are now present on a single update. However the nature of "current" seems insufficient for the case where we (1) want to know if we performed a valid update and (2) the costmap layer actually had data that iteration as well. If you had a 5 hz publishing sensor and a 10 hz updating costmap, not each costmap update would have a new piece of sensor data and that is an expected behavior. Or a situation with multiple observations in a layer and only some subset of them are updated in the first iteration. That doesn't address our problem here, though. I'd be open to suggestions. A couple of initial thoughts:
It seems like we'd need to make some promise about the fact that each observation source in each data processing costmap layer has not only been called to update, but that update contained new current data. And, do so in a general purpose way that accounts for the fact that some data sources come in faster / slower, regular or irregular. And also if that's really necessary in the way that we're thinking about it right now. Any thoughts? |
Thanks for your detailed answer :). Yes, I agree that it does make sense for a costmap to be valid in general after it has undergone a complete update procedure. However, I think this makes the So from my perspective, also taking into account the presence of sensor data (also after a reset), seems like a valid solution. If I'm not missing something here, it might also be less likely to introduce further problems then the points you mentioned (especially since you already mentioned the problems ^^). The changes I would suggest:
However, I noticed another thing here: If the map was cleared, the obstacle layer reports being current already after the call to So to sum it up. I think if we remove line 760 and 544 in the Combined with the change of only publishing the costmap if it's current, this would also solve my problem (at least from what I can see in a preliminary analysis) :). |
I think it is consistent: it takes time before the sensor timeout can trigger. If you have a 10 hz updating sensor and you set your timeout to 0.2 seconds to give it some leeway, then when we
I don't entirely disagree with you, though I'm cautious because I know costmaps can be implemented with inputs that are either not regularly updating sensors (i.e. not a regular frequency) or based on slow updating inputs (1hz AI output or even on event basis). If we do this, then we would (1) have to wait until all costmap layers and their constituent observation sources have some new data and (2) block publication and algorithms until such a time -- which could be arguably good. We could make a carve out for special cases like the Static Layer that are not updated regularly, but that now means its another API that has to be implemented and managed. Doable, but starts to feel like patches on top of patches and is further complicated if the inputs are not simple lasers, depth cameras, etc that update regularly and frequently.
I tried to find the origins of this, but its been done since ROS 1 Navigation, so its hard for me to pinpoint. Its at least 11 years old, but it seems to me to have been in the stack since the very first commit on costmap 2d, or so close to it that even GitHub itself doesn't have the stored Blame information for me to backtrack further. I believe this could be due to a reset operation taking arbitrarily long, or reset is meant to reset all state and that's an important element of the layer's state. It used to unsubscribe to the topic, reset the map, then resubscribe to the topic, and finally reset the last updated time (which makes sense if a subscription is new). There was an So, I'm a bit cautious about a change to that.
Update bounds is called in the I can really see an argument for removing that
@mikeferguson can you provide some historical context and/or let me know your thoughts?
Tentatively agree. @mikeferguson do you concur? |
You're definitely right there. I also think we should keep it as simple as is and only consider the expected_update_rate instead of the actual presence of data.
Yes, if we used to unsubscribe and resubscribe to the sensor topic the In terms of "a reset operation taking arbitrarily long" - this should not matter right? For the
Definitely understand that though :-).
As far as I can see it, we should not obtain a observation buffer that is empty and valid IF we set the
:D
Yes exactly, but in the However, I didn't notice any problems with that so far, so I guess this should not cause problems since the costmap will be updated right after. Looking forward to @mikeferguson thoughts :). |
@mikeferguson: Still looking forward to your opinion on this :) We also suggest the same changes for the Spatio Temporal Voxel Layer , as we obtained the same problems here. We already have working implementations for both repositories which are currently undergoing some thorough testing. Furthermore, we observed a problem with removing stale observations in the stvl, which we mentioned in this issue. |
Sure thing, STVL / NPVL would probably be good to both be updated by this (though NPVL is so simple, it probably isn't required) |
I think, for costmap historical stuff, @DLu might have more knowledge than me (I read this entire ticket but actually don't have anything of value to add) |
Greetings - For historical perspective, the As far as I've found, the current status was not checked before running recovery behaviors such as clearing the costmap. I also don't believe that the recovery itself checked or set the current status. Line 760 was introduced in 2019 by @crdelsey in #1412. Line 544 was @SteveMacenski in 2020 with #2090. Ultimately, I think that there are three distinct states intertwined with this bool, which I'm going to assign arbitrary names.
I haven't yet traced through all the changes to |
Thank you @DLu for your detailed response and providing historical insight :) Why do you think removing line 760 prevents the costmap from being |
Since I already have the implementation for all my suggestions, I created a pull request (#4800) so some changes might be more visible / easier to discuss. Looking forward to your comments on this. |
Bug report
Required Info:
Steps to reproduce issue
When working with a costmap with more then one different layers (in our case, we use
["static_layer", "obstacle_layer", "inflation_layer"]
), we face a problem after clearing the costmap: The thenav2_msgs::srv::ClearEntireCostmap
service returns after the costmap is cleared. However, at this point no scanner measurements are considered in the obstacle layer. Thus, the space which is not considered in the static layer is marked as free for a short time until the next scanner measurement is included in the obstacle layer.Expected behavior
There should be an option to either enforce waiting on information in all layers being present before returning from the ClearEntireCostmap service call, or some option to verify the presence of information on all layers.
Actual behavior
After the ClearEntireCostmap service call the costmap only shows values form that static layer and no scanner measurements are yet considered in the obstacle layer. Therefore, obstacles in front of the robot are ignored for a brief time.
The text was updated successfully, but these errors were encountered: