-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixes #199 #200
fixes #199 #200
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of questions, but overall LGTM now. Next steps after resolving these questions is to cherry pick this commit to noetic / ros2 / foxy
@@ -716,10 +723,12 @@ void SpatioTemporalVoxelLayer::updateBounds( \ | |||
ObservationsResetAfterReading(); | |||
current_ = current; | |||
|
|||
std::unordered_set<volume_grid::occupany_cell> cleared_cells; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why an unordered_set
? Why not a vector for instance? Was this experimentally shown to be faster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose unordered_set
to avoid duplicates. When clearing the world points, we don't care about the Z coordinate. If we clear world point (1, 2, 3) and (1, 2, 4), they're both gonna be in the same costmap cell, so there is no need to store and iterate through it twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, that seems reasonable. I'd ask though if you'd test both and use which ever is practically faster. The STL Vector implementation is very cache friendly, so even areas where you wouldn't think it would be faster to use (because of duplications for instance), it ends up being faster due to continuous memory buffers (I ran into that alot while developing this: what you "think" should be faster based on theory and what is actually faster are not always intuitive). I don't know if it would be faster or not, so please test and let me know.
You can create an update timer pretty easily with chrono
StartTime = std::chrono::system_clock::now();
... do work
endTime = std::chrono::system_clock::now();
std::cout << "time took " std::chrono::duration_cast<std::chrono::milliseconds>(endTime - StartTime).count() << " ms."<< std::endl;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, vector
is a lot faster than unordered_set
when iterating through its contents, even though its size is a lot bigger due to duplicated elements. However, vector
is only faster if we don't do anything while iterating. Since we need to touch every element, then unordered_set
becomes the faster alternative since it has fewer elements - meaning less calls to touch
.
I recorded two small clips where you can see the time it takes for both cases, when decay time is set to 0 (i.e. instant clearing):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the looping testing? Just the loop without touch
(so empty for
?)
Keep in mind its not just access for iteration, its also about insertion time since vector insertion would probably be faster due to no hash table lookups for existing keys. You should try to measure the full update and take some metrics based on that since the insertions are pretty built into the clearing function. I agree its looking good for the unordered_set
but just want to validate. I did alot of testing on datastructures to make STVL real-time so I want to ensure this doesn't have an undo impact on the overall performance.
So tl;dr
- Measure this update cycle metric
- Measure this with vector update cycle metric
- Measure the baseline before-you-changed-anything metric
So we can answer the questions:
- Is vector or unordered_set the right answer?
- What impact does this have on the update time compared to what existed before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I averaged 450 update cycles with the following scenario (decay time set to 0):
The start time was set right before defining the cleared_cells
variable:
std::unordered_set<volume_grid::occupany_cell> cleared_cells; |
and end time at the very bottom of the function, right before return
.
If I set the start time at the top of the function the execution times were very inconsistent, I'm guessing due to the calls to GetObservations
methods. I don't think my changes should affect the performance before we clear anyway, so I set the start time after the observations are obtained, in order to get more consistent times.
Results (units in microseconds):
- baseline
- mean: 2983
- std: 430
- unordered_set
- mean: 3761
- std: 921
- vector
- mean: 3512
- std: 836
So unordered_set
and vector
have basically the same performance, which was ~0.5ms worse than the baseline.
I think there's something wrong with the way I add the new marked points to If both the robot and the obstacle are static, the point cloud should be almost the same every cycle. However, with this PR the point cloud is a lot denser than before, meaning the points I add when marking have different coordinates than the ones that are added on clearing. |
Not sure, it looks OK to me, you should only see the new points from the measurement sensor that weren't in the grid already. Are they overlapping each other? The only thing I can think of is that since you input the global coordinates directly from the sensor, that these points weren't already in the |
Yup, that was the problem. I compared the size of
Since it seems like we can't input the global coordinates directly, I removed my overload of the I think the only problem left is the one I mentioned in the OP about the sides of the costmap not being properly cleaned. |
I don't recall that, what's that issue |
|
Were those points outside of the bounds of the local costmap? If so, they shouldn't be marking because they shouldn't be |
Nop.
The upstream branch didn't clear the obstacles at all. This PR at least clears them under certain conditions, although it seems like the main problem is still not solved. After some more testing, I think the problem is that in the video I posted in the OP there were no walls on the left side. Take a look at this video, where the robot is in the middle of the map with no walls near it. The obstacles stay permanently in the costmap until a different obstacle is detected in the same direction behind it. |
How does this PR not solve that issue? If your caching the clears, then it should be part of the next update bounds update to get queried and cleared out |
I don't know...! From what I can tell, there are only two places where the voxels are cleared, and I'm caching the cells in both cases: spatio_temporal_voxel_layer/src/spatio_temporal_voxel_grid.cpp Lines 206 to 207 in 651fcd7
spatio_temporal_voxel_layer/src/spatio_temporal_voxel_grid.cpp Lines 231 to 232 in 651fcd7
The current state of affairs is:
|
Make any progress on this? |
Hi, unfortunately I wasn't able to find the cause of the issue. Since touching the cleared cells doesn't seem to fix it, then maybe the underlying problem is something different from your initial guess? If you have any ideas I'd be happy to try them out! Otherwise I can also provide a minimum working example if you'd like. |
@siferati is it as simple as
Keep in mind there could be multiple cells in that column that could still be marked so it shouldn't be always clearing
|
The problem remains even if I do that. But while testing your idea I found something very interesting! a) If I mark cleared cells as free space before lethal obstacles are marked, the problem remains: void SpatioTemporalVoxelLayer::UpdateROSCostmap( \
double* min_x, double* min_y, \
double* max_x, double* max_y, \
std::unordered_set<volume_grid::occupany_cell>& cleared_cells)
{
// grabs map of occupied cells from grid and adds to costmap_
Costmap2D::resetMaps();
std::unordered_set<volume_grid::occupany_cell>::iterator cell;
for (cell = cleared_cells.begin(); cell != cleared_cells.end(); ++cell)
{
uint map_x, map_y;
if (worldToMap(cell->x, cell->y, map_x, map_y))
{
costmap_[getIndex(map_x, map_y)] = costmap_2d::FREE_SPACE;
touch(cell->x, cell->y, min_x, min_y, max_x, max_y);
}
}
std::unordered_map<volume_grid::occupany_cell, uint>::iterator it;
for (it = _voxel_grid->GetFlattenedCostmap()->begin();
it != _voxel_grid->GetFlattenedCostmap()->end(); ++it)
{
uint map_x, map_y;
if ( it->second >= _mark_threshold && \
worldToMap(it->first.x, it->first.y, map_x, map_y))
{
costmap_[getIndex(map_x, map_y)] = costmap_2d::LETHAL_OBSTACLE;
touch(it->first.x, it->first.y, min_x, min_y, max_x, max_y);
}
}
} b) However, if I mark cleared cells as free space after lethal obstacles are marked, the problem is fixed: void SpatioTemporalVoxelLayer::UpdateROSCostmap( \
double* min_x, double* min_y, \
double* max_x, double* max_y, \
std::unordered_set<volume_grid::occupany_cell>& cleared_cells)
{
// grabs map of occupied cells from grid and adds to costmap_
Costmap2D::resetMaps();
std::unordered_map<volume_grid::occupany_cell, uint>::iterator it;
for (it = _voxel_grid->GetFlattenedCostmap()->begin();
it != _voxel_grid->GetFlattenedCostmap()->end(); ++it)
{
uint map_x, map_y;
if ( it->second >= _mark_threshold && \
worldToMap(it->first.x, it->first.y, map_x, map_y))
{
costmap_[getIndex(map_x, map_y)] = costmap_2d::LETHAL_OBSTACLE;
touch(it->first.x, it->first.y, min_x, min_y, max_x, max_y);
}
}
std::unordered_set<volume_grid::occupany_cell>::iterator cell;
for (cell = cleared_cells.begin(); cell != cleared_cells.end(); ++cell)
{
uint map_x, map_y;
if (worldToMap(cell->x, cell->y, map_x, map_y))
{
costmap_[getIndex(map_x, map_y)] = costmap_2d::FREE_SPACE;
touch(cell->x, cell->y, min_x, min_y, max_x, max_y);
}
}
} Since b) works but both a) and your suggestion don't, it seems to me that the problem is that the voxel flattened costmap contains cells that should have been cleared but weren't. Or rather, since
is called every update, it seems more logical to say that there is obstacle information being added to the costmap about obstacles that no longer exist. |
I'll need to think about that a little.
This isn't quite right I don't think. You didn't check if the cleared voxel clears enough of the column of the voxels to make it as free now. If you had a column of 10 voxels and you cleared one, you shouldn't mark the costmap now as clear, its still occupied. You need to check for the number of cells still marked in the column and if this clearing operation now brings the number of voxels marked in this column of the costmap to Though, I have to admit, I would need to sketch it out, but the logic of placing the clearing after marking might implicitly solve that for you since we only mark as LETHAL when the mark threshold is already met (and is cleared every iteration). In fact, since we're clearing the costmap every single iteration ( Where is the state being tracked between iterations incorrectly needing any of this? I don't think that block of code I suggested above should actually be there, if that's working, that means something is marking points that shouldn't have been marked at all.
I don't see where we went wrong that there would be anything marked in the The only slight hiccup I see is that when you added I'm trying to follow that train of logic and I don't see how that would result in over marking if the obstacle simply vanished (because there would be no marking at that cell to cause it), but maybe that's what's actually causing this? |
I did it according to your suggestion. I just didn't post the code snippet since you had already posted the pseudo-code above. And since marking every cleared cell as free space before lethal obstacles are marked doesn't fix the problem, it makes sense that your suggestion also wouldn't work. Marking every cleared cell as free space and then overwriting the ones that are above a threshold with lethal obstacles would result in the same costmap as first marking the cells above the threshold as lethal obstacles and then marking as free space the cells below that threshold.
Because we need to touch the cleared cells to update the bounds. Quoting your previous comment: the internal voxel grid is empty on clearing, then nothing new marks it because the obstacle instantly disappeared, and then when it comes time to update the bounds, there's nothing to update them with (and there's no static layer to set the reasonable update bounds).
The only way I can think of to solve this is to keep track of the points that we marked in the clearing update in some container, and then when marking new points check for inclusion in that container: if it exists, then we skip the first +1. But this would likely slow down the process quite a bit, since for every marking observation we would have to check if it exists in that container. I'm not sure if it's worth doing this in order to avoid the column count sometimes being off by 1. I finally found the problem!
When the point is cleared here spatio_temporal_voxel_layer/src/spatio_temporal_voxel_grid.cpp Lines 227 to 232 in e077795
we spatio_temporal_voxel_layer/src/spatio_temporal_voxel_grid.cpp Lines 203 to 206 in e077795
we do nothing. So despite clearing this point, we will still populate it in spatio_temporal_voxel_layer/src/spatio_temporal_voxel_grid.cpp Lines 234 to 235 in e077795
I added a flag to avoid calling |
Rather than That also makes me wonder if we also break on L238 after the other clearing operation, then we never hit the I think the fact that I didn't add |
If you spatio_temporal_voxel_layer/src/spatio_temporal_voxel_grid.cpp Lines 171 to 185 in e077795
If you break on L238, then you're breaking out of the outer loop (grid points), meaning you will return from the function and not evaluate the remaining points. That's why there was a I simply replaced the
I thought a bit more about this and realized that it is actually not just off by 1, but off by double. If we have some object that is made up of, for example, 10 voxels in a vertical line (so they all share the same (x, y) coords), then what's gonna happen is that in the clearing update we're gonna update the corresponding costmap cell 10 times, resulting in the cell having a value of 10, and then when marking we're gonna update it another 10 times -- resulting in that cell having a column value of 20! Given the above, then I think it is better to either fix it or revert the marking logic so that it stays 1 update behind (i.e. the current upstream behaviour, before this PR). What do you think? |
Yes, that would be the point, then you move onto the next point, which would then have the opportunity to be populated. But that would I think, unless I'm missing something, that flag isn't necessary. I agree with you on L238, that should be a I do really think though from the looking I did this morning is if we re-add the From the behavior you reported too, that logic tracks for me. The reason it didn't clear is because it was still in the frustum view, and when we cleared, we didnt
Exactly, that's why I mentioned that since you added the |
Ok, I agree. Adding the
By inserting the spatio_temporal_voxel_layer/src/spatio_temporal_voxel_grid.cpp Lines 222 to 233 in e077795
Since spatio_temporal_voxel_layer/src/spatio_temporal_voxel_grid.cpp Lines 234 to 235 in e077795
So you will still populate the costmap with the cleared point. In order to avoid this, you would need to
Agreed. |
Ah yes, we break out of the inner loop, it doesn't just reset to the outer, it still has that code left. OK. A bool
after the |
But that's exactly the same behavior of what I have already done... The only difference is that with the way you're suggesting I would have to write Edit: only thing missing is to add the |
Well I think we've found from this method, we don't require |
We've already discussed this a few comments above:
There are 3 different problems at play here:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I think the rest of this tracks for me after that change. Have you tested this that it resolves the issues and doesn't break anything?
This PR fixes #199. I didn't find any problems when testing. |
OK here's what I recommend for next steps:
@doisyg can you test this either in ROS1 or ROS2 (once a PR for ROS2 is made)? Just to sanity check this doesn't break anything for you? |
Done. But I don't have access to a test environment for those distributions, so I wasn't able to test if things actually worked -- I'll leave that to someone else. |
I may have some bandwith to do a RO2 sanity check on our hw platform (which uses 2 3D cameras in STVL) at the end of the week. Please ping me back then if I don't |
@doisyg friendly ping :-) |
I tested the PR on our ROS2 hardware, it solves the issue without impacting the behaviour of the STVL |
Sweet, merging! |
* cherry-picked melodic fix #200 (#204) * bumping to 2.2.0 for galactic release (#219) * Fixing the offset that exists between the grid and the raw data (#226) Co-authored-by: Tiago Silva <[email protected]> Co-authored-by: Kaven Yau <[email protected]>
Implementation of the suggestion by @SteveMacenski to fix #199:
Cells that were cleared are stored in an
unordered_set
and touched when updating the ROS costmap.Bugs fixed in this PR:
cleared_cells
container solves this.cleared_point
flag solves this.(11/05) Edit: Updated to better describe the work that was done.