-
-
Notifications
You must be signed in to change notification settings - Fork 2
Perception
ArUco markers (which are a type of fiducial markers) are special patterns than encode numbers. Often time the course will have these placed around so we can orient our rover and execute a certain task.
OpenCV is used to run detection on the camera stream. This gives us information about where the tag is in pixel space, specifically its four corners. We can then fuse this with point cloud data, which gives us the xyz position for any given pixel relative to the camera. Specifically, we query the pointcloud at the center of the marker and thus find its transform relative to the rover.
We then publish the tags to the tf tree.
Update Loop:
- Detect the IDs and vertices in pixel space of ArUco tags from the current camera frame.
- Add any new tags to the "immediate" map or update existing ones. We calculate the center here by finding the average of the four vertices. If we also have a point cloud reading for this tag publish it to the TF tree as an immediate tag relative to the rover. These readings are filled in by another callback.
- Decrement the hit counter of any tags that were not seen this frame. If it reaches zero remove them entirely from the immediate map.
- Publish all tags to the TF tree that have been seen enough times. Importantly this time they will be relative to the map frame not the rover.
- Draw the detected markers onto an image and then publish it
We have the option of using the Zed built-in tracking or rtabmap stereo odometry. We have found that both are high quality but the Zed built-in tracking runs at a higher refresh rate at the cost of being more of a black box.
Communication between nodes has to use sockets in ROS by default since they all run in separate processes. We use nodelets instead which all run inside of the same process. In this way they share a virtual address space and can share messages via pointers (zero-copy). This vastly increases the update rate at which perception is able to run.
At least 720p is recommended. Anything lower will not work at long ranges. We also try to hit at least 10 hz so information propagates fast enough to navigation.
- AruCo: Special pattern of black and white blocks that encode a number. Often times called markers/tags/targets
- Stereo Camera: A camera that uses stereo rectification to produce point clouds
- Zed 2i: The stereo camera that we use
- OpenCV: A computer vision library
- Point Cloud: A collection of 3D points that roughly describe a scene
- Odometry: The "pose" of an object, in other words description of where it is in the world (usually position and rotation)
- Pixel Space (or Camera Space): x and y coordinates of where a pixel is in an image