-
Notifications
You must be signed in to change notification settings - Fork 0
Technical architecture
This notebook explores various OpenCV techniques, focused on pre-processing, background subtraction, and blob tracking. Low pass filter and image grayscale are the pre-processing techniques tried. Background subtraction serves as a mask for movement within two image frames. Blob tracking, which identifies contiguous blob segments of a certain shape, size, etc. works best in this case on the raw image itself, rather than pre-processed or background subtracted images. This is likely because the image resolution is low and the vantage point is far from the tram itself, thus making it more difficult to identify a clear blob. On the cases blobs are detected, the function returns the size and location of the blob. The largest blob is chosen to be the tram and depending on the x position of the tram between the two frames, the location is set as "arriving" or "leaving" relative to Roosevelt Island.
This notebook runs a simple test extracting basic intensity features of the tram images, namely average and standard deviation of RGB. Because there is not a significant difference in these features between images with and without the tram, these features were not used.
This notebook explores using the method of background subtraction values to identify tram presence and movement. Departure and arrival images are captured. From the absolute difference between intensity values between frames, we see a major spike when the tram is moving. When the tram is not there, the difference is around 470,000. However, when it is fully in frame, the difference spikes to around 1,000,000. Finally, based on the position of the first detected tram image, if its more on the left side of the image it is classified as arriving. If it is more on the right side, it is classified as departing.
The BerryNetProvider class definition wraps the following open source library: https://github.com/DT42/BerryNet. BerryNet is a very useful component that leverages Tiny-YOLO to do frame by frame object recognition across 80 classes. Our provider allows us to request a snapshot, retrieve the detection results in json format and parse results. The product worked really well and we showcase it in our 4/18 technical checkin. The issue we ran into was training this model to add detection of tram cars. This proved to be difficult and overly complex because we only needed to track the tram and nothing else, and our camera is in a fixed position. Tiny-YOLO is really useful for scenarios where the user would like to classify many different items with a moving camera. In addition to this, due to the pi's limited computational capabilities we could only process around one image every three seconds. This was a little too slow to accurately detect the tram as it departs or lands under the 59th street bridge.
The BerryNetServer main method instantiates an instance of our BerryNetProvider and leverages a loop to analyze snapshots from the USB camera every three seconds. As we create snapshots the images are stored in the pi in the current directory and parsed detections are output to the console. The product worked really well and we showcase it in our 4/18 technical checkin.
The TramBlobTracker class definition leverages OpenCV2 to use a trained blob detection scheme to find the key points in an image that contain an object around the tram's size. This method did not prove to be very accurate, but more information about this can be found in openCV.ipynb where all of the prototyping was documented in Jupiter notebook.
The TramDiffTracker class definition leverages OpenCV2 to compare two images for general change that occurs. Using the absdiff method provided by OpenCV we are able to compare two images sampled in one short increments to very accurately detect the tram in a carefully cropped region of the frame. Given the fixed camera position of our setup this actually works remarkably well and has allowed us to digitally crop out noise from the road, water ways and bridge. In addition to this logic, we found that we could also very accurately detect the direction of the tram based on which side of our cropped section we detected the tram first. The TramDiffTracker also maintains some local state, specifically the number of detections in a row seen and the direction of the first detection. We do this so that we can set a threshold on the number of detections that have to occur before reporting a single detection as an arrival or departure. This extra logic removes the few false positives we were seeing on first deployment. Second the reason we keep track of the direction on the first detection is this most accurately portrays the direction in which the tram is traveling. If we were to wait to detect direction on a later detection the tram most likely would have moved to the second half of the frame at that point.
The TramState class definition contains internal state and logic to report the current status of a docked or incoming tram to Roosevelt Island. The class keeps track of the last departure, last arrival, current running interval and a boolean representing if the tram is currently docked. It maintains three public methods set arrival, set departure and get wait. The set methods are called when our detection logic determines that a tram is arriving at or leaving the island. The get wait method leverages the current state and interval to return one of the following three results:
- Unknown - The state of the tram is unknown: either the current time is not within the hours of operation, or the system has just booted and has not detected an arrival or departure yet.
- Docked - There is at least one tram docked on the island: this state also returns a count of how many seconds the first tram to dock has been waiting (ie no departure has happened since at least one tram docked).
- Estimate - The last event detected was a departure and we will estimate the time till next departure based on the RIOC hours of operation and rush hour schedule. The value returned along with this state is the number of seconds till the next arrival. The value may be negative if we find the tram to be delayed.
Internally the TramState class has many methods for representing the status of the RIOC schedule. Specifically, the hours of operation, weekend vs weekday night schedules and the difference between rush hour and normal operation based on the current time.
The TramServer main and run methods leverage OpenCV to capture snapshots from the USB camera, ImageQueue, TramState and TramDiffTracker to detect the tram on one second intervals. This is where most of the technology we build comes together and functionality to notify the frontend dashboard and light server over http happens. The business logic within TramState is leveraged once here so we can efficiently report the results to both frontend user experiences with minimal logic in the frontends to avoid duplication. The TramServer is also responsible to creating a new timestamped image whenever a detection occurs, this image is shown on our front end dashboard and is a great way to gain user trust in our estimations. When talking to users it was clear that seeing the actual tram and data was the best way to prove that it was working and reliable.
The ImageQueue class definition wraps a standard FIFO queue to pipeline captured image snapshots. The images given to the queue are already represented as Numpy matrices for processing. Currently our final product only queues a single image at a time, but we had tested various queue depths to see what level of separation in our image comparison would give the most accurate and quickest detections.
The LightServer main method leverages Flask to enable an endpoint for controlling the light display. The light display is a laser printed representation of the tram towers and a strip of 60 LEDs representing the current trajectory of a docked or approaching tram. The TramServer will periodically push updates to the LightServer and the LightServer connects over serial to issue the respective updates to the Arduino micro-controller which is controlling the LED strip. We designed it this way so that the light strip display could be connected to a machine running the LightServer and be placed anywhere for people to see and interact with. The LED strip is folded in half on the display and each set of 30 LEDs are used to show the same trajectory to users on either side of the display.
The frontend components of the dashboard leverages Node Js to build a simple experience for users to see the current state of the tram in relation to Roosevelt Island. The architecture is very similar to what we built in the distant pictures lab exercise. Websockets are heavily utilized to communicate updates back to connected client browsers, and an http post route is enabled to allow the TramServer to send updates to the frontend service. Building this component was a great way to learn more about css and general basic frontend design/coding.