-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle Small/Tiny and Fast/High Speed moving object detection/tracking with stable inference #1246
Comments
ReID won't solve your tracking issues. High speed objects have high motion uncertainty, specially if the FPS for your camera are very low. For those cases IoU is not a viable association metric as there most certainly won't be any overlap at all between the bboxes. Because of this, a centroid based metric would be more suitable. |
@tgbaoo I have built a bunch of code around a centroid based system and got it to work quite well. its really quite a different paradigm than iou and there are a few gotchas. lmk if you want some help. |
def centroid_batch(bboxes1, bboxes2, w, h):
"""
Computes the normalized centroid distance between two sets of bounding boxes.
Bounding boxes are in the format [x1, y1, x2, y2].
w, h (width, height) is used to normalize the distance.
"""
# Calculate centroids
centroids1 = np.stack(((bboxes1[..., 0] + bboxes1[..., 2]) / 2,
(bboxes1[..., 1] + bboxes1[..., 3]) / 2), axis=-1)
centroids2 = np.stack(((bboxes2[..., 0] + bboxes2[..., 2]) / 2,
(bboxes2[..., 1] + bboxes2[..., 3]) / 2), axis=-1)
# Expand dimensions for broadcasting
centroids1 = np.expand_dims(centroids1, 1)
centroids2 = np.expand_dims(centroids2, 0)
# Calculate Euclidean distances
distances = np.sqrt(np.sum((centroids1 - centroids2) ** 2, axis=-1))
# Normalize distances
norm_factor = np.sqrt(w**2 + h**2)
normalized_distances = distances / norm_factor
return normalized_distances The output is normalized with respect to the diagonal of the image in pixels. This is, the maximum possible distance between the centroids in the input image. |
In order to use the centroid cost generated above, interchangeably with iou_cost, the cost matrix should inverted: return 1 - normalized_distances The rationale on the thresholding would then be that objects over a certain distance are ignored. Instead of the threshold based on overlap. |
A ReID model won't help you because all the golf balls basically look the same @tgbaoo. But maybe you are detecting more objects? |
You can now try: from boxmot import OCSORT
tracker = OCSORT(
asso_func="centroid",
iou_threshold=0.3 # use this to set the centroid threshold that match your use-case best
) or from boxmot import DeepOCSORT
tracker = DeepOCSORT(
asso_func="centroid",
iou_threshold=0.3 # use this to set the centroid threshold that match your use-case best
)
|
At the moment this only works for OCSORT and DeepOCSORT in |
@mikel-brostrom, @colonelpanic8, Thanks for you guys passion and quick apply new logic for centroid based cost, I am currently just detect only the golf ball only for this stage, for future stage is also detect the golfer for design system for my app demonstrate the which player are taking the shot and the result belong to which user I am applying the new logic so the clip result will be updated soon. Thanks you so much for your consult and support once again. |
@mikel-brostrom I already have a quick test, the result magically robust, I writing some code to imwrite the video, I will post some result soon, stay tune! Thanks again from @colonelpanic8 and @mikel-brostrom support <3 |
@colonelpanic8, @mikel-brostrom This is my some result test on OCSORT: duy_swing_result.2.mp4test_putt_cam_03_trimed.mp4test_putt_cam_01_trimed.mp4I am gonna testing DeepOCSORT, and update to the same comment, but I noticed that this is good with fast moving (like golf putting) but not seem's good with smaller trend + fast moving (like golf swing clip), do you guys have any solution/idea for handle insanely fast moving and smaller trend object like the golf swing? P/S: I have another question is, if I apply this on Deep-based (DeepSORT, DeepOCSORT, HybridSORT, StrongSORT) so I have to use ReID model right? Because I noticed you said that ReID seems not work on my project because ball seem's the same every case |
Exactly what I was expecting 🔥. The question related to dynamic speed object (fast to slow/ slow to fast) tracking is a matter of adaptive KFs. In order to get this working you will need to carefully study the system and its dynamics. So make total sense that this works for more static dynamics like putting but not for swinging. |
@mikel-brostrom Really appreciate your knowledge and expertise again. For next stage of the project is try to draw the golf swing path line, with the sample video like I attached above, from your experience, do you have any recommendation, consulting, solution or just few keywords for me to do research on my own to solve the drawing exactly or nearly golf path line of the golf swing. I have do some research, they give me some keywords related about motion tracking rather than actually visual object tracking. One of them is this repo, they do some C++ code (which is a little bit hardcore for me) but if you have time, your attention on this repo and give me some opinion or recommendation is very useful to me: Thanks for your work ⛳🙌🙌🙌🔥🔥🔥 |
Have you tried SAHI for the swing video with the metric I just implemented? Given that the golf ball is considerably smaller than in the rest of the videos, this approach could be a valid alternative. Let's start small and not over-complicate things 😄 |
@mikel-brostrom, I already applied SAHI. Although the result was good, but the delay of the inference is not compatible with my realtime stream camera application, I considered design my own algorithm like: A few first frame I use yolo like normal When I detect the ball success, then I create a window for 'detect area', then I detect for only the 'detect area' then when the ball is hit, if the ball is moving from the center of the detect area, we also move the 'detect area' Follow to the ball, then when the ball too small and lose the detection, I hard code the 'drop down' effect like the PGA Tour sample video below: From the technical opinion, how would you think about my logic? Is that possible to apply? I will run some code using chatgpt support for the algorithm. |
If it works well using SAHI. This:
I guess should work too.
This I believe will be very difficult to achieve with a realistic outcome. But sure, try it out 😄 |
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs. |
@mikel-brostrom HI, may I ask which file should this code be added to? |
Just substitute the tracker in any of the examples: https://github.com/mikel-brostrom/yolo_tracking#custom-object-detection-model-tracking-example by any of the supported trackers with this association function: #1246 (comment) @050603 |
@mikel-brostrom Thanks a lot |
@mikel-brostrom Hello Mikel, now I am moving to deploy my project to production, can I open another question label to discuss about how we process the tracker and detector keep process real time (with high frame rate and low latency - delay) as much as possible from streaming frames from Ip camera? |
Sure @tgbaoo ! |
Search before asking
Question
Hello @mikel-brostrom, I am involve a project that predict ball in hole classification, I am having a discussion on the ultralytics issue question below:
ultralytics/ultralytics#7109
this issue also contain my result clip demo, training result and relevant information
Then I found your repo with plugged difference SORT tracking algorithms, I wondered from your experience on tracking small/high speed object like golf ball,do you have any ideas or strategy on keep tracking them well? I am tend to re-trained the ReID model.
If you have any better idea with my case, please help me, I am noob with computer vision so really need your consult.
Thanks you so much
The text was updated successfully, but these errors were encountered: