Skip to content

Commit

Permalink
Add RT-DETR models and YOLOv8x-Seg version
Browse files Browse the repository at this point in the history
  • Loading branch information
MaxJa4 committed Nov 18, 2023
1 parent 6bf0514 commit 2c88540
Show file tree
Hide file tree
Showing 4 changed files with 49 additions and 30 deletions.
72 changes: 43 additions & 29 deletions doc/06_perception/experiments/model_evaluation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,37 +115,43 @@ Only the inference time was measured.
The model versions are different sizes of the same model.
The following models were evaluated (sorted descending by recognition performance):

1. yolo-nas-l
2. yolo-nas-m
3. yolo-nas-s
4. yolov8x
5. yolov8l
6. yolov8m
7. yolov8s
8. yolov8n
1. yolo-rtdetr-x
2. yolo-rtdetr-l
3. yolo-nas-l
4. yolo-nas-m
5. yolo-nas-s
6. yolov8x / yolov8x-seg
7. yolov8l
8. yolov8m
9. yolov8s
10. yolov8n

Images with boundary boxes: [Google Drive](https://drive.google.com/drive/folders/1u6T0Q3kd9FqjiBWMqzlT-3-fglMqlkBB?usp=sharing)

#### Summary

| Model | Cyclists | Traffic lights | Cars | Noise | Speed |
|------------|----------|----------------|------|-------|-------|
| yolo-nas-l | ++ | ++ | ++ | ++ | + |
| yolo-nas-m | ++ | ++ | ++ | ++ | + |
| yolo-nas-s | ++ | ++ | ++ | ++ | + |
| yolov8x | ++ | ++ | ++ | ++ | + |
| yolov8l | ++ | ++ | ++ | ++ | + |
| yolov8m | ++ | ++ | ++ | + | ++ |
| yolov8s | + | + | ++ | ++ | ++ |
| yolov8n | + | - | + | ++ | ++ |
| Model | Cyclists | Traffic lights | Cars | Noise | Speed |
|---------------|----------|----------------|------|-------|-------|
| yolo-rtdetr-x | ++ | ++ | ++ | + | - |
| yolo-rtdetr-l | ++ | ++ | ++ | + | - |
| yolo-nas-l | ++ | ++ | ++ | ++ | + |
| yolo-nas-m | ++ | ++ | ++ | ++ | + |
| yolo-nas-s | ++ | ++ | ++ | ++ | + |
| yolov8x/-seg | ++ | ++ | ++ | ++ | + |
| yolov8l | ++ | ++ | ++ | ++ | + |
| yolov8m | ++ | ++ | ++ | + | ++ |
| yolov8s | + | + | ++ | ++ | ++ |
| yolov8n | + | - | + | ++ | ++ |

#### Recognition

All model version performed very well. Only the smallest (`v8n`) version missed some cars. The v8s version was already visibly better, although the `v8x`, `v8l` and `v8m` versions recognized more details - like instead of just a person, they saw a person and a bicycle underneath.

The same can be said for traffic lights - `v8x`, `v8l` and `v8m` saw them from a larger distance, while `v8n` and `v8s` needed more proximity.

The YOLO-NAS family of models are similar to the best `v8` version but with higher confidence scores.
The `YOLO-NAS` family of models are similar to the best `v8` version but with higher confidence scores.

`RT-DETR` recognized a little more details and with higher confidence. But at the same time, they have more noise and see irrelevant objects (can be filtered though).

Throughout all versions, almost no noise (random wrong/duplicate predictions ) was present, without tweaking any values - only some noise with `v8m`.

Expand All @@ -154,16 +160,18 @@ Throughout all versions, almost no noise (random wrong/duplicate predictions ) w
These values are meant to be compared between the models, not as a representative performance indicator in general.
Only the inference time was measured.

| Model | Time | FPS |
|------------|------|------|
| yolov8n | ~2ms | 500 |
| yolov8s | ~2ms | 500 |
| yolov8m | ~3ms | ~333 |
| yolov8l | ~4ms | 250 |
| yolov8x | ~6ms | ~166 |
| yolo-nas-l | ~6ms | ~166 |
| yolo-nas-m | ~6ms | ~166 |
| yolo-nas-s | ~7ms | ~142 |
| Model | Time | FPS |
|---------------|--------|------|
| yolov8n | ~2ms | 500 |
| yolov8s | ~2ms | 500 |
| yolov8m | ~3ms | ~333 |
| yolov8l | ~4ms | 250 |
| yolov8x/-seg | ~6/7ms | ~166 |
| yolo-nas-l | ~6ms | ~166 |
| yolo-nas-m | ~6ms | ~166 |
| yolo-nas-s | ~7ms | ~142 |
| yolo-rtdetr-l | ~13ms | ~77 |
| yolo-rtdetr-x | ~16ms | ~62 |

## Conclusion

Expand All @@ -173,12 +181,18 @@ Since the `v8m` version is sometimes to sensitive and the `v8x` version is the l

If the best detection results are the most important, the `v8x` version and `nas` family should be analyzed further with more images and situations.

For segmentation, also `sam` and `fast-sam` was tested. `Sam` needs multiple seconds for inference and is like `fast-sam` not suitable at all for Carla, as they segment the entire image and e.g. segment individual windows of a car or building.

| ![1619_TF_faster-rcnn.jpg](asset-copies/1619_TF_faster-rcnn.jpg) |
|:--:|
| ^ *Pylot - Faster RCNN (26ms)* ^ |
| ![1619_PT_fasterrcnn_resnet50_fpn_v2.jpg](asset-copies/1619_PT_fasterrcnn_resnet50_fpn_v2.jpg) |
| ^ *Pytorch - Faster RCNN Resnet50 FPN V2 (45ms)* ^ |
| ![1619_yolov8x.jpg](asset-copies/1619_yolov8x.jpg) |
| ^ *YOLOv8x (6ms)* ^ |
| ![1619_yolov8x_seg.jpg](asset-copies/1619_yolov8x_seg.jpg) |
| ^ *YOLOv8x-Seg (7ms)* ^ |
| ![1619_yolo_nas_l.jpg](asset-copies/1619_yolo_nas_l.jpg) |
| ^ *YOLO-nas-l (7ms)* ^ |
| ![1619_yolo_nas_l.jpg](asset-copies/1619_yolo_rtdetr_x.jpg) |
| ^ *YOLO-rtdetr-x (16ms)* ^ |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 6 additions & 1 deletion doc/06_perception/experiments/model_evaluation/yolo.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

import os
from globals import IMAGE_BASE_FOLDER, IMAGES_FOR_TEST
from ultralytics import NAS, YOLO
from ultralytics import NAS, YOLO, RTDETR, SAM, FastSAM
from PIL import Image
import torch

Expand All @@ -17,6 +17,11 @@
'yolo_nas_l': NAS,
'yolo_nas_m': NAS,
'yolo_nas_s': NAS,
'rtdetr-l': RTDETR,
'rtdetr-x': RTDETR,
'yolov8x-seg': YOLO,
'sam-l': SAM,
'FastSAM-x': FastSAM,
}


Expand Down

0 comments on commit 2c88540

Please sign in to comment.