Github repository for the paper titled as "Open-Source Visual Target Tracking System Both on Simulation Environment and Real Drone".
This package contains the trt_yolo_v7.py node that performs the inference using NVIDIA's TensorRT engine. After trt_yolo_v7.py node publishes the necessary data to tracker_offboard.cpp, this node makes the calculations based on its inner PID controller and generates the vehicle parameters (linear velocity and angular velocity). This parameters are shared with MAVROS package to be converted into MAVLink message format.
The official paper can be found at the following link:
https://link.springer.com/chapter/10.1007/978-3-031-52760-9_11
This work presents an investigation into the domain of dynamic target tracking through object detection, particularly emphasizing the context of open-source applications like PX4, ROS, and YOLO. Over the years, achieving real-time object tracking on UAVs in dynamic environments has been a formidable challenge, necessitating offline computations or substantial onboard processing resources. However, contemporary UAVs are now equipped with advanced edge embedded devices, sensors, and cameras, enabling the integration of deep learning-based vision applications. This advancement offers the prospect of directly deploying cutting-edge applications onto UAVs, thereby expanding their utility in areas such as surveillance, search and rescue, and videography. To fully harness the potential of these vision applications, a communication infrastructure interfacing with the UAV’s underneath closed controllers becomes imperative. We’ve developed an integrated visual target-tracking system that connects a flight controller unit with a graphical unit by leveraging ROS tools and open-source deep learning packages. The overall integrated system based on ROS, deep learning applications, and custom PID controllers is shared on GitHub as open-source software package in a way that benefits everyone interested: https://github.com/miralab-ai/vision-ROS.
- Jetson Nano
- ROS Melodic
- Ubuntu 18.04
- Jetpack 4.5.1
- TensorRT 7+
- OpenCV 3.x
- numpy 1.15.1
- Protobuf 3.8.0
- Pycuda 2019.1.2
- onnx 1.4.1 (depends on Protobuf)
Install pycuda (takes awhile)
$ cd ${HOME}/catkin_ws/src/vision-ROS/dependencies
$ ./install_pycuda.sh
Install Protobuf (takes awhile)
$ cd ${HOME}/catkin_ws/src/vision-ROS/dependencies
$ ./install_protobuf-3.8.0.sh
Install onnx (depends on Protobuf above)
$ sudo pip3 install onnx==1.4.1
- Please also install jetson-inference
- Note: This package uses similar nodes to ros_deep_learning package. Please place a CATKIN_IGNORE in that package to avoid similar node name catkin_make error
- If these scripts do not work for you, do refer to this amazing repository by jefflgaol on installing the above packages and more on Jetson ARM devices.
$ cd ~/catkin_ws && catkin_make
$ source devel/setup.bash
$ cd ${HOME}/catkin_ws/src/trt_yolo_v7/plugins
$ make
This will generate a libyolo_layer.so file
$ cd ${HOME}/catkin_ws/src/trt_yolo_v7/yolo
** Please name the yolov7.weights and yolov7.cfg file as follows:
- yolov7.weights
- yolov7.cfg
Run the conversion script to convert to TensorRT engine file
$ ./convert_yolo_trt
- Input the appropriate arguments
- This conversion might take awhile
- The optimised TensorRT engine would now be saved as yolov7-416.trt
$ cd ${HOME}/catkin_ws/src/trt_yolo_v7/utils
$ vim yolo_classes.py
- Change the class labels to suit your model
$ cd ${HOME}/catkin_ws/src/trt_yolo_v7/launch
-
trt_yolo_v7.launch
: change the topic_name -
video_source.launch
: change the input format (refer to this Link- video_source.launch requires jetson-inference to be installed
- Default input is CSI camera
Note: Run the launch files separately in different terminals
# For csi input
$ roslaunch trt_yolo_v7 video_source.launch input:=csi://0
# For video input
$ roslaunch trt_yolo_v7 video_source.launch input:=/path_to_video/video.mp4
# For USB camera
$ roslaunch trt_yolo_v7 video_source.launch input:=v4l2://0
# For YOLOv7 (single input)
$ roslaunch trt_yolo_v7 trt_yolo_v7.launch
$ cd /usr/bin/
$ sudo ./nvpmodel -m 0 # Enable 2 Denver CPU
$ sudo ./jetson_clock # Maximise CPU/GPU performance
- These commands are found/referred in this forum post
- Please ensure the jetson device is cooled appropriately to prevent overheating
- Default Input FPS from CSI camera = 30.0
- To change this, go to jetson-inference/utils/camera/gstCamera.cpp
# In line 359, change this line
mOptions.frameRate = 15
# To desired frame_rate
mOptions.frameRate = desired_frame_rate
This node enables users to establish a connection between the companion computer and the main flight controller (such as PX4 or ArduPilot) using the MAVROS package. The next two lines denote the subscriber and publisher functions. Should one wish to transmit processed vision data to a different computer, this node can be adapted to one's specific needs. In this scenario, the MAVROS subscriber acquires data from MAVLink to gain awareness of the vehicle's position. Subsequently, the MAVROS publisher transmits computed vehicle commands back to MAVLink, enabling the vehicle to execute corresponding actions.
ros::Subscriber pose_stamped = nh.subscribe<geometry_msgs::PoseStamped>("mavros/local_position/pose", 1, pose_stamped_cb);
ros::Publisher body_vel_pub = nh.advertise<mavros_msgs::PositionTarget>("mavros/setpoint_raw/local", 1);
For the installation of MAVROS and MAVLink, you can found the details on User Guide
1. TensorRT samples from jkjung-avt
2. SORT from Hyun-je
If you use the our research in your studies, please cite our related publication:
@inproceedings{yilmaz2023open,
title={Open-Source Visual Target-Tracking System Both on Simulation Environment and Real Unmanned Aerial Vehicles},
author={Y{\i}lmaz, Celil and Ozgun, Abdulkadir and Erol, Berat Alper and Gumus, Abdurrahman},
booktitle={International Congress of Electrical and Computer Engineering},
pages={147--159},
year={2023},
organization={Springer}
}