Skip to content

Latest commit

 

History

History
167 lines (115 loc) · 8.83 KB

detectnet-example-2.md

File metadata and controls

167 lines (115 loc) · 8.83 KB

Back | Next | Contents
Object Detection

Coding Your Own Object Detection Program

In this step of the tutorial, we'll walk through the creation of the previous example for realtime object detection on a live camera feed in only 10 lines of Python code. The program will load the detection network with the detectNet object, capture video frames and process them, and then render the detected objects to the display.

For your convenience and reference, the completed source is available in the python/examples/my-detection.py file of the repo, but the guide below will act like they reside in the user's home directory or in an arbitrary directory of your choosing.

Here's a quick preview of the Python code we'll be walking through:

import jetson.inference
import jetson.utils

net = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=0.5)
camera = jetson.utils.gstCamera(1280, 720, "/dev/video0")  # using V4L2
display = jetson.utils.glDisplay()

while display.IsOpen():
	img, width, height = camera.CaptureRGBA()
	detections = net.Detect(img, width, height)
	display.RenderOnce(img, width, height)
	display.SetTitle("Object Detection | Network {:.0f} FPS".format(net.GetNetworkFPS()))

Source Code

First, open up your text editor of choice and create a new file. Below we'll assume that you'll save it to your user's home directory as ~/my-detection.py, but you can name and store it where you wish.

Importing Modules

At the top of the source file, we'll import the Python modules that we're going to use in the script. Add import statements to load the jetson.inference and jetson.utils modules used for object detection and camera capture.

import jetson.inference
import jetson.utils

note: these Jetson modules are installed during the sudo make install step of building the repo.
          if you did not run sudo make install, then these packages won't be found when the example is run.

Loading the Detection Model

Next use the following line to create a detectNet object instance that loads the 91-class SSD-Mobilenet-v2 model:

# load the object detection model
net = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=0.5)

Note that you can change the model string to one of the values from this table to load a different detection model. We also set the detection threshold here to the default of 0.5 for illustrative purposes - you can tweak it later if needed.

Opening the Camera Stream

To connect to the camera device for streaming, we'll create an instance of the gstCamera object:

camera = jetson.utils.gstCamera(1280, 720, "/dev/video0")  # using V4L2

It's constructor accepts 3 parameters - the desired width, height, and video device to use. Substitute the following snippet depending on if you are using a MIPI CSI camera or a V4L2 USB camera, along with the preferred resolution:

  • MIPI CSI cameras are used by specifying the sensor index ("0" or "1", ect.)
     camera = jetson.utils.gstCamera(1280, 720, "0")
  • V4L2 USB cameras are used by specifying their /dev/video node ("/dev/video0", "/dev/video1", ect.)
     camera = jetson.utils.gstCamera(1280, 720, "/dev/video0")
  • The width and height should be a resolution that the camera supports.
    • Query the available resolutions with the following commands:
      $ sudo apt-get install v4l-utils
      $ v4l2-ctl --list-formats-ext
    • If needed, change 1280 and 720 above to the desired width/height

note: for compatible cameras to use, see these sections of the Jetson Wiki:
             - Nano:  https://eLinux.org/Jetson_Nano#Cameras
             - Xavier: https://eLinux.org/Jetson_AGX_Xavier#Ecosystem_Products_.26_Cameras
             - TX1/TX2: developer kits include an onboard MIPI CSI sensor module (0V5693)

Display Loop

Next, we'll create an OpenGL display with the glDisplay object and create a main loop that will run until the user exits:

display = jetson.utils.glDisplay()

while display.IsOpen():
	# main loop will go here

Note that the remainder of the code below should be indented underneath this while loop.

Camera Capture

The first thing that happens in the main loop is to capture the next video frame from the camera. camera.CaptureRGBA() will wait until the next frame has been sent from the camera, and after it's been acquired by the Jetson, it will convert it to RGBA floating-point format residing in GPU memory.

	img, width, height = camera.CaptureRGBA()

Returned are a tuple containing a reference to the image data on the GPU, along with it's dimensions.

Detecting Objects

Next the detection network processes the image with the net.Detect() function. It takes in the image, width, and height from camera.CaptureRGBA() and returns a list of detections:

	detections = net.Detect(img, width, height)

This function will also automatically overlay the detection results on top of the input image.

If you want, you can add a print(detections) statement here, and the coordinates, confidence, and class info will be printed out to the terminal for each detection result. Also see the detectNet documentation for info about the different members of the Detection structures that are returned for accessing them directly in a custom application.

Rendering

Finally we'll visualize the results with OpenGL and update the title of the window to display the current peformance:

	display.RenderOnce(img, width, height)
	display.SetTitle("Object Detection | Network {:.0f} FPS".format(net.GetNetworkFPS()))

The RenderOnce() function will automatically flip the backbuffer and is used when we only have one image to render.

Source Listing

That's it! For completness, here's the full source of the Python script that we just created:

import jetson.inference
import jetson.utils

net = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=0.5)
camera = jetson.utils.gstCamera(1280, 720, "/dev/video0")  # using V4L2
display = jetson.utils.glDisplay()

while display.IsOpen():
	img, width, height = camera.CaptureRGBA()
	detections = net.Detect(img, width, height)
	display.RenderOnce(img, width, height)
	display.SetTitle("Object Detection | Network {:.0f} FPS".format(net.GetNetworkFPS()))

Note that this version assumes you are using a V4L2 USB camera. See the Opening the Camera Stream section above for info about changing it to use a MIPI CSI camera or supporting different resolutions.

Running the Program

To run the application we just coded, simply launch it from a terminal with the Python interpreter:

$ python my-detection.py

To tweak the results, you can try changing the model that's loaded along with the detection threshold. Have fun!

Next | Semantic Segmentation with SegNet
Back | Running the Live Camera Detection Demo

© 2016-2019 NVIDIA | Table of Contents