detectnet/clustering.py incorrectly calculates bounding box height #557

samsparks · 2019-02-07T13:45:58Z

vote_boxes() in clustering.py calculates each detection height by subtracting each bounding box's index 1 from index 3.

However, since a bounding box is a cv::Rect, index 3 is height and index 1 is y. The clustering algorithm should test bounding box height as follows:

            if rect[3] >= self.min_height:

The text was updated successfully, but these errors were encountered:

samsparks · 2019-02-07T14:48:33Z

Actually, as I look closer, the issue is move invasive. clustering.py assumes [[x1, y1, x2, y2]] boxes as the vote_boxes()'s input and output. Therefore, the algorithm needs to convert the input prior to the call to groupRectangles(), test the height using rect[3], and convert back to [x1, y1, x2, y2] when populating detections_per_image

samsparks · 2019-02-08T18:35:38Z

Is there a better venue for this discussion? I haven't received a response on nvidia's forums.
TIA

drnikolaev · 2019-02-08T18:44:56Z

hi @samsparks sorry for the delay. I'm trying this but I'd appreciate some sample/unit test to check that it works correctly

samsparks · 2019-02-08T19:21:09Z

Sure, @drnikolaev, I would be happy to provide example code. However, as this is an interface issue between clustering.py and opencv, I'm not sure what to provide beyond an inspection of the code.

Lines 167-172 show the extraction of the top left and bottom right coordinates of each bounding box into candidate boxes

    y1 = (np.asarray([net_boxes[1][y[i]][x[i]] for i in list(range(x.size))]) + my)
    x2 = (np.asarray([net_boxes[2][y[i]][x[i]] for i in list(range(x.size))]) + mx)
    y2 = (np.asarray([net_boxes[3][y[i]][x[i]] for i in list(range(x.size))]) + my)

    boxes = np.transpose(np.vstack((x1, y1, x2, y2)))

These coordinates are returned from gridbox_to_boxes() and passed to vote_boxes() on lines 224-226

            propose_boxes, propose_cvgs, mask = gridbox_to_boxes(cur_cvg, cur_boxes, self)
            # Vote across the proposals to get bboxes
            boxes_cur_image = vote_boxes(propose_boxes, propose_cvgs, mask, self)

Finally (unless I am missing something), these values are passed without being converted properly to (x, y, width, height) in vote_boxes() on line 189

    nboxes, weights = cv.groupRectangles(
        np.array(propose_boxes).tolist(),
        self.gridbox_rect_thresh,
        self.gridbox_rect_eps)

This looks wrong based on the opencv documentation.

Additionally, I rebuilt opencv to test the interface after posting this question on their forum. By adding debug statements to the implementation of groupRectangles(), I was able to prove the python code is expected (x, y, width, height).

Do you have an idea for what I can provide as example code? I am happy to do whatever I can to help.

drnikolaev · 2019-02-08T20:10:33Z

@samsparks please just give me example how exactly you execute clustering.py, against what dataset and/or model and what you expect as the correct outcome.

samsparks · 2019-02-11T14:40:55Z

Hi @drnikolaev - I have not forgotten about this. Unfortunately, I do not have a trained model I can provide, and DIGITS does not allow testing of pretrained models :-(. So I am going to have to train something from scratch.

In the meantime, I have and example where I modified clustering.py to print out the input to group rectangles right before it is called, as follows:
print("proposed: {}".format(np.array(propose_boxes).tolist()))

This output the following set of bounding boxes in clustering.py:
[[547,432,701,639],[557,435,700,640],[560,438,695,641],[560,438,694,640],[88,443,336,663],[83,444,357,671],[83,444,373,676],[87,449,377,676],[87,454,380,677],[76,453,388,680],[72,447,394,683],[80,437,393,683],[101,430,392,678],[547,433,702,641],[555,433,701,645],[558,437,696,647],[556,440,696,644],[84,443,357,664],[73,448,369,665],[74,449,375,664],[81,451,373,664],[85,454,375,666],[81,454,385,672],[74,452,392,676],[77,445,396,679],[91,433,392,680],[547,430,705,644],[553,429,704,649],[555,434,697,649],[552,438,695,649],[85,445,365,661],[69,451,376,662],[69,452,379,663],[76,453,374,663],[80,452,377,666],[79,451,382,671],[74,449,388,673],[77,445,393,673],[90,434,389,674],[546,429,706,643],[553,428,703,647],[554,432,695,649],[553,435,693,654],[81,445,370,663],[68,454,383,664],[67,455,388,664],[72,454,384,667],[77,452,382,669],[71,448,386,671],[66,443,388,672],[73,438,389,671],[92,429,388,673],[545,429,706,642],[553,429,703,643],[553,432,695,647],[553,432,696,658],[79,450,367,664],[72,459,379,663],[71,459,387,665],[75,458,388,667],[75,455,390,666],[65,448,389,668],[63,441,387,669],[73,433,384,672],[100,425,388,675],[549,429,707,648],[550,429,701,652],[554,434,703,662],[79,462,356,665],[73,462,374,665],[74,461,383,666],[73,460,387,667],[69,457,391,664],[60,447,390,668],[63,435,385,673],[81,430,384,676],[116,433,390,677]]

And it returns:
[[553, 433, 700, 647], [ 75, 449, 382, 669], [ 95, 430, 390, 676]]

Passing the values in c++ return the following
[[546,431,704,642],[70,447,389,672],[555,435,696,647],[74,455,381,666]]

I expect these two to match, but they do not. I think the problem is in how clustering.py is calling groupRectangles()

The full source of the example can be found here

samsparks · 2019-02-14T18:06:02Z

Hi @drnikolaev -

I used the default DIGITS DetectNet (KITTI) model and KITTI images contained in data_object_image_2.zip.

The two images 003716.png and 003719.png provide good examples for the problem.

DIGIT's clustering.py finds 4 bounding boxes for 003716.png
DIGIT's clustering.py finds 9 bounding boxes for 003719.png.

I can reproduce this reliably in jetson-inference only by malforming the construction of opencv::Rect objects.

I believe the current implementation of clustering.py works most of the time because groupRectangles() is grouping like objects. It is reasonably forgiving if you pass in [x1, y1, x2, y2] instead of [x1, y1, width, height] because it is just matching a pair of points instead of a point and width-height. However, it does not work as well when detections are in the bottom right (too inclusive) or top left (too exclusive) of the image.

See my fork of jetson-inference for the "broken" C++ code that replicates clustering.py. There is a define of REPLICATE_CLUSTERING_PY in detectNet.cpp that switches between the correct and incorrect construction of the cv::Rect objects.

Please note this will change the required values for epsilon. I plan on retraining my network after applying the following patch:

index 380df4a..d5c0589 100644
--- a/python/caffe/layers/detectnet/clustering.py
+++ b/python/caffe/layers/detectnet/clustering.py
@@ -188,14 +188,14 @@ def vote_boxes(propose_boxes, propose_cvgs, mask, self):
     # GROUP RECTANGLES Clustering
     ######################################################################
     nboxes, weights = cv.groupRectangles(
-        np.array(propose_boxes).tolist(),
+        [[e[0],e[1],e[2]-e[0],e[3]-e[1]] for e in np.array(propose_boxes).tolist()],
         self.gridbox_rect_thresh,
         self.gridbox_rect_eps)
     if len(nboxes):
         for rect, weight in zip(nboxes, weights):
-            if (rect[3] - rect[1]) >= self.min_height:
+            if rect[3] >= self.min_height:
                 confidence = math.log(weight[0])
-                detection = [rect[0], rect[1], rect[2], rect[3], confidence]
+                detection = [rect[0], rect[1], rect[0]+rect[2], rect[1]+rect[3], confidence]
                 detections_per_image.append(detection)
 
     return detections_per_image

drnikolaev · 2019-05-02T00:01:29Z

Fixed in v0.17.3

drnikolaev added a commit to drnikolaev/caffe that referenced this issue Feb 28, 2019

NVIDIA#557

587d488

drnikolaev closed this as completed May 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

detectnet/clustering.py incorrectly calculates bounding box height #557

detectnet/clustering.py incorrectly calculates bounding box height #557

samsparks commented Feb 7, 2019

samsparks commented Feb 7, 2019

samsparks commented Feb 8, 2019

drnikolaev commented Feb 8, 2019

samsparks commented Feb 8, 2019

drnikolaev commented Feb 8, 2019

samsparks commented Feb 11, 2019 •

edited

Loading

samsparks commented Feb 14, 2019

drnikolaev commented May 2, 2019

detectnet/clustering.py incorrectly calculates bounding box height #557

detectnet/clustering.py incorrectly calculates bounding box height #557

Comments

samsparks commented Feb 7, 2019

samsparks commented Feb 7, 2019

samsparks commented Feb 8, 2019

drnikolaev commented Feb 8, 2019

samsparks commented Feb 8, 2019

drnikolaev commented Feb 8, 2019

samsparks commented Feb 11, 2019 • edited Loading

samsparks commented Feb 14, 2019

drnikolaev commented May 2, 2019

samsparks commented Feb 11, 2019 •

edited

Loading