google · MarkDaoust · Jun 24, 2024 · Jun 24, 2024 · Jun 24, 2024
@@ -394,7 +394,9 @@
       "source": [
         "### Get bounding boxes\n",
         "\n",
-        "You can ask the model for the coordinates of bounding boxes for objects in images."
+        "You can ask the model for the coordinates of bounding boxes for objects in images. For object detection, the Gemini model has been trained to provide\n",
+        "these coordinates as relative widths or heights in range `[0,1]`, scaled by 1000 and converted to an integer. Effectively, the coordinates given are for a\n",
+        "1000x1000 version of the original image, and need to be converted back to the dimensions of the original image."
       ]
     },
     {
@@ -414,6 +416,19 @@
         "print(response.text)"
       ]
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b8e422c55df2"
+      },
+      "source": [
+        "To convert these coordinates to the dimensions of the original image:\n",
+        "\n",
+        "1.    Divide each output coordinate by 1000.\n",
+        "1.    Multiply the x-coordinates by the original image width.\n",
+        "1.    Multiply the y-coordinates by the original image height."
+      ]
+    },
     {
       "cell_type": "markdown",
       "metadata": {