-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plots: interactive plots with toggling bounding box #10198
Comments
Some thoughts around how we could do this (open to other ideas): Here is nice lib that shows the different BB formats and converts between them: https://github.com/devrimcavusoglu/pybboxes. Basically, the formats are:
These can be encoded in a JSON or other structured file, but unfortunately there doesn't seem to be any standardization on the file format, so we would either need users to specify the file structure, or we can start by having dvclive write out a standard format that we can parse, which could be something like: [
{"path": "image1.png", "label": "cat", "bbox": [100, 110, 5, 20]},
{"path": "image2.png": ...}, If we do that, we should have enough info to render bounding boxes in vs code and studio. To find this annotations file, we could:
|
In my mind this option makes the most sense. This allows the user full control over where and how the annotation files are being written, and it allows easy integration with dvclive if some frameworks automatically output this info at a set location. |
Should it be an annotation file per image though, folks? Let's say I have a pipeline that is producing some new images every time and doesn't delete old ones. In this case how I am supposed to update that single file? Also, a single file can become super painful to parse - it can be slow, we can run out of memory, etc. |
Yes I think one annotation file per image is more standard (considering there are very few standardizations here) In my mind the path would actually be a directory path which contains x structured files and directly correlates with the number of images you would like to display. For example I run my newly trained model on 5 images and produce bounding boxes for all 5, there would be 5 images and 5 annotation files. |
Right, one annotation file per image is better. I wonder then if it's worth introducing |
For this do you mean have a set location that DVC looks for files and then display in studio/vscode if those files exist? Like store them all in something like |
I was thinking the simplest is to have them right next to the images themselves so that we don't have to worry about mapping the annotations to the images. For example:
Both, but yes I am mostly focused on dvclive here. I'm taking inspiration from https://dvc.org/doc/dvclive/live/log_image#images-per-step, where we have a similar convention. I am not against ultimately codifying this in
|
Hello, wanted to check on the status of this issue? Have we decided that we are going to have the users set the json files to be next to the images themselves? If so, what BB format will we be using for the json files? |
This makes the most sense to me.
I think x1,y1,x2,y2 (top left, bottom right points) - is more common. Obviously I do not get the final say, but just my $.02 as a user. |
Okay, added iterative/dvclive#766 to propose how we want to do this on the backend. |
Thanks! Just wanted to make sure I'm understanding things correctly, are we planning to have these JSON file contents or paths be shown in any way inside of Update: Discussed synchronously in VSCode planning, and concluded that there would need to be changes made to |
@julieg18 @mattseddon Is there a plan for how to implement it? Do you need more clarification on anything? |
No, we haven't decided on a plan on how to implement this on DVC's end. We were planning to discuss what would need to be done in Slack. |
@julieg18 @mattseddon @AlexandreKempf Can you two agree on the API of what dvc should pass to get the bounding box info in vs code and studio? |
I'm planning on having a |
Currently, my PRs are expecting a format like this: {
"boxes": [
{
"label": "cat",
"box": {
"left": 100, "right": 110, "top": 5, "bottom": 20
}
},
{
"label": "cat",
"box": {
"left": 30, "right": 55, "top": 75, "bottom": 90
}
},
{
"label": "dog",
"box": {
"left": 80, "right": 100, "top": 25, "bottom": 50
}
}
]
} But would it possible for the boxes to be sorted by the label? That would reduce processing on the Studio/VS Code. Example: Outdated example```json { "boxes": [ { "label": "cat", "boxes": [ { "left": 100, "right": 110, "top": 5, "bottom": 20 }, { "left": 30, "right": 55, "top": 75, "bottom": 90 } ] }, { "label": "dog", "boxes": [{ "left": 80, "right": 100, "top": 25, "bottom": 50 }] } ] } ```Update, I meant: {
"boxes": {
"cat": [
{ "left": 100, "right": 110, "top": 5, "bottom": 20 },
{ "left": 30, "right": 55, "top": 75, "bottom": 90 }
],
"dog": [
{ "left": 80, "right": 100, "top": 25, "bottom": 50 }
]
}
} |
@julieg18 Technically, it is possible, yes. I personally prefer the first format, but if that helps you a lot, we might take the second one. For most object detection tasks, running this kind of processing (grouping by label) during the logging phase is probably smarter because it is not very time-sensitive. If the plots are lagging, it can be frustrating for the user, so let me know if it really increases performance on your side (enough so that we take the tradeoff with the json format). |
If I'm understanding your question correctly, the frontend can handle both whole numbers and numbers with decimals when it comes to the box coordinates. |
While it wouldn't increase performance tremendously, every bit helps especially when it comes to running huge numbers of images/boxes. It also reduces code repetition in VSCode and Studio, since both products have to sort the boxes by label. We can definitely work with the first format though if it works better on DVC's end. |
Yeah, that was my question. But to make extra sure:
|
I needed to be more explicit in my previous comment, I apologize. In data science, we have two units for the bounding box's coordinates: pixels and %. I believe we should support both. It is quite easy to determine which unit the user is using (all values are in the <0,1> range for %) but then the user needs to give us details about the image size. We agree with @julieg18 that DVClive should be the one in charge of handling the conversion from % to pixels. |
Whoops! Apologies, I made the example incorrectly 🤦♀️. I meant something like this: {
"boxes": {
"cat": [
{ "left": 100, "right": 110, "top": 5, "bottom": 20 },
{ "left": 30, "right": 55, "top": 75, "bottom": 90 }
],
"dog": [
{ "left": 80, "right": 100, "top": 25, "bottom": 50 }
]
}
} |
@dberenbaum Can we agree to follow this schema or do you see any problem with it? :) |
I think it's fine as a starting point, although can we agree it may change as we try it? We still may need to reconsider the tradeoffs between rendering performance and useful schema. For example, if someone wants to resize an image without recomputing bounding boxes, it would be nice to allow for relative coordinates in the schema, but not sure if this really outweighs the performance benefits of precomputing fixed coordinates. |
We took some time to discuss with @julieg18 the difference between the two schemas (#10198 (comment)). There were some things that were discussed:
|
@dberenbaum Concerning the relative coordinates, if the bounding boxes are stored alongside the image, we know the size of the image. So we could allow the user to log the bounding boxes with <0,1> values in the python code but save it in the JSON with pixel values. |
I think it's fine. My point was more that we should expect that things might change after we play with it more, so let's not spend too much time focusing on whether it's the right schema yet. |
We've decided on starting with: {
"boxes": {
"cat": [
{ "box": {"left": 100, "right": 110, "top": 5, "bottom": 20}, "score": 0.8 },
{ "box": {"left": 80, "right": 130, "top": 13, "bottom": 55}, "score": 0.5 }
],
"dog": [
{ "box": {"left": 81, "right": 160, "top": 16, "bottom": 52}, "score": 0.1 }
]
}
} I'll update Studio and VSCode's PRs to use this format. |
…y bbox on images Fixes #10198
What is the status of this issue? Does #10312 need to be taken over? I'd be willing to look into it if need be! |
hey @BradyJ27 , absolute, feel free to take a look and help us get it done! thanks. |
Issue stems from:
iterative/vscode-dvc#4917
The idea is to have an interactive plot (likely not a template of the current plot system), for which the user can view different labels for each class, and toggle labels on and off to see bounding boxes for specific classes.
The text was updated successfully, but these errors were encountered: