You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Interesting work, but I have a question regarding the grounding process used for the CoM samples.
In the paper, you mention using a grounding tool to tag the data. However, when I tried applying the GroundingDINO model during the CoM construction process, I encountered difficulties in achieving accurate recognition, particularly with dots inside polyline graphs.
My question is: were the bounding boxes and results generated during the grounding step manually annotated, or were they produced through tuning the GroundingDINO model on the chart data?
Any clarification on this process would be greatly appreciated!
Thanks!
The text was updated successfully, but these errors were encountered:
Interesting work, but I have a question regarding the grounding process used for the CoM samples.
In the paper, you mention using a grounding tool to tag the data. However, when I tried applying the GroundingDINO model during the CoM construction process, I encountered difficulties in achieving accurate recognition, particularly with dots inside polyline graphs.
My question is: were the bounding boxes and results generated during the grounding step manually annotated, or were they produced through tuning the GroundingDINO model on the chart data?
Any clarification on this process would be greatly appreciated!
Thanks!
Thank you very much for your interest in our work.
We manually annotated the CoM reasoning data sourced from artificial graphical images (i.e., ChartQA, MathVista), as we found it difficult to use tools like GroundingDINO to label boxes in these images. The manual annotation includes the reasoning process with visual evidence (i.e., boxes, lines, OCR results) for each data sample, which follows the same paradigm and data structure as the automated generation pipeline used for natural images except for the lines drawing.
Using tools to obtain useful information from artificial images is challenging, and we are uncertain if DINO-x can effectively address this issue. We would be happy to discuss any problems related to this topic in the future.
Hi,
Interesting work, but I have a question regarding the grounding process used for the CoM samples.
In the paper, you mention using a grounding tool to tag the data. However, when I tried applying the GroundingDINO model during the CoM construction process, I encountered difficulties in achieving accurate recognition, particularly with dots inside polyline graphs.
My question is: were the bounding boxes and results generated during the grounding step manually annotated, or were they produced through tuning the GroundingDINO model on the chart data?
Any clarification on this process would be greatly appreciated!
Thanks!
The text was updated successfully, but these errors were encountered: