Pointing Data Usage Issue #13

youthhoo · 2025-01-02T16:30:02Z

When I finetuning the pretrained model, I find the pixmo pointing data was divided into four parts, i.e., pixmo_points, pixmo_points_counting, pixmo_points_high_freq and pixmo_points_high_freq_counting. I wonder the reason to do that, and if it can improve the improvement of pointing.

chrisc36 · 2025-02-24T19:06:49Z

The parts are divided into counting vs. not-counting, and then into (high-frequency vs low-frequency). The high-frequency data was collected through a slightly different pipeline that encourage annotators to focus on high-frequency objects. Counting/non-counting just formats the inputs differently.

In practice, I don't think it would make a big difference to merge all these parts into one big pointing dataset. It is just that way in the code due to how the data was collected, and to conveniently have both pointing and counting examples built from each point annotation.

The exception is that using these splits effectively up-samples the high-frequency data bit because smaller datasets are upsampled slightly, and the high-frequency dataset is small, although this is not something we have actually ablated.

chrisc36 closed this as completed Feb 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pointing Data Usage Issue #13

Pointing Data Usage Issue #13

youthhoo commented Jan 2, 2025 •

edited

Loading

chrisc36 commented Feb 24, 2025

Pointing Data Usage Issue #13

Pointing Data Usage Issue #13

Comments

youthhoo commented Jan 2, 2025 • edited Loading

chrisc36 commented Feb 24, 2025

youthhoo commented Jan 2, 2025 •

edited

Loading