Skip to content

Latest commit

 

History

History
52 lines (52 loc) · 2.05 KB

2023-07-02-cao23a.md

File metadata and controls

52 lines (52 loc) · 2.05 KB
abstract openreview title layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Despite the advances in Visual Question Answering (VQA), many VQA models currently suffer from language priors (i.e. generating answers directly from questions without using images), which severely reduces their robustness in real-world scenarios. We propose a novel training strategy called Loss Rebalancing Label and Global Context (LRLGC) to alleviate the above problem. Specifically, the Loss Rebalancing Label (LRL) is dynamically constructed based on the degree of sample bias to accurately adjust losses across samples and ensure a more balanced form of total losses in VQA. In addition, the Global Context (GC) provides the model with valid global information to assist the model in predicting answers more accurately. Finally, the model is trained through an ensemble-based approach that retains the beneficial effects of biased samples on the model while reducing their importance. Our approach is model-agnostic and enables end-to-end training. Extensive experimental results show that LRLGC (1) improves performance for various VQA models and (2) performs competitively in the VQA-CP v2 benchmark test.
FEDih5lICO
Overcoming Language Priors for Visual Question Answering via Loss Rebalancing Label and Global Context
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
cao23a
0
Overcoming Language Priors for Visual Question Answering via Loss Rebalancing Label and Global Context
249
259
249-259
249
false
Cao, Runlin and Li, Zhixin
given family
Runlin
Cao
given family
Zhixin
Li
2023-07-02
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence
216
inproceedings
date-parts
2023
7
2