ITM loss #126

ghost · 2023-05-12T09:31:13Z

Hi, thanks again for this great work!

During pre-training phase, for example taking the VG dataset, we have multiple captions corresponding to the same image. It's not clear to me, when you do ITM loss if the same image with different captions happens to appear multiple times in a batch it will become a hard negative example for it but it is actually a valid description for that image, even if as by implementation it will have a label 0, i.e., not a match. Could you please explain the reasoning here? Should we prevent somehow that the same image appears multiple times in the batch to avoid this issue?

HWH-2000 · 2023-08-04T08:55:19Z

My recent work also has the same problem. Because of the overlap of text or images, the model cannot learn the difference from the negative samples, resulting in the loss of ITC and ITM tasks not converging. Have you solved this problem?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ITM loss #126

ITM loss #126

ghost commented May 12, 2023

HWH-2000 commented Aug 4, 2023

ITM loss #126

ITM loss #126

Comments

ghost commented May 12, 2023

HWH-2000 commented Aug 4, 2023