about voken regression and voken constrastive #11

lizhiustc · 2022-11-05T13:59:34Z

I have two questions.

(1) I notice that in your code

Line 238 in 5601b79

# Build Loss functions

, you design three loss function voken classification, voken regression and voken constrastive. But you only report "voken classification" in paper, maybe you find "voken regression and voken constrastive" both don't work or even harm model performance after trials? Is my guess correct ? (Because image features are far different from language embeddings. )

(2) What's the intuition that voken classification loss can improve model performance ? I suspect that different words with similar semantic will have same voken labels and voken classification loss will optimize their similarity. What is your opinion？Could you give me some intuition from your views?

airsplay · 2022-11-05T17:51:40Z

Copy-paste the email reply here:

Yes, these token losses perform similarly. We thus choose the simplest one. To me, it's classification.
Token label is also a strong supervision. For me, they are mostly used for distillation. Contrastive and L2-reg are more like distillation, but tokens can do the same (e.g., in the language mode distillations). Some other works to look at are: wave2vec 2.0, DINO.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about voken regression and voken constrastive #11

about voken regression and voken constrastive #11

lizhiustc commented Nov 5, 2022

airsplay commented Nov 5, 2022

about voken regression and voken constrastive #11

about voken regression and voken constrastive #11

Comments

lizhiustc commented Nov 5, 2022

airsplay commented Nov 5, 2022