There should be a stop gradient in the simsiam model #39

nikheelpandey · 2021-05-27T12:08:51Z

Hello,

I was using your implementation of SimSiam for contrastive learning. I noticed that the model that you have created has a few problems:

The "stop_gradient" part of the network is absent from your implementation. This model is effectively training both the path.

Could you please clarify how and where you are taking care of it?

hhhdw · 2021-07-05T09:46:20Z

def D(p, z, version='simplified'): # negative cosine similarity
if version == 'original':
z = z.detach() # stop gradient
p = F.normalize(p, dim=1) # l2-normalize
z = F.normalize(z, dim=1) # l2-normalize
return -(p*z).sum(dim=1).mean()

elif version == 'simplified':# same thing, much faster. Scroll down, speed test in __main__
    return - F.cosine_similarity(p, z.detach(), dim=-1).mean()
else:
    raise Exception

There is a 'detach' after 'z' when compute loss

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There should be a stop gradient in the simsiam model #39

There should be a stop gradient in the simsiam model #39

nikheelpandey commented May 27, 2021 •

edited

Loading

hhhdw commented Jul 5, 2021 •

edited

Loading

There should be a stop gradient in the simsiam model #39

There should be a stop gradient in the simsiam model #39

Comments

nikheelpandey commented May 27, 2021 • edited Loading

hhhdw commented Jul 5, 2021 • edited Loading

nikheelpandey commented May 27, 2021 •

edited

Loading

hhhdw commented Jul 5, 2021 •

edited

Loading