How to distinguish sos token(default = 0) and quantified image token zero ? #242

JJJYmmm · 2024-03-26T12:46:51Z

Since the transformer take in the quantified image token generated by VQGAN, which codebook has indices (0~n_embed-1), and transformer’s sos token is also set to zero defaultly. Could you tell me why we don't distinguish codebook vector0 and the sos token when training transformer?

srasoulzadeh · 2024-09-03T14:13:50Z

This was also my Q as well. I believe the reason is that the transformer uses positional encodings to add information about the position of each token in the sequence. The token at position 0 is always the sos_token, and this positional information helps the model distinguish between the sos_token and the codebook's 0 index.

JJJYmmm · 2024-09-07T15:38:40Z

This was also my Q as well. I believe the reason is that the transformer uses positional encodings to add information about the position of each token in the sequence. The token at position 0 is always the sos_token, and this positional information helps the model distinguish between the sos_token and the codebook's 0 index.

Sounds reasonable!
thx!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to distinguish sos token(default = 0) and quantified image token zero ? #242

How to distinguish sos token(default = 0) and quantified image token zero ? #242

JJJYmmm commented Mar 26, 2024 •

edited

Loading

srasoulzadeh commented Sep 3, 2024

JJJYmmm commented Sep 7, 2024

How to distinguish sos token(default = 0) and quantified image token zero ? #242

How to distinguish sos token(default = 0) and quantified image token zero ? #242

Comments

JJJYmmm commented Mar 26, 2024 • edited Loading

srasoulzadeh commented Sep 3, 2024

JJJYmmm commented Sep 7, 2024

JJJYmmm commented Mar 26, 2024 •

edited

Loading