Why the transformer model doesn`t share the same weight matrix between two embedding layers? #478

suishitian · 2021-01-28T06:31:06Z

suishitian
Jan 28, 2021

Hi, I got a question during learning the transformer model:

According to the paper "Attention is all you need", section 3.4, it says that the model share the same weight matrix between the two embedding layers. However, I didn`t find it in the code of transformer. There is only two initialized sentences of embedding -- "self.embeding = tf.keras.layers.Embedding..." -- in both Encoder layer and Decoder layer, and it seems there is nothing related with two layers. So my question is where is the share of the weight between the two embedding layers.

Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why the transformer model doesn`t share the same weight matrix between two embedding layers? #478

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Why the transformer model doesn`t share the same weight matrix between two embedding layers? #478

suishitian Jan 28, 2021

Replies: 0 comments

suishitian
Jan 28, 2021