Why the transformer model doesn`t share the same weight matrix between two embedding layers? #478
Unanswered
suishitian
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I got a question during learning the transformer model:
According to the paper "Attention is all you need", section 3.4, it says that the model share the same weight matrix between the two embedding layers. However, I didn`t find it in the code of transformer. There is only two initialized sentences of embedding -- "self.embeding = tf.keras.layers.Embedding..." -- in both Encoder layer and Decoder layer, and it seems there is nothing related with two layers. So my question is where is the share of the weight between the two embedding layers.
Thank you
Beta Was this translation helpful? Give feedback.
All reactions