Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pointer of encoder/decoder word_embedding.weight are same #8

Open
HumanIearning opened this issue Nov 29, 2022 · 0 comments
Open

pointer of encoder/decoder word_embedding.weight are same #8

HumanIearning opened this issue Nov 29, 2022 · 0 comments

Comments

@HumanIearning
Copy link

Screenshot 2022-11-30 at 1 05 07 AM

Screenshot 2022-11-30 at 1 04 25 AM

위 사진처럼 encoder와 decoder의 embedding 레이어의 weight가 다른 모델을 load state dict해와도
Screenshot 2022-11-30 at 1 08 30 AM

위에 보이듯이 decoder embedding layer의 weigth가 encoder랑 decoder에 둘다 들어가게 됩니다.

그래서 직접 encoder에
model.state_dict()['encoder.embeddings.word_embeddings.weight'].copy_(ckpt['state_dict']['encoder.embeddings.word_embeddings.weight'])

copy로 값을 넣어봤는데 그러면 encoder랑 decoder embedding layer에 둘다 encoder embedding layer의 값만 들어가게 됩니다.
Screenshot 2022-11-30 at 1 11 01 AM

의도하신 부분인지는 모르겟지만 제 모델은 두 부분이 다른 값을 가지게 학습되어서 각각 load해오고 싶은데 해결방법이 없을까요

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant