https://arxiv.org/abs/2305.09312

Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation (Zhuoyuan Mao, Raj Dabre, Qianying Liu, Haiyue Song, Chenhui Chu, Sadao Kurohashi)

post norm이 더 나은 것 같다는 결과가 하나 더 나왔군요. 학습 안정성만 아니면 post norm이 맞는 선택인 것 같긴 한데...바로 그 학습 안정성(과 전례)가 문제긴 하네요.

#normalization

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

230516 Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation.md

230516 Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation.md

Files

230516 Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation.md

Latest commit

History

230516 Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation.md

File metadata and controls