https://arxiv.org/abs/2104.06644

Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little (Koustuv Sinha, Robin Jia, Dieuwke Hupkes, Joelle Pineau, Adina Williams, Douwe Kiela)

mlm 프리트레이닝이 단어 순서를 모델링한다기에는 순서를 섞어 프리트레이닝을 해도 성능이 잘 나온다, 따라서 단어 분포를 모델링하는 것이 아닌가...하는 추론인데 mdl 결과를 보면 단어 순서 학습이 꽤 되고 있다고 보는 것이 맞지 않나 싶네요.

#pretraining #language_model #mlm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

210413 Masked Language Modeling and the Distributional Hypothesis.md

210413 Masked Language Modeling and the Distributional Hypothesis.md

Files

210413 Masked Language Modeling and the Distributional Hypothesis.md

Latest commit

History

210413 Masked Language Modeling and the Distributional Hypothesis.md

File metadata and controls