https://arxiv.org/abs/2007.00811

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks (Denny Zhou, Mao Ye, Chen Chen, Tianjian Meng, Mingxing Tan, Xiaodan Song, Quoc Le, Qiang Liu, Dale Schuurmans)

폭이 넓은 모델 학습 2. 폭이 좁은 모델에 linear 레이어를 붙여서 폭을 맞추고 feature matching 3. 파인튜닝 후 linear 레이어 머지. resnet50/bert base로 resnet101/bert large 성능 달성. glorified feature kd인 것 같긴 한데 어쨌든 좋은 결과.

#distillation #lightweight

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

200701 Go Wide, Then Narrow.md

200701 Go Wide, Then Narrow.md

Files

200701 Go Wide, Then Narrow.md

Latest commit

History

200701 Go Wide, Then Narrow.md

File metadata and controls