You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A layer decay schedule is mentioned in Section 5.1 Implementation details "To avoid deteriorating the general representations obtained from the previous stage, a layer decay schedule is adopted to train the student model for all downstream tasks."
Could you show more details of the layer decay schedule? Or point me to the code/reference of the schedule?
Thanks
The text was updated successfully, but these errors were encountered:
A layer decay schedule is mentioned in Section 5.1 Implementation details "To avoid deteriorating the general representations obtained from the previous stage, a layer decay schedule is adopted to train the student model for all downstream tasks."
Could you show more details of the layer decay schedule? Or point me to the code/reference of the schedule?
Thanks
The text was updated successfully, but these errors were encountered: