You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You'll really see a patter in what I'm reading lately...
Typical way to get a sparse model is to train a dense model and drop connections (may or may not include multiple iterations.). But this approach have two main limitations.
The maximum size of sparse model is limited to the largest dense model that can be trained.
Large amount of computation is performed for parameters that are zero valued or that will be zero during inference.
In this new approach, called "RigL", they randomly initialize a sparse neural network, and at regularly spaced intervals it removes a fraction of connections based on their magnitudes and activates new ones using gradient information.
Stumbled on this here: https://www.reddit.com/r/MachineLearning/comments/j0xr24/d_machine_learning_wayr_what_are_you_reading_week/g71offh/?utm_source=reddit&utm_medium=web2x&context=3
You'll really see a patter in what I'm reading lately...
Link to the paper: https://arxiv.org/pdf/1911.11134.pdf
The text was updated successfully, but these errors were encountered: