You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanks for your BinaryConnect paper and implementation of it here, which is really inspiring and helpful. I got a concern about updating the trained weights. Empirically, you know, for each backpropagation, parameter changes after gradient descent are tiny, which is also illustrated in your paper. with these tiny changes on weights, after binarization of each epoch, it is possible that these binarized weights remain unchanged, for example, it is hard to change from 1 to -1 due to the tiny changes on weights. And based on that, after the forward pass, each epoch of result may be similar, which in turn result in more tiny changes on weights. Hince, after several epochs of training, the weights are hardly updated and it is still far away from optimization. How do you solve this issue?
another question is: In your figure, after training, the distribution of weights are around -1 and 1. I don't know why but my training weights seems like a little bit random? do you know why?
thanks.
The text was updated successfully, but these errors were encountered:
Hi yananliusdu!
For the first question,the auther used a variable of W_Lr_scaler to scale the learn rate ,so the weight update will not too hard.The detial you can see in the file of binary_connect.py of clipping_scaling function.
Hi Matthieu,
thanks for your BinaryConnect paper and implementation of it here, which is really inspiring and helpful. I got a concern about updating the trained weights. Empirically, you know, for each backpropagation, parameter changes after gradient descent are tiny, which is also illustrated in your paper. with these tiny changes on weights, after binarization of each epoch, it is possible that these binarized weights remain unchanged, for example, it is hard to change from 1 to -1 due to the tiny changes on weights. And based on that, after the forward pass, each epoch of result may be similar, which in turn result in more tiny changes on weights. Hince, after several epochs of training, the weights are hardly updated and it is still far away from optimization. How do you solve this issue?
another question is:
In your figure, after training, the distribution of weights are around -1 and 1. I don't know why but my training weights seems like a little bit random? do you know why?
thanks.
The text was updated successfully, but these errors were encountered: