-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproduced ResNet18 CIFAR10 result is 10% lower than reported #7
Comments
@csyhhu Do you have any update on the accuracy? I got 87% on CIFAR10, still 4 points lower than the paper. |
Hi @blueardour I don't have any update since then. |
Hi. @csyhhu @blueardour @czhu95 Model (ResNet 34 on CIFAR 10):
def resnet_shortcut(l, n_out, stride, activation=tf.identity): def resnet_group(name, l, block_func, features, count, stride): Final Train Output: |
@Yulun-Yao Hi, I'm very sorry that I've gaven up the TTN. I take weeks and tried different optimizer strategy as well as modified the gradient based on empirecal experiences gathered so far, I stilled found the traning unstable and not able to recover the accuracy. Recently, I moved to LQ-net and Dorefa. I always obtained accuracy better than the paper reported ones without too much efforts on many scanorios. Even for a2w1 or a1w1 bit configurations, I got better result than TTN. |
Thank you for your reply and suggestions! Btw, have you ever encountered the problem I mentioned above? (Negative scaling factors resulted in both weights having the same sign). If you did encounter it, were you able to fix it? Would you mind sharing your model and parameters? |
Hi @Yulun-Yao , sorry for replying late. I also give up TTQ. For quantization problems, maybe you can add my wechat: csyhhu for further discuss. |
Hi @czhu95 ,
Thanks for providing the codes!
Recently I use your codes to ternarize a ResNet18 using CIFAR10. Firstly I use tensorpack to train a ResNet18 to validation error as 0.083. However, when I apply this as initial status and ternarize (using the example codes), as I use the default delta t=0.05 in your code, the validation error is always around 0.1843. I tried other t but it is still around 0.18, which is about 10% lower than your paper report.
Is there any tricks or mistake I made?
Best regards,
Shangyu
The text was updated successfully, but these errors were encountered: