Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why use -6.58 as the initialization of bias #20

Open
Beanocean opened this issue Dec 14, 2018 · 3 comments
Open

Why use -6.58 as the initialization of bias #20

Beanocean opened this issue Dec 14, 2018 · 3 comments

Comments

@Beanocean
Copy link

bias_filler { type: "constant" value: -6.58 }

Hi Saurabh,
Recently, I am trying to export this code from Caffe to Pytorch. When tuning the model in Fully-Convolutional Network, I found a very interesting trick in the code.
The bias of the classification layer should be set to -6.58, otherwise, the optimization will be misled. For example, If this value is initialized to zero, the model even does not converge. So I want to know why you use this value as initialization and how did you find it.

@mememimis
Copy link

Did you export this code to pytorch?

@Beanocean
Copy link
Author

Beanocean commented Mar 25, 2019

Did you export this code to pytorch?

@mememimis, I have exported it to Pytorch. But still, have some troubles. I will release the code later, hope you can help to refine it together.

@lifeGWT
Copy link

lifeGWT commented Mar 2, 2020

Did you export this code to pytorch?

@mememimis, I have exported it to Pytorch. But still, have some troubles. I will release the code later, hope you can help to refine it together.

Can you share your pytorch code? Thank you very much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants