Why use -6.58 as the initialization of bias #20

Beanocean · 2018-12-14T07:02:03Z

visual-concepts/output/v1/mil_finetune.prototxt

Line 184 in 0e22363

bias_filler { type: "constant" value: -6.58 }

Hi Saurabh,
Recently, I am trying to export this code from Caffe to Pytorch. When tuning the model in Fully-Convolutional Network, I found a very interesting trick in the code.
The bias of the classification layer should be set to -6.58, otherwise, the optimization will be misled. For example, If this value is initialized to zero, the model even does not converge. So I want to know why you use this value as initialization and how did you find it.

mememimis · 2019-03-21T16:09:39Z

Did you export this code to pytorch?

Beanocean · 2019-03-25T04:33:37Z

Did you export this code to pytorch?

@mememimis, I have exported it to Pytorch. But still, have some troubles. I will release the code later, hope you can help to refine it together.

lifeGWT · 2020-03-02T09:48:13Z

Did you export this code to pytorch?

@mememimis, I have exported it to Pytorch. But still, have some troubles. I will release the code later, hope you can help to refine it together.

Can you share your pytorch code? Thank you very much

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why use -6.58 as the initialization of bias #20

Why use -6.58 as the initialization of bias #20

Beanocean commented Dec 14, 2018

mememimis commented Mar 21, 2019

Beanocean commented Mar 25, 2019 •

edited

Loading

lifeGWT commented Mar 2, 2020

Why use -6.58 as the initialization of bias #20

Why use -6.58 as the initialization of bias #20

Comments

Beanocean commented Dec 14, 2018

mememimis commented Mar 21, 2019

Beanocean commented Mar 25, 2019 • edited Loading

lifeGWT commented Mar 2, 2020

Beanocean commented Mar 25, 2019 •

edited

Loading