-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loss is not decreasing #43
Comments
did you load the pre-train weight? it works fine with my dataset |
or maybe you didn't change the mode is train or test in the config file |
@jinfagang Have you solved the problem? I have the same issue. @1453042287 I trained the yolov2-mobilenet-v2 from stratch. U mentioned 'pre-trained model', do y mean the pre-trained bone network model (such as the mobilenetv2) or both bone model and detection model? In my training, all the parameters are not pre trained. |
@blueardour first, make sure you change the PHASE in .yml file to 'train', then ,actually, i believe it's inappropriate to train a model from scratch, so at least, you should load the pre-train backbone, i just utilize the whole pre-train weight(including backbone and extract and so on..) the author provided, but i set the RESUME_SCOPE in the .yml file to be 'base' only and the resault is almost the same as fine-tune's |
@1453042287 Hi, thanks for the advise. My current training seems working. My only problem left is the speed for test. The nms in the test procedure seems very slow. It have been discussed in #16. Yet no good solutions. |
@blueardour Hi,bellow is my test result of fssd_mobilenet_v2 on coco2017 using my config files instead of the given one. training from scratch without any pre-trained model.
|
ok...seems like training from scratch might not be well supported. |
Yes, set all parameter to re-trainable seems hard to converge. This year, Mr He did publish a paper named 'Rethinking ImageNet Pre-training' which claimed the pre-train on imagenet is not necessary. However, it is skillful to give a good initialization of the network. |
Yes, agree with you. |
Hi, @1453042287 @cvtower I have another issue about the train precision and loss curve. The following is the result from tensorboardX. It can be see that the precision slowly increase and meet a jump at around 89th epoch. I don't why the precision changes so dramatically at this point. The loc and cls loss as well the learning rate seem not change so much. Do you observe a similar phenomenon or do you have any explanation on it? |
Hi @blueardour, I did not use the CosineAnnealing LR and no such phenomenon ever happened during training. |
您好,我想请问下:作者提供的pre-train weight文件,你是如何得到的,我没有weight目录,所以也没有预训练权重文件,还是您通过其他方式获得的?谢谢您! @1453042287 |
@XiaSunny 下载啊。。。就在这个repo的readme里面,蓝体字 |
@1453042287 好的,谢谢你。 |
您好,我用的配置文件是fssd_vgg16_train_coco.yml,当我训练coco2017时conf_loss在5左右,loc_loss在2左右,一直不下去。我的配置文件如下: TRAIN: TEST: MATCHER: POST_PROCESS: DATASET: EXP_DIR: './experiments/models/fssd_vgg16_coco' |
@XiaSunny 你好,我也遇到了你这个问题,请问你解决了吗 |
@1453042287 @XiaSunny 你好,我想使用预训练模型
|
TRAINABLE_SCOPE指的是需要训练的范围RESUME_SCOPE指的是你需要从预训练模型中恢复的有哪些,首先应该把conf去掉(因为类别数不一样)其他的你根据实际情况看看还需要改不。发自我的华为手机-------- 原始邮件 --------发件人: Damon2019 <[email protected]>日期: 2019年9月18日周三 11:31收件人: "ShuangXieIrene/ssds.pytorch" <[email protected]>抄送: XiaSunny <[email protected]>, Mention <[email protected]>主 题: Re: [ShuangXieIrene/ssds.pytorch] Loss is not decreasing (#43)@1453042287 @XiaSunny 你好,我想使用预训练模型
TRAINABLE_SCOPE: 'base,norm,extras,loc,conf'
RESUME_SCOPE: 'base,norm,extras,loc,conf'
这里面的参数我应该如何修改? 谢谢!
—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or mute the thread.
|
你好,我最近训练也遇到loss不下降的问题,一直维持在4左右,下载的模型,没做任何修改,只是重新加载base进行训练,求问你最终是如何解决的,万分感谢~ |
I have trained ssd with mobilenetv2 on VOC but after almost 500 epochs, the loss is still like this:
It's doesn't change and loss is very hight...... What's the problem with implementation?
The text was updated successfully, but these errors were encountered: