Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

for a good result #5

Open
mjanddy opened this issue Dec 3, 2019 · 26 comments
Open

for a good result #5

mjanddy opened this issue Dec 3, 2019 · 26 comments

Comments

@mjanddy
Copy link

mjanddy commented Dec 3, 2019

Hi,Would you mind telling me how your model is trained? I didn't use the code to achieve your model effect.

@reshow
Copy link
Owner

reshow commented Dec 3, 2019

If you run the code directly and correctly, the result would be slightly worse than mine (2d landmark nme is about 3.30±0.03 ) since the number of parameters is less than in PRN's paper.
To achieve a good performance, I employ a number of data augmentation methods which are not the same as PRN, such as random erasing, gauss blur, etc. These methods are arbitrary so that I remove them from my code.
Another way is to increase the parameter number of the network. Here I use exactly the same network structure as PRN's given model. The model size is 52MB while the model size in their paper is more than 150MB. I'm not sure about this part.

@mjanddy
Copy link
Author

mjanddy commented Dec 3, 2019

I train your 30 epoch ,but just got 2d nme 3.8

@reshow
Copy link
Owner

reshow commented Dec 3, 2019

How about the NME on training data?

@mjanddy
Copy link
Author

mjanddy commented Dec 3, 2019

I do not test for training data

@reshow
Copy link
Owner

reshow commented Dec 3, 2019

Sorry, I mean the printed 'metrics0' of training dataset and evaluation dataset

@mjanddy
Copy link
Author

mjanddy commented Dec 3, 2019

I'm sorry I didn't record it

@mjanddy
Copy link
Author

mjanddy commented Dec 3, 2019

I use the datasets for official generation method, not use your method。Does this affect the effect?

@mjanddy
Copy link
Author

mjanddy commented Dec 3, 2019

I reload the model ,and got this result

[epoch:0, iter:111/7653, time:51] Loss: 0.1049 Metrics0: 0.0379

@reshow
Copy link
Owner

reshow commented Dec 3, 2019

I didn't try it. There are some differences between our generation codes but I don't think they will affect the performance.

The metrics0 should reach 0.03 in less than 10 epochs.

Try to use my generation code.

And try to change the line 96 in torchmodel.py as below and remember to record metrics0:

scheduler_exp = optim.lr_scheduler.ExponentialLR(self.optimizer, 0.9)

@mjanddy
Copy link
Author

mjanddy commented Dec 3, 2019

ok,I will try it. Thanks a lot.

@mjanddy
Copy link
Author

mjanddy commented Dec 4, 2019

I didn't try it. There are some differences between our generation codes but I don't think they will affect the performance.

The metrics0 should reach 0.03 in less than 10 epochs.

Try to use my generation code.

And try to change the line 96 in torchmodel.py as below and remember to record metrics0:

scheduler_exp = optim.lr_scheduler.ExponentialLR(self.optimizer, 0.9)

I do it follwing your all code,but its effect is still not good,

this is result:

[epoch:29, iter:7654/7653, time:1802] Loss: 0.0329 Metrics0: 0.0130

nme2d 0.04015569452557179
nme3d 0.054406630244023056
landmark2d 0.043106316771823916
landmark3d 0.05833802395872772

Look forward to your reply.

@reshow
Copy link
Owner

reshow commented Dec 4, 2019

I didn't try it. There are some differences between our generation codes but I don't think they will affect the performance.
The metrics0 should reach 0.03 in less than 10 epochs.
Try to use my generation code.
And try to change the line 96 in torchmodel.py as below and remember to record metrics0:

scheduler_exp = optim.lr_scheduler.ExponentialLR(self.optimizer, 0.9)

I do it follwing your all code,but its effect is still not good,

this is result:

[epoch:29, iter:7654/7653, time:1802] Loss: 0.0329 Metrics0: 0.0130

nme2d 0.04015569452557179
nme3d 0.054406630244023056
landmark2d 0.043106316771823916
landmark3d 0.05833802395872772

Look forward to your reply.

The result on training set is good and better than mine. But the evaluation result is bad.
I guess this is because I remove some augmentation codes. Please give me an email and I'll send them to you.
I'll update it right now.

@reshow
Copy link
Owner

reshow commented Dec 4, 2019

It is my email: [email protected] .

I've updated it. Sorry for the trouble.

@mjanddy
Copy link
Author

mjanddy commented Dec 4, 2019

thanks

@mjanddy
Copy link
Author

mjanddy commented Dec 9, 2019

It is my email: [email protected] .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

@reshow
Copy link
Owner

reshow commented Dec 9, 2019

It is my email: [email protected] .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I trained it myself again and I get nme3d 0.0445 in 30 epochs.
I don't known what causes this difference.
You can try to use another learning rate scheduler in the code

self.scheduler = optim.lr_scheduler.StepLR(self.optimizer, step_size=5, gamma=0.5)

and set the learning rate to 2.5e-5.

I use this scheduler long time ago since it takes more epochs.

@mjanddy
Copy link
Author

mjanddy commented Dec 9, 2019

for get nme2d=0.031,how many epochs have you trained?

@reshow
Copy link
Owner

reshow commented Dec 9, 2019

It is my email: [email protected] .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

@reshow
Copy link
Owner

reshow commented Dec 9, 2019

for get nme2d=0.031,how many epochs have you trained?

I don't remember, but 45 epochs is enough.

@mjanddy
Copy link
Author

mjanddy commented Dec 9, 2019

I

It is my email: [email protected] .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

I first train 30 epochs using lr=2e-4 ,and get nme2d 0.345,then decrease lr to 2e-5 ,retrain 45 epochs,get nme2d 0.336.

@reshow
Copy link
Owner

reshow commented Dec 9, 2019

It is my email: [email protected] .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

I first train 30 epochs using lr=2e-4 ,and get nme2d 0.345,then decrease lr to 2e-5 ,retrain 45 epochs,get nme2d 0.336.

It's strange...... Could you use a even smaller learning rate (lr=8e-6) to train it from the beginning? I intuitively think it will help.

@mjanddy
Copy link
Author

mjanddy commented Dec 9, 2019

ok ,I will try it.

@mjanddy
Copy link
Author

mjanddy commented Dec 11, 2019

It is my email: [email protected] .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

I first train 30 epochs using lr=2e-4 ,and get nme2d 0.345,then decrease lr to 2e-5 ,retrain 45 epochs,get nme2d 0.336.

It's strange...... Could you use a even smaller learning rate (lr=8e-6) to train it from the beginning? I intuitively think it will help.

Excuse me again,if I use randomcolor in your augmentation codes, the nme is always about 0.04,can't drop to 0.03,Is this normal?

@mjanddy
Copy link
Author

mjanddy commented Dec 12, 2019

And if I use smaller learning rate (lr=8e-6) to train it from the beginning, the nme is drop slower than before(lr=1e-4).

@reshow
Copy link
Owner

reshow commented Dec 15, 2019

It is my email: [email protected] .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

I first train 30 epochs using lr=2e-4 ,and get nme2d 0.345,then decrease lr to 2e-5 ,retrain 45 epochs,get nme2d 0.336.

It's strange...... Could you use a even smaller learning rate (lr=8e-6) to train it from the beginning? I intuitively think it will help.

Excuse me again,if I use randomcolor in your augmentation codes, the nme is always about 0.04,can't drop to 0.03,Is this normal?

I don't use the RandomColor function in practice, forget it.
If you use a smaller learning rate, does it finally reach a good result? And if the speed is unbearable, you may try some strategies such as warm up (I don't really use that).

@mjanddy
Copy link
Author

mjanddy commented Dec 16, 2019

It is my email: [email protected] .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

I first train 30 epochs using lr=2e-4 ,and get nme2d 0.345,then decrease lr to 2e-5 ,retrain 45 epochs,get nme2d 0.336.

It's strange...... Could you use a even smaller learning rate (lr=8e-6) to train it from the beginning? I intuitively think it will help.

Excuse me again,if I use randomcolor in your augmentation codes, the nme is always about 0.04,can't drop to 0.03,Is this normal?

I don't use the RandomColor function in practice, forget it.
If you use a smaller learning rate, does it finally reach a good result? And if the speed is unbearable, you may try some strategies such as warm up (I don't really use that).

I'm not got a good result for use smaller learning rate or use optim.lr_scheduler.StepLR. The best result is nme2d 0.336.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants