Correctly save DP/DDP checkpooints #9

danieltudosiu · 2022-06-27T10:29:42Z

The correct way of saving DP/DDP checkpoints is to access the module parameter of the class.

Please do that instead of saving the whole DP/DDP class' state dict and then trimming the name.

kayhan-batmanghelich · 2022-06-27T15:03:45Z

Fork out and make changes and then pull request, Li will look into it.

On Mon, Jun 27, 2022 at 6:29 AM Petru-Daniel Tudosiu < ***@***.***> wrote: The correct way of saving DP/DDP checkpoints is to access the module parameter of the class. Please do that instead of saving the whole DP/DDP class' state dict and then trimming the name. — Reply to this email directly, view it on GitHub <#9>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AC53JXKZXN4G25IOK56QHT3VRF7CDANCNFSM5Z57XOQA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

lisun-ai · 2022-06-27T22:09:20Z

Hi there,

Thanks for your message. We followed the official PyTorch ImageNet training code for saving DP/DDP class' state dict. The name prefix trimming method is commonly adopted in other repos. We will add an annotation to this part. If you have further concerns, please initiate a pull request.

Thanks,
Li

danieltudosiu · 2022-06-27T22:17:39Z

Hi Li,

From my knowledge of Ignite/MONAI that's not the cleanest way.

https://github.com/pytorch/ignite/blob/master/ignite/handlers/checkpoint.py#L463

Cheers,

Dan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correctly save DP/DDP checkpooints #9

Correctly save DP/DDP checkpooints #9

danieltudosiu commented Jun 27, 2022

kayhan-batmanghelich commented Jun 27, 2022 via email

lisun-ai commented Jun 27, 2022

danieltudosiu commented Jun 27, 2022

Correctly save DP/DDP checkpooints #9

Correctly save DP/DDP checkpooints #9

Comments

danieltudosiu commented Jun 27, 2022

kayhan-batmanghelich commented Jun 27, 2022 via email

lisun-ai commented Jun 27, 2022

danieltudosiu commented Jun 27, 2022