Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on training time with Resnet18 #4

Open
jackzhan01 opened this issue Dec 25, 2024 · 0 comments
Open

Question on training time with Resnet18 #4

jackzhan01 opened this issue Dec 25, 2024 · 0 comments

Comments

@jackzhan01
Copy link

I am intrigued by your work that demonstrates the effectiveness of ZO optimization in training large-scale models. Your main experiments on CIFAR-10 using ResNet20 show that it takes approximately 60 minutes per epoch (a result I have successfully replicated).
image
However, your framework, Deepzero, utilizes CGE, which causes the inference time to increase linearly with the model size. In Appendix D, Table A3, you reported training ResNet18, whose model size is approximately ten times larger than ResNet20. I am curious about how long it took to train ResNet18 using the Deepzero framework.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant