Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test speed #3

Open
jjn037 opened this issue Jan 31, 2018 · 3 comments
Open

test speed #3

jjn037 opened this issue Jan 31, 2018 · 3 comments
Labels

Comments

@jjn037
Copy link

jjn037 commented Jan 31, 2018

Have you tested the speed? I get a lower speed(30ms/img) with resnet18 224*224 bachsize1

@jjn037
Copy link
Author

jjn037 commented Jan 31, 2018

auto output_tensor = CPU(kByte).tensorFromBlob(data, {output_height, output_width, 3});

spend an abnormal time

@warmspringwinds
Copy link
Owner

warmspringwinds commented Feb 6, 2018

Sorry for the late reply

@jjn037 This piece of code is slow because you transfer the data from gpu to cpu --
this is usually an expensive operation and should be slow in the original pytorch too.

Would be cool if you can compare the timing of the cpp line with a pytorch's one:
output.cpu() and see if there is a significant difference in runtime

@warmspringwinds
Copy link
Owner

FYI, I have just added a file with a speed benchmark:
https://github.com/warmspringwinds/pytorch-cpp/blob/master/examples/resnet_18_8s_benchmark.cpp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants