Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Options to increase fps #8

Open
carstenschwede opened this issue Jan 12, 2017 · 22 comments
Open

Options to increase fps #8

carstenschwede opened this issue Jan 12, 2017 · 22 comments

Comments

@carstenschwede
Copy link

Are there any options to increase fps besides reducing resolution or adding GPUs? Is it possible to restrict detection to certain joints (e.g. Heads) in order to speed up processing?

@ZheC
Copy link
Member

ZheC commented Jan 12, 2017

(1) Using MPI model instead of COCO model (2) Using one scale for testing can speed up the processing time. Restricting the detections does not help because the CNN still need to use the same trained model and thus the CNN forwarding processing time is the same.

@ZheC
Copy link
Member

ZheC commented Jan 13, 2017

Another option is to modify the text.prototxt and reduce the stage number from 6 to 3.

@carstenschwede
Copy link
Author

Thanks, I will try both. Any idea of what kind of speedup I could expect?

@Warden7
Copy link

Warden7 commented Feb 13, 2017

Hi I have a question about the fps is that: I run the rtpose demo on the AWS p2.large instance(with one K80 GPU 24G), however it takes 1.1s to deal a frame.
I don't know whether it is because that the k80 gpu has a compute capability of 3.7 lower than that of 6.1 of GTX1080?

@gineshidalgo99
Copy link
Member

These is a preliminary benchmark we have made with the new version we are working on (it will be released in around 1 month). The current version you are using should be around 25-30% slower. Let me know if you are using the same flags. If so, are you using cuDNN 5.1? Older versions of cuDNN might also slow down the program. Thanks!

Current benchmark:
https://docs.google.com/spreadsheets/d/1-DynFGvoScvfWDA1P4jDInCkbD4lg0IKOYbXgEq0sK0/edit#gid=0

@wangzhangup
Copy link

@Warden7
their compute capabilities
K80: 8.73TFLOPS
1080: 9TFLOPS

low fps maybe other reasons

@Warden7
Copy link

Warden7 commented Feb 21, 2017

Thanks for your warmly analysis. The version of cuDNN is 5.0 and Cuda is 7.5. The key word of GPU information "volatile gpu util" always shows 99%, even though nothing is done on the GPU.Maybe something debug need to be done further.

@wangzhangup
Copy link

@Warden7
kill the processes on the GPU

@gineshidalgo99
Copy link
Member

Another way to speed it up is by using the new version (~25% faster):
https://github.com/CMU-Perceptual-Computing-Lab/openpose

@wangzhangup
Copy link

Reduce the number of feature maps.
I modify the stage 3-6 conv layer's output number from 128 to 64. And the result is as good as original version, speed up 25%!

@carstenschwede
Copy link
Author

@wangzhangup Thanks, can you try your modification also on the newer version at https://github.com/CMU-Perceptual-Computing-Lab/openpose? Would be interesting to see what overall speedup you are able to get.

@carstenschwede
Copy link
Author

@gineshidalgo99 Thanks for the update!

@gineshidalgo99
Copy link
Member

@wangzhangup Thank you so much for your idea! Please, could you email me: [email protected] to discuss how you did it in more details? We are interested in adding it to our system if that is OK for you!

@wangzhangup
Copy link

@gineshidalgo99 OK!

@wangzhangup
Copy link

@gineshidalgo99 @carstenschwede this is the speedup model https://drive.google.com/open?id=0B-SxboVJxF-WNmtpWGc5emZrRDg

@gineshidalgo99
Copy link
Member

gineshidalgo99 commented May 10, 2017

@wangzhangup The speed-up is impressive, and the accuracy does decrease a bit, but it is a fine for the huge speedup. Do you mind if I add it to the new OpenPose? (I went from 14 to 20 fps on my desktop and from 30 to 22 mAP). Or you can make a pull request with your new prototxt, and I will fix the other details (so you would appear as contributor of OpenPose). Thanks!

https://github.com/CMU-Perceptual-Computing-Lab/openpose

@carstenschwede
Copy link
Author

@wangzhangup thanks for the model, impressive speedup!

@gineshidalgo99 is a similar speedup expected for the upcoming "extended" models at OpenPose (e.g. finger tracking)?

@gineshidalgo99
Copy link
Member

@carstenschwede The speed up applies to the body pose, but finger tracking is made on top of it (you need to know the body location to detect the hand), so it will take advantage of it too if this model is used (I did not measure the accuracy impact yet though, I guess I will add both models: 1 for better accuracy and 1 for speed).

@carstenschwede
Copy link
Author

carstenschwede commented May 11, 2017

I guess I will add both models: 1 for better accuracy and 1 for speed

Sounds perfect. Can't wait to try out the finger detection.

@wangzhangup
Copy link

@gineshidalgo99 Could you share your measure code?

@gineshidalgo99
Copy link
Member

It is still quite messy, it uses Matlab and C++, and it is not completely finished. I prefer to wait until I actually finish it properly... sorry!

@aakendi
Copy link

aakendi commented Nov 20, 2018

@carstenschwede The speed up applies to the body pose, but finger tracking is made on top of it (you need to know the body location to detect the hand), so it will take advantage of it too if this model is used (I did not measure the accuracy impact yet though, I guess I will add both models: 1 for better accuracy and 1 for speed).

I just try finger tracking, with option 640x480, also use tracking 5 but fps just around 10fps. May you give an advice?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants