The TensorRT Difference #30
painebenjamin
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Many people haven't given TensorRT a shot yet. I understand, it's complicated, takes a long time to set up, and has some prohibitive hardware requirements - so what do you even really get out of it?
I made the claim
50-100%
faster inference, which sounds like a lot, but do I actually mean it?The answer is yes. Using the new log window, you can follow along as your invocation executes, which includes a notation of of how many iterations per second (
it/s
) it is executing.When not using TensorRT, the fastest speed I can come up with on my 3090 Ti is
17.15 it/s
.When using TensorRT, with the exact same settings, I reached a whopping
37.39 it/s
- an astonishing118%
peak speed increase.When a TensorRT engine is hot (indicated by a green status indicator,) you can get results in as little as three seconds.
Not all parts of the execution are enhanced by TensorRT, so some phases will not see speed gains. However, 90%+ of the time spent during inference is during the denoising phase, which is TensorRT-powered.
Beta Was this translation helpful? Give feedback.
All reactions