The TensorRT Difference #30

painebenjamin · 2023-06-30T15:12:24Z

painebenjamin
Jun 30, 2023
Maintainer

Many people haven't given TensorRT a shot yet. I understand, it's complicated, takes a long time to set up, and has some prohibitive hardware requirements - so what do you even really get out of it?

I made the claim 50-100% faster inference, which sounds like a lot, but do I actually mean it?

The answer is yes. Using the new log window, you can follow along as your invocation executes, which includes a notation of of how many iterations per second (it/s) it is executing.

When not using TensorRT, the fastest speed I can come up with on my 3090 Ti is 17.15 it/s.

When using TensorRT, with the exact same settings, I reached a whopping 37.39 it/s - an astonishing 118% peak speed increase.

When a TensorRT engine is hot (indicated by a green status indicator,) you can get results in as little as three seconds.

Not all parts of the execution are enhanced by TensorRT, so some phases will not see speed gains. However, 90%+ of the time spent during inference is during the denoising phase, which is TensorRT-powered.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The TensorRT Difference #30

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

The TensorRT Difference #30

painebenjamin Jun 30, 2023 Maintainer

Replies: 0 comments

painebenjamin
Jun 30, 2023
Maintainer