You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Inference for local prompt tuned models seems to be (at least depending on model size) an order of magnitude faster than local text generation models, which should not be the case assuming parameters and data types are the same.
Sample Code
WIP
Expected behavior
Text generation model inference speed is on par with or faster than prompt tuned models.
Observed behavior
Text generation models are much slower than prompt tuned models.
Additional context
Note that generate calls have been consolidated; it would be best to have a repro case both in and out of the Caikit NLP
The text was updated successfully, but these errors were encountered:
Describe the bug
Inference for local prompt tuned models seems to be (at least depending on model size) an order of magnitude faster than local text generation models, which should not be the case assuming parameters and data types are the same.
Sample Code
WIP
Expected behavior
Text generation model inference speed is on par with or faster than prompt tuned models.
Observed behavior
Text generation models are much slower than prompt tuned models.
Additional context
Note that generate calls have been consolidated; it would be best to have a repro case both in and out of the Caikit NLP
The text was updated successfully, but these errors were encountered: