You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
max_new_tokens = 1000 by default, and this can be specified in ktransformers.local_chat through --max_new_tokens, but not the server.
Please add the --max_new_tokens option to the ktransformers server so we can specify longer output context lengths, and add more generation options (like input context, etc).
The text was updated successfully, but these errors were encountered:
Apologies for the inconvenience. If you’re building from source, you can modify the max_new_tokens parameter in ktransformers/server/backend/args.py. We will include this update in the next Docker release.
I just encountered this limitation. It would be even better if the REST API honored the maximum context length and maximum number of generation tokens.
max_new_tokens = 1000 by default, and this can be specified in ktransformers.local_chat through --max_new_tokens, but not the server.
Please add the --max_new_tokens option to the ktransformers server so we can specify longer output context lengths, and add more generation options (like input context, etc).
The text was updated successfully, but these errors were encountered: