You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When calling LlamaCppGenerator.run() with generation_kwargs={"stream": True}, a TypeError "'generator' object is not subscriptable" is raised in line 97: replies = [output["choices"][0]["text"]], because the create_completion function of the underlying llama-cpp-python module returns a generator object in this case.
To Reproduce
Reproducable whenever run is called with generation_kwargs={"stream": True}
E.g.
fromhaystack_integrations.components.generators.llama_cppimportLlamaCppGeneratorg=LlamaCppGenerator(model="llama.cpp/models/llama-2-7b-chat/ggml-models-Q4_K_M.gguf", n_ctx=2048, n_batch=128, model_kwargs={"verbose": False, "use_mlock": True}) # happens no matter the model_kwargsg.warm_up()
g.run("The purpose of life is", generation_kwargs={"stream": True})
(Won't run because of the model path on my machine obvs)
Expected behaviour
The underlying create_completion function returns a generator in this case. So should the run function.
Fix suggestion
I guess easiest would be to return the generator object in this case.
Describe your environment (please complete the following information):
OS: Ubuntu Linux (Wsl)
Haystack version: haystack_ai-2.0.1
Integration version: llama-cpp-haystack-0.3.0
The text was updated successfully, but these errors were encountered:
This is unfortunately expected as currently no components are capable to return a streamable object. We're working on a solution in Haystack, when ready we'll roll it out to all the integrations that will need it.
Thanks for replying. Yes, I figured so. For anyone interested, I have a workaround for the time being. Add the following lines in the run function of generators.py.
Describe the bug
When calling
LlamaCppGenerator.run()
withgeneration_kwargs={"stream": True}
, a TypeError "'generator' object is not subscriptable" is raised in line 97:replies = [output["choices"][0]["text"]]
, because thecreate_completion
function of the underlying llama-cpp-python module returns a generator object in this case.To Reproduce
Reproducable whenever
run
is called withgeneration_kwargs={"stream": True}
E.g.
(Won't run because of the model path on my machine obvs)
Expected behaviour
The underlying
create_completion
function returns a generator in this case. So should therun
function.Fix suggestion
I guess easiest would be to return the generator object in this case.
Describe your environment (please complete the following information):
The text was updated successfully, but these errors were encountered: