Add streaming to acall #158

mneedham · 2024-07-25T06:17:25Z

@liyin2015 I played around with how to add streaming to the acall function, but I dunno whether this is the right way to do it as I'm a newbie when it comes to using async.

So I've just implemented it for the Ollama Client for the time being.

Let me know what you think?

mneedham · 2024-07-25T06:18:20Z

And here's how you use it:

async def my_fn(stream_response=True):
    model_client = OllamaClient(host="http://localhost:11434")
    model_kwargs = {"model": "llama3.1", "stream": stream_response}
    generator = Generator(model_client=model_client, model_kwargs=model_kwargs)

    response = await generator.acall({"input_str": "What would happen if a lion and an elephant met three dogs and four hyenas?"})
    if stream_response:
      async for chunk in response.data:
          print(chunk, end='', flush=True)
    else:
      print(response.data)

y = asyncio.run(my_fn())

That's quite an interesting scenario!

If a lion and an elephant were to meet with three dogs and four hyenas, I think the outcome would depend on various factors such as the size and ferocity of each individual animal.

Initially, the lion might try to assert its dominance over the smaller animals (the three dogs). However, the presence of the elephant and the hyenas could change the dynamics. Elephants are known for their strength and memory, so they might be wary of the lion's intentions.

The four hyenas, being scavengers and often seen as a pack, would likely try to capitalize on any potential chaos. They're known for their cunning and ability to work together, which could put them at an advantage in this scenario.

The three dogs, depending on their breed and temperament, might either run away or try to join forces with the hyenas against the lion and elephant.

Ultimately, it would likely be a chaotic scene, with each animal trying to protect itself. The size and strength of the lion and elephant would initially give them an advantage, but if the dogs and hyenas were able to work together, they might be able to drive out or distract the larger predators, creating an opportunity for escape or counterattack.

Of course, this is all speculation, and in reality, such a scenario would likely play out differently depending on various factors like habitat, weather conditions, and individual animal personalities!

mneedham · 2024-07-25T19:46:32Z

@liyin2015 Added tests around this code.

mneedham · 2024-07-25T20:34:36Z

@liyin2015 I don't think the acall streaming works for the other clients either? I'll try to go through the others and do the same thing

liyin2015 · 2024-07-30T17:34:30Z

lightrag/lightrag/core/generator.py

@@ -263,8 +286,8 @@ async def acall(
        api_kwargs = self._pre_call(prompt_kwargs, model_kwargs)
        completion = await self.model_client.acall(
            api_kwargs=api_kwargs, model_type=self.model_type


im thinking we need to have a new function, self.model_client_call and model_client_acall to call and parse the completion this way, the pre_call and post_call does not need to be async for now

im working on something also need to separate it:

def _model_client_call(self, api_kwargs: Dict) -> Any: # call the model client try: # check the cache index_content = json.dumps(api_kwargs) # all messages cached_completion = self._check_cache(index_content) if cached_completion is not None: return cached_completion completion = self.model_client.call( api_kwargs=api_kwargs, model_type=self.model_type ) # prepare cache self._save_cache(index_content, completion) return completion except Exception as e: log.error(f"Error calling the model: {e}") raise e

You can use this minus the cache, here is how to use it in the call

output: GeneratorOutputType = None # call the model client completion = None try: completion = self._model_client_call(api_kwargs=api_kwargs) except Exception as e: log.error(f"Error calling the model: {e}") output = GeneratorOutput(error=str(e)) # process the completion if completion: try: output = self._post_call(completion) except Exception as e: log.error(f"Error processing the output: {e}") output = GeneratorOutput(raw_response=str(completion), error=str(e))

liyin2015

@mneedham thanks for the pr, its great work. only one change will need

mneedham · 2024-08-09T14:24:27Z

@liyin2015 Is the change that you mentioned committed now? I guess I can pull down your changes and fix this PR

liyin2015 · 2024-08-16T15:21:38Z

@mneedham sorry you have to do a rebase now, and i will finish reviewing this time. Its really close

PoC to add streaming to acall

8a6c41b

mneedham mentioned this pull request Jul 25, 2024

Can we stream responses? #149

Open

Add tests

f7ea1bc

mneedham changed the title ~~PoC to add streaming to acall~~ Add streaming to acall Jul 25, 2024

openai async streaming

9aedfc1

liyin2015 reviewed Jul 30, 2024

View reviewed changes

liyin2015 mentioned this pull request Dec 18, 2024

299 ollama client does not work with stream #309

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add streaming to acall #158

Add streaming to acall #158

mneedham commented Jul 25, 2024

mneedham commented Jul 25, 2024

mneedham commented Jul 25, 2024

mneedham commented Jul 25, 2024

liyin2015 Jul 30, 2024

liyin2015 Aug 1, 2024

liyin2015 left a comment

mneedham commented Aug 9, 2024

liyin2015 commented Aug 16, 2024

Add streaming to acall #158

Are you sure you want to change the base?

Add streaming to acall #158

Conversation

mneedham commented Jul 25, 2024

mneedham commented Jul 25, 2024

mneedham commented Jul 25, 2024

mneedham commented Jul 25, 2024

liyin2015 Jul 30, 2024

Choose a reason for hiding this comment

liyin2015 Aug 1, 2024

Choose a reason for hiding this comment

liyin2015 left a comment

Choose a reason for hiding this comment

mneedham commented Aug 9, 2024

liyin2015 commented Aug 16, 2024