(uoft) Include chat history in semantic cache #27548

AntonioFerreras · 2024-10-22T17:03:19Z

AntonioFerreras
Oct 22, 2024

Checked

I searched existing ideas and did not find a similar one
I added a very descriptive title
I've clearly described the feature request and motivation for it

Feature request

Semantic cache implementations such as the Upstash, Redis, and GPTcache do not have easy support to include chat history. This proposal would extend their functionality to have this crucial feature.

Motivation

LLM caches allow for repeat prompts to an LLM to be cached and reused to save wait time. The semantic cache increases cache hit rate by not only finding string-equivalent prompts, but semantically similar ones too.

However, there is a flaw in this, a single prompt may not capture the semantic meaning within a chat. Previous chat history is important to determine the response of the latest prompt. Here is an example:

Prompt: "Tell me a joke"
Response: "Why did the chicken cross the road, to get to the other side!"

Prompt: "I want car jokes"
Response: "--a car joke--"

Prompt: "tell me a joke"
Cached response: "Why did the chicken..."

Proposal (If applicable)

We are a team of 4 in CSCD01 at UTSC. We would like to know if this is a good idea to contribute!

eyurtsev · 2024-10-22T17:18:53Z

eyurtsev
Oct 22, 2024
Maintainer

I don't know how practical a semantic cache is beyond a first level of interaction. If it substantially increases recall, don't see how it doesn't happen at the expense of precision.

3 replies

AntonioFerreras Oct 22, 2024
Author

Hi thanks for the quick response! So you don't believe the semantic cache is useful if a chat history is included in the embeddings? An alternative direction we are considering is implementing semantic cache in langchainjs, as it doesn't appear to have an implementation. What do you think of this?

eyurtsev Oct 22, 2024
Maintainer

Yes, I don't believe that it'll be used beyond the first level of interaction.

The reasons:

Usually too dangerous to-reuse information across users (e.g., what if you accidentally surface personal information from one user to another).
For a conversation consisting of 3-4 turns what do you think are the odds that the conversations are actually similar? Can probably model as a geometric distribution with a very steep decline.
Could still be useful for the very first interaction (e.g., FAQ question "what is langchain?")? cache hits are likely in this case. Although still some potential problem with personal information (e.g., "hi my name is eugene. What is langchain?")

AntonioFerreras Oct 22, 2024
Author

Yes, I don't believe that it'll be used beyond the first level of interaction.

I see, I didn't consider that but those are good points. I've created a discussion for implementing semantic caching for first interactions on the langchainjs repo, would you mind including your note that it could still be useful for first interactions? Perhaps it could be promoted to an issue from there.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(uoft) Include chat history in semantic cache #27548

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

(uoft) Include chat history in semantic cache #27548

AntonioFerreras Oct 22, 2024

Checked

Feature request

Motivation

Proposal (If applicable)

Replies: 1 comment · 3 replies

eyurtsev Oct 22, 2024 Maintainer

AntonioFerreras Oct 22, 2024 Author

eyurtsev Oct 22, 2024 Maintainer

AntonioFerreras Oct 22, 2024 Author

AntonioFerreras
Oct 22, 2024

Replies: 1 comment 3 replies

eyurtsev
Oct 22, 2024
Maintainer

AntonioFerreras Oct 22, 2024
Author

eyurtsev Oct 22, 2024
Maintainer

AntonioFerreras Oct 22, 2024
Author