Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community[patch]: Graceful handling of redis errors in RedisCache and AsyncRedisCache #17171

Merged
merged 2 commits into from
Feb 21, 2024

Conversation

snsten
Copy link
Contributor

@snsten snsten commented Feb 7, 2024

  • Description:
    The existing RedisCache implementation lacks proper handling for redis client failures, such as ConnectionRefusedError, leading to subsequent failures in pipeline components like LLM calls. This pull request aims to improve error handling for redis client issues, ensuring a more robust and graceful handling of such errors.

  • Issue: Fixes RedisCache does't handle errors from redis.  #16866

  • Dependencies: No new dependency

  • Twitter handle: N/A

Copy link

vercel bot commented Feb 7, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Feb 21, 2024 3:39pm

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. Ɑ: memory Related to memory module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: redis Primarily related to Redis integrations labels Feb 7, 2024
generations.append(Generation(text=text))
# Gotta catch'em all
except Exception as e:
logger.warning(f"Redis lookup failed: {e}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this an error level issue rather than a warning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will change the log level to error

# In a previous life we stored the raw text directly
# in the table, so assume it's in that format.
generations.append(Generation(text=text))
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two other reasonable behaviors.

  1. Retry once prior to giving up
  2. Circuit breaker pattern (out of scope)

What are your thoughts on these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Adding retry with some timeout make sense.
  2. About circuit breaker for long term failure scenario:
    • We keep logging errors while redis is down until circuit is open.
    • When circuit is open do the error logs continue for redis failure in this duration?
      Or
    • Circuit is open we log error that the circuit is open so redis is not being used?

Ideally it would be better to continue showing error logs since the api costs are incurred without cache, the service user should be aware of the redis failure even during the open circuit state from continued failure.

I can start with retry and close this and start a new pr for circuit breaker considering all the scenarios or do it here. Let me know what do you suggest.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think users would appreciate being able to configure the retry behavior since a retry involves some additional latency and isn't guaranteed to pay off.

Would you be able to suggest a parameterization for the initializer that you think would make sense to configure the retry behavior?


For a circuit breaker, I think it probably make sense to log periodically, but not every request. We should probably figure out a separate way to tackle this. I'd appreciate if we could design a circuit breaker as a higher level primitive that accepts a cache as an instance and dresses it up as a circuit breaker


cc @cbornet / @dzmitry-kankalovich tagging you in case this is of interest for you

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think even as-is current change set is great, because cache layer IO should not break LLM calls in my opinion.

Taking it further, I do agree that more configuration around it would be nice, like the fact whether it should error out or not (default "not"?); and if retry to be added, then its definitely need to have configuration for that as well (which I'd probably set to no retry by default? you rightfully mentioned with cache retries you need to be sure this is what you want, as it can just defeat the purpose of a cache in the first place).

And yes, a general circuit breaker for cache interface would be great, but this is probably a substantial changeset, which IDK if @snsten would go for.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on all points!

@snsten do you want to proceed with the PR as is and just update the logger.warning -> logger.error? This will probably spam someone's logs at some point, but at least their service will stay up.

And if you want to tackle more configuration separately that will be great (or a circuit breaker).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will change the error log only in this pr for the two classes mentioned

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@dzmitry-kankalovich
Copy link
Contributor

I'd also add it to AsyncRedisCache class (should pop up in your branch once you rebase onto latest master)

@dosubot dosubot bot removed the size:M This PR changes 30-99 lines, ignoring generated files. label Feb 15, 2024
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Feb 15, 2024
@efriis efriis self-assigned this Feb 15, 2024
@snsten snsten closed this Feb 15, 2024
@snsten snsten force-pushed the redis-cache-error-handling branch from f7ea281 to 53b8c86 Compare February 15, 2024 10:29
@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Feb 15, 2024
@snsten
Copy link
Contributor Author

snsten commented Feb 15, 2024

Added the error logs for RedisCache and AsyncRedisCache classes.

@snsten snsten reopened this Feb 15, 2024
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Feb 15, 2024
@dzmitry-kankalovich
Copy link
Contributor

LGTM 👍

@eyurtsev eyurtsev changed the title community: Graceful handling of redis errors community[patch]: Graceful handling of redis errors in RedisCache and AsyncRedisCache Feb 21, 2024
# In a previous life we stored the raw text directly
# in the table, so assume it's in that format.
generations.append(Generation(text=text))
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Feb 21, 2024
@eyurtsev eyurtsev removed Ɑ: memory Related to memory module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature partner labels Feb 21, 2024
@eyurtsev eyurtsev removed the template label Feb 21, 2024
@eyurtsev eyurtsev merged commit 8381f85 into langchain-ai:master Feb 21, 2024
58 checks passed
al1p pushed a commit to al1p/langchain that referenced this pull request Feb 27, 2024
… AsyncRedisCache (langchain-ai#17171)

- **Description:**
The existing `RedisCache` implementation lacks proper handling for redis
client failures, such as `ConnectionRefusedError`, leading to subsequent
failures in pipeline components like LLM calls. This pull request aims
to improve error handling for redis client issues, ensuring a more
robust and graceful handling of such errors.

  - **Issue:**  Fixes langchain-ai#16866
  - **Dependencies:** No new dependency
  - **Twitter handle:** N/A

Co-authored-by: snsten <>
Co-authored-by: Eugene Yurtsev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm PR looks good. Use to confirm that a PR is ready for merging. 🔌: redis Primarily related to Redis integrations size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RedisCache does't handle errors from redis.
4 participants