Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I "always cache", ignoring response cache headers? #116

Closed
jeffjose opened this issue Nov 27, 2023 · 6 comments
Closed

How do I "always cache", ignoring response cache headers? #116

jeffjose opened this issue Nov 27, 2023 · 6 comments

Comments

@jeffjose
Copy link

The endpoint I'm dealing with has Cache-Control: private, no-cache, no-store, max-age=0, must-revalidate. However, I'd still like to cache the response, and take advantage of hishel's TTL (say 1min) before I hit the endpoint again.

How do I ignore response headers, and make hishel always cache responses?

@karpetrosyan
Copy link
Owner

karpetrosyan commented Nov 28, 2023

Hi! I believe the custom controller is what you needed.

Here is the example:

import hishel


class AlwaysCacheController(hishel.Controller):

    def is_cachable(self, request, response) -> bool:
        return True

    def construct_response_from_cache(self, request, response, original_request):
        return response

client = hishel.CacheClient(controller=AlwaysCacheController())

The controller is responsible for validating the stored response in the construct_response_from_cache method and checking if the response can be stored in the is_cachable method, so overriding these two methods will cause the hishel to cache every single response.

You can also use storage TTLs with this custom controller because the controllers and storages are completely independent.

I think it will be amazing to add an extension for such cases. Now that we have the cache_disabled extension, we can also add always_cache for a better user experience.

@karpetrosyan
Copy link
Owner

I have opened a pull request that can resolve this problem using the HTTPX extensions.

Similar issue: #85
PR: #117

@jeffjose
Copy link
Author

The AlwaysCacheController works, but I have another issue when used with TTL. The second client.get resets the TTL, so the cache never expires so long as there's another request within the TTL window.

I'd have expected the TTL to run down independently, irrespective of whether there was a cache hit.

In other words,

ACTUAL

client.get(url) # Hits the endpoint, and cache for say 60 seconds

# immediately
client.get(url) # Retrieves from cache and resets the TTL to 60 seconds

# after 59 seconds
client.get(url) # Retrieves from cache and resets the TTL to 60 seconds.

# after 3 seconds (total time since the original response was cached = 62 seconds)
client.get(url) # Retrieves from cache and resets the TTL to 60 seconds.

EXPECTED

client.get(url) # Hits the endpoint, and cache for say 60 seconds

# immediately
client.get(url) # Retrieves from cache 

# after 59 seconds
client.get(url) # Retrieves from cache

# after 3 seconds (total time since the original response was cached = 62 seconds)
client.get(url) # Hits the endpoint, and cache for say 60 seconds

@jeffjose
Copy link
Author

The solution posted over at #85 (comment) seems to work the way I want. The only downside is the cache entry (in my case, in redis) never expires. I had to remove TTL on the RedisStorage for make it work.

@karpetrosyan
Copy link
Owner

Actually, now that the Redis and FileSystem storages do not support such behavior, they reset the ttl after each change, but SQLStorage doesn't, so yes, we should tidy up things here.

For example, for filesystems, we use st_mtime, which is the last modification time, to check if it's stale or not, but instead we can use the creation time, and the problem would be fixed.

@karpetrosyan
Copy link
Owner

Resolved in #117

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants