Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a caching mechanism #13

Merged
merged 3 commits into from
Oct 4, 2024
Merged

Conversation

FrostyX
Copy link
Member

@FrostyX FrostyX commented Mar 24, 2024

Fix #11

@FrostyX
Copy link
Member Author

FrostyX commented Mar 24, 2024

What do you think? I tried to keep the caching as simple as possible and as explicit as possible. I don't really like the need for nested-try-except in the client code (see README) but I don't know how to avoid it.

README.md Outdated
aliases = load_distro_aliases_cache()
print([x.branch for x in aliases["fedora-all"]])

except BadAliasesCache as ex:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly, this gets thrown from load_distro_aliases_cache()… wouldn't it be easier to wrap the whole thing for users? Something like:

def get_distro_aliases(with_cache=True, fallback=True):
  …

Where with_cache triggers the caching, and fallback indicates retrieval from cache if Bodhi request fails. Or maybe with different defaults.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @mfocko, sorry for the delay here.
The BadAliasesCache is thrown by load_distro_aliases_cache() when the cache is unusable (the file doesn't exist, is older than allowed, etc). Probably nothing to wrap here since this situation is fatal and the caller can't do anything but print the error to the user.

wouldn't it be easier to wrap the whole thing for users? Something like:
Where with_cache triggers the caching, and fallback indicates retrieval from cache if Bodhi request fails. Or maybe with different defaults.

This is where I agree with you. If you take a look at the README.md change, there is a proposed usage which leaves a lot of error handling for the caller. This was because of several reasons:

  1. @praiskup was concerned about the difficulty of debugging when caching is involved, so I wanted to make it as explicit as possible to the user that caching is indeed happening
  2. I wasn't able to figure out how to implement it well
  3. It allowed the user to trivially add additional code (such as logging) to the error-handling boilerplate

However, the usage becomes really ugly with the two nested try blocks and so on. So, I agree with you, we probably want to wrap the whole thing. Based on your suggestion, we could do this:

def get_distro_aliases(caching=False):
    try:
        releases = bodhi_active_releases()
        if caching:
            save_distro_aliases_cache(releases)
    except requests.exceptions.RequestException as ex:
        if not caching:
            raise BodhiDown from ex
        releases = load_distro_aliases_cache()
    ...

But I am considering a bit more generic approach

def get_distro_aliases(cache=None):
    try:
        releases = bodhi_active_releases()
        if cache:
            cache.save(releases)
    except requests.exceptions.RequestException as ex:
        if not cache:
            raise BodhiDown from ex
        releases = cache.load()
    ...


class Cache:
    def __init__(self, path=None, ttl=None):
        self.path = path or PATH
        self.ttl = ttl

    def save(self, releases):
        ...

    def load(self):
        ...

# Usage
cache = Cache(ttl=999)
aliases = get_distro_aliases(cache=cache)

This would allow the user to modify the caching parameters like path and how old cache is too old (aka TTL :D). It would also allow the user to provide custom caching implementation, and maybe it could be re-used by opensuse-distro-aliases.

WDYT @mfocko and @dcermak?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the API proposal from @mfocko for its simplicity.

TBH, I wouldn't make everything user configurable, like the TTL (this is something where we imho can pick a reasonable default).

I am not sure whether I'll really be reusing the caching mechanism, as I don't want to introduce more dependencies to opensuse-distro-aliases. But I might steal your code to avoid the Thursday morning downtime.

@LecrisUT
Copy link

Where would the caching be local to? If the caller is in the same python environment, than it should be fairly easy to implement, but then you wouldn't be able to cache for calls that spawn a new python environment like a subprocess.run. On the other hand, if it's locally stored, how do we wipe the cache? Maybe the idea is to always do a bodhi call and store the latest result, at the expense of performance gains from caching.

Comment on lines 40 to 41
except requests.exceptions.RequestException as ex:
raise BodhiDown from ex
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this not mask a 404?

@FrostyX
Copy link
Member Author

FrostyX commented Aug 6, 2024

You guys didn't tell me that I forgot to commit the actual cache.py and test_cache.py files!
I thought I did, sorry for tagging you all to review an imaginary code :trollface:

@LecrisUT
Copy link

LecrisUT commented Aug 6, 2024

Goes to show that a simple .github/workflows wouldn't be a bad idea. Still amazing that there is otherwise no mention of import cache. How is it being used?

@FrostyX
Copy link
Member Author

FrostyX commented Aug 6, 2024

Goes to show that a simple .github/workflows wouldn't be a bad idea. Still amazing that there is otherwise no mention of import cache. How is it being used?

I will soon commit the real caching implementation so it will be clear how to use it and we will finally have something concrete to talk about :-)

@FrostyX
Copy link
Member Author

FrostyX commented Aug 6, 2024

I finally pushed the actual caching implementation. Please let me know your thoughts.

@dcermak
Copy link

dcermak commented Aug 30, 2024

JFYI, I needed to also supply an offline available version in opensuse-distro-aliases, but opted for a simpler solution: save the result of get_distro_aliases in a constant. This is imho quite feasible in openSUSE, as most distros release about once per year. See: rpm-software-management/opensuse-distro-aliases#18

Copy link
Collaborator

@mfocko mfocko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor nitpicks, otherwise LGTM.

README.md Outdated
Comment on lines 94 to 95
To configure parameters for the cache, or use your own caching mechanism, pass
an object instead of `True`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably add an example of what is expected to appear…

  1. your own caching mechanism implies an object satisfying some protocol/interface, so that's clear
  2. configure parameters is not that straightforward though, is it a dictionary, dataclass, or something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @mfocko, good point. Hopefully fixed now.

if cache:
cache.save(releases)
except requests.exceptions.RequestException as ex:
if not cache:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you override __bool__ so that it returns False when the cache is empty, i.e., I'm caching, created a valid cache, but failed to download the releases for the first time?

Never mind, it raises BadCache in such event…

fedora_distro_aliases/cache.py Show resolved Hide resolved
There is no reason for them to be munches, and it will simplify
caching when they are not.
@mfocko
Copy link
Collaborator

mfocko commented Oct 3, 2024

Are we blocked on something here? 👀

@FrostyX
Copy link
Member Author

FrostyX commented Oct 4, 2024

I think we are not @mfocko. I will merge and do a release shortly.

@FrostyX FrostyX merged commit fedb62e into rpm-software-management:main Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fallback for when Bodhi API is down
4 participants