Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redis driver for blob cache information and metadb #2865

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

andaaron
Copy link
Contributor

@andaaron andaaron commented Jan 7, 2025

Rebased #2412 and continued work on missing method implementation for cache driver, and

Still need to:

  • run linter
  • make sure all existing tests pass
  • add more tests to cover redis
  • see if we can do anything for image signatures (right now testing with local storage, but we should have an alternative).
  • use an effective locking system:
    • need to check what support they offer in this library for concurrent goroutines
    • even if they have support for concurrent goroutines in the library we need a solution to work with multiple zot instances using the same database instance

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

elee1766 and others added 2 commits January 4, 2025 08:15
Currently, we have dynamoDB as the remote shared cache but ideal only
for the cloud use case.
For on-prem use case, add support for redis.

Signed-off-by: Ramkumar Chinchani <[email protected]>
This was referenced Jan 7, 2025
@elee1766
Copy link
Contributor

elee1766 commented Jan 7, 2025

@andaaron

this is awesome work. thank you.

i wish i could finish this but company needs put me elsewhere. will be super excited to use this feature when it is ready.

I wanted to note a few things, since i see some notes in your PR.

  1. for locks, i have had good success in https://github.com/go-redsync/redsync in the past, using it with redis-v9 driver. however, they only provide a somewhat-safe mutex, and you still need to implement either a leadership election on top of the mutex primative, or force the hot-paths to use mutex. for my needs, this has been okay, since mostly I am using mutexes to save on duplication of work.

  2. this is a slight tangent, but the more modern redis library is https://github.com/redis/rueidis
    which has https://github.com/redis/rueidis/tree/main/rueidislock. i have been using it for some more recent work, as it claims function to better in a redis cluster, however, the locking package is much less tested and used vs redsync/v4. I wouldn't use it for this project personally, but the option does exist, so i thought i would say

  3. redis cluster, maybe you know, has very interesting behavior + latency during failover, reshuffle, etc, or at least in my experience. i'm not sure if this is in scope for the project, but especially if a lock is going to be in the hot path, it may be important to test that zot does not explode when there are 10-15 second timeouts on random queries to redis

  4. fwiw, i dont think that redis is a great choice for a cache driver. the max value size is 512mb, which is much smaller than the max size for an oci blob (i think its a few gb?), there are many layers with size above 512mb. You could potentially shard/multipart the blob into multiple keys, by creating some container format around blob parts to indicate where the next "part" of the blob is, (now you get into some weird stuff with eviction, so you probably want this to be a lua script), but i dont know, it seems a lot more effort than its worth.

@rchincha
Copy link
Contributor

rchincha commented Jan 7, 2025

@andaaron thx for taking this up

@andaaron
Copy link
Contributor Author

andaaron commented Jan 8, 2025

  1. fwiw, i dont think that redis is a great choice for a cache driver. the max value size is 512mb, which is much smaller than the max size for an oci blob (i think its a few gb?), there are many layers with size above 512mb. You could potentially shard/multipart the blob into multiple keys, by creating some container format around blob parts to indicate where the next "part" of the blob is, (now you get into some weird stuff with eviction, so you probably want this to be a lua script), but i dont know, it seems a lot more effort than its worth.

With regards to this specific point, we have a misunderstanding. The role of the cache driver is to store information on what blob was deduped and and where are the duplicates. The role of metadb is to store information taken from manifests, as well as other metadata about manifests, repos, and users.

We wouldn't be using it to store and serve the actual image blobs, that is a separate storage implementation, where we support local folders and S3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants