-
Notifications
You must be signed in to change notification settings - Fork 141
Thoughts on keeping more metadata? #295
Comments
Storing more metadata sounds great! The behavior would need to be the same across different backends, though. The kwargs in Package will probably get you like 50% of the way there. You'll probably want to change the pypicloud/pypicloud/cache/base.py Lines 94 to 102 in 046126f
The two places that ingest packages and metadata are pypicloud/pypicloud/views/api.py Line 70 in 046126f
pypicloud/pypicloud/views/simple.py Lines 25 to 27 in 046126f
Poking around the backends a bit, it looks like the storage & caching options should just pick up the metadata additions to Package. Might require some tests to be sure. |
Just found that this would probably speed up poetry's dependency resolution https://python-poetry.org/docs/faq#why-is-the-dependency-resolution-process-slow. If implemented, it would be beneficial if something like |
I was also looking into this, and specifically for poetry. From reading the poetry docs, i'm not super sure if poetry uses the json api for anything other than pypi itself. This is a bit unfortunate if true, but c'est la vie; might be worth asking on poetry repo for details / if they'd add support for non-pypi repositories (i.e. non-pypi package repositories) to use the json API. That said, I could be wrong about how poetry works WRT this; it's possible it would work but doesn't atm because poetry thinks the metadata is "invalid" since it's missing the required fields you mentioned. One alternative would be to add support for PEP658 https://peps.python.org/pep-0658/ pypicloud provides a ton of value, and really appreciate the hard work here so happy to contribute back time-permitting. To me, doing both seems worth-while in the long-run. I can't promise any specific timelines, but i'd be interested in checking out the above two things. I.e. volunteering instead of just requesting |
I think PEP 658 would be very reasonable to implement. It shouldn't be too difficult to add basic functionality, but adding the metadata hashes might be a bit tricky to avoid performance issues. Would happily review a PR for this. |
Minor note from me: Lyft has migrated away from PyPICloud so I will not be taking this on. |
Hey @stevearc.
QQ so I don't end up working on something that doesn't make sense: would you be philosophically against storing more metadata (say, classifiers) in the cache and making it available in the
json
endpoint?And potentially also implementing a per-release json endpoint like Warehouse.
Motivation: It'd be helpful to have more metadata upfront before downloading.
It seems relatively straightforward given that the Package model already has a KV catch-all.
Follow-up: I only need this for s3 (I mention because of cache rebuilding). If I wanted to implement the feature above, would it need to have parity across backends?
Thanks!
The text was updated successfully, but these errors were encountered: