-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support AWS S3 #164
Support AWS S3 #164
Conversation
Hey @karpetrosyan! Love this idea! One approach to consider may be to simply sub in fsspec methods in the FIleStorages. It's a well-documented, and well-supported part of the dask high performance compute ecosystem. It supports async/await and an extremely long list of cloud storage types, including a ton of built-in backends, and a growing list of additional extensions, including S3, Azure Blob Service, and Google Cloud Storage. For the api, you would just need to pass a "base_dir" into the file storage cache that matches a registered file system type, and maybe also support passing in authentication and other back-end specific configuration parameters in kwargs. Just a thought - might be an easy way to instantly support a ton of cloud storage providers. |
Hey @parkerhancock! I believe that if we decide to support multiple cloud services, we should probably use that library. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #164 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 29 29
Lines 2064 2067 +3
=========================================
+ Hits 2064 2067 +3 ☔ View full report in Codecov by Sentry. |
S3 support in
Hishel
allows you to share responses across your services, reducing traffic to the origin server.When used correctly, it has the potential to significantly reduce your traffic.
Amazon Simple Storage Service (Amazon S3) is a scalable and widely-used object storage service provided by Amazon Web Services (AWS). In a nutshell, here are key features and aspects of Amazon S3:
Object Storage: Amazon S3 is designed to store and retrieve any amount of data, and it treats data as objects. Each object consists of data, a key (unique within a bucket), and metadata.
Scalability: S3 is highly scalable, allowing you to store an unlimited amount of data. It can scale both in terms of storage capacity and request throughput.
Durability and Availability: Amazon S3 is designed for 99% durability, meaning your data is highly resilient. It achieves this through automatic data replication across multiple servers and data centers.
Data Management Features:
Versioning: You can enable versioning for a bucket to keep multiple versions of an object.
Lifecycle Policies: Define rules to automatically transition objects to different storage classes or delete them after a specified period.
Cross-Region Replication (CRR): Replicate objects across different AWS regions for data redundancy and compliance.
Security and Access Control:
Access Control Lists (ACLs): Control access to buckets and objects using ACLs.
Bucket Policies: Define fine-grained access controls using JSON-based policies.
Identity and Access Management (IAM): Use IAM roles and policies to manage access at a more granular level.