Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support AWS S3 #164

Merged
merged 17 commits into from
Jan 18, 2024
Merged

Support AWS S3 #164

merged 17 commits into from
Jan 18, 2024

Conversation

karpetrosyan
Copy link
Owner

S3 support in Hishel allows you to share responses across your services, reducing traffic to the origin server.
When used correctly, it has the potential to significantly reduce your traffic.

Amazon Simple Storage Service (Amazon S3) is a scalable and widely-used object storage service provided by Amazon Web Services (AWS). In a nutshell, here are key features and aspects of Amazon S3:

  • Object Storage: Amazon S3 is designed to store and retrieve any amount of data, and it treats data as objects. Each object consists of data, a key (unique within a bucket), and metadata.

  • Scalability: S3 is highly scalable, allowing you to store an unlimited amount of data. It can scale both in terms of storage capacity and request throughput.

  • Durability and Availability: Amazon S3 is designed for 99% durability, meaning your data is highly resilient. It achieves this through automatic data replication across multiple servers and data centers.

  • Data Management Features:
    Versioning: You can enable versioning for a bucket to keep multiple versions of an object.
    Lifecycle Policies: Define rules to automatically transition objects to different storage classes or delete them after a specified period.
    Cross-Region Replication (CRR): Replicate objects across different AWS regions for data redundancy and compliance.

  • Security and Access Control:
    Access Control Lists (ACLs): Control access to buckets and objects using ACLs.
    Bucket Policies: Define fine-grained access controls using JSON-based policies.
    Identity and Access Management (IAM): Use IAM roles and policies to manage access at a more granular level.

@karpetrosyan karpetrosyan added the enhancement New feature or request label Jan 16, 2024
@parkerhancock
Copy link
Collaborator

Hey @karpetrosyan!

Love this idea! One approach to consider may be to simply sub in fsspec methods in the FIleStorages. It's a well-documented, and well-supported part of the dask high performance compute ecosystem. It supports async/await and an extremely long list of cloud storage types, including a ton of built-in backends, and a growing list of additional extensions, including S3, Azure Blob Service, and Google Cloud Storage.

For the api, you would just need to pass a "base_dir" into the file storage cache that matches a registered file system type, and maybe also support passing in authentication and other back-end specific configuration parameters in kwargs.

Just a thought - might be an easy way to instantly support a ton of cloud storage providers.

@karpetrosyan
Copy link
Owner Author

Hey @parkerhancock!

I believe that if we decide to support multiple cloud services, we should probably use that library.
For the time being, let's simply introduce S3 so that users can test it and determine whether it is truly necessary or useful.

@karpetrosyan karpetrosyan marked this pull request as draft January 17, 2024 14:52
@karpetrosyan karpetrosyan marked this pull request as ready for review January 18, 2024 05:04
Copy link

codecov bot commented Jan 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (e4ec838) 100.00% compared to head (6dad418) 100.00%.

Additional details and impacted files
@@            Coverage Diff            @@
##            master      #164   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           29        29           
  Lines         2064      2067    +3     
=========================================
+ Hits          2064      2067    +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@karpetrosyan karpetrosyan merged commit 09672fd into master Jan 18, 2024
8 checks passed
@karpetrosyan karpetrosyan mentioned this pull request Jan 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants