Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in s3hub cp subcommand #46

Merged
merged 9 commits into from
Jan 27, 2024
Merged

Fix bug in s3hub cp subcommand #46

merged 9 commits into from
Jan 27, 2024

Conversation

nao1215
Copy link
Owner

@nao1215 nao1215 commented Jan 26, 2024

Summary by CodeRabbit

  • New Features
    • Enhanced S3 operations to ensure all objects within a bucket are deleted before the bucket itself is deleted, improving data management and cleanup.
    • Introduced new functionalities in the S3 hub for copying files, including better handling of local and S3 targets, and a new method for managing local copy targets.
  • Tests
    • Added tests for the new copyPathPair functionality, ensuring robustness and reliability in file copying scenarios across different storage locations.

Copy link
Contributor

coderabbitai bot commented Jan 26, 2024

Warning

Rate Limit Exceeded

@nao1215 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 6 minutes and 51 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.
Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.
Please see our FAQ for further information.

Commits Files that changed from the base of the PR and between c79562b and 28a9ab8.

Walkthrough

The recent modifications enhance the functionality related to S3 bucket management and file copying operations. Specifically, they ensure that all objects within an S3 bucket are deleted before the bucket itself is removed, and improve the copying process by refining how local and S3 targets are handled. This includes better struct organization, method refactoring, and improved testing for robustness and reliability in file management tasks.

Changes

File Path Change Summary
app/external/s3_helper_test.go Modified to delete objects in S3 buckets before deletion and use explicit context in S3 operations.
app/domain/model/s3.go Modified Split method to handle edge case and use filepath.Clean for returning S3Key.
app/external/s3.go Minor modification in the UploadS3Object method, adding an extra line break.
app/interactor/mock/s3.go New file providing a mock implementation of the S3ObjectLister interface.
app/interactor/s3.go Minor modification in the UploadFile method.
cmd/subcmd/s3hub/cp.go Added gfile import, refactored struct field, and added methods for improved copy handling.
LICENSE Minor modification in the text.
cmd/subcmd/s3hub/cp_test.go Added Test_newCopyPathPair with test cases for verifying the creation of copyPathPair struct.

"In the digital meadow under the byte-sized sun,
A rabbit hopped, code in tow, refactoring fun.
With each line tweaked, a bug retreats,
And files dance in sync, to the disk's heartbeat. 🐇💾"

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

This comment has been minimized.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 4

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between d03dc7a and 694bbbc.
Files ignored due to path filters (2)
  • go.mod is excluded by: !**/*.mod
  • go.sum is excluded by: !**/*.sum
Files selected for processing (3)
  • app/external/s3_helper_test.go (1 hunks)
  • cmd/subcmd/s3hub/cp.go (4 hunks)
  • cmd/subcmd/s3hub/cp_test.go (2 hunks)
Additional comments: 1
cmd/subcmd/s3hub/cp.go (1)
  • 133-142: The method copyTargetsInLocal uses file.WalkDir to list files when the source is a directory. Ensure that file.WalkDir correctly handles symbolic links and does not follow them, to prevent potential security issues or unintended behavior.
Verification successful

The implementation of WalkDir in utils/file/file.go uses filepath.WalkDir from the Go standard library, which iterates over each entry in a directory. The function checks if an entry is a directory and skips it if true, appending only file paths to the files slice. This implementation does not explicitly handle symbolic links, relying on the behavior of filepath.WalkDir.

The Go documentation for filepath.WalkDir states that it does not follow symbolic links unless the link is to a directory and the function is called directly on that link. Since WalkDir does not explicitly follow symbolic links and only appends file paths (excluding directories), it inherently does not follow symbolic links to files. However, if a symbolic link points to a directory and WalkDir is called on that directory, the behavior would follow the standard filepath.WalkDir behavior, which includes traversing into symbolic links that point to directories.

Given this, the original concern about file.WalkDir potentially following symbolic links and leading to security issues or unintended behavior is addressed by the inherent behavior of filepath.WalkDir. It does not follow symbolic links to files, and its handling of symbolic links to directories is consistent with the documented behavior of the Go standard library.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Verify that file.WalkDir does not follow symbolic links

Length of output: 0



Script:

#!/bin/bash
# Attempt to find the implementation of file.WalkDir to verify its behavior regarding symbolic links
rg "func WalkDir" --type=go

Length of output: 97



Script:

#!/bin/bash
# Now that we've located the file, let's examine the implementation of WalkDir to understand how it handles symbolic links
cat utils/file/file.go

Length of output: 587

Comment on lines +39 to +66
// DeleteAllS3BucketDelete deletes all S3 buckets and objects.
func DeleteAllS3BucketDelete(t *testing.T, client *s3.Client) {
t.Helper()
ctx := context.Background()

buckets, err := client.ListBuckets(context.Background(), &s3.ListBucketsInput{})
buckets, err := client.ListBuckets(ctx, &s3.ListBucketsInput{})
if err != nil {
t.Fatal(err)
}

for _, bucket := range buckets.Buckets {
if _, err := client.DeleteBucket(context.Background(), &s3.DeleteBucketInput{Bucket: bucket.Name}); err != nil {
output, err := client.ListObjects(ctx, &s3.ListObjectsInput{
Bucket: bucket.Name,
})
if err != nil {
t.Fatal(err)
}

for _, object := range output.Contents {
if _, err := client.DeleteObject(ctx, &s3.DeleteObjectInput{
Bucket: bucket.Name,
Key: object.Key,
}); err != nil {
t.Fatal(err)
}
}

if _, err := client.DeleteBucket(ctx, &s3.DeleteBucketInput{Bucket: bucket.Name}); err != nil {
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of DeleteAllS3BucketDelete now includes deleting all objects within an S3 bucket before attempting to delete the bucket itself. This change addresses potential issues with deleting non-empty buckets. However, it's important to handle pagination in the ListObjects call to ensure all objects are deleted, especially in buckets with a large number of objects.

+ // Ensure to handle pagination for buckets with many objects
+ var continuationToken *string = nil
+ for {
+     output, err := client.ListObjects(ctx, &s3.ListObjectsInput{
+         Bucket: bucket.Name,
+         ContinuationToken: continuationToken,
+     })
+     if err != nil {
+         t.Fatal(err)
+     }
+     for _, object := range output.Contents {
+         _, err := client.DeleteObject(ctx, &s3.DeleteObjectInput{
+             Bucket: bucket.Name,
+             Key:    object.Key,
+         })
+         if err != nil {
+             t.Fatal(err)
+         }
+     }
+     if !output.IsTruncated {
+         break
+     }
+     continuationToken = output.NextContinuationToken
+ }

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
// DeleteAllS3BucketDelete deletes all S3 buckets and objects.
func DeleteAllS3BucketDelete(t *testing.T, client *s3.Client) {
t.Helper()
ctx := context.Background()
buckets, err := client.ListBuckets(context.Background(), &s3.ListBucketsInput{})
buckets, err := client.ListBuckets(ctx, &s3.ListBucketsInput{})
if err != nil {
t.Fatal(err)
}
for _, bucket := range buckets.Buckets {
if _, err := client.DeleteBucket(context.Background(), &s3.DeleteBucketInput{Bucket: bucket.Name}); err != nil {
output, err := client.ListObjects(ctx, &s3.ListObjectsInput{
Bucket: bucket.Name,
})
if err != nil {
t.Fatal(err)
}
for _, object := range output.Contents {
if _, err := client.DeleteObject(ctx, &s3.DeleteObjectInput{
Bucket: bucket.Name,
Key: object.Key,
}); err != nil {
t.Fatal(err)
}
}
if _, err := client.DeleteBucket(ctx, &s3.DeleteBucketInput{Bucket: bucket.Name}); err != nil {
// DeleteAllS3BucketDelete deletes all S3 buckets and objects.
func DeleteAllS3BucketDelete(t *testing.T, client *s3.Client) {
t.Helper()
ctx := context.Background()
buckets, err := client.ListBuckets(ctx, &s3.ListBucketsInput{})
if err != nil {
t.Fatal(err)
}
for _, bucket := range buckets.Buckets {
// Ensure to handle pagination for buckets with many objects
var continuationToken *string = nil
for {
output, err := client.ListObjects(ctx, &s3.ListObjectsInput{
Bucket: bucket.Name,
ContinuationToken: continuationToken,
})
if err != nil {
t.Fatal(err)
}
for _, object := range output.Contents {
_, err := client.DeleteObject(ctx, &s3.DeleteObjectInput{
Bucket: bucket.Name,
Key: object.Key,
})
if err != nil {
t.Fatal(err)
}
}
if !output.IsTruncated {
break
}
continuationToken = output.NextContinuationToken
}
if _, err := client.DeleteBucket(ctx, &s3.DeleteBucketInput{Bucket: bucket.Name}); err != nil {

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeleteAllS3BucketDelete() method only used for unit test. So, there is no many s3 objects. We don't use pagination when execute unit test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nao1215, understood! Given that DeleteAllS3BucketDelete is used solely for unit testing with a limited number of objects, your approach makes sense. Skipping pagination under these conditions seems reasonable. Thank you for clarifying the context and usage of this method. If there's anything else you need, feel free to ask!

cmd/subcmd/s3hub/cp_test.go Outdated Show resolved Hide resolved
cmd/subcmd/s3hub/cp.go Show resolved Hide resolved
cmd/subcmd/s3hub/cp.go Outdated Show resolved Hide resolved

This comment has been minimized.

@nao1215
Copy link
Owner Author

nao1215 commented Jan 26, 2024

@coderabbitai review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 694bbbc and 77a2e1d.
Files selected for processing (4)
  • app/domain/model/s3.go (1 hunks)
  • app/domain/model/s3_test.go (1 hunks)
  • cmd/subcmd/s3hub/cp.go (4 hunks)
  • cmd/subcmd/s3hub/cp_test.go (2 hunks)
Files skipped from review as they are similar to previous changes (2)
  • cmd/subcmd/s3hub/cp.go
  • cmd/subcmd/s3hub/cp_test.go
Additional comments: 2
app/domain/model/s3.go (1)
  • 232-237: The Split method now cleans the S3Key using filepath.Clean before returning it. This is a good practice to ensure the S3Key is in a standard format, especially when dealing with file paths that might include relative components like ...
app/domain/model/s3_test.go (1)
  • 662-673: The new test cases added to TestBucket_Split effectively cover scenarios with multiple levels in the bucket path and include parent directory references. These additions enhance the test coverage for the Split method, ensuring it behaves as expected in more complex scenarios.

cmd/subcmd/s3hub/cp.go Outdated Show resolved Hide resolved

This comment has been minimized.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 77a2e1d and c79562b.
Files selected for processing (6)
  • LICENSE (1 hunks)
  • app/external/s3.go (1 hunks)
  • app/interactor/mock/s3.go (1 hunks)
  • app/interactor/s3.go (1 hunks)
  • cmd/subcmd/s3hub/cp.go (5 hunks)
  • cmd/subcmd/s3hub/cp_test.go (2 hunks)
Files skipped from review due to trivial changes (3)
  • LICENSE
  • app/external/s3.go
  • app/interactor/s3.go
Files skipped from review as they are similar to previous changes (2)
  • cmd/subcmd/s3hub/cp.go
  • cmd/subcmd/s3hub/cp_test.go
Additional comments: 3
app/interactor/mock/s3.go (3)
  • 1-7: The package and import declarations are correctly structured, ensuring that the mock implementation is properly isolated within the mock package and has access to necessary types from the usecase package.
  • 9-10: The S3ObjectsLister type is declared as a function type that matches the signature of the ListS3Objects method in the S3ObjectLister interface. This is a common and effective pattern for creating mocks in Go, allowing for flexible testing scenarios.
  • 12-15: The implementation of the ListS3Objects method correctly delegates the call to the function itself, passing through the context and input parameters. This ensures that any function assigned to an instance of S3ObjectsLister will be called when ListS3Objects is invoked, allowing for dynamic behavior in tests based on the function provided. This pattern is both simple and powerful for mocking external dependencies.

This comment has been minimized.

This comment has been minimized.

Copy link

HOTTEST report

PASS FAIL SKIP TOTAL DURATION
283 0 7 290 3.271989722s

Reported by hottest

Copy link

Code Metrics Report

main (d03dc7a) #46 (f389d2f) +/-
Coverage 27.3% 29.2% +1.9%
Test Execution Time 4s 6s +2s
Details
  |                     | main (d03dc7a) | #46 (f389d2f) |  +/-  |
  |---------------------|----------------|---------------|-------|
+ | Coverage            |          27.3% |         29.2% | +1.9% |
  |   Files             |             42 |            43 |    +1 |
  |   Lines             |           1348 |          1357 |    +9 |
+ |   Covered           |            368 |           396 |   +28 |
- | Test Execution Time |             4s |            6s |   +2s |

Code coverage of files in pull request scope (60.8% → 66.9%)

Files Coverage +/-
app/domain/model/s3.go 99.0% +0.0%
app/external/s3.go 41.4% 0.0%
app/interactor/mock/s3.go 100.0% +100.0%
app/interactor/s3.go 100.0% 0.0%
cmd/subcmd/s3hub/cp.go 25.8% +25.8%

Reported by octocov

@nao1215 nao1215 merged commit f4aa82b into main Jan 27, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant