Skip to content
This repository has been archived by the owner on Feb 12, 2021. It is now read-only.

deleteRemoved option with downloadDir causes some local directories to disappear entirely #54

Open
brzpegasus opened this issue Oct 17, 2014 · 6 comments · May be fixed by #57 or #202
Open

deleteRemoved option with downloadDir causes some local directories to disappear entirely #54

brzpegasus opened this issue Oct 17, 2014 · 6 comments · May be fixed by #57 or #202
Labels
Milestone

Comments

@brzpegasus
Copy link

I'm trying to download the content of an entire directory from an S3 bucket. The directory contains two subdirectories, which themselves contain a bunch of files. It essentially looks something like this:

+ some-bucket-name
  |_ test/
      |_ foo/
      |_ bar/

The first time that I call downloadDir() with the deleteRemoved flag set to true, both subdirectories (foo and bar in this example) download to a local directory as expected:

target-dir/
|_ foo/
|_ bar/

However, on each subsequent downloadDir() call, something weird happens: the local target-dir ends up with only one directory left. Sometimes, it's foo, and sometimes, it's bar.

The code for the download is pretty straightforward:

var downloader = client.downloadDir({
  localDir: "target-dir",
  s3Params: {
    Bucket: "some-bucket-name",
    Prefix: "test/"
  },
  deleteRemoved: true
});

I haven't been able to dig into this project's code yet, but is there something that might immediately come to mind?

@andrewrk
Copy link
Owner

Thanks for the report and the test case. I haven't taken a look at this yet, but it looks like a serious issue that needs to be solved. Nothing immediately comes to mind but I plan to dig into it within a couple weeks. Feel free to dig in yourself if you want things to happen sooner.

brzpegasus added a commit to brzpegasus/node-s3-client that referenced this issue Oct 29, 2014
enabled.

* allS3Objects should not contain the directory specified by 'prefix'.

Fixes andrewrk#54
@andrewrk
Copy link
Owner

andrewrk commented Nov 3, 2014

I've been trying to reproduce this issue and I can't seem to get it to happen. Also, looking at the code, I don't understand why it would happen or why your fix would solve it. Do you think you could whip up a .tar.gz with some example code that demonstrates the problem?

@brzpegasus
Copy link
Author

@andrewrk Thanks for looking at this issue.

While trying to reproduce this, I realized there was a bit more to the story than I originally thought and described. Basically, the issue is only observed if you are to manually create the directory in an S3 bucket (from the S3 management console), and then upload contents to it. It doesn't happen when the entire directory is uploaded to a bucket.

Let me try to elaborate, with screenshots to avoid any confusion:

Code for the download

The code is very straightforward. It's all in this gist.

Directory structure

I created a directory on my local system, s3-test, containing two subfolders, subdir1 and subdir2. Each of these subfolders contains a very simple text file:

s3-test/
|_ subdir1/
  |_ file1.txt
|_ subdir2/
  |_ file2.txt

Scenario 1: Upload entire directory to S3 bucket (works as expected)

I set up a bucket and uploaded the entire s3-test directory to it.

screen shot 2014-11-03 at 10 46 22 pm

screen shot 2014-11-03 at 10 47 45 pm

When I then execute the code to download from the bucket and s3-test directory, I get both subdir1 and subdir2 downloaded into my local assets folder. If I execute the code a second time, I still get the same contents, so no issue here.

screen shot 2014-11-03 at 10 50 20 pm

Scenario 2: Create a directory in a bucket then upload contents (subdirectory gets deleted on download):

Here, I first create a new directory inside the bucket, s3-test2, using the S3 management console. I then upload just the contents of the local s3-test directory (so just subdir1 and subdir2).

screen shot 2014-11-03 at 10 51 10 pm

screen shot 2014-11-03 at 10 52 07 pm

screen shot 2014-11-03 at 10 52 31 pm

screen shot 2014-11-03 at 10 53 02 pm

When I execute the code to download from the bucket and s3-test2 directory, the first time, everything works fine and both subdir1 and subdir2 get downloaded into the assets directory. If I re-run the code (without first deleting the assets directory), subdir1 goes missing.

screen shot 2014-11-03 at 10 54 00 pm

Hope that helps clear things up!

@andrewrk andrewrk added the bug label Nov 11, 2014
@rousseauo
Copy link

I got the same problem... +1 for finding the working/failing cases... Wasnt easy.
Any solutions yet ? @brzpegasus

@OmgImAlexis OmgImAlexis modified the milestone: Backlog Aug 19, 2017
@zdila
Copy link

zdila commented Oct 27, 2017

I can reproduce the issue. Files on S3 were created manually.

@nealshail
Copy link

nealshail commented Mar 8, 2018

I am also experiencing this error scenario now.. I am noticing that when you create the scenario on s3 by hand s3 creates 'directory' objects for empty directories.. It seems this might be causing an error in the special case scenario in checkDoMoreWork() for handling directory matching. Fixed in merge request below.

@nealshail nealshail linked a pull request Mar 8, 2018 that will close this issue
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
6 participants