Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync does not remove deleted files from s3 #94

Open
coen-hyde opened this issue Aug 18, 2013 · 11 comments
Open

Sync does not remove deleted files from s3 #94

coen-hyde opened this issue Aug 18, 2013 · 11 comments

Comments

@coen-hyde
Copy link
Contributor

Sync will upload new and changed files but will not delete files that had previously been uploaded to s3 and since removed.

@geedew
Copy link
Contributor

geedew commented Aug 18, 2013

Do we really need/want something that will automatically delete files? I for one know that I would stop using that code entirely if that were in place. Web files are cheap, not having them there for consumption is expensive. Are there good use cases for it? Can we make sure it has an off switch when using sync?

@coen-hyde
Copy link
Contributor Author

It's certainly a bit of a dangerous feature and it should be off by default. I am maintaining a large number of assets on s3. Changes are mostly additions but sometimes they are also deletes. Having a unified way to maintain the assets would be good.

@geedew
Copy link
Contributor

geedew commented Aug 18, 2013

I'll take a stab at it. My interest is piqued. I'm wondering if it would be better to do a whitelist with rules, rather than a delete:on option.

For instance:
del: {
files: [ '/only/these/*/files/.js` ],// can be deleted
between : [ null, new Date(new Date().setDate(new Date().getDate() - 10 )) ]
}

So only delete from files, and if they are between date 0 and 10 days ago (don't delete anything in the last 10 days). The delete functionality would obviously be updated to use this, so that sync can take advantage?
The hardest part is really just knowing what files are on S3 to actually delete. I don't think it's actually possible with Knox and will require the AWS lib.

http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/frames.html

@coen-hyde
Copy link
Contributor Author

By the way thanks for implementing the initial sync functionality. For my use case the above filters would be complicating the solution, though they may be useful to someone else. I would be interested to hear the opinions of other people using this project. To me 'sync' implies the sync functionality will make whatever changes that are necessary to files stored on s3 to reflect the current state of the local files (PUT'ing and DELET'ing).

We could use https://github.com/segmentio/s3-lister. It uses a knox client to implement a streaming interface to listing a bucket. Though it probably makes sense to build this on top of AWS's node sdk.

@geedew
Copy link
Contributor

geedew commented Aug 18, 2013

It is actually possible to get a list of files in a bucket with knox, so it should be very possible to add this in.

client.list({ prefix: 'my-prefix' }, function(err, data){

@coen-hyde
Copy link
Contributor Author

The only problem with that, is that it can only list 1000 files at a time, so some paging functionality would have to be implemented.

@geedew
Copy link
Contributor

geedew commented Aug 18, 2013

One thing at a time. Getting the first 1000 to work first would be a great step forward :)

@coen-hyde
Copy link
Contributor Author

yes it would :)

@wclr
Copy link

wclr commented Jan 17, 2014

+1. I think this feature should be added to make able ability to sync the whole folder: upload files that not in bucket yet and delete objects that don't exist on file system. Also helper s3.list would be helpful for composing custom tasks.

@dgil
Copy link

dgil commented Apr 10, 2014

+1. It would be great to delete objects that no longer exist on the filesystem.

@andrewboni
Copy link

+1 for this as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants