-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store full original path as archival record #178
Comments
Any document can have an arbitrary number of filenames associated with it in its metadata. |
Cool, and thanks. After we run the sha process, can we compile all the different names into all manifestations of the same file? |
We can easily run a process that deletes the duplicate files and collapses the documents down to one with the other filenames recorded within it. If we do that, the hard bit will be deciding which categorization we keep.
(1) seems to be the logical choice. This does not consider the possibility that other metadata could differ (e.g. through a spreadsheet import). If that is the case then both metadata blocks should probably remain (as an optimization, they could be modified to point at the same actual s3 file). |
Thanks, Paul. Option 1 and the multiple metadata files where information is conflicting sounds like the way forward. |
the path is useful as an archival record. otherwise, we would store directory names as attribute tags and perhaps eventually change the folder structure
The text was updated successfully, but these errors were encountered: