Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIKE] Batch process to point existing DCS parent to reingested object in Preservica #2168

Closed
1 of 2 tasks
motropuk opened this issue Jul 28, 2022 · 6 comments
Closed
1 of 2 tasks

Comments

@motropuk
Copy link

motropuk commented Jul 28, 2022

Story

As collection owner, I often need to make multiple changes to an existing object in DCS that is connected to Preservica, to add, replace, delete or resort files. While it is possible to do this one image at a time in Preservica, when multiple changes need to be made, best practice in Preservica is to download the existing object to your local machine, make the necessary changes, and reingest the package as a new object in Preservica.

Once this is done, as a collection owner, I would then need to repoint the existing parent in DCS to the new object in Preservica, by replacing the existing Preservica URI for that DCS parent, with the new Preservica URI.

However, when pulling in the new files from the new Preservica object, I would not want to lose the existing DCS child oids, labels or captions, for files that were already in DCS.

A suggested approach to this is to create a new batch process that:

  1. The job takes as input a CSV with:
    1. a DCS parent OID and its new Preservica URI
    2. optionally, this contains a list of DCS child OIDs and the new checksum to match against in Preservica
  2. Updates a DCS parent to point to a new Preservica URI provided in the input CSV
  3. Updates the Preservica checksum of each child OID provided in the input CSV
  4. Pulls in all of the files from the new Preservica object. As this pull happens, DCS should check the checksums of the incoming files. If they match something already in DCS, the existing child record for that file should be updated to overwrite the child level Preservica URIs. However, the child oid, label and caption field for that record should not be updated
  5. For each child OID in the CSV,
    1. replace the existing file in the pair tree with the new version of the file, downloaded from Preservica
    2. set the heigh and width of the child as null to force regeneration of the PTIFFs
  6. If checksum matching fails and no exception has been provided, DCS creates a new child record with new child oid for the incoming file.
  7. If checksum for a file already in DCS is not matched against incoming files, the child record for that file in DCS is deleted or unpublished (to discuss)
  8. The children are reordered to match the order in the new Preservica object

Acceptance

@motropuk motropuk changed the title Batch process to point existing DCS parent to reingested object in Preservica [SPIKE] Batch process to point existing DCS parent to reingested object in Preservica Jul 28, 2022
@motropuk
Copy link
Author

Setup meeting with Josh with Jon and Summer

@motropuk
Copy link
Author

Revisit tomorrow (Weds) and have a discussion after standup

@motropuk
Copy link
Author

Discussion started today. Work on #2193 and then return to this discussion to consider if proposed steps work

@motropuk
Copy link
Author

David Cirella has created a Python script that outputs all of the checksums for a Preservica object based on parent id. Its here https://git.yale.edu/dec69/bitstream_checksums_v6.py. Need to be on the Yale network to access. He will turn this into a desktop application in time, but for now could use to get the output needed when working on this ticket

@K8Sewell
Copy link

K8Sewell commented Jun 7, 2023

Ticket for review - #2510

@sshetenhelm
Copy link

Jon has stated that ticket #2510 looks good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants