Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Batch Job to Reingest with Preservica #2510

Open
8 tasks
K8Sewell opened this issue Jun 7, 2023 · 5 comments
Open
8 tasks

Create Batch Job to Reingest with Preservica #2510

K8Sewell opened this issue Jun 7, 2023 · 5 comments
Assignees

Comments

@K8Sewell
Copy link

K8Sewell commented Jun 7, 2023

Summary

Add a batch process that takes an existing parent object and replaces it’s child object’s images with images from Preservica.

Acceptance Criteria

  • Test current feature work in UAT

  • Discuss and finalize acceptance criteria of this ticket

  • CSV will contain parent object oid,digital object source, preservica uri, and preservica representation type

  • will update parent object with new preservica information

  • will maintain the child object’s original oid, caption, and label

  • will persist the child object’s order from Preservica and overwrite original order

  • will persist the child object’s checksum from Preservica and overwrite original checksum

  • will remove existing child objects if they do not have preservica information

Engineering Notes

Like the sync process this may require a subsequent ‘Recreate Child Ptiffs’ batch process to occur to see the change take effect in Blacklight.
Draft PR - yalelibrary/yul-dc-management#1205

Original Acceptance Outline from ticket #2168

  1. The job takes as input a CSV with:
  • a DCS parent OID and its new Preservica URI
  • optionally, this contains a list of DCS child OIDs and the new checksum to match against in Preservica
  1. Updates a DCS parent to point to a new Preservica URI provided in the input CSV
  2. Updates the Preservica checksum of each child OID provided in the input CSV
  3. Pulls in all of the files from the new Preservica object. As this pull happens, DCS should check the checksums of the incoming files. If they match something already in DCS, the existing child record for that file should be updated to overwrite the child level Preservica URIs. However, the child oid, label and caption field for that record should not be updated
  4. For each child OID in the CSV,
  • replace the existing file in the pair tree with the new version of the file, downloaded from Preservica
  • set the height and width of the child as null to force regeneration of the PTIFFs
  1. If checksum matching fails and no exception has been provided, DCS creates a new child record with new child oid for the incoming file.
  2. If checksum for a file already in DCS is not matched against incoming files, the child record for that file in DCS is deleted or unpublished (to discuss)
  3. The children are reordered to match the order in the new Preservica object
@motropuk
Copy link

I have started to work on this but I am having issues with the changed package not ingesting into Preservica. So need to keep working on a fix for this before I can test in DCS UAT

@motropuk motropuk added this to the Preservica Integration milestone Jun 15, 2023
@jillpe jillpe added the waiting waiting on external resources label Jun 16, 2023
@sshetenhelm
Copy link

@motropuk was the issue you mentioned above resolved, or should we test again in Preservica prior to pulling this ticket into sprint?

@motropuk
Copy link

motropuk commented Jan 5, 2024

@sshetenhelm no this was not resolved, so this will need to be tested again

@sshetenhelm sshetenhelm assigned sshetenhelm and unassigned motropuk Oct 22, 2024
@sshetenhelm sshetenhelm removed the waiting waiting on external resources label Oct 22, 2024
@sshetenhelm
Copy link

sshetenhelm commented Oct 22, 2024

Next steps:

  • Clean up branch with previous PR
  • Merge into UAT
  • Let everyone else know via Teams
  • Test in UAT (because it's hooked up to Preservica TEST)

@K8Sewell
Copy link
Author

Draft PR: yalelibrary/yul-dc-management#1205

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants