Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segment large batch processes #2873

Open
9 tasks
K8Sewell opened this issue Jun 25, 2024 · 0 comments
Open
9 tasks

Segment large batch processes #2873

K8Sewell opened this issue Jun 25, 2024 · 0 comments
Assignees

Comments

@K8Sewell
Copy link

K8Sewell commented Jun 25, 2024

Story
As described in a comment in #2859, sometimes a job may time out and fail before all records in a CSV are processed. This causes some jobs to run multiple times. We would like to change the batch process behavior to process CSVs in segments of 50 rows at a time, to prevent process from timing out and re-running.

This behavior should be applied to the following batch processes:

  • DeleteParentObjects
  • GenerateManifest
  • GeneratePdf
  • GeneratePtiff
  • ReassociateChildOids
  • RecreateChildOidPtiffs
  • SetupMetadata
  • SolrIndex
  • UpdateParentObjects

Acceptance
The following jobs run in segments of 50 rows until completion:

  • DeleteParentObjects
  • GenerateManifest
  • GeneratePdf
  • GeneratePtiff
  • ReassociateChildOids
  • RecreateChildOidPtiffs
  • SetupMetadata
  • SolrIndex
  • UpdateParentObjects

Engineering Notes
Jobs that have batching patterns to pull from:

  • SolrReindexAll
  • UpdateAllMetadata
  • UpdateDigitalObjects
  • UpdateManifests
@sshetenhelm sshetenhelm added this to the Batch Process Refactoring milestone Jul 1, 2024
@sshetenhelm sshetenhelm changed the title [NEEDS EDITING] Segment large batch processes [Segment large batch processes Jul 1, 2024
@sshetenhelm sshetenhelm changed the title [Segment large batch processes Segment large batch processes Jul 1, 2024
@jpengst jpengst self-assigned this Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants