Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

282 watcher running with nmdc dev configuration finds jobs but no data objects for the jobs #287

Conversation

mbthornton-lbl
Copy link
Contributor

This PR provides updates to workflow automation to ensure the the Watcher is able to

  • find and process job runner data files
  • Process additional output files for MAGs
  • Handle incorrectly formatted MagBin data
  • Save checkpoints after processing failed and successful jobs

Testing Results against dev:

(nmdc-automation-py3.11) (nersc-python) nmdcda@perlmutter:login29:~/nmdc_automation/dev> ./run_dev.sh
2024-11-18 14:14:07,463 INFO: Config file: /global/homes/n/nmdcda/nmdc_automation/dev/site_configuration_nersc.toml
2024-11-18 14:14:07,464 INFO: Using state file from config: /global/cfs/cdirs/m3408/var/dev/agent.state
2024-11-18 14:14:07,464 INFO: Reading state from /global/cfs/cdirs/m3408/var/dev/agent.state
2024-11-18 14:14:07,490 INFO: Restored 351 jobs from state
2024-11-18 14:14:07,490 INFO: Restoring 351 jobs from state.
2024-11-18 14:14:08,741 INFO: Entering polling loop
2024-11-18 14:14:08,742 INFO: Reading state from /global/cfs/cdirs/m3408/var/dev/agent.state
2024-11-18 14:14:08,753 INFO: Restored 0 jobs from state
2024-11-18 14:14:09,555 INFO: Found 0 unclaimed jobs.
2024-11-18 14:14:09,629 INFO: Found 1 successful jobs.
2024-11-18 14:14:09,629 INFO: Found 1 failed jobs.
2024-11-18 14:14:09,629 INFO: Process successful job:  nmdc:sys0m369xp60
2024-11-18 14:14:09,630 INFO: Getting job runner metadata for job 9492a397-eb30-472b-9d3b-b44b676f4652
2024-11-18 14:14:10,001 INFO: Processing output nmdc_mags.final_checkm
2024-11-18 14:14:10,001 INFO: Output file path: /pscratch/sd/n/nmdcda/cromwell-executions/nmdc_mags/9492a397-eb30-472b-9d3b-b44b676f4652/call-finish_mags/execution/nmdc_wfmag-11-g7msr323.1_checkm_qa.out
2024-11-18 14:14:10,155 INFO: Processing output nmdc_mags.final_hqmq_bins_zip
2024-11-18 14:14:10,155 INFO: Output file path: /pscratch/sd/n/nmdcda/cromwell-executions/nmdc_mags/9492a397-eb30-472b-9d3b-b44b676f4652/call-finish_mags/execution/nmdc_wfmag-11-g7msr323.1_hqmq_bin.zip
2024-11-18 14:14:10,226 INFO: Processing output nmdc_mags.final_gtdbtk_bac_summary
2024-11-18 14:14:10,226 INFO: Output file path: /pscratch/sd/n/nmdcda/cromwell-executions/nmdc_mags/9492a397-eb30-472b-9d3b-b44b676f4652/call-finish_mags/execution/nmdc_wfmag-11-g7msr323.1_gtdbtk.bac122.summary.tsv
2024-11-18 14:14:10,229 INFO: Processing output nmdc_mags.final_gtdbtk_ar_summary
2024-11-18 14:14:10,229 INFO: Output file path: /pscratch/sd/n/nmdcda/cromwell-executions/nmdc_mags/9492a397-eb30-472b-9d3b-b44b676f4652/call-finish_mags/execution/nmdc_wfmag-11-g7msr323.1_gtdbtk.ar122.summary.tsv
2024-11-18 14:14:10,232 INFO: Processing output nmdc_mags.mags_version
2024-11-18 14:14:10,232 INFO: Output file path: /pscratch/sd/n/nmdcda/cromwell-executions/nmdc_mags/9492a397-eb30-472b-9d3b-b44b676f4652/call-finish_mags/execution/nmdc_wfmag-11-g7msr323.1_bin.info
2024-11-18 14:14:10,234 INFO: Found 5 data objects for job nmdc:sys0m369xp60
2024-11-18 14:14:11,466 INFO: Created workflow execution record for job nmdc:sys0m369xp60
2024-11-18 14:14:14,315 INFO: Database object validated for job nmdc:sys0m369xp60
2024-11-18 14:14:16,420 INFO: Posted objects response: {'message': 'jobs accepted'}
2024-11-18 14:14:17,942 INFO: Updated operation nmdc:sys0m369xp60 response {'id': 'nmdc:sys0m369xp60', 'done': True, 'expire_time': '2024-10-16T19:33:30.785000', 'result': None,

Copy link
Contributor

@aclum aclum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import config yaml needs to bump the mags version as well. Everything else looks good.

@mbthornton-lbl mbthornton-lbl requested a review from aclum November 18, 2024 23:00
@mbthornton-lbl mbthornton-lbl merged commit f6e4634 into main Nov 18, 2024
1 check passed
@mbthornton-lbl mbthornton-lbl deleted the 282-watcher-running-with-nmdc-dev-configuration-finds-jobs-but-no-data-objects-for-the-jobs branch November 18, 2024 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Watcher running with nmdc-dev configuration finds jobs but no data objects for the jobs
2 participants