Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUMULUS-3960: Updated PostToCmr task to be able to republish granules #3906

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

jennyhliu
Copy link
Contributor

@jennyhliu jennyhliu commented Jan 21, 2025

Summary: Summary of changes

Addresses CUMULUS-3960: Update CMR metadata with new collection info

Changes

  • Updated PostToCmr task to be able to republish granules

Testing

  • Verified that PostToCmr can republish granule:
    • Added a workflow with the RepublishTest:
        "RepublishTest": {
        "Parameters": {
          "cma": {
            "event.$": "$",
            "ReplaceConfig": {
              "FullMessage": true
            },
            "task_config": {
              "bucket": "{$.meta.buckets.internal.name}",
              "stack": "{$.meta.stack}",
              "cmr": "{$.meta.cmr}",
              "launchpad": "{$.meta.launchpad}",
              "etags": "{$.meta.file_etags}",
              "concurrency": 10,
              "republish": true
            }
          }
        },
        "Type": "Task",
        "Resource": "${post_to_cmr_task_arn}",
        "Next": "BackupGranulesToLzards",
        "Retry": [
          {
            "ErrorEquals": [
              "Lambda.ServiceException",
              "Lambda.AWSLambdaException",
              "Lambda.SdkClientException"
            ],
            "IntervalSeconds": 2,
            "MaxAttempts": 6,
            "BackoffRate": 2
          }
        ]
      },
    
    • Updated spec/parallel/orca/OrcaBackupAndRecoverySpec.js to ingest a granule only (xdescribe any tests except S3 Ingest Granules, comment out the cleanup code)
    • Got the output cumulus message in s3 from IngestAndPublishGranuleWithOrca workflow execution.
    • Manually updated .cmr.xml in s3 to have collection ShortName MOD09GQ-> MOD09GQ-AZ
    • Ran the workflow with the cumulus message we got from IngestAndPublishGranuleWithOrca with etags removed. We can update etags as well.
    • Verified CMR granule has updated collection ShortName: https://cmr.uat.earthdata.nasa.gov/search/granules.umm_json?provider=CUMULUS&granule_ur=MOD09GQ.A5513498.Nji_76.006.6698559176437
      Note: The cmr concept-id remains the same in my test

PR Checklist

  • Update CHANGELOG
  • Unit tests
  • Ad-hoc testing - Deploy changes and test manually
  • Integration tests

.eslintrc.js Outdated
@@ -62,6 +62,7 @@ module.exports = {
},
globals: {
JSX: true,
AggregateError: false,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to resolve eslint error error 'AggregateError' is not defined. Later versions of eslint have resolved the error.

const results = await pMap(
updatedCMRFiles,
(cmrFile) => publish2CMR(cmrFile, cmrSettings, cmrRevisionId),
{ concurrency }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stopOnError: true is default, if set to false, it will change current lambda behavior

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this even be set by a user or someone? it doesnt seem defined anyways

@@ -5,6 +5,6 @@
],
"statements": 94.0,
"functions": 88.0,
"branches": 97.0,
"branches": 88.0,
Copy link
Contributor Author

@jennyhliu jennyhliu Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the handler function is not covered as before, not sure how to improve this.

@jennyhliu jennyhliu marked this pull request as ready for review January 29, 2025 15:10
Copy link
Contributor

@Nnaga1 Nnaga1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some questions/comments I had, one or two small NITs, once answered + some local testing I'll approve 🙌

@@ -35,6 +35,16 @@ const {

const log = new Logger({ sender: '@cumulus/cmrjs/src/cmr-utils' });

/**
* @typedef {{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 better than having a {provider, clientId, username, password....} everywhere

@@ -233,6 +243,17 @@ async function publish2CMR(cmrPublishObject, creds, cmrRevisionId) {
throw new Error(`invalid cmrPublishObject passed to publis2CMR ${JSON.stringify(cmrPublishObject)}`);
}

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why this wasn't here before? did we never have the case of having to remove from CMR until now or was there just another way to do it (via API or something)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mainly for my curiousity I guess

@@ -25,6 +25,8 @@ Config object fields:
| process | string | (required) | Process the granules went through
| stack | string | (required) | Name of deployment stack
| cmr | object | (required) | CMR credentials object
| concurrency | number | 20 | Maximum concurrency of requests to CMR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is 20/false the default, maybe you could say:

| concurrency | (optional) | Maximum concurrency of requests to CMR, defaults to 20

or something like that, just a nit tho

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other task puts default value to this column.
https://github.com/nasa/cumulus/blob/master/tasks/files-to-granules/README.md

* @throws {Error} - Error from CMR request
*/
async function removeGranuleFromCmr({ granules, cmrSettings, concurrency }) {
const granulesToUnpublish = granules.filter((granule) => granule.published || !!granule.cmrLink);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its considered published if it doesnt have a granule.cmrLink ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory published granule should have cmrLink, but we should be able to delete the granule from cmr if we don't have the link

const granulesToUnpublish = granules.filter((granule) => granule.published || !!granule.cmrLink);
await pMap(
granulesToUnpublish,
(granule) => removeFromCMR(granule.granuleId, cmrSettings),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

const results = await pMap(
updatedCMRFiles,
(cmrFile) => publish2CMR(cmrFile, cmrSettings, cmrRevisionId),
{ concurrency }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this even be set by a user or someone? it doesnt seem defined anyways

cmrClient.CMR.prototype.ingestGranule.restore();
});

await s3PutObject({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the other calls/functions but what does this s3PutObject do in the case of the test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants