[Bug]: DISP-S1 reprocessing with date range doesn't retrieve k-granules sometimes. Also processes extra frame that it shouldn't be #964

philipjyoon · 2024-08-23T00:25:26Z

Checked for duplicates

Yes - I've already checked

Describe the bug

When running CSLC query (for DISP-S1 processing) in reprocessing mode, specifying a date range (as opposed to native_id) sometimes does not retrieve k-granules correctly. It submits download jobs with just one batch_id even when k is larger than 1. This seems to coincide with the query job submitting download for an adjacent frame which it shouldn't be submitting at all - but surprisingly with the correct k granules. These are probably related.

The issue was first found using the command "Originally" in the Reproducible Steps below. I was able to come up with a simpler test case.

However, not all reprocessing with date range with optional frame-id exhibit this bug. The following two commands work correctly:

python ~/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py query -c OPERA_L2_CSLC-S1_V1 --chunk-size=1 --k=4 --m=1 --job-queue=opera-job_worker-cslc_data_download --processing-mode=reprocessing --start-date=2024-06-05T02:00:00Z --end-date=2024-06-05T02:01:11Z
python ~/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py query -c OPERA_L2_CSLC-S1_V1 --chunk-size=1 --k=4 --m=1 --job-queue=opera-job_worker-cslc_data_download --processing-mode=reprocessing --start-date=2024-06-05T02:00:00Z --end-date=2024-06-05T02:01:11Z --frame-id=34481

What may be happening is that when one random granule is picked out of a date range query to expand upon to full-frame for reprocessing, if it happens to be one of the 6 (out of 27 total) bursts that belong to more than one frame, this bug surfaces. And there's probably logic in the code somewhere, correctly, that for a single native_id reprocessing request, there should only be one frame download job. And, actually, this can be easily tested by running reprocessing using the burst native_id chosen to represent the date range reprocessing case.

What did you expect?

n/a

Reproducible steps

Originally:
python ~/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py query -c OPERA_L2_CSLC-S1_V1 --chunk-size=1 --k=4 --m=1  --job-queue=opera-job_worker-cslc_data_download --processing-mode=reprocessing --start-date=2024-06-05T02:00:00Z --end-date=2024-06-05T02:30:00Z

Simplified:
python ~/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py query -c OPERA_L2_CSLC-S1_V1 --chunk-size=1 --k=4 --m=1  --job-queue=opera-job_worker-cslc_data_download --processing-mode=reprocessing --start-date=2024-06-05T02:00:00Z --end-date=2024-06-05T02:30:11Z --frame-id=34996

Environment

- Version of this software [e.g. vX.Y.Z]
- Operating System: [e.g. MacOSX with Docker Desktop vX.Y]
...

The text was updated successfully, but these errors were encountered:

philipjyoon · 2024-08-23T00:30:16Z

Both of the native-id reprocessing work correctly. But I still think some confluence of these native-ids with date range reprocessing is causing this bug

python ~/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py query -c OPERA_L2_CSLC-S1_V1 --chunk-size=1 --k=4 --m=1 --job-queue=opera-job_worker-cslc_data_download --processing-mode=reprocessing --native-id=OPERA_L2_CSLC-S1_T131-279961-IW1_20240224T163057Z_20240605T012555Z_S1A_VV_v1.1

python ~/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py query -c OPERA_L2_CSLC-S1_V1 --chunk-size=1 --k=4 --m=1 --job-queue=opera-job_worker-cslc_data_download --processing-mode=reprocessing --native-id=OPERA_L2_CSLC-S1_T131-279969-IW1_20240224T163119Z_20240605T012546Z_S1A_VV_v1.1

…ary, but not sufficient, for fixing this bug. It currently fails

…ed, it processes just one frame instead of multiple

philipjyoon · 2024-10-14T22:22:39Z

This bug was fixed as part of addressing feature #1001

philipjyoon added bug Something isn't working needs triage Issue that requires triage labels Aug 23, 2024

philipjyoon self-assigned this Aug 23, 2024

philipjyoon added a commit that referenced this issue Aug 23, 2024

#964: Creating a test gold file. Passing this cslc scenario is necess…

24c52c8

…ary, but not sufficient, for fixing this bug. It currently fails

philipjyoon added a commit that referenced this issue Aug 23, 2024

#964: Made partial fix where when date range and frame_id are specifi…

00371b4

…ed, it processes just one frame instead of multiple

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: DISP-S1 reprocessing with date range doesn't retrieve k-granules sometimes. Also processes extra frame that it shouldn't be #964

[Bug]: DISP-S1 reprocessing with date range doesn't retrieve k-granules sometimes. Also processes extra frame that it shouldn't be #964

philipjyoon commented Aug 23, 2024

philipjyoon commented Aug 23, 2024

philipjyoon commented Oct 14, 2024

[Bug]: DISP-S1 reprocessing with date range doesn't retrieve k-granules sometimes. Also processes extra frame that it shouldn't be #964

[Bug]: DISP-S1 reprocessing with date range doesn't retrieve k-granules sometimes. Also processes extra frame that it shouldn't be #964

Comments

philipjyoon commented Aug 23, 2024

Checked for duplicates

Describe the bug

What did you expect?

Reproducible steps

Environment

philipjyoon commented Aug 23, 2024

philipjyoon commented Oct 14, 2024