WIP: Geocoded burst stack processor #18

vbrancat · 2022-05-10T20:22:11Z

This PR adds a script for the geocoded burst stack processor.

The script works on a static stack of S1-A/B SAFE files and operates at a burst level.
Given a reference acquisition date in the stack, the script identifies the common bursts among the static stack of SAFE files and generate a runconfig and a bash file to use to run the geocoded burst workflow.

Stack of burst with the same burst IDs are geocoded on the same grid which their stitching easier.
If the orbit directory is not provided, the script will automatically download the orbit ephemerides files for each SAFE file of the stack.

TO DO:

Have the script be capable of reading options from a configuration file

hfattahi · 2022-05-11T17:55:50Z

@vbrancat I don't think if you have pushed any code. Would you please check and push the code?

hfattahi · 2022-05-16T21:32:37Z

@vbrancat now that the repo reorganization is merged, would you please rebase so that we can start evaluating this PR?

vbrancat · 2022-05-16T21:46:37Z

@hfattahi let me try to test the branch. I will push all the updates when ready :s

hfattahi · 2022-05-17T17:43:17Z

src/compass/s1_geo_stack.py

+                    with open(runfile_name, 'w') as rsh:
+                        path = os.path.dirname(os.path.realpath(__file__))
+                        rsh.write(
+                            f'python {path}/geo_cslc.py {runconfig_path}\n')


Suggested change

f'python {path}/geo_cslc.py {runconfig_path}\n')

f'python {path}/s1_cslc.py {runconfig_path}\n')

hfattahi · 2022-05-17T17:44:36Z

src/compass/s1_geo_stack.py

+            bursts = load_bursts(safe, orbit_path, subswath)
+            for burst in bursts:
+                if burst.burst_id in list(set(burst_map['burst_id'])):
+                    runconfig_path = create_runconfig(burst, safe, orbit_path,


Currently runconfig_path is relative path. It would be great to make it absolute path

No, the runconfig_path is already an absolute_path. If you inspect one of the generated run_file you can see that e.g.,

python /u/aurora0/vbrancat/COMPASS/src/compass/s1_cslc.py /mnt/aurora-r0/vbrancat/data/S1/stack_processor/Rosamond/g eocoded//runconfigs/geo_runconfig_20220501_t64_iw3_b207.yaml

hfattahi · 2022-05-17T18:00:30Z

Very exciting PR. Before dig into the code, I'm trying to be a user and giving it a try. Provided two minor comments above. One nice functionality would be the start and end date. Let's assume there is a folder with all SLCs, but we want to process only certain dates. It would be nice to have that flexibility. Other relevant functionality would be allowing to process a list of specific dates. Or exclude a list of specific dates.

hfattahi · 2022-05-17T21:18:24Z

src/compass/s1_geo_stack.py

+                    with open(runfile_name, 'w') as rsh:
+                        path = os.path.dirname(os.path.realpath(__file__))
+                        rsh.write(


This currently creates many run files which each of them has only one command to run. An alternative would be to have one run file with many commands at each line. The latter allows to run multiple commands in parallel for example using this. I wonder if it is preferred to have one run file with one command per line or multiple run files with one command per file?
Maybe @seongsujeong has a preference?

@seongsujeong before I make any modification, what is your opinion with respect to the comment above?

I've been running them in parallel like

ls run_files | xargs -P 10 sh -c 'bash run_files/$0 > logfile-$0.log'

so for now I've kept them as separate files

…xclude dates

vbrancat · 2022-06-14T19:00:17Z

@hfattahi latest commit should address the following:

Inspect the stack and identifying the common bursts along the stack. If no burst_id is specified for processing, only the common bursts are used
Specify a start_date and an end_date. If specified, all the burst IDs with dates in this interval will be processed
Ability to exclude dates from the stack

I should do some code clean-up (move orbits download to the reader), improve panda dataframe filtering (it is a bit slow)

LiangJYu · 2022-07-06T23:32:52Z

src/compass/s1_geo_stack.py

+def get_orbit_dict(sensor_id, start_time, end_time, orbit_type):
+    '''
+    Query Copernicus GNSS API to find latest orbit file
+
+    Parameters:
+    ----------
+    sensor_id: str
+        Sentinel satellite identifier ('A' or 'B')
+    start_time: datetime object
+        Sentinel start acquisition time
+    end_time: datetime object
+        Sentinel end acquisition time
+    orbit_type: str
+        Type of orbit to download (AUX_POEORB: precise, AUX_RESORB: restituted)
+
+    Returns:
+    orbit_dict: dict
+        Python dictionary with [orbit_name, orbit_type, download_url]
+    '''
+    # Check if correct orbit_type
+    if orbit_type not in ['AUX_POEORB', 'AUX_RESORB']:
+        err_msg = f'{orbit_type} not a valid orbit type'
+        raise ValueError(err_msg)
+
+    # Add a 30 min margin to start_time and end_time
+    pad_start_time = start_time - datetime.timedelta(hours=0.5)
+    pad_end_time = end_time + datetime.timedelta(hours=0.5)
+    new_start_time = pad_start_time.strftime('%Y-%m-%dT%H:%M:%S')
+    new_end_time = pad_end_time.strftime('%Y-%m-%dT%H:%M:%S')
+    query_string = f"startswith(Name,'S1{sensor_id}') and substringof('{orbit_type}',Name) " \
+                   f"and ContentDate/Start lt datetime'{new_start_time}' and ContentDate/End gt datetime'{new_end_time}'"
+    query_params = {'$top': 1, '$orderby': 'ContentDate/Start asc',
+                    '$filter': query_string}
+    query_response = requests.get(url=scihub_url, params=query_params,
+                                  auth=(scihub_user, scihub_password))
+    # Parse XML tree from query response
+    xml_tree = ElementTree.fromstring(query_response.content)
+    # Extract w3.org URL
+    w3_url = xml_tree.tag.split('feed')[0]
+
+    # Extract orbit's name, id, url
+    orbit_id = xml_tree.findtext(
+        f'.//{w3_url}entry/{m_url}properties/{d_url}Id')
+    orbit_url = f"{scihub_url}('{orbit_id}')/$value"
+    orbit_name = xml_tree.findtext(f'./{w3_url}entry/{w3_url}title')
+
+    if orbit_id is not None:
+        orbit_dict = {'orbit_name': orbit_name, 'orbit_type': orbit_type,
+                      'orbit_url': orbit_url}
+    else:
+        orbit_dict = None
+    return orbit_dict
+
+
+def download_orbit(output_folder, orbit_url):
+    '''
+    Download S1-A/B orbits
+
+    Parameters:
+    ----------
+    output_folder: str
+        Path to directory where to store orbits
+    orbit_url: str
+        Remote url of orbit file to download
+    '''
+
+    response = requests.get(url=orbit_url, auth=(scihub_user, scihub_password))
+    # Get header and find filename
+    header = response.headers['content-disposition']
+    _, header_params = cgi.parse_header(header)
+    # construct orbit filename
+    orbit_filename = os.path.join(output_folder, header_params['filename'])
+    # Save orbits
+    open(orbit_filename, 'wb').write(response.content)


Eventually both replace with this?

Agreed, this will be replaced with the orbit download functionality in the s1-reader. I will wait for that PR to be merged before applying any related modification to this PR

LiangJYu · 2022-07-07T00:05:23Z

src/compass/s1_geo_stack.py

+    if burst_id is not None:
+        burst_map = prune_dataframe(burst_map, 'burst_id', burst_id)


Can avoid this prune_dataframe call if generate_burst_map takes in burst_id and uses it to filter bursts. Like this.

LiangJYu · 2022-07-07T00:08:13Z

src/compass/s1_geo_stack.py

+    if start_date is not None:
+        burst_map = burst_map[burst_map['date'] >= start_date]
+    if end_date is not None:
+        burst_map = burst_map[burst_map['date'] <= end_date]
+
+    # Exclude some dates if the user requires it
+    if exclude_dates is not None:
+        burst_map = prune_dataframe(burst_map, 'date', exclude_dates)


Like burst_id, these bits of pruning could be filtered in generate_burst_map as well.

The start and end date pruning is now occurring before the generation, so that the number of zip files to load is minimized

LiangJYu · 2022-07-07T00:13:25Z

src/compass/s1_geo_stack.py

+        orbit_path = get_orbit_file_from_list(safe,
+                                              glob.glob(f'{orbit_dir}/S1*'))
+        for subswath in i_subswath:
+            bursts = load_bursts(safe, orbit_path, subswath)


If Sentinel1BurstSlc objects are stored in burst_map then they don't need to reloaded here.

If my suggestions to filter in generate_burst_map are accepted, then that should keep the burst_map object not too big?

This suggestions is implemented in #53 , and it means that the script run time is cut in half (since 95% or more of the script time is spent loading, and every zip was loaded twice)

scottstanie · 2022-08-01T23:47:27Z

src/compass/s1_geo_stack.py

+                                                      is_split_spectrum,
+                                                      low_band, high_band, pol,
+                                                      x_spac, y_spac)
+                    date_str = str(date)


I think this line should be moved to right after line 534. Right now if the "else" triggers, the logging entry on line 552 uses date_str, but it isn't defined.

scottstanie · 2022-10-18T23:13:15Z

I believe I've addressed the comments here, so moving the final implementation to #53

First draft of geocoded bursts stack processor

9630de3

vbrancat requested review from hfattahi, LiangJYu and yunjunz May 10, 2022 20:22

Stack processor code

63782c0

Merge branch 'main' into geo_burst_stack

953da48

vbrancat added 2 commits May 16, 2022 16:32

Fix compatibility with latest reorganization of repository

58b9f33

Fill toy burst_path and burst metadata

288e34c

hfattahi reviewed May 17, 2022

View reviewed changes

vbrancat added 5 commits May 25, 2022 08:56

Change name of script to run geocoded bursts

441b3bb

Resolve conflicts with upstream

c678e9f

Merge remote-tracking branch 'upstream/main' into geo_burst_stack

b5b19f7

Identify common burst of stack, filter stack by start/end date, add e…

a345abf

…xclude dates

Isort imports, pep8 convention, codacy

34d575b

LiangJYu reviewed Jul 7, 2022

View reviewed changes

LiangJYu mentioned this pull request Jul 11, 2022

Stitching geocoded bursts for stack processing #14

Closed

Remove light metadata and burst data from defaults

c752e26

scottstanie reviewed Aug 1, 2022

View reviewed changes

scottstanie mentioned this pull request Sep 27, 2022

s1_orbit.get_orbit_file_from_dir(): add auto_download arg isce-framework/s1-reader#71

Merged

Merge remote-tracking branch 'upstream/main' into geo_burst_stack

dcab675

scottstanie mentioned this pull request Oct 11, 2022

Geocoded burst stack processor (cont) #53

Merged

scottstanie closed this Oct 18, 2022

vbrancat deleted the geo_burst_stack branch November 17, 2022 15:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Geocoded burst stack processor #18

WIP: Geocoded burst stack processor #18

vbrancat commented May 10, 2022

hfattahi commented May 11, 2022

hfattahi commented May 16, 2022

vbrancat commented May 16, 2022

hfattahi May 17, 2022

hfattahi May 17, 2022

vbrancat Jun 14, 2022

hfattahi commented May 17, 2022

hfattahi May 17, 2022

vbrancat Jun 14, 2022

scottstanie Oct 18, 2022

vbrancat commented Jun 14, 2022

LiangJYu Jul 6, 2022

vbrancat Jul 11, 2022 •

edited

Loading

LiangJYu Jul 7, 2022

LiangJYu Jul 7, 2022

scottstanie Oct 18, 2022

LiangJYu Jul 7, 2022

scottstanie Oct 18, 2022

scottstanie Aug 1, 2022

scottstanie commented Oct 18, 2022

	f'python {path}/geo_cslc.py {runconfig_path}\n')
	f'python {path}/s1_cslc.py {runconfig_path}\n')

		if burst_id is not None:
		burst_map = prune_dataframe(burst_map, 'burst_id', burst_id)

WIP: Geocoded burst stack processor #18

WIP: Geocoded burst stack processor #18

Conversation

vbrancat commented May 10, 2022

hfattahi commented May 11, 2022

hfattahi commented May 16, 2022

vbrancat commented May 16, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hfattahi commented May 17, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vbrancat commented Jun 14, 2022

Choose a reason for hiding this comment

vbrancat Jul 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottstanie commented Oct 18, 2022

vbrancat Jul 11, 2022 •

edited

Loading