New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Stitching geocoded bursts for stack processing #14

Closed

vbrancat wants to merge 12 commits into opera-adt:main from vbrancat:geo_stitching

Contributor

vbrancat commented Apr 4, 2022 •

edited

Loading

This PR provides a script to stitch S1-A/B bursts for stack processing i.e., the stitched bursts forming the stitched SLCs have all the same shape and can be directly interfered. To efficiently handle the list of burst IDs to stitch, the PR makes use of pandas which would be a new (but light) dependency for COMPASS.

The algorithm implements the following steps:

If a list of burst IDs to stitch is not provided, it identifies a common list of bursts among all the different dates and use those burst ID for stitching. Otherwise, only the provided list of burst IDs is used for stitching.
For each unique burst ID, it identifies the common burst boundary (on the ground) among the different dates and use the common burst boundary to cut the bursts at different dates. This is implemented by saving a shapely.geometry.Polygon into an ESRI Shapefile. This can be used by gdal.Warp to cut the different bursts using the cutline feature
All the cut bursts for a same date are then stitched together to form the output SLCs. All the formed SLCs have the same shape as they have been formed by using the commonly identified bursts.

Assumptions
The script makes the following assumptions:

The directory containing the input burst is organized by date. This should be a pretty common assumption for a stack processor
Metadata are in json format and it contains info on the granule_id (i.e. filename), date, polygon, burst_id, and epsg. These should be all pretty reasonable assumptions
Untested: the algorithm should equivalently work also for range/Doppler co-registered stacks of bursts, assuming that the same metadata (e.g. polygon of valid pixels) are provided.

Testing
The PR has been tested on a small stack of S1-A/B data that can be found on the aurora server at: /mnt/aurora-r0/vbrancat/data/S1/gburst_stitching/ . Below a pic of an interferogram formed by randomly selecting a reference and a secondary stitched SLC from the processed stack.

TO DO

Check if gdal.Warp allows to allocate the number of threads to use and expose it in the command line option
Expose in command line options X/Y resolution of stitched bursts and EPSG for reprojection
Check that the algorithm applies to stitch multiband rasters (e.g. geocoded bursts with main, low, and high bands


          Stitching geocoded burst for stack processing

cee5574

vbrancat requested review from gshiroma, hfattahi, LiangJYu and yunjunz

April 4, 2022 17:57

vbrancat changed the title ~~Stitching geocoded burst for stack processing~~ Stitching geocoded bursts for stack processing

vbrancat added 2 commits

May 13, 2022 11:29


          Merge remote-tracking branch 'upstream/main' into geo_stitching

a30c9c7


          Address compatibility with current version of main

6d5f802

LiangJYu reviewed

View reviewed changes

src/compass/utils/stitching/stitch_burst.py Show resolved Hide resolved

vbrancat added 2 commits

May 26, 2022 15:34


          Merge remote-tracking branch 'upstream/main' into geo_stitching

7c56394


          Merge remote-tracking branch 'upstream/main' into geo_stitching

LiangJYu reviewed

View reviewed changes

src/compass/utils/stitching/stitch_burst.py Outdated

Comment on lines 1 to 14

+              import argparse
+              import glob
+              import json
+              import os
+              import time
+              import isce3
+              import journal
+              import pandas as pd
+              import shapely.wkt
+              from datetime import datetime
+              from compass.utils import helpers
+              from osgeo import gdal, ogr
+              from shapely.geometry import Polygon

Contributor

LiangJYu Jul 6, 2022

Suggested change

      
            import argparse
          
            import glob
          
            import json
          
            import os
          
            import time
          
            import isce3
          
            import journal
          
            import pandas as pd
          
            import shapely.wkt
          
            from datetime import datetime
          
            from compass.utils import helpers
          
            from osgeo import gdal, ogr
          
            from shapely.geometry import Polygon
          
            import argparse
          
            from datetime import datetime
          
            import glob
          
            import json
          
            import os
          
            import time
          
            import isce3
          
            import journal
          
            from osgeo import gdal, ogr
          
            import pandas as pd
          
            import shapely.wkt
          
            from shapely.geometry import Polygon
          
            from compass.utils import helpers

PEP8 ordering and grouping

Contributor Author

vbrancat Jul 11, 2022

Just heads-up, using isort (package to sort imports according to pep8 convention) gives me a slightly different result.

Contributor

LiangJYu Jul 11, 2022

The isort output appears to group from x import y incorrectly. e.g. from datetime import datetime should be with the standard library group.

src/compass/utils/stitching/stitch_burst.py Outdated

		return poly_int, epsg_int


		def get_stitching_dict(indir):

Contributor

LiangJYu Jul 6, 2022

Suggested change

      
            def get_stitching_dict(indir):
          
            def get_stitching_dataframe(bursts_dir, wanted_burst_ids):

misnomer and rename for clarity
What do you think about filtering unwanted burst IDs on dataframe init vs filtering them out later?

Contributor Author

vbrancat Jul 11, 2022

I am not sure it is a good idea. We might want to filter for different reasons at different stages. One thing that I have noticed is that for bigger dataframes the filtering operation might take some time. Any hint how to making faster using pandas dataframes?

Contributor

LiangJYu Jul 11, 2022

I made my suggestion because this block of code:

    data_dict = get_stitching_dict(indir)

    # If stitching some bursts, prune dataframe to
    # contains only the burst IDs to stitch
    if burst_list is not None:
        data_dict = prune_dataframe(data_dict,
                                    'burst_id', burst_list)

By filtering out unwanted bursts in get_stitching_dict, the call to prune_dataframe is no longer needed and other calls to prune_dataframe will work on smaller dataframes. I wager a smaller dataframe will be faster to filter. If burst_list is None, downstream behavior is not impacted since everything in indirwill be in data_dict.

My perspective of filtering is of what's present in the code of this PR. What others besides the 2 here that you have in mind?

src/compass/utils/stitching/stitch_burst.py Outdated Show resolved Hide resolved

src/compass/utils/stitching/stitch_burst.py Outdated Show resolved Hide resolved

src/compass/utils/stitching/stitch_burst.py Outdated Show resolved Hide resolved

src/compass/utils/stitching/stitch_burst.py Outdated Show resolved Hide resolved

src/compass/utils/stitching/stitch_burst.py Outdated Show resolved Hide resolved

src/compass/utils/stitching/stitch_burst.py Outdated

Comment on lines 97 to 102

+                  # Identify common burst IDs among different dates
+                  ids2stitch = get_common_burst_ids(data_dict)
+                  # Prune dataframe to contain only the IDs to stitch
+                  data_dict = prune_dataframe(data_dict,
+                                              'burst_id', ids2stitch)

Contributor

LiangJYu Jul 6, 2022

If prune_dataframe above is removed, then get_common_burst_ids and prune_dataframe can be merged into prune_uncommon_burst_ids function? Unless you see prune_dataframe being being something to be imported and used elsewhere...

Something like:

def prune_uncommon_burst_ids(data):
    '''
    Prune dataframe based on column ID and list of value
    Parameters:
    ----------
    data: pandas.DataFrame
       dataframe that needs to be pruned

    Returns:
    -------
    data: pandas.DataFrame
       Pruned dataframe with rows in 'id_list'
    '''
    unique_dates = list(set(data['date']))

    # Initialize list of unique burst IDs
    common_id = data.burst_id[data.date == unique_dates[0]]

    for date in unique_dates:
        ids = data.burst_id[data.date == date]
        common_id = sorted(list(set(ids.tolist()) & set(common_id)))

    # remove burst IDs not common to all days
    pattern = '|'.join(common_id)
    dataf = data.loc[data['burst_id'].str.contains(pattern,
                                               case=False)]
    return dataf

Contributor Author

vbrancat Jul 11, 2022

The geocoded burst stack processor would reuse the functionality for pruning the dataframe based on several criteria. Therefore, I am inclined to keep this functionality as it is and reuse it elsewhere

Contributor

LiangJYu Jul 11, 2022

Thank you for pointing out commonality with #18; I completely missed this. In light of this, I think prune_dataframe and get_common_burst_ids could be in their own file in src/compass/utils/dataframe_tools.py as a shared import between stack processing and stitching.

src/compass/utils/stitching/stitch_burst.py Outdated Show resolved Hide resolved

src/compass/utils/stitching/stitch_burst.py Outdated Show resolved Hide resolved

LiangJYu mentioned this pull request

WIP: Geocoded burst stack processor #18

Closed

vbrancat added 7 commits

July 11, 2022 08:07


          Organize imports with pep8 convention

4dba162


          Remove metadata read and extraction functions

0ee452c


          Rename variables for clarity

163e54f


          Rename data_dict to metadata_dataframe

87cee4d


          stitch_burst.py

9cf73bd


          Merge remote-tracking branch 'upstream/main' into geo_stitching

1fc8fdf


          Change the directory structure of stitching code

31d4426

Contributor Author

vbrancat commented Feb 1, 2023

@scottstanie Maybe we can close this PR? The stitching code has been incorporated in dolphin and it is much easier :)

Contributor

scottstanie commented Feb 1, 2023

Sure! There's a chance that we later come across some case where stitching geocoded SLCs leads to an easier time... but for the current effort of the displacement workflow you're right that we've got that now in dolphin 👍

Contributor Author

vbrancat commented Feb 1, 2023

Close as duplicated in the displacement workflow

vbrancat closed this

vbrancat deleted the geo_stitching branch

February 22, 2023 00:16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet