-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stitching geocoded bursts for stack processing #14
Conversation
import argparse | ||
import glob | ||
import json | ||
import os | ||
import time | ||
|
||
import isce3 | ||
import journal | ||
import pandas as pd | ||
import shapely.wkt | ||
from datetime import datetime | ||
from compass.utils import helpers | ||
from osgeo import gdal, ogr | ||
from shapely.geometry import Polygon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import argparse | |
import glob | |
import json | |
import os | |
import time | |
import isce3 | |
import journal | |
import pandas as pd | |
import shapely.wkt | |
from datetime import datetime | |
from compass.utils import helpers | |
from osgeo import gdal, ogr | |
from shapely.geometry import Polygon | |
import argparse | |
from datetime import datetime | |
import glob | |
import json | |
import os | |
import time | |
import isce3 | |
import journal | |
from osgeo import gdal, ogr | |
import pandas as pd | |
import shapely.wkt | |
from shapely.geometry import Polygon | |
from compass.utils import helpers |
PEP8 ordering and grouping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just heads-up, using isort
(package to sort imports according to pep8
convention) gives me a slightly different result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The isort
output appears to group from x import y
incorrectly. e.g. from datetime import datetime
should be with the standard library group.
return poly_int, epsg_int | ||
|
||
|
||
def get_stitching_dict(indir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_stitching_dict(indir): | |
def get_stitching_dataframe(bursts_dir, wanted_burst_ids): |
misnomer and rename for clarity
What do you think about filtering unwanted burst IDs on dataframe init vs filtering them out later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure it is a good idea. We might want to filter for different reasons at different stages. One thing that I have noticed is that for bigger dataframes the filtering operation might take some time. Any hint how to making faster using pandas
dataframes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made my suggestion because this block of code:
data_dict = get_stitching_dict(indir)
# If stitching some bursts, prune dataframe to
# contains only the burst IDs to stitch
if burst_list is not None:
data_dict = prune_dataframe(data_dict,
'burst_id', burst_list)
By filtering out unwanted bursts in get_stitching_dict
, the call to prune_dataframe
is no longer needed and other calls to prune_dataframe
will work on smaller dataframes. I wager a smaller dataframe
will be faster to filter. If burst_list is None
, downstream behavior is not impacted since everything in indir
will be in data_dict
.
My perspective of filtering is of what's present in the code of this PR. What others besides the 2 here that you have in mind?
# Identify common burst IDs among different dates | ||
ids2stitch = get_common_burst_ids(data_dict) | ||
|
||
# Prune dataframe to contain only the IDs to stitch | ||
data_dict = prune_dataframe(data_dict, | ||
'burst_id', ids2stitch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If prune_dataframe
above is removed, then get_common_burst_ids
and prune_dataframe
can be merged into prune_uncommon_burst_ids
function? Unless you see prune_dataframe
being being something to be imported and used elsewhere...
Something like:
def prune_uncommon_burst_ids(data):
'''
Prune dataframe based on column ID and list of value
Parameters:
----------
data: pandas.DataFrame
dataframe that needs to be pruned
Returns:
-------
data: pandas.DataFrame
Pruned dataframe with rows in 'id_list'
'''
unique_dates = list(set(data['date']))
# Initialize list of unique burst IDs
common_id = data.burst_id[data.date == unique_dates[0]]
for date in unique_dates:
ids = data.burst_id[data.date == date]
common_id = sorted(list(set(ids.tolist()) & set(common_id)))
# remove burst IDs not common to all days
pattern = '|'.join(common_id)
dataf = data.loc[data['burst_id'].str.contains(pattern,
case=False)]
return dataf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The geocoded burst stack processor would reuse the functionality for pruning the dataframe based on several criteria. Therefore, I am inclined to keep this functionality as it is and reuse it elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing out commonality with #18; I completely missed this. In light of this, I think prune_dataframe
and get_common_burst_ids
could be in their own file in src/compass/utils/dataframe_tools.py as a shared import between stack processing and stitching.
@scottstanie Maybe we can close this PR? The stitching code has been incorporated in |
Sure! There's a chance that we later come across some case where stitching geocoded SLCs leads to an easier time... but for the current effort of the displacement workflow you're right that we've got that now in dolphin 👍 |
Close as duplicated in the displacement workflow |
This PR provides a script to stitch S1-A/B bursts for stack processing i.e., the stitched bursts forming the stitched SLCs have all the same shape and can be directly interfered. To efficiently handle the list of burst IDs to stitch, the PR makes use of
pandas
which would be a new (but light) dependency forCOMPASS
.The algorithm implements the following steps:
If a list of burst IDs to stitch is not provided, it identifies a common list of bursts among all the different dates and use those burst ID for stitching. Otherwise, only the provided list of burst IDs is used for stitching.
For each unique burst ID, it identifies the common burst boundary (on the ground) among the different dates and use the common burst boundary to cut the bursts at different dates. This is implemented by saving a
shapely.geometry.Polygon
into anESRI Shapefile
. This can be used bygdal.Warp
to cut the different bursts using thecutline
featureAll the cut bursts for a same date are then stitched together to form the output SLCs. All the formed SLCs have the same shape as they have been formed by using the commonly identified bursts.
Assumptions
The script makes the following assumptions:
The directory containing the input burst is organized by date. This should be a pretty common assumption for a stack processor
Metadata are in
json
format and it contains info on thegranule_id
(i.e. filename),date
,polygon
, burst_id, andepsg
. These should be all pretty reasonable assumptionsUntested: the algorithm should equivalently work also for range/Doppler co-registered stacks of bursts, assuming that the same metadata (e.g. polygon of valid pixels) are provided.
Testing
The PR has been tested on a small stack of S1-A/B data that can be found on the aurora server at:
/mnt/aurora-r0/vbrancat/data/S1/gburst_stitching/
. Below a pic of an interferogram formed by randomly selecting a reference and a secondary stitched SLC from the processed stack.TO DO
gdal.Warp
allows to allocate the number of threads to use and expose it in the command line option