Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] - nice progress bar during downloading of SLC data #56

Open
cmarshak opened this issue Aug 19, 2021 · 3 comments · May be fixed by #81
Open

[Feature] - nice progress bar during downloading of SLC data #56

cmarshak opened this issue Aug 19, 2021 · 3 comments · May be fixed by #81

Comments

@cmarshak
Copy link

cmarshak commented Aug 19, 2021

Is your feature request related to a problem? Please describe.
Downloading SLC data takes a long time. Would be nice to track.

Describe the solution you'd like
Something like this - https://gist.github.com/wy193777/0e2a4932e81afc6aa4c8f7a2984f34e2 - would be happy to make a pull request if the feature would be welcome. Understand if this is out of scope.

Describe alternatives you've considered
I think there are other progress bars, but am not familiar with them.

Additional context
There is a lot of output during downloading e.g.

response: 307
Redirect to https://sentinel1.asf.alaska.edu/SLC/SB/S1B_IW_SLC__1SDV_20210723T014947_20210723T015014_027915_0354B4_B3A9.zip
response: 302
Redirect to https://dy4owt9f80bz7.cloudfront.net/...
response: 200

Rather be able to track progress more clearly.

edit: realizing this feature request might be a bit hasty as there might be use cases which this is not desired and annoying - but hopefully there might be a smart way to integrate such a feature.

@jhkennedy
Copy link
Contributor

@glshort we've added a tqdm process bar to the HyP3 SDK so you get pretty progress bars in Jupyter Notebooks and text ones everywhere else. E.g.,
https://github.com/ASFHyP3/hyp3-sdk/blob/develop/hyp3_sdk/util.py#L113-L121

One thing to be aware of, is that if you do want to use the Jupyter support, you'll need to be careful with the import b/c you can be running in a Jupyter kernel, but not have all the expected Jupyter dependencies, so we do this:
https://github.com/ASFHyP3/hyp3-sdk/blob/develop/hyp3_sdk/util.py#L54-L61

@carmine1990
Copy link

immagine

@jhkennedy using your link i customized the download.py script in the download folder so that, if the processes params is 1 (the customizeation doesn't support pool() method) now you can see the progres of the download.

The function to modify is asf_search/download/download.py:

def download_url(url: str, path: str, filename: str = None, session: ASFSession = None ) -> None:
"""
Downloads a product from the specified URL to the specified location and (optional) filename.
:param url: URL from which to download
:param path: Local path in which to save the product
:param filename: Optional filename to be used, extracted from the URL by default
:param session: The session to use, in most cases should be authenticated beforehand
:return:
"""
if filename is None:
filename = os.path.split(urllib.parse.urlparse(url).path)[1]
if not os.path.isdir(path):
raise ASFDownloadError(f'Error downloading {url}: directory not found: {path}')
if os.path.isfile(os.path.join(path, filename)):
warnings.warn(f'File already exists, skipping download: {os.path.join(path, filename)}')
return
if session is None:
session = ASFSession()
def strip_auth_if_aws(r, *args, **kwargs):
if 300 <= r.status_code <= 399 and 'amazonaws.com' in urllib.parse.urlparse(r.headers['location']).netloc:
location = r.headers['location']
r.headers.clear()
r.headers['location'] = location
response = session.get(url, stream=True, hooks={'response': strip_auth_if_aws})
response.raise_for_status()
with tqdm.wrapattr(open(os.path.join(path, filename),'wb'),
'write', miniters=1,
desc=filename,
total=int(response.headers.get('content-length', 0))) as f:
for chunk in response.iter_content(chunk_size=31457280):
f.write(chunk)

@scottstanie
Copy link

Is there still interest in an implementation of this? if nothing else, for the top-level loop that shows the total number of files to download would be only a one line change, and it's nice that now you can disable the progress bars with TQDM_DISABLE=1. Not sure if you'd want to add the dependency though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants