Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanchez should ignore files in emwin directory when stitching #71

Open
KiwiInNZ opened this issue Feb 6, 2021 · 0 comments
Open

Sanchez should ignore files in emwin directory when stitching #71

KiwiInNZ opened this issue Feb 6, 2021 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@KiwiInNZ
Copy link

KiwiInNZ commented Feb 6, 2021

Is your feature request related to a problem? Please describe.
When doing a stitched image, e.g. using the following command, Sanchez is also parsing the content of the emwin directory which contains text files which can be downloaded from GOES17 (probably GOES16 too).

Since this directory can contain several thousand (~20k) files per day of data captured, the time to parse each filename and to determine that there is no handler for the file can take a significant chunk of time (e.g. for the command below, the total time was 20 seconds, but once the emwin directory was removed, this dropped to 14 seconds with the emwin directory only having about 12 hours of data in it).

~/sanchez/Sanchez reproject -s ~/goes -o ~/stitched.jpg -T 2021-02-06T19:50:00 -a

emwin directory example structure:
emwin
2021-02-01 [this is the UTC aligned date using YYYY-MM-DD format]
.TXT / .GIF / .PNG files

example filenames:
A_ABUS23KWBC010930_C_KWIN_20210201093015_126851-2-NWX005US.TXT
A_SACN82CWAO011600_C_KWIN_20210201161811_149938-2-SAHOURLY.TXT
Z_QAEA00TJSJ011218_C_KWIN_20210201121855_136854-3-RADALLPR.GIF
Z_QGTO88KWNS010304_C_KWIN_20210201150443_145446-4-GPHJ88US.PNG

For everything you may need to know about these files - https://www.weather.gov/media/emwin/EMWIN_Image_and_Text_Data_Capture_Catalog_table_v1.2_180222_1313.pdf

If people are processing all the emwin files, the result could be a significant increase in the time for the file parsing to complete with no added benefit.

Describe the solution you'd like

Ideal outcome is that there is no / very limited performance impact from having an emwin directory in the directory structure being parsed from the -s argument.

Some possible options as a starting point:

  • Always exclude the emwin directory from being parsed
  • Add an argument to enable directory(s) in the source directory from being parsed
  • Another ingenious idea from the mind of the developer of Sanchez

Describe alternatives you've considered
My solution was to stop collecting emwin data due to the volume of files being downloaded with relatively limited value to me. However this may not be ideal to some.

There are options around moving the emwin directory / contents from the standard location to an alternative one, perhaps using a cron job to do this move, however this seems to be a bit of a hack.

Additional context
The above content should be enough to validate the "behavior" and for the creator of Sanchez to implement what could be a relatively straightforward change to do so in a time frame which is aligned with their own priority and schedule for this feature request.

@KiwiInNZ KiwiInNZ added the enhancement New feature or request label Feb 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants