anmn_temp_gridded_product - possible future improvements #812

mhidas · 2018-02-20T23:42:07Z

The temperature gridded product code (after #809) seems to work fine and it's reasonably clear what it does. However, I think it could be made clearer, simpler and possibly faster/more efficient by making use of some existing packages:

Use numpy arrays for all data handling and avoid looping arrays or through lists of arrays.
Better still, use the xarray package to handle both netCDF i/o and array arithmetic. E.g. it has methods that could replace some of the binning code.
Use boto3 to get files from S3 directly, rather than via HTTP.

Don't know if we'll ever have time to work on these, just wanted to make a note while I thought of it. To an extent this also applies to the burst-averaging code.

lbesnard · 2018-02-21T01:36:04Z

I'm pretty sure using boto3 means you would have to develop the code and test it only on a authorised machine (@lwgordonimos ?) . Also If I'm correct, it does mean you cannot share it with people outside of the IMOS AODN organisation

ghost · 2018-02-21T02:42:57Z

Not necessarily. If it relates to a public bucket, it is possible to still access it anonymously, and get the benefit of nice efficient S3 interaction . Refer: https://github.com/aodn/utilities/blob/master/jenkins/get_latest_artifact.py#L19 (note: that script is overly condensed so disregard the rest but the key point is the UNSIGNED requests to S3).

One other benefit would be that it has efficient download code internally, which avoids manually chunking the downloads, i.e. https://github.com/aodn/data-services/pull/809/files#diff-84abcef0df7fb618752454ed770a069aR157

The only drawbacks I could see are that:

it becomes S3 only, instead of potentially any URL
it creates a new dependency on boto3 (however I suspect this is not a problem, since it is likely to be installed anywhere this would be used anyway)

pblain assigned smancini Mar 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

anmn_temp_gridded_product - possible future improvements #812

anmn_temp_gridded_product - possible future improvements #812

mhidas commented Feb 20, 2018

lbesnard commented Feb 21, 2018

ghost commented Feb 21, 2018 •

edited by ghost

Loading

anmn_temp_gridded_product - possible future improvements #812

anmn_temp_gridded_product - possible future improvements #812

Comments

mhidas commented Feb 20, 2018

lbesnard commented Feb 21, 2018

ghost commented Feb 21, 2018 • edited by ghost Loading

ghost commented Feb 21, 2018 •

edited by ghost

Loading