Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to reduce the number of server requests #60

Open
Paul-Rutten-MPS opened this issue Dec 3, 2024 · 1 comment
Open

Option to reduce the number of server requests #60

Paul-Rutten-MPS opened this issue Dec 3, 2024 · 1 comment
Assignees

Comments

@Paul-Rutten-MPS
Copy link

for each dateset query we make, xarray.open_dataset() sends multiple requests to the server:

  • 4 meta data requests (to the .dds, .das, .dds, and .dods endpoints respectively)
  • 1 request per parameter we want data for.

Given that some datasets contain over 20 parameters, this can result in a lot of requests to get some data.

It is possible to reduce this to 2 request, by filling in the OPeNDAP Dataset Access Form programmatically and converting the resulting data to an xarray Dataset.

The example below returns an xarray.Dataset with dimensions: (y: 2, x: 2, time: 1, height: 6) and variables (latitude, longitude, wind_speed, wind_direction)

import xarray as xr
from pydap.client import open_dods_url

dd = open_dods_url('https://thredds.met.no/thredds/dodsC/nora3_subset_atmos/wind_hourly/arome3kmwind_1hr_202312.nc.dods?latitude[0:1][0:1],longitude[0:1][0:1],x[0:1],y[0:1],wind_speed[0:1:0][0:1:5][0:1][0:1],wind_direction[0:1:0][0:1:5][0:1][0:1]')

ds = xr.open_dataset(xr.backends.PydapDataStore(dd))
@efvik
Copy link
Collaborator

efvik commented Jan 13, 2025

I tested this with file 'https://thredds.met.no/thredds/dodsC/windsurfer/mywavewam3km_files/1975/01/19750101_MyWam3km_hindcast.nc', and timing the two alternatives:

xr.open_dataset(filename).isel({'rlat':500,'rlon':2000})[variables].load()
xr.load_dataset(xr.backends.PydapDataStore(open_dods_url(nora3.filenames[0]+request)))

where request = '.dods?rlat[500],rlon[2000],time[0:23],projection_ob_tran,ff[0:23][500][2000]' etc.

For 5 variables i got 241ms and 309ms respectively, and for all (51) variables 4.54s and 3.81s respectively. So in terms of speed there is little difference, but perhaps this approach can reduce server load.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants