[Enhancement]: Update coords="minimal"
and compat="minimal"
as defaults to improve performance of xc.open_mfdataset()
?
#641
Labels
type: enhancement
New enhancement request
Is your feature request related to a problem?
xarray.open_mfdataset()
has a few issues related to: (1) incorrectly concatenating coords on variables (e.g,. "time" gets added to "lat_bnds") and 2) performance. xCDAT addresses (1) by defaultingdata_vars="minimal"
. To address (2) performance, the post and docs below suggest addingcoords="minimal"
and"compat="override"
.pydata/xarray#1385 (comment)
pydata/xarray#1385 (comment)
Describe the solution you'd like
Xarray documentation
Describe alternatives you've considered
No response
Additional context
I don't know how reliable
parallel=True
is for speeding up reading coordinate information. There is a Xarray GitHub issue #7079 with comments suggesting using theparallel=True
is not thread-safe and might cause resource locking on some filesystems, unlike the defaultparallel=False
. Tony B and I ran into this in e3sm_to_cmip (related issue).In this e3sm_diags PR, we are getting a
TimeoutError: Timed out
when usingxcdat.open_mfdataset()
.There might be some performance issues with the underlying call toI think this e3sm_diags issue is actually related to compatibility with the multiprocessing scheduler manually defined in e3sm_diags (related issue).xarray.open_mfdataset()
.The text was updated successfully, but these errors were encountered: