Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not obvious what packages need to be installed to run Cubed examples #500

Closed
rbavery opened this issue Jul 14, 2024 · 4 comments
Closed

Comments

@rbavery
Copy link
Contributor

rbavery commented Jul 14, 2024

This error occurs in this example notebook: https://github.com/cubed-dev/cubed/blob/main/examples/pangeo-4-climatological-anomalies.ipynb

.mean doesn't take a method argument

Full Traceback:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[8], line 3
      1 # Note: we actually want skipna=True, but this isn't implemented in xarray yet
      2 # see https://github.com/pydata/xarray/issues/7243
----> 3 mean = ds.groupby("time.dayofyear").mean(method="map-reduce", skipna=False)
      4 mean

File ~/miniforge3/lib/python3.10/site-packages/xarray/core/_aggregations.py:2974, in DatasetGroupByAggregations.mean(self, dim, skipna, keep_attrs, **kwargs)
   2964     return self._flox_reduce(
   2965         func="mean",
   2966         dim=dim,
   (...)
   2971         **kwargs,
   2972     )
   2973 else:
-> 2974     return self._reduce_without_squeeze_warn(
   2975         duck_array_ops.mean,
   2976         dim=dim,
   2977         skipna=skipna,
   2978         numeric_only=True,
   2979         keep_attrs=keep_attrs,
   2980         **kwargs,
   2981     )

File [~/miniforge3/lib/python3.10/site-packages/xarray/core/groupby.py:1993](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/core/groupby.py#line=1992), in DatasetGroupByBase._reduce_without_squeeze_warn(self, func, dim, axis, keep_attrs, keepdims, shortcut, **kwargs)
   1990     warnings.filterwarnings("ignore", message="The `squeeze` kwarg")
   1991     check_reduce_dims(dim, self.dims)
-> 1993 return self._map_maybe_warn(reduce_dataset, warn_squeeze=False)

File [~/miniforge3/lib/python3.10/site-packages/xarray/core/groupby.py:1839](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/core/groupby.py#line=1838), in DatasetGroupByBase._map_maybe_warn(self, func, args, shortcut, warn_squeeze, **kwargs)
   1829 def _map_maybe_warn(
   1830     self,
   1831     func: Callable[..., Dataset],
   (...)
   1836 ) -> Dataset:
   1837     # ignore shortcut if set (for now)
   1838     applied = (func(ds, *args, **kwargs) for ds in self._iter_grouped(warn_squeeze))
-> 1839     return self._combine(applied)

File [~/miniforge3/lib/python3.10/site-packages/xarray/core/groupby.py:1859](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/core/groupby.py#line=1858), in DatasetGroupByBase._combine(self, applied)
   1857 def _combine(self, applied):
   1858     """Recombine the applied objects like the original."""
-> 1859     applied_example, applied = peek_at(applied)
   1860     coord, dim, positions = self._infer_concat_args(applied_example)
   1861     combined = concat(applied, dim)

File [~/miniforge3/lib/python3.10/site-packages/xarray/core/utils.py:205](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/core/utils.py#line=204), in peek_at(iterable)
    201 """Returns the first value from iterable, as well as a new iterator with
    202 the same content as the original iterable
    203 """
    204 gen = iter(iterable)
--> 205 peek = next(gen)
    206 return peek, itertools.chain([peek], gen)

File [~/miniforge3/lib/python3.10/site-packages/xarray/core/groupby.py:1838](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/core/groupby.py#line=1837), in <genexpr>(.0)
   1829 def _map_maybe_warn(
   1830     self,
   1831     func: Callable[..., Dataset],
   (...)
   1836 ) -> Dataset:
   1837     # ignore shortcut if set (for now)
-> 1838     applied = (func(ds, *args, **kwargs) for ds in self._iter_grouped(warn_squeeze))
   1839     return self._combine(applied)

File [~/miniforge3/lib/python3.10/site-packages/xarray/core/groupby.py:1980](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/core/groupby.py#line=1979), in DatasetGroupByBase._reduce_without_squeeze_warn.<locals>.reduce_dataset(ds)
   1979 def reduce_dataset(ds: Dataset) -> Dataset:
-> 1980     return ds.reduce(
   1981         func=func,
   1982         dim=dim,
   1983         axis=axis,
   1984         keep_attrs=keep_attrs,
   1985         keepdims=keepdims,
   1986         **kwargs,
   1987     )

File [~/miniforge3/lib/python3.10/site-packages/xarray/core/dataset.py:6942](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/core/dataset.py#line=6941), in Dataset.reduce(self, func, dim, keep_attrs, keepdims, numeric_only, **kwargs)
   6922         if (
   6923             # Some reduction functions (e.g. std, var) need to run on variables
   6924             # that don't have the reduce dims: PR5393
   (...)
   6935             # the former is often more efficient
   6936             # keep single-element dims as list, to support Hashables
   6937             reduce_maybe_single = (
   6938                 None
   6939                 if len(reduce_dims) == var.ndim and var.ndim != 1
   6940                 else reduce_dims
   6941             )
-> 6942             variables[name] = var.reduce(
   6943                 func,
   6944                 dim=reduce_maybe_single,
   6945                 keep_attrs=keep_attrs,
   6946                 keepdims=keepdims,
   6947                 **kwargs,
   6948             )
   6950 coord_names = {k for k in self.coords if k in variables}
   6951 indexes = {k: v for k, v in self._indexes.items() if k in variables}

File [~/miniforge3/lib/python3.10/site-packages/xarray/core/variable.py:1662](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/core/variable.py#line=1661), in Variable.reduce(self, func, dim, axis, keep_attrs, keepdims, **kwargs)
   1655 keep_attrs_ = (
   1656     _get_keep_attrs(default=False) if keep_attrs is None else keep_attrs
   1657 )
   1659 # Noe that the call order for Variable.mean is
   1660 #    Variable.mean -> NamedArray.mean -> Variable.reduce
   1661 #    -> NamedArray.reduce
-> 1662 result = super().reduce(
   1663     func=func, dim=dim, axis=axis, keepdims=keepdims, **kwargs
   1664 )
   1666 # return Variable always to support IndexVariable
   1667 return Variable(
   1668     result.dims, result._data, attrs=result._attrs if keep_attrs_ else None
   1669 )

File [~/miniforge3/lib/python3.10/site-packages/xarray/namedarray/core.py:905](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/namedarray/core.py#line=904), in NamedArray.reduce(self, func, dim, axis, keepdims, **kwargs)
    901     if isinstance(axis, tuple) and len(axis) == 1:
    902         # unpack axis for the benefit of functions
    903         # like np.argmin which can't handle tuple arguments
    904         axis = axis[0]
--> 905     data = func(self.data, axis=axis, **kwargs)
    906 else:
    907     data = func(self.data, **kwargs)

File [~/miniforge3/lib/python3.10/site-packages/xarray/core/duck_array_ops.py:704](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/core/duck_array_ops.py#line=703), in mean(array, axis, skipna, **kwargs)
    702     return _to_pytimedelta(mean_timedeltas, unit="us") + offset
    703 else:
--> 704     return _mean(array, axis=axis, skipna=skipna, **kwargs)

File [~/miniforge3/lib/python3.10/site-packages/xarray/core/duck_array_ops.py:471](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/core/duck_array_ops.py#line=470), in _create_nan_agg_method.<locals>.f(values, axis, skipna, **kwargs)
    469     with warnings.catch_warnings():
    470         warnings.filterwarnings("ignore", "All-NaN slice encountered")
--> 471         return func(values, axis=axis, **kwargs)
    472 except AttributeError:
    473     if not is_duck_dask_array(values):

TypeError: mean() got an unexpected keyword argument 'method'
@rbavery
Copy link
Contributor Author

rbavery commented Jul 14, 2024

fixed by installing flox. I'll push a PR to list at the top of the notebook packages that need to be installed

@rbavery
Copy link
Contributor Author

rbavery commented Jul 14, 2024

new error at the last cell of that noteboook

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[13], line 1
----> 1 anomaly.to_zarr("anomaly.zarr", chunkmanager_store_kwargs=dict(callbacks=[RichProgressBar()]))

File [~/miniforge3/lib/python3.10/site-packages/xarray/core/dataset.py:2549](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/core/dataset.py#line=2548), in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, write_empty_chunks, chunkmanager_store_kwargs)
   2404 """Write dataset contents to a zarr group.
   2405 
   2406 Zarr chunks are determined in the following way:
   (...)
   2545     The I[/O](http://localhost:8888/O) user guide, with more details and examples.
   2546 """
   2547 from xarray.backends.api import to_zarr
-> 2549 return to_zarr(  # type: ignore[call-overload,misc]
   2550     self,
   2551     store=store,
   2552     chunk_store=chunk_store,
   2553     storage_options=storage_options,
   2554     mode=mode,
   2555     synchronizer=synchronizer,
   2556     group=group,
   2557     encoding=encoding,
   2558     compute=compute,
   2559     consolidated=consolidated,
   2560     append_dim=append_dim,
   2561     region=region,
   2562     safe_chunks=safe_chunks,
   2563     zarr_version=zarr_version,
   2564     write_empty_chunks=write_empty_chunks,
   2565     chunkmanager_store_kwargs=chunkmanager_store_kwargs,
   2566 )

File [~/miniforge3/lib/python3.10/site-packages/xarray/backends/api.py:1698](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/backends/api.py#line=1697), in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, write_empty_chunks, chunkmanager_store_kwargs)
   1696 # TODO: figure out how to properly handle unlimited_dims
   1697 dump_to_store(dataset, zstore, writer, encoding=encoding)
-> 1698 writes = writer.sync(
   1699     compute=compute, chunkmanager_store_kwargs=chunkmanager_store_kwargs
   1700 )
   1702 if compute:
   1703     _finalize_store(writes, zstore)

File [~/miniforge3/lib/python3.10/site-packages/xarray/backends/common.py:258](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/backends/common.py#line=257), in ArrayWriter.sync(self, compute, chunkmanager_store_kwargs)
    256 def sync(self, compute=True, chunkmanager_store_kwargs=None):
    257     if self.sources:
--> 258         chunkmanager = get_chunked_array_type(*self.sources)
    260         # TODO: consider wrapping targets with dask.delayed, if this makes
    261         # for any discernible difference in performance, e.g.,
    262         # targets = [dask.delayed(t) for t in self.targets]
    264         if chunkmanager_store_kwargs is None:

File [~/miniforge3/lib/python3.10/site-packages/xarray/namedarray/parallelcompat.py:165](http://localhost:8888/~/miniforge3/lib/python3.10/site-packages/xarray/namedarray/parallelcompat.py#line=164), in get_chunked_array_type(*args)
    159 selected = [
    160     chunkmanager
    161     for chunkmanager in chunkmanagers.values()
    162     if chunkmanager.is_chunked_array(chunked_arr)
    163 ]
    164 if not selected:
--> 165     raise TypeError(
    166         f"Could not find a Chunk Manager which recognises type {type(chunked_arr)}"
    167     )
    168 elif len(selected) >= 2:
    169     raise TypeError(f"Multiple ChunkManagers recognise type {type(chunked_arr)}")

TypeError: Could not find a Chunk Manager which recognises type <class 'cubed.array_api.array_object.Array'>

@rbavery rbavery changed the title TypeError: mean() got an unexpected keyword argument 'method' Not obvious what packages need to be installed to run Cubed examples Jul 14, 2024
@rbavery
Copy link
Contributor Author

rbavery commented Jul 14, 2024

Here is where I specified what packages need to be installed and their purpose. once these are installed the notebook example runs without any changes.

https://github.com/cubed-dev/cubed/pull/501/files#diff-49aaa2819e35a856818ecec8c9fa7e1c79ad028d3f44bd749736353cfb51bac9R62

A to-do might be to explain in the docs site what packages Cubed integrates with and how. For example I hadn't heard of flox before but it seems like an important prerequisite to use Cubed for map reduce operations.

@rbavery
Copy link
Contributor Author

rbavery commented Jul 23, 2024

addressed by #507

@rbavery rbavery closed this as completed Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant