Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

demo xarray with attrs #57

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from
Open

Conversation

wpbonelli
Copy link
Member

@wpbonelli wpbonelli commented Dec 7, 2024

Demo how an attrs-based component hierarchy might expose a xarray.DataTree view of itself.

The idea is to provide a .data attribute on input component classes which is either a data tree or a data set, depending whether the item is a node or a leaf in the hierarchy.

This is implemented with a datatree decorator which is applied on top of attrs.define().

@datatree
@define(slots=False)
class Npf:
    ...

It could also be implemented as a base class, or a decorator wrapping attrs.define() directly, though I think the attrs maintainers recommend composing instead of wrapping.

The decorator intercepts the __attrs_post_init__ hook and runs some extra code after it, including registering parent/child relationships so the full tree is accessible from any item in it.

This pattern assumes simulations are constructed from the top down as usual, i.e. first a simulation, which is passed into the model's initializer, which is passed into the package's initializer, etc

sim = Sim()
tdis = Tdis(sim=sim, nper=1, perioddata=[Tdis.PeriodData()])
gwf = Gwf(sim=sim)
dis = Dis(model=gwf)
ic = Ic(model=gwf, strt=1.0)
oc = Oc(model=gwf, perioddata=[Oc.Steps()])
npf = Npf(model=gwf, icelltype=0, k=1.0)

Here is a sample view from the top (simulation) level:

sim.data
<xarray.DataTree 'sim'>
Group: /
├── Group: /tdis
│       Dimensions:     (nper: 1)
│       Dimensions without coordinates: nper
│       Data variables:
│           perioddata  (nper) object 8B Tdis.PeriodData(perlen=1.0, nstp=1, tsmult=1.0)
└── Group: /gwf
    ├── Group: /gwf/dis
    │       Dimensions:  (ncol: 2, nrow: 2, nlay: 1)
    │       Dimensions without coordinates: ncol, nrow, nlay
    │       Data variables:
    │           delr     (ncol) float64 16B 1.0 1.0
    │           delc     (nrow) float64 16B 1.0 1.0
    │           top      (ncol, nrow) float64 32B 1.0 1.0 1.0 1.0
    │           botm     (ncol, nrow, nlay) float64 32B 0.0 0.0 0.0 0.0
    ├── Group: /gwf/ic
    │       Dimensions:  (nodes: 4)
    │       Dimensions without coordinates: nodes
    │       Data variables:
    │           strt     (nodes) float64 32B 1.0 1.0 1.0 1.0
    ├── Group: /gwf/oc
    │       Dimensions:     (nper: 1)
    │       Dimensions without coordinates: nper
    │       Data variables:
    │           perioddata  (nper) object 8B Oc.Steps(first='first', last=None, all=None,...
    └── Group: /gwf/npf
            Dimensions:    (nodes: 4)
            Dimensions without coordinates: nodes
            Data variables:
                icelltype  (nodes) int64 32B 0 0 0 0
                k          (nodes) float64 32B 1.0 1.0 1.0 1.0

Note there are several ways we could represent certain record types (e.g. for period data), there is some flexibility in the conversion from mf6 definitions. This is just a sketch of one option.

Dimensions so far are just dis and tdis i.e. nlay, ncol, nrow, nper.

I think a similar pattern could work for output as well. Output would also need space/time dimensions/coordinates.

I discovered xarray can only represent list data if records are classes — it treats tuples as an additional dimension in the array, instead of considering the tuple the item itself. This is true even of NamedTuple subclasses. This is more reason we should use proper classes for record types I suppose.

There are several shortcuts taken in this demo just to get it working, e.g.

  • the way subcomponents discover dis/tdis is a hack and may need a better implementation
  • shape parsing is currently only done at init time, but it should run as a validation step whenever the attribute is set

@wpbonelli wpbonelli marked this pull request as ready for review December 12, 2024 20:30
@deltamarnix
Copy link
Contributor

mypy is giving quite some errors in the code.

For example this piece

def _parse_dim_names(s: str) -> tuple[str]:
    return tuple( # returns a tuple[str, ...], instead of tuple[str]
        [
            ss.strip()
            for ss in s.strip().replace("(", "").replace(")", "").split(",")
            if any(ss)
        ]
    )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants