Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul of Python architecture #731

Merged
merged 22 commits into from
Nov 14, 2023
Merged

Overhaul of Python architecture #731

merged 22 commits into from
Nov 14, 2023

Conversation

evetion
Copy link
Member

@evetion evetion commented Nov 2, 2023

Fixes #512
Groundwork for #630 & #252

The code has become much simpler in some places (e.g. write_toml).

from ribasim import Model

m = Model(filepath="generated_testmodels/basic/ribasim.toml")
m.database.node.df # Node table
m.basin.static.df # BasinStatc table
m.write("test")

Some notes:

  • The config.py file cannot be autogenerated anymore. The schemas still can, but I disabled it for now to be sure (some imports error).

Changes:

I created new (parent) classes:

  • BaseModel, from Pydantic, with our own config
  • FileModel, like hydrolib-core (but now Pydantic v2), which can take a single filepath for initilization. Models who inherit require defining _load and _save, dealing with that filepath.
  • NodeModel, a class for nodes (where add will be).
  • TableModel, a class to read/write individual tables from/to gpkg/arrow.
  • SpatialTableModel, inherits TableModel, but reads/writes spatial stuff.

I changed:

  • the Model class to be a carbon copy of Config (which has been deleted), so it mirrors the toml.
  • in turn this created a database NodeModel class (reflecting the field in the toml), with only Node and Edge underneath.
  • the NodeModel classes Basin from their node_type version to the one in Config, and set the type of the underlying table with a TypeVar like so:
class Terminal(NodeModel):
    static: TableModel[TerminalStaticSchema]

Yet to do:

  • Update tests
  • Fix sort! rules
  • Delete node_types folder
  • Link schemas to their Pydantic class (TableModel[TerminalStaticSchema] => TerminalStatic)

visr and others added 9 commits October 16, 2023 22:17
Does not yet work with ribasim-python.
The testmodels need to be present for the build process to work

I've also formatted, so the diff is slightly bigger than expected
Mypy doesn't like these dict defaults in config.py, since they are not the same type as the annotated type. Can't we somehow avoid copying the same defaults twice here?
@visr
Copy link
Member

visr commented Nov 2, 2023

The config cannot be autogenerated without overwriting this code.

What do you mean by this?

@evetion
Copy link
Member Author

evetion commented Nov 2, 2023

The config cannot be autogenerated without overwriting this code.

What do you mean by this?

We currently autogenerate all code in config.py, but I changed that code manually and use it extensively. Unless I write a new template for the codegenerator, I don't see autogeneration happening in the near future. (we can/do still autogenerate the schemas though).

@visr
Copy link
Member

visr commented Nov 2, 2023

Ok thanks. That's fine, I don't see that as a blocker.

Copy link
Contributor

@Hofer-Julian Hofer-Julian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @evetion.

To later review the usage, tests or even better the example jupyter notebook would be helpful.

pixi.toml Outdated Show resolved Hide resolved
python/ribasim/pyproject.toml Outdated Show resolved Hide resolved
python/ribasim/ribasim/config.py Outdated Show resolved Hide resolved
python/ribasim/ribasim/config.py Outdated Show resolved Hide resolved
python/ribasim/ribasim/config.py Outdated Show resolved Hide resolved
python/ribasim/ribasim/model.py Outdated Show resolved Hide resolved
python/ribasim/ribasim/model.py Outdated Show resolved Hide resolved
python/ribasim/ribasim/model.py Outdated Show resolved Hide resolved
python/ribasim/ribasim/model.py Outdated Show resolved Hide resolved
python/ribasim/ribasim/config.py Outdated Show resolved Hide resolved
@Hofer-Julian
Copy link
Contributor

I've added 507d925 to your branch so similar type hint lints are caught by ruff directly.

@evetion
Copy link
Member Author

evetion commented Nov 3, 2023

I've added 507d925 to your branch so similar type hint lints are caught by ruff directly.

I had to revert it for now, as the autogenerated schemas still use the old notation. :/ Will see if I can fix that at a later stage.

@visr
Copy link
Member

visr commented Nov 7, 2023

the autogenerated schemas still use the old notation

Is that fixed by target-python-version? datamodel-codegen --target-python-version=3.10
Mentioned in #653 (comment).

@evetion
Copy link
Member Author

evetion commented Nov 13, 2023

the autogenerated schemas still use the old notation

Is that fixed by target-python-version? datamodel-codegen --target-python-version=3.10 Mentioned in #653 (comment).

Nope, but it was --use-union-operator 🎉

@evetion evetion marked this pull request as ready for review November 13, 2023 16:43
@evetion evetion requested review from visr and Hofer-Julian November 13, 2023 17:17
pixi.toml Outdated Show resolved Hide resolved
@Hofer-Julian
Copy link
Contributor

I've added be766f9 as an example how to simplify the code

Copy link
Contributor

@Hofer-Julian Hofer-Julian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changeset is naturally massive so I didn't check every detail.
But overall it looks good

python/ribasim/ribasim/input_base.py Outdated Show resolved Hide resolved
Copy link
Member

@visr visr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Left some comments, but they could also be moved out to separate issues or follow up PRs if you prefer.

{"flow_rate": 1, "node_id": -1, "remarks": "", "active": True},
ignore_index=True,
)
# Currently can't handle mixed NaN and None in a DataFrame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a pandas limitation? We could benefit from a clear guide on how we deal with / represent missing data for different data types on both Julia, Python, Arrow and GeoPackage sides.

See also this comment: @evetion what do you recommend for missing / default string columns; missing or ""?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pandas is fine, it's just that with adding new columns, mixing None (object) and NaN (float) doesn't play nice with pandera. I might've fixed that in the last commit. I will write a guide in the subsequent PR on how to handle missing values in Python. (I think it's pretty clear in Julia & geopackage).

Comment on lines +100 to +103
database=ribasim.Database(
node=node,
edge=edge,
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that these are now together in a single class, and understand you named it Database since these tables always have to be in the database. But wouldn't a name like Network better represent what it really is? If all tables are stored in the database, users may expect to find all tables in ribasim.Database.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field is database because it is named so in the toml. I thought about naming the class Topology, but Network is a good alternative. This can be renamed in the follow up PR, when we actually start using said class more extensively.

@evetion evetion merged commit 633d635 into main Nov 14, 2023
@evetion evetion deleted the feat/pyrefact branch November 14, 2023 15:13
visr added a commit that referenced this pull request Nov 14, 2023
Fixes #512
Groundwork for #630 & #252

The code has become much simpler in some places (e.g. write_toml).

```python
from ribasim import Model

m = Model(filepath="generated_testmodels/basic/ribasim.toml")
m.database.node.df # Node table
m.basin.static.df # BasinStatc table
m.write("test")
```

### Some notes:
- The config.py file cannot be autogenerated anymore. The schemas still
can, but I disabled it for now to be sure (some imports error).

### Changes:
I created new (parent) classes:
- BaseModel, from Pydantic, with our own config
- FileModel, like hydrolib-core (but now Pydantic v2), which can take a
single filepath for initilization. Models who inherit require defining
_load and _save, dealing with that filepath.
- NodeModel, a class for nodes (where `add` will be).
- TableModel, a class to read/write individual tables from/to
gpkg/arrow.
- SpatialTableModel, inherits TableModel, but reads/writes spatial
stuff.

I changed:
- the Model class to be a carbon copy of Config (which has been
deleted), so it mirrors the toml.
- in turn this created a `database` NodeModel class (reflecting the
field in the toml), with only Node and Edge underneath.
- the NodeModel classes Basin from their node_type version to the one in
Config, and set the type of the underlying table with a TypeVar like so:
```python
class Terminal(NodeModel):
    static: TableModel[TerminalStaticSchema]
```

### Yet to do:
- [x] Update tests
- [x] Fix sort! rules
- [x] Delete node_types folder
- [x] Link schemas to their Pydantic class
(TableModel[TerminalStaticSchema] => TerminalStatic)

---------

Co-authored-by: Martijn Visser <[email protected]>
Co-authored-by: Hofer-Julian <[email protected]>
Co-authored-by: Hofer-Julian <[email protected]>
visr added a commit that referenced this pull request Nov 14, 2023
I noticed with #731 we went back from Python 3.12 to 3.11. This updates
all dependencies, including Python back to 3.12.

I did this using pixi 0.7.0 which just came out:
https://github.com/prefix-dev/pixi/releases/tag/v0.7.0

The lockfile format seems the same, so I'm hoping this works on TeamCity
with pixi 0.6.0, but I triggered tests to make sure. There is a new top
level `version: 2` entry on top of the lockfile, so it looks like they
started versioning it.
evetion added a commit that referenced this pull request Nov 16, 2023
Follow-up as mentioned in
#731 (comment)

Not passing the tests yet, as I don't understand how to have a different
TOML key `geopackage` and field `network`. Writing to TOML is 'fixed' by
bbb2982, but tests are still failing.
@evetion could you help out?

---------

Co-authored-by: Maarten Pronk <[email protected]>
This was referenced Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor Python Model
3 participants