-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyVortex #729
PyVortex #729
Conversation
67a213d
to
4d55c61
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can move some things around. Creating tokio runtime is a bit annoying but it's a future problem to solve
3f31390
to
2bfb65b
Compare
PyVortex -------- The generated documentation for this branch is available at https://spiraldb.github.io/vortex/docs/ The Python package is now structured like this: - `vortex` - `array()`: converts a list or an Arrow array into a Vortex array. - `encodings` - `Array`: In Rust this is called a PyArray and it is just PyO3 wrapper around a Vortex Rust Array. - `to_pandas` - `to_numpy` - `compress()`: compresses an Array. - `dtype`: A module containing dtype constructors, e.g. `uint(32, nullable=False)` - `io`: Readers and writers which currently only work for Struct arrays without top-level nulls. - `read()` - `write()` - `expr` - `Expr`: a class, implemented in Rust, which constructs vortex-exprs using the obvious Python operators. I also added `python_repr` which returns a Display-able struct that renders itself in the Python `repr` style. In particular, the dtypes look like `uint(32, False)` rather than `u32`. I think the only bugfixes in this PR are: 1. pyvortex/src/encode.rs: propagate the nullability from Arrow to `Array::from_arrow`. 2. arrow/recordbatch.rs and arrow/dtype.rs need to return compatible nullability and validity. Future Work ----------- 1. Automatically generate and deploy the documentation to github.io. 2. Run `cd pyvortex/docs && make doctest` on every commit.
f1bd829
to
52851b3
Compare
pyvortex/src/dtype.rs
Outdated
@@ -26,8 +31,12 @@ impl PyDType { | |||
format!("{}", self.inner) | |||
} | |||
|
|||
fn __repr__(&self) -> String { | |||
format!("{}", self.inner.python_repr()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be
format!("{}", self.inner.python_repr()) | |
self.inner.python_repr().to_string() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
PyVortex
The generated documentation for this branch is available at https://spiraldb.github.io/vortex/docs/
The Python package is now structured like this:
vortex
array()
: converts a list or an Arrow array into a Vortex array.encodings
Array
: In Rust this is called a PyArray and it is just PyO3 wrapper around a Vortex Rust Array. -to_pandas
-to_numpy
compress()
: compresses an Array.dtype
: A module containing dtype constructors, e.g.uint(32, nullable=False)
io
: Readers and writers which currently only work for Struct arrays without top-level nulls.read()
write()
expr
-Expr
: a class, implemented in Rust, which constructs vortex-exprs using the obvious Python operators.I also added
python_repr
which returns a Display-able struct that renders itself in the Pythonrepr
style. In particular, the dtypes look likeuint(32, False)
rather thanu32
.I think the only bugfixes in this PR are:
Array::from_arrow
.Future Work
cd pyvortex/docs && make doctest
on every commit.