Skip to content

Commit

Permalink
PyVortex
Browse files Browse the repository at this point in the history
PyVortex
--------

The generated documentation for this branch is available at https://spiraldb.github.io/vortex/docs/

The Python package is now structured like this:

- `vortex`
  - `array()`: converts a list or an Arrow array into a Vortex array.
  - `encodings`
    - `Array`: In Rust this is called a PyArray and it is just PyO3 wrapper around a Vortex Rust Array.
      - `to_pandas`
      - `to_numpy`
    - `compress()`: compresses an Array.
  - `dtype`: A module containing dtype constructors, e.g. `uint(32, nullable=False)`
  - `io`: Readers and writers which currently only work for Struct arrays without top-level nulls.
    - `read()`
    - `write()`
  - `expr`
    - `Expr`: a class, implemented in Rust, which constructs vortex-exprs using the obvious Python operators.

I also added `python_repr` which returns a Display-able struct that renders itself in the Python
`repr` style. In particular, the dtypes look like `uint(32, False)` rather than `u32`.

I think the only bugfixes in this PR are:

1. pyvortex/src/encode.rs: propagate the nullability from Arrow to `Array::from_arrow`.
2. arrow/recordbatch.rs and arrow/dtype.rs need to return compatible nullability and validity.

Future Work
-----------

1. Automatically generate and deploy the documentation to github.io.
2. Run `cd pyvortex/docs && make doctest` on every commit.
  • Loading branch information
danking committed Sep 4, 2024
1 parent fd49140 commit 67a213d
Show file tree
Hide file tree
Showing 34 changed files with 1,618 additions and 212 deletions.
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ instance/
.scrapy

# Sphinx documentation
docs/_build/
pyvortex/docs/_build/

# PyBuilder
.pybuilder/
Expand Down Expand Up @@ -196,3 +196,6 @@ data/

# vscode
.vscode/

# Emacs
*~
7 changes: 7 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions pyvortex/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,21 +18,29 @@ workspace = true
[lib]
name = "pyvortex"
crate-type = ["rlib", "cdylib"]
doctest = false

[dependencies]
arrow = { workspace = true, features = ["pyarrow"] }
flexbuffers = { workspace = true }
futures = { workspace = true }
log = { workspace = true }
paste = { workspace = true }
pyo3 = { workspace = true }
pyo3-log = { workspace = true }
tokio = { workspace = true, features = ["fs"] }
vortex-alp = { workspace = true }
vortex-array = { workspace = true }
vortex-dict = { workspace = true }
vortex-dtype = { workspace = true }
vortex-error = { workspace = true }
vortex-expr = { workspace = true }
vortex-fastlanes = { workspace = true }
vortex-roaring = { workspace = true }
vortex-runend = { workspace = true }
vortex-sampling-compressor = { workspace = true }
vortex-serde = { workspace = true, features = ["tokio"] }
vortex-scalar = { workspace = true }
vortex-zigzag = { workspace = true }

# We may need this workaround?
Expand Down
20 changes: 20 additions & 0 deletions pyvortex/docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?= -W --keep-going
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
40 changes: 40 additions & 0 deletions pyvortex/docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = "Vortex"
copyright = "2024, Spiral"
author = "Spiral"

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.intersphinx",
"sphinx.ext.doctest",
]

templates_path = ["_templates"]
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]

intersphinx_mapping = {
"python": ("https://docs.python.org/3", None),
"pyarrow": ("https://arrow.apache.org/docs/", None),
"pandas": ("https://pandas.pydata.org/docs/", None),
"numpy": ("https://numpy.org/doc/stable/", None),
}

nitpicky = True # ensures all :class:, :obj:, etc. links are valid

doctest_global_setup = "import pyarrow; import vortex"

# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = "pydata_sphinx_theme"
# html_static_path = ['_static'] # no static files yet
7 changes: 7 additions & 0 deletions pyvortex/docs/dtype.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Array Data Types
================

.. automodule:: vortex.dtype
:members:
:imported-members:

7 changes: 7 additions & 0 deletions pyvortex/docs/encoding.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Arrays
======

.. automodule:: vortex.encoding
:members:
:imported-members:
:special-members: __len__
6 changes: 6 additions & 0 deletions pyvortex/docs/expr.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Row Filter Expressions
======================

.. automodule:: vortex.expr
:members:
:imported-members:
19 changes: 19 additions & 0 deletions pyvortex/docs/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.. Vortex documentation master file, created by
sphinx-quickstart on Wed Aug 28 10:10:21 2024.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Vortex documentation
====================

Vortex is an Apache Arrow-compatible toolkit for working with compressed array data.

.. toctree::
:maxdepth: 2
:caption: Contents:

encoding
dtype
io
expr

6 changes: 6 additions & 0 deletions pyvortex/docs/io.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Input and Output
================

.. automodule:: vortex.io
:members:
:imported-members:
35 changes: 35 additions & 0 deletions pyvortex/docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

if "%1" == "" goto help

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
9 changes: 7 additions & 2 deletions pyvortex/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@ description = "Add your description here"
authors = [
{ name = "Nicholas Gates", email = "[email protected]" }
]
dependencies = []
dependencies = [
"pydata-sphinx-theme>=0.15.4",
]
requires-python = ">= 3.11"
classifiers = ["Private :: Do Not Upload"]

Expand All @@ -17,7 +19,10 @@ build-backend = "maturin"
managed = true
dev-dependencies = [
"pyarrow>=15.0.0",
"pip"
"pip",
"sphinx>=8.0.2",
"ipython>=8.26.0",
"pandas>=2.2.2",
]

[tool.maturin]
Expand Down
7 changes: 5 additions & 2 deletions pyvortex/python/vortex/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
from ._lib import * # noqa: F403
from ._lib import __doc__ as module_docs
from ._lib import __doc__ as module_docs, io, expr, dtype
from . import encoding


__doc__ = module_docs
del module_docs
array = encoding.array
Loading

0 comments on commit 67a213d

Please sign in to comment.