layout | permalink | title |
---|---|---|
home |
/v_alpha/ |
Seldonian \| Version alpha |
The Seldonian Toolkit is currently in the alpha stage of development. To install the latest version of the libraries in the toolkit:
{% highlight bash %} pip install --upgrade seldonian-engine pip install --upgrade seldonian-experiments {% endhighlight bash %}These versions contain the latest bug fixes and features.
In this document, we cover what is included and excluded in the alpha version.
Packages/libraries- Seldonian Engine API, a library that implements the Seldonian algorithm described here.
<li>
<a href="https://seldonian-toolkit.github.io/Experiments/build/html/index.html">Seldonian Experiments API</a>, a library for evaluating the safety and performance of Seldonian algorithms run with the Engine
</li>
<li>
<a href="https://seldonian-toolkit.github.io/GUI/build/html/index.html">Seldonian Interface GUI</a>, an interactive GUI for creating behavioral constraints via drag-and-drop. One example of a Seldonian interface.
</li>
Engine features
<li>
A command line Seldonian interface.
</li>
<li>
Student's $t$-test for the confidence-bound calculation.
</li>
<li>
Parse tree capable of handling a wide range of user-provided behavioral constraints. Constraints can consist of basic mathematical operations (+,-,/,*) and use any combination of (min,max,abs,exp) functions.
</li>
<li>
Parse tree visualizer.
</li>
<li>
Efficient bound propagation in parse tree by limiting the number of confidence intervals that need to be calculated.
</li>
<li>
User can specify an arbitrary number of behavioral constraints for a single Seldonian algorithm.
</li>
<li>
User can specify split fraction between candidate selection and safety test.
</li>
<li>
Dataset loaders for CSV-formatted datasets.
</li>
<li>
Gradient descent with Adam optimizer module option for candidate selection.
</li>
<li>
Black box optimization using SciPy with barrier function module option for candidate selection.
</li>
<li>
Gradient descent visualizer.
</li>
<li>
Automatic differentiation using the "autograd" Python library for gradient descent.
</li>
<li>
User can provide gradient function for their custom primary objective. We provide several built-in gradient functions which are often faster than using autograd.
</li>
<li>
Support for parametric supervised learning algorithms (binary classification and regression) as well as offline ("batch") reinforcement learning algorithms.
</li>
<li>
Example reinforcement learning policies supporting discrete and continuous observation spaces, such as softmax.
</li>
<li>
Modular design to make implementing user-defined models and constraints seamless for developers. Tutorials to help guide design.
</li>
- Three-plot generator (performance, solution rate, failure rate) for supervised learning and reinforcement learning Seldonian algorithms.
<li>
Logistic regression and random classifier baseline models for comparing against Seldonian classification algorithms.
</li>
<li>
Fairlearn experiment runner for several types of fairness constraints for comparing against Seldonian classification algorithms.
</li>
<li>
Generate resampled datasets that approximate ground truth using no additional data (supervised learning only).
</li>
<li>
Generate new episodes to use as ground truth from existing policy parameterizations (reinforcement learning only).
</li>
<li>
Modular design to make implementing new baselines seamless for developers.
</li>
GUI features
- Flask application that users run locally.
<li>
Upload locally stored datasets.
</li>
<li>
Drag-and-drop to build a wide array of behavioral constraints.
</li>
<li>
Five definitions of fairness hardcoded for quick reference and use.
</li>
Many of the features below are in development. Check the Coming soon page to learn more. Feel free to raise an issue on github requesting new features.
<li>
The $t$-test confidence bound used when calculating the upper bound on the behavioral constraint relies on reasonable but possibly false assumptions about the distribution of the data. As a result, the algorithms implemented in version alpha are quasi-Seldonian. <a href="https://en.wikipedia.org/wiki/Hoeffding%27s_inequality">Hoeffding's</a> concentration inequality does not rely on such assumptions and, once incorporated into the engine, will enable running true Seldonian algorithms.
</li>
<li>
Multiclass classification is not yet supported
</li>
<li>
Multiple label columns in a dataset are not supported (supervised learning). Currently, only a single label column is allowed.
</li>
<li>
Nonparameteric machine learning models (e.g., random forest), are not yet supported.
</li>