Skip to content

Commit

Permalink
Merge pull request #134 from promised-ai/docs/2023-09-28-updates-and-…
Browse files Browse the repository at this point in the history
…fixes

Docs/2023 09 28 updates and fixes
  • Loading branch information
BaxterEaves authored Sep 29, 2023
2 parents 829b8fb + 95b443c commit c4c9464
Show file tree
Hide file tree
Showing 11 changed files with 93 additions and 16 deletions.
101 changes: 89 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,26 @@ distribution over your dataset, which enables users to...

and more, all in one place, without any explicit model building.

```python
import pandas as pd
import lace

# Create an engine from a dataframe
df = pd.read_csv("animals.csv", index_col=0)
engine = lace.Engine.from_df(df)

# Fit a model to the dataframe over 5000 steps of the fitting procedure
engine.update(5000)

# Show the statistical structure of the data -- which features are likely
# dependent (predictive) on each other
engine.clustermap("depprob", zmin=0, zmax=1)
```

![Animals dataset dependence probability](assets/animals-depprob.png)



## The Problem

The goal of lace is to fill some of the massive chasm between standard machine
Expand Down Expand Up @@ -105,36 +125,62 @@ themselves from scratch, meaning they must know (or at least guess) the model.
PPL users must also know how to specify such a model in a way that is
compatible with the underlying inference procedure.

### Who should not use lace
### Example use cases

- **Combine data sources and understand how they interact.** For example, we
may wish to predict cognitive decline from demographics, survey or task
performance, EKG data, and other clinical data. Combined, this data would
typically be very sparse (most patients will not have all fields filled
in), and it is difficult to know how to explicitly model the interaction of
these data layers. In Lace, we would just concatenate the layers and run
them through.
- **Understanding the amount and causes of uncertainty over time.** For
example, a farmer may wish to understand the likelihood of achieving a
specific yield over the growing season. As the season progresses, new
weather data can be added to the prediction in the form of conditions.
Uncertainty can be visualized as variance in the prediction, disagreement
between posterior samples, or multi-modality in the predictive distribution
(see [this blog post](https://redpoll.ai/blog/ml-uncertainty/) for more
information on uncertainty)
- **Data quality control.** Use `surprisal` to find anomalous data in the table
and use `-logp` to identify anomalies before they enter the table. Because
Lace creates a model of the data, we can also contrive methods to find data
that are *inconsistent* with that model, which we have used to good effect
in error finding.

### Who should not use Lace

There are a number of use cases for which Lace is not suited

- Non-tabular data such as images and text
- Highly optimizing specific predictions
+ Lace would rather over-generalize than over fit.


## Quick start

Install the CLI and pylace (requires [rust and
cargo](https://www.rust-lang.org/tools/install))
### Installation

```console
Lace requires rust.

To install the CLI:
```
$ cargo install --locked lace
$ pip install py-lace
```

First, use the CLI to fit a model to your data
To install pylace

```console
$ lace run --csv satellites.csv -n 5000 -s 32 --seed 1337 satellites.lace
```
$ pip install pylace
```

Then load the model and start asking questions
### Examples

Lace comes with two pre-fit example data sets: Satellites and Animals.

```python
>>> from lace import Engine
>>> engine = Engine(metadata='satellites.lace')
>>> from lace.examples import Satellites
>>> engine = Satellites()

# Predict the class of orbit given the satellite has a 75-minute
# orbital period and that it has a missing value of geosynchronous
Expand Down Expand Up @@ -176,9 +222,13 @@ And similarly in rust:

```rust,noplayground
use lace::prelude::*;
use lace::examples::Example;

fn main() {
let mut engine = Engine::load("satellites.lace").unrwap();
// In rust, you can create an Engine or and Oracle. The Oracle is an
// immutable version of an Engine; it has the same inference functions as
// the Engine, but you cannot train or edit data.
let mut engine = Example::Satellites.engine().unwrap();

// Predict the class of orbit given the satellite has a 75-minute
// orbital period and that it has a missing value of geosynchronous
Expand All @@ -196,6 +246,33 @@ fn main() {
}
```

### Fitting a model

To fit a model to your own data you can use the CLI

```console
$ lace run --csv my-data.csv -n 1000 my-data.lace
```

...or initialize an engine from a file or dataframe.

```python
>>> import pandas as pd # Lace supports polars as well
>>> from lace import Engine
>>> engine = Engine.from_df(pd.read_csv("my-data.csv", index_col=0))
>>> engine.update(1_000)
>>> engine.save("my-data.lace")
```

You can monitor the progress of the training using diagnostic plots

```python
>>> from lace.plot import diagnostics
>>> diagnostics(engine)
```

![Animals MCMC convergence](assets/animals-convergence.png)

## License

Lace is licensed under Server Side Public License (SSPL), which is a copyleft
Expand Down
Binary file added assets/animals-convergence.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/animals-depprob.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sats-depprob.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sats-period-uncertainty.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion book/src/appendix/references.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,4 +51,4 @@ and examples, see Mansinghka et al [^pcc-jmlr].
[^pcc-jmlr]: Mansinghka, V., Shafto, P., Jonas, E., Petschulat, C., Gasner,
M., & Tenenbaum, J. B. (2016). Crosscat: A fully bayesian nonparametric
method for analyzing heterogeneous, high dimensional data.
[(PDF)](jmlr.org/papers/volume17/11-392/11-392.pdf)
[(PDF)](https://jmlr.org/papers/volume17/11-392/11-392.pdf)
2 changes: 1 addition & 1 deletion book/src/appendix/stats-primer.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,5 +76,5 @@ The CRP metaphor works like this: you are on your lunch break and, as one often

where \\(z_i\\) is the table of customer i, \\(n_k\\) is the number of customers currently seated at table \\(k\\), and \\(N_{-i}\\) is the total number of seated customers, not including customer i (who is still deciding where to sit).

Under the CRP formalism, we make inferences about what datum belongs to which category. The weight vector is implicit. That's it. For information on how inference is done in DPMMs check out the [literature recommendations](#literature-recommendations).
Under the CRP formalism, we make inferences about what datum belongs to which category. The weight vector is implicit. That's it. For information on how inference is done in DPMMs check out the [literature recommendations](stats-primer.md).

Binary file added book/src/assets/animals-convergence.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added book/src/assets/animals-depprob.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion book/src/workflow/workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Open the model in lace
```python
import lace

engine = lace.Engine(metadata='metadata.lace')
engine = lace.Engine.load('metadata.lace')
```

```rust,noplayground
Expand Down
2 changes: 1 addition & 1 deletion book/theme/index.hbs
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@
<i class="fa fa-paint-brush"></i>
</button>
<ul id="theme-list" class="theme-popup" aria-label="Themes" role="menu">
<!-- <li role="none"><button role="menuitem" class="theme" id="light">{{ theme_option "Light" }}</button></li> -->
<li role="none"><button role="menuitem" class="theme" id="light">{{ theme_option "Light" }}</button></li>
<!-- <li role="none"><button role="menuitem" class="theme" id="rust">{{ theme_option "Rust" }}</button></li> -->
<!-- <li role="none"><button role="menuitem" class="theme" id="coal">{{ theme_option "Coal" }}</button></li> -->
<!-- <li role="none"><button role="menuitem" class="theme" id="navy">{{ theme_option "Navy" }}</button></li> -->
Expand Down

0 comments on commit c4c9464

Please sign in to comment.