Skip to content

Commit

Permalink
Builder -> Engine builder; typo in UpdateHandler
Browse files Browse the repository at this point in the history
  • Loading branch information
Baxter Eaves committed Nov 15, 2023
1 parent f11e8f4 commit f01c2ae
Show file tree
Hide file tree
Showing 14 changed files with 97 additions and 65 deletions.
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Moved `Example` code into the `examples` feature flag (on by default)
- Replaced instances of `once_cell::sync::OnceCell` with `syd::sync::OnceLock`
- Renamed all files/methods with the name `feather` to `arrow`
- Renamed `Builder` to `EngineBuilder`

### Fixed

- Fixed typo `UpdateHandler::finialize` is now `UpdateHandler::finalize`

## [python-0.4.1] - 2023-10-19

Expand Down
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@
<a href='https://www.lace.dev/'>User guide</a> |
<a href='https://docs.rs/lace/latest/lace/'>Rust API</a> |
<a href='https://pylace.readthedocs.io/en/latest/'>Python API</a> |
<a href='#'>CLI</a>
</div>
<div>
<strong>Installation</strong>:
Expand Down
61 changes: 42 additions & 19 deletions book/src/workflow/workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,32 +9,55 @@ The typical workflow consists of two or three steps:
Step 1 is optional in many cases as Lace usually does a good job of inferring
the types of your data. The condensed workflow looks like this.


Create an optional codebook using the CLI.

```console
$ lace codebook --csv data.csv codebook.yaml
```

Run a model.

```console
$ lace run --csv data.csv --codebook codebook.yaml -n 5000 metadata.lace
```

Open the model in lace

<div class=tabbed-blocks>

```python
import pandas as pd
import lace

engine = lace.Engine.load('metadata.lace')
df = pd.read_csv("mydata.csv", index_col=0)

# 1. Create a codebook (optional)
codebook = lace.Codebook.from_df(df)

# 2. Initialize a new Engine from the prior. If no codebook is provided, a
# default will be generated
engine = lace.Engine.from_df(df, codebook=codebook)

# 3. Run inference
engine.run(5000)
```

```rust,noplayground
use lace::Engine;
let engine = Engine::load("metadata.lace")?;
use polars::prelude::{SerReader, CsvReader};
use lace::prelude::*;
let df = CsvReader::from_path("mydata.csv")
.unwrap()
.has_header(true)
.finish()
.unwrap();
// 1. Create a codebook (optional)
let codebook = Codebook::from_df(&df, None, None, False).unwrap();
// 2. Build an engine
let mut engine = EngineBuilder::new(DataSource::Polars(df))
.with_codebook(codebook)
.build()
.unwrap();
// 3. Run inference
// Use `run` to fit with the default transition set and update handlers; use
// `update` for more control.
engine.run(5_000);
```

</div>

You can also use the CLI to create codebooks and run inference. Creating a default YAML codebook with the CLI, and then manually editing is good way to fine tune models.

```console
$ lace codebook --csv mydata.csv codebook.yaml
$ lace run --csv data.csv --codebook codebook.yaml -n 5000 metadata.lace
```
2 changes: 1 addition & 1 deletion cli/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions cli/src/routes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ use lace::codebook::Codebook;
use lace::metadata::{deserialize_file, serialize_obj};
use lace::stats::rv::dist::Gamma;
use lace::update_handler::{CtrlC, ProgressBar, Timeout};
use lace::{Builder, Engine};
use lace::{Engine, EngineBuilder};

use crate::opt;

Expand Down Expand Up @@ -84,7 +84,7 @@ fn new_engine(cmd: opt::RunArgs) -> i32 {
return 1;
};

let mut builder = Builder::new(data_source)
let mut builder = EngineBuilder::new(data_source)
.with_nstates(cmd.nstates)
.id_offset(cmd.id_offset);

Expand Down
2 changes: 1 addition & 1 deletion lace/examples/count_model.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ fn main() {
writeln!(file, "{},{}", ix, x).unwrap();
});

Builder::new(DataSource::Csv(file.path().into()))
EngineBuilder::new(DataSource::Csv(file.path().into()))
.with_nstates(2)
.seed_from_u64(1337)
.build()
Expand Down
25 changes: 15 additions & 10 deletions lace/src/interface/engine/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ const DEFAULT_NSTATES: usize = 8;
const DEFAULT_ID_OFFSET: usize = 0;

/// Builds `Engine`s
pub struct Builder {
pub struct EngineBuilder {
n_states: Option<usize>,
codebook: Option<Codebook>,
data_source: DataSource,
Expand All @@ -28,7 +28,7 @@ pub enum BuildEngineError {
DefaultCodebookError(#[from] DefaultCodebookError),
}

impl Builder {
impl EngineBuilder {
#[must_use]
pub fn new(data_source: DataSource) -> Self {
Self {
Expand All @@ -41,7 +41,7 @@ impl Builder {
}
}

/// Eith a certain number of states
/// With a certain number of states
#[must_use]
pub fn with_nstates(mut self, n_states: usize) -> Self {
self.n_states = Some(n_states);
Expand Down Expand Up @@ -132,7 +132,7 @@ mod tests {

#[test]
fn default_build_settings() {
let engine = Builder::new(animals_csv()).build().unwrap();
let engine = EngineBuilder::new(animals_csv()).build().unwrap();
let state_ids: BTreeSet<usize> =
engine.state_ids.iter().copied().collect();
let target_ids: BTreeSet<usize> = btreeset! {0, 1, 2, 3, 4, 5, 6, 7};
Expand All @@ -142,7 +142,10 @@ mod tests {

#[test]
fn with_id_offet_3() {
let engine = Builder::new(animals_csv()).id_offset(3).build().unwrap();
let engine = EngineBuilder::new(animals_csv())
.id_offset(3)
.build()
.unwrap();
let state_ids: BTreeSet<usize> =
engine.state_ids.iter().copied().collect();
let target_ids: BTreeSet<usize> = btreeset! {3, 4, 5, 6, 7, 8, 9, 10};
Expand All @@ -152,8 +155,10 @@ mod tests {

#[test]
fn with_nstates_3() {
let engine =
Builder::new(animals_csv()).with_nstates(3).build().unwrap();
let engine = EngineBuilder::new(animals_csv())
.with_nstates(3)
.build()
.unwrap();
let state_ids: BTreeSet<usize> =
engine.state_ids.iter().copied().collect();
let target_ids: BTreeSet<usize> = btreeset! {0, 1, 2};
Expand All @@ -163,7 +168,7 @@ mod tests {

#[test]
fn with_nstates_0_causes_error() {
let result = Builder::new(animals_csv()).with_nstates(0).build();
let result = EngineBuilder::new(animals_csv()).with_nstates(0).build();

assert!(result.is_err());
}
Expand All @@ -172,13 +177,13 @@ mod tests {
fn seeding_engine_works() {
let seed: u64 = 8_675_309;
let nstates = 4;
let mut engine_1 = Builder::new(animals_csv())
let mut engine_1 = EngineBuilder::new(animals_csv())
.with_nstates(nstates)
.seed_from_u64(seed)
.build()
.unwrap();

let mut engine_2 = Builder::new(animals_csv())
let mut engine_2 = EngineBuilder::new(animals_csv())
.with_nstates(nstates)
.seed_from_u64(seed)
.build()
Expand Down
10 changes: 5 additions & 5 deletions lace/src/interface/engine/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ mod data;
pub mod error;
pub mod update_handler;

pub use builder::{BuildEngineError, Builder};
pub use builder::{BuildEngineError, EngineBuilder};
pub use data::{
AppendStrategy, InsertDataActions, InsertMode, OverwriteMode, Row,
SupportExtension, Value, WriteMode,
Expand Down Expand Up @@ -1030,7 +1030,7 @@ impl Engine {
.collect::<Result<Vec<State>, _>>()?;
}
std::mem::drop(update_handlers);
update_handler.finialize();
update_handler.finalize();

Ok(())
}
Expand Down Expand Up @@ -1119,12 +1119,12 @@ mod tests {
false
}

fn finialize(&mut self) {
fn finalize(&mut self) {
self.0.write().unwrap().insert("finalize".to_string());
}
}

let mut engine = Builder::new(animals_csv()).build().unwrap();
let mut engine = EngineBuilder::new(animals_csv()).build().unwrap();

let called_methods = Arc::new(RwLock::new(HashSet::new()));
let update_handler = TestingHandler(called_methods.clone());
Expand Down Expand Up @@ -1158,7 +1158,7 @@ mod tests {
// It does not test that the StateTimeout successfully ends states that have gone over the duration
#[test]
fn state_timeout_update_handler() {
let mut engine = Builder::new(animals_csv()).build().unwrap();
let mut engine = EngineBuilder::new(animals_csv()).build().unwrap();

let config = EngineUpdateConfig::new().default_transitions().n_iters(1);

Expand Down
16 changes: 8 additions & 8 deletions lace/src/interface/engine/update_handler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ use crate::EngineUpdateConfig;
/// self.timings.lock().unwrap().push(Instant::now());
/// }
///
/// fn finialize(&mut self) {
/// fn finalize(&mut self) {
/// let timings = self.timings.lock().unwrap();
/// let mean_time_between_updates =
/// timings.iter().zip(timings.iter().skip(1))
Expand Down Expand Up @@ -106,7 +106,7 @@ pub trait UpdateHandler: Clone + Send + Sync {
///
/// This method is called when all updating is complete.
/// Uses for this method include cleanup, report generation, etc.
fn finialize(&mut self) {}
fn finalize(&mut self) {}
}

macro_rules! impl_tuple {
Expand Down Expand Up @@ -151,9 +151,9 @@ macro_rules! impl_tuple {
)||+
}

fn finialize(&mut self) {
fn finalize(&mut self) {
$(
self.$idx.finialize();
self.$idx.finalize();
)+
}

Expand Down Expand Up @@ -204,8 +204,8 @@ where
false
}

fn finialize(&mut self) {
self.iter_mut().for_each(|handler| handler.finialize());
fn finalize(&mut self) {
self.iter_mut().for_each(|handler| handler.finalize());
}
}

Expand Down Expand Up @@ -285,7 +285,7 @@ impl UpdateHandler for Timeout {
}
}

fn finialize(&mut self) {}
fn finalize(&mut self) {}
}

/// Limit the time each state can run for during an `Engine::update`.
Expand Down Expand Up @@ -427,7 +427,7 @@ impl UpdateHandler for ProgressBar {
false
}

fn finialize(&mut self) {
fn finalize(&mut self) {
if let Self::Initialized { sender, handle } = std::mem::take(self) {
std::mem::drop(sender);

Expand Down
2 changes: 1 addition & 1 deletion lace/src/interface/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ mod metadata;
mod oracle;

pub use engine::{
update_handler, AppendStrategy, BuildEngineError, Builder, Engine,
update_handler, AppendStrategy, BuildEngineError, Engine, EngineBuilder,
InsertDataActions, InsertMode, OverwriteMode, Row, SupportExtension, Value,
WriteMode,
};
Expand Down
8 changes: 4 additions & 4 deletions lace/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -188,10 +188,10 @@ pub use index::*;
pub use config::EngineUpdateConfig;

pub use interface::{
update_handler, utils, AppendStrategy, BuildEngineError, Builder,
ConditionalEntropyType, DatalessOracle, Engine, Given, HasData, HasStates,
ImputeUncertaintyType, InsertDataActions, InsertMode, Metadata,
MiComponents, MiType, Oracle, OracleT, OverwriteMode,
update_handler, utils, AppendStrategy, BuildEngineError,
ConditionalEntropyType, DatalessOracle, Engine, EngineBuilder, Given,
HasData, HasStates, ImputeUncertaintyType, InsertDataActions, InsertMode,
Metadata, MiComponents, MiType, Oracle, OracleT, OverwriteMode,
PredictUncertaintyType, Row, RowSimilarityVariant, SupportExtension, Value,
WriteMode,
};
Expand Down
8 changes: 4 additions & 4 deletions lace/src/prelude.rs
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
//! Common import for general use.
pub use crate::{
update_handler, AppendStrategy, Builder, Datum, Engine, EngineUpdateConfig,
Given, ImputeUncertaintyType, InsertMode, MiType, OracleT, OverwriteMode,
PredictUncertaintyType, Row, RowSimilarityVariant, SupportExtension, Value,
WriteMode,
update_handler, AppendStrategy, Datum, Engine, EngineBuilder,
EngineUpdateConfig, Given, ImputeUncertaintyType, InsertMode, MiType,
OracleT, OverwriteMode, PredictUncertaintyType, Row, RowSimilarityVariant,
SupportExtension, Value, WriteMode,
};

pub use crate::data::DataSource;
Expand Down
Loading

0 comments on commit f01c2ae

Please sign in to comment.