Skip to content

A Rust library for counting distinct elements in a stream, using ClickHouse uniq data structure.

License

Notifications You must be signed in to change notification settings

vivienm/rust-uniq-ch

Repository files navigation

uniq-ch

A Rust library for counting distinct elements in a stream, using ClickHouse uniq data structure.

This uses BJKST, a probabilistic algorithm that relies on adaptive sampling and provides fast, accurate and deterministic results. Two BJKSTs can be merged, making the data structure well suited for map-reduce settings.

Documentation

Examples

use uniq_ch::Bjkst;

let mut bjkst = Bjkst::new();

// Add some elements, with duplicates.
bjkst.extend(0..75_000);
bjkst.extend(25_000..100_000);

// Count the distinct elements.
assert!((99_000..101_000).contains(&bjkst.len()));

About

A Rust library for counting distinct elements in a stream, using ClickHouse uniq data structure.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published