Skip to content

Commit

Permalink
Merge pull request #2 from GG2002/for-review
Browse files Browse the repository at this point in the history
add some doc
  • Loading branch information
bsbds authored Jul 18, 2024
2 parents f08d86f + 18afea8 commit 45232af
Show file tree
Hide file tree
Showing 10 changed files with 2,354 additions and 0 deletions.
648 changes: 648 additions & 0 deletions Cargo.lock

Large diffs are not rendered by default.

23 changes: 23 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
[package]
name = "interval_map"
version = "0.1.0"
edition = "2021"
authors = ["feathercyc [email protected]"]
description = "`interval_map` is a map based on interval tree."
license = "Apache-2.0"
keywords = ["Interval Tree", "Augmented Tree", "Red-Black Tree"]

[dependencies]

[dev-dependencies]
criterion = "0.5.1"
rand = "0.8.5"

[features]
default = []
interval_tree_find_overlap_ordered = []

[[bench]]
name = "bench"
path = "benches/bench.rs"
harness = false
36 changes: 36 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# interval_map

`interval_map` is a map based on interval tree. It fully implements the insertion and deletion functionality of a red-black tree, ensuring that each modification operation requires at most $O(logN)$ time complexity.

The implementation of the interval tree in interval_map references "Introduction to Algorithms" (3rd ed., Section 14.3: Interval trees, pp. 348–354).

To safely and efficiently handle insertion and deletion operations in Rust, `interval_map` innovatively **uses arrays to simulate pointers** for managing the parent-child references in the red-black tree. This approach also ensures that interval_map has the `Send` and `Unpin` traits, allowing it to be safely transferred between threads and to maintain a fixed memory location during asynchronous operations.

`interval_map` implements an `IntervalMap` struct:
- It accepts `Interval<T>` as the key, where `T` can be any type that implements `Ord+Clone` trait. Therefore, intervals such as $[1, 2)$ and $["aaa", "bbb")$ are allowed
- The value can be of any type

`interval_map` supports `insert`, `delete`, and `iter` fns. Traversal is performed in the order of `Interval<T>` . For instance, with intervals of type `Interval<u32>`:
- $[1,4)<[2,5)$, because $1<2$
- $[1,4)<[1,5)$, because $4<5$

So the order of intervals in `IntervalMap` is $[1,4)<[1,5)<[2,5)$.

Currently, `interval_map` only supports half-open intervals, i.e., $[...,...)$.

## Benchmark

The benchmark was conducted on a platform with `AMD R7 7840H + DDR5 5600MHz`. The result are as follows:
1. Only insert
| insert | 100 | 1000 | 10, 000 | 100, 000 |
| --------------- | --------- | --------- | --------- | --------- |
| Time per insert | 5.4168 µs | 80.518 µs | 2.2823 ms | 36.528 ms |
2. Insert N and remove N
| insert_and_remove | 100 | 1000 | 10, 000 | 100, 000 |
| ------------------ | --------- | --------- | --------- | --------- |
| Time per operation | 10.333 µs | 223.43 µs | 4.9358 ms | 81.634 ms |

## TODO
- [] Support for $(...,...)$, $[...,...]$ and $(...,...]$ interval types.
- [] Add more tests like [etcd](https://github.com/etcd-io/etcd/blob/main/pkg/adt/interval_tree_test.go)
- [] Add Point type for Interval
113 changes: 113 additions & 0 deletions benches/bench.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
use criterion::{criterion_group, criterion_main, Bencher, Criterion};
use interval_map::{Interval, IntervalMap};
use std::hint::black_box;

struct Rng {
state: u32,
}
impl Rng {
fn new() -> Self {
Self { state: 0x87654321 }
}

fn gen_u32(&mut self) -> u32 {
self.state ^= self.state << 13;
self.state ^= self.state >> 17;
self.state ^= self.state << 5;
self.state
}

fn gen_range_i32(&mut self, low: i32, high: i32) -> i32 {
let d = (high - low) as u32;
low + (self.gen_u32() % d) as i32
}
}

struct IntervalGenerator {
rng: Rng,
limit: i32,
}
impl IntervalGenerator {
fn new() -> Self {
const LIMIT: i32 = 100000;
Self {
rng: Rng::new(),
limit: LIMIT,
}
}

fn next(&mut self) -> Interval<i32> {
let low = self.rng.gen_range_i32(0, self.limit - 1);
let high = self.rng.gen_range_i32(low + 1, self.limit);
Interval::new(low, high)
}
}

// insert helper fn
fn interval_map_insert(count: usize, bench: &mut Bencher) {
let mut gen = IntervalGenerator::new();
let intervals: Vec<_> = std::iter::repeat_with(|| gen.next()).take(count).collect();
bench.iter(|| {
let mut map = IntervalMap::new();
for i in intervals.clone() {
black_box(map.insert(i, ()));
}
});
}

// insert and remove helper fn
fn interval_map_insert_remove(count: usize, bench: &mut Bencher) {
let mut gen = IntervalGenerator::new();
let intervals: Vec<_> = std::iter::repeat_with(|| gen.next()).take(count).collect();
bench.iter(|| {
let mut map = IntervalMap::new();
for i in intervals.clone() {
black_box(map.insert(i, ()));
}
for i in &intervals {
black_box(map.remove(&i));
}
});
}

fn bench_interval_map_insert(c: &mut Criterion) {
c.bench_function("bench_interval_map_insert_100", |b| {
interval_map_insert(100, b)
});
c.bench_function("bench_interval_map_insert_1000", |b| {
interval_map_insert(1000, b)
});
c.bench_function("bench_interval_map_insert_10,000", |b| {
interval_map_insert(10_000, b)
});
c.bench_function("bench_interval_map_insert_100,000", |b| {
interval_map_insert(100_000, b)
});
}

fn bench_interval_map_insert_remove(c: &mut Criterion) {
c.bench_function("bench_interval_map_insert_remove_100", |b| {
interval_map_insert_remove(100, b)
});
c.bench_function("bench_interval_map_insert_remove_1000", |b| {
interval_map_insert_remove(1000, b)
});
c.bench_function("bench_interval_map_insert_remove_10,000", |b| {
interval_map_insert_remove(10_000, b)
});
c.bench_function("bench_interval_map_insert_remove_100,000", |b| {
interval_map_insert_remove(100_000, b)
});
}

fn criterion_config() -> Criterion {
Criterion::default().configure_from_args().without_plots()
}

criterion_group! {
name = benches;
config = criterion_config();
targets = bench_interval_map_insert, bench_interval_map_insert_remove
}

criterion_main!(benches);
97 changes: 97 additions & 0 deletions src/entry.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
use crate::index::{IndexType, NodeIndex};
use crate::interval::Interval;
use crate::intervalmap::IntervalMap;
use crate::node::Node;

/// A view into a single entry in a map, which may either be vacant or occupied.
#[derive(Debug)]
pub enum Entry<'a, T, V, Ix> {
/// An occupied entry.
Occupied(OccupiedEntry<'a, T, V, Ix>),
/// A vacant entry.
Vacant(VacantEntry<'a, T, V, Ix>),
}

/// A view into an occupied entry in a `IntervalMap`.
/// It is part of the [`Entry`] enum.
#[derive(Debug)]
pub struct OccupiedEntry<'a, T, V, Ix> {
/// Reference to the map
pub map_ref: &'a mut IntervalMap<T, V, Ix>,
/// The entry node
pub node: NodeIndex<Ix>,
}

/// A view into a vacant entry in a `IntervalMap`.
/// It is part of the [`Entry`] enum.
#[derive(Debug)]
pub struct VacantEntry<'a, T, V, Ix> {
/// Mutable reference to the map
pub map_ref: &'a mut IntervalMap<T, V, Ix>,
/// The interval of this entry
pub interval: Interval<T>,
}

impl<'a, T, V, Ix> Entry<'a, T, V, Ix>
where
T: Ord,
Ix: IndexType,
{
/// Ensures a value is in the entry by inserting the default if empty, and returns
/// a mutable reference to the value in the entry.
///
/// # Example
/// ```rust
/// use interval_map::{Interval, IntervalMap, Entry};
///
/// let mut map = IntervalMap::new();
/// assert!(matches!(map.entry(Interval::new(1, 2)), Entry::Vacant(_)));
/// map.entry(Interval::new(1, 2)).or_insert(3);
/// assert!(matches!(map.entry(Interval::new(1, 2)), Entry::Occupied(_)));
/// assert_eq!(map.get(&Interval::new(1, 2)), Some(&3));
/// ```
#[inline]
pub fn or_insert(self, default: V) -> &'a mut V {
match self {
Entry::Occupied(entry) => entry.map_ref.node_mut(entry.node, Node::value_mut),
Entry::Vacant(entry) => {
let entry_idx = NodeIndex::new(entry.map_ref.nodes.len());
let _ignore = entry.map_ref.insert(entry.interval, default);
entry.map_ref.node_mut(entry_idx, Node::value_mut)
}
}
}

/// Provides in-place mutable access to an occupied entry before any
/// potential inserts into the map.
///
/// # Panics
///
/// This method panics when the node is a sentinel node
///
/// # Example
/// ```rust
/// use interval_map::{Interval, IntervalMap, Entry};
///
/// let mut map = IntervalMap::new();
///
/// map.insert(Interval::new(6, 7), 3);
/// assert!(matches!(map.entry(Interval::new(6, 7)), Entry::Occupied(_)));
/// map.entry(Interval::new(6, 7)).and_modify(|v| *v += 1);
/// assert_eq!(map.get(&Interval::new(6, 7)), Some(&4));
/// ```
#[inline]
#[must_use]
pub fn and_modify<F>(self, f: F) -> Self
where
F: FnOnce(&mut V),
{
match self {
Entry::Occupied(entry) => {
f(entry.map_ref.node_mut(entry.node, Node::value_mut));
Self::Occupied(entry)
}
Entry::Vacant(entry) => Self::Vacant(entry),
}
}
}
64 changes: 64 additions & 0 deletions src/index.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
use std::fmt;
use std::hash::Hash;

pub type DefaultIx = u32;

pub unsafe trait IndexType: Copy + Default + Hash + Ord + fmt::Debug + 'static {
fn new(x: usize) -> Self;
fn index(&self) -> usize;
fn max() -> Self;
}

unsafe impl IndexType for u32 {
#[inline(always)]
fn new(x: usize) -> Self {
x as u32
}
#[inline(always)]
fn index(&self) -> usize {
*self as usize
}
#[inline(always)]
fn max() -> Self {
::std::u32::MAX
}
}

/// Node identifier.
#[derive(Copy, Clone, Default, PartialEq, PartialOrd, Eq, Ord, Hash)]
pub struct NodeIndex<Ix = DefaultIx>(Ix);

impl<Ix: IndexType> NodeIndex<Ix> {
#[inline]
pub fn new(x: usize) -> Self {
NodeIndex(IndexType::new(x))
}

#[inline]
pub fn index(self) -> usize {
self.0.index()
}

#[inline]
pub fn end() -> Self {
NodeIndex(IndexType::max())
}
}

unsafe impl<Ix: IndexType> IndexType for NodeIndex<Ix> {
fn index(&self) -> usize {
self.0.index()
}
fn new(x: usize) -> Self {
NodeIndex::new(x)
}
fn max() -> Self {
NodeIndex(<Ix as IndexType>::max())
}
}

impl<Ix: fmt::Debug> fmt::Debug for NodeIndex<Ix> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "NodeIndex({:?})", self.0)
}
}
Loading

0 comments on commit 45232af

Please sign in to comment.