Skip to content

Commit

Permalink
Update 2015-10-30-optimizing-hash-tries.md
Browse files Browse the repository at this point in the history
  • Loading branch information
jurgenvinju authored Jun 12, 2024
1 parent 3a3a2b6 commit 687b7be
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions blog/2015-10-30-optimizing-hash-tries.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ authors: [jvinju]

Hash-tries are the data-structure under Rascal's sets, maps and relations. These papers explain how they work and how we make them lean and fast on the JVM. [Others](https://blog.acolyer.org/2015/11/27/hamt/) have blogged about these results as well. The code can be found in the [Capsule project](http://www.usethesource.io/projects/capsule).

```
```bibtex
@inproceedings{oopsla2015,
title = {Optimizing Hash-Array Mapped Tries for Fast and Lean Immutable JVM Collections}
author = {Michael Steindorder and Jurgen J. Vinju}.
Expand All @@ -29,4 +29,4 @@ Hash-tries are the data-structure under Rascal's sets, maps and relations. These
}
```

The data structures under-pinning collection API (e.g. lists, sets, maps) in the standard libraries of programming languages are used intensively in many applications. The standard libraries of recent Java Virtual Machine languages, such as Clojure or Scala, contain scalable and well-performing immutable collection data structures that are implemented as Hash-Array Mapped Tries (HAMTs). HAMTs already feature efficient lookup, insert, and delete operations, however due to their tree-based nature their memory footprints and the runtime performance of iteration and equality checking lag behind array-based counterparts. This particularly prohibits their application in programs which process larger data sets. In this paper, we propose changes to the HAMT design that increase the overall performance of immutable sets and maps. The resulting general purpose design increases cache locality and features a canonical representation. It outperforms Scala’s and Clojure’s data structure implementations in terms of memory footprint and runtime efficiency of iteration (1.3– 6.7 x) and equality checking (3–25.4 x).
The data structures under-pinning collection API (e.g. lists, sets, maps) in the standard libraries of programming languages are used intensively in many applications. The standard libraries of recent Java Virtual Machine languages, such as Clojure or Scala, contain scalable and well-performing immutable collection data structures that are implemented as Hash-Array Mapped Tries (HAMTs). HAMTs already feature efficient lookup, insert, and delete operations, however due to their tree-based nature their memory footprints and the runtime performance of iteration and equality checking lag behind array-based counterparts. This particularly prohibits their application in programs which process larger data sets. In this paper, we propose changes to the HAMT design that increase the overall performance of immutable sets and maps. The resulting general purpose design increases cache locality and features a canonical representation. It outperforms Scala’s and Clojure’s data structure implementations in terms of memory footprint and runtime efficiency of iteration (1.3– 6.7 x) and equality checking (3–25.4 x).

0 comments on commit 687b7be

Please sign in to comment.