Skip to content

Commit

Permalink
Merge pull request #173 from MPLLang/prep-v0.4-release
Browse files Browse the repository at this point in the history
(WIP) v0.4 release
  • Loading branch information
shwestrick authored Aug 15, 2023
2 parents 11373d1 + 66ac61d commit f10dab4
Show file tree
Hide file tree
Showing 68 changed files with 4,474 additions and 1,383 deletions.
165 changes: 83 additions & 82 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ compiler for Standard ML which implements support for
nested (fork-join) parallelism. MPL generates executables with
excellent multicore performance, utilizing a novel approach to
memory management based on the theory of disentanglement
[[1](#rmab16),[2](#gwraf18),[3](#wyfa20),[4](#awa21),[5](#waa22)].
[[1](#rmab16),[2](#gwraf18),[3](#wyfa20),[4](#awa21),[5](#waa22),[6](#awa23)].

MPL is research software and is being actively developed.

Expand All @@ -25,7 +25,7 @@ $ docker run -it shwestrick/mpl /bin/bash
...# examples/bin/primes @mpl procs 4 --
```

If you want to try out MPL by writing and compiling your own code, we recommend
To write and compile your own code, we recommend
mounting a local directory inside the container. For example, here's how you
can use MPL to compile and run your own `main.mlb` in the current directory.
(To mount some other directory, replace `$(pwd -P)` with a different path.)
Expand All @@ -38,46 +38,36 @@ $ docker run -it -v $(pwd -P):/root/mycode shwestrick/mpl /bin/bash
...# ./main @mpl procs 4 --
```

## Benchmark Suite

## Build and Install (from source)
The [Parallel ML benchmark suite](https://github.com/MPLLang/parallel-ml-bench)
provides many examples of sophisticated parallel algorithms and
applications in MPL, as well as cross-language performance comparisons with
C++, Go, Java,
and multicore OCaml.

### Requirements
## Libraries

MPL has only been tested on Linux with x86-64. The following software is
required.
* [GCC](http://gcc.gnu.org)
* [GMP](http://gmplib.org) (GNU Multiple Precision arithmetic library)
* [GNU Make](http://savannah.gnu.org/projects/make), [GNU Bash](http://www.gnu.org/software/bash/)
* binutils (`ar`, `ranlib`, `strip`, ...)
* miscellaneous Unix utilities (`diff`, `find`, `grep`, `gzip`, `patch`, `sed`, `tar`, `xargs`, ...)
* Standard ML compiler and tools:
- Recommended: [MLton](http://mlton.org) (`mlton`, `mllex`, and `mlyacc`). Pre-built binary packages for MLton can be installed via an OS package manager or (for select platforms) obtained from http://mlton.org.
- Supported but not recommended: [SML/NJ](http://www.smlnj.org) (`sml`, `ml-lex`, `ml-yacc`).
We recommend using the [smlpkg](https://github.com/diku-dk/smlpkg) package
manager. MPL supports the full SML language, so existing libraries for
SML can be used.

### Instructions
In addition, here are a few libraries that make use of MPL for parallelism:
* [`github.com/MPLLang/mpllib`](https://github.com/MPLLang/mpllib): implements
a variety of data structures (sequences, sets, dictionaries, graphs, matrices, meshes,
images, etc.) and parallel algorithms (map, reduce, scan, filter, sorting,
search, tokenization, graph processing, computational geometry, etc.). Also
includes basic utilies (e.g. parsing command-line arguments) and
benchmarking infrastructure.
* [`github.com/shwestrick/sml-audio`](https://github.com/shwestrick/sml-audio):
a library for audio processing with I/O support for `.wav` files.

The following builds the compiler at `build/bin/mpl`.
```
$ make all
```

After building, MPL can then be installed to `/usr/local`:
```
$ make install
```
or to a custom directory with the `PREFIX` option:
```
$ make PREFIX=/opt/mpl install
```

## Parallel and Concurrent Extensions

MPL extends SML with a number of primitives for parallelism and concurrency.
Take a look at `examples/` to see these primitives in action.

**Note**: Before writing any of your own code, make sure to read the section
"Disentanglement" below.

### The `ForkJoin` Structure
```
val par: (unit -> 'a) * (unit -> 'b) -> 'a * 'b
Expand Down Expand Up @@ -163,8 +153,6 @@ by default.
* `-debug true -debug-runtime true -keep g` For debugging, keeps the generated
C files and uses the debug version of the runtime (with assertions enabled).
The resulting executable is somewhat peruse-able with tools like `gdb`.
* `-detect-entanglement true` enables the dynamic entanglement detector.
See below for more information.

For example:
```
Expand Down Expand Up @@ -198,60 +186,13 @@ argument `bar` using 4 pinned processors.
$ foo @mpl procs 4 set-affinity -- bar
```

## Disentanglement

Currently, MPL only supports programs that are **disentangled**, which
(roughly speaking) is the property that concurrent threads remain oblivious
to each other's allocations [[3](#wyfa20)].

Here are a number of different ways to guarantee that your code is
disentangled.
- (Option 1) Use only purely functional data (no `ref`s or `array`s). This is
the simplest but most restrictive approach.
- (Option 2) If using mutable data, use only non-pointer data. MPL guarantees
that simple types (`int`, `word`, `char`, `real`, etc.) are never
indirected through a
pointer, so for example it is safe to use `int array`. Other types such as
`int list array` and `int array array` should be avoided. This approach
is very easy to check and is surprisingly general. Data races are fine!
- (Option 3) Make sure that your program is race-free. This can be
tricky to check but allows you to use any type of data. Many of our example
programs are race-free.

## Entanglement Detection

Whenever a thread acquires a reference
to an object allocated concurrently by some other thread, then we say that
the two threads are **entangled**. This is a violation of disentanglement,
which MPL currently does not allow.

MPL has a built-in dynamic entanglement detector which is enabled by default.
The entanglement detector monitors individual reads and writes during execution;
if entanglement is found, the program will terminate with an error message.

The entanglement detector is both "sound" and "complete": there are neither
false negatives nor false positives. In other words, the detector always raises
an alarm when entanglement occurs, and never raises an alarm otherwise. Note
however that entanglement (and therefore also entanglement detection) can
be execution-dependent: if your program is non-deterministic (e.g. racy),
then entanglement may or may not occur depending on the outcome of a race
condition. Similarly, entanglement could be input-dependent.

Entanglement detection is highly optimized, and typically has negligible
overhead (see [[5](#waa22)]). It can be disabled at compile-time by passing
`-detect-entanglement false`; however, we recommend against doing so. MPL
relies on entanglement detection to ensure memory safety. We recommend leaving
entanglement detection enabled at all times.

## Bugs and Known Issues

### Basis Library
In general, the basis library has not yet been thoroughly scrubbed, and many
functions may not be safe for parallelism
The basis library is inherited from (sequential) SML. It has not yet been
thoroughly scrubbed, and some functions may not be safe for parallelism
([#41](https://github.com/MPLLang/mpl/issues/41)).
Some known issues:
* `Int.toString` is racy when called in parallel.
* `Real.fromString` may throw an error when called in parallel.

### Garbage Collection
* ([#115](https://github.com/MPLLang/mpl/issues/115)) The GC is currently
Expand All @@ -274,6 +215,61 @@ unsupported, including (but not limited to):
* `Weak`
* `World`


## Build and Install (from source)

### Requirements

MPL has only been tested on Linux with x86-64. The following software is
required.
* [GCC](http://gcc.gnu.org)
* [GMP](http://gmplib.org) (GNU Multiple Precision arithmetic library)
* [GNU Make](http://savannah.gnu.org/projects/make), [GNU Bash](http://www.gnu.org/software/bash/)
* binutils (`ar`, `ranlib`, `strip`, ...)
* miscellaneous Unix utilities (`diff`, `find`, `grep`, `gzip`, `patch`, `sed`, `tar`, `xargs`, ...)
* Standard ML compiler and tools:
- Recommended: [MLton](http://mlton.org) (`mlton`, `mllex`, and `mlyacc`). Pre-built binary packages for MLton can be installed via an OS package manager or (for select platforms) obtained from http://mlton.org.
- Supported but not recommended: [SML/NJ](http://www.smlnj.org) (`sml`, `ml-lex`, `ml-yacc`).
* (If using [`mpl-switch`](https://github.com/mpllang/mpl-switch)): Python 3, and `git`.

### Installation with `mpl-switch`

The [`mpl-switch`](https://github.com/mpllang/mpl-switch) utility makes it
easy to install multiple versions of MPL on the same system and switch
between them. After setting up `mpl-switch`, you can install MPL as follows:
```
$ mpl-switch install v0.4
$ mpl-switch select v0.4
```

You can use any commit hash or tag name from the MPL repo to pick a
particular version of MPL. Installed versions are stored in `~/.mpl/`; this
folder is safe to delete at any moment, as it can always be regenerated. To
see what versions of MPL are currently installed, do:
```
$ mpl-switch list
```

### Manual Instructions

Alternatively, you can manually build `mpl` by cloning this repo and then
performing the following.

**Build the executable**. This produces an executable at `build/bin/mpl`:
```
$ make
```

**Put it where you want it**. After building, MPL can then be installed to
`/usr/local`:
```
$ make install
```
or to a custom directory with the `PREFIX` option:
```
$ make PREFIX=/opt/mpl install
```

## References

[<a name="rmab16">1</a>]
Expand All @@ -300,3 +296,8 @@ POPL 2021.
[Entanglement Detection with Near-Zero Cost](http://www.cs.cmu.edu/~swestric/22/icfp-detect.pdf).
Sam Westrick, Jatin Arora, and Umut A. Acar.
ICFP 2022.

[<a name="awa23">6</a>]
[Efficient Parallel Functional Programming with Effects](https://www.cs.cmu.edu/~swestric/23/epfpe.pdf).
Jatin Arora, Sam Westrick, and Umut A. Acar.
PLDI 2023.
12 changes: 12 additions & 0 deletions basis-library/mlton/thread.sig
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ signature MLTON_THREAD =
structure HierarchicalHeap :
sig
type thread = Basic.t
type clear_set
type finished_clear_set_grain

(* The level (depth) of a thread's heap in the hierarchy. *)
val getDepth : thread -> int
Expand Down Expand Up @@ -69,6 +71,16 @@ signature MLTON_THREAD =
(* Move all chunks at the current depth up one level. *)
val promoteChunks : thread -> unit

val clearSuspectsAtDepth: thread * int -> unit
val numSuspectsAtDepth: thread * int -> int
val takeClearSetAtDepth: thread * int -> clear_set
val numChunksInClearSet: clear_set -> int
val processClearSetGrain: clear_set * int * int -> finished_clear_set_grain
val commitFinishedClearSetGrain: thread * finished_clear_set_grain -> unit
val deleteClearSet: clear_set -> unit

val updateBytesPinnedEntangledWatermark: unit -> unit

(* "put a new thread in the hierarchy *)
val moveNewThreadToDepth : thread * int -> unit

Expand Down
27 changes: 27 additions & 0 deletions basis-library/mlton/thread.sml
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,9 @@ struct
type thread = Basic.t
type t = MLtonPointer.t

type clear_set = MLtonPointer.t
type finished_clear_set_grain = MLtonPointer.t

fun forceLeftHeap (myId, t) = Prim.forceLeftHeap(Word32.fromInt myId, t)
fun forceNewChunk () = Prim.forceNewChunk (gcState ())
fun registerCont (kl, kr, k, t) = Prim.registerCont(kl, kr, k, t)
Expand All @@ -90,6 +93,30 @@ struct
Prim.moveNewThreadToDepth (t, Word32.fromInt d)
fun checkFinishedCCReadyToJoin () =
Prim.checkFinishedCCReadyToJoin (gcState ())

fun clearSuspectsAtDepth (t, d) =
Prim.clearSuspectsAtDepth (gcState (), t, Word32.fromInt d)

fun numSuspectsAtDepth (t, d) =
Word64.toInt (Prim.numSuspectsAtDepth (gcState (), t, Word32.fromInt d))

fun takeClearSetAtDepth (t, d) =
Prim.takeClearSetAtDepth (gcState (), t, Word32.fromInt d)

fun numChunksInClearSet c =
Word64.toInt (Prim.numChunksInClearSet (gcState (), c))

fun processClearSetGrain (c, start, stop) =
Prim.processClearSetGrain (gcState (), c, Word64.fromInt start, Word64.fromInt stop)

fun commitFinishedClearSetGrain (t, fcsg) =
Prim.commitFinishedClearSetGrain (gcState (), t, fcsg)

fun deleteClearSet c =
Prim.deleteClearSet (gcState (), c)

fun updateBytesPinnedEntangledWatermark () =
Prim.updateBytesPinnedEntangledWatermark (gcState ())
end

structure Disentanglement =
Expand Down
28 changes: 20 additions & 8 deletions basis-library/mpl/gc.sig
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,15 @@ sig
*)
val numberDisentanglementChecks: unit -> IntInf.int

(* How many times entanglement has been detected at a read barrier.
*)
val numberEntanglementsDetected: unit -> IntInf.int
(* How many times the entanglement is detected *)
val numberEntanglements: unit -> IntInf.int

val approxRaceFactor: unit -> Real32.real

val numberSuspectsMarked: unit -> IntInf.int
val numberSuspectsCleared: unit -> IntInf.int
val bytesPinnedEntangled: unit -> IntInf.int
val bytesPinnedEntangledWatermark: unit -> IntInf.int

val getControlMaxCCDepth: unit -> int

Expand All @@ -43,6 +46,8 @@ sig
val localBytesReclaimed: unit -> IntInf.int
val localBytesReclaimedOfProc: int -> IntInf.int

val bytesInScopeForLocal: unit -> IntInf.int

val numLocalGCs: unit -> IntInf.int
val numLocalGCsOfProc: int -> IntInf.int

Expand All @@ -52,21 +57,28 @@ sig
val promoTime: unit -> Time.time
val promoTimeOfProc: int -> Time.time

val numCCs: unit -> IntInf.t
val numCCsOfProc: int -> IntInf.t

val ccBytesReclaimed: unit -> IntInf.int
val ccBytesReclaimedOfProc: int -> IntInf.int

val bytesInScopeForCC: unit -> IntInf.int

val ccTime: unit -> Time.time
val ccTimeOfProc: int -> Time.time

(* DEPRECATED *)
val rootBytesReclaimed: unit -> IntInf.int
val rootBytesReclaimedOfProc: int -> IntInf.int

val internalBytesReclaimed: unit -> IntInf.int
val internalBytesReclaimedOfProc: int -> IntInf.int

val numRootCCs: unit -> IntInf.int
val numRootCCsOfProc: int -> IntInf.int

val numInternalCCs: unit -> IntInf.int
val numInternalCCsOfProc: int -> IntInf.int

val rootCCTime: unit -> Time.time
val rootCCTimeOfProc: int -> Time.time

val internalCCTime: unit -> Time.time
val internalCCTimeOfProc: int -> Time.time
end
Loading

0 comments on commit f10dab4

Please sign in to comment.