Skip to content

Design Motivation

James Yang edited this page Apr 12, 2020 · 1 revision

Design Motivation

Definitions

What is a random variable?

A random variable some quantity that can be observed. This quantity can also be sampled.

What is a probabilistic model?

What is a distribution?

Desires

User POV

The simplest example is to list a random variable and associate it with a distribution.

X ~ Normal(0,1)
  • BIG question: how do we represent this?
  • Must be able to:
    • sample values
    • compute pdf and logpdf for a particular value of X

The next more complicated example is the following with parameters:

X ~ Normal(theta1 + theta2, 1)
theta1 ~ Uniform(0,1)
theta2 ~ Normal(0,1)
  • BIG question: how do we represent this dependency between X and theta1, theta2?
    • this graph design must degenerate to the solution for simple example
  • Must be able to:
    • sample values for X, theta1, theta2 from the joint distribution
    • compute joint pdf for a given value for X and theta
  • Possibly useful:
    • sample from a subset of variables specified in the model

Similarly complicated example:

X ~ Normal(theta, 1)
Y ~ Normal(theta, 1)
theta ~ Uniform(0,1)

Algorithm POV

One of the sampling algorithms we plan to write is Metropolis-Hastings. If we want to sample from a distribution specified by the pdf p(x), the algorithm only requires a function f(x) that is any constant multiple of p(x). Hence, the only input to MH is some representation of this function f. MH must be able to call this function inside.

Efficiency POV

Bjarne said not to worry about efficiency due to time constraint. Screw that we goin hard fam.

Graph

  • contiguous memory
  • cheap memory
    • contain simple pointers, for example
    • no other data structure should be required
  • What happens when the graph gets too big?
    • When is it more desirable to heap-allocate chunks of contiguous sub-graphs?

Other Existing Library Design POV