Merge pull request #73 from JuliaPOMDP/parametric

Parametric
JuliaPOMDP · Apr 15, 2016 · 8d3f3c6 · 8d3f3c6
2 parents ea9614d + 548f2ed
commit 8d3f3c6
Show file tree

Hide file tree

Showing 22 changed files with 703 additions and 640 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,2 @@
+docs/build/
+docs/site/
diff --git a/.travis.yml b/.travis.yml
@@ -5,6 +5,11 @@ julia:
   - release
 notifications:
   email: false
+before_script:
+  - export PATH=$HOME/.local/bin:$PATH
 script:
   - if [[ -a .git/shallow ]]; then git fetch --unshallow; fi
   - julia --check-bounds=yes -e 'Pkg.clone(pwd()); Pkg.test("POMDPs")'
+after_success:
+  - julia -e 'Pkg.clone("https://github.com/MichaelHatherly/Documenter.jl")'
+  - julia -e 'cd(Pkg.dir("POMDPs")); include(joinpath("docs", "make.jl"))'
diff --git a/README.md b/README.md
@@ -4,12 +4,15 @@
 
 This package provides a basic interface for working with partially observable Markov decision processes (POMDPs).
 
+NEWS: We recently made a significant change to the interface, introducing parametric types (see issue #56). If you wish to continue using the old interface, the v0.1 release may be used, but we recommend that all projects update to the new version.
+
 The goal is to provide a common programming vocabulary for researchers and students to use primarily for three tasks:
 
 1. Expressing problems using the POMDP format. 
 2. Writing solver software.
 3. Running simulations efficiently.
 
+For problems and solvers that only use a generative model (rather than explicit transition and observation distributions), see also [GenerativeModels.jl](https://github.com/JuliaPOMDP/GenerativeModels.jl).
 
 ## Installation
 ```julia
@@ -35,7 +38,6 @@ using POMDPs
 POMDPs.add("SARSOP") 
 ```
 
-
 ## Tutorials
 
 The following tutorials aim to get you up to speed with POMDPs.jl:
@@ -45,142 +47,6 @@ The following tutorials aim to get you up to speed with POMDPs.jl:
   of using SARSOP and QMDP to solve the tiger problem
 
 
-## Core Interface
-
-The core interface provides tools to express problems, program solvers, and setup simulations.
-
-**TODO** this list is not complete! there are some functions in src missing documentation that were not included here
-
-
-### Distributions
-
-`AbstractDistribution` - Base type for a probability distribution
-
-- `rand(rng::AbstractRNG, d::AbstractDistribution, sample::Any)` fill with random sample from distribution and return the sample
-- `pdf(d::AbstractDistribution, x)` value of probability distribution function at x
-
-**XXX** There are functions missing from this list that are included in `src/distribution.jl`
-
-### Problem Model
-
-`POMDP` - Base type for a problem definition<br>
-`AbstractSpace` - Base type for state, action, and observation spaces<br>
-`State` - Base type for states<br>
-`Action` - Base type for actions<br>
-`Observation` - Base type for observations
-
-- `states(pomdp::POMDP)` returns the complete state space
-- `states(pomdp::POMDP, state::State, sts::AbstractSpace=states(pomdp))` modifies `sts` to the state space accessible from the given state and returns it 
-- `actions(pomdp::POMDP)` returns the complete action space
-- `actions(pomdp::POMDP, state::State, aspace::AbstractSpace=actions(pomdp))` modifies `aspace` to the action space accessible from the given state and returns it
-- `actions(pomdp::POMDP, belief::Belief, aspace::AbstractSpace=actions(pomdp))` modifies `aspace` to the action space accessible from the states with nonzero belief and returns it
-- `observations(pomdp::POMDP)` returns the complete observation space
-- `observations(pomdp::POMDP, state::State, ospace::AbstractSpace)` modifies `ospace` to the observation space accessible from the given state and returns it
-- `reward(pomdp::POMDP, state::State, action::Action, statep::State)` returns the immediate reward for the s-a-s' triple
-- `transition(pomdp::POMDP, state::State, action::Action, distribution=create_transition_distribution(pomdp))` modifies `distribution` to the transition distribution from the current state-action pair and returns it
-- `observation(pomdp::POMDP, state::State, action::Action, statep::State, distribution=create_observation_distribution(pomdp))` modifies `distribution` to the observation distribution for the s-a-s' tuple (state, action, and next state) and returns it
-- `observation(pomdp::POMDP, state::State, action::Action, distribution=create_observation_distribution(pomdp))` modifies `distribution` to the observation distribution for the s-a pair (state and action) and returns it
-- `discount(pomdp::POMDP)` returns the discount factor
-- `isterminal(pomdp::POMDP, state::State)` checks if a state is terminal
-- `isterminal(pomdp::POMDP, observation::Observation)` checks if an observation is terminal. A terminal observation should be generated only upon transition to a terminal state.
-
-**XXX** Missing functions such as `n_states`, `n_actions` (see `src/pomdp.jl`)
-
-### Solvers and Polices
-
-`Solver` - Base type a solver<br>
-`Policy` - Base type for a policy (a map from every possible belief, or more abstract policy state, to an optimal or suboptimal action)
-
-- `solve(solver::Solver, pomdp::POMDP, policy::Policy=create_policy(solver, pomdp))` solves the POMDP and modifies `policy` to be the solution of `pomdp` and returns it
-- `action(policy::Policy, belief::Belief)` or `action(policy::Policy, belief::Belief, act::Action)` returns an action for the current belief given the policy (the method with three arguments modifies `act` and returns it)
-- `action(policy::Policy, state::State)` or `action(policy::Policy, state::State, act::Action)` returns an action for the current state given the policy
-
-### Belief
-
-`Belief` - Base type for an object representing some knowledge about the state (often a probability distribution)<br>
-`BeliefUpdater` - Base type for an object that defines how a belief should be updated
-
-- `update(updater::BeliefUpdater, belief_old::Belief, action::Action, obs::Observation, belief_new::Belief=create_belief(updater))` modifies `belief_new` to the belief given the old belief (`belief_old`) and the latest action and observation and returns the updated belief. 
-
-### Simulation
-
-`Simulator` - Base type for an object defining how a simulation should be carried out
-
-- `simulate(simulator::Simulator, pomdp::POMDP, policy::Policy, updater::BeliefUpdater, initial_belief::Belief)` runs a simulation using the specified policy and returns the accumulated reward
-
-## Minor Components
+## Documentation
 
-### Convenience Functions
-
-Several convenience functions are also provided in the interface to provide standard vocabulary for common tasks and may be used by some solvers or in simulation, but they are not strictly necessary for expressing problems.
-
-- `index(pomdp::POMDP, state::State)` returns the index of the given state for a discrete POMDP 
-- `initial_belief(pomdp::POMDP)` returns an example initial belief for the pomdp
-- `iterator(space::AbstractSpace)` returns an iterator over a space or an iterable object containing the space (such as an array)
-- `dimensions(s::AbstractSpace)` returns the number (integer) of dimensions in a space
-- `lowerbound(s::AbstractSpace, i::Int)` returns the lower bound of dimension `i`
-- `upperbound(s::AbstractSpace, i::Int)` returns the upper bound of dimension `i`
-- `rand(rng::AbstractRNG, d::AbstractSpace, sample::Any)` fill with random sample from space and return the sample
-- `value(policy::Policy, belief::Belief)` returns the utility value from policy p given the belief
-- `value(policy::Policy, state::State)` returns the utility value from policy p given the state
-- `convert_belief(updater::BeliefUpdater, b::Belief)` returns a belief that can be updated using `updater` that has a similar distribution to `b` (this conversion may be lossy)
-- `updater(p::Policy)` returns a default BeliefUpdater appropriate for a belief type that policy `p` can use
-
-### Object Creators
-
-In many cases, it is more efficient to fill pre-allocated objects with new data rather than create new objects at each iteration of an algorithm or simulation. When a new object is needed, the following functions may be called. They should return an object of the appropriate type as efficiently as possible. The data in the object does not matter - it will be overwritten when the object is used.
-
-- `create_state(pomdp::POMDP)` creates a single state object (for preallocation purposes)
-- `create_observation(pomdp::POMDP)` creates a single observation object (for preallocation purposes)
-- `create_transition_distribution(pomdp::POMDP)` returns a transition distribution
-- `create_observation_distribution(pomdp::POMDP)` returns an observation distribution
-- `create_policy(solver::Solver, pomdp::POMDP)` creates a policy object (for preallocation purposes)
-- `create_action(pomdp::POMDP)` creates an action object (for preallocation purposes)
-- `create_belief(updater::BeliefUpdater)` creates a belief object of the type used by `updater` (for preallocation purposes)
-- `create_belief(pomdp::POMDP)` creates an empty problem-native belief object (for preallocation purposes)
-
-
-## Reference Simulation Implementation
-
-This reference simulation implementation shows how the various functions will be used. Please note that this example is written for clarity and not efficiency (see [TODO: link to main doc] for efficiency tips).
-
-```julia
-type ReferenceSimulator
-    rng::AbstractRNG
-    max_steps
-end
-
-function simulate(simulator::ReferenceSimulator, pomdp::POMDP, policy::Policy, updater::BeliefUpdater, initial_belief::Belief)
-
-    s = create_state(pomdp)
-    o = create_observation(pomdp)
-    rand(sim.rng, initial_belief, s)
-
-    b = convert_belief(updater, initial_belief)
-
-    step = 1
-    disc = 1.0
-    r = 0.0
-
-    while step <= sim.max_steps && !isterminal(pomdp, s)
-        a = action(policy, b)
-
-        sp = create_state(pomdp)
-        trans_dist = transition(pomdp, s, a)
-        rand(sim.rng, trans_dist, sp)
-
-        r += disc*reward(pomdp, s, a, sp)
-
-        obs_dist = observation(pomdp, s, a, sp)
-        rand(sim.rng, obs_dist, o)
-
-        b = update(updater, b, a, o)
-
-        s = sp
-        disc *= discount(pomdp)
-        step += 1
-    end
-
-end
-
-```
+Detailed documentation can be found [here](http://juliapomdp.github.io/POMDPs.jl/latest/).
diff --git a/docs/make.jl b/docs/make.jl
@@ -0,0 +1,12 @@
+using Documenter, POMDPs
+
+makedocs(
+    # options
+    modules = [POMDPs]    
+)
+
+deploydocs(
+    repo = "github.com/JuliaPOMDP/POMDPs.jl.git",
+    julia = "release",
+    osname = "linux"
+)
diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml
@@ -0,0 +1,32 @@
+site_name:        POMDPs.jl
+repo_url:         https://github.com/JuliaPOMDP/POMDPs.jl 
+site_description: API for solving partially observable Markov decision processes in Julia.
+site_author:      Maxim Egorov
+
+theme: readthedocs
+
+extra:
+  palette:
+    primary: 'indigo'
+    accent:  'blue'
+
+extra_css:
+  - assets/Documenter.css
+
+markdown_extensions:
+  - codehilite
+  - extra
+  - tables
+  - fenced_code
+
+extra_javascript:
+  - https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML
+  - assets/mathjaxhelper.js
+
+docs_dir: 'build'
+
+pages:
+- Home: index.md
+- Manual: guide.md
+- API: api.md
+
diff --git a/docs/src/api.md b/docs/src/api.md
@@ -0,0 +1,112 @@
+# Solver Documentation
+
+Documentation for the `POMDPs.jl` user interface. You can get help for any type or 
+function in the module by typing `?` in the Julia REPL followed by the name of 
+type or function. For example:
+
+```julia
+julia>using POMDPs
+julia>?
+help?>reward
+search: reward
+
+  reward{S,A,O}(pomdp::POMDP{S,A,O}, state::S, action::A, statep::S)
+
+  Returns the immediate reward for the s-a-s triple
+
+  reward{S,A,O}(pomdp::POMDP{S,A,O}, state::S, action::A)
+
+  Returns the immediate reward for the s-a pair
+
+```
+
+    {meta}
+    CurrentModule = POMDPs
+
+## Contents
+
+    {contents}
+    Pages = ["api.md"]
+
+## Index
+
+    {index}
+    Pages = ["api.md"]
+
+
+## Types
+
+    {docs}
+    POMDP
+    MDP
+    AbstractSpace
+    AbstractDistribution
+    Solver
+    Policy
+    Belief
+    BeliefUpdater
+
+## Model Functions
+
+    {docs}
+    states
+    actions
+    observations
+    reward
+    transition
+    observation
+    isterminal
+    isterminal_obs
+    n_states
+    n_actions
+    n_observations
+    state_index
+    action_index
+    obs_index
+    discount
+
+## Distribution/Space Functions
+
+    {docs}
+    rand
+    pdf
+    dimensions
+    iterator
+    create_transition_distribution
+    create_observation_distribution
+
+## Belief Functions
+
+    {docs}
+    initial_belief
+    create_belief
+    update
+    convert_belief
+
+## Policy and Solver Functions
+
+    {docs}
+    create_policy
+    solve
+    updater
+    action
+    value
+
+## Simulator
+
+    {docs}
+    Simulator
+    simulate
+
+## Utility Tools
+
+    {docs}
+    add
+    @pomdp_func
+    strip_arg
+
+## Constants
+
+    {docs}
+    REMOTE_URL
+    SUPPORTED_SOLVERS