From 3e82f810861fdfe3a7c8dcf4eaaa8f77276fd53d Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Wed, 17 Jul 2024 02:59:23 +0000 Subject: [PATCH] build based on 430cfc8 --- dev/.documenter-siteinfo.json | 2 +- dev/api/index.html | 12 ++++++------ dev/circular/index.html | 2 +- dev/compressors/index.html | 2 +- dev/index.html | 2 +- dev/samplers/index.html | 6 +++--- 6 files changed, 13 insertions(+), 13 deletions(-) diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 7d3083f..becb251 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-07-17T02:50:21","documenter_version":"1.3.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-07-17T02:59:19","documenter_version":"1.3.0"}} \ No newline at end of file diff --git a/dev/api/index.html b/dev/api/index.html index a549d6f..7d87311 100644 --- a/dev/api/index.html +++ b/dev/api/index.html @@ -1,12 +1,12 @@ -API Documentation · CompressedBeliefMDPs

API Documentation

Contents

Index

Types/Functors

CompressedBeliefMDPs.CompressedBeliefMDPType
CompressedBeliefMDP{B, A}

The CompressedBeliefMDP struct is a generalization of the compressed belief-state MDP presented in Exponential Family PCA for Belief Compression in POMDPs.

Type Parameters

  • B: The type of compressed belief states.
  • A: The type of actions.

Fields

  • bmdp::GenerativeBeliefMDP: The generative belief-state MDP.
  • compressor::Compressor: The compressor used to compress belief states.
  • ϕ::Bijection: A bijection representing the mapping from uncompressed belief states to compressed belief states. See notes.

Constructors

CompressedBeliefMDP(pomdp::POMDP, updater::Updater, compressor::Compressor)
+API Documentation · CompressedBeliefMDPs

API Documentation

Contents

Index

Types/Functors

CompressedBeliefMDPs.CompressedBeliefMDPType
CompressedBeliefMDP{B, A}

The CompressedBeliefMDP struct is a generalization of the compressed belief-state MDP presented in Exponential Family PCA for Belief Compression in POMDPs.

Type Parameters

  • B: The type of compressed belief states.
  • A: The type of actions.

Fields

  • bmdp::GenerativeBeliefMDP: The generative belief-state MDP.
  • compressor::Compressor: The compressor used to compress belief states.
  • ϕ::Bijection: A bijection representing the mapping from uncompressed belief states to compressed belief states. See notes.

Constructors

CompressedBeliefMDP(pomdp::POMDP, updater::Updater, compressor::Compressor)
 CompressedBeliefMDP(pomdp::POMDP, sampler::Sampler, updater::Updater, compressor::Compressor)

Constructs a CompressedBeliefMDP using the specified POMDP, updater, and compressor.

Warning

The 4-argument constructor is a quality-of-life constructor that calls fit! on the given compressor.

Example Usage

pomdp = TigerPOMDP()
 updater = DiscreteUpdater(pomdp)
 compressor = PCACompressor(1)
-mdp = CompressedBeliefMDP(pomdp, updater, compressor)

For continuous POMDPs, see ParticleFilters.jl.

Notes

  • While compressions aren't usually injective, we cache beliefs and their compressions on a first-come, first-served basis, so we can effectively use a bijection without loss of generality.
source
CompressedBeliefMDPs.CompressedBeliefPolicyType
CompressedBeliefPolicy

Maps a base policy for the compressed belief-state MDP to a policy for the true POMDP.

Fields

  • m::CompressedBeliefMDP: The compressed belief-state MDP.
  • base_policy::Policy: The base policy used for decision-making in the compressed belief-state MDP.

Constructors

CompressedBeliefPolicy(m::CompressedBeliefMDP, base_policy::Policy)

Constructs a CompressedBeliefPolicy using the specified compressed belief-state MDP and base policy.

Example Usage

policy = solve(solver, pomdp)
+mdp = CompressedBeliefMDP(pomdp, updater, compressor)

For continuous POMDPs, see ParticleFilters.jl.

Notes

  • While compressions aren't usually injective, we cache beliefs and their compressions on a first-come, first-served basis, so we can effectively use a bijection without loss of generality.
source
CompressedBeliefMDPs.CompressedBeliefPolicyType
CompressedBeliefPolicy

Maps a base policy for the compressed belief-state MDP to a policy for the true POMDP.

Fields

  • m::CompressedBeliefMDP: The compressed belief-state MDP.
  • base_policy::Policy: The base policy used for decision-making in the compressed belief-state MDP.

Constructors

CompressedBeliefPolicy(m::CompressedBeliefMDP, base_policy::Policy)

Constructs a CompressedBeliefPolicy using the specified compressed belief-state MDP and base policy.

Example Usage

policy = solve(solver, pomdp)
 s = initialstate(pomdp)
 a = action(policy, s) # returns the approximately optimal action for state s
-v = value(policy, s)  # returns the approximately optimal value for state s
source
CompressedBeliefMDPs.CompressedBeliefSolverType
CompressedBeliefSolver

The CompressedBeliefSolver struct represents a solver for compressed belief-state MDPs. It combines a compressed belief-state MDP with a base solver to approximate the value function.

Fields

  • m::CompressedBeliefMDP: The compressed belief-state MDP.
  • base_solver::Solver: The base solver used to solve the compressed belief-state MDP.

Constructors

CompressedBeliefSolver(pomdp::POMDP, base_solver::Solver; updater::Updater=DiscreteUpdater(pomdp), sampler::Sampler=BeliefExpansionSampler(pomdp), compressor::Compressor=PCACompressor(1))
+v = value(policy, s)  # returns the approximately optimal value for state s
source
CompressedBeliefMDPs.CompressedBeliefSolverType
CompressedBeliefSolver

The CompressedBeliefSolver struct represents a solver for compressed belief-state MDPs. It combines a compressed belief-state MDP with a base solver to approximate the value function.

Fields

  • m::CompressedBeliefMDP: The compressed belief-state MDP.
  • base_solver::Solver: The base solver used to solve the compressed belief-state MDP.

Constructors

CompressedBeliefSolver(pomdp::POMDP, base_solver::Solver; updater::Updater=DiscreteUpdater(pomdp), sampler::Sampler=BeliefExpansionSampler(pomdp), compressor::Compressor=PCACompressor(1))
 CompressedBeliefSolver(pomdp::POMDP; updater::Updater=DiscreteUpdater(pomdp), sampler::Sampler=BeliefExpansionSampler(pomdp), compressor::Compressor=PCACompressor(1), interp::Union{Nothing, LocalFunctionApproximator}=nothing, k::Int=1, verbose::Bool=false, max_iterations::Int=1000, n_generative_samples::Int=10, belres::Float64=1e-3)

Constructs a CompressedBeliefSolver using the specified POMDP, base solver, updater, sampler, and compressor. Alternatively, you can omit the base solver in which case a LocalApproximationValueIterationSolver(https://github.com/JuliaPOMDP/LocalApproximationValueIteration.jl) will be created instead. For example, different base solvers are needed if the POMDP state and action space are continuous.

Example Usage

julia> pomdp = TigerPOMDP();
 julia> solver = CompressedBeliefSolver(pomdp; verbose=true, max_iterations=10);
 julia> solve(solver, pomdp);
@@ -19,7 +19,7 @@
 [Iteration 7   ] residual:       6.03 | iteration runtime:      0.495 ms, (     0.639 s total)
 [Iteration 8   ] residual:       5.73 | iteration runtime:      0.585 ms, (     0.639 s total)
 [Iteration 9   ] residual:       4.02 | iteration runtime:      0.463 ms, (      0.64 s total)
-[Iteration 10  ] residual:       7.28 | iteration runtime:      0.576 ms, (      0.64 s total)
source

Functions

CompressedBeliefMDPs.make_cacheFunction
make_cache(B, B̃)

Helper function that creates a cache that maps each unique belief from the set B to its corresponding compressed representation in .

Arguments

  • B::Vector{<:Any}: A vector of beliefs.
  • B̃::Matrix{Float64}: A matrix where each row corresponds to the compressed representation of the beliefs in B.

Returns

  • Dict{<:Any, Vector{Float64}}: A dictionary mapping each unique belief in B to its corresponding compressed representation in .

Example Usage

B = [belief1, belief2, belief3]
+[Iteration 10  ] residual:       7.28 | iteration runtime:      0.576 ms, (      0.64 s total)
source

Functions

CompressedBeliefMDPs.make_cacheFunction
make_cache(B, B̃)

Helper function that creates a cache that maps each unique belief from the set B to its corresponding compressed representation in .

Arguments

  • B::Vector{<:Any}: A vector of beliefs.
  • B̃::Matrix{Float64}: A matrix where each row corresponds to the compressed representation of the beliefs in B.

Returns

  • Dict{<:Any, Vector{Float64}}: A dictionary mapping each unique belief in B to its corresponding compressed representation in .

Example Usage

B = [belief1, belief2, belief3]
 B̃ = [compressed_belief1; compressed_belief2; compressed_belief3]
-ϕ = make_cache(B, B̃)
source
CompressedBeliefMDPs.make_numericalFunction
make_numerical(B, pomdp)

Helper function that converts a set of beliefs B into a numerical matrix representation suitable for processing by numerical algorithms/compressors.

Arguments

  • B::Vector{<:Any}: A vector of beliefs.
  • pomdp::POMDP: The POMDP model associated with the beliefs.

Returns

  • Matrix{Float64}: A matrix where each row corresponds to a numerical representation of a belief in B.

Example Usage

B = [belief1, belief2, belief3]
-B_numerical = make_numerical(B, pomdp)
source
CompressedBeliefMDPs.compress_POMDPFunction
compress_POMDP(pomdp, sampler, updater, compressor)

Creates a compressed belief-state MDP by sampling, compressing, and caching beliefs from the given POMDP.

Arguments

  • pomdp::POMDP: The POMDP model to be compressed.
  • sampler::Sampler: A sampler to generate a set of beliefs from the POMDP.
  • updater::Updater: An updater to initialize beliefs from states.
  • compressor::Compressor: A compressor to reduce the dimensionality of the beliefs.

Returns

  • CompressedBeliefMDP: The constructed compressed belief-state MDP.
  • Matrix{Float64}: A matrix where each row corresponds to the compressed representation of the sampled beliefs.

Example Usage

```julia pomdp = TigerPOMDP() sampler = BeliefExpansionSampler(pomdp) updater = DiscreteUpdater(pomdp) compressor = PCACompressor(2) m, B̃ = compress_POMDP(pomdp, sampler, updater, compressor)

source
+ϕ = make_cache(B, B̃)
source
CompressedBeliefMDPs.make_numericalFunction
make_numerical(B, pomdp)

Helper function that converts a set of beliefs B into a numerical matrix representation suitable for processing by numerical algorithms/compressors.

Arguments

  • B::Vector{<:Any}: A vector of beliefs.
  • pomdp::POMDP: The POMDP model associated with the beliefs.

Returns

  • Matrix{Float64}: A matrix where each row corresponds to a numerical representation of a belief in B.

Example Usage

B = [belief1, belief2, belief3]
+B_numerical = make_numerical(B, pomdp)
source
CompressedBeliefMDPs.compress_POMDPFunction
compress_POMDP(pomdp, sampler, updater, compressor)

Creates a compressed belief-state MDP by sampling, compressing, and caching beliefs from the given POMDP.

Arguments

  • pomdp::POMDP: The POMDP model to be compressed.
  • sampler::Sampler: A sampler to generate a set of beliefs from the POMDP.
  • updater::Updater: An updater to initialize beliefs from states.
  • compressor::Compressor: A compressor to reduce the dimensionality of the beliefs.

Returns

  • CompressedBeliefMDP: The constructed compressed belief-state MDP.
  • Matrix{Float64}: A matrix where each row corresponds to the compressed representation of the sampled beliefs.

Example Usage

```julia pomdp = TigerPOMDP() sampler = BeliefExpansionSampler(pomdp) updater = DiscreteUpdater(pomdp) compressor = PCACompressor(2) m, B̃ = compress_POMDP(pomdp, sampler, updater, compressor)

source
diff --git a/dev/circular/index.html b/dev/circular/index.html index a66b6af..d252958 100644 --- a/dev/circular/index.html +++ b/dev/circular/index.html @@ -5,4 +5,4 @@ n_corridors = 8 corridor_length = 25 -maze = CircularMaze(n_corridors, corridor_length)source
CompressedBeliefMDPs.CircularMazeStateType
CircularMazeState(corridor::Integer, x::Integer)

The CircularMazeState struct represents the state of an agent in a circular maze.

Fields

  • corridor::Integer: The corridor number. The value ranges from 1 to n_corridors.
  • x::Integer: The position of the state within the corridor. The value ranges from 1 to the corridor_length.
source
+maze = CircularMaze(n_corridors, corridor_length)source
CompressedBeliefMDPs.CircularMazeStateType
CircularMazeState(corridor::Integer, x::Integer)

The CircularMazeState struct represents the state of an agent in a circular maze.

Fields

  • corridor::Integer: The corridor number. The value ranges from 1 to n_corridors.
  • x::Integer: The position of the state within the corridor. The value ranges from 1 to the corridor_length.
source
diff --git a/dev/compressors/index.html b/dev/compressors/index.html index 14a3f89..b7509bc 100644 --- a/dev/compressors/index.html +++ b/dev/compressors/index.html @@ -12,4 +12,4 @@ function fit!(c::MyCompressor, beliefs) # YOUR CODE HERE -end

Implementation Tips

Implemented Compressors

CompressedBeliefMDPs currently provides wrappers for the following compression types:

Principal Component Analysis (PCA)

CompressedBeliefMDPs.PCACompressorFunction

Wrapper for MultivariateStats.PCA.

source

Kernel PCA

CompressedBeliefMDPs.KernelPCACompressorFunction

Wrapper for MultivariateStats.KernelPCA.

source

Probabilistic PCA

CompressedBeliefMDPs.PPCACompressorFunction

Wrapper for MultivariateStats.PPCA.

source

Factor Analysis

CompressedBeliefMDPs.FactorAnalysisCompressorFunction

Wrapper for MultivariateStats.FactorAnalysis

source

Isomap

CompressedBeliefMDPs.IsomapCompressorFunction

Wrapper for ManifoldLearning.Isomap.

source

Autoencoder

CompressedBeliefMDPs.AutoencoderCompressorType

Implements an autoencoder in Flux.

source

Variational Auto-Encoder (VAE)

CompressedBeliefMDPs.VAECompressorType

Implements a VAE in Flux.

source
Warning

Some compression algorithms aren't optimized for large belief spaces. While they pass our unit tests, they may fail on large POMDPs or without seeding. For large POMDPs, users may want a custom Compressor.

+end

Implementation Tips

Implemented Compressors

CompressedBeliefMDPs currently provides wrappers for the following compression types:

Principal Component Analysis (PCA)

CompressedBeliefMDPs.PCACompressorFunction

Wrapper for MultivariateStats.PCA.

source

Kernel PCA

CompressedBeliefMDPs.KernelPCACompressorFunction

Wrapper for MultivariateStats.KernelPCA.

source

Probabilistic PCA

CompressedBeliefMDPs.PPCACompressorFunction

Wrapper for MultivariateStats.PPCA.

source

Factor Analysis

CompressedBeliefMDPs.FactorAnalysisCompressorFunction

Wrapper for MultivariateStats.FactorAnalysis

source

Isomap

CompressedBeliefMDPs.IsomapCompressorFunction

Wrapper for ManifoldLearning.Isomap.

source

Autoencoder

CompressedBeliefMDPs.AutoencoderCompressorType

Implements an autoencoder in Flux.

source

Variational Auto-Encoder (VAE)

CompressedBeliefMDPs.VAECompressorType

Implements a VAE in Flux.

source
Warning

Some compression algorithms aren't optimized for large belief spaces. While they pass our unit tests, they may fail on large POMDPs or without seeding. For large POMDPs, users may want a custom Compressor.

diff --git a/dev/index.html b/dev/index.html index d4ced9b..9629f41 100644 --- a/dev/index.html +++ b/dev/index.html @@ -65,4 +65,4 @@ ) policy = solve(solver, pomdp) rs = RolloutSimulator(max_steps=50) -r = simulate(rs, pomdp, policy)

Concepts and Architecture

CompressedBeliefMDPs.jl aims to implement a generalization of the belief compression algorithm for solving large POMDPs. The algorithm has four steps:

  1. collect belief samples,
  2. compress the samples,
  3. create the compressed belief-state MDP,
  4. solve the MDP.

Each step is handled by Sampler, Compressor, CompressedBeliefMDP, and CompressedBeliefSolver respectively.

For more details, please see the rest of the documentation or the associated paper.

+r = simulate(rs, pomdp, policy)

Concepts and Architecture

CompressedBeliefMDPs.jl aims to implement a generalization of the belief compression algorithm for solving large POMDPs. The algorithm has four steps:

  1. collect belief samples,
  2. compress the samples,
  3. create the compressed belief-state MDP,
  4. solve the MDP.

Each step is handled by Sampler, Compressor, CompressedBeliefMDP, and CompressedBeliefSolver respectively.

For more details, please see the rest of the documentation or the associated paper.

diff --git a/dev/samplers/index.html b/dev/samplers/index.html index e050478..b83fca3 100644 --- a/dev/samplers/index.html +++ b/dev/samplers/index.html @@ -16,13 +16,13 @@ DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.15000000000000002, 0.85]) DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.5, 0.5]) DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.85, 0.15000000000000002]) - DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.9697986577181208, 0.030201342281879207])source

Policy Sampler

CompressedBeliefMDPs.PolicySamplerType
PolicySampler

Samples belief states by rolling out a Policy.

Fields

  • policy::Policy: The policy used for decision making.
  • updater::Updater: The updater used for updating beliefs.
  • n::Integer: The maximum number of simulated steps.
  • rng::AbstractRNG: The random number generator used for sampling.
  • verbose::Bool: Whether to use a progress bar while sampling.

Constructors

PolicySampler(pomdp::POMDP; policy::Policy=RandomPolicy(pomdp), 
+  DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.9697986577181208, 0.030201342281879207])
source

Policy Sampler

CompressedBeliefMDPs.PolicySamplerType
PolicySampler

Samples belief states by rolling out a Policy.

Fields

  • policy::Policy: The policy used for decision making.
  • updater::Updater: The updater used for updating beliefs.
  • n::Integer: The maximum number of simulated steps.
  • rng::AbstractRNG: The random number generator used for sampling.
  • verbose::Bool: Whether to use a progress bar while sampling.

Constructors

PolicySampler(pomdp::POMDP; policy::Policy=RandomPolicy(pomdp), 
 updater::Updater=DiscreteUpdater(pomdp), n::Integer=10, 
 rng::AbstractRNG=Random.GLOBAL_RNG)

Methods

(s::PolicySampler)(pomdp::POMDP)

Returns a vector of unique belief states.

Example

julia> pomdp = TigerPOMDP();
 julia> sampler = PolicySampler(pomdp; n=3); 
 julia> 2-element Vector{Any}:
 DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.5, 0.5])
-DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.15000000000000002, 0.85])
source

ExplorationPolicy Sampler

CompressedBeliefMDPs.ExplorationPolicySamplerType
ExplorationPolicySampler

Samples belief states by rolling out an ExplorationPolicy. Essentially identical to PolicySampler.

Fields

  • explorer::ExplorationPolicy: The ExplorationPolicy used for decision making.
  • on_policy::Policy: The fallback Policy used for decision making when not exploring.
  • updater::Updater: The updater used for updating beliefs.
  • n::Integer: The maximum number of simulated steps.
  • rng::AbstractRNG: The random number generator used for sampling.
  • verbose::Bool: Whether to use a progress bar while sampling.

Constructors

ExplorationPolicySampler(pomdp::POMDP; rng::AbstractRNG=Random.GLOBAL_RNG,
+DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.15000000000000002, 0.85])
source

ExplorationPolicy Sampler

CompressedBeliefMDPs.ExplorationPolicySamplerType
ExplorationPolicySampler

Samples belief states by rolling out an ExplorationPolicy. Essentially identical to PolicySampler.

Fields

  • explorer::ExplorationPolicy: The ExplorationPolicy used for decision making.
  • on_policy::Policy: The fallback Policy used for decision making when not exploring.
  • updater::Updater: The updater used for updating beliefs.
  • n::Integer: The maximum number of simulated steps.
  • rng::AbstractRNG: The random number generator used for sampling.
  • verbose::Bool: Whether to use a progress bar while sampling.

Constructors

ExplorationPolicySampler(pomdp::POMDP; rng::AbstractRNG=Random.GLOBAL_RNG,
 explorer::ExplorationPolicy=EpsGreedyPolicy(pomdp, 0.1; rng=rng), on_policy=RandomPolicy(pomdp),
 updater::Updater=DiscreteUpdater(pomdp), n::Integer=10)

Methods

(s::ExplorationPolicySampler)(pomdp::POMDP)

Returns a vector of unique belief states.

Example Usage

julia> pomdp = TigerPOMDP()
 julia> sampler = ExplorationPolicySampler(pomdp; n=30)
@@ -30,4 +30,4 @@
 3-element Vector{Any}:
  DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.5, 0.5])
  DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.85, 0.15000000000000002])
- DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.9697986577181208, 0.030201342281879207])
source
+ DiscreteBelief{TigerPOMDP, Bool}(TigerPOMDP(-1.0, -100.0, 10.0, 0.85, 0.95), Bool[0, 1], [0.9697986577181208, 0.030201342281879207])source