Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StackOverflowError when solving TigerPOMDP #17

Open
dominikstrb opened this issue Jun 26, 2020 · 3 comments
Open

StackOverflowError when solving TigerPOMDP #17

dominikstrb opened this issue Jun 26, 2020 · 3 comments

Comments

@dominikstrb
Copy link

Hi everyone,

I am currently trying to solve a version of the TigerPOMDP, but with continuous observations instead of the Bernoulli observation model from the original tiger problem. Since MCVI is one of the few solvers that supports continuous observations, I thought I'd give it a try.

As a first attempt, I tried to run MCVI on the original TigerPOMDP as implemented in POMDPModels.jl. I basically adapted the code from the tests in this repo to the tiger problem. This is my code:

using MCVI
import MCVI: init_lower_action, lower_bound, upper_bound
using POMDPs
using Random
using POMDPModelTools
using POMDPSimulators
using POMDPModels

# Bounds
mutable struct TigerLowerBound
    rng::AbstractRNG
end

mutable struct TigerUpperBound
    rng::AbstractRNG
end

function lower_bound(lb::TigerLowerBound, p::TigerPOMDP, s::Bool)
    return p.r_findtiger
end

function upper_bound(ub::TigerUpperBound, p::TigerPOMDP, s::Bool)
    return p.r_escapetiger
end

function init_lower_action(p::TigerPOMDP)
    return 0 # Worst? This depends on the initial state? XXX
end

function POMDPs.observation(p::TigerPOMDP, s::Bool)
    return observation(p, TIGER_LISTEN, s)
end

prob = TigerPOMDP()
sim = MCVISimulator()

solver = MCVISolver(sim, nothing, 1, 100, 8, 500, 1000, 5000, 50, TigerLowerBound(sim.rng), TigerUpperBound(sim.rng))
solve(solver, prob)

I am getting the following error message, for which I have no clue why:

StackOverflowError:
Stacktrace:
 [1] dot at /build/julia/src/julia-1.4.2/usr/share/julia/stdlib/v1.4/LinearAlgebra/src/blas.jl:297 [inlined]
 [2] dot(::Array{Float64,1}, ::Array{Float64,1}) at /build/julia/src/julia-1.4.2/usr/share/julia/stdlib/v1.4/LinearAlgebra/src/blas.jl:346
 [3] dot at /build/julia/src/julia-1.4.2/usr/share/julia/stdlib/v1.4/LinearAlgebra/src/matmul.jl:9 [inlined]
 [4] reward at /home/dominik/.julia/packages/MCVI/rUku8/src/subspace.jl:80 [inlined]
 [5] reward at /home/dominik/.julia/packages/MCVI/rUku8/src/belief.jl:109 [inlined]
 [6] expand!(::MCVI.BeliefNode{Bool}, ::MCVISolver, ::TigerPOMDP; debug::Bool) at /home/dominik/.julia/packages/MCVI/rUku8/src/solver.jl:83
 [7] search!(::MCVI.BeliefNode{Bool}, ::MCVISolver, ::MCVIPolicy, ::TigerPOMDP, ::Float64; debug::Bool) at /home/dominik/.julia/packages/MCVI/rUku8/src/solver.jl:169
 [8] search!(::MCVI.ActionNode{Bool,Int64}, ::MCVISolver, ::MCVIPolicy, ::TigerPOMDP, ::Float64; debug::Bool) at /home/dominik/.julia/packages/MCVI/rUku8/src/solver.jl:216
 [9] search!(::MCVI.BeliefNode{Bool}, ::MCVISolver, ::MCVIPolicy, ::TigerPOMDP, ::Float64; debug::Bool) at /home/dominik/.julia/packages/MCVI/rUku8/src/solver.jl:185
 ... (the last 2 lines are repeated 11896 more times)
 [23802] macro expansion at /home/dominik/.julia/packages/MCVI/rUku8/src/solver.jl:240 [inlined]
 [23803] macro expansion at ./util.jl:234 [inlined]
 [23804] solve(::MCVISolver, ::TigerPOMDP, ::MCVIPolicy; debug::Bool) at /home/dominik/.julia/packages/MCVI/rUku8/src/solver.jl:239
 [23805] solve at /home/dominik/.julia/packages/MCVI/rUku8/src/solver.jl:226 [inlined] (repeats 2 times)
 [23806] top-level scope at ./util.jl:175 [inlined]
in expression starting at /home/dominik/repos/psy-pomdps/test-mcvi.jl:43

Does anyone have an idea how to use this solver for problems other than the LightDark1D?

Thanks,
Dominik

@zsunberg
Copy link
Member

It appears that search! is being called recursively here until it runs out of stack memory. This is likely because the bounds are too far apart. Unfortunately I don't know exactly what the bounds mean. If it is a bound on the value function for the state, it seems like both the upper and lower bounds could be r_escapetiger, but this produces a different error. Someone should investigate what the bounds mean and document them appropriately (also, ideally the bounds should just be functions instead of objects).

@dominikstrb
Copy link
Author

dominikstrb commented Jun 28, 2020

Thanks, that makes sense. My main problem with this solver was that the documentation does not explain at all what the bounds mean. The MCVI paper also did not help.

@gdaddi
Copy link

gdaddi commented Nov 2, 2022

Has anyone recently found how to solve this issue? I ran into the same StackOverlfow error while trying to solve the mountaincar domain with MCVI?

This is my version, that tried to keep these bounds as close as possible:

mutable struct MCarLowerBound
    rng::AbstractRNG
end

mutable struct MCarUpperBound
    rng::AbstractRNG
end

function lower_bound(lb::MCarLowerBound, p, s::Tuple)
    return s[1] - 0.0001
end

function upper_bound(ub::MCarUpperBound, p, s::Tuple)
    return s[1] + 0.0001
end

function init_lower_action(p::QuickPOMDP)
    return 0
end

function POMDPs.observation(p::QuickPOMDP, s::Tuple)
    return observation(p, s) 
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

3 participants