LKJ bijector is having problems #144

cpfiffer · 2020-10-30T01:04:46Z

I wish I could be more specific about the problems in the issue description, but I'm having a hard time codifying them. I've gotten singular exceptions, non-PD exceptions, and non-Hermitian matrix exceptions, neither of which seem to have an obvious cause to me.

MWE:

using Turing, LinearAlgebra

@model function ex(data)
    # Set up covariance matrix
    sigma ~ filldist(InverseGamma(2, 3), size(data, 2))
    omega ~ LKJ(size(data, 2), 1.0)
    Σ = Symmetric(diagm(sigma) * omega * diagm(sigma))

    # Observations
    for t in 1:size(data, 1)
        data[t,:] ~ MvNormal(zeros(size(data, 2)), Σ)
    end
end

# Parameters
dim = 10
n_obs = 1000

# Variances
sigma = rand(InverseGamma(2, 3), dim)

# Correlation matrix
omega = rand(LKJ(dim, 2.0))
Sigma = Symmetric(omega)

# Data
data = rand(MvNormal(zeros(dim), Sigma), n_obs)'

# Construct model
model = ex(data)
chain = sample(model, NUTS(), 100)

Running the example above gets me

ERROR: LoadError: PosDefException: matrix is not positive definite; Cholesky factorization failed.
Stacktrace:
 [1] checkpositivedefinite at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/factorization.jl:18 [inlined]
 [2] cholesky!(::Symmetric{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:sigma, :omega),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillArrays.Fill{InverseGamma{Float6
4},1,Tuple{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:omega,Tuple{}},1},Array{Float64,
1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"#13#14",(:data,),(),(),Tuple{Adjoint{Float64,Array{Float64,2}}},Tuple{}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{Na
medTuple{(:sigma, :omega),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillArrays.Fill{InverseGamma{Float64},1,Tuple{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{}},1},Array{Float64,1},Array{S
et{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:omega,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}},DynamicPPL.DefaultContext},Float64},Float64,10},Arra
y{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:sigma, :omega),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillArrays.Fill{InverseGamma{Float64},1,Tuple{Base.OneTo{Int
64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:omega,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.S
elector},1}}}},Float64},DynamicPPL.Model{var"#13#14",(:data,),(),(),Tuple{Adjoint{Float64,Array{Float64,2}}},Tuple{}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:sigma, :omega)
,Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillArrays.Fill{InverseGamma{Float64},1,Tuple{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1
}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:omega,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}},DynamicPPL.DefaultContext},Float64},Float64,10},2}}, ::Val{false}; check::Boo
l) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/cholesky.jl:219
 [3] #cholesky#130 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/cholesky.jl:344 [inlined]
 [4] cholesky at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/cholesky.jl:344 [inlined] (repeats 2 times)
 [5] (::Bijectors.CorrBijector)(::Symmetric{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:sigma, :omega),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillArrays.Fill{In
verseGamma{Float64},1,Tuple{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:omega,Tuple{}},
1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"#13#14",(:data,),(),(),Tuple{Adjoint{Float64,Array{Float64,2}}},Tuple{}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{Dyna
micPPL.VarInfo{NamedTuple{(:sigma, :omega),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillArrays.Fill{InverseGamma{Float64},1,Tuple{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{}},1},Array{F
loat64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:omega,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}},DynamicPPL.DefaultContext},Float64}
,Float64,10},Array{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:sigma, :omega),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillArrays.Fill{InverseGamma{Float64},1,Tup
le{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:omega,Tuple{}},1},Array{Float64,1},Array
{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"#13#14",(:data,),(),(),Tuple{Adjoint{Float64,Array{Float64,2}}},Tuple{}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple
{(:sigma, :omega),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillArrays.Fill{InverseGamma{Float64},1,Tuple{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{}},1},Array{Float64,1},Array{Set{Dynam
icPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:omega,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}},DynamicPPL.DefaultContext},Float64},Float64,10},2}}) at /hom
e/cameron/.julia/dev/Bijectors/src/bijectors/corr.jl:68
 [6] logabsdetjac at /home/cameron/.julia/packages/Bijectors/Fye2h/src/bijectors/corr.jl:101 [inlined]
 [7] logpdf_with_trans(::LKJ{Float64,Int64}, ::Symmetric{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:sigma, :omega),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillA
rrays.Fill{InverseGamma{Float64},1,Tuple{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:om
ega,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"#13#14",(:data,),(),(),Tuple{Adjoint{Float64,Array{Float64,2}}},Tuple{}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.Samp
lerState{DynamicPPL.VarInfo{NamedTuple{(:sigma, :omega),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillArrays.Fill{InverseGamma{Float64},1,Tuple{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{
}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:omega,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}},DynamicPPL.DefaultCont
ext},Float64},Float64,10},Array{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:sigma, :omega),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillArrays.Fill{InverseGamma{F
loat64},1,Tuple{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:omega,Tuple{}},1},Array{Flo
at64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"#13#14",(:data,),(),(),Tuple{Adjoint{Float64,Array{Float64,2}}},Tuple{}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarIn
fo{NamedTuple{(:sigma, :omega),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:sigma,Tuple{}},Int64},Array{Product{Continuous,InverseGamma{Float64},FillArrays.Fill{InverseGamma{Float64},1,Tuple{Base.OneTo{Int64}}}},1},Array{DynamicPPL.VarName{:sigma,Tuple{}},1},Array{Float64,1},Ar
ray{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:omega,Tuple{}},Int64},Array{LKJ{Float64,Int64},1},Array{DynamicPPL.VarName{:omega,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}},DynamicPPL.DefaultContext},Float64},Float64,10}
,2}}, ::Bool) at /home/cameron/.julia/dev/Bijectors/src/Bijectors.jl:130

If I reduce dims to 3, sampling basically just hangs forever. Occasionally it'll throw a singular exception, but I haven't been able to recreate it and so I don't have a callstack.

Any thoughts here?

The text was updated successfully, but these errors were encountered:

torfjelde · 2020-10-30T09:39:09Z

I don't think there's anything wrong here, it's just that it's numerically very unstable. If I run the code with an inserted print statement in the definition of the inverse-evaluation of CorrBijector, I get the following as an input (i.e. the value in the unconstrained space that HMC is working with):

10×10 Array{Float64,2}:
 0.0  -1630.14  -857.317  -8051.24  …    -825.602   -1598.03   -1801.29
 0.0      0.0   8426.06   25379.5       -5603.79    -6840.33   -3651.21
 0.0      0.0      0.0    30092.8       -2565.67    -4733.44     -56.0023
 0.0      0.0      0.0        0.0       -7583.1    -10285.7     8297.9
 0.0      0.0      0.0        0.0       -4432.82    -6084.45    6599.2
 0.0      0.0      0.0        0.0   …  -23834.4    -31462.6   -55327.8
 0.0      0.0      0.0        0.0      -33831.2    -42291.8    22720.5
 0.0      0.0      0.0        0.0           0.0     42018.5    13520.2
 0.0      0.0      0.0        0.0           0.0         0.0    32250.5
 0.0      0.0      0.0        0.0           0.0         0.0        0.0

Now, that doesn't look too pretty. The "transformed" version (from unconstrained to constrained) is:

10×10 Array{Float64,2}:
  1.0  -1.0  -1.0  -1.0  -1.0  -1.0   1.0  -1.0  -1.0  -1.0
 -1.0   1.0   1.0   1.0   1.0   1.0  -1.0   1.0   1.0   1.0
 -1.0   1.0   1.0   1.0   1.0   1.0  -1.0   1.0   1.0   1.0
 -1.0   1.0   1.0   1.0   1.0   1.0  -1.0   1.0   1.0   1.0
 -1.0   1.0   1.0   1.0   1.0   1.0  -1.0   1.0   1.0   1.0
 -1.0   1.0   1.0   1.0   1.0   1.0  -1.0   1.0   1.0   1.0
  1.0  -1.0  -1.0  -1.0  -1.0  -1.0   1.0  -1.0  -1.0  -1.0
 -1.0   1.0   1.0   1.0   1.0   1.0  -1.0   1.0   1.0   1.0
 -1.0   1.0   1.0   1.0   1.0   1.0  -1.0   1.0   1.0   1.0

I need to look more into the LKJ distribution to see how we can deal with this though. Possibly related: https://mc-stan.org/docs/2_18/functions-reference/cholesky-lkj-correlation-distribution.html

cpfiffer · 2020-10-30T16:35:26Z

Actually, I wonder if this is just an initilization problem. Like if we tilted the initialization towards regions of the unconstrained space that actually mapped to meaningfully differentiable correlation matrices.

Edit: bad solution, I know, because then it's a different density.

torfjelde · 2020-11-01T13:58:04Z

No, this is fine. This is essentially what we do in Turing wit the "robust" init-methods. But this is more of an issue with Turing rather than Bijectors, no?

EDIT: As in, it's fine to change the initialization procedure, but not the actual distribution, yeah.

devmotion · 2020-11-01T14:20:35Z

Possibly also related: #134

cpfiffer · 2020-11-01T17:26:59Z

No, this is fine. This is essentially what we do in Turing wit the "robust" init-methods. But this is more of an issue with Turing rather than Bijectors, no?

Yeah, perhaps, though the real issue might be the numerical instability. If the "true" bijector exists and is really numerically stable, then it doesn't matter how it's initialized.

bratslavia · 2021-05-04T13:42:23Z

Just a data point that may be relevant to this issue - I noticed that LKJ seems to never work with NUTS (giving the above posdef error), but it does seems to work just fine with HMC. E.g.,

using Turing

@model function LKJ_demo(P)
    y ~ LKJ(P, 1)
end
model = LKJ_demo(5)

m1 = sample(model, HMC(0.01, 20), MCMCThreads(), 1, 4) # Works
m2 = sample(model, NUTS(1, 0.65), MCMCThreads(), 1, 4) # Always fails

torfjelde · 2021-05-05T14:50:38Z

@bratslavia This is because NUTS has an adaptation phase where it will try out possibly "extreme" step-sizes, while HMC is using the fixed-step size of 0.01 that you provide. As a result, numerical issues are more likely to show up when using NUTS rather than HMC (unless you specify a very large step-size).

mgmverburg · 2021-08-11T13:05:35Z

So what is the current suggested workaround, if any?
Bratslavia's MWE is indeed perfectly indicative of what I have been experiencing, that, at least as long as using NUTS then probably, it is very likely to run into an error that will basically stop all sampling.

1. matrix variate dist yield PosDefinite exception (probably due to slicer trying extreme values, see TuringLang/Bijectors.jl#144 (comment) where it is reported similar issue with NUTS (but not HMC)) 2. other matrix variate dists do not have bijector implemented 3. anyways it seems they are flattened by the time it becomes a vi..

sjwild mentioned this issue Feb 24, 2021

Tracker does not work with multilevel model TuringLang/Turing.jl#1549

Closed

bratslavia mentioned this issue Mar 16, 2021

PosDef issues with InverseWishart? #170

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LKJ bijector is having problems #144

LKJ bijector is having problems #144

cpfiffer commented Oct 30, 2020 •

edited

Loading

torfjelde commented Oct 30, 2020

cpfiffer commented Oct 30, 2020 •

edited

Loading

torfjelde commented Nov 1, 2020 •

edited

Loading

devmotion commented Nov 1, 2020

cpfiffer commented Nov 1, 2020

bratslavia commented May 4, 2021

torfjelde commented May 5, 2021

mgmverburg commented Aug 11, 2021

LKJ bijector is having problems #144

LKJ bijector is having problems #144

Comments

cpfiffer commented Oct 30, 2020 • edited Loading

torfjelde commented Oct 30, 2020

cpfiffer commented Oct 30, 2020 • edited Loading

torfjelde commented Nov 1, 2020 • edited Loading

devmotion commented Nov 1, 2020

cpfiffer commented Nov 1, 2020

bratslavia commented May 4, 2021

torfjelde commented May 5, 2021

mgmverburg commented Aug 11, 2021

cpfiffer commented Oct 30, 2020 •

edited

Loading

cpfiffer commented Oct 30, 2020 •

edited

Loading

torfjelde commented Nov 1, 2020 •

edited

Loading