Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need for MvNormal with sparse precision matrix #89

Open
jrfaulkner opened this issue Jun 24, 2020 · 18 comments · Fixed by JuliaDiff/ChainRules.jl#730
Open

Need for MvNormal with sparse precision matrix #89

jrfaulkner opened this issue Jun 24, 2020 · 18 comments · Fixed by JuliaDiff/ChainRules.jl#730

Comments

@jrfaulkner
Copy link

After much digging and testing, it appears you are currently not supporting sparse matrix constructions for AD. Is this correct? In particular, I am in need of a MvNormal distribution that is specified using a sparse precision matrix and will work with the autodiff in Turing. This is common for large spatial problems using Gaussian Markov random fields. I came to Turing for this because stan does not support sparse matrix operations. This would be a really valuable addition to your package if you can make it work.

@devmotion
Copy link
Member

No, MvNormalCanon (which you are referring to) is not supported yet. Would be a valuable addition.

@jrfaulkner
Copy link
Author

One that allows sparse precision matrix is the key.

@devmotion
Copy link
Member

Sure, I understand, I was just referring to the fact that MvNormalCanon is the distribution type in Distributions for multivariate normal distributions that are parameterized by precision matrices.

@jrfaulkner
Copy link
Author

Yes, thanks for pointing that out. Is there any reason that sparse operations would cause problems for autodiff?

@ElOceanografo
Copy link

This is also something I would find really valuable. From trying out a toy model, am I correct that Distributions is using PDMats which, when it constructs a PDSparseMat, calls a cholesky routine from CHOLMOD, which chokes on the Dual numbers?

using Turing
using SparseArrays
using PDMats

@model function MvNormalCanonTest(x)
    n = length(x)
    ρ ~ Uniform(0, 0.49)
    Q = spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1))
    x ~ MvNormalCanon(PDSparseMat(Q))
end

x = randn(10)
m = MvNormalCanonTest(x)
sample(m, NUTS(), 100)

TypeError: in Sparse, in Tv, expected Tv<:Union{Complex{Float64}, Float64}, got Type{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1}}
SuiteSparse.CHOLMOD.Sparse(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}, ::Int64) at cholmod.jl:933
SuiteSparse.CHOLMOD.Sparse(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}) at cholmod.jl:939
cholesky(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}; kws::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at cholmod.jl:1458
cholesky at cholmod.jl:1458 [inlined]
PDSparseMat(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}) at pdsparsemat.jl:22
macro expansion at untitled-d3c70e9b18556a75b02f77b5bb73e87c:245 [inlined]
##evaluator#299(::Random._GLOBAL_RNG, ::DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}}, ::DynamicPPL.ThreadSafeVarInfo{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f...

@jkbest2
Copy link

jkbest2 commented Sep 16, 2020

This PR is probably relevant: JuliaDiff/ChainRules.jl#246. If I understand the AD machinery being used here (I don't), once sparse array support is added, rules for sparse Choleskys can be added that will enable differentiating through. Unfortunately CHOLMOD doesn't know what Dual numbers are, so ForwardDiff AD fails.

@ElOceanografo
Copy link

Reading the stacktrace for my error above more carefully, I the error is actually coming from the Sparse constructor, not the Cholesky decomposition, so hopefully JuliaDiff/ChainRules.jl#246 will solve the problem.

@ElOceanografo
Copy link

I've done more investigating and think that it might be easier to implement a custom TuringMvNormalCanon type in this package, at least as long as Distributions uses PDMats. The issue is that even with a custom adjoint defined for the Cholesky decomposition, the rrule still has to calculate the factorization--which, as long as it is done via SuiteSparse.CHOLMOD, means it won't be able to handle dual numbers.

However, I recently found LDLFactorizations.jl, a pure-Julia port of the LDL decomposition from SuiteSparse--which can be used to calculate the Cholesky factors, even for non-float matrices. Using this approach, I've hacked together a custom MvNormalCanon implementation that works in my example Turing model above. @devmotion , does this sound like a worthwhile PR?

@devmotion
Copy link
Member

The issue is that even with a custom adjoint defined for the Cholesky decomposition, the rrule still has to calculate the factorization--which, as long as it is done via SuiteSparse.CHOLMOD, means it won't be able to handle dual numbers.

I don't understand why there is a problem with the custom adjoint here - what is missing or broken in the definition of the adjoint in ChainRules (https://github.com/JuliaDiff/ChainRules.jl/blob/f63f385e7a8477c5112f0b2cf24790ca1abf6320/src/rulesets/LinearAlgebra/factorization.jl#L73)?

However, I recently found LDLFactorizations.jl, a pure-Julia port of the LDL decomposition from SuiteSparse--which can be used to calculate the Cholesky factors, even for non-float matrices.

Interesting, I didn't know this package. I wonder if it is problematic that it uses the LGPL license.

@ElOceanografo
Copy link

In my example above, the precision matrix Q is defined in terms of the unknown parameter ρ, so when applying any autodiff routine to the model, Q ends up being a SparseMatrixCSC with elements of some dual number type. The error is coming from the PDSparseMat constructor trying to precalculate cholesky(Q) using CHOLMOD, not the actual calculation of the likelihood or its derivative.

The rrule seems to be defined correctly, but it still calls CHOLMOD if the matrix X is sparse (line 74 in the file you linked). My understanding of how rrules work is pretty superficial, so I may be wrong, but I think it would run into the same problem there, right?

Regarding the LGPL...I'm not a lawyer, but I believe it's okay to use LGPL libraries in non-GPL code, as long as the non-GPL code isn't a derivative work. SuiteSparse itself is under the LGPL, so if using LDLFactorizations causes license issues, they're at least not new ones...

@devmotion
Copy link
Member

Ah OK, I think I understand now. I'm a bit surprised that ForwardDiff doesn't support cholesky of sparse matrices but I've never checked.

Did you try to use Zygote instead of ForwardDiff? It uses the adjoints in ChainRules and does not operate with dual numbers.

@ElOceanografo
Copy link

Zygote doesn't work either, it fails with a Mutating arrays is not supported error. Some of the linear algebra functions build up their results in-place.

Traceback ```julia Mutating arrays is not supported error(::String) at error.jl:33 (::Zygote.var"#364#365")(::Nothing) at array.jl:58 (::Zygote.var"#2245#back#366"{Zygote.var"#364#365"})(::Nothing) at adjoint.jl:49 diag at cholmod.jl:1792 [inlined] (::typeof(∂(diag)))(::Array{Float64,1}) at interface2.jl:0 logdet at cholmod.jl:1801 [inlined] (::typeof(∂(logdet)))(::Float64) at interface2.jl:0 logdet at pdsparsemat.jl:55 [inlined] (::typeof(∂(logdet)))(::Float64) at interface2.jl:0 logdetcov at mvnormalcanon.jl:165 [inlined] (::typeof(∂(logdetcov)))(::Float64) at interface2.jl:0 mvnormal_c0 at mvnormal.jl:99 [inlined] (::typeof(∂(mvnormal_c0)))(::Int64) at interface2.jl:0 _logpdf at mvnormal.jl:127 [inlined] (::typeof(∂(_logpdf)))(::Int64) at interface2.jl:0 JuliaLang/julia#119 at multivariates.jl:262 [inlined] (::typeof(∂(λ)))(::Int64) at interface2.jl:0 JuliaLang/julia#1071 at broadcast.jl:140 [inlined] JuliaLang/julia#3 at generator.jl:36 [inlined] iterate at generator.jl:47 [inlined] collect(::Base.Generator{Base.Iterators.Zip{Tuple{Array{typeof(∂(λ)),1},FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}}},Base.var"#3#4"{Zygote.var"#1071#1078"}}) at array.jl:686 map at abstractarray.jl:2248 [inlined] (::Zygote.var"#1070#1077"{Tuple{UnitRange{Int64}},Val{2},Array{typeof(∂(λ)),1}})(::FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}) at broadcast.jl:140 JuliaLang/julia#3862#back at adjoint.jl:49 [inlined] (::Zygote.var"#150#151"{Zygote.var"#3862#back#1081"{Zygote.var"#1070#1077"{Tuple{UnitRange{Int64}},Val{2},Array{typeof(∂(λ)),1}}},Tuple{Tuple{Nothing,Nothing,Nothing},Tuple{}}})(::FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}) at lib.jl:191 JuliaLang/julia#1693#back at adjoint.jl:49 [inlined] broadcasted at broadcast.jl:1257 [inlined] (::typeof(∂(broadcasted)))(::FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}) at interface2.jl:0 JuliaLang/julia#556 at array.jl:243 [inlined] (::typeof(∂(#556)))(::Int64) at interface2.jl:0 JuliaLang/julia#41 at interface.jl:45 [inlined] JuliaLang/julia#557 at array.jl:244 [inlined] sum at reducedim.jl:720 [inlined] (::typeof(∂(sum)))(::Int64) at interface2.jl:0 loglikelihood at multivariates.jl:262 [inlined] (::typeof(∂(loglikelihood)))(::Int64) at interface2.jl:0 observe at context_implementations.jl:152 [inlined] observe at hmc.jl:557 [inlined] _tilde at context_implementations.jl:109 [inlined] tilde at context_implementations.jl:67 [inlined] (::typeof(∂(tilde)))(::Int64) at interface2.jl:0 tilde_observe at context_implementations.jl:89 [inlined] (::typeof(∂(tilde_observe)))(::Nothing) at interface2.jl:0 JuliaLang/julia#7 at untitled-80c47956c154a0947dfbe4a5c1a877ba:92 [inlined] (::typeof(∂(#7)))(::Nothing) at interface2.jl:0 macro expansion at model.jl:0 [inlined] _evaluate at model.jl:160 [inlined] (::typeof(∂(_evaluate)))(::Nothing) at interface2.jl:0 evaluate_threadsafe at model.jl:150 [inlined] (::typeof(∂(evaluate_threadsafe)))(::Nothing) at interface2.jl:0 Model at model.jl:94 [inlined] (::typeof(∂(λ)))(::Nothing) at interface2.jl:0 JuliaLang/julia#150 at lib.jl:191 [inlined] JuliaLang/julia#1693#back at adjoint.jl:49 [inlined] Model at model.jl:98 [inlined] f at ad.jl:165 [inlined] (::typeof(∂(λ)))(::Int64) at interface2.jl:0 (::Zygote.var"#41#42"{typeof(∂(λ))})(::Int64) at interface.jl:45 gradient_logp(::Turing.Core.ZygoteAD, ::Array{Float64,1}, ::DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}, ::DynamicPPL.Model{var"#7#8",(:x,),(),(),Tuple{Array{Float64,1}},Tuple{}}, ::DynamicPPL.Sampler{NUTS{Turing.Core.ZygoteAD,(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}, ::DynamicPPL.DefaultContext) at ad.jl:171 gradient_logp at ad.jl:83 [inlined] gradient_logp at ad.jl:83 [inlined] ∂logπ∂θ at hmc.jl:474 [inlined] ∂H∂θ at hamiltonian.jl:31 [inlined] phasepoint at hamiltonian.jl:69 [inlined] find_good_stepsize(::Random._GLOBAL_RNG, ::AdvancedHMC.Hamiltonian{AdvancedHMC.DiagEuclideanMetric{Float64,Array{Float64,1}},Turing.Inference.var"#logπ#55"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Sampler{NUTS{Turing.Core.ZygoteAD,(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}},DynamicPPL.Model{var"#7#8",(:x,),(),(),Tuple{Array{Float64,1}},Tuple{}}},Turing.Inference.var"#∂logπ∂θ#54"{DynamicPPL.VarI... ```

@devmotion
Copy link
Member

Sorry, it's not clear to me what exactly you tested here. Is it the Turing model above? That might still require some additional work but I was just referring to the statements about AD of cholesky above - I assumed that this should work with sparse matrices since Zygote should use the custom adjoint defined in ChainRules.

@ElOceanografo
Copy link

Yes I just ran the Turing model above after setting the AD backend to Zygote...sorry for not being clearer. However, I also get the same error with a hand-coded likelihood function :

using Distributions, Zygote, SparseArrays, PDMats

const xobs = randn(10)

function loglik(θ)
    ρ = 0.5tanh(θ[1])
    n = length(xobs)
    Q = spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1))
    d = MvNormalCanon(PDSparseMat(Q))
    return logpdf(d, xobs)
end

Zygote.gradient(loglik, [1.0])
# ERROR: Mutating arrays is not supported

@jkbest2
Copy link

jkbest2 commented Nov 16, 2020

For what it's worth, the pullback and pushforward of cholesky with a sparse argument appears to work:

julia> n = 100
julia> ρ = 0.5
julia> Q = spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1))

julia> pullback(cholesky, Q)
(SuiteSparse.CHOLMOD.Factor{Float64}
type:    LLt
method:  simplicial
maxnnz:  19
nnz:     19
success: true
, Zygote.var"#41#42"{Zygote.ZBack{ChainRules.var"#cholesky_pullback#1615"{SuiteSparse.CHOLMOD.Factor{Float64}}}}(Zygote.ZBack{ChainRules.var"#cholesky_pullback#1615"{SuiteSparse.CHOLMOD.Factor{Float64}}}(ChainRules.var"#cholesky_pullback#1615"{SuiteSparse.CHOLMOD.Factor{Float64}}(SuiteSparse.CHOLMOD.Factor{Float64}
type:    LLt
method:  simplicial
maxnnz:  19
nnz:     19
success: true
))))

julia> pushforward(cholesky, Q)
JuliaLang/julia#8 (generic function with 1 method)

But trying to get the gradient of logdet with respect to a sparse matrix fails:

julia> Zygote.gradient(logdet, Q)
ERROR: MethodError: no method matching logabsdet(::SuiteSparse.UMFPACK.UmfpackLU{Float64,Int64})
Closest candidates are:
  logabsdet(::SymTridiagonal; shift) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/tridiag.jl:441
  logabsdet(::UnitUpperTriangular{T,S} where S<:AbstractArray{T,2}) where T at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/triangular.jl:2652
  logabsdet(::UnitLowerTriangular{T,S} where S<:AbstractArray{T,2}) where T at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/triangular.jl:2653
  ...
Stacktrace:
 [1] logabsdet(::SparseMatrixCSC{Float64,Int64}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/generic.jl:1573
 [2] adjoint at /home/jkbest/.julia/packages/Zygote/c0awc/src/lib/array.jl:378 [inlined]
 [3] _pullback at /home/jkbest/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:47 [inlined]
 [4] logdet at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/generic.jl:1596 [inlined]
 [5] _pullback(::Zygote.Context, ::typeof(logdet), ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
 [6] _pullback(::Function, ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:38
 [7] pullback(::Function, ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:44
 [8] gradient(::Function, ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:53
 [9] top-level scope at REPL[22]:1

And if you use a function to create a AR-1 precision matrix and try to take logdet with respect to ρ, you get the same error about mutating arrays.

julia> spchol(ρ; n = 100) =  cholesky(spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1)))
julia> Zygote.gradient-> logdet(spchol(ρ)), 0.5)
ERROR: Mutating arrays is not supported
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] (::Zygote.var"#364#365")(::Nothing) at /home/jkbest/.julia/packages/Zygote/c0awc/src/lib/array.jl:58
 [3] (::Zygote.var"#2245#back#366"{Zygote.var"#364#365"})(::Nothing) at /home/jkbest/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:49
 [4] diag at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/SuiteSparse/src/cholmod.jl:1792 [inlined]
 [5] (::typeof((diag)))(::Array{Float64,1}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
 [6] logdet at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/SuiteSparse/src/cholmod.jl:1801 [inlined]
 [7] (::typeof((logdet)))(::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
 [8] JuliaLang/julia#4 at ./REPL[20]:1 [inlined]
 [9] (::typeof((#4)))(::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
 [10] (::Zygote.var"#41#42"{typeof((#4))})(::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:45
 [11] gradient(::Function, ::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:54
 [12] top-level scope at REPL[20]:1

I'm not sure if any of this is helpful, but maybe a Zygote issue would be more productive?

@devmotion
Copy link
Member

For what it's worth, the pullback and pushforward of cholesky with a sparse argument appears to work:

That's exactly what I assumed and why I thought that for Zygote the problem mentioned above for ForwardDiff

The error is coming from the PDSparseMat constructor trying to precalculate cholesky(Q) using CHOLMOD, not the actual calculation of the likelihood or its derivative.

should not exist.

But trying to get the gradient of logdet with respect to a sparse matrix fails:

It seems logabsdet is not defined for sparse matrices (and its LU decomposition)?!

@ElOceanografo
Copy link

It seems logabsdet is not defined for sparse matrices (and its LU decomposition)?!

Weird that we'd be the first people to notice this, but that seems to be the case. There are a few closed issues for other missing logabsdet methods, but not for SparseMatrixCSC. I've opened an issue about it: JuliaLang/LinearAlgebra.jl#788

Anyway, in the meantime...what do you think is the best way to move forward on this issue?

@ElOceanografo
Copy link

Some progress on this: logabsdet will now work for sparse matrices (on Julia master, should be released in v1.7). Once the AD issues get sorted out, this issue can probably be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants