Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add COCG method for complex symmetric linear systems #289

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ makedocs(
"Getting started" => "getting_started.md",
"Preconditioning" => "preconditioning.md",
"Linear systems" => [
"Conjugate Gradients" => "linear_systems/cg.md",
"Conjugate Gradient" => "linear_systems/cg.md",
"Chebyshev iteration" => "linear_systems/chebyshev.md",
"MINRES" => "linear_systems/minres.md",
"BiCGStab(l)" => "linear_systems/bicgstabl.md",
Expand Down
4 changes: 2 additions & 2 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,13 @@ When solving linear systems $Ax = b$ for a square matrix $A$ there are quite som

| Method | When to use it |
|---------------------|--------------------------------------------------------------------------|
| [Conjugate Gradients](@ref CG) | Best choice for **symmetric**, **positive-definite** matrices |
| [Conjugate Gradient](@ref CG) | Best choice for **symmetric**, **positive-definite** matrices |
| [MINRES](@ref MINRES) | For **symmetric**, **indefinite** matrices |
| [GMRES](@ref GMRES) | For **nonsymmetric** matrices when a good [preconditioner](@ref Preconditioning) is available |
| [IDR(s)](@ref IDRs) | For **nonsymmetric**, **strongly indefinite** problems without a good preconditioner |
| [BiCGStab(l)](@ref BiCGStabl) | Otherwise for **nonsymmetric** problems |

We also offer [Chebyshev iteration](@ref Chebyshev) as an alternative to Conjugate Gradients when bounds on the spectrum are known.
We also offer [Chebyshev iteration](@ref Chebyshev) as an alternative to Conjugate Gradient when bounds on the spectrum are known.

Stationary methods like [Jacobi](@ref), [Gauss-Seidel](@ref), [SOR](@ref) and [SSOR](@ref) can be used as smoothers to reduce high-frequency components in the error in just a few iterations.

Expand Down
8 changes: 6 additions & 2 deletions docs/src/linear_systems/cg.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
# [Conjugate Gradients (CG)](@id CG)
# [Conjugate Gradient (CG)](@id CG)

Conjugate Gradients solves $Ax = b$ approximately for $x$ where $A$ is a symmetric, positive-definite linear operator and $b$ the right-hand side vector. The method uses short recurrences and therefore has fixed memory costs and fixed computational costs per iteration.
CG solves $Ax = b$ approximately for $x$ where $A$ is a Hermitian, positive-definite linear operator and $b$ the right-hand side vector. The method uses short recurrences and therefore has fixed memory costs and fixed computational costs per iteration.

A simple variant called Conjugate Orthogonal Conjugate Gradient (COCG) is obtained by replacing the dot product ($x^*y$) in the CG algorithm with the unconjugated dot product ($xᵀy$). For complex symmetric matrices ($A = Aᵀ$), COCG is equivalent to Biconjugate Gradient (BiCG) but takes only one matrix-vector multiplication per iteration rather than two of BiCG. Therefore, COCG converges twice as fast in time as BiCG. Also, like in BiCG, $A$ in COCG does not need to be positive-definite.

## Usage

```@docs
cg
cg!
cocg
cocg!
```

## On the GPU
Expand Down
2 changes: 1 addition & 1 deletion docs/src/linear_systems/chebyshev.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Chebyshev iteration solves the problem $Ax=b$ approximately for $x$ where $A$ is a symmetric, definite linear operator and $b$ the right-hand side vector. The methods assumes the interval $[\lambda_{min}, \lambda_{max}]$ containing all eigenvalues of $A$ is known, so that $x$ can be iteratively constructed via a Chebyshev polynomial with zeros in this interval. This polynomial ultimately acts as a filter that removes components in the direction of the eigenvectors from the initial residual.

The main advantage with respect to Conjugate Gradients is that BLAS1 operations such as inner products are avoided.
The main advantage with respect to Conjugate Gradient is that BLAS1 operations such as inner products are avoided.

## Usage

Expand Down
61 changes: 40 additions & 21 deletions src/cg.jl
Original file line number Diff line number Diff line change
@@ -1,21 +1,22 @@
import Base: iterate
using Printf
export cg, cg!, CGIterable, PCGIterable, cg_iterator!, CGStateVariables
export cg, cg!, cocg, cocg!, CGIterable, PCGIterable, cg_iterator!, CGStateVariables

mutable struct CGIterable{matT, solT, vecT, numT <: Real}
mutable struct CGIterable{matT, solT, vecT, numT <: Real, paramT <: Number, dotT <: AbstractDot}
A::matT
x::solT
r::vecT
c::vecT
u::vecT
tol::numT
residual::numT
prev_residual::numT
ρ_prev::paramT
maxiter::Int
mv_products::Int
dotproduct::dotT
end

mutable struct PCGIterable{precT, matT, solT, vecT, numT <: Real, paramT <: Number}
mutable struct PCGIterable{precT, matT, solT, vecT, numT <: Real, paramT <: Number, dotT <: AbstractDot}
Pl::precT
A::matT
x::solT
Expand All @@ -24,9 +25,10 @@ mutable struct PCGIterable{precT, matT, solT, vecT, numT <: Real, paramT <: Numb
u::vecT
tol::numT
residual::numT
ρ::paramT
ρ_prev::paramT
maxiter::Int
mv_products::Int
dotproduct::dotT
end

@inline converged(it::Union{CGIterable, PCGIterable}) = it.residual ≤ it.tol
Expand All @@ -47,18 +49,20 @@ function iterate(it::CGIterable, iteration::Int=start(it))
end

# u := r + βu (almost an axpy)
β = it.residual^2 / it.prev_residual^2
ρ = isa(it.dotproduct, ConjugatedDot) ? it.residual^2 : _norm(it.r, it.dotproduct)^2
β = ρ / it.ρ_prev

it.u .= it.r .+ β .* it.u

# c = A * u
mul!(it.c, it.A, it.u)
α = it.residual^2 / dot(it.u, it.c)
α = ρ / _dot(it.u, it.c, it.dotproduct)

# Improve solution and residual
it.ρ_prev = ρ
it.x .+= α .* it.u
it.r .-= α .* it.c

it.prev_residual = it.residual
it.residual = norm(it.r)

# Return the residual at item and iteration number as state
Expand All @@ -78,18 +82,17 @@ function iterate(it::PCGIterable, iteration::Int=start(it))
# Apply left preconditioner
ldiv!(it.c, it.Pl, it.r)

ρ_prev = it.ρ
it.ρ = dot(it.c, it.r)

# u := c + βu (almost an axpy)
β = it.ρ / ρ_prev
ρ = _dot(it.r, it.c, it.dotproduct)
β = ρ / it.ρ_prev
it.u .= it.c .+ β .* it.u

# c = A * u
mul!(it.c, it.A, it.u)
α = it.ρ / dot(it.u, it.c)
α = ρ / _dot(it.u, it.c, it.dotproduct)

# Improve solution and residual
it.ρ_prev = ρ
it.x .+= α .* it.u
it.r .-= α .* it.c

Expand Down Expand Up @@ -122,7 +125,8 @@ function cg_iterator!(x, A, b, Pl = Identity();
reltol::Real = sqrt(eps(real(eltype(b)))),
maxiter::Int = size(A, 2),
statevars::CGStateVariables = CGStateVariables(zero(x), similar(x), similar(x)),
initially_zero::Bool = false)
initially_zero::Bool = false,
dotproduct::AbstractDot = ConjugatedDot())
u = statevars.u
r = statevars.r
c = statevars.c
Expand All @@ -143,14 +147,12 @@ function cg_iterator!(x, A, b, Pl = Identity();
# Return the iterable
if isa(Pl, Identity)
return CGIterable(A, x, r, c, u,
tolerance, residual, one(residual),
maxiter, mv_products
)
tolerance, residual, one(eltype(r)),
maxiter, mv_products, dotproduct)
else
return PCGIterable(Pl, A, x, r, c, u,
tolerance, residual, one(eltype(x)),
maxiter, mv_products
)
tolerance, residual, one(eltype(r)),
maxiter, mv_products, dotproduct)
end
end

Expand Down Expand Up @@ -214,6 +216,7 @@ function cg!(x, A, b;
statevars::CGStateVariables = CGStateVariables(zero(x), similar(x), similar(x)),
verbose::Bool = false,
Pl = Identity(),
dotproduct::AbstractDot = ConjugatedDot(),
kwargs...)
history = ConvergenceHistory(partial = !log)
history[:abstol] = abstol
Expand All @@ -222,7 +225,7 @@ function cg!(x, A, b;

# Actually perform CG
iterable = cg_iterator!(x, A, b, Pl; abstol = abstol, reltol = reltol, maxiter = maxiter,
statevars = statevars, kwargs...)
statevars = statevars, dotproduct = dotproduct, kwargs...)
if log
history.mvps = iterable.mv_products
end
Expand All @@ -240,3 +243,19 @@ function cg!(x, A, b;

log ? (iterable.x, history) : iterable.x
end

"""
cocg(A, b; kwargs...) -> x, [history]

Same as [`cocg!`](@ref), but allocates a solution vector `x` initialized with zeros.
dkarrasch marked this conversation as resolved.
Show resolved Hide resolved
"""
cocg(A, b; kwargs...) = cocg!(zerox(A, b), A, b; initially_zero = true, kwargs...)

"""
cocg!(x, A, b; kwargs...) -> x, [history]

Same as [`cg!`](@ref), but uses the unconjugated dot product (`xᵀy`) instead of the usual,
conjugated dot product (`x'y`) in the algorithm. It is for solving linear systems with
matrices `A` that are complex-symmetric (`Aᵀ == A`) rather than Hermitian (`A' == A`).
"""
cocg!(x, A, b; kwargs...) = cg!(x, A, b; dotproduct = UnconjugatedDot(), kwargs...)
15 changes: 14 additions & 1 deletion src/common.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import LinearAlgebra: ldiv!, \

export Identity
export Identity, ConjugatedDot, UnconjugatedDot

#### Type-handling
"""
Expand Down Expand Up @@ -30,3 +30,16 @@ struct Identity end
\(::Identity, x) = copy(x)
ldiv!(::Identity, x) = x
ldiv!(y, ::Identity, x) = copyto!(y, x)

"""
Conjugated and unconjugated dot products
"""
abstract type AbstractDot end
struct ConjugatedDot <: AbstractDot end
struct UnconjugatedDot <: AbstractDot end

_norm(x, ::ConjugatedDot) = norm(x)
_dot(x, y, ::ConjugatedDot) = dot(x, y)

_norm(x, ::UnconjugatedDot) = sqrt(sum(xₖ->xₖ^2, x))
_dot(x, y, ::UnconjugatedDot) = transpose(@view(x[:])) * @view(y[:])
44 changes: 35 additions & 9 deletions test/cg.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,22 +17,22 @@ end

ldiv!(y, P::JacobiPrec, x) = y .= x ./ P.diagonal

@testset "Conjugate Gradients" begin
@testset "Conjugate Gradient" begin

Random.seed!(1234321)

@testset "Small full system" begin
n = 10

@testset "Matrix{$T}" for T in (Float32, Float64, ComplexF32, ComplexF64)
@testset "Matrix{$T}, conjugated dot product" for T in (Float32, Float64, ComplexF32, ComplexF64)
A = rand(T, n, n)
A = A' * A + I
b = rand(T, n)
reltol = √eps(real(T))

x,ch = cg(A, b; reltol=reltol, maxiter=2n, log=true)
@test isa(ch, ConvergenceHistory)
@test norm(A*x - b) / norm(b) ≤ reltol
@test A*x ≈ b rtol=reltol
@test ch.isconverged

# If you start from the exact solution, you should converge immediately
Expand All @@ -50,6 +50,32 @@ Random.seed!(1234321)
x0 = cg(A, zeros(T, n))
@test x0 == zeros(T, n)
end

@testset "Matrix{$T}, unconjugated dot product" for T in (Float32, Float64, ComplexF32, ComplexF64)
A = rand(T, n, n)
A = A + transpose(A) + 15I
x = ones(T, n)
b = A * x

reltol = √eps(real(T))

# Solve without preconditioner
x1, his1 = cocg(A, b, reltol = reltol, maxiter = 100, log = true)
@test isa(his1, ConvergenceHistory)
@test A*x1 ≈ b rtol=reltol

# With an initial guess
x_guess = rand(T, n)
x2, his2 = cocg!(x_guess, A, b, reltol = reltol, maxiter = 100, log = true)
@test isa(his2, ConvergenceHistory)
@test x2 == x_guess
@test A*x2 ≈ b rtol=reltol

# Do an exact LU decomp of a nearby matrix
F = lu(A + rand(T, n, n))
x3, his3 = cocg(A, b, Pl = F, maxiter = 100, reltol = reltol, log = true)
@test A*x3 ≈ b rtol=reltol
end
end

@testset "Sparse Laplacian" begin
Expand All @@ -64,24 +90,24 @@ end
@testset "SparseMatrixCSC{$T, $Ti}" for T in (Float64, Float32), Ti in (Int64, Int32)
xCG = cg(A, rhs; reltol=reltol, maxiter=100)
xJAC = cg(A, rhs; Pl=P, reltol=reltol, maxiter=100)
@test norm(A * xCG - rhs) ≤ reltol
@test norm(A * xJAC - rhs) ≤ reltol
@test A*xCG rhs rtol=reltol
@test A*xJAC rhs rtol=reltol
end

Af = LinearMap(A)
@testset "Function" begin
xCG = cg(Af, rhs; reltol=reltol, maxiter=100)
xJAC = cg(Af, rhs; Pl=P, reltol=reltol, maxiter=100)
@test norm(A * xCG - rhs) ≤ reltol
@test norm(A * xJAC - rhs) ≤ reltol
@test A*xCG rhs rtol=reltol
@test A*xJAC rhs rtol=reltol
end

@testset "Function with specified starting guess" begin
x0 = randn(size(rhs))
xCG, hCG = cg!(copy(x0), Af, rhs; abstol=abstol, reltol=0.0, maxiter=100, log=true)
xJAC, hJAC = cg!(copy(x0), Af, rhs; Pl=P, abstol=abstol, reltol=0.0, maxiter=100, log=true)
@test norm(A * xCG - rhs) ≤ reltol
@test norm(A * xJAC - rhs) ≤ reltol
@test A*xCG rhs rtol=reltol
@test A*xJAC rhs rtol=reltol
@test niters(hJAC) == niters(hCG)
end
end
Expand Down
14 changes: 14 additions & 0 deletions test/common.jl
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,20 @@ end
@test ldiv!(y, P, copy(x)) == x
end

@testset "Vector{$T}, conjugated and unconjugated dot products" for T in (ComplexF32, ComplexF64)
n = 100
x = rand(T, n)
y = rand(T, n)

# Conjugated dot product
@test IterativeSolvers._norm(x, ConjugatedDot()) ≈ sqrt(x'x)
@test IterativeSolvers._dot(x, y, ConjugatedDot()) ≈ x'y

# Unonjugated dot product
@test IterativeSolvers._norm(x, UnconjugatedDot()) ≈ sqrt(transpose(x) * x)
@test IterativeSolvers._dot(x, y, UnconjugatedDot()) ≈ transpose(x) * y
end

end

DocMeta.setdocmeta!(IterativeSolvers, :DocTestSetup, :(using IterativeSolvers); recursive=true)
Expand Down
2 changes: 1 addition & 1 deletion test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ include("hessenberg.jl")
#Stationary solvers
include("stationary.jl")

#Conjugate gradients
#Conjugate gradient
include("cg.jl")

#BiCGStab(l)
Expand Down