Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add smoothing cubic splines #103

Merged
merged 6 commits into from
Feb 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ This package provides:

- evaluation of splines and their derivatives and integrals;

- spline interpolations and function approximation;
- spline interpolations, smoothing splines and function approximation;

- basis recombination, for generating bases satisfying homogeneous boundary
conditions using linear combinations of B-splines.
Expand Down
2 changes: 1 addition & 1 deletion docs/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"

[compat]
Documenter = "1"
Makie = "0.21"
Makie = "0.22"

[extras]
CPUSummary = "2a0fbf3d-bb9c-48f3-b0a9-814d99fd7ab9"
4 changes: 2 additions & 2 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ This package provides:
- evaluation of [splines](@ref Splines-api) and their derivatives and
integrals;

- [spline interpolations](@ref interpolation-api) and
[function approximation](@ref function-approximation-api);
- [spline interpolations](@ref interpolation-example), [smoothing splines](@ref smoothing-example) and
[function approximation](@ref function-approximation-example);

- [basis recombination](@ref basis-recombination-api), for generating bases
satisfying homogeneous boundary conditions using linear combinations of
Expand Down
3 changes: 2 additions & 1 deletion docs/src/interpolation.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,14 @@
CurrentModule = BSplineKit.SplineInterpolations
```

High-level interpolation of gridded data using B-splines.
High-level interpolation and fitting of gridded data using B-splines.

## Functions

```@docs
interpolate
interpolate!
fit
```

## Types
Expand Down
52 changes: 52 additions & 0 deletions examples/interpolation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,58 @@ current_figure() # hide
# Clearly, the spurious oscillations are strongly suppressed near the
# boundaries.

# ## [Smoothing cubic splines](@id smoothing-example)
#
# One can use [smoothing splines](https://en.wikipedia.org/wiki/Smoothing_spline)
# to fit noisy data.
# A smoothing spline is a curve which passes close to the input data, while avoiding strong
# fluctuations due to possible noise.
# The smoothign strength is controlled by a regularisation parameter ``λ``.
# Setting ``λ = 0`` corresponds to a regular interpolation (the obtained spline passes
# through all the points), while increasing ``λ`` leads to a smoother curve which roughly
# approximates the data.
#
# Given a set of data points ``(x_i, y_i)``, the idea is to construct a spline ``S(x)`` that
# minimises:
#
# ```math
# ∑_{i = 1}^N w_i |y_i - S(x_i)|^2 + λ ∫_{x_1}^{x_N} \left[ S''(x) \right]^2 \, \mathrm{d}x
# ```
#
# Here ``w_i`` are optional weights that may be used to give "priority" to certain data
# points.
#
# Note that only cubic splines (order ``k = 4``) are currently supported.

rng = MersenneTwister(42)
Ndata = 20
xs = sort!(rand(rng, Ndata))
xs[begin] = 0; xs[end] = 1; # not strictly necessary; just to set the data limits
ys = cospi.(2 .* xs) .+ 0.04 .* randn(rng, Ndata);

# Create smoothing spline from data:

λ = 1e-3
S_fit = fit(xs, ys, λ)

# If we want the spline to pass very near a single data point, we can assign a
# larger weight to that point:

weights = fill!(similar(xs), 1)
weights[12] = 100 # larger weight to point i = 12
S_fit_weight = fit(xs, ys, λ; weights)

# Plot results and compare with natural cubic spline interpolation:

S_interp = interpolate(xs, ys, BSplineOrder(4), Natural())

scatter(xs, ys; label = "Data", color = :black)
lines!(0..1, S_interp; label = "Interpolation", linewidth = 2)
lines!(0..1, S_fit; label = "Fit (λ = $λ)", linewidth = 2)
lines!(0..1, S_fit_weight; label = "Fit (λ = $λ) with weight", linestyle = :dash, linewidth = 2)
axislegend(position = (0.5, 1))
current_figure() # hide

# ## [Extrapolations](@id extrapolation-example)
#
# One can use extrapolation to evaluate splines outside of their domain of definition.
Expand Down
5 changes: 4 additions & 1 deletion src/SplineInterpolations/SplineInterpolations.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ module SplineInterpolations
using ..BSplines
using ..Collocation
using ..Collocation: CyclicTridiagonalMatrix, IdentityMatrix
using ..DifferentialOps: Derivative
using ..Splines

using BandedMatrices
Expand All @@ -21,7 +22,9 @@ using ..Recombinations:
using ..BoundaryConditions

export
SplineInterpolation, spline, interpolate, interpolate!
SplineInterpolation, spline, interpolate, interpolate!, fit

include("smoothing.jl")

"""
SplineInterpolation
Expand Down
110 changes: 110 additions & 0 deletions src/SplineInterpolations/smoothing.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
@doc raw"""
fit(xs, ys, λ::Real, [BSplineOrder(4)]; [weights = nothing])

Fit a cubic smoothing spline to the given data.

Returns a natural cubic spline which roughly passes through the data (points `(xs[i], ys[i])`)
given some smoothing parameter ``λ``.
Note that ``λ = 0`` means no smoothing and the results are equivalent to
all the data.

One can optionally pass a `weights` vector if one desires to give different weights to
different data points (e.g. if one wants the curve to pass as close as possible to a
specific point). By default all weights are ``w_i = 1``.

More precisely, the returned spline ``S(x)`` minimises:

```math
∑_{i = 1}^N w_i |y_i - S(x_i)|^2 + λ ∫_{x_1}^{x_N} \left[ S''(x) \right]^2 \, \mathrm{d}x
```

Only cubic splines (`BSplineOrder(4)`) are currently supported.

# Examples

```jldoctest smoothing_spline
julia> xs = (0:0.01:1).^2;

julia> ys = @. cospi(2 * xs) + 0.1 * sinpi(200 * xs); # smooth + highly fluctuating components

julia> λ = 0.001; # smoothing parameter

julia> S = fit(xs, ys, λ)
101-element Spline{Float64}:
basis: 101-element RecombinedBSplineBasis of order 4, domain [0.0, 1.0], BCs {left => (D{2},), right => (D{2},)}
order: 4
knots: [0.0, 0.0, 0.0, 0.0, 0.0001, 0.0004, 0.0009, 0.0016, 0.0025, 0.0036 … 0.8836, 0.9025, 0.9216, 0.9409, 0.9604, 0.9801, 1.0, 1.0, 1.0, 1.0]
coefficients: [0.946872, 0.631018, 1.05101, 1.04986, 1.04825, 1.04618, 1.04366, 1.04067, 1.03722, 1.03331 … 0.437844, 0.534546, 0.627651, 0.716043, 0.798813, 0.875733, 0.947428, 1.01524, 0.721199, 0.954231]
```
"""
function fit end

function fit(
xs::AbstractVector, ys::AbstractVector, λ::Real, order::BSplineOrder{4};
weights::Union{Nothing, AbstractVector} = nothing,
)
λ ≥ 0 || throw(DomainError(λ, "the smoothing parameter λ must be non-negative"))
eachindex(xs) == eachindex(ys) || throw(DimensionMismatch("x and y vectors must have the same length"))
N = length(xs)
cs = similar(xs)

T = eltype(cs)

# Create natural cubic B-spline basis with knots = input points
B = BSplineBasis(order, copy(xs))
R = RecombinedBSplineBasis(B, Natural())

# Compute collocation matrices for derivatives 0 and 2
A = BandedMatrix{T}(undef, (N, N), (1, 1)) # 3 bands are enough
D = similar(A)
collocation_matrix!(A, R, xs, Derivative(0))
collocation_matrix!(D, R, xs, Derivative(2))

# Matrix containing grid steps (banded + symmetric, so we don't compute the lower part)
Δ_upper = BandedMatrix{T}(undef, (N, N), (0, 1))
fill!(Δ_upper, 0)
@inbounds for i ∈ axes(Δ_upper, 1)[2:end-1]
# Δ_upper[i, i - 1] = xs[i] - xs[i - 1] # this one is obtained by symmetry
Δ_upper[i, i] = 2 * (xs[i + 1] - xs[i - 1])
Δ_upper[i, i + 1] = xs[i + 1] - xs[i]
end
Δ = Hermitian(Δ_upper) # symmetric matrix with 3 bands

# The integral of the squared second derivative is (H * cs) ⋅ (D * cs) / 6.
H = Δ * D

# Directly construct LHS matrix
# M = Hermitian((H'D + D'H) * (λ / 6) + 2 * A' * W * A) # usually positive definite

# Construct LHS matrix trying to reduce computations
B = H' * D # this matrix has 5 bands
buf_1 = (B .+ B') .* (λ / 6) # = (H'D + D'H) * (λ / 6)
buf_2 = if weights === nothing
A' * A
else
A' * Diagonal(weights) * A # = A' * W * A
end

@. buf_1 = buf_1 + 2 * buf_2 # (H'D + D'H) * (λ / 6) + 2 * A' * W * A
M = Hermitian(buf_1) # the matrix is actually symmetric (usually positive definite)
F = cholesky!(M) # factorise matrix (assuming posdef)

# Directly construct RHS
# z = 2 * A' * (W * ys) # RHS
# ldiv!(cs, F, z)

# Construct RHS trying to reduce allocations
zs = copy(ys)
if weights !== nothing
eachindex(weights) == eachindex(xs) || throw(DimensionMismatch("the `weights` vector must have the same length as the data"))
lmul!(Diagonal(weights), zs) # zs = W * ys
end
mul!(cs, A', zs) # cs = A' * (W * ys)
lmul!(2, cs) # cs = 2 * A' * (W * ys)

ldiv!(F, cs) # solve linear system

Spline(R, cs)
end

fit(x, y, λ; kws...) = fit(x, y, λ, BSplineOrder(4); kws...)
1 change: 1 addition & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ include("splines.jl")
include("periodic.jl")
include("natural.jl")
include("interpolation.jl")
include("smoothing.jl")
include("extrapolation.jl")
include("approximation.jl")
include("recombination.jl")
Expand Down
66 changes: 66 additions & 0 deletions test/smoothing.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Test smoothing splines

using BSplineKit
using QuadGK: quadgk
using Test

# Returns the integral of |S''(x)| (the "curvature") over the whole spline.
function total_curvature(S::Spline)
ts = knots(S)
k = order(S)
xs = @view ts[(begin - 1 + k):(end + 1 - k)] # exclude repeated knots at the boundaries
@assert length(xs) == length(S)
S″ = Derivative(2) * S
curv, err = quadgk(xs) do x
abs2(S″(x))
end
curv
end

function distance_from_data(S::Spline, xs, ys)
dist = zero(eltype(ys))
for i in eachindex(xs, ys)
dist += abs2(ys[i] - S(xs[i]))
end
sqrt(dist / length(xs))
end

@testset "Smoothing cubic splines" begin
xs = (0:0.01:1).^2
ys = @. cospi(2 * xs) + 0.1 * sinpi(200 * xs) # smooth + highly fluctuating components

# Check that smoothing with λ = 0 is equivalent to interpolation
@testset "Fit λ = 0" begin
λ = 0
S = @inferred fit(xs, ys, λ)
S_interp = spline(interpolate(xs, ys, BSplineOrder(4), Natural()))
@test coefficients(S) ≈ coefficients(S_interp)
end

@testset "Smoothing" begin
λs = [0.0, 1e-6, 1e-5, 1e-4, 1e-3, 1e-2]
curvatures = similar(λs)
distances = similar(λs)
for i in eachindex(λs)
S = @inferred fit(xs, ys, λs[i])
curvatures[i] = total_curvature(S)
distances[i] = distance_from_data(S, xs, ys)
end
@test issorted(curvatures; rev = true) # in decreasing order (small λ => large curvature)
@test issorted(distances) # in increasing order (large λ => large distance from data)
end

@testset "Weights" begin
λ = 1e-3
weights = fill!(similar(xs), 1)
S = fit(xs, ys, λ)
Sw = @inferred fit(xs, ys, λ; weights) # equivalent to the default (all weights are 1)
@test coefficients(S) == coefficients(Sw)
# Now give more weight to point i = 3
weights[3] = 1000
Sw = fit(xs, ys, λ; weights)
@test abs(Sw(xs[3]) - ys[3]) < abs(S(xs[3]) - ys[3]) # the new curve is closer to the data point i = 3
@test total_curvature(Sw) > total_curvature(S) # since we give more importance to data fitting (basically, the sum of weights is larger)
@test distance_from_data(Sw, xs, ys) < distance_from_data(S, xs, ys)
end
end