MIN_DIAGONAL and MAX_DIAGONAL constants depend on overall scaling ("units") #39

stevengj · 2024-11-08T13:39:39Z

One basic invariant that it is good to maintain if possible is that the algorithm should be insensitive to the overall scaling ("units") of the problem. e.g. if my variables x are in meters, and I change them to µm, which will scale the Jacobian matrix J by 1e6, the algorithm should take the same steps (scaled by 1e6). Unfortunately, the current implementation violates this assumption.

In particular, the "magic" constants MIN_DIAGONAL = 1e-6 and MAX_DIAGONAL = 1e32, which are used to limit the damping/regularization factor:

LeastSquaresOptim.jl/src/optimizer/levenberg_marquardt.jl

Lines 87 to 89 in 2aea700

    
           colsumabs2!(dtd, J) 
        
           clamp!(dtd, MIN_DIAGONAL, MAX_DIAGONAL) 
        
           rmul!(dtd, 1/Δ)

don't scale with the units. This is falls into a common floating-point fallacy, confusing floating-point arithmetic with fixed-point arithmetic — it assumes that numbers are scaled to be of order unity, so that 1e-6 is "small" and 1e32 is "large" in an "absolute" sense, which makes no sense since floating-point numbers have 15 digits of precision over 600 orders of magnitude.

Instead, you have to ask whether the diagonal elements are small or large compared to what. In this case, I think that a natural comparison would be to something related to norm(J)^2 == sum(colsumabs2!(dtd, J)), which sets the overall scale of the entries here and the matrix $J^T J$ that you are regularizing. In particular, a natural choice would be norm(J)^2 / n, which is simply the mean of the dtd entries before clamping. So maybe just change clamp!(dtd, MIN_DIAGONAL, MAX_DIAGONAL) to:

dtd_mean = sum(dtd) / length(dtd)
clamp!(dtd, MIN_DIAGONAL * dtd_mean, MAX_DIAGONAL * dtd_mean)

which would make the algorithm insensitive to overall scaling unless I'm missing something (your other "magic constants" seem to be dimensionless).

cc @mochen4

The text was updated successfully, but these errors were encountered:

matthieugomez · 2024-11-08T16:57:42Z

Thanks a lot for the report! This is fixed by #40

matthieugomez closed this as completed Nov 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MIN_DIAGONAL and MAX_DIAGONAL constants depend on overall scaling ("units") #39

MIN_DIAGONAL and MAX_DIAGONAL constants depend on overall scaling ("units") #39

stevengj commented Nov 8, 2024 •

edited

Loading

matthieugomez commented Nov 8, 2024 •

edited

Loading

MIN_DIAGONAL and MAX_DIAGONAL constants depend on overall scaling ("units") #39

MIN_DIAGONAL and MAX_DIAGONAL constants depend on overall scaling ("units") #39

Comments

stevengj commented Nov 8, 2024 • edited Loading

matthieugomez commented Nov 8, 2024 • edited Loading

stevengj commented Nov 8, 2024 •

edited

Loading

matthieugomez commented Nov 8, 2024 •

edited

Loading