Cannot fit data: Newton-Raphson does not converge #17

Hoeze · 2023-01-08T19:02:44Z

Hi @jzluo, I found a sample dataset which fails fitting.
Consider the example data I attached here:
sample.zip

#! unzip sample.zip
with open(r"X.pickle", "rb") as fd:
    X = pickle.load(fd)
with open(r"y.pickle", "rb") as fd:
    y = pickle.load(fd)

from firthlogist import FirthLogisticRegression
FirthLogisticRegression().fit(X, y)

This fails after the second round of optimization because all coefficients get NaN.

Detailed logs + stack trace:

/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/sklearn/utils/validation.py:993: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:153: ConvergenceWarning: Firth logistic regression failed to converge. Try increasing max_iter.
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:322: RuntimeWarning: invalid value encountered in log
/opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/numpy/linalg/linalg.py:2154: RuntimeWarning: invalid value encountered in det

---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
Cell In[164], line 2
      1 from firthlogist import FirthLogisticRegression
----> 2 FirthLogisticRegression().fit(X, y)

File /opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:161, in FirthLogisticRegression.fit(self, X, y)
    159 if not self.skip_ci:
    160     if not self.wald:
--> 161         self.ci_ = _profile_likelihood_ci(
    162             X=X,
    163             y=y,
    164             fitted_coef=self.coef_,
    165             full_loglik=self.loglik_,
    166             max_iter=self.pl_max_iter,
    167             max_stepsize=self.pl_max_stepsize,
    168             max_halfstep=self.pl_max_halfstep,
    169             tol=self.tol,
    170             alpha=self.alpha,
    171             test_vars=self.test_vars,
    172         )
    173     else:
    174         self.ci_ = _wald_ci(self.coef_, self.bse_, self.alpha)

File /opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/firthlogist/firthlogist.py:440, in _profile_likelihood_ci(X, y, fitted_coef, full_loglik, max_iter, max_stepsize, max_halfstep, tol, alpha, test_vars)
    438 U_star = np.matmul(X.T, y - preds + np.multiply(hat, 0.5 - preds))
    439 # https://github.com/georgheinze/logistf/blob/master/src/logistf.c#L780-L781
--> 440 inv_fisher = np.linalg.pinv(fisher_info_mtx)
    441 tmp1x1 = U_star @ np.negative(inv_fisher) @ U_star
    442 underRoot = (
    443     -2
    444     * ((LL0 - loglike) + 0.5 * tmp1x1)
    445     / (inv_fisher[coef_idx, coef_idx])
    446 )

File <__array_function__ internals>:180, in pinv(*args, **kwargs)

File /opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/numpy/linalg/linalg.py:1998, in pinv(a, rcond, hermitian)
   1996     return wrap(res)
   1997 a = a.conjugate()
-> 1998 u, s, vt = svd(a, full_matrices=False, hermitian=hermitian)
   2000 # discard small singular values
   2001 cutoff = rcond[..., newaxis] * amax(s, axis=-1, keepdims=True)

File <__array_function__ internals>:180, in svd(*args, **kwargs)

File /opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/numpy/linalg/linalg.py:1657, in svd(a, full_matrices, compute_uv, hermitian)
   1654         gufunc = _umath_linalg.svd_n_s
   1656 signature = 'D->DdD' if isComplexType(t) else 'd->ddd'
-> 1657 u, s, vh = gufunc(a, signature=signature, extobj=extobj)
   1658 u = u.astype(result_t, copy=False)
   1659 s = s.astype(_realType(result_t), copy=False)

File /opt/anaconda/envs/firthlogist/lib/python3.9/site-packages/numpy/linalg/linalg.py:98, in _raise_linalgerror_svd_nonconvergence(err, flag)
     97 def _raise_linalgerror_svd_nonconvergence(err, flag):
---> 98     raise LinAlgError("SVD did not converge")

LinAlgError: SVD did not converge

The text was updated successfully, but these errors were encountered:

Hoeze · 2023-01-09T16:21:34Z

@jzluo I debugged a bit and found that the issue is the determinant getting negative which results in NaN's in the penalty.
I tried to fix that in #18.

Another thing I found:
With FirthLogisticRegression(skip_ci=True, wald=True).fit(X, y) the fitting works.
Otherwise, the regression runs in an infinite loop during CI/pvalue calculation if I do not set max_halfsteps > 0.

Still, the NR steps do not converge even after 200 iterations.

jzluo · 2023-01-11T23:16:58Z

Thanks so much! I will take a closer look tonight.

jzluo · 2023-01-13T19:12:58Z

Sorry, things have been a bit hectic for me and it may be longer before I can sit down and dig into this.
Just as an initial sanity check, I tried your sample data with logistf and brglm2. Fitting fails in logistf, and the results for brglm2:

Call:
glm(formula = y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + 
    X10 + X11 + X12 + X13 + X14 + X15 + X16 + X17 + X18 + X19 + 
    X20 + X21 + X22 + X23 + X24 + X25 + X26 + X27 + X28 + X29 + 
    X30 + X31 + X32 + X33 + X34, family = binomial(logit), data = data, 
    method = "brglmFit")

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.6170  -0.4648  -0.3425  -0.2405   3.3041  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept) -4.379e+00  1.144e-01 -38.281  < 2e-16 ***
X1          -3.372e-01  1.347e-01  -2.504 0.012296 *  
X2           1.067e-02  1.486e-03   7.180 6.95e-13 ***
X3           6.447e-04  2.326e-03   0.277 0.781616    
X4           8.806e-03  5.879e-03   1.498 0.134193    
X5           1.978e-03  6.121e-03   0.323 0.746594    
X6          -9.555e-03  5.908e-03  -1.617 0.105788    
X7           9.907e-03  4.557e-03   2.174 0.029729 *  
X8          -2.614e-03  1.972e-03  -1.325 0.185152    
X9           5.465e-03  5.678e-03   0.962 0.335837    
X10          4.911e-03  5.117e-03   0.960 0.337250    
X11          2.094e-03  5.129e-03   0.408 0.683103    
X12         -5.939e-03  2.064e-03  -2.877 0.004015 ** 
X13         -4.947e-03  4.903e-03  -1.009 0.312996    
X14          6.875e-03  3.670e-03   1.873 0.061015 .  
X15         -1.744e-02  4.920e-03  -3.544 0.000394 ***
X16         -6.035e-03  5.629e-03  -1.072 0.283699    
X17         -1.795e-02  2.999e-03  -5.986 2.15e-09 ***
X18         -2.030e-03  5.223e-03  -0.389 0.697615    
X19         -7.145e-03  3.056e-03  -2.338 0.019371 *  
X20         -3.240e-03  4.584e-03  -0.707 0.479723    
X21          2.543e-03  3.137e-03   0.811 0.417493    
X22         -3.022e-03  3.157e-03  -0.957 0.338335    
X23          2.704e-03  3.184e-03   0.849 0.395848    
X24          7.830e-03  3.291e-03   2.379 0.017356 *  
X25         -3.968e-03  3.175e-03  -1.250 0.211373    
X26         -6.351e-03  3.194e-03  -1.989 0.046745 *  
X27          5.167e-03  3.549e-03   1.456 0.145381    
X28          1.475e-03  3.346e-03   0.441 0.659453    
X29          5.652e-05  3.306e-03   0.017 0.986360    
X30          2.458e-03  3.278e-03   0.750 0.453359    
X31         -5.930e-03  3.310e-03  -1.792 0.073172 .  
X32         -1.083e-03  3.294e-03  -0.329 0.742266    
X33          4.086e-03  3.219e-03   1.269 0.204388    
X34          1.131e+05  1.224e+03  92.404  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 98119  on 161633  degrees of freedom
Residual deviance: 88017  on 161599  degrees of freedom
AIC:  88087

Type of estimator: AS_mixed (mixed bias-reducing adjusted score equations)
Number of Fisher Scoring iterations: 3

Can I ask what kind of data this is?

Hoeze · 2023-01-17T14:13:08Z

Unfortunately, I cannot tell you what kind of data this is.
But the fitting succeeds if you standard-normalize all features in X.

Without normalizing and using the default statsmodels logistic regression, you need L-BFGS-B to get a good fit. Newton-Raphson also fails to converge there.

So probably some L-BFGS-B implementation would help here a lot?
This would also make firthlogist scale better to large numbers of variables.

Hoeze linked a pull request Jan 9, 2023 that will close this issue

Use fast_logdet to compute penalty #18

Open

Hoeze changed the title ~~Cannot fit data: Problem with numerical accuracy~~ Cannot fit data: Newton-Raphson does not converge Jan 9, 2023

Hoeze mentioned this issue Jan 9, 2023

ENH: penalized mixin: custom penalization with model properties statsmodels/statsmodels#8594

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot fit data: Newton-Raphson does not converge #17

Cannot fit data: Newton-Raphson does not converge #17

Hoeze commented Jan 8, 2023 •

edited

Loading

Hoeze commented Jan 9, 2023 •

edited

Loading

jzluo commented Jan 11, 2023

jzluo commented Jan 13, 2023

Hoeze commented Jan 17, 2023

Cannot fit data: Newton-Raphson does not converge #17

Cannot fit data: Newton-Raphson does not converge #17

Comments

Hoeze commented Jan 8, 2023 • edited Loading

Hoeze commented Jan 9, 2023 • edited Loading

jzluo commented Jan 11, 2023

jzluo commented Jan 13, 2023

Hoeze commented Jan 17, 2023

Hoeze commented Jan 8, 2023 •

edited

Loading

Hoeze commented Jan 9, 2023 •

edited

Loading