Negative tau2 error in DerSimonianLaird random effects estimation #40

tsalo · 2020-06-24T18:27:00Z

Updated summary

There is a bug in Estimators wherein, during random-effects stats estimation, tau2 may drop below zero and raise an exception. I don't know what's causing this, but for documentation of the actual bug, see #40 (comment).

To replicate (on my laptop):

import numpy as np
from pymare.stats import q_profile
y = np.array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, -5.2165329e-01, 7.4963748e-01, 3.3219385e-01, -1.2351226e+00, 1.7527617e+00, -3.0110748e+00,  1.0946554e+02, -4.8926108e-02, 0.0000000e+00,  0.0000000e+00,  0.0000000e+00, -6.2120113e+01, -2.8613630e+01, -1.2308966e+01,  5.3221474e+00,  0.0000000e+00, -9.2870388e+00])
v = np.array([0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 5.54480810e-01, 1.21734442e-01, 2.25415160e-01, 1.74327830e+00, 6.00752247e-01, 1.43908420e+00, 3.25888647e+03, 4.66549950e+01, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 2.91479907e+03, 2.15909521e+03, 2.17931152e+03, 1.63775589e+02, 0.00000000e+00, 1.01077698e+02])
X = np.array([[1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.]])
alpha = 0.05
q_cis = q_profile(y=y, X=X, v=v, alpha=alpha)

Original post

I tried running the DerSimonianLaird estimator on the 21-pain dataset we use in NiMARE to test IBMA methods. There is apparently some mismatch between NiMARE's default brain mask and the maps in the Dataset, so some voxels have all zeroes for both beta and varcope maps. These zero voxels lead to divide-by-zero issues in the estimator, leading to NaNs in the tau2 estimates and an exception when trying to get RE stats.

For an example, see this notebook.

The text was updated successfully, but these errors were encountered:

tyarkoni · 2020-06-24T18:33:53Z

What behavior would you like it to have? Return NaNs at voxels where the input data contained NaNs?

tsalo · 2020-06-24T18:36:10Z

I think returning NaNs at voxels where the input data contained all zeros or NaNs would be good. Unless zero would be better when all studies have zeros at a given voxel?

I think that methods like get_re_stats() will need to do some masking/unmasking w.r.t. NaNs.

tyarkoni · 2020-06-24T18:40:36Z

Okay, I'll have it return NaNs. Might be a while before I can get to it though, so feel free to open a PR if you need this urgently.

tsalo · 2020-06-24T23:19:33Z

I kept digging and found out that it's not the all-zero voxels that are causing the problem. Sorry about the mis-reported bug!

I've isolated the voxel that consistently raises the exception, but I can't figure out what is causing it.

To replicate (on my laptop at least):

import numpy as np
from pymare.stats import q_profile
y = np.array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, -5.2165329e-01, 7.4963748e-01, 3.3219385e-01, -1.2351226e+00, 1.7527617e+00, -3.0110748e+00,  1.0946554e+02, -4.8926108e-02, 0.0000000e+00,  0.0000000e+00,  0.0000000e+00, -6.2120113e+01, -2.8613630e+01, -1.2308966e+01,  5.3221474e+00,  0.0000000e+00, -9.2870388e+00])
v = np.array([0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 5.54480810e-01, 1.21734442e-01, 2.25415160e-01, 1.74327830e+00, 6.00752247e-01, 1.43908420e+00, 3.25888647e+03, 4.66549950e+01, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 2.91479907e+03, 2.15909521e+03, 2.17931152e+03, 1.63775589e+02, 0.00000000e+00, 1.01077698e+02])
X = np.array([[1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.]])
alpha = 0.05
q_cis = q_profile(y=y, X=X, v=v, alpha=alpha)

The behavior for all-zero voxels seems to be that tau2 ends up NaN, but the confidence intervals end up at their respective bounds (i.e., [0, 100]). I do think those should be NaNs, but that's more an enhancement to request than a bug to report. I will update this issue and open a new one for the NaN-handling soonish.

tyarkoni · 2020-06-25T00:07:41Z

Setting aside the issue for this particular voxel (which we do need to get to the bottom of), I don't recommend treating voxels with missing data for studies as if the estimates and variances are 0. My naive expectation is that PyMARE estimators will always return a NaN value if you do that (if they don't, let me know!), which would be fine since you could just mask those voxels out. BUT even if that's true, I don't think you want to rely on PyMARE to do the right thing here. If you know you have missing data for at least one study at a voxel, I suggest either setting the estimates and variances for those studies to NaN or (better) just masking them out before you hand off to PyMARE in the first place.

(Alternatively, you could list-wise delete the missing studies, so that PyMARE gets only complete data. But that would be very misleading to the end user, because they'd have no way of knowing that different voxels could be based on different studies. So I'd recommend just erring on the side of caution and dropping all voxels that don't have complete data everywhere.)

tsalo · 2020-06-25T18:40:49Z

I've opened #42 to continue discussion of handling missing data, and will change the title and summary for this issue to match the bug of unknown cause.

tyarkoni · 2020-10-09T21:19:12Z

It looks like this is an optimization problem—scipy's minimize() is failing to stay within the specified bounds (variances shouldn't be negative). The problem is that this seems to be extremely dependent on starting values for the Q-profiled CIs... I was able to reproduce the error from your inputs, but if I tweak the starting value even slightly, it converges successfully. Unfortunately, this isn't just a matter of picking a different fixed starting point, as I imagine this will be heavily dependent on the input data, and you just happened to find inputs at some voxel that happened to cause the optimizer to go off the rails.

What I'm going to do for the moment is pick a starting value based on the data, and hopefully that will at least reduce the number of failures of this sort we get. I don't want to explicitly handle exceptions just yet, as I want to get a sense of how rare these failure are in practice before we take that step. But if this crops up again, as seems likely, I think we're going to have to just set NaN for all voxels that run into convergence problems (probably with a warning).

tyarkoni · 2020-10-09T21:36:45Z

@tsalo on a semi-related note, getting CIs for the tau^2 estimate via get_re_stats() is likely to be impractical for the foreseeable future in the fMRI context... there's no easy way to parallelize the optimization over voxels. The CI estimates currently obtained via the Q-profile method are also a bit dubious, and probably won't be super useful to users anyway. So at least for the time being, I would probably avoid trying to report these in NiMARE. We definitely want to report the point estimates for tau^2, but I think the NaNs you were getting for that were a separate issue (still to be addressed).

tsalo · 2020-10-13T01:28:10Z

I'm totally fine not including CIs in NiMARE- at least unless users start requesting them (though I doubt that will happen any time soon). Thanks for the heads up.

tsalo added the bug Something isn't working label Jun 24, 2020

tsalo mentioned this issue Jun 25, 2020

Handling of missing data in Estimators #42

Open

tsalo changed the title ~~Voxels with all zeros result in NaN-related errors in Estimators~~ Negative tau2 error in DerSimonianLaird random effects estimation Jun 25, 2020

tyarkoni mentioned this issue Oct 9, 2020

[FIX] minimization issues #51

Merged

tyarkoni closed this as completed in #51 Oct 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Negative tau2 error in DerSimonianLaird random effects estimation #40

Negative tau2 error in DerSimonianLaird random effects estimation #40

tsalo commented Jun 24, 2020 •

edited

Loading

tyarkoni commented Jun 24, 2020

tsalo commented Jun 24, 2020

tyarkoni commented Jun 24, 2020

tsalo commented Jun 24, 2020

tyarkoni commented Jun 25, 2020

tsalo commented Jun 25, 2020 •

edited

Loading

tyarkoni commented Oct 9, 2020

tyarkoni commented Oct 9, 2020

tsalo commented Oct 13, 2020

Negative tau2 error in DerSimonianLaird random effects estimation #40

Negative tau2 error in DerSimonianLaird random effects estimation #40

Comments

tsalo commented Jun 24, 2020 • edited Loading

Updated summary

Original post

tyarkoni commented Jun 24, 2020

tsalo commented Jun 24, 2020

tyarkoni commented Jun 24, 2020

tsalo commented Jun 24, 2020

tyarkoni commented Jun 25, 2020

tsalo commented Jun 25, 2020 • edited Loading

tyarkoni commented Oct 9, 2020

tyarkoni commented Oct 9, 2020

tsalo commented Oct 13, 2020

tsalo commented Jun 24, 2020 •

edited

Loading

tsalo commented Jun 25, 2020 •

edited

Loading