Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative tau2 error in DerSimonianLaird random effects estimation #40

Closed
tsalo opened this issue Jun 24, 2020 · 9 comments · Fixed by #51
Closed

Negative tau2 error in DerSimonianLaird random effects estimation #40

tsalo opened this issue Jun 24, 2020 · 9 comments · Fixed by #51
Labels
bug Something isn't working

Comments

@tsalo
Copy link
Member

tsalo commented Jun 24, 2020

Updated summary

There is a bug in Estimators wherein, during random-effects stats estimation, tau2 may drop below zero and raise an exception. I don't know what's causing this, but for documentation of the actual bug, see #40 (comment).

To replicate (on my laptop):

import numpy as np
from pymare.stats import q_profile
y = np.array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, -5.2165329e-01, 7.4963748e-01, 3.3219385e-01, -1.2351226e+00, 1.7527617e+00, -3.0110748e+00,  1.0946554e+02, -4.8926108e-02, 0.0000000e+00,  0.0000000e+00,  0.0000000e+00, -6.2120113e+01, -2.8613630e+01, -1.2308966e+01,  5.3221474e+00,  0.0000000e+00, -9.2870388e+00])
v = np.array([0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 5.54480810e-01, 1.21734442e-01, 2.25415160e-01, 1.74327830e+00, 6.00752247e-01, 1.43908420e+00, 3.25888647e+03, 4.66549950e+01, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 2.91479907e+03, 2.15909521e+03, 2.17931152e+03, 1.63775589e+02, 0.00000000e+00, 1.01077698e+02])
X = np.array([[1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.]])
alpha = 0.05
q_cis = q_profile(y=y, X=X, v=v, alpha=alpha)

Original post

I tried running the DerSimonianLaird estimator on the 21-pain dataset we use in NiMARE to test IBMA methods. There is apparently some mismatch between NiMARE's default brain mask and the maps in the Dataset, so some voxels have all zeroes for both beta and varcope maps. These zero voxels lead to divide-by-zero issues in the estimator, leading to NaNs in the tau2 estimates and an exception when trying to get RE stats.

For an example, see this notebook.

@tsalo tsalo added the bug Something isn't working label Jun 24, 2020
@tyarkoni
Copy link
Contributor

What behavior would you like it to have? Return NaNs at voxels where the input data contained NaNs?

@tsalo
Copy link
Member Author

tsalo commented Jun 24, 2020

I think returning NaNs at voxels where the input data contained all zeros or NaNs would be good. Unless zero would be better when all studies have zeros at a given voxel?

I think that methods like get_re_stats() will need to do some masking/unmasking w.r.t. NaNs.

@tyarkoni
Copy link
Contributor

Okay, I'll have it return NaNs. Might be a while before I can get to it though, so feel free to open a PR if you need this urgently.

@tsalo
Copy link
Member Author

tsalo commented Jun 24, 2020

I kept digging and found out that it's not the all-zero voxels that are causing the problem. Sorry about the mis-reported bug!

I've isolated the voxel that consistently raises the exception, but I can't figure out what is causing it.

To replicate (on my laptop at least):

import numpy as np
from pymare.stats import q_profile
y = np.array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, -5.2165329e-01, 7.4963748e-01, 3.3219385e-01, -1.2351226e+00, 1.7527617e+00, -3.0110748e+00,  1.0946554e+02, -4.8926108e-02, 0.0000000e+00,  0.0000000e+00,  0.0000000e+00, -6.2120113e+01, -2.8613630e+01, -1.2308966e+01,  5.3221474e+00,  0.0000000e+00, -9.2870388e+00])
v = np.array([0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 5.54480810e-01, 1.21734442e-01, 2.25415160e-01, 1.74327830e+00, 6.00752247e-01, 1.43908420e+00, 3.25888647e+03, 4.66549950e+01, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 2.91479907e+03, 2.15909521e+03, 2.17931152e+03, 1.63775589e+02, 0.00000000e+00, 1.01077698e+02])
X = np.array([[1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.], [1.]])
alpha = 0.05
q_cis = q_profile(y=y, X=X, v=v, alpha=alpha)

The behavior for all-zero voxels seems to be that tau2 ends up NaN, but the confidence intervals end up at their respective bounds (i.e., [0, 100]). I do think those should be NaNs, but that's more an enhancement to request than a bug to report. I will update this issue and open a new one for the NaN-handling soonish.

@tyarkoni
Copy link
Contributor

Setting aside the issue for this particular voxel (which we do need to get to the bottom of), I don't recommend treating voxels with missing data for studies as if the estimates and variances are 0. My naive expectation is that PyMARE estimators will always return a NaN value if you do that (if they don't, let me know!), which would be fine since you could just mask those voxels out. BUT even if that's true, I don't think you want to rely on PyMARE to do the right thing here. If you know you have missing data for at least one study at a voxel, I suggest either setting the estimates and variances for those studies to NaN or (better) just masking them out before you hand off to PyMARE in the first place.

(Alternatively, you could list-wise delete the missing studies, so that PyMARE gets only complete data. But that would be very misleading to the end user, because they'd have no way of knowing that different voxels could be based on different studies. So I'd recommend just erring on the side of caution and dropping all voxels that don't have complete data everywhere.)

@tsalo
Copy link
Member Author

tsalo commented Jun 25, 2020

I've opened #42 to continue discussion of handling missing data, and will change the title and summary for this issue to match the bug of unknown cause.

@tsalo tsalo changed the title Voxels with all zeros result in NaN-related errors in Estimators Negative tau2 error in DerSimonianLaird random effects estimation Jun 25, 2020
@tyarkoni
Copy link
Contributor

tyarkoni commented Oct 9, 2020

It looks like this is an optimization problem—scipy's minimize() is failing to stay within the specified bounds (variances shouldn't be negative). The problem is that this seems to be extremely dependent on starting values for the Q-profiled CIs... I was able to reproduce the error from your inputs, but if I tweak the starting value even slightly, it converges successfully. Unfortunately, this isn't just a matter of picking a different fixed starting point, as I imagine this will be heavily dependent on the input data, and you just happened to find inputs at some voxel that happened to cause the optimizer to go off the rails.

What I'm going to do for the moment is pick a starting value based on the data, and hopefully that will at least reduce the number of failures of this sort we get. I don't want to explicitly handle exceptions just yet, as I want to get a sense of how rare these failure are in practice before we take that step. But if this crops up again, as seems likely, I think we're going to have to just set NaN for all voxels that run into convergence problems (probably with a warning).

@tyarkoni
Copy link
Contributor

tyarkoni commented Oct 9, 2020

@tsalo on a semi-related note, getting CIs for the tau^2 estimate via get_re_stats() is likely to be impractical for the foreseeable future in the fMRI context... there's no easy way to parallelize the optimization over voxels. The CI estimates currently obtained via the Q-profile method are also a bit dubious, and probably won't be super useful to users anyway. So at least for the time being, I would probably avoid trying to report these in NiMARE. We definitely want to report the point estimates for tau^2, but I think the NaNs you were getting for that were a separate issue (still to be addressed).

@tsalo
Copy link
Member Author

tsalo commented Oct 13, 2020

I'm totally fine not including CIs in NiMARE- at least unless users start requesting them (though I doubt that will happen any time soon). Thanks for the heads up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants