Skip to content
This repository has been archived by the owner on Oct 26, 2024. It is now read-only.

Meta-analysis #267

Open
JamesYang007 opened this issue Feb 7, 2023 · 5 comments
Open

Meta-analysis #267

JamesYang007 opened this issue Feb 7, 2023 · 5 comments

Comments

@JamesYang007
Copy link
Member

Not sure if this is what people call meta-analysis, but I raised this long time ago and we might be getting closer to doing this:

I wanted to look at past cases of drugs that went through certain designs and what drugs were effective and ineffective. Ultimately, I want to show if practitioners used imprint as a means of validating their design and ran the trials with the calibrated thresholds, we could've predicted which drugs were indeed effective or ineffective. Realistically, it's probably only possible to show which drugs are ineffective but claimed to be effective. People don't publish about ineffective drugs that were actually effective.

I'm not sure the exact procedure we would go about this, but my thoughts are:

  • Collect past published studies on drugs claimed to be "effective", but were ineffective in practice. We probably need to somehow infer the latter from something and take it to be the truth.
  • See what designs were used for those drugs.
  • Validate the designs to show tight upper bound of TIE.
  • Might be fun to note which designs ended up having significantly higher TIE than alpha.
  • Calibrate the designs. Run with their real data. See which designs reject. Hopefully, we reject much less.

The story here is that if people went through our framework, less ineffective drugs would be out because we properly find a valid design.

@tbenthompson
Copy link
Member

Oh this sounds kinda fun! One important question: are drug trials often near threshold? I remember in the talks with Amgen that one of the folks said something like "we really don't want to be near p=0.025 and most of the time, we only declare success if p is really tiny like 0.001 or better." Obviously, there are marginal cases and that's why FDA guidance and regulation is important, but maybe those marginal cases are rare?

I'm not sure Mike gets emails for these issues, let's pester him: @mikesklar

@JamesYang007
Copy link
Member Author

JamesYang007 commented Feb 7, 2023

I do recall something like that now that you say it, but I didn't get it. All I was thinking was "you can make your p-value as rare as possible by not considering the right part of the null space.. if you told me you got a low p-value, I would just be skeptical whether you even picked the right null points to compute this p-value". There's still merit to studying this imo. Let's say these people got 0.001. I bet if we validate it on a space, the largest p-value is way higher than 0.025. It's something to call out still.

@mikesklar
Copy link
Contributor

mikesklar commented Feb 7, 2023

Notes:

-Meta-analysis usually refers to something else, where you have a group of studies that you’re combining. Here we’re actually analyzing the analysis itself, which might be more accurate to the greek, but we’ll have to call it something else. Maybe analysis comparison.

-I think revisiting previous designs with our methods we would be more likely to identify missed-effective drugs (Type II Errors) rather than missed-ineffective drugs (Type I Error); because when it’s hard to establish accurate Type I Error control or do fancier stuff, the sponsor must either (1) use a more-conservative-than-necessary design with large sample size in which it’s super unlikely to get a false positive. OR (2) they decline to run the design, or go slower, because you can’t get the job done at lower cost/fewer patients.

-per above, our methods are unlikely to totally overturn stuff. But we can try to argue, “this design could have stopped X patients sooner with a more accurate Type I Error calibration.”

-re: Amgen stopping at very small p-value: this decision-making attitude is meant to apply to "at what point do they decide to lock in the stopping" because data can change unexpectedly due to collection failures or additional follow-up. There is still a rough regulatory target of 0.025 underneath it, there's just fear that you'll spin out of control at the last second. Nonetheless, if we offer savings/refunds for Type I Error, that's still real, since it basically corresponds to a sample size improvement of some constant fraction, no matter what

-re: calling out people for studying the wrong part of the space and therefore failing TIE control: this would be a lot easier if people would actually publish their freaking designs. Most of them don't... we should still analyze this to the extent that we can from published literature

@JamesYang007
Copy link
Member Author

JamesYang007 commented Feb 7, 2023

-I think revisiting previous designs with our methods we would be more likely to identify missed-effective drugs (Type II Errors) rather than missed-ineffective drugs (Type I Error); because when it’s hard to establish accurate Type I Error control or do fancier stuff, the sponsor must either (1) use a more-conservative-than-necessary design with large sample size in which it’s super unlikely to get a false positive. OR (2) they decline to run the design, or go slower, because you can’t get the job done at lower cost/fewer patients.

Wait but then you're saying that true effective drugs went through (1) or (2) and therefore people were not able to reject the null and say it's effective? I was under the impression that people wouldn't publish about drugs that they couldn't reject.

this would be a lot easier if people would actually publish their freaking designs. Most of them don't... we should still analyze this to the extent that we can from published literature

The nightmare continues...

@mikesklar
Copy link
Contributor

mikesklar commented Feb 7, 2023

You're right, the publication bias is a big problem for this!

There should be some government-funded studies which we can look at with well-published protocols and results. These probably don't use simulations to calibrate at all, though

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants