-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inference with iMRMC - R package - numerical score outcome #174
Comments
I think you are trying to analyze the mean score given modality A (Abar) minus the mean score given modality B (Bbar). I ran the code that you posted. The first two elements of the result were the mean and the variance:
The standard error of the difference is approximately 0.2 = sqrt(0.04) Assuming your data is approximately normal, these can be used to estimate the lower and upper limits of the confidence interval:
I hope this is what you want. |
Thanks for the quick reply Brandon, I set the set.seed(123) and ran the analysis with the uStat11.jointD and the normal approximated CI's with the calculated variance and got
Running the analysis with the lmer package as mixed model gives CI's (profile likelihood) which are 1/3 smaller
Using a standard t.test gives
taking the t-distribution and adjusting by sqrt(n) gives smaller CIs as expected, the question is now which ones would be possible to take and still be reasonable with it? ( here 5 readers , and 40 observations)
|
Ok I ran several simulations now eith more readers = 10 and more cases n=120, then the results are more consistent for t-test , lmer and the ones one would get with the normal approximations. Would it be legitimate to calculate the confidence intervals also for a not fully crossed study, for example split-plot design or randomizes design with the same formula? To be more conservative, one coudl apply t-distributed CIs instead of normal with the number of readers as df-1, or? many thanks |
Hi Michael, I would have to say we are moving into statistical consulting rather than software debugging. If you don't mind, please email me. I'd like to know who I am working with: [email protected] I think you are on the wrong path. I have concerns about the methods you are comparing and the conclusions you are making. I shared your comments with my colleague, Si Wen (@SiWen314 ), and she concurs. Below I share her responses to my questions about your comments. BDGIs the lmer function treating readers and the interaction term as random effects? It seems they are being treated the same as modalityID, which is a fixed effect. This would explain the smaller confidence intervals. SWThe function However, even if the function is written to treat both readerID and the interaction terms as random effects, like the following: Another issue for this lmer function is that it does not account for case variability. The ANOVA model in the OR method only shows the modality, reader and reader x modality effects. However, the dependent variable in OR method is AUC for each reader-modality combination, not the raw score, and the error term in the OR method includes the case variability, includes the correlations, and is managed explicitly. For this reason, the OR method cannot be executed with standard ANOVA software, and the lmer function does not account for case variability. BDGDo you understand the closing question? The equation at the end is wrong because the author is dividing by sqrt(40). The variance output from SWI agree with you that it does not make sense to divide the standard error by sqrt(40). Also, it seems that the author try to use the quantile in t distribution, qt(0.975, 5-1), to replace 1.96. It is not reasonable to use the degrees of freedom for the readers, 5-1, as the degrees of freedom for the difference score, because the degrees of freedom for the difference score is a complex function of the degrees of freedom of the readers, cases, and interaction terms. I think the author try to manipulate the U-stat result to be similar to what is produced from BDGCould your ANOVA-based software do this analysis better than the U-stat estimates? SWBasically, what the author did is calculate the confidence interval of the mean difference, which is also available in iMRMC::laBRBM function in the iMRMC package. The laBRBM function uses uStat11.conditionalD, but the results are the same. Which is the same as is produced by the U-statistics approach. My ANOVA-based method give exact the same result BDGI need a summary about where my U-stat estimate compares well with yours and where it fails. SWWhen the study is fully crossed, the results from the U-stat method and ANOVA method are the same. When the study is not fully crossed, but the number of cases read by each reader are the same and the readings are paired across the two modalities, the relative difference in the variance estimation is about 2% When the study does not keep the number of cases the same across readers, the results from the two methods are very different |
Thanks for the long response Bradon, especially the part about the linear mixed model. That clarifies the main points on our side. Also I didnt know about the other anova funtions from SW that are provided here. As our study is not fully crossed, but the number of cases read by each reader are the same and the readings are paired across the two modalities, I expect to be on the safe side and use the the normal approximation for the CIs. We will run some simulations with both methods the U-stat method and the ANOVA functions/SW method provided in the iMRMC package and compare the results to see the differences there. Also thanks for providing the contact email for future consultings. KR |
We would be happy to hear what your simulations show. Please report back. We would also like to better know who you are. That is why I shared my email, so you can share your full name and affiliation. Si has submitted her work that treats arbitrary study designs and is awaiting the journal's response. She might be willing to share that and the corresponding software, though it might depend on the review status. Brandon |
Thanks Brandon, I will share more details in a direct email contact to you, working for a bigger pharma company. |
Hi,
we have a question regarding the usage of the R -package.
Our experiment is a random study design with 100 scans, and 9 readers, where each scan is assessed by a block of randomly selected 3 out of 9 readers. The outcome variable is a score of scan -quality from 1-100. Paired design because the same 3 readers assess the scans before and after processing the image through a AI system.
We encountered that this package may be of use for our random design with paired A and B test and multiple readers.
We used the sample code from the package as below to test the functionality. The outcome of the uStat11.jointD and uStat11.conditionalD gives us the means and variances and the moments and coefficients.
Our question would now be: How do we get inference for A - B in this example between the scores, we want to compute confidence intervals. As we havent found information about this in the Paper and are unsure how to get from here to the confidence intervals described by Obuchwosi & Rockette and Hillis papers. Any help or links to the papers that can lead us from these estimate of these function to calculation of the proper confidence of the difference of mean scores betweeen A and B modality would be appriciated,
Thanks
Michael Blin
$moments
c0r0 c1r0 c0r1 c1r1
AB 0.9898495 1.550436 1.067417 2.084773
CD 1.0220998 1.801375 1.041513 2.286694
ABminusCD 1.0262877 1.405781 1.013911 1.662502
$coeff
c0r0 c1r0 c0r1 c1r1
AB -0.22 0.02 0.195 0.005
CD -0.22 0.02 0.195 0.005
ABminusCD -0.22 0.02 0.195 0.005
library(iMRMC)
simRoeMetz.config <- sim.gRoeMetz.config()
df.MRMC <- sim.gRoeMetz(simRoeMetz.config)
df <- undoIMRMCdf(df.MRMC)
df <- droplevels(df[grepl("pos", df$caseID), ])
result.jointD.identity <- uStat11.jointD(
df,
kernelFlag = 1,
keyColumns = c("readerID", "caseID", "modalityID", "score"),
modalitiesToCompare = c("testA", "testB"))
cat("\n")
cat("uStat11.jointD.identity \n")
print(result.jointD.identity[1:10])
The text was updated successfully, but these errors were encountered: