diff --git a/_articles/RJ-2023-026/RJ-2023-026.R b/_articles/RJ-2023-026/RJ-2023-026.R index 29905a6a9..e24e1ca6d 100644 --- a/_articles/RJ-2023-026/RJ-2023-026.R +++ b/_articles/RJ-2023-026/RJ-2023-026.R @@ -2,12 +2,12 @@ # Please edit RJ-2023-026.Rmd to modify this file ## ----setup, include=FALSE----------------------------------------------------- -knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE) +knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE, out.width="100%") library(ggplot2) #library(kableExtra) -## ----Fig0, fig.height = 12, fig.width=8, fig.cap = "The visual predictive check plot. The solid red line represents the $50^{th}$ percentile of the observed data, and dashed red lines represent the $10^{th}$ and $90^{th}$ percentiles of the observed data. The solid blue line represents the $50^{th}$ percentile of the simularted data, and dashed blue lines represent the $10^{th}$ and $90^{th}$ percentiles of the simulated data. Light blue and pink areas represent the 95% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines."---- +## ----Fig0, fig.height = 12, fig.width=8, fig.cap = "The visual predictive check plot. The solid red line represents the $50^{th}$ percentile of the observed data, and dashed red lines represent the $10^{th}$ and $90^{th}$ percentiles of the observed data. The solid blue line represents the $50^{th}$ percentile of the simularted data, and dashed blue lines represent the $10^{th}$ and $90^{th}$ percentiles of the simulated data. Light blue and pink areas represent the 95\\% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines."---- library(nlmeVPC) library(ggplot2) library(gridExtra) @@ -23,15 +23,15 @@ C=VPCgraph(origdata,simdata,N_xbin=8,type="CI")+ grid.arrange(A,B,C,nrow=3) -## ----Fig1, fig.height = 4, fig.width=8, fig.cap = "The additive equantile VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line. Light blue and pink areas represent the 95% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines."---- +## ----Fig1, fig.height = 4, fig.width=8, fig.cap = "The additive equantile VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line. Light blue and pink areas represent the 95\\% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines."---- aqrVPC(origdata,simdata) +labs(caption="") -## ----Fig2, fig.height = 4, fig.width=8, fig.cap = "The bootstrap VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line, and the pink areas represent the 95% confidence areas of the $50^{th}$ percentile line, calculated from the bootstrap samples of the observed data."---- +## ----Fig2, fig.height = 4, fig.width=8, fig.cap = "The bootstrap VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line, and the pink areas represent the 95\\% confidence areas of the $50^{th}$ percentile line, calculated from the bootstrap samples of the observed data."---- bootVPC(origdata,simdata,N_xbin=8) -## ----Fig3, fig.height = 8, fig.width=8,fig.cap = "The average shifted VPC plot. Dots indicate the observed data. The solid line represents the 50th quantiles of the observed data, and dashed lines represent the $10^{th}$ and $90^{th}$ percentiles of the observed data. Light blue and pink areas represent the 95% confidence areas of the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles."---- +## ----Fig3, fig.height = 8, fig.width=8,fig.cap = "The average shifted VPC plot. Dots indicate the observed data. The solid line represents the 50th quantiles of the observed data, and dashed lines represent the $10^{th}$ and $90^{th}$ percentiles of the observed data. Light blue and pink areas represent the 95\\% confidence areas of the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles."---- A=asVPC(origdata,simdata,type="CI",N_xbin=8,N_hist=3,weight_method="bin") +labs(caption="") B=asVPC(origdata,simdata,type="CI",N_xbin=8,N_hist=3,weight_method="distance")+labs(caption="") grid.arrange(A,B,nrow=2) @@ -41,7 +41,7 @@ grid.arrange(A,B,nrow=2) NumericalCheck(origdata,simdata,pred.level=c(0,0.2,0.4,0.6,0.8,0.9),N_xbin=8)$NPC -## ----Fig4, fig.height = 10, fig.width=6, fig.cap = "The coverage plot and the coverage detailed plot for the 80% prediction interval. In the coverage plot, the X-axis is the level of the prediction interval. The Y-axis is the ratio between the number of observed data and the number of expected data of the lower and upper parts in each level of the prediction interval. The white line is the reference line, and the gray area represents the confidence area of the ratios. If the solid lines are near the white line, we can conclude that the suggested model is suitable. In the coverage detailed plot, the white dots represent the expected percentages of lower and upper prediction intervals of, 10%, and 90%, respectively. The upper and lower percentages of observation in each time bin are darker gray."---- +## ----Fig4, out.width="90%", fig.height = 10, fig.width=6, fig.cap = "The coverage plot and the coverage detailed plot for the 80\\% prediction interval. In the coverage plot, the X-axis is the level of the prediction interval. The Y-axis is the ratio between the number of observed data and the number of expected data of the lower and upper parts in each level of the prediction interval. The white line is the reference line, and the gray area represents the confidence area of the ratios. If the solid lines are near the white line, we can conclude that the suggested model is suitable. In the coverage detailed plot, the white dots represent the expected percentages of lower and upper prediction intervals of, 10\\%, and 90\\%, respectively. The upper and lower percentages of observation in each time bin are darker gray."---- A=coverageplot(origdata,simdata,N_xbin=8) +ggtitle("(A) Coverage Plot") B=coverageDetailplot(origdata,simdata,N_xbin=8,predL=0.8) + @@ -256,13 +256,13 @@ B=coverageplot(origdata,simdata.F,conf.level=0.9,N_xbin=8)+labs(title="Model 2") grid.arrange(A,B,ncol=2) -## ----M16,fig.height = 3, fig.width=7.5,fig.cap = "The coverage detailed plots for Model 1 and Model 2 when PI=50%."---- +## ----M16,fig.height = 3, fig.width=7.5,fig.cap = "The coverage detailed plots for Model 1 and Model 2 when PI=50\\%."---- A=coverageDetailplot(origdata,simdata.T,predL=0.5,N_xbin=8)+labs(title="Model 1") B=coverageDetailplot(origdata,simdata.F,predL=0.5,N_xbin=8)+labs(title="Model 2") grid.arrange(A,B,ncol=2) -## ----M17, fig.height = 3, fig.width=7.5,fig.cap = "The coverage detailed plots for Model 1 and Model 2 when PI=80%."---- +## ----M17, fig.height = 3, fig.width=7.5,fig.cap = "The coverage detailed plots for Model 1 and Model 2 when PI=80\\%."---- A=coverageDetailplot(origdata,simdata.T,predL=0.8,N_xbin=8)+labs(title="Model 1") B=coverageDetailplot(origdata,simdata.F,predL=0.8,N_xbin=8)+labs(title="Model 2") grid.arrange(A,B,ncol=2) diff --git a/_articles/RJ-2023-026/RJ-2023-026.Rmd b/_articles/RJ-2023-026/RJ-2023-026.Rmd index 4fe0c65f6..adad4b536 100644 --- a/_articles/RJ-2023-026/RJ-2023-026.Rmd +++ b/_articles/RJ-2023-026/RJ-2023-026.Rmd @@ -29,7 +29,7 @@ author: orcid: 0000-0003-0817-5000 type: package output: - rjtools::rjournal_article: + rjtools::rjournal_web_article: self_contained: yes toc: no bibliography: EunKyung_Lee.bib @@ -44,9 +44,8 @@ journal: --- - ```{r setup, include=FALSE} -knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE) +knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE, out.width="100%") library(ggplot2) #library(kableExtra) ``` @@ -97,7 +96,7 @@ As the number of bins decreases, the lines become smoother and more regular, how `VPCgraph` provides the automatic binning with `optK` and `makeCOVbin`; here, `optK` finds the optimal number of bins, and `makeCOVbin` finds the optimal cutoffs of bins using Lavielle and Bleakley's method. -```{r Fig0, fig.height = 12, fig.width=8, fig.cap = "The visual predictive check plot. The solid red line represents the $50^{th}$ percentile of the observed data, and dashed red lines represent the $10^{th}$ and $90^{th}$ percentiles of the observed data. The solid blue line represents the $50^{th}$ percentile of the simularted data, and dashed blue lines represent the $10^{th}$ and $90^{th}$ percentiles of the simulated data. Light blue and pink areas represent the 95% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines."} +```{r Fig0, fig.height = 12, fig.width=8, fig.cap = "The visual predictive check plot. The solid red line represents the $50^{th}$ percentile of the observed data, and dashed red lines represent the $10^{th}$ and $90^{th}$ percentiles of the observed data. The solid blue line represents the $50^{th}$ percentile of the simularted data, and dashed blue lines represent the $10^{th}$ and $90^{th}$ percentiles of the simulated data. Light blue and pink areas represent the 95\\% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines."} library(nlmeVPC) library(ggplot2) library(gridExtra) @@ -119,7 +118,7 @@ To overcome the difficulties of making bins as well as determining the number of @jamsen2018regression used additive quantile regression to calculate the quantiles of the observed and simulated data. This regression method makes it possible to estimate quantiles without discrete binning, which is especially useful when the data are insufficient, irregular, or inappropriate to configure the bins. To fit the additive quantile regression, we used the `rqss` function in the \CRANpkg{quantreg} [@quantreg] package and developed the `aqrVPC` function to draw the VPC type plot with additive quantile regression. Figure \@ref(fig:Fig1) shows the additive quantile regression VPC plot. The solid and dashed lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ additive quantile regression lines of the observed data, and the pink and light blue areas represent the confidence areas of the additive quantile regression lines of the simulated data. Lines and areas in the additive quantile regression VPC plot are much smoother than those in the original VPC plot. -```{r Fig1, fig.height = 4, fig.width=8, fig.cap = "The additive equantile VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line. Light blue and pink areas represent the 95% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines."} +```{r Fig1, fig.height = 4, fig.width=8, fig.cap = "The additive equantile VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line. Light blue and pink areas represent the 95\\% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines."} aqrVPC(origdata,simdata) +labs(caption="") ``` @@ -131,7 +130,7 @@ This plot reflects the uncertainty of the observed data and allows for more obje Figure \@ref(fig:Fig2) shows the bootstrap VPC plot using `bootVPC`. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line, and the pink areas represent the 95$\%$ confidence areas of the $50^{th}$ percentile line, calculated from the bootstrap samples of the observed data. If the solid blue line and the solid red line are similar, the solid blue line is in the pink area, and the pink area is located between two dashed blue lines, then this is evidence that the fitted model fit the observed data well. -```{r Fig2, fig.height = 4, fig.width=8, fig.cap = "The bootstrap VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line, and the pink areas represent the 95% confidence areas of the $50^{th}$ percentile line, calculated from the bootstrap samples of the observed data."} +```{r Fig2, fig.height = 4, fig.width=8, fig.cap = "The bootstrap VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line, and the pink areas represent the 95\\% confidence areas of the $50^{th}$ percentile line, calculated from the bootstrap samples of the observed data."} bootVPC(origdata,simdata,N_xbin=8) ``` @@ -180,7 +179,7 @@ In the asVPC plot, the observations in each bin are combined using weights. Typi Figure \@ref(fig:Fig3) shows the results from the `asVPC` function using bin-related weights and distance-related weights. The solid and dashed lines represent the average shifted quantile lines of the observed data, and the pink and light blue areas represent the confidence areas of the simulated data. The lines in the asVPC plot are smoother than those in the original VPC plot, and the confidence areas in the asVPC plot are thinner than those in the original VPC plot. -```{r Fig3, fig.height = 8, fig.width=8,fig.cap = "The average shifted VPC plot. Dots indicate the observed data. The solid line represents the 50th quantiles of the observed data, and dashed lines represent the $10^{th}$ and $90^{th}$ percentiles of the observed data. Light blue and pink areas represent the 95% confidence areas of the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles."} +```{r Fig3, fig.height = 8, fig.width=8,fig.cap = "The average shifted VPC plot. Dots indicate the observed data. The solid line represents the 50th quantiles of the observed data, and dashed lines represent the $10^{th}$ and $90^{th}$ percentiles of the observed data. Light blue and pink areas represent the 95\\% confidence areas of the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles."} A=asVPC(origdata,simdata,type="CI",N_xbin=8,N_hist=3,weight_method="bin") +labs(caption="") B=asVPC(origdata,simdata,type="CI",N_xbin=8,N_hist=3,weight_method="distance")+labs(caption="") grid.arrange(A,B,nrow=2) @@ -228,7 +227,7 @@ Unlike the VPC plot, which represents the data space, the information in the obs Figure \@ref(fig:Fig4)(B) is the result of `coverageDetailplot` when the prediction level is 80$\%$. The white dots represent the expected percentages of the lower and upper the prediction intervals, 10$\%$, and 90$\%$, respectively. The upper and lower percentages of observation in each time bin are shown in darker gray. The left bin(before 0.045 hours) shows all light gray in the coverage detailed plot, and it is quite different patterns from the expected one. However, it is mainly due to the characteristics of this example data. All observations in this bin are 0. It makes the lower and upper bound of the prediction interval all 0, and the lower and upper percentages become 0. -```{r Fig4, fig.height = 10, fig.width=6, fig.cap = "The coverage plot and the coverage detailed plot for the 80% prediction interval. In the coverage plot, the X-axis is the level of the prediction interval. The Y-axis is the ratio between the number of observed data and the number of expected data of the lower and upper parts in each level of the prediction interval. The white line is the reference line, and the gray area represents the confidence area of the ratios. If the solid lines are near the white line, we can conclude that the suggested model is suitable. In the coverage detailed plot, the white dots represent the expected percentages of lower and upper prediction intervals of, 10%, and 90%, respectively. The upper and lower percentages of observation in each time bin are darker gray."} +```{r Fig4, out.width="90%", fig.height = 10, fig.width=6, fig.cap = "The coverage plot and the coverage detailed plot for the 80\\% prediction interval. In the coverage plot, the X-axis is the level of the prediction interval. The Y-axis is the ratio between the number of observed data and the number of expected data of the lower and upper parts in each level of the prediction interval. The white line is the reference line, and the gray area represents the confidence area of the ratios. If the solid lines are near the white line, we can conclude that the suggested model is suitable. In the coverage detailed plot, the white dots represent the expected percentages of lower and upper prediction intervals of, 10\\%, and 90\\%, respectively. The upper and lower percentages of observation in each time bin are darker gray."} A=coverageplot(origdata,simdata,N_xbin=8) +ggtitle("(A) Coverage Plot") B=coverageDetailplot(origdata,simdata,N_xbin=8,predL=0.8) + @@ -503,13 +502,13 @@ Figure \@ref(fig:M15) shows the `coverageplot` results for Model 1 and Model 2. The upper and lower percentages in both figures are close to the white points in Model 1. On the other hand, the upper percentages of the most time bins are far from the white points in Model 2, especially the time bin (3.54,5.28] when PI = 50$\%$. When PI = 80$\%$, most upper and lower percentages are far from the white points. -```{r M16,fig.height = 3, fig.width=7.5,fig.cap = "The coverage detailed plots for Model 1 and Model 2 when PI=50%."} +```{r M16,fig.height = 3, fig.width=7.5,fig.cap = "The coverage detailed plots for Model 1 and Model 2 when PI=50\\%."} A=coverageDetailplot(origdata,simdata.T,predL=0.5,N_xbin=8)+labs(title="Model 1") B=coverageDetailplot(origdata,simdata.F,predL=0.5,N_xbin=8)+labs(title="Model 2") grid.arrange(A,B,ncol=2) ``` -```{r M17, fig.height = 3, fig.width=7.5,fig.cap = "The coverage detailed plots for Model 1 and Model 2 when PI=80%."} +```{r M17, fig.height = 3, fig.width=7.5,fig.cap = "The coverage detailed plots for Model 1 and Model 2 when PI=80\\%."} A=coverageDetailplot(origdata,simdata.T,predL=0.8,N_xbin=8)+labs(title="Model 1") B=coverageDetailplot(origdata,simdata.F,predL=0.8,N_xbin=8)+labs(title="Model 2") grid.arrange(A,B,ncol=2) diff --git a/_articles/RJ-2023-026/RJ-2023-026.html b/_articles/RJ-2023-026/RJ-2023-026.html index 03fe18a6d..6fdbd4ba2 100644 --- a/_articles/RJ-2023-026/RJ-2023-026.html +++ b/_articles/RJ-2023-026/RJ-2023-026.html @@ -1,19 +1,19 @@ - + - - - + + + + /* Hide doc at startup (prevent jankiness while JS renders/transforms) */ + body { + visibility: hidden; + } + @@ -35,149 +35,150 @@ pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; } } pre.numberSource code -{ counter-reset: source-line 0; } + { counter-reset: source-line 0; } pre.numberSource code > span -{ position: relative; left: -4em; counter-increment: source-line; } + { position: relative; left: -4em; counter-increment: source-line; } pre.numberSource code > span > a:first-child::before -{ content: counter(source-line); -position: relative; left: -1em; text-align: right; vertical-align: baseline; -border: none; display: inline-block; --webkit-touch-callout: none; -webkit-user-select: none; --khtml-user-select: none; -moz-user-select: none; --ms-user-select: none; user-select: none; -padding: 0 4px; width: 4em; -color: #aaaaaa; -} -pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; } + { content: counter(source-line); + position: relative; left: -1em; text-align: right; vertical-align: baseline; + border: none; display: inline-block; + -webkit-touch-callout: none; -webkit-user-select: none; + -khtml-user-select: none; -moz-user-select: none; + -ms-user-select: none; user-select: none; + padding: 0 4px; width: 4em; + color: #aaaaaa; + } +pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; } div.sourceCode -{ color: #00769e; background-color: #f1f3f5; } + { color: #00769e; background-color: #f1f3f5; } @media screen { pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; } } -code span { color: #00769e; } -code span.al { color: #ad0000; } -code span.an { color: #5e5e5e; } -code span.at { color: #657422; } -code span.bn { color: #ad0000; } -code span.bu { } -code span.cf { color: #00769e; } -code span.ch { color: #20794d; } -code span.cn { color: #8f5902; } -code span.co { color: #5e5e5e; } -code span.cv { color: #5e5e5e; font-style: italic; } -code span.do { color: #5e5e5e; font-style: italic; } -code span.dt { color: #ad0000; } -code span.dv { color: #ad0000; } -code span.er { color: #ad0000; } -code span.ex { } -code span.fl { color: #ad0000; } -code span.fu { color: #4758ab; } -code span.im { } -code span.in { color: #5e5e5e; } -code span.kw { color: #00769e; } -code span.op { color: #5e5e5e; } -code span.ot { color: #00769e; } -code span.pp { color: #ad0000; } -code span.sc { color: #5e5e5e; } -code span.ss { color: #20794d; } -code span.st { color: #20794d; } -code span.va { color: #111111; } -code span.vs { color: #20794d; } -code span.wa { color: #5e5e5e; font-style: italic; } +code span { color: #00769e; } /* Normal */ +code span.al { color: #ad0000; } /* Alert */ +code span.an { color: #5e5e5e; } /* Annotation */ +code span.at { color: #657422; } /* Attribute */ +code span.bn { color: #ad0000; } /* BaseN */ +code span.bu { } /* BuiltIn */ +code span.cf { color: #00769e; } /* ControlFlow */ +code span.ch { color: #20794d; } /* Char */ +code span.cn { color: #8f5902; } /* Constant */ +code span.co { color: #5e5e5e; } /* Comment */ +code span.cv { color: #5e5e5e; font-style: italic; } /* CommentVar */ +code span.do { color: #5e5e5e; font-style: italic; } /* Documentation */ +code span.dt { color: #ad0000; } /* DataType */ +code span.dv { color: #ad0000; } /* DecVal */ +code span.er { color: #ad0000; } /* Error */ +code span.ex { } /* Extension */ +code span.fl { color: #ad0000; } /* Float */ +code span.fu { color: #4758ab; } /* Function */ +code span.im { } /* Import */ +code span.in { color: #5e5e5e; } /* Information */ +code span.kw { color: #00769e; } /* Keyword */ +code span.op { color: #5e5e5e; } /* Operator */ +code span.ot { color: #00769e; } /* Other */ +code span.pp { color: #ad0000; } /* Preprocessor */ +code span.sc { color: #5e5e5e; } /* SpecialChar */ +code span.ss { color: #20794d; } /* SpecialString */ +code span.st { color: #20794d; } /* String */ +code span.va { color: #111111; } /* Variable */ +code span.vs { color: #20794d; } /* VerbatimString */ +code span.wa { color: #5e5e5e; font-style: italic; } /* Warning */ nlmeVPC: Visual Model Diagnosis for the Nonlinear Mixed Effect Model - + + - - - - - + + + + + - - - - + + + + - - - + + + - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + .l-screen .caption { + margin-left: 10px; + } - + .sidebar-section { + margin-bottom: 30px; + } - - - - - - - - - - - -`);class qr extends Rr(HTMLElement){}const Fr=function(){if(1>window.distillRunlevel)throw new Error('Insufficient Runlevel for Distill Template!');if('distillTemplateIsLoading'in window&&window.distillTemplateIsLoading)throw new Error('Runlevel 1: Distill Template is getting loaded more than once, aborting!');else window.distillTemplateIsLoading=!0,console.info('Runlevel 1: Distill Template has started loading.');p(document),console.info('Runlevel 1: Static Distill styles have been added.'),console.info('Runlevel 1->2.'),window.distillRunlevel+=1;for(const[e,t]of Object.entries(hi.listeners))'function'==typeof t?document.addEventListener(e,t):console.error('Runlevel 2: Controller listeners need to be functions!');console.info('Runlevel 2: We can now listen to controller events.'),console.info('Runlevel 2->3.'),window.distillRunlevel+=1;if(2>window.distillRunlevel)throw new Error('Insufficient Runlevel for adding custom elements!');const e=[ki,wi,Ci,Li,Ai,Di,Oi,Ni,Ri,Fi,pi,Hi,zi,T,Bi,Wi,Vi,Mr,$i].concat([Ir,jr,qr]);for(const t of e)console.info('Runlevel 2: Registering custom element: '+t.is),customElements.define(t.is,t);console.info('Runlevel 3: Distill Template finished registering custom elements.'),console.info('Runlevel 3->4.'),window.distillRunlevel+=1,hi.listeners.DOMContentLoaded(),console.info('Runlevel 4: Distill Template initialisation complete.')};window.distillRunlevel=0,yi.browserSupportsAllFeatures()?(console.info('Runlevel 0: No need for polyfills.'),console.info('Runlevel 0->1.'),window.distillRunlevel+=1,Fr()):(console.info('Runlevel 0: Distill Template is loading polyfills.'),yi.load(Fr))}); -//# sourceMappingURL=template.v2.js.map -} - + + + + + + + + + + + + d-byline .byline { + grid-template-columns: 2fr 2fr 2fr 2fr; + } + + d-byline .rjournal { + grid-column-end: span 2; + grid-template-columns: 1fr 1fr; + margin-bottom: 0; + } + + d-title h1, d-title p, d-title figure, + d-abstract p, d-abstract b { + grid-column: page; + } + + d-title .dt-tags { + grid-column: page; + } + + .dt-tags .dt-tag { + text-transform: lowercase; + } + + d-article h1 { + line-height: 1.1em; + } + + d-abstract p, d-article p { + text-align: justify; + } + + @media(min-width: 1000px) { + .d-contents.d-contents-float { + justify-self: end; + } + + nav.toc { + border-right: 1px solid rgba(0, 0, 0, 0.1); + border-right-width: 1px; + border-right-style: solid; + border-right-color: rgba(0, 0, 0, 0.1); + } + } + + .posts-list .dt-tags .dt-tag { + text-transform: lowercase; + } + + @keyframes highlight-target { + 0% { + background-color: #ffa; + } + 66% { + background-color: #ffa; + } + 100% { + background-color: none; + } + } + + d-article :target, d-appendix :target { + animation: highlight-target 3s; + } + + .header-section-number { + margin-right: 0.5em; + } + + d-appendix .citation-appendix, + .d-appendix .citation-appendix { + color: rgb(60, 60, 60); + } + + d-article h2 { + border-bottom: 0px solid rgba(0, 0, 0, 0.1); + padding-bottom: 0rem; + } + d-article h3 { + font-size: 20px; + } + d-article h4 { + font-size: 18px; + text-transform: none; + } + + @media (min-width: 1024px) { + d-article h2 { + font-size: 32px; + } + d-article h3 { + font-size: 24px; + } + d-article h4 { + font-size: 20px; + } + } + @@ -2540,6 +1769,7 @@

This article is in review.

nlmeVPC: Visual Model Diagnosis for the Nonlinear Mixed Effect Model

+

A nonlinear mixed effects model is useful when the data are repeatedly measured within the same unit or correlated between units. Such models are widely used in medicine, disease mechanics, pharmacology, ecology, social science, psychology, etc. After fitting the nonlinear mixed effect model, model diagnostics are essential for verifying that the results are reliable. The visual predictive check (VPC) has recently been highlighted as a visual diagnostic tool for pharmacometric models. This method can also be applied to general nonlinear mixed effects models. However, functions for VPCs in existing R packages are specialized for pharmacometric model diagnosis, and are not suitable for general nonlinear mixed effect models. In this paper, we propose nlmeVPC, an R package for the visual diagnosis of various nonlinear mixed effect models. The nlmeVPC package allows for more diverse model diagnostics, including visual diagnostic tools that extend the concept of VPCs along with the capabilities of existing R packages.

@@ -2552,17 +1782,17 @@

nlmeVPC: Visual Model Diagnosis for the Nonlinear Mixed Effect Model

, Eun-Kyung Lee (Ewha Womans University) -
2023-08-26 +
2023-08-26
-

1 Introduction

+

1 Introduction

After fitting a model, diagnosing the fitted model is essential for verifying that the results are reliable (Nguyen et al. 2017). For linear models, the residuals are usually used to determine the goodness of fit of the fitted model. However, due to random effects and nonlinearity, residuals are less useful for diagnosing fit in nonlinear mixed effects models. Therefore, various diagnostic tools for this type of model have been developed. The nonlinear mixed effect model is useful when the data are repeatedly measured within the same unit, or the relationship between the dependent and independent variables is nonlinear. It is widely used in various fields, including medicine, disease mechanics, pharmacology, ecology, pharmacometrics, social science, and psychology (Pinheiro and Bates 2000; Davidian 2017). Recently, among the various diagnostic tools applicable to nonlinear mixed models, simulation-based diagnostic methods have been developed in the field of pharmacology (Karlsson and Savic 2007). The visual predictive check (VPC; Karlsson and Holford (2008)) is a critical diagnostic tool that visually tests for the adequacy of the fitted model. It allows for the diagnosis of fixed and random effects in mixed models (Karlsson and Holford 2008) in the original data space. Currently, it is a widely used method for diagnosing population pharmacometric models: Heus et al. (2022) used the VPC method to evaluate their population pharmacokinetic models of vancomycin, Mudde et al. (2022) checked their final population PK model for each regimen of antituberculosis drug using the VPC method, and Otto et al. (2021) compares the predictive performance of parent-metabolite models of (S)-ketamine with the VPC method. This VPC method can be used for general nonlinear mixed effect models, including hierarchical models.

The psn (Lindbom et al. 2004) and xpose4 (Keizer et al. 2013) packages provide various diagnostic methods for pharmacometric models in R (R Core Team 2023), including the VPC plot. These packages were developed only for the pharmacometricians, and the users needed to use the NONMEM software (Bauer 2011) for generating inputs of functions in these two packages. However, NONMEM is licensed software, and it is mainly designed towards the analysis of population pharmacometric models. Therefore, it is not easy for nonpharmacometrician to use NONMEM and to draw the VPC plot through psn and xpose4. Recently, the vpc (Keizer 2021) package and nlmixr2 (Fidler et al. 2019) package have been developed to draw VPC plots in R without results from NONMEM. However, vpc package provides the function for drawing the original VPC plot only. nlmixr2 was developed initially for fitting general dynamic models. To check the fitted model, nlmixr2 provides a function to use graphical diagnostics in xpose4 with the nlmixr2 model object. Also, nlmixr2 uses the VPC plot in the vpc package with the nlmixr2 model object. Therefore, both packages only provide a function to draw the basic VPC plot, and the other newly developed simulation-based methods, including extensions of VPC, are not provided.

We have developed a new R package, nlmeVPC, to provide a suite for various visual checking methods for the nonlinear mixed models. This package includes various state-of-the-art model diagnostic methods based on the visual comparison between the original and simulated data. Most methods compare the statistics calculated from the observed data to the statistics from the simulated data. Percentiles, for example, the \(10^{th}\), \(50^{th}\), and \(90^{th}\) percentiles, are widely used to summarize relevant statistics from the observed and simulated data. We compare the similarities between the statistics from the observed data and those from the simulated data in two different spaces: the data space and the model space (Wickham et al. 2015). The original data comprise the data space. Usually, the data space is represented by the independent and dependent variables, such as time and blood concentration, in the pharmacokinetic data. On the other hand, the model space is composed of quantities obtained from the fitted model, for example, residuals and summary values from the fitted model. From this viewpoint, we categorize the well-known visual diagnostic tools into two categories. One method compares the observed and simulated data in the original data space, and the other in the model space. In the method with the data space, we developed functions for the original VPC (VPCgraph), the additive quantile regression VPC (aqrVPC), and the bootstrap VPC (bootVPC). In addition, we proposed a new VPC method: the average shifted VPC (asVPC). In the method with the model space, the coverage plot (coverageplot) and the quantified VPC (quantVPC) are included. For a more detailed diagnosis, we developed a coverage detailed plot (coverageDetailplot). In nlmeVPC, the ggplot2 package (Wickham 2016) is used to create all plots.

-

2 Nonlinear mixed effect model

+

2 Nonlinear mixed effect model

The general nonlinear mixed effect model is defined as follow: \[ y_{ij} = f(x_{ij}, \theta, \eta_i, \epsilon_{ij}) \\ @@ -2570,12 +1800,12 @@

\(y_{ij}\) is the dependent variable of the \(j^{th}\) observation of the \(i^{th}\) individual, \(x_{ij}\) is the independent variable, \(f\) is the nonlinear function of \(x_{ij}\), \(\theta\) is the population parameter, \(\eta_i\) represents the variability of the individual \(i\), and \(\epsilon_{ij}\) represents the random error. From the data, \(\theta\) , \(\Omega\), and \(\Sigma\) are estimated. For the simulated data, the fitted model \(y_{ij} = f(x_{ij}, \hat{\theta}, \eta_i, \epsilon_{ij})\), \(\eta_i \sim N(0,\hat\Omega)\), \(\epsilon_{ij} \sim N(0, \hat\Sigma)\) are used.

-

3 Validation in the data space

+

3 Validation in the data space

The VPC is the most popular model validation method in the pharmacometrics area. It was developed to diagnose population pharmacokinetic/pharmacodynamic models visually. The main idea of the VPC is to compare the distribution of observations and the distribution of predicted values, where the distribution of predicted values is obtained from simulated data drawn from the fitted model. If the fitted model explains the observed data well, these two distributions should be similar. Both distributions can be represented in the original data space that consists X axis as the independent variable and Y axis as the dependent variable. It allows us to compare the observed data with the fitted model in the original data space. In nlmeVPC, we include an original VPC plot, an additive quantile regression VPC (Jamsen et al. 2018), and a bootstrap VPC (Post et al. 2008). We also proposed a new approach to draw the VPC: the average shifted VPC.

-

Visual Predictive Check

+

3.1 Visual Predictive Check

The visual predictive check (VPC; Karlsson and Holford (2008)) is based on the principle that if the fitted model adequately describes the observed data, the distribution of the simulated data from the fitted model should be similar to the distribution of the observed data. There are several ways to compare the similarities between the distributions. In the VPC approach, profiles of quantiles are used. Two profiles are mainly used to compare the distributions of observations and predictions. One profile is from the upper bound of the prediction intervals, and the other is from the lower bound. These prediction intervals are calculated from the simulated data. 90\(\%\) prediction intervals are usually used. For small and sparse samples, 80\(\%\) prediction interval is also used. The lower and upper bounds of 80\(\%\) prediction interval are the \(10^{th}\) and \(90^{th}\) percentiles of the simulated data. Figure 1(A) shows the “scatter” type of the VPC plot. Dots indicate the observed data. Two dashed blue lines represent profiles of the \(10^{th}\) and \(90^{th}\) percentiles of the simulated data, and the solid blue line represents the \(50^{th}\) percentile. If the fitted model represents the observed data well, most observed data should lie between profiles of \(10^{th}\) and \(90^{th}\) percentiles.

Figure 1(B) is the “percentile” type of the VPC plot. In this plot, profiles of percentiles from the observed data are compared to profiles of percentiles from the simulated data. Two dashed red lines represent profiles of the \(10^{th}\) and \(90^{th}\) percentiles of the observed data, and the solid red line represents profiles of the \(50^{th}\) percentile of the observed data. If the fitted model represents the observed data well, two profiles in each percentile - one from the original data and the other from the simulated data - are similar.

Figure 1(C) is the “CI” type of the VPC plot. The solid red line represents the \(50^{th}\) percentile of the observed data, and dashed red lines represent the \(10^{th}\) and \(90^{th}\) percentiles of the observed data. Light blue areas represent the 95\(\%\) confidence areas of the \(10^{th}\) and \(90^{th}\) percentiles, and pink areas represent the 95\(\%\) confidence areas of the \(50^{th}\) percentile. These confidence areas were calculated from the simulated data. After calculating percentiles in each simulated data, we find 95\(\%\) confidence intervals for each percentile and use this to draw the areas. In this plot, it is necessary to verify that the profiles of the original data are in confidence areas of each profile from the simulated data in each percentile. If each percentile line of the observed data is in the corresponding confidence area, this can be evidence @@ -2585,37 +1815,37 @@

Visual Predictive Check

VPCgraph provides the automatic binning with optK and makeCOVbin; here, optK finds the optimal number of bins, and makeCOVbin finds the optimal cutoffs of bins using Lavielle and Bleakley’s method.

-The visual predictive check plot. The solid red line represents the $50^{th}$ percentile of the observed data, and dashed red lines represent the $10^{th}$ and $90^{th}$ percentiles of the observed data. The solid blue line represents the $50^{th}$ percentile of the simularted data, and dashed blue lines represent the $10^{th}$ and $90^{th}$ percentiles of the simulated data. Light blue and pink areas represent the 95% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines. +The visual predictive check plot. The solid red line represents the $50^{th}$ percentile of the observed data, and dashed red lines represent the $10^{th}$ and $90^{th}$ percentiles of the observed data. The solid blue line represents the $50^{th}$ percentile of the simularted data, and dashed blue lines represent the $10^{th}$ and $90^{th}$ percentiles of the simulated data. Light blue and pink areas represent the 95\% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines.

Figure 1: The visual predictive check plot. The solid red line represents the \(50^{th}\) percentile of the observed data, and dashed red lines represent the \(10^{th}\) and \(90^{th}\) percentiles of the observed data. The solid blue line represents the \(50^{th}\) percentile of the simularted data, and dashed blue lines represent the \(10^{th}\) and \(90^{th}\) percentiles of the simulated data. Light blue and pink areas represent the 95% confidence areas of the \(10^{th}\), \(50^{th}\) and \(90^{th}\) percentile lines.

-

Additive quantile regression VPC

+

3.2 Additive quantile regression VPC

To overcome the difficulties of making bins as well as determining the number of bins, Jamsen et al. (2018) used additive quantile regression to calculate the quantiles of the observed and simulated data. This regression method makes it possible to estimate quantiles without discrete binning, which is especially useful when the data are insufficient, irregular, or inappropriate to configure the bins. To fit the additive quantile regression, we used the rqss function in the quantreg (Koenker 2023) package and developed the aqrVPC function to draw the VPC type plot with additive quantile regression. Figure 2 shows the additive quantile regression VPC plot. The solid and dashed lines represent the \(10^{th}\), \(50^{th}\), and \(90^{th}\) additive quantile regression lines of the observed data, and the pink and light blue areas represent the confidence areas of the additive quantile regression lines of the simulated data. Lines and areas in the additive quantile regression VPC plot are much smoother than those in the original VPC plot.

-The additive equantile VPC plot.  Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line. Light blue and pink areas represent the 95% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines. +The additive equantile VPC plot.  Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line. Light blue and pink areas represent the 95\% confidence areas of the $10^{th}$, $50^{th}$ and $90^{th}$ percentile lines.

Figure 2: The additive equantile VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the \(10^{th}\), \(50^{th}\), and \(90^{th}\) percentiles of the simulated data. The solid red line represents the \(50^{th}\) percentile line. Light blue and pink areas represent the 95% confidence areas of the \(10^{th}\), \(50^{th}\) and \(90^{th}\) percentile lines.

-

Bootstrap VPC

+

3.3 Bootstrap VPC

The bootstrap VPC (Post et al. 2008) compares the distribution of the simulated data to the distribution of the bootstrap samples drawn from the observed data. This plot reflects the uncertainty of the observed data and allows for more objective comparisons with the predicted median.

Figure 3 shows the bootstrap VPC plot using bootVPC. The solid and dashed blue lines represent the \(10^{th}\), \(50^{th}\), and \(90^{th}\) percentiles of the simulated data. The solid red line represents the \(50^{th}\) percentile line, and the pink areas represent the 95\(\%\) confidence areas of the \(50^{th}\) percentile line, calculated from the bootstrap samples of the observed data. If the solid blue line and the solid red line are similar, the solid blue line is in the pink area, and the pink area is located between two dashed blue lines, then this is evidence that the fitted model fit the observed data well.

-The bootstrap VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line, and the pink areas represent the 95% confidence areas of the $50^{th}$ percentile line, calculated from the bootstrap samples of the observed data. +The bootstrap VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the $10^{th}$, $50^{th}$, and $90^{th}$ percentiles of the simulated data. The solid red line represents the $50^{th}$ percentile line, and the pink areas represent the 95\% confidence areas of the $50^{th}$ percentile line, calculated from the bootstrap samples of the observed data.

Figure 3: The bootstrap VPC plot. Dots indicate the observed data. The solid and dashed blue lines represent the \(10^{th}\), \(50^{th}\), and \(90^{th}\) percentiles of the simulated data. The solid red line represents the \(50^{th}\) percentile line, and the pink areas represent the 95% confidence areas of the \(50^{th}\) percentile line, calculated from the bootstrap samples of the observed data.

-

Average shifted VPC

+

3.4 Average shifted VPC

Even though binning mitigates the problem with highly irregular data, the VPC plot still has a precision problem with sparse data. In this paper, we propose a new approach to draw the adapted VPC plot from the average shifted histograms (Scott 1985). A histogram is a widely used method for displaying the density of a single continuous variable. However, histograms can look quite different based on different choice of bin width and anchor. This requires computing an optimal bin width. @@ -2640,7 +1870,7 @@

Average shifted VPC

  • Collect the medians of the independent variable and the weighted percentiles of the dependent variable from 2, and connect them to the lines.

  • We can implement (A) and (B) by applying these procedures separately to the observed and simulated data. Additionally, (C) can be implemented using these procedures for each simulated dataset. First, we find the weighted percentiles, combine the results from each simulated dataset, and then calculate the 95\(\%\) confidence intervals of each percentile. Using these three quantities (A), (B), and (C), we can draw the VPC type plot with the ASH approach, producing the asVPC plot.

    -

    Determining the weights

    +

    Determining the weights

    In the asVPC plot, the observations in each bin are combined using weights. Typically, the data near the center of the integrated bin have higher weights, and the data far from the center have smaller weights. This idea is used in the ASH algorithm as well as the density estimation literature. We suggest two different ways to apply weights for the asVPC calculation.