Skip to content

Commit

Permalink
0.2.1
Browse files Browse the repository at this point in the history
  • Loading branch information
PharmCat committed Jan 27, 2021
1 parent 12907e9 commit bc51a68
Show file tree
Hide file tree
Showing 18 changed files with 270 additions and 85 deletions.
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "Metida"
uuid = "a1dec852-9fe5-11e9-361f-8d9fde67cfa2"
authors = ["Vladimir Arnautov <[email protected]>"]
version = "0.2.0"
version = "0.2.1"

[deps]
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
Expand Down
5 changes: 5 additions & 0 deletions change.log
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
- 0.2.1
* StatsBase methods
* ARMA
* docs

- 0.2.0
* many changes in VarianceType, covariance structure
* code cleaning
Expand Down
36 changes: 32 additions & 4 deletions docs/src/details.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,53 @@
## Details

### V
#### Model

In matrix notation a mixed effect model can be represented as:

```math
y = X\beta + Zu + \epsilon
```

#### V

```math
V_{i} = Z_{i}GZ_i'+R_{i}
```

### REML
#### Henderson's «mixed model equations»

```math
\begin{pmatrix}X'R^{-1}X&X'R^{-1}Z\\Z'R^{-1}X&Z'R^{-1}Z+G_{-1}\end{pmatrix} \begin{pmatrix}\widehat{\beta} \\ \widehat{u} \end{pmatrix}= \begin{pmatrix}X'R^{-1}y\\Z'R^{-1}y\end{pmatrix}
```

The solution to the mixed model equations is a maximum likelihood estimate when the distribution of the errors is normal. PROC MIXED in SAS / MIXED SPSS used restricted maximum likelihood (REML) approach by default. REML equation by: Henderson, 1959; Laird et.al. 1982; Jennrich 1986; Lindstrom & Bates, 1988; Gurka et.al 2006.

#### REML

```math
logREML(\theta,\beta) = -\frac{N-p}{2} - \frac{1}{2}\sum_{i=1}^nlog|V_{i}|-
-\frac{1}{2}log|\sum_{i=1}^nX_i'V_i^{-1}X_i|-\frac{1}{2}\sum_{i=1}^n(y_i - X_{i}\beta)'V_i^{-1}(y_i - X_{i}\beta)
```

### Beta
#### Beta (β)

```math
\beta = {(\sum_{i=1}^n X_{i}'V_i^{-1}X_{i})}^{-1}(\sum_{i=1}^n X_{i}'V_i^{-1}y_{i})
```

### Sweep
#### F

```math
F = \frac{\beta'L'(LCL')^{-1}L\beta}{rank(LCL')}
```

#### Variance covariance matrix of β

```math
C = (\sum_{i=1}^{n} X_i'V_i^{-1}X_i)^{-1}
```

#### Sweep

Details see: https://github.com/joshday/SweepOperator.jl
38 changes: 17 additions & 21 deletions docs/src/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,10 @@

```
using Metida, StatsBase, StatsModels, CSV, DataFrames
df = CSV.File(dirname(pathof(Metida))*"\\..\\test\\csv\\df0.csv") |> DataFrame
df0 = CSV.File(dirname(pathof(Metida))*"\\..\\test\\csv\\df0.csv") |> DataFrame
# EXAMPLE 1
################################################################################
################################################################################
# EXAMPLE 1
################################################################################
#=
PROC MIXED data=df0;
Expand All @@ -15,16 +14,15 @@ MODEL var = sequence period formulation/ DDFM=SATTERTH s;
RANDOM formulation/TYPE=CSH SUB=subject G V;
REPEATED/GRP=formulation SUB=subject R;
RUN;
REML: 10.06523862
=#
lmm = LMM(@formula(var ~ sequence + period + formulation), df;
lmm = LMM(@formula(var ~ sequence + period + formulation), df0;
random = VarEffect(@covstr(formulation), CSH),
repeated = VarEffect(@covstr(formulation), VC),
repeated = VarEffect(@covstr(formulation), DIAG),
subject = :subject)
fit!(lmm)
#=
Linear Mixed Model: var ~ sequence + period + formulation
Random 1:
Expand Down Expand Up @@ -61,9 +59,8 @@ Repeated formulation: 1 var 0.0206445
Repeated formulation: 2 var 0.0422948
=#
# EXAMPLE 2
################################################################################
################################################################################
# EXAMPLE 2
################################################################################
#=
PROC MIXED data=df0;
Expand All @@ -77,9 +74,9 @@ REML: 16.06148160
=#
lmm = LMM(
@formula(var ~ sequence + period + formulation), df;
@formula(var ~ sequence + period + formulation), df0;
random = VarEffect(@covstr(formulation), SI),
repeated = VarEffect(@covstr(formulation), VC),
repeated = VarEffect(@covstr(formulation), DIAG),
subject = :subject,
)
fit!(lmm)
Expand Down Expand Up @@ -117,9 +114,9 @@ Repeated formulation: 1 var 0.0210784
Repeated formulation: 2 var 0.048761
=#
#EXAMPLE 3
################################################################################
################################################################################
# EXAMPLE 3
################################################################################
#=
PROC MIXED data=df0;
Expand All @@ -131,7 +128,7 @@ RUN;
REML: 10.86212458
=#
lmm = LMM(@formula(var ~ sequence + period + formulation), df;
lmm = LMM(@formula(var ~ sequence + period + formulation), df0;
random = VarEffect(@covstr(subject), SI)
)
fit!(lmm)
Expand Down Expand Up @@ -168,9 +165,8 @@ Random 1 Var var 0.168571
Repeated Var var 0.0333559
=#
#EXAMPLE 4
################################################################################
################################################################################
# EXAMPLE 4
################################################################################
#=
PROC MIXED data=df0;
Expand All @@ -184,7 +180,7 @@ REML: 25.12948063
=#
lmm = LMM(
@formula(var ~ sequence + period + formulation), df;
random = [VarEffect(@covstr(period), VC), VarEffect(@covstr(formulation), VC)]
random = [VarEffect(@covstr(period), VC), VarEffect(@covstr(formulation), DIAG)]
)
fit!(lmm)
Expand Down Expand Up @@ -232,7 +228,7 @@ Repeated Var var 0.177845
#EXAMPLE 5
################################################################################
################################################################################
# EXAMPLE 5
################################################################################
#=
PROC MIXED data=df0;
Expand All @@ -244,8 +240,8 @@ RUN;
REML: 16.24111264
=#
lmm = LMM(@formula(var ~ sequence + period + formulation), df;
random = VarEffect(@covstr(formulation), VC),
lmm = LMM(@formula(var ~ sequence + period + formulation), df0;
random = VarEffect(@covstr(formulation), DIAG),
subject = :subject)
fit!(lmm)
Expand Down Expand Up @@ -285,7 +281,7 @@ Repeated Var var 0.0342502
#EXAMPLE 6
################################################################################
################################################################################
# EXAMPLE 6
################################################################################
using MixedModels
Expand Down
64 changes: 52 additions & 12 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,32 +6,72 @@ CurrentModule = Metida

## Mixed Models

Metida.jl is a Julia package for fitting mixed-effects models with flexible covariance structure. At this moment package is in early development stage.
Metida.jl is a Julia package for fitting mixed-effects models with flexible covariance structure.

Main goal to make reproducible output corresponding to SAS/SPSS.

Now implemented covariance structures:
Implemented covariance structures:

* Scaled Identity (SI)
* Variance Components / Diagonal (VC)
* Diagonal (DIAG)
* Autoregressive (AR)
* Heterogeneous Autoregressive (ARH)
* Compound Symmetry (CS)
* Heterogeneous Compound Symmetry (CSH)
* Autoregressive Moving Average (ARMA)

### Usage

Usage:
#### Model construction

`LMM(model, data; subject = nothing, random = nothing, repeated = nothing)`

where
where:

* `model` is a fixed-effect model (`@formula`), example: `@formula(var ~ sequence + period + formulation)`

* `random` vector of random effects or single random effect. Effect can be declared like this: `VarEffect(@covstr(formulation), CSH)`. `@covstr` is a effect model: `@covstr(formulation)`. `CSH` is a CovarianceType structure. Premade constants: SI, DIAG, AR, ARH, CS, CSH, ARMA.

* `repeated` is a repeated effect (only single).

* `subject` is a block-diagonal factor.

#### Fitting

```
fit!(lmm::LMM{T};
solver::Symbol = :default,
verbose = :auto,
varlinkf = :exp,
rholinkf = :sigm,
aifirst::Bool = false,
g_tol::Float64 = 1e-12,
x_tol::Float64 = 1e-12,
f_tol::Float64 = 1e-12,
hcalck::Bool = true,
init = nothing,
io::IO = stdout)
```

where:

* `solver` - :default / :nlopt / :cuda

* `verbose` - :auto / 1 / 2 / 3

* `varlinkf` - not implemented

* `rholinkf` - not implemented

* `g_tol` - absolute tolerance in the gradient

* `x_tol` - absolute tolerance of theta vector

`model` is a fixed-effect model (`@formula`), example: `@formula(var ~ sequence + period + formulation)`
* `f_tol` - absolute tolerance in changes of the REML

`random` vector of random effects or single random effect. Effect can be declared like this: `VarEffect(@covstr(formulation), CSH)`. `@covstr` is a effect model: `@covstr(formulation)`. `CSH` is a CovarianceType structure. Premade constants: SI, VC, AR, ARH, CSH.
* `hcalck` - calculate REML Hessian

`repeated` is a repeated effect (only single).
* `init` - initial theta values

`subject` is a block-diagonal factor.
* `io` - uotput IO

```@contents
Pages = [
Expand All @@ -43,4 +83,4 @@ Depth = 3

See also:

https://github.com/JuliaStats/MixedModels.jl
[MixedModels.jl](https://github.com/JuliaStats/MixedModels.jl)
4 changes: 2 additions & 2 deletions docs/src/ref.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ More:

* Henderson, C. R., et al. “The Estimation of Environmental and Genetic Trends from Records Subject to Culling.” Biometrics, vol. 15, no. 2, 1959, pp. 192–218. JSTOR, www.jstor.org/stable/2527669.

* Laird, Nan M., and James H. Ware. “Random-Effects Models for Longitudinal Data.” Biometrics, vol. 38, no. 4, 1982, pp. 963–974. JSTOR, www.jstor.org/stable/2529876.

* Lindstrom & J.; Bates, M. (1988). Newton—Raphson and EM Algorithms for Linear Mixed-Effects Models for Repeated-Measures Data. Journal of the American Statistical Association. 83. 1014. 10.1080/01621459.1988.10478693.

* Gurka, Matthew. (2006). Selecting the Best Linear Mixed Model under REML. The American Statistician. 60. 19-26. 10.1198/000313006X90396.
Expand Down Expand Up @@ -45,8 +47,6 @@ More:

### And more

* Laird, Nan M., and James H. Ware. “Random-Effects Models for Longitudinal Data.” Biometrics, vol. 38, no. 4, 1982, pp. 963–974. JSTOR, www.jstor.org/stable/2529876.

* Giesbrecht, F. G., and Burns, J. C. (1985), "Two-Stage Analysis Based on a Mixed Model: Large-sample Asymptotic Theory and Small-Sample Simulation Results," Biometrics, 41, 853-862.

* Jennrich, R., & Schluchter, M. (1986). Unbalanced Repeated-Measures Models with Structured Covariance Matrices. Biometrics, 42(4), 805-820. doi:10.2307/2530695
Expand Down
51 changes: 38 additions & 13 deletions docs/src/validation.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
## Validation

### sleepstudy.csv

| Model | REML Metida | REML SPSS|
|--------|--------|-------|
| 1 | 1729.4925602367025 | 1729.492560 |
| 2 | 1904.3265170722132 | 1904.327 |
| 3 | 1772.0953251997046 | 1772.095 |
| 4 | 1730.1895427398322 | 1730.189543 |
### REML result table

| Model | DataSet | REML Metida | REML SPSS|
|--------|--------|--------|-------|
| 1 | sleepstudy.csv | 1729.4925602367025 | 1729.492560 |
| 2 | sleepstudy.csv | 1904.3265170722132 | 1904.327 |
| 3 | sleepstudy.csv | 1772.0953251997046 | 1772.095 |
| 4 | sleepstudy.csv | 1730.1895427398322 | 1730.189543 |
| 5 | pastes.csv | 246.99074585348623 | 246.990746 |
| 6 | pastes.csv | 246.81895071012508 | 246.818951 |
| 7 | penicillin.csv | 330.86058899109184 | 330.860589 |

### sleepstudy.csv

#### Model 1

Expand Down Expand Up @@ -59,11 +63,7 @@ lmm = Metida.LMM(@formula(Reaction~1), df;
```
```

### Pastes.csv

| Model | REML Metida | REML SPSS|
|--------|--------|-------|
| 5 | 246.99074585348623 | 246.990746 |
### pastes.csv

#### Model 5

Expand All @@ -77,6 +77,31 @@ Metida.fit!(lmm)
```
```

#### Model 6

```
lmm = Metida.LMM(@formula(strength~1), df;
random = Metida.VarEffect(Metida.@covstr(cask), Metida.ARMA, subj = :batch),
)
Metida.fit!(lmm)
```

```
```

### penicillin.csv

#### Model 7

```
lmm = Metida.LMM(@formula(diameter~1), df;
random = [Metida.VarEffect(Metida.SI, subj = :plate), Metida.VarEffect(Metida.SI, subj = :sample)]
)
Metida.fit!(lmm)
```

```
```
### Reference dataset

* Schütz, H., Labes, D., Tomashevskiy, M. et al. Reference Datasets for Studies in a Replicate Design Intended for Average Bioequivalence with Expanding Limits. AAPS J 22, 44 (2020). https://doi.org/10.1208/s12248-020-0427-6
Expand Down
Loading

2 comments on commit bc51a68

@PharmCat
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/28795

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.2.1 -m "<description of version>" bc51a68b457ecd2ebcc1c98b5416b939edc54d38
git push origin v0.2.1

Please sign in to comment.