[1] "Created: Wed Apr 1 16:35:44 2015"
See the introduction for an overview.
library(lme4)
library(ggplot2)
options(digits=5,show.signif.stars=FALSE)
Load in and plot the data:
data(abrasion, package="faraway")
abrasion
run position material wear
1 1 1 C 235
2 1 2 D 236
3 1 3 B 218
4 1 4 A 268
5 2 1 A 251
6 2 2 B 241
7 2 3 D 227
8 2 4 C 229
9 3 1 D 234
10 3 2 C 273
11 3 3 A 274
12 3 4 B 226
13 4 1 B 195
14 4 2 A 270
15 4 3 C 230
16 4 4 D 225
ggplot(abrasion,aes(x=material, y=wear, shape=run, color=position))+geom_point(position = position_jitter(width=0.1, height=0.0))
mmod <- lmer(wear ~ material + (1|run) + (1|position), abrasion)
print(summary(mmod),correlation=FALSE)
Linear mixed model fit by REML ['lmerMod']
Formula: wear ~ material + (1 | run) + (1 | position)
Data: abrasion
REML criterion at convergence: 100.3
Scaled residuals:
Min 1Q Median 3Q Max
-1.090 -0.302 0.027 0.422 1.210
Random effects:
Groups Name Variance Std.Dev.
run (Intercept) 66.9 8.18
position (Intercept) 107.1 10.35
Residual 61.3 7.83
Number of obs: 16, groups: run, 4; position, 4
Fixed effects:
Estimate Std. Error t value
(Intercept) 265.75 7.67 34.7
materialB -45.75 5.53 -8.3
materialC -24.00 5.53 -4.3
materialD -35.25 5.53 -6.4
anova(mmod)
Analysis of Variance Table
Df Sum Sq Mean Sq F value
material 3 4621 1540 25.1
No inferential output is provided (as expected)
Construct confidence intervals using the parametric bootstrap:
confint(mmod, method="boot")
2.5 % 97.5 %
sd_(Intercept)|run 0.0000 16.418
sd_(Intercept)|position 0.0000 19.511
sigma 3.6584 11.830
(Intercept) 251.0119 281.808
materialB -57.7385 -35.538
materialC -34.5255 -12.980
materialD -45.4872 -24.729
We can apply the Kenward-Roger adjustment to test the fixed effect like this:
library(pbkrtest)
mmod <- lmer(wear ~ material + (1|run) + (1|position), abrasion,REML=FALSE)
nmod <- lmer(wear ~ 1+ (1|run) + (1|position), abrasion,REML=FALSE)
KRmodcomp(mmod, nmod)
F-test with Kenward-Roger approximation; computing time: 0.16 sec.
large : wear ~ material + (1 | run) + (1 | position)
small : wear ~ 1 + (1 | run) + (1 | position)
stat ndf ddf F.scaling p.value
Ftest 25.1 3.0 6.0 1 0.00085
We see strong evidence of a difference between materials.
The RLRsim
package cannot be used here as there is more than one random effect. We can implement the parametric bootstrap directly to test whether there is any variance in the runs (for example)
First we compute the observed LRT statistic
rmod <- lmer(wear ~ material + (1|position), abrasion,REML=FALSE)
(olrt <- 2*(logLik(mmod)-logLik(rmod)))
'log Lik.' 5.4164 (df=7)
Now we resample 1000 times:
lrstat <- numeric(1000)
for(i in 1:1000){
rwear <- unlist(simulate(rmod))
bmod <- refit(mmod,rwear)
smod <- refit(rmod,rwear)
lrstat[i] <- 2*(logLik(bmod)-logLik(smod))
}
mean(lrstat > olrt)
[1] 0.036
This gives a p-value close to 0.05 so we might run this a lot more bootstrap samples if we cared deeply about this hypothesis test.
library(lmerTest)
mmod <- lmer(wear ~ material + (1|run) + (1|position), abrasion,REML=FALSE)
print(summary(mmod),correlation=FALSE)
Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
approximations to degrees of freedom [merModLmerTest]
Formula: wear ~ material + (1 | run) + (1 | position)
Data: abrasion
AIC BIC logLik deviance df.resid
134.3 139.7 -60.2 120.3 9
Scaled residuals:
Min 1Q Median 3Q Max
-1.3040 -0.3777 0.0011 0.5490 1.4368
Random effects:
Groups Name Variance Std.Dev.
run (Intercept) 61.4 7.84
position (Intercept) 91.1 9.54
Residual 41.1 6.41
Number of obs: 16, groups: run, 4; position, 4
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 265.75 6.96 8.52 38.20 7.9e-11
materialB -45.75 4.53 8.87 -10.09 3.7e-06
materialC -24.00 4.53 8.87 -5.29 0.00052
materialD -35.25 4.53 8.87 -7.77 3.0e-05
anova(mmod)
Analysis of Variance Table of type 3 with Satterthwaite
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
material 4621 1540 3 8.87 37.5 2.3e-05
We prefer the Kenward-Rogers (seen above) to the Satterthwaite adjustment shown here.
Can test the random effects:
rand(mmod)
Analysis of Random effects Table:
Chi.sq Chi.DF p.value
run 3.05 1 0.08
position 4.59 1 0.03
but we prefer the parametric bootstrap result because of problems with asymptotic distribution of the LRT in such cases.
May also be useful to have the pairwise difference of the fixed effects:
difflsmeans(mmod)
Differences of LSMEANS:
Estimate Standard Error DF t-value Lower CI Upper CI
material A - B 45.8 4.53 8.9 10.09 35.469 56.031
material A - C 24.0 4.53 8.9 5.29 13.719 34.281
material A - D 35.2 4.53 8.9 7.77 24.969 45.531
material B - C -21.8 4.53 8.9 -4.80 -32.031 -11.469
material B - D -10.5 4.53 8.9 -2.32 -20.781 -0.219
material C - D 11.2 4.53 8.9 2.48 0.969 21.531
p-value
material A - B <2e-16
material A - C 5e-04
material A - D <2e-16
material B - C 0.001
material B - D 0.046
material C - D 0.035
All pairwise difference appear significant.
We use the expanded prior for the reasons established in our one random effect example. In contrast to the previous nested example the labeling of the factors already indicates a crossed design for the random effects.
library(MCMCglmm)
eprior <- list(R=list(V=1,nu=0.02),G=list(G1=list(V=1,nu=0.02,alpha.V=1000),G2=list(V=1,nu=0.02,alpha.V=1000)))
bmod <- MCMCglmm(wear ~ material,random=~run+position,data=abrasion,verbose=FALSE,prior=eprior,pr=TRUE)
xyplot(log(bmod$VCV))
The diagnostics are satisfactory - can also check the fixed effect terms and autocorrelations.
summary(bmod)
Iterations = 3001:12991
Thinning interval = 10
Sample size = 1000
DIC: 124.95
G-structure: ~run
post.mean l-95% CI u-95% CI eff.samp
run 176 4.76e-05 639 1000
~position
post.mean l-95% CI u-95% CI eff.samp
position 300 0.00213 912 1000
R-structure: ~units
post.mean l-95% CI u-95% CI eff.samp
units 122 18.8 322 825
Location effects: wear ~ material
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) 265.35 242.46 290.82 1000 <0.001
materialB -45.71 -63.88 -30.85 1000 0.002
materialC -23.93 -39.98 -8.55 1000 0.016
materialD -35.35 -51.61 -20.44 1127 <0.001
There is strong evidence of differences between the materials. The credible intervals for the variance components for run and position are quite wide so we are less sure about these effects.
We can view the posterior distributions for the SDs corresponding to the three components of variation:
library(reshape2)
ref <- data.frame(bmod$VCV)
rdf <- melt(ref, value.name="wear")
ggplot(rdf, aes(x=sqrt(wear),color=variable))+geom_density()
We see that the error SD distribution is more compact. We see that run and position variances could be quite small or relatively much larger. It is not surprising that we cannot narrow these down more as we have only four runs and four positions.
We can examine the posterior distributions for the material contrasts:
ref <- data.frame(bmod$Sol[,2:4])
colnames(ref) <- c("B-A","C-A","D-A")
rdf <- melt(ref, value.name="wear", variable.name="material")
ggplot(rdf, aes(x=wear, color=material))+geom_density()
We see that B, C, and D all show clearly less wear than A because all three difference distributions are concentrated well-below zero. B appears best in showing least wear but we are not sure that it is better than D. There is not much overlap between B and C. However, we would need to plot these contrasts to be certain about this.
We can also plot the posteriors for the random effects:
ref <- data.frame(bmod$Sol[,5:8])
rdf <- melt(ref, value.name="wear", variable.name="run")
ggplot(rdf, aes(x=wear, color=run))+geom_density()
We can see our previous uncertainty about whether there really is a difference between the runs expressed in the overlapping posterior distributions.
Check on versions of R packages used:
sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reshape2_1.2.2 MCMCglmm_2.21 ape_3.1-4 coda_0.16-1
[5] lattice_0.20-29 lmerTest_2.0-20 pbkrtest_0.4-1 MASS_7.3-33
[9] ggplot2_0.9.3.1 lme4_1.1-7 Rcpp_0.11.3 Matrix_1.1-4
[13] knitr_1.6 rmarkdown_0.2.68
loaded via a namespace (and not attached):
[1] bitops_1.0-6 boot_1.3-11 caTools_1.16
[4] cluster_1.15.2 colorspace_1.2-4 corpcor_1.6.7
[7] dichromat_2.0-0 digest_0.6.4 evaluate_0.5.3
[10] formatR_0.10 Formula_1.1-1 gdata_2.13.3
[13] gplots_2.13.0 grid_3.1.1 gtable_0.1.2
[16] gtools_3.3.1 Hmisc_3.14-3 htmltools_0.2.4
[19] KernSmooth_2.23-12 labeling_0.2 latticeExtra_0.6-26
[22] minqa_1.2.3 munsell_0.4.2 nlme_3.1-117
[25] nloptr_1.0.4 numDeriv_2012.9-1 parallel_3.1.1
[28] plyr_1.8.1 proto_0.3-10 RColorBrewer_1.0-5
[31] scales_0.2.3 splines_3.1.1 stringr_0.6.2
[34] survival_2.37-7 tensorA_0.36 tools_3.1.1
[37] yaml_2.1.11