[1] "Created: Wed Apr  1 16:47:20 2015"

See the introduction for an overview.

library(lme4)
library(ggplot2)
options(digits=5,show.signif.stars=FALSE)

Load in and summarize the data:

data(vision, package="faraway")
vision$npower <- rep(1:4,14)
summary(vision)
     acuity     power       eye     subject     npower    
 Min.   : 94   6/6 :14   left :28   1:8     Min.   :1.00  
 1st Qu.:110   6/18:14   right:28   2:8     1st Qu.:1.75  
 Median :115   6/36:14              3:8     Median :2.50  
 Mean   :113   6/60:14              4:8     Mean   :2.50  
 3rd Qu.:118                        5:8     3rd Qu.:3.25  
 Max.   :124                        6:8     Max.   :4.00  
                                    7:8                   

Plot the data:

ggplot(vision, aes(y=acuity, x=npower, linetype=eye))+geom_line()+facet_wrap(~ subject)

plot of chunk unnamed-chunk-4

Fit the model

mmod <- lmer(acuity~power+(1|subject)+(1|subject:eye),vision)
print(summary(mmod),corr=FALSE)
Linear mixed model fit by REML ['lmerMod']
Formula: acuity ~ power + (1 | subject) + (1 | subject:eye)
   Data: vision

REML criterion at convergence: 328.7

Scaled residuals: 
   Min     1Q Median     3Q    Max 
-3.424 -0.323  0.011  0.441  2.466 

Random effects:
 Groups      Name        Variance Std.Dev.
 subject:eye (Intercept) 10.3     3.21    
 subject     (Intercept) 21.5     4.64    
 Residual                16.6     4.07    
Number of obs: 56, groups:  subject:eye, 14; subject, 7

Fixed effects:
            Estimate Std. Error t value
(Intercept)  112.643      2.235    50.4
power6/18      0.786      1.540     0.5
power6/36     -1.000      1.540    -0.6
power6/60      3.286      1.540     2.1
anova(mmod)
Analysis of Variance Table
      Df Sum Sq Mean Sq F value
power  3    141    46.9    2.83

Estimates but no inferential output as expected.

Parametric bootstrap-based confidence intervals:

confint(mmod, method="boot")
                               2.5 %   97.5 %
sd_(Intercept)|subject:eye   0.00000   5.3654
sd_(Intercept)|subject       0.00000   8.0743
sigma                        3.16913   4.9847
(Intercept)                108.27293 117.2798
power6/18                   -2.11527   3.9079
power6/36                   -3.98635   2.1837
power6/60                    0.31575   6.5202

pbkrtest package

We execute the KR adjusted test in this example:

library(pbkrtest)
mmod <- lmer(acuity~power+(1|subject)+(1|subject:eye),vision,REML=FALSE)
nmod <- lmer(acuity~1+(1|subject)+(1|subject:eye),vision,REML=FALSE)
KRmodcomp(mmod, nmod)
F-test with Kenward-Roger approximation; computing time: 0.16 sec.
large : acuity ~ power + (1 | subject) + (1 | subject:eye)
small : acuity ~ 1 + (1 | subject) + (1 | subject:eye)
       stat   ndf   ddf F.scaling p.value
Ftest  2.83  3.00 39.00         1   0.051

The p-value exceeding 5% is interesting since it contrasts with the previous results being just below 5%. In truth, the result is borderline either way but of course, which side of the fence you are on will matter greatly to some. We might also try the parametric bootstrap.

Testing the random effects

RLRsim cannot be used because there is more than one random effect.

We can implement the parametric bootstrap directly to test whether there is any variance in the eye within subject random effect term.

First we compute the observed LRT statistic

mmod <- lmer(acuity~power+(1|subject)+(1|subject:eye),vision,REML=FALSE)
rmod <- lmer(acuity~power+(1|subject),vision,REML=FALSE)
(olrt <- 2*(logLik(mmod)-logLik(rmod)))
'log Lik.' 6.9608 (df=7)

Now we resample 1000 times and computed the p-value:

lrstat <- numeric(1000)
for(i in 1:1000){
    racuity <- unlist(simulate(rmod))
    bmod <-    refit(mmod, racuity)
    smod <-    refit(rmod, racuity)
    lrstat[i] <- 2*(logLik(bmod)-logLik(smod))
}
mean(lrstat > olrt)
[1] 0.003

Looks like there is some variation in the eyes within subjects

lmerTest package

library(lmerTest)
mmod <- lmer(acuity~power+(1|subject)+(1|subject:eye),vision)
print(summary(mmod),corr=FALSE)
Linear mixed model fit by REML t-tests use Satterthwaite approximations
  to degrees of freedom [merModLmerTest]
Formula: acuity ~ power + (1 | subject) + (1 | subject:eye)
   Data: vision

REML criterion at convergence: 328.7

Scaled residuals: 
   Min     1Q Median     3Q    Max 
-3.424 -0.323  0.011  0.441  2.466 

Random effects:
 Groups      Name        Variance Std.Dev.
 subject:eye (Intercept) 10.3     3.21    
 subject     (Intercept) 21.5     4.64    
 Residual                16.6     4.07    
Number of obs: 56, groups:  subject:eye, 14; subject, 7

Fixed effects:
            Estimate Std. Error      df t value Pr(>|t|)
(Intercept)  112.643      2.235   8.800   50.40  3.7e-12
power6/18      0.786      1.540  39.000    0.51    0.613
power6/36     -1.000      1.540  39.000   -0.65    0.520
power6/60      3.286      1.540  39.000    2.13    0.039
anova(mmod)
Analysis of Variance Table of type 3  with  Satterthwaite 
approximation for degrees of freedom
      Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
power    141    46.9     3    39    2.83  0.051

Can test the random effects:

rand(mmod)
Analysis of Random effects Table:
            Chi.sq Chi.DF p.value
subject       2.96      1    0.09
subject:eye   5.98      1    0.01

but we prefer the parametric bootstrap result because of problems with asymptotic distribution of the LRT in such cases.

Pairwise differences may be interesting

difflsmeans(mmod)
Differences of LSMEANS:
                  Estimate Standard Error   DF t-value Lower CI Upper CI
power 6/6 - 6/18      -0.8          1.540 39.0   -0.51    -3.90    2.329
power 6/6 - 6/36       1.0          1.540 39.0    0.65    -2.12    4.115
power 6/6 - 6/60      -3.3          1.540 39.0   -2.13    -6.40   -0.171
power 6/18 - 6/36      1.8          1.540 39.0    1.16    -1.33    4.901
power 6/18 - 6/60     -2.5          1.540 39.0   -1.62    -5.62    0.615
power 6/36 - 6/60     -4.3          1.540 39.0   -2.78    -7.40   -1.171
                  p-value
power 6/6 - 6/18    0.613
power 6/6 - 6/36    0.520
power 6/6 - 6/60    0.039
power 6/18 - 6/36   0.253
power 6/18 - 6/60   0.113
power 6/36 - 6/60   0.008

MCMCglmm

The syntax of MCMCglmm requires that nested factors be uniquely labeled to distinguish, in this case, between the left eye of subject one and subject two (as these are different eyes!). We use the expanded prior for the reasons established in our one random effect example.

library(MCMCglmm)
vision$eyesub <- paste0(vision$eye,vision$subject)
eprior <- list(R=list(V=1,nu=0.02),G=list(G1=list(V=1,nu=0.02,alpha.V=1000),G2=list(V=1,nu=0.02,alpha.V=1000)))
bmod <- MCMCglmm(acuity ~ power,random=~subject+eyesub,data=vision,verbose=FALSE,prior=eprior,pr=TRUE)
xyplot(log(bmod$VCV))

plot of chunk unnamed-chunk-13

The diagnostics are satisfactory

summary(bmod)

 Iterations = 3001:12991
 Thinning interval  = 10
 Sample size  = 1000 

 DIC: 334.26 

 G-structure:  ~subject

        post.mean l-95% CI u-95% CI eff.samp
subject      30.7  0.00383     98.3     1000

               ~eyesub

       post.mean l-95% CI u-95% CI eff.samp
eyesub      17.9  0.00336       47     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units        18     11.1     27.6     1000

 Location effects: acuity ~ power 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC
(Intercept)  112.6641 107.8443 118.1216     1000 <0.001
power6/18      0.7473  -2.3682   3.8104     1000  0.638
power6/36     -0.9847  -4.3784   2.0154      813  0.536
power6/60      3.2479  -0.0462   6.0731      625  0.046

The results are quite similar to the lme4 fit. We see that the highest power level is the only one with much evidence of a difference from the rest.

We can examine the posterior distributions of the random effects terms.

library(reshape2)
ref <- data.frame(sqrt(bmod$VCV))
rdf <- melt(ref, value.name="acuity")
ggplot(rdf, aes(x=acuity,color=variable))+geom_density()

plot of chunk unnamed-chunk-15

These form the usual pattern where the error SD is more precisely estimated. We see there is slightly more variation between subjects than between eyes within subjects. Perhaps one might expect the latter to be significantly smaller but we don’t quite see that in this fit.

Let’s look at the posterior distributions for the contrasts with the lowest power level:

ref <- data.frame(bmod$Sol[,2:4])
colnames(ref) <- c("18-6","36-6","60-6")
rdf <- melt(ref, value.name="acuity", variable.name="power")
ggplot(rdf, aes(x=acuity, color=power))+geom_density()

plot of chunk unnamed-chunk-16

We see that only the 60/6 vs. 6/6 contrast has a distribution which indicates a clear difference from zero.

It is also interesting to look at the posterior distributions of the random effects. We focus on the eye effects and label whether eye is left or right.

ref <- data.frame(bmod$Sol)
rdf <- melt(ref[,grep("eye",colnames(ref))],value.name="acuity",variable.name="eyesub")
rdf$hand <- substring(rdf$eyesub,8,8)
ggplot(rdf, aes(x=acuity,group=eyesub, color=hand))+geom_density()

plot of chunk unnamed-chunk-17

We see there is not much distinction between left and right eyes (unlike hands). We also see one case where there is a large difference. We can figure out which case this is:

posterior.mode(bmod$Sol)
  (Intercept)     power6/18     power6/36     power6/60     subject.1 
  112.9392405     0.5903626    -1.2241856     4.0310278     3.6405395 
    subject.2     subject.3     subject.4     subject.5     subject.6 
    0.0667272     3.9021509    -0.1296614    -0.1785706    -7.5305802 
    subject.7  eyesub.left1  eyesub.left2  eyesub.left3  eyesub.left4 
   -0.0033437     3.0636185    -0.0716572     0.0239942     0.2224728 
 eyesub.left5  eyesub.left6  eyesub.left7 eyesub.right1 eyesub.right2 
   -0.3153598     1.5774931    -0.4528449     0.2237597    -1.6250469 
eyesub.right3 eyesub.right4 eyesub.right5 eyesub.right6 eyesub.right7 
    2.0656315     0.8727361     0.0602976    -6.3729642    -1.3415210 

We see that it corresponds to the right eye of subject 6. We have already noticed this problem earlier. There is either a problem with the measurement of this eye and/or the subject is quite asymetrical.

Predicted values

The Bayesian approach provides a predictive distribution rather than just a point estimate. We can summarise the distribution with a posterior mean. As with lme4 approach, there are different kinds of predicted values depending on how we use the random effects.

Suppose we were interested in a new individual - not one of the subjects in the study. In this case, we need to marginalize out the subject and eye effects:

vision$pv0 <- predict(bmod,marginal=~subject+eyesub)
p <- ggplot(vision, aes(y=acuity, x=npower, linetype=eye))+geom_line()+facet_wrap(~ subject)
p+geom_line(aes(y=pv0,color="red"))

plot of chunk unnamed-chunk-19

We see that this produces the same predicted value for all subjects and eyes.

We might know the subject but not which eye. To achieve this kind of predicted value:

vision$pvsub <- predict(bmod,marginal=~eyesub)
p <- ggplot(vision, aes(y=acuity, x=npower, linetype=eye))+geom_line()+facet_wrap(~ subject)
p+geom_line(aes(y=pvsub,color="red"))

plot of chunk unnamed-chunk-20

Now we can see that the predicted values vary (by a constant amount) across subjects but not across eyes.

Finally, we can specify the subject and eye. This is sometimes called the full predictive distribution:

vision$pvfull <- predict(bmod,marginal=NULL)
p <- ggplot(vision, aes(y=acuity, x=npower, linetype=eye))+geom_line()+facet_wrap(~ subject)
p+geom_line(aes(y=pvfull,color="red"))

plot of chunk unnamed-chunk-21

Now we see that the predicted value varies by subject and eye (with constant differences).

Diagnostics

Having produced predicted values, we can also compute some residuals:

vision$res <- with(vision, acuity-pvfull)

We can make the standard residual vs. fitted plot:

ggplot(vision, aes(x=pvfull,y=res)) + geom_point()+xlab("Predicted")+ylab("Residuals")+geom_hline(yint=0)

plot of chunk unnamed-chunk-23

We can see the outlier which we have previously identified. Otherwise everything looks fine.

A Q-Q plot of the residuals:

qqnorm(vision$res)
qqline(vision$res)

plot of chunk unnamed-chunk-24

Again we see the outlier but otherwise we have approximate normality.

These diagnostics do provide something useful in checking the model but they are not entirely appropriate for a Bayesian-style analysis. Assuming we recognize the value in model checking (I certainly do!), we should check that data simulated from the fitted model looks like the data we actually observed. This idea can be implemented in various ways which we leave as a TODO.

Check on versions of R packages used:

sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] reshape2_1.2.2   MCMCglmm_2.21    ape_3.1-4        coda_0.16-1     
 [5] lattice_0.20-29  lmerTest_2.0-20  pbkrtest_0.4-1   MASS_7.3-33     
 [9] ggplot2_0.9.3.1  lme4_1.1-7       Rcpp_0.11.3      Matrix_1.1-4    
[13] knitr_1.6        rmarkdown_0.2.68

loaded via a namespace (and not attached):
 [1] bitops_1.0-6        boot_1.3-11         caTools_1.16       
 [4] cluster_1.15.2      colorspace_1.2-4    corpcor_1.6.7      
 [7] dichromat_2.0-0     digest_0.6.4        evaluate_0.5.3     
[10] formatR_0.10        Formula_1.1-1       gdata_2.13.3       
[13] gplots_2.13.0       grid_3.1.1          gtable_0.1.2       
[16] gtools_3.3.1        Hmisc_3.14-3        htmltools_0.2.4    
[19] KernSmooth_2.23-12  labeling_0.2        latticeExtra_0.6-26
[22] minqa_1.2.3         munsell_0.4.2       nlme_3.1-117       
[25] nloptr_1.0.4        numDeriv_2012.9-1   parallel_3.1.1     
[28] plyr_1.8.1          proto_0.3-10        RColorBrewer_1.0-5 
[31] scales_0.2.3        splines_3.1.1       stringr_0.6.2      
[34] survival_2.37-7     tensorA_0.36        tools_3.1.1        
[37] yaml_2.1.11