Multiple Response Multilevel Models

Fit the model
pbkrtest package
lmerTest package
MCMCglmm analysis

[1] "Created: Wed Apr  1 16:45:08 2015"

See the introduction for an overview.

library(lme4)
library(ggplot2)
options(digits=5,show.signif.stars=FALSE)

Load in and summarize the data:

data(jsp, package="faraway")
jspr <- jsp[jsp$year==2,]
mjspr <- data.frame(rbind(jspr[,1:6],jspr[,1:6]),subject=factor(rep(c("english","math"),c(953,953))),score=c(jspr$english/100,jspr$math/40))
summary(mjspr)

     school     class     gender        social        raven     
 48     : 126   1:1134   boy :916   4      :738   Min.   : 4.0  
 33     :  88   2: 584   girl:990   9      :276   1st Qu.:22.0  
 42     :  82   3: 108              2      :252   Median :26.0  
 31     :  70   4:  80              5      :170   Mean   :25.3  
 47     :  66                       3      :160   3rd Qu.:30.0  
 21     :  56                       6      :116   Max.   :36.0  
 (Other):1418                       (Other):194                 
       id          subject        score      
 1      :   2   english:953   Min.   :0.000  
 3      :   2   math   :953   1st Qu.:0.390  
 4      :   2                 Median :0.625  
 6      :   2                 Mean   :0.594  
 7      :   2                 3rd Qu.:0.825  
 8      :   2                 Max.   :1.000  
 (Other):1894

Plot the data

ggplot(mjspr, aes(x=raven, y=score))+geom_jitter(alpha=0.25)+facet_grid(gender ~ subject)

plot of chunk unnamed-chunk-4

Fit the model

mjspr$craven <- mjspr$raven-mean(mjspr$raven)
mmod <- lmer(score ~ subject*gender+craven*subject+social+(1|school)+(1|school:class)+(1|school:class:id),mjspr)
print(summary(mmod),corr=FALSE)

Linear mixed model fit by REML ['lmerMod']
Formula: 
score ~ subject * gender + craven * subject + social + (1 | school) +  
    (1 | school:class) + (1 | school:class:id)
   Data: mjspr

REML criterion at convergence: -1741.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.6654 -0.5692  0.0072  0.5641  2.5870 

Random effects:
 Groups          Name        Variance Std.Dev.
 school:class:id (Intercept) 0.010252 0.1013  
 school:class    (Intercept) 0.000582 0.0241  
 school          (Intercept) 0.002231 0.0472  
 Residual                    0.013592 0.1166  
Number of obs: 1906, groups:  
school:class:id, 953; school:class, 90; school, 48

Fixed effects:
                        Estimate Std. Error t value
(Intercept)             0.441578   0.026459    16.7
subjectmath             0.366565   0.007710    47.5
gendergirl              0.063351   0.010254     6.2
craven                  0.017390   0.000925    18.8
social2                 0.013754   0.027230     0.5
social3                -0.020768   0.028972    -0.7
social4                -0.070708   0.025868    -2.7
social5                -0.050474   0.028818    -1.8
social6                -0.087852   0.030672    -2.9
social7                -0.099408   0.031607    -3.1
social8                -0.081623   0.042352    -1.9
social9                -0.047337   0.027445    -1.7
subjectmath:gendergirl -0.059194   0.010706    -5.5
subjectmath:craven     -0.003720   0.000930    -4.0

anova(mmod)

Analysis of Variance Table
               Df Sum Sq Mean Sq F value
subject         1   53.7    53.7 3953.67
gender          1    0.1     0.1    7.46
craven          1    6.0     6.0  444.63
social          8    0.7     0.1    6.47
subject:gender  1    0.4     0.4   28.23
subject:craven  1    0.2     0.2   15.99

Estimates but no inferential output as expected

confint(mmod, method="boot")

                                    2.5 %     97.5 %
sd_(Intercept)|school:class:id  0.0917857  0.1097861
sd_(Intercept)|school:class     0.0000000  0.0423214
sd_(Intercept)|school           0.0294989  0.0613863
sigma                           0.1113921  0.1212957
(Intercept)                     0.3909352  0.4935450
subjectmath                     0.3518251  0.3805216
gendergirl                      0.0422825  0.0833143
craven                          0.0155256  0.0192694
social2                        -0.0391574  0.0662833
social3                        -0.0729605  0.0302451
social4                        -0.1167668 -0.0194739
social5                        -0.1045879  0.0040000
social6                        -0.1432620 -0.0305644
social7                        -0.1599951 -0.0331508
social8                        -0.1636875  0.0050874
social9                        -0.0989284  0.0055644
subjectmath:gendergirl         -0.0795496 -0.0385646
subjectmath:craven             -0.0054501 -0.0017955

Again some evidence that there is little variation between the classes at a school.

pbkrtest package

We can reconstruct an F-test of the subject by craven term in the model using the KR adjustment.

library(pbkrtest)
mmod <- lmer(score ~ subject*gender+craven*subject+social+  (1|school)+(1|school:class)+(1|school:class:id),mjspr, REML=FALSE)
mmodr <- lmer(score ~ subject*gender+craven+subject+social+(1|school)+(1|school:class)+(1|school:class:id),mjspr, REML=FALSE)
KRmodcomp(mmod, mmodr)

F-test with Kenward-Roger approximation; computing time: 0.61 sec.
large : score ~ subject + gender + craven + social + (1 | school) + (1 | 
    school:class) + (1 | school:class:id) + subject:gender + 
    subject:craven
small : score ~ subject * gender + craven + subject + social + (1 | school) + 
    (1 | school:class) + (1 | school:class:id)
      stat ndf ddf F.scaling p.value
Ftest   16   1 950         1 6.9e-05

The outcome is very similar to the printed result. This is not surprising given the larger size of the sample.

lmerTest package

library(lmerTest)
mmod <- lmer(score ~ subject*gender+craven*subject+social+(1|school)+(1|school:class)+(1|school:class:id),mjspr)
print(summary(mmod),corr=FALSE)

Linear mixed model fit by REML t-tests use Satterthwaite approximations
  to degrees of freedom [merModLmerTest]
Formula: 
score ~ subject * gender + craven * subject + social + (1 | school) +  
    (1 | school:class) + (1 | school:class:id)
   Data: mjspr

REML criterion at convergence: -1741.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.6654 -0.5692  0.0072  0.5641  2.5870 

Random effects:
 Groups          Name        Variance Std.Dev.
 school:class:id (Intercept) 0.010252 0.1013  
 school:class    (Intercept) 0.000582 0.0241  
 school          (Intercept) 0.002231 0.0472  
 Residual                    0.013592 0.1166  
Number of obs: 1906, groups:  
school:class:id, 953; school:class, 90; school, 48

Fixed effects:
                        Estimate Std. Error        df t value Pr(>|t|)
(Intercept)             4.42e-01   2.65e-02  8.36e+02   16.69  < 2e-16
subjectmath             3.67e-01   7.71e-03  9.50e+02   47.54  < 2e-16
gendergirl              6.34e-02   1.03e-02  1.52e+03    6.18  8.3e-10
craven                  1.74e-02   9.25e-04  1.49e+03   18.81  < 2e-16
social2                 1.38e-02   2.72e-02  9.27e+02    0.51   0.6136
social3                -2.08e-02   2.90e-02  9.29e+02   -0.72   0.4737
social4                -7.07e-02   2.59e-02  9.36e+02   -2.73   0.0064
social5                -5.05e-02   2.88e-02  9.34e+02   -1.75   0.0802
social6                -8.79e-02   3.07e-02  9.35e+02   -2.86   0.0043
social7                -9.94e-02   3.16e-02  9.39e+02   -3.15   0.0017
social8                -8.16e-02   4.24e-02  9.22e+02   -1.93   0.0543
social9                -4.73e-02   2.74e-02  9.37e+02   -1.72   0.0849
subjectmath:gendergirl -5.92e-02   1.07e-02  9.50e+02   -5.53  4.2e-08
subjectmath:craven     -3.72e-03   9.30e-04  9.50e+02   -4.00  6.9e-05

anova(mmod)

Analysis of Variance Table of type 3  with  Satterthwaite 
approximation for degrees of freedom
               Sum Sq Mean Sq NumDF DenDF F.value  Pr(>F)
subject          54.0    54.0     1   950    3975 < 2e-16
gender            0.2     0.2     1   915      15 0.00012
craven            5.1     5.1     1   923     378 < 2e-16
social            0.7     0.1     8   927       6 3.2e-08
subject:gender    0.4     0.4     1   950      31 4.2e-08
subject:craven    0.2     0.2     1   950      16 6.9e-05

Can test the random effects:

rand(mmod)

Analysis of Random effects Table:
                Chi.sq Chi.DF p.value
school            8.63      1   0.003
school:class      1.44      1   0.230
school:class:id 186.70      1  <2e-16

MCMCglmm analysis

We need to create unique labels for the classes within schools and the students within classes within schools. The reason for this is explained in the nested example.

library(MCMCglmm)
mjspr$classch <- paste(mjspr$school,mjspr$class,sep=".")
mjspr$idclassch <- paste(mjspr$school,mjspr$class,mjspr$id,sep=".")

We need to use an expanded prior as in previous examples.

eprior <- list(R=list(V=1,nu=0.02),G=list(G1=list(V=1,nu=0.02,alpha.V=1000),G2=list(V=1,nu=0.02,alpha.V=1000),G3=list(V=1,nu=0.02,alpha.V=1000)))
bmod <- MCMCglmm(score ~ subject*gender+craven*subject+social, random=~school+classch+idclassch,data=mjspr, verbose=FALSE, prior=eprior)
xyplot(log(bmod$VCV))

plot of chunk unnamed-chunk-11

The diagnostics for the variance components look satisfactory.

summary(bmod)


 Iterations = 3001:12991
 Thinning interval  = 10
 Sample size  = 1000 

 DIC: -2179.3 

 G-structure:  ~school

       post.mean l-95% CI u-95% CI eff.samp
school   0.00231 0.000613  0.00415      599

               ~classch

        post.mean l-95% CI u-95% CI eff.samp
classch  0.000692 9.23e-10  0.00195      371

               ~idclassch

          post.mean l-95% CI u-95% CI eff.samp
idclassch    0.0103  0.00859    0.012     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units    0.0136   0.0124   0.0149     1000

 Location effects: score ~ subject * gender + craven * subject + social 

                       post.mean l-95% CI u-95% CI eff.samp  pMCMC
(Intercept)              0.44059  0.38982  0.49213     1000 <0.001
subjectmath              0.36694  0.35075  0.38002     1000 <0.001
gendergirl               0.06345  0.04297  0.08194     1000 <0.001
craven                   0.01732  0.01558  0.01918     1000 <0.001
social2                  0.01489 -0.03251  0.07277     1000  0.584
social3                 -0.02000 -0.07337  0.03750     1000  0.506
social4                 -0.06980 -0.12007 -0.01788     1000  0.006
social5                 -0.04887 -0.10700  0.00441     1000  0.098
social6                 -0.08775 -0.14386 -0.02943     1000  0.004
social7                 -0.09884 -0.15992 -0.03638     1000  0.002
social8                 -0.07940 -0.15918  0.00969     1000  0.064
social9                 -0.04632 -0.09927  0.00603     1000  0.096
subjectmath:gendergirl  -0.05937 -0.08086 -0.04009     1000 <0.001
subjectmath:craven      -0.00370 -0.00550 -0.00203      804 <0.001

The results are quite similar to the lme4 fit with all the fixed effects looking significant. Again the class variation seems to be smallest of three components of variation in the heirarchy.

We can view the posterior distributions for the SDs corresponding to the four components of variation:

library(reshape2)
ref <- data.frame(sqrt(bmod$VCV))
rdf <- melt(ref, value.name="score")
ggplot(rdf, aes(x=score,color=variable))+geom_density()

plot of chunk unnamed-chunk-13

We can clearly see the range for these components. As usual, we are most sure about the error SD which will represent the difference between maths and english scores for the same student. Interestingly, this variation is the largest.

Similar plots can be made of the fixed effect posteriors.

Check on versions of R packages used:

sessionInfo()

R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] reshape2_1.2.2   MCMCglmm_2.21    ape_3.1-4        coda_0.16-1     
 [5] lattice_0.20-29  lmerTest_2.0-20  pbkrtest_0.4-1   MASS_7.3-33     
 [9] ggplot2_0.9.3.1  lme4_1.1-7       Rcpp_0.11.3      Matrix_1.1-4    
[13] knitr_1.6        rmarkdown_0.2.68

loaded via a namespace (and not attached):
 [1] bitops_1.0-6        boot_1.3-11         caTools_1.16       
 [4] cluster_1.15.2      colorspace_1.2-4    corpcor_1.6.7      
 [7] dichromat_2.0-0     digest_0.6.4        evaluate_0.5.3     
[10] formatR_0.10        Formula_1.1-1       gdata_2.13.3       
[13] gplots_2.13.0       grid_3.1.1          gtable_0.1.2       
[16] gtools_3.3.1        Hmisc_3.14-3        htmltools_0.2.4    
[19] KernSmooth_2.23-12  labeling_0.2        latticeExtra_0.6-26
[22] minqa_1.2.3         munsell_0.4.2       nlme_3.1-117       
[25] nloptr_1.0.4        numDeriv_2012.9-1   parallel_3.1.1     
[28] plyr_1.8.1          proto_0.3-10        RColorBrewer_1.0-5 
[31] scales_0.2.3        splines_3.1.1       stringr_0.6.2      
[34] survival_2.37-7     tensorA_0.36        tools_3.1.1        
[37] yaml_2.1.11

Multiple Response Multilevel Models

Julian Faraway

Fit the model

pbkrtest package

lmerTest package

MCMCglmm analysis