Inference for mixed effect models is difficult. In 2005, I published Extending the Linear Model with R (Faraway 2006) that has three chapters on these models. The inferential methods described in that book and implemented in the lme4 as available at the time of publication were based on some approximations. In some simple balanced cases, the inference is exactly correct, in other cases the approximation is adequate and in yet other cases the approximation is poor and the results could be misleading. It is not easy to specify when the approximations are adequate or better, so the author of the package made the decision to withdraw these methods entirely. Unfortunately this left readers of my text wondering why the output from current R implementations did not provide the inferential results presented in the text.
I have gathered together in Changes to the Mixed Effects Models chapters in ELM (PDF) a summary of what is different from current lme4
and 2005 lme4
.
In recent years, several R packages have presented alternative methods for performing the inference.
pbkrtest - The pbkrtest
package provides two ways in which the fixed effects terms in a linear mixed model may be tested. One method is based on the F-statistic. In some cases where a balanced layout exists, the F-statistic has an exact null distribution which has an F-distribution with degrees of freedom as expected. For unbalanced data, the degrees of freedom are not correct and the null-distribution may not follow the F-distribution exactly. (Kenward and Roger 1997) proposed a method of adjusting the degrees of freedom to obtain better approximations to the null distribution. This method has been implemented in (Halekoh and Højsgaard 2014). The pbkrtest
package also implements the parametric bootstrap. The idea is the same as explained in the text but the implementation in this package saves us the trouble of explicitly coding the procedure. Furthermore, the package makes it easy to take advantage of the multiple processing cores available on many computers. Given that the parametric bootstrap is expensive to compute, this is well worthwhile for very little additional effort for the user.
RLRsim - offers a convenient way of testing a random effect term. For more details, see (Scheipl, Greven, and Kuechenhoff 2008). Sadly, RLRsim
only deals with cases where a single random effect term is used so it cannot help us in testing the terms in the crossed or nested examples in the textbook. For these examples, we can code our own parametric bootstrap solution.
lmerTest This package recreates much of the inferential output seen in the earlier version of lme4
. It also has some additional functionality. Of course, this does not solve the problems found in those inferential methods.
aov This function has been available in R for many years. It can provide exact F-tests for fixed effects in some simple balanced examples.
MCMC An alternative way of conducting inference is via Bayesian methods implemented via Markov chain Monte Carlo (MCMC). A general introduction to these methods may be found in texts such as (Gelman et al. 2014). The idea is to assign a non-informative prior on the parameters of the mixed model and then generate a sample from their posterior distribution. This method is no longer available in the current version of lme4
. As the help page states:
One of the most frequently asked questions about 'lme4' is "how do I calculate p-values for estimated parameters?" Previous versions of `lme4` provided the `mcmcsamp` function, which efficiently generated a Markov chain Monte Carlo sample from the posterior distribution of the parameters, assuming flat (scaled likelihood) priors. Due to difficulty in constructing a version of 'mcmcsamp' that was reliable even in cases where the estimated random effect variances were near zero (e.g. <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2009q4/003115.html> `mcmcsamp` has been withdrawn (or more precisely, not updated to work with `lme4` versions greater than 1.0.0).
MCMCglmm This package allows the fitting generalized linear mixed models using Bayesian methods via MCMC. For more details see Hadfield (2010).
STAN is software for fitting Bayesian Models. Worked examples for the data in the book are available.
I demonstrate these methods for each of the examples in the text. You’ll need to read the text for more background on datasets and the interpretations or you can just look at the help pages for the datasets.
pulp
datapenicillin
datairrigation
dataeggs
dataabrasion
datajsp
datapsid
datavision
datajsp
dataFaraway, J. 2006. Extending the Linear Model with R. London: Chapman; Hall.
Gelman, A., J. Carlin, H. Stern, D. Dunson, A. Vehtari, and D. Rubin. 2014. Bayesian Data Analysis. 3rd ed. London: Chapman; Hall.
Hadfield, Jarrod D. 2010. “MCMC Methods for Multi-Response Generalized Linear Mixed Models: The MCMCglmm R Package.” Journal of Statistical Software 33 (2): 1–22. http://www.jstatsoft.org/v33/i02/.
Halekoh, Ulrich, and Søren Højsgaard. 2014. “A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models – the R Package Pbkrtest.” Journal of Statistical Software 59 (9): ??–?? http://www.jstatsoft.org/v59/i09.
Kenward, MG, and JH Roger. 1997. “Small Sample Inference for Fixed Effects from Restricted Maximum Likelihood.” Biometrics 53 (3): 983–97.
Scheipl, Fabian, Sonja Greven, and Helmut Kuechenhoff. 2008. “Size and Power of Tests for a Zero Random Effect Variance or Polynomial Regression in Additive and Linear Mixed Models.” Computational Statistics & Data Analysis 52 (7): 3283–99.