R notes on the Analysis of the Right Elbow Included data

First read in the response data:

load(url("http://people.bath.ac.uk/jjf23/fda/jffda.rda"))

The first observation is

relb[[1]]

##  [1] 129.6 129.1 129.2 129.7 130.8 132.4 134.5 136.9 139.5 142.4 145.8
## [12] 148.6 150.9 153.6 156.5 158.5 159.8 160.5 160.8 161.0 161.1 161.2
## [23] 161.3 161.6 161.8

Now read in the R functions that we will use

source("http://people.bath.ac.uk/jjf23/fda/frafuncs.R")

Generate the response as 20 point equally spaced smoothing spline approximations to the raw data:

y <- matrix(NA, 20, 60)
for (ifno in 1:60) y[, ifno] <- roughapp(relb[[ifno]])

We now construct a plot of the smoothed curves (Fig. 1 in the paper). First I construct a matrix indexing the locations:

trialindex <- matrix(0, 20, 3)
k <- 1
for (i in levels(targs$loc)) {
    trialindex[k, ] <- (1:60)[targs$loc == i]
    k <- k + 1
}
trialindex[1:5, ]

##      [,1] [,2] [,3]
## [1,]    5    7   11
## [2,]   47   48   58
## [3,]    6    8   16
## [4,]    2   26   34
## [5,]   50   52   54

and now make the plot:

yr <- range(y)
par(mfrow = c(5, 4), mai = rep(0.1, 4), omi = c(0.3, 0.3, 0, 0), mgp = c(1.5, 
    0.5, 0), tck = -0.01)
for (i in 1:20) {
    plot(seq(0, 1, len = 20), y[, trialindex[i, 1]], type = "l", ylim = yr, 
        xlab = "", ylab = "")
    lines(seq(0, 1, len = 20), y[, trialindex[i, 2]], col = 2)
    lines(seq(0, 1, len = 20), y[, trialindex[i, 3]], col = 3)
    text(0.3, yr[2], as.character(targs$loc[trialindex[i, 1]]))
}

Leave out obs. 37 because it is extreme:

y <- y[, -37]

Now compute the X-matrices and fit the models. The one way anova model:

xc <- model.matrix(~targs$loc)
xc <- xc[-37, ]
gc <- freg(xc, y)

The linear model:

xl <- model.matrix(~as.matrix(targs[, 3:5]))
xl <- xl[-37, ]
gl <- freg(xl, y)

The quadratic model (rescale the co-ordinates for numerical accuracy)

x <- targs[, -(1:2)]
x <- x/100
xq <- cbind(rep(1, 60), x, x[, 1]^2, x[, 2]^2, x[, 3]^2, x[, 1] * x[, 2], x[, 
    2] * x[, 3], x[, 3] * x[, 1])
xq <- xq[-37, ]
gq <- freg(xq, y)

We may now make a plot of the fitted curves for these three models
as in Figure 2 of the paper:

par(mfrow = c(5, 4), mai = rep(0.1, 4), omi = c(0.3, 0.3, 0, 0), mgp = c(1.5, 
    0.5, 0), tck = -0.01)
yr <- range(y)
for (i in 1:20) {
    j <- trialindex[i, 1]
    if (j > 37) 
        j <- j - 1
    z <- cbind(gc$pred[, j], gq$pred[, j], gl$pred[, j])
    matplot(seq(0, 1, len = 20), z, type = "l", ylim = yr, lty = c(1, 2, 5), 
        xlab = "", ylab = "")
    text(0.3, yr[2], as.character(targs$loc[j]))
}

We can test the hypotheses in the paper as follows.
First compare the linear model to the one way anova model

bd <- bsfttr(gl, gc, xl, xc, B = 200)
bd[1]

## [1] 756.8

max(bd)

## [1] 823.1

The observed test statistic is larger than all the bootstrapped test
statistics so our estimated p-value is 0 i.e. we reject the null
hypothesis that the linear model is adequate. The same result is found
when using the test based on the assumption of a Gaussian stochastic process.

Now compare the quadratic model to the one way anova model. The bootstrap test:

bd <- bsfttr(gq, gc, xq, xc, B = 200)
length((1:200)[bd[-1] > bd[1]])/200

## [1] 0.365

The Gaussian based test

nd <- ntfttr(gq, gc, xq, xc, B = 200)
length((1:200)[nd[-1] > nd[1]])/200

## [1] 0.345

So there is no evidence to reject the quadratic model.

The residual plot in Figure 3 is constructed as follows. The m.l.e of the covariance is

sloe <- eigen(gq$res %*% t(gq$res)/gq$n)

Now construct the plot:

par(mfrow = c(2, 2), mar = c(0, 2, 0, 0))
yr <- range(sloe$vec[, 1:4])
for (i in 1:4) {
    plot((1:20)/20, sloe$vec[, i], type = "l", ylim = yr, ylab = "", xlab = "")
    abline(0, 0, lty = 5)
    text(0.5, 0.4, as.character(round(sloe$val[i], 1)))
}

A new point for which a predicted curve is constructed with
simultaneous confidence - somewhere between the glove and radio.

i <- (targs[, 2] == "glove") + (targs[, 2] == "radio")
i <- as.logical(i)
x0 <- apply(targs[i, 3:5], 2, mean)
x0 <- x0/100
x0 <- c(1, x0, x0^2, x0[1] * x0[2], x0[2] * x0[3], x0[3] * x0[1])
x0

##               x       y       z       x       y       z       x       y 
##  1.0000  2.2003  3.5161 -0.7038  4.8414 12.3631  0.4953  7.7366 -2.4746 
##       z 
## -1.5485

And now the bootstrapping:

bsp <- predfreg(xq, y, x0)
bss <- sort(bsp$bstat)

We want 95th percentile for 95% confidence bands:

bss[950]

## [1] 2.332

Compare this with the t-based value:

qt(0.975, 49)

## [1] 2.01

95% confidence bands for the mean response and a new observation
are constructed as in Figure 4 of the paper.
The predicted response is

ypred <- bsp$pred

The standard error for the mean response is sef * sqrt(xxxx)
and for the new observation it is sef * sqrt(1+xxxx).

xxxx <- bsp$sx^2
sef <- bsp$sef

xp <- cbind(ypred, ypred - 2.29 * sef * sqrt(xxxx), ypred + 2.29 * sef * sqrt(xxxx), 
    ypred - 2.29 * sef * sqrt(1 + xxxx), ypred + 2.29 * sef * sqrt(1 + xxxx))
par(mfrow = c(1, 1))
matplot(seq(0, 1, len = 20), xp, type = "l", lty = c(1, 2, 2, 3, 3), xlab = "Motion", 
    ylab = "Degrees", col = c(1, 2, 2, 3, 3))

We may also calculate the R^{2^} for the 3 models

nmr <- 58 * apply(y, 1, var)
tf <- function(x) sum(x^2)

For the one-way anova model

cmr <- apply(gc$res, 1, tf)
1 - cmr/nmr

##  [1] 0.6738 0.7226 0.7381 0.7390 0.7375 0.7481 0.7710 0.7943 0.8211 0.8463
## [11] 0.8658 0.8812 0.8959 0.9099 0.9221 0.9330 0.9399 0.9454 0.9506 0.9534

mean(1 - cmr/nmr)

## [1] 0.8395

For the linear model

lmr <- apply(gl$res, 1, tf)
1 - lmr/nmr

##  [1] 0.4642 0.5298 0.5619 0.5761 0.5858 0.6108 0.6532 0.6960 0.7353 0.7717
## [11] 0.7933 0.8016 0.7980 0.7889 0.7779 0.7619 0.7370 0.7118 0.6900 0.6693

mean(1 - lmr/nmr)

## [1] 0.6857

For the quadratic model

qmr <- apply(gq$res, 1, tf)
1 - qmr/nmr

##  [1] 0.6184 0.6712 0.6887 0.6917 0.6930 0.7063 0.7333 0.7622 0.7902 0.8203
## [11] 0.8451 0.8643 0.8795 0.8922 0.9027 0.9107 0.9126 0.9133 0.9149 0.9145

mean(1 - qmr/nmr)

## [1] 0.8062

As mentioned in the paper, the R² does increase towards the end of the motion.