Home page for accesible maths

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

11.6 Diagnostics in R

As with everything else we have covered relating to the linear model, model diagnostics can easily be calculated in R. If we consider again the pressure data used in the last two examples. Start by fitting the linear model

> L <- lm(pressure$Pressure~pressure$Temp)

If we apply the base plot function to a lm fit, we can obtain a total of six possible diagnostic plots. We shall look at four of these: residuals against fitted, a normal QQ plot of the residuals, Cook’s distances and square root of the standardised residuals against the fitted values, and standardised residuals against leverage.

> par(mfrow=c(2,2))
> plot(L,which=c(1:2,4,5))

The results can be seen in Figure 11.7.

  • 1

    For the first plot, we expect to see no pattern, since residuals and fitted values should be independent. The red line indicates any trend in the plot.

  • 2

    For the QQ plot we hope to see points lying on the line y=x, indicated by the dotted line.

  • 3

    For the Cook’s distance plot we are looking for any particularly large values. The observation number for these will be given (in this example 12, 2 and 17).

  • 4

    For the residuals vs. leverage plot we are looking for any points that have either (or both) an unusually large leverage or an unusually large residual. The red dashed lines show contours of equal Cook’s distance.

Fig. 11.7: Diagnostics for the pressure temperature regression model.

An alternative is to download and install the car package, as this has many functions for regression diagnostics. For example

> library("car")
> influencePlot(L)

produces a bubble plot of the Studentised residuals s^i against the hat values Hii. The bubbles are proportional to the size of Cook’s distance. In the pressure data example (see Figure 11.8) we can see that observation 12 is a clear outlier. In addition this function prints values of Cook’s distance, the hat value and the Studentised residual for any ‘unusual’ observations.

Fig. 11.8: Hat values, studentised residuals and Cooks distance for the pressure temperature regression model. The sizes of the bubbles are proportional to Cook’s distance.