Home page for accesible maths

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

11.7 Summary

{mdframed}
  • 1

    Even though you have selected a best model using appropriate covariate selection techniques, it is still necessary to check that the model fits well. Diagnostics provide us with a set of tools to do this.

  • 2

    Diagnostics check that key assumptions made when fitting the linear regression model are in fact satisfied.

  • 3

    QQ and PP plots can be used to check that the estimated residuals are approximately normally distributed.

  • 4

    Plots of estimated residuals vs. fitted values, and estimated residuals vs. explanatory variables, should also be made, to check that these are independent.

  • 5

    The hat matrix

    H=X(XX)-1X

    can be used to prove independence of estimated residuals and fitted values, and of estimated residuals and explanatory variables.

  • 6

    In addition the data should be checked for outliers and points of strong influence.

    • 1

      Outlier: a data point which is unusual compared to the rest of the sample. It usually has a very large studentised residual.

    • 2

      Influential observation: makes a larger than expected contribution to the estimate of β^. It will have a large value of Cook’s distance.