6 Model Choice

Relationship between LRT and AIC

The AIC can be applied more generally than the LRT, but what is the relationship between them when comparing two nested models?

Let 0 denote the log-likelihood under the simpler of the models, with p parameters, and 1 the log-likelihood under a more complicated model, with one extra parameter. Then the LRT would prefer the more complex model if

2{1-0}> 3.84
1> 0+1.92.

On the other hand, AIC would prefer the more complex model if

2(-1+p+1)< 2(-0+p)
1> 0+1.

In other words, AIC is more willing to accept a more complex model than the LRT. What should we make of this difference?

The answer is to do with the fact that the model fit criteria have slightly different objectives, and the burden of proof lies in a different place. The LRT says ‘I will only add an extra parameter if I am sure it improves the model’. The AIC says ‘I will pick whichever model has the best predictive properties’.

This highlights why model selection is a hard problem. The choice of model complexity should depend on our objectives, and even then arguments could be made for different criteria. These issues are explored more at MSc level, and could form a topic for a PhD thesis.