3 Week 3: Bayesian statistics: Prediction 3.1 Examples 3.3 Using the predictive for model checking and evaluation

3.2 The predictive for model checking

The posterior p-value

Bayesians use the predictive for model checking. In this case there are four points would be highly unlikely from the model we ave suggested.

The posterior p-value

The posterior p-value of a point is the probability of a future observation being at least as extreme as that observed in the data.

In our case the Poisson likelihood did not have enough variability to describe four of the points.

3.2.1 Example 1 Using the predictive for model checking.

Example 1 Using the predictive for model checking.

Example

A cancer laboratory is estimating the rate of tumorigenesis in mice. They have tumor count data for 5 mice in a particular strain. The question of interest is whether the Poisson model is appropriate for these counts.

Y	1	2	0	4	8

Assuming a Poisson sampling distribution and the following vague prior distribution: $\theta\sim\mbox{{\rm Gamma\,}$\left({1},{.1}\right)$}$ .

1.

Simulate from the posterior for $\theta$ .
2.

Simulate from the predictive and obtain a 95%CI.
3.

Are the points adequately described by a Poisson?

Example 1 Using the predictive for model checking.

# Use Bayes theorem to find conjugate distribution of theta.
al=1;be=.1
y=c(1,2,0,4,8)
alp=al+sum(y)
bep=be+length(y)

#Simulate from theta then the predictive.
theta=rgamma(10000,alp,bep)
ys=rpois(10000,theta)

#Find a HPD
library(TeachingDemos)
CI=emp.hpd(ys, conf=0.95)
CI
lower upper
    0     7
barplot(table(ys),col=3,main="The predictive",
ylab="probability",col=c(rep(2,8),rep(0,5)))

Example 1 Using the predictive for model checking.

Figure 3.5: Link, Caption: The predictive for the tumour data. The bars coloured red is a 95% CI. Note point

y=8

is outside this interval and thus poorly explained by the Poisson distribution.

The ingredients of Bayesian statistics

The likelihood: $f(x\mid\theta)=\prod_{i=1}^{n}f(x_{i}\mid\theta)$: is a function of $\theta$ .
The prior: $\pi(\theta)$: is the probability of $\theta$ prior to data collection.
The joint distribution: $h(x;\theta)$: of $\theta$ and $x$ , factorized as $h(x;\theta)=f(x\mid\theta)\pi(\theta)$ .
The marginal likelihood: $m(x)$: or evidence can be obtained by integrating out $\theta$ from the joint distribution. $m(x)=\int_{\theta}\pi(\theta)f(x\mid\theta)d\theta$ .
The posterior distribution, $h(\theta\mid x)$: is the probability of the unknown upon consideration of the current data.
The prior predictive distribution: $f(x^{\star}\mid x)$: is the probability of a future observation, $x^{\star}$ , before the data is looked at.
The posterior predictive distribution: $f(x^{\star}\mid x)$: is the probability of a future observation, $x^{\star}$ , given the data in hand, $x$ .