Bayesians use the predictive for model checking. In this case there are four points would be highly unlikely from the model we ave suggested.
The posterior p-value of a point is the probability of a future observation being at least as extreme as that observed in the data.
In our case the Poisson likelihood did not have enough variability to describe four of the points.
A cancer laboratory is estimating the rate of tumorigenesis in mice. They have tumor count data for 5 mice in a particular strain. The question of interest is whether the Poisson model is appropriate for these counts.
| Y | 1 | 2 | 0 | 4 | 8 |
|---|
Assuming a Poisson sampling distribution and the following vague prior distribution: .
Simulate from the posterior for .
Simulate from the predictive and obtain a 95%CI.
Are the points adequately described by a Poisson?
# Use Bayes theorem to find conjugate distribution of theta.
al=1;be=.1
y=c(1,2,0,4,8)
alp=al+sum(y)
bep=be+length(y)
#Simulate from theta then the predictive.
theta=rgamma(10000,alp,bep)
ys=rpois(10000,theta)
#Find a HPD
library(TeachingDemos)
CI=emp.hpd(ys, conf=0.95)
CI
lower upper
0 7
barplot(table(ys),col=3,main="The predictive",
ylab="probability",col=c(rep(2,8),rep(0,5)))
is a function of .
is the probability of prior to data collection.
of and , factorized as .
or evidence can be obtained by integrating out from the joint distribution. .
is the probability of the unknown upon consideration of the current data.
is the probability of a future observation, , before the data is looked at.
is the probability of a future observation, , given the data in hand, .