in a case-control study, sampling of cases and controls is based upon their disease status: diseased, , or non-diseased,
information regarding exposure is then obtained retrospectively: (exposed) and (not exposed)
Let denote the risk of disease risk amongst those exposed and denote the risk of disease amongst the non-exposed
we are interested in estimating and and, in particular, comparing the disease risk for the exposed and non-exposed groups
in a case control study it is not possible to estimate disease risk, differences in disease risk or relative risks since sampling is undertaken based upon disease status
in case control studies we can estimate the disease odds ratio:
note that if the disease is rare then
and the odds ratio approximates the relative risk
the outcome measure in a case-control study is the conditional exposure status of the cases and the controls: and
using Bayes theorem we have:
and
the exposure odds for cases (diseased) takes the form:
and
for a control (not diseased)
the exposure odds ratio is the ratio of the two exposure odds:
fundamental relation: the exposure odds ratio is equal to the disease odds ratio
consider the exposure status of the cases (diseased) and controls (non-diseased) have been ascertained: exposed , and non-exposed,
The results of the case-control study can then be illustrated using the following general tabular representation Cases Controls Total Exposed a b a+b Not exposed c d c+d Total a+c b+d n
rationale: measure frequency of exposure in the case and controls group and calculate a measure of association
exposure odds ratio:
disease odds ratio:
we can thus calculate the exposure odds ratio and a corresponding 95% confidence interval for the true exposure odds ratio and interpret in terms of the disease odds ratio, comparing exposed to non-exposed groups
study into alcohol consumption and laryngeal cancer (a relatively rare condition)
Cases | Control | Total | |
---|---|---|---|
Alcohol | 160 | 90 | 250 |
No Alcohol | 40 | 110 | 150 |
Total | 200 | 200 | 400 |
estimate the exposure odds ratio = disease odds ratio:
compute the standard error of the natural logarithm of the odds ratio:
compute a confidence interval for the logarithm of the odds ratio and then back-transform (exponentiate): 95% CI: exponentiate to give:
conclusion: on average the odds of laryngeal cancer was found to be 4.9 times higher amongst those exposed (i.e. consuming alcohol) when compared to those who abstain (no alcohol). We are 95% confident that the true disease odds ratio falls between 3.13 and 7.63 so the true increase in odds could be as high as 7.63 and as low as 3.13. The interval does not contain the value 1 and thus the increase is significant (at the 5% level)
the number of available cases is often limited and so the cases are often a complete enumeration of the diseased (i.e all known disease cases in a region in a specified time period)
controls (disease free) may be more readily available and thus we can usually choose the number of controls to include
we can increase the precision of our estimate for the disease odds ratio by increasing the number of controls (i.e. and in the standard error formula for the log odds ratio)
beyond about 5 controls per case yields little benefit
controls may need to be screened to avoid inadvertent inclusion of cases
controls may be a simple random sample from the disease free population at risk or may be matched to the cases
matching is used to handle known confounders
we shall consider 1:1 matching later this week