2 Second Chapter

2.4 Analysis of cohort studies: binary exposure

  • consider that the cohort have been categorised according to their exposure status exposed, E, and non-exposed, E¯, and also by their disease status after follow-up: (D: diseased/ D¯: non-diseased)

  • the results of the study can then be illustrated via the tabular representation:

Disease status
Diseased Non-diseased Total
Exposed a b a+b
Not exposed c d c+d
Total a+c b+d n
  • what is the estimated risk of disease in the exposed group?

  • what is the estimated risk of disease in the unexposed group?

  • how might we compare the risk of disease for the exposed and non-exposed?

Measures of association cohort studies

:

Let p1 and p0 denote the respective disease risks (event probabilities) for the exposed and non-exposed groups.

  • The risk difference, p1-p0, is estimated:

    RD=aa+b-cc+d
  • The relative risk, p1/p0, is estimated:

    RR=a/(a+b)c/(c+d).
  • The disease odds, p/(1-p), provides the ratio of success (here death) to failure (here survival). The odds ratio comparing the exposed and non-exposed groups is estimated:

    OR=p11-p1÷p01-p0=adbc

Confidence Intervals for a risk difference, relative risk and the odds ratio of disease

  • Intervals are based upon the Normal approximation (1-α)% CI:

    θ^±z1-α/2SE(θ^)

  • risk difference:

    SE(RD)=p1(1-p1)n1+p0(1-p0)n0

    with p1=aa+b, p0=cc+d, n1=a+b and n0=c+d.

  • relative risk (natural log scale):

    SE(loge(RR))=1a-1a+b+1c-1c+d
  • odds ratio (natural log scale):

    SE(loge(OR))=1a+1b+1c+1d

Example: cohort study

  • Study of all cause mortality at 7-years in smoking and non-smoking doctors

Disease status
Died Alive Total
Smokers 133 25636 25769
Non-smokers 3 5436 5439
Total 136 31072 31208
  • Study aim: to compare the ‘all cause’ mortality risk amongst smokers and non-smokers

  • Cohort study prospective: we can investigate association using the either the risk difference, the relative risk or the odds ratio

  • Example: relative risk (smokers compared to non-smokers)

    RR=133/(133+25636)3/(3+5436)=9.357
  • RR on the natural log scale:log(9.357)=2.236

Example: cohort study continued

In addition to estimating the relative risk we should construct a confidence interval for the true relative risk

  • -

    first compute the standard error of the natural logarithm of the relative risk:

    SE(loge(RR))=1133-125769+13-15439=0.584
  • -

    then compute a 95% confidence interval for the logarithm of the relative risk:

    2.236±1.960.584[1.091;3.380]
  • -

    then back transform (exponentiate) to the relative risk scale:

    [exp(1.091),exp(3.380)]=[2.977,29.400]
  • Conclusion: on average the ‘all cause’ mortality risk amongst smokers was found to be 9.36 times that of non-smokers. We are 95% confident that the true relative risk lies between 2.98 and 29.40 and since the said interval does not contain one we have significant evidence (at the 5% level) of increased mortality risk amongst those who smoke compared to those who do not smoke.

  • Note that the interval is very wide and thus we have large uncertainty in the true magnitude of the relative risk