epidemiological studies: interest often lies in studying the effect of an exposure () on the risk of developing disease ()
the study results may reflect the true effect of an exposure on the disease outcome, however, it should always be considered that the findings may in fact be due to an alternative explanation
a confounder () is a third variable:
associated with the exposure of interest
independently associated with the risk of disease
a confounder explains partially or fully the relationship between and
a confounder should not lie on the causal pathway between exposure and disease
consequences of confounding:
the creation of an apparent relationship (i.e. creates a spurious relationship) between and
masking of a true relationship between and (i.e conceals a true relationship)
causes and overestimation or underestimation of a true effect
example: study into alcohol consumption and CHD
findings: increased level of alcohol consumption associated with increased risk of coronary heart disease (CHD)
confounder: smoking is a risk factor for CHD and is associated with alcohol consumption
on controlling for smoking there may be no association between alcohol and CHD
confounder: age? overestimation of effect
confounder: gender? underestimation of effect
see handout for illustration of confounding
a number of approaches can be applied to control for potential
confounders. Methods may be design-based or analysis-based
design based approaches:
randomisation: random allocation of exposures promotes balance over potential confounders
restriction: limits participation in the study to individuals who are similar in relation to the confounder (e.g smokers)
matching: selecting controls to be similar to cases in
terms of confounders (e.g. age, gender, smoking habits)
analysis based approaches:
stratification: examine exposure-disease associations within strata (e.g age groups, smoking groups) and estimate pooled estimate of association measure adjusting for confounding effect
standardisation: controlling confounding using an external population to adjust for age, gender etc yielding standardised rates
multivariate analysis/regression models: include confounding variable in the model (model adjustment)
Stratification: allows the association between exposure and
outcome to be examined within strata of the confounding variable
assuming the association measures are relatively uniform they
may then be pooled to yield an adjusted estimate
Mantel-Haenszel methods most widely used
history: method first described for use in stratified
case-control studies
tables: for each of the strata we can
tabulate (using the usual format) the results with strata specific participants
Diseased
Non-diseased
Exposed
Not Exposed
assumption: common odds ratio across strata
idea:
weighted average of the strata odds ratios
weight equal to precision (inverse of the variance)
Mantel-Haenszel estimate of the common odds ratio
standard error for : see for example Robins J et al. (1986) American Journal Epidemiology 124: 719–723
(Woodward 1999, Example 4.9)
Scottish Heart Health Study (SHHS), six-year follow-up of men
potential confounders: various lifestyle factors (in particular smoking)
CHD | |||
---|---|---|---|
Housing tenure | Yes | No | Risk |
Rented | 85 | 1821 | 0.0446 |
Owner-occupied | 77 | 2400 | 0.0311 |
Odds ratio | 1.45 |
stratified analysis by smoking status
strata: smokers (1) and non-smokers (2)
and
exposure: living rented accommodation versus owner-occupied
Non-smokers | CHD | ||
---|---|---|---|
Housing tenure | Yes | No | Risk |
Rented | 33 | 923 | 0.0345 |
Owner-occupied | 48 | 1722 | 0.0271 |
Odds ratio | 1.28 |
Smokers | CHD | ||
---|---|---|---|
Housing tenure | Yes | No | Risk |
Rented | 52 | 898 | 0.0547 |
Owner-occupied | 29 | 678 | 0.0410 |
Odds ratio | 1.35 |
Mantel-Haenszel estimate of the common odds ratio
standard error:
approximate 95% confidence interval:
conclusion: by strata the odds ratios are 1.28 (smokers)
and 1.35 (non-smokers). The adjusted estimate is 1.32 (a weighted
average of the two). The 95% confidence interval for the true odds
ratio spans one hence the association between housing tenure and CHD
is not significant at the 5% level
some confounding is manifest here since a similar reduction is seen in both strata. Since the reductions are small (from 1.45) the degree of confounding is small
case-control studies can be unstratified or stratified
unstratified: choose controls randomly
stratified: match controls to cases according to
confounding variables
group matching (constant ratio of cases and controls
within broad strata)
individual matching: matching controls to each case
(e.g. 1:1 matching)
the rationale in a matched case-control study is to eliminate
confounders by design
matching is a design-based approach to controlling confounding
advantages
control confounders by elimination
gain in efficiency (depending on strength of confounder)
avoid/minimize selection bias (e.g. neighbourhood matching)
disadvantages
more complicated study design
it is not possible to study the effect of matching variables on
the outcome of interest if you match, for
example, on age you cannot
study the effects of age on disease outcome!!
overmatching (e.g. matching variable strongly related to exposure, but not to disease )
we shall focus upon the analysis of matched pairs
for each case a control is selected matched on values of
confounders
in each matched pair we can classify the case and the control as
exposed () or not exposed ()
we can then tabulate the frequencies of case-control pairs
History
History of control
of case
Exposed
Unexposed
Exposed
a
b
Unexposed
c
d
the counts in the table represent the number of pairs not
individuals
this corresponds to tables for each pair
the tabulated values arise from the four possible tables for the case-control pairs
(1) | |||
---|---|---|---|
Case | Control | ||
Exposed | 1 | 1 | 2 |
Unexposed | 0 | 0 | 0 |
1 | 1 | 2 |
(2) | |||
---|---|---|---|
Case | Control | ||
Exposed | 1 | 0 | 1 |
Unexposed | 0 | 1 | 1 |
1 | 1 | 2 |
(3) | |||
---|---|---|---|
Case | Control | ||
Exposed | 0 | 1 | 1 |
Unexposed | 1 | 0 | 1 |
1 | 1 | 2 |
(4) | |||
---|---|---|---|
Case | Control | ||
Exposed | 0 | 0 | 0 |
Unexposed | 1 | 1 | 2 |
1 | 1 | 2 |
condition each stratum table on both margins
conditional on the margins, tables (1) and (4) deterministic
only tables (2) and (3) relevant
the values and in pairs frequency table are the so-called
discordant pairs
although there are pairs in total we are only interested in the matched groups with discordant exposures (i.e the discordant pairs)
the maximum likelihood estimate of the exposure odds ratio is:
with standard error:
an approximate confidence interval can be computed for the true odds ratio and
interpreted in context of the disease odds ratio
objective: relationship between history of tonsillectomy and incidence of Hodgkin’s disease
case-control study: 85 pairs of cases and controls
odds ratio: ()
standard error:
approx. 95% CI : [0.872; 5.249]
More examples in the workshop!!!
History | History of control | |
---|---|---|
of case | Positive | Negative |
Positive | 26 | 15 |
Negative | 7 | 37 |
individually matched studies
standard analysis ignoring the 1:1 matching is misleading
should use conditional analysis as above (discordant pairs)
several controls per case: arguments above may be extended
Mantel-Haenszel method: note that if you consider each matched pair
as a strata same estimator and standard error (exercise)
risk, risk difference, relative risk: cannot be estimated
in matched case-control studies
a principle role in epidemiology is to compare the incidence of
disease or mortality between two or more populations
comparing crude mortality rates can be misleading since
populations may differ with respect to confounders such as age and
gender which in turn will impact upon mortality
one approach it to simply produce rates by strata of the
confounder such as mortality rates by age group
when comparing a large number of populations over various strata
the data can be become unmanageable
an alternative approach is to combine category specific rates in
such a way that has been adjusted for the confounding factor: standardisation
standardisation is a process aimed at removing confounding
by choosing a ’standard population’ with a known distribution
of the confounder (e.g known age structure)
most common: age standardisation, age/sex standardisation
example: comparison of mortality rates between a seaside
resort and an industrialised town
there are two methods of standardisation commonly used in
epidemiology: direct and indirect
both methods require a ‘standard’
direct standardisation: the disease rates in the population
of interest (study population(s)) are applied to the ’standard’ population
indirect standardisation: the disease rates in the
standard population are applied to the population of interest (study population(s))
both direct and indirect methods involve calculating expected
numbers of events (e.g. deaths) and comparing them to the observed
number of events
the most common ‘standard’ is based upon age strata or age/gender strata
direct standardised event rate: the expected event rate
in the ’standard’ population if the age-specific event rates in the
study population prevailed
so: if adjusting for age, the category specific event rates for
each population being compared will be applied to a single standard population
standardised event ratios can then be calculated and regions compared
notation: standardising by age
observed no. of events in the age group of the study population
size of the age group of the study population
size of the age group of the standard population
total size of standard population
direct age standardised event rate per 1,000:
hypothetical example: see handout
note
strictly speaking it is a proportion, not rate
however, average population size times study length (usually one year) person-years
indirect methods are commonly used when age specific rates are
not available
the strata specific rates of the standard population are applied
to the strata of study population and the expected number of events calculated
assuming the standard population rate prevails
methods give rise to a standardised event ratio (SER) or
standardised mortality ratio (SMR)
with and
the SMR is typically multiplied by 100 and expressed as a percentage
direct standardisation requires age specific rates for all
populations being studied
the indirect method requires the total number of cases
the ratio of two indirectly standardised rates is called the standardised incidence ratio or the standardised mortality ratio
indirect standardisation more frequently used
indirect standardisation more stable in case of small numbers of events
indirect standardisation requires age-specific rates for
standard population
the choice of standard population should be stated
clearly. Often the National standard population will
be used when comparing sub regions