3 Introduction to Survival Analysis

3.2 Comparing survival distributions between subgroups

  • often we are interested in comparing the survival of two (or more) groups. For example, two treatment groups, males versus females, smokers versus non-smokers etc.

  • the usual two group methods (e.g t-tests to compare group means) are not valid due to censoring

  • separate Kaplan-Meier plots with confidence intervals can used to investigate groups informally

  • example: lung cancer data examine survival outcome by gender

Lung cancer survival by gender

  • males: red (solid) curve; females: blue (dashed) curve

    Unnumbered Figure: Link

  • comments: survival appears, on average, to be extended in females but some overlap in the upper limit of the confidence interval

  • question: potential confounders?

Comparing two groups: the log rank test

  • a formal comparison can be made using the log-rank test

  • the null hypothesis is that the survival distributions are equal for the sub-groups (i.e no difference in survival)

  • let tj denote the observed event times and dj the number of events at time tj

  • further, let nj denote the number at risk at time tj (e.g. alive at time tj) of which n1j are in group 1 and n2j are in group 2

  • if no difference: the expected number of events in each group is: Ejk=djnjk/nj

  • we actually observe: Ojk=djk, k=(1,2)

  • summing over the failure times for the two groups gives Ek=jEjk and Ok=jOjk

  • the log-rank test statistic:

    X2=k=12(Ok-Ek)2/Eχ12

Lung cancer survival by gender: log rank test using R

  • the function: survdiff() conducts the log rank test in R

  • the command and output is below

    
    > survdiff(Surv(time,status)~sex, data=lung)
    Call:
    survdiff(formula = Surv(time, status) ~ sex,data = lung)
    
            N Observed Expected (O-E)^2/E (O-E)^2/V
    sex=1 138      112     91.6      4.55      10.3
    sex=2  90       53     73.4      5.68      10.3
    
     Chisq= 10.3  on 1 degrees of freedom, p= 0.00131
    
    
  • the log-rank test result indicates significant difference in the survival outcomes for male and female lung cancer patients

  • comments?

Comparing more than two groups

  • Kaplan-Meier curves can be obtained for more than two sub-groups and survival compared informally

  • it may be preferable not to add the confidence intervals since the plots can become confusing

  • the log rank test can be used to compare more than two groups

  • more generally, a model can be fit to the data and potential for confounding accommodated

  • the Cox proportional hazards model is commonly used to flexibly model covariate effects on the hazard function