Censoring is a term used to indicate that a measurement is incomplete or partially observed. Even if an event cannot be observed directly, the fact that it is censored still provides some information that is useful. For example, terminally ill patients in a survival trial receive a treatment slow the progression of their illness. The survival times of patients who are alive at the end of the study must be censored.
Three main reasons for why censoring may occur:
Events occur totally independent to the study that means that it is not possible to observe the event of interest. For example, the fire alarm goes off and so the experiment is cut short.
Limitations of the experimental design. For example, there is a time restraint on the experiment before it must end.
Censoring occurs as a consequence of the experiment. For example, a patient in a clinical trial has an adverse reaction to the treatment and so must be removed from the study for their safety.
The first two examples of censoring are independent of the event of interest. On the contrary, in the third case the study and reason for censoring are dependent; also known as informative censoring. This type of censoring requires further consideration and so is not discussed in this chapter.
Remark Censoring is not the same as truncation. When a measurement is censored then the event of interest could have occurred, but due to some reason it was not possible to observe the event. On the contrary, if there is some constraint that is preventing a particular event happening whether observed or not, then this is truncation. For example, the number of students who can attend a lecture is limited by the number of seats available in room; say 100 seats. If we are then interested in monitoring that the attendance is at least 80%, then the lecturer need only count 80 students. So, the maximum number of students the lecturer could count is 100 (truncated) but she need not count beyond 80 students (censoring point).
Exercise 8.60
For the following scenarios, state whether the event has been censored.
River depth measurement cannot be recorded as depth is greater than the meter ruler.
A participant in a clinical trial cannot make regular meeting with the doctor due to moving overseas.
Parts of a multiple choice questionnaire on household income has not been filled in.
120 marks are available on an exam paper, but the number of marks awarded is capped at 100.
When discussing censoring, it is important to clarify the direction of censoring:
Right censoring – The value of the event is greater than a specified value but it is unknown by how much. For example, the survival time of patients who are alive at the end of the study are right censored.
Left censoring – The value of the event is lower than a specified value but it is unknown by how much. For example, in child development studies, the time for a child to learn a particular task is left censored if they can already perform the task when they enter the study.
Interval censoring – The observed events occur within a specified interval, with values smaller than the lower bound are left censored and values greater than the upper bound are right censored. For example, Figure 8.3 (Link) depicts a line-up of three prisoners where the height of the first prisoner is left censored at 4 feet, the height of the second is observed at 5.75 feet and the third is right censored at 7 feet.
Suppose you are to undertake a product quality study to investigate the lifetime of AA batteries. The experiment involves purchasing off-the-shelf batteries made by the same manufacturer from a variety of stores and put them in a device that drains the battery under a constant 50 mA current. The time taken to deplete the battery is recorded. However, the device you are using is in high demand and so you must complete your experiment within 24 hours.
Let , for , denote the lifetime of the th battery and that they are independent and identically distributed variables with probability density function . If the lifetime is observed, then the contribution to the likelihood function is . On the contrary, if the lifetime is censored at (here the known fixed time of 24 hours) then we do not know what the true lifetime is other than it is greater than ; i.e. it is right censored at .
Exercise 8.61
Which is the correct contribution to the likelihood from the right-censored event :
Let be an indicator function of whether the th battery lifetime was censored or not:
The likelihood for the sample data is given by:
Suppose the battery lifetimes are independent and identically exponentially distributed with pdf:
Of the batteries, lifetime measurements for batteries were observed, , with the remaining lifetimes censored at , a known constant.
Exercise 8.62
Write down the likelihood for .
Exercise 8.63
Calculate the maximum likelihood estimate of .
24 | 16 | 22 | 3 | 24 |
24 | 17 | 3 | 24 | 3 |
23 | 20 | 13 | 23 | 3 |
21 | 13 | 10 | 24 | 6 |
6 | 24 | 24 | 12 | 9 |
1 | 23 | 8 | 6 | 5 |
The data from the experiment is presented in Table 8.1 with Figure 8.2 (Link) depicting the lifetimes of batteries. Of these, lifetimes were observed with a total time of hours. The lifetimes for the remaining batteries are censored at hours. The maximum likelihood estimate for the rate parameter is:
Exercise 8.64
Calculate an approximate 95% confidence interval for .