An estimator of is a function of the random variables . An estimator is itself a random variable.
An estimate of is a function of the observed sample . An estimate is a number, and is a realisation of the estimator.
One basic model is to assume that the sample is a realisation of an IID sequence of random variables , which have a Normal distribution. In this case, the parameter vector, , represents the unknown population mean and population variance.
Sensible estimators for and are the sample mean and variance (justification for this will be given in the maximum likelihood part of the course),
,
.
The following example shows how to obtain estimates from these estimators.
A study by Vallbona et al.22Vallbona, C., Hazlewood, C.F. and Jurida, G. (1997) Response of pain to static magnetic fields in postpolio patients: a double-blind pilot study investigated the possibility that magnetic fields can reduce pain in human tissue. They conducted a double-blind experiment on 50 patients with postpolio syndrome (causing muscular pain). Patients were given either an active or a placebo magnetic device, to be worn for 45 minutes. Each patient rated their pain before and after treatment. Pain scores before and after treatment, for both groups, can also found in the file magnetExperiment.Rdata.
First we estimate the sample mean and variance of the post-treatment score for the group who received an active treatment. Begin by loading the data into R
The column Active takes the value 1 (active treatment) and 2 (placebo). The condition given in the square brackets is used to extract only those values of the variable magnet$Score_post which correspond to individuals who received active treatment.
The observed post-treatment pain scores for the group with active treatment were , , , , , , , , , , , , , , , , , , , , , , , , , , , , . Using the sample mean and sample variance, we obtain the following parameter estimates for the population mean and population variance,
,
.
Alternatively, in R, the sample mean, variance and standard deviation are calculated as follows:
It is one thing to fit a model, but in order to trust any inferences that we might make from that model, we should ensure that it describes the observed data well. A range of tools called diagnostics are available for this purpose. Which diagnostic(s) we choose will depend on the model fitted. Suppose that our model assumes that the data are an IID sample from a probability distribution , with parameters which we have estimated by . In Math104, you used QQ plots to compare quantiles of standardized data against quantiles of the standard to assess fit to a Normal distribution.
We will look at a range of diagnostic tools in detail in Chapter 11.
You can produce QQ plots in R using the qqnorm function. Use this function to assess the normality of the post-treatment scores above. Hint: you need to standardize the data values first. How would you do this in R? What does a good fit look like?