The Central Limit Theorem is one of the most important results in probability theory and statistics and is the reason the Normal distribution plays such a prominent role. It asserts that the sum (or the mean) of many independent identically distributed random variables is approximately Normally distributed. The remarkable fact is true, whatever the common distribution of the random variables, as long as it has finite expectation and variance.
The Central Limit Theorem. Suppose is a sequence of iid random variables with expectation and finite variance , then for any number ,
where and is the cumulative distribution function for the standard Normal distribution, , evaluated at .
Whereas the WLLN only tells us that converges to the CLT gives us the stronger information that the deviations of from scaled by follow a distribution in the limit. The practical use of this is that for reasonably large we can assume that
,
approximately.
A large company claims to pay an average wage of pounds an hour with a standard deviation of pounds. A sample of workers were found to have an average wage of pounds. Find the probability of observing a sample mean as low as this, or worse, by random chance alone if the company’s claim is true.
Solution. Let be the wages in pounds of the workers. If the company’s claim is true these should have expectation and standard deviation . By the CLT the average satisfies
approximately. The probability of getting a value of or lower in this Normal distribution is
which is pnorm(-1.6). There is only around chance of observing such a low average wage for randomly selected workers.
How large an iid sample should be taken from a normal distribution in order for the probability to be at least 0.99 that the sample mean will be within one standard deviation of the expectation of the distribution? (cf. Example 9.2.1)
Solution. By symmetry , so
if and only if . Now qnorm(0.995). So or is sufficient .
How large does have to be for the normal approximation to be valid? This depends on how close the original distribution of the ’s is to normal in the first place - the closer it is the quicker the approximation becomes accurate. Almost always will be enough to justify the approximation - sometimes much smaller will do.
Exam2016 A clumsy robot has been programmed to use a litre bucket to fill a litre tub with water. It fills the bucket at a tap, carries it to the tub and then empties it into the tub. During each trip from the tap to the tub it spills litres of water from the bucket.
Write down the exact probability that the tub is full to the brim after the robot has made trips.
Solution. 0 (since the robot would have to spill no water on any trip, and even spilling no water on one trip has a probability of 0).
Let be the total amount of water in the tub after trips. Find and and hence write down an approximate distribution for .
Solution.
So , approximately.
Use the approximation in (b) to estimate the probability that the tub is full after trips. Write your answer in terms of , the cdf of the standard normal distribution.
Solution. and , so
Use the following approximate values to comment on the accuracy of the approximation that you used in (c):
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
---|---|---|---|---|---|---|---|---|
0.159 | 0.023 |
Solution. The approximation gives which is close to the truth, zero, so, with an individual uniform distribution even with as low as the CLT seems to be pretty accurate.
The proof of the CLT is not examinable, but we provide a sketch below. For completeness, a formal proof (subject to conditions on the existence of ), appears in Appendix C. The key simplification below is that we ignore all of the remainder terms from the (two) Taylor expansions; we also ignore the possible non-existence of for some , and we assume that the random variables in the sequence have all been standardised: and .
We will prove the CLT in terms of , i.e. that
for all . Part 1 of Theorem 6.4.1 (the MGF theorem) says that the distribution (CDF) of a random variable, , is uniquely determined by its moment generating function (MGF) . That is, if two random variables have the same MGF then they have the same CDF.
Let , then, since the are independent and identically distributed,
Hence
Or
(9.2) |
Since has been standardised,
,
,
.
Hence, by Taylor expansion,
But so
Thus, using (9.2), is
As , the approximations, , become exact, as detailed in the appendix. Thus , the mgf of a rv, as required. ∎