Home page for accesible maths 5 Models for discrete random variables

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

5.4 Binomial random variables

Consider an experiment in which n independent Bernoulli trials are carried out, each with probability of success being θ. Let R be the random variable reporting the number of successes in these n trials. The induced sample space is {0,1,,n}. The random variable R is termed a Binomial random variable with parameters n and θ. We say RBin(n,θ).

Examples include:

  • the number of heads in n tosses of a of biased coin,

  • the number of patients with cancer in the next n examined,

  • the number of 6ft tall smokers in a tutorial of size n.

The derivation is a little more complex here so first consider the n=3 case with S and F denoting success and failure respectively. The sample space for the experiment is

Ω={SSS,SSF,SFS,FSS,SFF,FSF,FFS,FFF}.

The random variable of interest, R, is the number of successes.

Exercise 5.4.

Find P(R=r) for r=0,1,2,3.

Solution.

Previously, when θ=0.5, we used equi-probable outcomes to derive the pmf. This is not possible with an arbitrary θ. Instead we need to use independence to calculate the probabilities of the sample points. This results in the following calculations:

pR(0) = P({FFF})
= (1-θ)(1-θ)(1-θ)
= (1-θ)3,
pR(1) = P({SFF})+P({FSF})+P({FFS})
= 3θ(1-θ)2,
pR(2) = P({SSF})+P({SFS})+P({FSS})
= 3θ2(1-θ),
pR(3) = P({SSS})
= θ3.

with pR(r)=0 for other values of r.

A general formula which summarises these results is

pR(r)=(3r)θr(1-θ)3-r

for r=0,1,2,3.

Example 5.5.

Show that

r=03(3r)θr(1-θ)3-r=1.
Solution.

The binomial theorem states that

(a+b)3=r=03(3r)arb3-r.

Putting a=θ and b=1-θ gives the result.

The more general form for the pmf is as follows:

Lemma 5.6 (pmf of a Binomial random variable).

The pmf of a Binomial random variable RBin(n,θ) is

pR(r) = (nr)θr(1-θ)n-r

for r=0,1,2,,n, with pR(r)=0 otherwise, where 0<θ<1.

Proof.
  1. i.

    For any sample point with r S’s and n-r F’s, the probability of the event consisting solely of that sample point is θr(1-θ)n-r by independence.

  2. ii.

    There are (nr) sample points with r successes and n-r failures (choose r of the n trials to be S, with the others F).

  3. iii.

    Hence P(R=r)=ω:R(ω)=rP({ω})=ω:R(ω)=rθr(1-θ)n-r=(nr)θr(1-θ)n-r.


Exercise 5.7.

Show that r=0mpR(r)=1

Solution.

See Example 5.5

The software package R can evaluate pmfs from standard probability models, including the Binomial:

Example 5.8.

The rv RBin(3,0.5). Use R to evaluate and plot the pmf of R. Repeat with θ=0.4.

dbinom(0:3,size=3,prob=0.5)
dbinom(0:3,size=3,prob=0.4)
            # Note how the probabilities change.
p = dbinom(0:3,size=3,prob=0.5)
barplot(p, names.arg=c(0:3))
Exercise 5.9.

Find the probability of rolling a fair die and finding

  1. i.

    2 sixes in 4 rolls,

  2. ii.

    2 sixes in 5 rolls,

  3. iii.

    at least 2 sixes in 4 rolls.

Solution.
  1. i.

    2 sixes in 4 rolls: model as RBin(4,1/6)

    P(R=2) = (42)(16)2(56)2
    = 25/216=0.116.
  2. ii.

    2 sixes in 5 rolls: model as RBin(5,1/6)

    P(R=2) = (52)(16)2(56)3
    = 0.161.
  3. iii.

    at least 2 sixes in 4 rolls: model as RBin(4,1/6)

    P(R2) = 1-P(R<2)
    = 1-P(R=0)-P(R=1)
    = 1-(40)(16)0(56)4-(41)(16)1(56)3
    = 19/144=0.132.

We could calculate these In R using the following commands:

dbinom(2,size=4,prob=1/6)
dbinom(2,size=5,prob=1/6)
1-dbinom(0,size=4,prob=1/6)-dbinom(1,size=4,prob=1/6)
Exercise 5.10.

There are two families each with three children. If each gender has the same probability and the genders of the children are independent then find the probability that the families have the same number of girls.

R hint: sum( dbinom(0:3, size=3, prob=1/2)^2 )

Solution.

Let R be the number of girls in a family. Because of independence between children and constant probability for each child model RBin(3,0.5). For the two families R1Bin(3,0.5) and R2Bin(3,0.5).

Now

P(R1=R2) = r=03P(R1=r,R2=r) additivity Axiom
= r=03P(R1=r)P(R2=r), indep
= r=03[(3r)(12)r(12)3-r]2
= (12)6r=03(3r)2
= 164[12+32+32+12]
= 20/64=0.3125.