Probability
Introduction
In general we use probability theory as the basis of statistical inference - the drawing of
conclusions about a population from a sample. The definition of probability can be
problematic, but we ignore that here.
Definitions
P(A) = probability that the event A will occur.
Probabilities must obey certain laws:
1) certainty: P(A) = 1. Outcome A is certain to occur.
2) certainty not to occur: P(A) = 0; P(not A) = 1.
3) 0 < P(A) < 1.
4) Sum of probabilities of all outcomes in sample space = 1, e.g. P(heads) + P(tails) = 1.
Obviously also, P(not heads) = 1 - P(heads). These are complements.
Probability is the basis of the study of random variables. A random variable is (intuitively)
a variable whose value cannot be predicted with accuracy, although we may know the range
of value (e.g. scores on die) and perhaps the probability with which each may occur. These
values and their associated probabilities are summarised in a probability distribution or
alternatively a probability density function (pdf).
Random variables may be discrete (e.g. the probability of a particular score on a die) or
continuous (e.g. probability of a train arrival time)
Example I (discrete random variable): the Binomial distribution
A coin is tossed twice; what are the probabilities of 0, 1, 2 heads? The number of heads is a
random variable, h. Hence we wish to find its probability distribution, summarising the
probabilities of all possible outcomes.
A tree diagram allows use to enumerate the possibilities and calculate the probabilities.
H HH P = 1/4
H T
}
HT P = 1/4
P = 1/2
T H TH P = 1/4
T
TT P = 1/4
Hence we obtain:
P(h = 0) = 0.25
P(h = 1) = 0.5
P(h = 2) = 0.25.
This summarises the pdf for h. It could also be summarised in a graph:
0.6
0.5
0.4
0.3
0.2
0.1
0
h= 1 h= 2 h= 3
Note that we have used multiplication and addition to calculate these probabilities:
The multiplication rule
P(h = 2) = P(H and H) = P(H) P(H) = ½ ½
Generally, P(A and B) = P(A) P(B).
This formula assumes independence, i.e. that P(B) is the same whether or not A has
occurred. Tosses of a coin are obviously independent. In this case we may write:
P(B|A) = P(B|not A) = P(B)
where P(B|A) means “the probability of B occurring given that A has occurred”.
Independence is not always the case. If drawing two cards from a pack, the probability that
the second is an Ace depends upon whether the first card drawn was an Ace or not. In
obvious notation:
P(A2|A1) = 3/51, P(A2|not A1) = 4/51.
P(B|A) is the conditional probability of event B, conditional on event A having occurred.
Hence in general:
P(A and B) = P(A) P(B|A)
which simplifies to
P(A and B) = P(A) P(B)
with independence.
2
The addition rule
P(h = 1) = P(HT or TH) = P(HT) + P(TH) = 0.5 0.5 + 0.5 0.5 = 0.5
To evaluate the probability of either of two (or more) events occurring, the individual
probabilities are added. Hence
P(A or B) = P(A) + P(B).
This formula assumes A and B are mutually exclusive events, i.e. that they cannot both
occur. Obviously TH and HT cannot both occur, so they satisfy the criterion.
As a counter-example, think of the probability of drawing a Queen or a Spade from a pack of
cards. Using the rule:
P(Q or S) = P(Q) + P(S) = 4/52 + 13/52 = 17/52.
But this is wrong. The ways of satisfying the objective are drawing one of the thirteen spades
(which includes the Queen of Spades) or one of the three other Queens. Hence the correct
answer is 16/32. In the formula above the Queen of Spades was double counted, once as a
Queen and again as a Spade. To correct the formula, we must subtract off the probability of a
Queen and a Spade, so it is only counted once. Hence:
P(Q or S) = P(Q) + P(S) – P(Q and S) = 4/52 + 13/52 – 1/52 = 16/52.
In general then,
P(A or B) = P(A) + P(B) – P(A and B)
which simplifies to
P(A or B) = P(A) + P(B).
if the events are mutually exclusive. With these rules, one can solve most problems of
probability.
The combinatorial formula
An alternative way of getting P(h = 1) is using the combinatorial formula. We find the
probability of finding a head then a tail, and multiply by the number of ways (combinations)
of getting one tail and one head in two tosses (two, obviously). With bigger numbers it is not
so obvious.
Consider the probability of getting 3 heads in eight tosses. There are in fact 56 ways of
achieving this, so using a tree diagram to establish this fact is laborious. A formula is
simpler.
The probability of getting three heads followed by five tails (or any single ordering) is given
by 0.53 0.55. The number of ways of getting three heads in eight tosses is obtained from
the combinatorial formula:
3
8! 8 7 6 1
8C3 = = 56.
3!5! 3 2 1 5 4 1
Hence 0.53 0.55 56 = 0.21875 gives the desired answer.