S2250
P R O B A B I L I T Y C A L C U L AT I O N S
D R R O L A Z I N TO U T
ENGLISH
PA R T 3
Syllabus S2250
W1 chapter 1: Introduction
W2 chapter 2: Discrete variables- Uniform, Bernoulli, and Binomial distributions
W3 chapter 2: Poisson, geometric and Hypergeometric distributions
W4 chapter 3: Continuous distributions- Uniform
W5 chapter 3: Normal and exponential distributions
W6 chapter 4: CI for the mean and the proportion
W7 chapter 4: CI for the variance
W8 chapter 5: hypothesis testing for the mean of 1 population
W9 chapter 5: hypothesis testing for the proportion and the variance of 1 population
W10 chapter 5: hypothesis testing for the mean, the proportion and the variance of 2 independent
populations
W11 chapter 5: hypothesis testing for the mean of 2 dependent populations
W12 chapter 6: hypothesis testing for the mean of several populations: ANOVA
W13 chapter 7: Chi square testing
W14 chapter 7: Chi square testing
CHAPTER 2: DISCRETE VARIABLES
WEEK 3
PA R T 3
Binomial Distribution
The Bernoulli distribution is closely related to the Binomial distribution. As
long as each individual Bernoulli trial is independent, then the number of
successes in a series of Bernoulli trails has a Binomial Distribution. The
Bernoulli distribution can also be defined as the Binomial distribution with n
= 1.
- 2 possible outcomes (success or failure)
- n independent trials
- n is known.
- We note X~ B (n, p) where p is the probability of success.
𝑛
The probability distribution is: P(X=k) = 𝑘
𝑝𝑘 (1 − 𝑝)𝑛−𝑘
𝑛 𝑛!
Where 𝑘
= and p is the probability of success, n is the number of
𝑘! 𝑛−𝑘 !
trials.
The expectation of X is: E(X) = np
The variance is Var(X) = np(1 − p).
Example 7:
Consider an unfair coin where the probability of observing a tail (T ) is p(T ) = 0.6. Let us
denote tails by “1” and heads by “0”. Suppose the coin is tossed three times. In total, there
are the 23 = 8 following possible outcomes. X is a random variable representing the
number of points that you get:
Possibilities X X P(X)
3
111 3 0 0
0.60 (1 − 0.6)3−0 = 0.064
110 2 3
1 1
0.61 (1 − 0.6)3−1= 0.288
101 2
011 2 3
2
0.62 (1 − 0.6)3−2 = 0.432
100 1 2
3
010 1 3
0.63 (1 − 0.6)3−3 = 0.216
001 1 3
000 0 Total 1
The expectation is : np = 3 x 0.6 = 1.8
The variance is : np(1-p) = 3 x 0.6 x 0.4 = 0.72
Example 8:
Assuming that 2% of people are left-handed, calculate the probability that
a- Among 100 people, exactly three are left-handed.
b- Among 100 people, three at most are left-handed.
c- Among 100 people, nobody is left-handed.
d- Among 100 people, 2 or more are left-handed.
n= 100 success: left-handed person p=2%=2/100=0.02 (1-p)=0.98
a- P(X=k) = 𝑛𝑘 𝑝𝑘 (1 − 𝑝)𝑛−𝑘
P(X=3) = 100
3
0.023 (0.98)97= 0.1823
b- P(X≤ 3) = P(X=0) +P(X=1)+P(X=2)+P(X=3)=
= 100
0
0.020 (0.98)100 + …+ 100 0.022 (0.98)98 + 0.1823=
2
= 0.1326+ 0.2707+ 0.2734+ 0.1823 = 0.858
85.58% of groups of 100 people have at most 3 left-handed.
100
c- P(X=0)= 0
0.020 (0.98)100 = 0.1326
d- P(X≥ 2) = 1- P(X<2) = 1-[P(X=0)+P(X=1)] = 1-(0.1326 +0.2707) = 0.5967
Example 9:
A dentist pulls the teeth of his patients randomly. Patients have a bad tooth among the thirty-two they have
before the operation.
1) Consider the first ten customers: calculate the probability that none of these ten patients have the bad
tooth any more.
2) Consider the first ten customers: calculate the probability that at least one of those customers still have
the bad tooth.
3) How many people should be treated if the dentist wants to extract at least a bad tooth with a probability
greater than 0.6?
Example 10:
The 25 employees of a factory have the following number of children:
Number of Frequency P(X)
children (X)
1 3
2 5
3 12
4 4
5 1
a- Find: The mean.
b- Find the variance and standard deviation.
c- Find the probability that an employee has 3 children or less.
Poisson Distribution
The Poisson distribution is popular for modeling the number of times an
event occurs in an interval of time or space.
It is used:
- As an approximation to the binomial distribution (specially when n is
unknown)
- 2 possible outcomes (success or failure)
- n independent trials
- n can be known or unknown known.
- the number of events n is very large and the probability of success p is
very small.
- We note X~ P (λ) where λ is the mean and is equal to np.
Example:
- the number of alpha particles emitted by a radioactive substance entering a particular region
in a given short time interval. Note that the number of emitted alpha particles is very high but
only a few particles are transmitted through the region in a given short time interval.
- the number of flu cases in a country within one year,
- the number of tropical storms within a given area in one year,
- the number of bacteria found in a biological investigation.
A discrete random variable X is said to follow a Poisson distribution with parameter λ > 0 if its
PMF is given by
λ𝑘
P(X = k) = 𝑒 −λ
𝑘!
Where e is Euler’s number (e = 2.71828...)
E(X) = V(X) = λ
Example 11:
A country experiences on average 4 tropical storms per year.
a- Write the probability distribution of X, where X is the number of tropical storms.
K P(X=K)
0 λ𝑘 −λ 40 −4
𝑒 = 0! 𝑒 =0.01832
𝑘!
41 −4
1 𝑒 = 0.07326
1!
42 −4
𝑒 = 0.1465
2 2!
3 0.1954
4 0.1954
5 0.1563
6 0.1042
7 0.0595
…
Total 1
b- What is the expectation of X, and what is the variance?
E(X) = V(X) = λ = 4
c- What is the probability of suffering from only two tropical storms per year?
P(X=2) = 0.1465
d- What is the probability of suffering from more than two tropical storms per year?
P(X>2) = 1- P(X≤ 2) = 1-[P(X=0) + P(X=1) + P(X=2)]=
= 1- (0.01832 + 0.07326 + 0.1465)
= 1- 0.23808 = 0.76192
e- What is the probability of suffering from only one tropical storm in a period of 5 years?
If the average is λ = 4/year, then λ0 = 4 x 5=20/ 5years
λ𝑘 201
P(X=1) = 𝑒 −λ = e−20 = 4.1223 x 10−8
𝑘! 1!
Example 12:
It is assumed that the probability that a traveler forgets his luggage on the train is 0.005.
A train is carrying 850 travelers. We know that these travelers are grouped randomly and their
behavior regarding their baggage, are independent of each other.
X denotes the random variable representing the number of travelers who have forgotten their
luggage on the train.
1) What is the probability distribution of the random variable X? Calculate its expectation and its
variance.
2) Give a probability distribution for approximating the law found in the previous question. Using
this approximate law, calculate an approximate value of the probability of the following events:
a) no traveler has forgotten his luggage,
b) at least five passengers have forgotten their luggage.
Example 13:
A factory produces water bottles. Among these bottles, 3% are defective. X is the random variable
representing the number of defective bottles, in any batch of randomly selected 100 bottles. It is assumed that
X has a Poisson distribution of parameter 3.
a- Determine the probability that this lot has two defective bottles.
b- Determine the probability that this lot has at most two defective bottles.
c- Determine the probability that such a lot has more than three defective bottles.
d- We suppose now that we buy a lot of 500 bottles. What is the probability of having exactly 10 defected
bottles?
Geometric Distribution
We are interested in determining how many independent Bernoulli trials are needed until the event of
interest occurs for the first time.
The geometric distribution can be used to determine the probability that the event of interest happens
at the kth trial for the first time.
It is used:
- 2 possible outcomes (success or failure)
- n independent trials
- We note X~ G (p) where P is the probability of success.
Example:
- How many tickets to buy in a raffle until we win for the first time,
- How many different drugs to try to successfully tackle a severe migraine, etc.
A discrete random variable X is said to follow a geometric distribution with parameter p if its PMF is
given by
P(X = k) = p (1 − p) k−1 , k = 1, 2, 3,...
1 1 1
The mean (expectation) is E(X) = p and the variance is Var(X) = p ( p − 1).
Example 14:
A coin is tossed until “head” is obtained for the first time. The probability of getting a head is p = 0.5
for each toss. We have the following probabilities:
P(X = 1) = 0.5
P(X = 2) = 0.5(1 − 0.5) = 0.25
P(X = 3) = 0.5 (1 −0.5) 2 = 0.125 and P(X = 4) = 0.5 (1 −0.5)3 = 0.0625
The mean and the variance are: E(X) = 1/0.5 = 2; Var(X) = 1/0.5 (1/0.5 − 1) = 2.
Example 15:
A drunk man returns home with 10 different keys in his pocket. To open the door, he tries a key at
random and, if the door does not open, he puts the key back in his pocket and starts again. X is the
number of keys tested until the door opens. What is the probability of having X = 6?
Example 16:
We roll a balanced die until we get a 6. X is the number of throws until we get 6. What is the
probability of having X = 2?
Example 17:
A hamster is placed in a cage. He is facing 5 gates of which only one allows him to leave the
cage. At each unsuccessful attempt, it receives an electric shock and is returned to its original
location.
1. Assuming that the hamster is not capable of learning and therefore chooses equally between
solutions for each new test, determine the probability of the events:
(A) the hamster leaves on the first attempt,
(B) the hamster leaves on the third attempt,
(C) the hamster comes out on the seventh attempt.
Example 17 (continue):
2. The hamster now memorizes unsuccessful the attempts and chooses equally between the gates he has
not yet tried. We denote by X the random variable equal to the number of tests carried out.
A) What values can X? Determine its law of probability.
B) Determine the mathematical expectation E (X): interpret the result.
C) Determine the Var (X) variance.
Hypergeometric Distribution
It is used:
- When we draw simultaneously.
- several possible outcomes.
- We note X ∼ H(n, M, N).
Consider an urn with N balls, M white balls and the rest are black.
We draw simultaneously n balls (without replacement), the order in which the balls are
drawn is assumed to be of no interest; only the number of drawn white balls is of
relevance.
We define the following random variable X : “number of white balls (x) among the n
drawn balls”.
- There are 𝑀 𝑥
possibilities to choose X white balls from the total of M white balls,
- There are 𝑁−𝑀 𝑛−𝑥
possibilities to choose (n − x) black balls from the total of N − M
black balls.
- The number of combinations for all possible events is 𝑁 𝑛
. In total, we draw n out of
N balls.
- The number of favorable events is 𝑀 𝑥
𝑁−𝑀
𝑛−𝑥
because we draw, independent of
each other, x out of M balls and n − x out of N − M balls
A random variable X is said to follow a hypergeometric distribution with parameters n, M, N,
i.e. X ∼ H(n, M, N), if its PMF is given by
𝑀 𝑁−𝑀
𝑥 𝑛−𝑥
P(X = x) = 𝑁
𝑛
The mean is E(X)= nM/N
Example 18:
A box contains 20 balls (4 are red and 16 are white). A person randomly and simultaneously
picks 5 balls :
a- Explain why this is a hypergeometric distribution, What are x, M and N ?
b- What is the probability that among the selected balls there are exactly 4 white?
c- What is the probability that among the selected balls there are exactly 4 red?
d- What is the probability that all the selected balls are red?
Solution:
a- It is a hypergometric distribution because we draw simultaneously. If X is the random
variable representing the number of white balls then: x=4 , M=16 and N=20.
𝑀 𝑁−𝑀 16 20−16
𝑥 𝑛−𝑥 4 5−4
b- P(X=4) = 𝑁 = 20 = 0.46956
𝑛 5
𝑀 𝑁−𝑀 16 20−16 16 4
𝑥 𝑛−𝑥 1 5−1 1 4
c- P(Y=4) = P(X=1) = 𝑁 = 20 = 20 = 0.00103
𝑛 5 5
d- P(Y=5)=0 since we have only 4 red balls.
Example 19:
The German national lottery draws 6 out of 49 balls from a rotating bowl. Each ball is
associated with a number between 1 and 49. A simple bet is to choose 6 numbers between 1 and
49. If 3 or more chosen numbers correspond to the numbers drawn in the lottery, then one wins
a certain amount of money.
What is the probability of choosing 4 correct numbers?
We can utilize the hypergeometric distribution with x = 4, M = 6, N = 49, and n = 6 to calculate
such probabilities.
𝑀 𝑁−𝑀 6 49−6 6 43
P(X = x) = 𝑥
𝑁
𝑛−𝑥
= 4 6−4
49 = 4
49
2
= 9.686 x 10−4 which is almost 0
𝑛 6 6
Example 20:
24 persons including 8 women have applied for a job. 5 of the candidates are randomly
selected.
a- What is the probability of choosing exactly three women?
𝑀 𝑁−𝑀 𝑀 𝑁−𝑀
𝑥 𝑛−𝑥 𝑥 𝑛−𝑥
P(X=3) = 𝑁 = 𝑁
𝑛 𝑛
b- What is the expected number of women as candidates.
E(X)= nM/N = 5 x 8 /24 = 1.67
Example 21:
A list contains 30 cities among which 12 are in Europe and the rest in Asia.
4 cities are randomly chosen to be visited:
a- What is the probability that they are all European?
b- What is the probability that only two among them are European?
c- What is the probability of having more than 2 European cities?
Distribution P(X) E(X) V(X)
Uniform 1 k+1 k2 −1
2 12
𝑘
Bernoulli p p p(1-p)
𝑛
Binomial 𝑘
𝑝𝑘 (1 − 𝑝)𝑛−𝑘 np np(1 − p)
Poisson λ𝑘 −λ λ λ
𝑒
𝑘!
p (1 − p) k−1
Geometric 1 1 1
( p − 1)
p p
𝑀 𝑁−𝑀
Hypergeometric 𝑥 𝑛−𝑥 nM/N -------
𝑁
𝑛