Probability, Statistics and
Information
2024 - 2025 Fall
Course Objective: This course aims to teach
concepts and ideas about statistics and
probability, to establish meaningful relationships
between these concepts and ideas, and to
develop statistical thinking and reasoning skills.
Chapter 5
Some Discrete
Probability
Distributions
Copyright © 2017 Pearson Education, Ltd. All rights reserved.
Section 5.2
Binomial and
Multinomial
Distributions
Copyright © 2017 Pearson Education, Ltd. All rights reserved.
2-4
Section 5.2:
The Bernoulli Distribution
We use the Bernoulli distribution when we have
an experiment which can result in one of two
outcomes. One outcome is labeled “success,”
and the other outcome is labeled “failure.”
The probability of a success is denoted by p. The
probability of a failure is then 1 – p.
Such a trial is called a Bernoulli trial with
success probability p.
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-5
Examples
1. The simplest Bernoulli trial is the toss of a coin.
The two outcomes are heads and tails. If we define
heads to be the success outcome, then p is the
probability that the coin comes up heads. For a fair
coin, p = 0.5.
2. Another Bernoulli trial is a selection of a
component from a population of components, some
of which are defective. If we define “success” to be
a defective component, then p is the proportion of
defective components in the population.
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-6
The Bernoulli Distribution
• There is a single trial.
• The trial can result in one of two possible outcomes.
– There are two possible outcomes.
• P(success)=p
• P(failure)=1-p
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-7
X ~ Bernoulli(p)
For any Bernoulli trial, we define a random variable X
as follows:
If the experiment results in a success, then X = 1.
Otherwise, X = 0.
It follows that X is a discrete random variable, with
probability mass function p(x) defined by
p(0) = P(X = 0) = 1 – p (probability of failure)
p(1) = P(X = 1) = p (probability of success)
p(x) = 0 for any value of x other than 0 or 1
x=0,1
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-8
Mean and Variance
If X ~ Bernoulli(p), then
X 0(1 p) 1( p) p
(0 p) (1 p) (1 p) ( p) p(1 p)
2
X
2 2
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-9
Example
Ten percent of components manufactured by a
certain process are defective. A component is
chosen at random. Let X = 1 if the component is
defective, and
X = 0 otherwise.
1. What is the distribution of X?
2. Find the mean and variance of X.
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-10
• P(success)=P(defective)=p=0.1
• P(failure)=P(nondefective)=1-p=0.9
• X ~ Bernoulli(0.1)
• E(X)=
• V(X)=
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-11
Section 5.2:
The Binomial Distribution
If a total of n Bernoulli trials are conducted, and
The trials are independent.
Each trial has the same success probability p
X is the number of successes in the n trials
then X has the binomial distribution with
parameters n and p, denoted X ~ Bin(n, p).
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-12
Example
A fair coin is tossed 10 times. Let X be the
number of heads that appear. What is the
distribution of X?
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-13
Independence of trials
• When selecting items from a box, you have to replace
the item you selected, to make your selections
independent! (with replacement)
• There are some cases where your selections are
independent even though the selections are made
without replacement:
– Selections from an infinite population
– Selections from finite population
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-14
Another Use of the Binomial
Assume that:
a finite population contains items of two types,
successes and failures,
and that a simple random sample is drawn from the
population.
Then if the sample size is no more than 5% of
the population, the binomial distribution may be
used to model the number of successes.
Sample size < 0.05*(population size)
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-15
Example
A lot contains several thousand components, 10%
of which are defective. Seven components are
sampled from the lot. Let X represent the number
of defective components in the sample. What is
the distribution of X?
• Since the sample size is small compared to the
population (i.e., less than 5%), the number of
successes in the sample approximately follows
a binomial distribution.
• Therefore we model X with the Bin(7, 0.1)
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-16
Binomial R.V.:
pmf, mean, and variance
If X ~ Bin(n, p), the probability mass
function of X is
n!
p x
(1 p ) n x
, x 0,1,..., n
p ( x) P ( X x) x !(n x)!
0, otherwise
Mean: X = np
Variance: X2 np(1 p)
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-17
More on the Binomial Distribution
• Assume n independent Bernoulli trials are conducted.
• Each trial has probability of success p.
• Let Y1, …, Yn be defined as follows: Yi = 1 if the ith
trial results in success, and Yi = 0 otherwise. (Each of
the Yi has the Bernoulli(p) distribution.)
• Now, let X represent the number of successes among
the n trials. So, X = Y1 + …+ Yn .
This shows that a binomial random variable can be
expressed as a sum of Bernoulli random variables.
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-18
Multinomial Trials
A Bernoulli trial is a process that results in one
of two possible outcomes. A generalization of
the Bernoulli trial is the multinomial trial,
which is a process that can result in any of k
outcomes, where k ≥ 2.
We denote the probabilities of the k outcomes by
p1,…,pk .
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-19
Multinomial Distribution
• Now assume that n independent multinomial trials are
conducted each with k possible outcomes and with the
same probabilities p1,…,pk .
• Number the outcomes 1, 2, …, k. For each outcome i,
let Xi denote the number of trials that result in that
outcome.
• Then X1,…, Xk are discrete random variables.
• The collection X1,… ,Xk said to have the multinomial
distribution with parameters n, p1,…, pk. We write X1,
…, Xk ~ MN(n, p1,…, pk).
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-20
Multinomial R.V.
If X1,…, Xk ~ MN(n, p1,…, pk), then the pmf of X1,…, Xk is
Note that if X1,…, Xk ~ MN(n, p1,…, pk), then for each i, Xi
~ Bin(n, pi).
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
Section 5.3
Hypergeometric
Distribution
Copyright © 2017 Pearson Education, Ltd. All rights reserved.
2-22
[Link] Distribution
• Consider a finite population containing two types of
items, which may be called successes and failures.
• A simple random sample is drawn from the population.
• Each item sampled constitutes a Bernoulli trial.
• As each item is selected, the probability of successes in
the remaining population decreases or increases,
depending on whether the sampled item was a success or
a failure (sampling without replacement).
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-23
Hypergeometric
• For this reason the trials are not independent,
so the number of successes in the sample does
not follow a binomial distribution.
• The distribution that properly describes the
number of successes is the hypergeometric
distribution.
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-24
Hypergeometric pmf
Assume a finite population contains N items, of
which R are classified as successes and N – R are
classified as failures. Assume that n items are
sampled from this population, and let X represent
the number of successes in the sample. Then X
has a hypergeometric distribution with
parameters N, R, and n, which can be denoted X
~ H(N, R, n).
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-25
Hypergeometric pmf
The probability mass function of X is
R N R
x
n x , max(0, R n N ) x min(n, R)
p( x) P( X x) N
n
0, otherwise
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
Mean and Variance of the 2-26
Hypergeometric Distribution
If X ~ H(N, R, n), then
nR
Mean of X: X N
R R N n
n 1
2
Variance of X:
X
N N N 1
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
Section 5.4
Negative
Binomial and
Geometric
Distributions
Copyright © 2017 Pearson Education, Ltd. All rights reserved.
2-28
5.4. Geometric Distribution
• Assume that a sequence of independent
Bernoulli trials is conducted, each with the
same probability of success, p.
• Let X represent the number of trials up to and
including (until) the first success.
• Then X is a discrete random variable, which is
said to have the geometric distribution with
parameter p.
• We write X ~ Geom(p).
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
Geometric R.V.: 2-29
pmf, mean, and variance
If X ~ Geom(p), then
The pmf of X is
p (1 p ) x 1 , x 1,2,...
p ( x) P( X x)
0, otherwise
The mean of X is X 1 .
p
The variance of X is 2 1 p .
X 2
p
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-30
5.4. Negative Binomial Distribution
The negative binomial distribution is an extension of the
geometric distribution. Let r be a positive integer.
Assume that independent Bernoulli trials, each with
success probability p, are conducted, and let
X denote the number of trials up to and including
(until) the rth success.
Then X has the negative binomial distribution with
parameters r and p. We write X ~ NB(r,p).
Note: If X ~ NB(r,p), then X = Y1 + …+ Yr where Y1,
…,Yr are independent random variables, each with
Geom(p) distribution.
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-31
Negative Binomial R.V.:
pmf, mean, and variance
If X ~ NB(r,p), then
The pmf of X is
x 1 r
p (1 p ) x r
, x r , r 1,...
p( x) P( X x) r 1
0, otherwise
The mean of X is r .
X
p
The variance of X is X2 r (1 p) .
p2
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
Section 5.5
Poisson
Distribution and
the Poisson
Process
Copyright © 2017 Pearson Education, Ltd. All rights reserved.
Section 5.5: 2-33
The Poisson Distribution
One way to think of the Poisson distribution is as
an approximation to the binomial distribution when n
is large and p is small.
It is the case when n is large and p is small the mass
function depends almost entirely on the mean np, and
very little on the specific values of n and p.
We can therefore approximate the binomial mass
function with a quantity λ = np; this λ is the
parameter in the Poisson distribution.
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-34
The Poisson Distribution
• Poisson distribution expresses the probability of a
given number of events occurring in a fixed interval
of time or space if these events occur with a known
constant rate and independently of the time since the
last event.
• The Poisson distribution can also be used for the
number of events in other specified intervals such as
distance, area or volume.
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-35
The Poisson Distribution
• X: the number of events occurred in a fixed interval of time or
space
• The number of births per hour during a given day
• The number of particles emitted by a radioactive source in a
given time
• The number of cases of a disease in different towns
• The number of hits on a web site in one hour
• The number of goals scored by a football team in a match
• The number of accidents in a certain part of the road
• The number of customers who enters a market during an hour
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
Poisson R.V.: 2-36
pmf, mean, and variance
If X ~ Poisson(λ), the probability mass function of X
is x
e
, for x = 0, 1, 2, ...
p ( x) P ( X x) x !
0, otherwise
Mean: X = λ
2
Variance: X
Note: X is a discrete random variable and λ must be a
positive constant.
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-37
The Poisson Distribution
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.
2-38
Summary
• Bernoulli trials
• Binomial distribution
• Multinomial distribution
• Hypergeometric distribution
• Geometric distribution
• Negative Binomial distribution
• Poisson distribution
McGraw-Hill ©2014 by The McGraw-Hill Companies, Inc. All rights reserved.