CHAPTER 4
COMMONLY USED
DISTRIBUTIONS
BY:
DANGLI
LAVA
SUMAGAYSAY
BETITO
BANDADA
BORNALES
INTRODUCTION
Statistical inference involves drawing a sample from a
population and analyzing thesample data to learn about the
population. In many situations, one has an approximate
knowledge of the probability mass function or probability
density function of the population. In these cases, the
probability mass or density function can often be well
approximated by one of several standard families of curves,
or functions. In this chapter, we describe some of these
standard functions, and for each one we describe some
conditions under which it is appropriate.
4.1 THE BERNOULLI
DISTRIBUTION
4.1 THE BERNOULLI
DISTRIBUTION
• Imagine an experiment that can result in one of two outcomes.
One outcome is labeled“success,” and the other outcome is
labeled “failure.” The probability of success is denoted by p. The
probability of failure is therefore 1− p. Such a trial is called a
Bernoullitrial with success probability p.
• The simplest Bernoulli trial is the toss of a coin. The two
outcomes are heads and tails. If we define heads to be the
success outcome, then p isthe probability that the coin comes up
heads. For a fair coin, p = 1/2.
4.1 THE BERNOULLI
DISTRIBUTION
For any Bernoulli trial, we define a random variable X as follows: If the
experiment results in success, then X = 1. Otherwise X = 0. It follows
that X is a discrete random variable, with probability mass function
p(x) defined by:
p(0) = P(X = 0) = 1 − p
p(1) = P(X = 1) = p
p(x) = 0 for any value of x other than 0 or 1
4.1 THE BERNOULLI
DISTRIBUTION
EXAMPLE 4.1
A coin has probability 0.5 of landing heads when tossed.
Let X = 1 if the coin comes up heads, and X = 0 if the
coin comes up tails. What is the distribution of X?
Solution:
Since X = 1 when heads comes up, heads is the success
outcome. The success probability, P(X = 1), is equal to
0.5. Therefore X ∼ Bernoulli(0.5).
EXAMPLES
Example 4.2. A die has probability 1/6 of coming up 6 when rolled. Let X = 1
if the die comes up 6, and X = 0 otherwise. What is the distribution of X?
Solution:
The success probability is p = P(X = 1) = 1/6. Therefore X ∼ Bernoulli(1/6).
Example 4.3. Ten percent of the components manufactured by a certain
process are defective. A component is chosen at random. Let X = 1 if the
component is defective, and X = 0 otherwise. What is the distribution of X?
Solution:
The success probability is p = P(X = 1) = 0.1. Therefore X ∼ Bernoulli(0.1).
MEAN AND VARIANCE OF A BERNOULLI
RANDOM VARIABLE
EXAMPLE 4.4
Refer to Example 4.3. Find μX and σ2 X .
Solution:
Since X ∼ Bernoulli(0.1), the success probability p is equal to
0.1. Using Equa?tions (4.1) and (4.2), μX = 0.1 and σ2 X = 0.1(1
− 0.1) = 0.09.
4.2 The Binomial
Distribution
4.2 The Binomial
Distribution
Sampling a single item to check if it's defective is an example of a
Bernoulli trial. When multiple items are sampled independently with
the same probability of success (e.g., being defective), the total
number of successes follows a binomial distribution.
If n is the number of trials and p is the probability of success in each
trial, then the number of successes X is a discrete random variable
with a binomial distribution:
X∼Bin(n,p)
Possible values of X are 0,1,2,...,n.
EXAMPLE 4.5
A fair coin is tossed 10 times. Let X be the number of heads that
appear. What is the distribution of X?
Solution:
There are 10 independent Bernoulli trials, each with success
probability p = 0.5. The random variable X is equal to the number of
successes in the 10 trials. Therefore X ∼Bin(10, 0.5).
Example 4.2
A lot contains several thousand components, 10% of which are
defective. Seven components are sampled from the lot. Let X
represent the number of defective components in the sample. What is
the distribution of X?
Solution:
Since the sample size is small compared to the population (i.e., less
than 5%), the number of successes in the sample approximately
follows a binomial distribution. Therefore we model X with the Bin(7,
0.1) distribution.
Probability Mass Function of
a Binomial Random Variable
We now derive the probability mass function of a binomial random variable by
considering an example. A biased coin has probability 0.6 of coming up heads.
The coin is tossed three times. Let X be the number of heads. Then X
∼Bin(3,0.6). We will compute P(X =2).
There are three arrangements of two heads in three tosses of a coin, HHT, HTH,
and THH. We first compute the probability of HHT. The event HHT is a sequence
of independent events: H on the first toss, H on the second toss, T on the third
toss. We know the probabilities of each of these events separately:
P(H on the first toss)=0.6, P(H on the second toss)=0.6, P(T on the third
toss)=0.4
Since the events are independent, the probability that they all occur is equal to the
product of their probabilities (Equation 2.20 in Section 2.3). Thus
P(HHT)=(0.6)(0.6)(0.4)=(0.6)2(0.4)1
Similarly, P(HTH)=(0.6)(0.4)(0.6) = (0.6)2(0.4)1,and P(THH) = (0.4)(0.6)(0.6) =
(0.6)2(0.4)1. It is easy to see that all the different arrangements of two heads and
one tail have the same probability. Now
P(X =2) = P(HHT or HTH or THH)
= P(HHT)+ P(HTH)+ P(THH)
=(0.6)2(0.4)1 + (0.6)2(0.4)1 +
(0.6)2(0.4)1
= 3(0.6)2(0.4)1
Examining this result, we see that the number 3 represents the number of
arrangements of two successes (heads) and one failure (tails), 0.6 is the success
probability p, the exponent 2 is the number of successes, 0.4 is the failure
We can now generalize this result to produce a formula for the probability of x
successes in n independent Bernoulli trials with success probability p, in terms
of x,n, and p. In other words, we can compute P(X = x) where X ∼ Bin(n,
p).Wecan see that
P(X = x) =(number of arrangements of x successes in n trials) · px(1 − p)n−x
All we need to do now is to provide an expression for the number of
arrangements of x successes in n trials. To describe this number, we need
factorial notation. For any positive integer n, the quantity n! (read “n factorial”)
is the number
(n)(n − 1)(n −2)···(3)(2)(1)
We also define 0! = 1. The number of arrangements of x successes in n trials is
n!/[x!(n − x)!]. (A derivation of this result is presented in Section 2.2.) We can
now define the probability mass function for a binomial random variable.
Figure 4.2 presents probability histograms for the Bin(10,0.4) and Bin(20, 0.1) probability
mass functions.
EXAMPLE 4.7
Find the probability mass function of the random variable X if X ∼ Bin(10, 0.4).
Find P(X = 5).
Example 4.8
A fair die is rolled eight times. Find the probability that no more than 2 sixes
come up.
EXAMPLE 4.9
A large industrial firm allows a discount on any invoice that is paid with in 30
days. Of all invoices, 10% receive the discount. In a company audit, 12 invoices
are sampled at random. What is the probability that fewer than 4 of the 12
sampled invoices receive the discount?
Solution:
Let X represent the number of invoices in the sample that receive discounts.
Then X∼Bin(12,0.1). The probability that fewer than four invoices receive
discounts is P(X≤3). We consult Table A.1 with n=12, p=0.1, and x=3. We find
that P(X≤3)=0.974.
Example 4.10
Refer to Example 4.9. What is the probability that more than 1 of the 12
sampled invoices receives a discount?
Solution:
Let X represent the number of invoices in the sample that receive discounts.
We wish to compute the probability P(X > 1). Table A.1 presents probabilities
of the form P(X ≤ x). Therefore we note that P(X > 1) = 1 − P(X ≤ 1).
Consulting the table with n = 12, p = 0.1, x = 1, we find that P(X ≤ 1) = 0.659.
Therefore P(X >1) =1−0.659 =0.341.
ABinomial Random Variable
Is a Sum of Bernoulli
Random Variables
Assume n independent Bernoulli trials are conducted, each with success probability p.
Let Y1,...,Yn be defined as follows: Yi = 1 if the ith trial results in success, and Yi = 0
otherwise. Then each of the random variables Yi has the Bernoulli (p) distribution.
Now let X represent the number of successes among the n trials. Then X ∼ Bin(n, p).
Since each Yi is either 0 or 1, the sum Y1 +···+Yn is equal to the number of the Yi that
have the value 1, which is the number of successes among the n trials. Therefore X =
Y1+···+Yn. This shows that a binomial random variable can be expressed as a sum of
Bernoulli random variables. Put another way, sampling a single value from a Bin(n, p)
population is equivalent to drawing a sample of size n from a Bernoulli (p) population,
and then summing the sample values.
The Mean and Variance of a
Binomial Random Variable
With a little thought, it is easy to see how to compute the mean of a
binomial random variable. For example, if a fair coin is tossed 10 times, we
expect on the average to see f ive heads. The number 5 comes from
multiplying the success probability (0.5) by the number of trials (10). This
method works in general. If we perform n Bernoulli trials, each with success
probability p, the mean number of successes is np. Therefore, if X ∼Bin(n,
p), then μX = np.
We can verify this intuition by noting that X is the sum of n Bernoulli
variables, each with mean p. The mean of X is therefore the sum of the
means of the Bernoulli random variables that compose it, which is equal to
np. We can compute σ2 X by again noting that X is the sum of independent
Bernoulli random variables and recalling that the variance of a Bernoulli
random variable is p(1 − p). The variance of X is therefore the sum of the
variances of the Bernoulli random variables that compose it, which is equal to
np(1 − p).
Using a Sample Proportion to
Estimate a Success
Probability
In many cases we do not know the success probability p associated with a certain
Bernoulli trial, and we wish to estimate its value. A natural way to do this is to conduct n
independent trials and count the number X of successes. To estimate the success
probability p we compute the sample proportion p.
This notation follows a pattern that is important to know. The success probability, which
is unknown, is denoted by p. The sample proportion, which is known, is denoted p. The
“hat” ( ) indicates that p is used to estimate the unknown value p.
EXAMPLE 4.11
A quality engineer is testing the calibration of a machine that packs ice
cream into containers. In a sample of 20 containers, 3 are underfilled.
Estimate the probability p that the machine underfills a container.
Solution:
The sample proportion of underfilled containers is p = 3/20 = 0.15. We
estimate that the probability p that the machine underfills a container is
0.15 as well.
Uncertainty in the Sample
Proportion
It is important to realize that the sample proportion p is just an estimate of
the success probability p, and in general, is not equal to p. If another
sample were taken, the value of p would probably come out differently. In
other words, there is uncertainty in p. For p to be useful, we must compute
its bias and its uncertainty. We now do this. Let n denote the sample size,
and let X denote the number of successes, where X ∼ Bin(n, p).
EXAMPLE 4.12
The safety commissioner in a large city wants to estimate the
proportion of buildings in the city that are in violation of fire codes. A
random sample of 40 buildings is chosen for inspection, and 4 of
them are found to have fire code violations. Estimate the proportion
of buildings in the city that have fire code violations, and find the
uncertainty in the estimate.
EXAMPLE 4.13
In Example 4.12, approximately how many additional buildings must be
inspected so that the uncertainty in the sample proportion of buildings in
violation will be only 0.02?
EXAMPLE 4.14
In a sample of 100 newly manufactured automobile tires, 7 are found to have
minor flaws in the tread. If four newly manufactured tires are selected at random
and installed on a car, estimate the probability that none of the four tires have a
flaw, and find the uncertainty in this estimate.
4.3 The Poisson
Distribution
4.3 The Poisson
Distribution
The Poisson distribution arises frequently in scientific
work. One way to think of the Poisson distribution is as an
approximation to the binomial distribution when n is large
and p is small.
EXAMPLE 4.15
If X ∼ Poisson(3), compute P(X = 2), P(X = 10), P(X = 0), P(X = −1), and P(X =
0.5).
EXAMPLE 4.16
If X ∼ Poisson(4),
compute P(X ≤ 2)
and P(X > 1).
The Mean and Variance of a Poisson
Random Variable
If X ∼ Poisson(λ), we can think of X as a binomial random variable with large
n, small p, and np = λ. Since the mean of a binomial random variable is np, it
follows that the mean of a Poisson random variable is λ. The variance of a
binomial random variable is np(1 − p). Since p is very small, we replace 1 − p
with 1, and conclude that the variance of a Poisson random variable is np = λ.
Note that the variance of a Poisson random variable is equal to its mean.
EXAMPLE 4.18
Particles are suspended in a liquid medium at a concentration of 6 particles per
mL. A large volume of the suspension is thoroughly agitated, and then 3 mL are
withdrawn. What is the probability that exactly 15 particles are withdrawn?
Solution :
Let X represent the number of
particles withdrawn. The mean
number of particles in a 3 mL volume
is 18. Therefore X ∼ Poisson(18). The
probability that exactly 15 particles
are withdrawn is
EXAMPLE 4.19
Grandma bakes chocolate chip cookies in batches of 100. She puts 300 chips
into the dough. When the cookies are done, she gives you one. What is the
probability that your cookie contains no chocolate chips?
Solution:
This is another instance of particles in a
suspension. Let X represent the number of
chips in your cookie. The mean number of
chips is 3 per cookie, so X ∼ Poisson(3). It
follows that
EXAMPLE 4.20
Grandma’s grandchildren have been complaining that Grandma is too stingy with the
chocolate chips. Grandma agrees to add enough chips to the dough so that only 1 % of
the cookies will contain no chips. How many chips must she include in a batch of 100
cookies to achieve this?
Solution:
Let n be the number of chips to include in a batch of 100 cookies, and let X be the number of
chips in your cookie. The mean number of chips is 0.01n per cookie, so X ∼ Poisson(0.01n).
We must find the value of n for which P(X = 0) = 0.01. Using the Poisson(0.01n) probability
mass function,
Using the Poisson
Distribution to Estimate a
Rate
EXAMPLE 4.23
A suspension contains particles at an unknown
concentration of λ per mL. The suspension is thoroughly
agitated, and then 4 mL are withdrawn and 17 particles are
counted. Estimate λ.
Uncertainty in the
Estimated Rate
Uncertainty in the
Estimated Rate
EXAMPLE 4.24
A 5 mL sample of a suspension is withdrawn, and 47 particles are counted.
Estimate the mean number of particles per mL, and find the uncertainty in
the estimate.
EXAMPLE 4.25
A certain mass of a radioactive substance emits alpha particles at a mean
rate of λ particles per second. A physicist counts 1594 emissions in 100
seconds. Estimate λ, and find the uncertainty in the estimate.
EXAMPLE 4.26
In Example 4.25, for how many seconds should emissions be counted to
reduce the uncertainty to 0.3 emissions per second?
4.5 The Normal
Distribution
4.5 The Normal
Distribution
- (ALSO CALLED THE GAUSSIAN DISTRIBUTION)
IS BY FAR THE MOST COMMONLY USED
DISTRIBUTION IN STATISTICS.
-THIS DISTRIBUTION PROVIDES A GOOD MODEL
FOR MANY, ALTHOUGH NOT ALL, CONTINUOUS
POPULATIONS.
-A CONTINOUS, SYMMETRIC, BELL-SHAPED
DISTRIBUTION OF A VARIABLE.
THE PROBABILITY DENSITY FUNCTION OF A
NORMAL RANDOM VARIABLE WITH MEAN AND
VARIANCE IS
4.5 The Normal
Distribution
4.5 The Normal
Distribution
PROBABILITY DENSITY FUNCTION OF A
NORMAL RANDOM VARIABLE WITH MEAN Μ
AND VARIANCE Σ2 .
EXAMPLES:
STANDARD NORMAL POPULATION
EXAMPLES:
EXAMPLES:
EXAMPLES:
EXAMPLES:
EXAMPLES:
EXAMPLES:
EXAMPLES:
EXAMPLES:
EXAMPLES:
4.5 The Normal
Distribution
4.5 The Normal
Distribution
EXAMPLES:
EXAMPLES:
4.5 The Normal
Distribution
EXAMPLES:
EXAMPLES:
4.5 The Normal
Distribution
4.5 The Normal
Distribution
4.7 The
Exponential
Distribution
4.7 The Exponential
- Distribution
is a continuous distribution that is
sometimes used to model the time that
elapses before an event occurs.
- sometimes used to model the lifetime of a
component.
4.7 The Exponential
Distribution
-The probability density function of the
exponential distribution involves a
parameter, which is a positive constant λ
whose value determines the density
function’s location and shape.
4.7 The Exponential
Distribution
4.7 The Exponential
Distribution
-If X is a random variable whose distribution
is exponential with parameter λ, we write X
∼ Exp(λ).
4.7 The Exponential
Distribution
4.7 The Exponential
Distribution
4.7 The Exponential
-The mean and variance of an exponential
Distribution
random variable can be computed by using
integration by parts.
EXAMPLE 4.56
EXAMPLE 4.57
4.7 The Exponential
Distribution
4.7 The Exponential
Distribution
EXAMPLE 4.58
4.7 The Exponential
• Lack of Memory Property
Distribution
-means that the probability of waiting an
additional time ‘t’ does not depend on how
much time has already passed. In other words,
the distribution "forgets" the past.
EXAMPLE 4.59
EXAMPLE 4.60
4.7 The Exponential
Distribution
-The exponential distribution has the memoryless property:
the probability of waiting an additional ‘t’ units, given that
you have already waited ‘s’ units, is the same as the
probability of waiting ‘t’ units from the start. In other words,
the process does not "remember" how long you have already
waited. For example, if a component’s lifetime is
exponentially distributed, the chance it lasts ‘t’ more units of
time is the same whether it is new or already ‘s’ units old-its
age does not affect future probabilities.
4.7 The Exponential
Distribution
EXAMPLE 4.61
4.7 The Exponential
Distribution
4.7 The Exponential
Distribution
4.7 The Exponential
Distribution
EXAMPLE 4.62
4.11
THE CENTRAL LIMIT
THEOREM
4.11 The Central
Limit Theorem
The Central Limit Theorem is a fundamental concept in statistics. It
states that:
• When you take a large enough sample size (n) from a population, the
average of that sample will follow a normal distribution (bell-shaped
curve).
• This is true regardless of the population's original distribution.
IMPORTANCE:
• Allows us to use statistical methods that assume normal distributions.
• Enables calculation of probabilities for sample averages using standard
tables (z-table).
4.11 The Central Limit
Theorem
EXAMPLE OF THE CLT APPLICATION:
Getting the average height of adults in a city. We take random
samples of 100 adults each from the population.
1. Take many random samples from a population (e.g., heights).
2. Calculate the average ( sample mean ) of each sample.
3. The averages will form a normal distribution (bell-shaped
curve), even if the original population isn't normal.
IMPORTANCE: Allows us to make predictions and estimates about
the population average using statistical methods.
4.11 The Central Limit
Theorem
4.11 The Central Limit
Theorem
The Central Limit Theorem says that X and Sn are approximately
normally distributed, if the sample size n is large enough.
The natural question to ask is: How large is large enough?
Note that two of the
original distributions
are continuous, and
one is discrete. The
Central Limit Theorem
holds for both
continuous and
discrete distributions.
2
sample
z-score standard
devIatIon
EXAMPLE 4.70
EXAMPLE 4.70
EXAMPLE 4.72
EXAMPLE 4.72
4.11 The Central Limit
Theorem
NORMAL APPROXIMATION TO THE BINOMIAL
When we have a binomial distribution with a large sample size, we
can use the normal distribution to approximate it. This is useful
because normal distributions are easier to work with, especially
for calculating probabilities.
4.11 The Central Limit
Theorem
NORMAL APPROXIMATION TO THE BINOMIAL
4.11 The Central Limit
Theorem
NORMAL APPROXIMATION TO THE BINOMIAL
4.11 The Central Limit
Theorem
NORMAL APPROXIMATION TO THE BINOMIAL
4.11 The Central Limit
Theorem
THE CONTINUITY CORRECTION
When we use a normal distribution to approximate a
binomial distribution, we are using a continuous
curve to estimate a discrete set of values. To improve
the accuracy of this approximation, we use something
called the continuity correction.
4.11 The Central Limit
Theorem
THE CONTINUITY CORRECTION
4.11 The Central Limit
Theorem
THE CONTINUITY CORRECTION
4.11 The Central Limit
Theorem
THE CONTINUITY CORRECTION
4.11 The Central Limit
Theorem
THE CONTINUITY CORRECTION
EXAMPLE 4.73
EXAMPLE 4.73
EXAMPLE 4.74
EXAMPLE 4.74
EXAMPLE 4.75
EXAMPLE 4.75
4.11 The Central Limit
Theorem
ACCURACY OF THE CONTINUITY CORRECTION
4.11 The Central Limit
TheoremTO THE POISSON
NORMAL APPROXIMATOION
4.11 The Central Limit
Theorem
NORMAL APPROXIMATOION TO THE POISSON
EXAMPLE 4.76
THANK YOU!