Unit IV
Theoretical Probability Distributions
Objectives
• After studying this unit, you should be able to:
1. Identify the situations were discrete probability distributions can be applied.
2. Calculate a binomial probabilities and to find the mean, and standard deviation for the binomial variable.
3. Calculate Poisson probabilities and to find the mean and standard deviation for the Poisson variable.
4. Identify the situations were continuous probability distributions can be applied.
5. Identify properties of normal distribution.
6. Determine probabilities based on z-scores.
Introduction
• In this chapter we start to make precise the basic elements of the theory of distributions. Under certain given
conditions, the observed or empirical frequency distributions can be approximated by some standard known theoretical
distributions. These theoretical distributions are very helpful to the decision makers in various business decisions. In
the population, the values of the variable may follow distribution according to some law of probability and these
distributions are termed as ‘Theoretical Probability Distributions’. These theoretical probability distributions indicate
the behavior of the variable under certain known or given conditions. All probability distributions can be classified as
discrete probability distributions or as continuous probability distributions, depending on whether they define
probabilities associated with discrete variables or continuous variables. In this chapter, we will be discussing some
important discrete distributions and continuous distributions like;
• Discrete Probability Distributions:
• Binomial Distribution
• Poisson Distribution
• Continuous Probability distribution:
• Normal Distribution
Discrete Probability Distributions
• Discrete probability distribution is used to calculate probability for a countable number of occurrences of an event.
Types of discrete probability distribution:
There are two types of discrete probability distribution,
Finite discrete probability distribution: Finite discrete distribution is used to calculate probability for a countable
number of values.
• For e.g., a die is rolled, there is a chance of 6 possible outcomes and total number of outcome is a finite number 6.
• The Binomial distribution is finite discrete probability distribution.
Infinite discrete probability distribution: Infinite discrete distribution is used to calculate probability for infinite
number of values.
• For e.g., two dice are rolled together until we get six in both the dice, infinite number of outcomes.
• The Poisson distribution is infinite discrete probability distribution.
Binomial Distribution
• Binomial distribution is also known as the ‘Bernoulli distribution’ after the Swiss mathematician James Bernoulli who discovered it in
1700 and was first published in 1713, eight years after his death.
• Many types of probability problems have only two outcomes, or they can be reduced to two outcomes. For example, when a coin is
tossed, it can land heads or tails. When a baby is born, it will be either male or female. In a basketball game, a team either wins or
loses. A true-false item can be answered in only two ways, true or false. Other situations can be reduced to two outcomes. For
example, a medical treatment can be classified as effective or ineffective, depending on the results. A person can be classified as
having normal or abnormal blood pressure, depending on the measure of the blood pressure gauge. A multiple-choice question, even
though there are four or five answer choices, can be classified as correct or incorrect. Situations like these are called binomial
experiments.
• This distribution can be used under the following conditions.
1. The random experiment is performed repeatedly a finite number of times. i.e., n, the number of trials is finite and fixed.
2. The outcomes of the random experiment (trial) results in the dichotomous classification of events. i.e., the outcomes of each trial
may be classified into two mutually disjoint categories, called success and failure.
3. All the trails are independent. i.e., the results of any trail, is not affected in any way, by the preceding trails and doesn’t affect the
result of succeeding trails.
4. The probability of success in any trail is p and is constant for each trail. The q = 1- p, is then termed as the probability of failure and
is constant for each trail.
Suppose an experiment is repeated ‘n’ times and each trail is independent.
Let us assume that each trail results in two possible mutually exclusive and exhaustive outcomes i.e., success
and failure.
Let X is random variable represents total no. of successes in ‘n’ trails. Let the probability of success in each
trail is p and the probability of failure is q = 1 - p and p remains constant from trail to trail.
Now, we have to find out the probability of x successes in n trails.
Let us suppose that a particular order of outcomes of x successes in n repetitions be as follows:
S S S S S F F F S S F S … F S (x number of successes and n-x failures)
Since, the trails are all independent the probability for the joint occurrence of the event is
p p p p p q q q p p q p ... q p
= (p p p p p p ... x times) (q q q q q q … (n-x) times)
= pxqn-x
Further in a series of n trails x successes and n-x failures can occur in n c x ways. So, the required probability
is
Probability of x successes in n trails is P(X=x) = n c x p x q n − x , x = 0, 1, 2, ..., n
This is called probability distribution of Binomial random variable X or simply Binomial distribution.
Definition: A random variable X is said to be follow a binomial distribution if its probability function is given
by
P(X=x) = n c x p x q n − x , x = 0, 1, 2, ..., n
And p + q =1
➢ The n and p are called parameters of the binomial distribution.
➢ The X is called binomial variable.
➢ Symbolically, for a binomial random variable X with parameter n and p we use notation X ~ B(n,p) or this
can be written as B(X; n, p).
➢ The total of the probabilities of the binomial distribution is unity.
➢ Mean of the binomial distribution
n
The mean of the Binomial distribution is E ( X ) = x P( X
x =0
= x) = np
➢ Variance of the Binomial distribution:
The variance of the Binomial distribution is V ( X ) = E ( X 2 ) − [ E ( X )] 2 = npq
Note: In B.D since mean = np and variance = npq and p + q = 1 therefore, mean > variance
Properties of binomial distribution
(1) Binomial distribution is a discrete probability distribution with two parameters n and p and finite range
from 0 to n.
(2) The mean and the variance of the binomial distribution is np and npq respectively and mean > variance.
1 = (1 − 2 p)
2
(3) The measure of skewness of binomial distribution is
npq
1
(4) If p = , the distribution is symmetric.
2
1
p , the distribution is positively skewed.
2
1
p , the distribution is negatively skewed.
2
(5) The measure of Kurtosis of binomial distribution is β2 reveals that as the number of trails n→∞, the
distribution tends to mesokurtic.
1
(6) If pq = , the distribution is mesokurtic.
6
1
pq , the distribution is leptokurtic.
6
1
pq , the distribution is platykurtic.
6
1 2
Also, for a symmetric binomial distribution i.e. for p = , the kurtosis of binomial distribution is 3 − .
2 n
1 n n −1
(7) For p = , the binomial distribution has maximum probability at x = , if n is even and x = and
2 2 2
n +1
x = , if n is odd.
2
(8) Under certain conditions the binomial distribution approaches to Poisson and Normal distributions.
(9) Additive or reproductive property of binomial distribution is :
If X~ B (n1, p) and Y~ B (n2, p) and X and Y are Independent, then X+Y ~ B (n1 + n2, p).
Uses of Binomial Distribution
(1) Ithas major application in the field of industrial quality control when items are
classified as defective and non defective.
(2) Thisdistribution is used when we like to know the opinion of the public when the
voters may be in favor of or against a candidate.
(3) Thisdistribution is also used in market researches where a consumer may prefer the
product of brand A or brand B.
(4) Thisdistribution is used in medical research where a particular drug might cure a
person or not.
(5) This distribution also used in economic survey where respondents are in for or against
a certain economic policy of the govt.
Example 2: If a student randomly guesses at five multiple-choice questions, find the probability that the
student gets exactly three correct. Each question has five possible choices.
Solution: In this case n = 5, X = 3, and p = 1/5, since there is one chance in five of guessing a correct
answer. Then,
Example 3: A survey from Teenage Research Unlimited (Northbrook, Ill.) found that 30% of teenage
consumers receive their spending money from part-time jobs. If five teenagers are selected at random, find
the probability that at least three of them will have part-time jobs.
Solution: To find the probability that at least three have a part-time job, it is necessary to find the individual
probabilities for 3, 4, or 5 and then add them to get the total probability.
Hence,
P (at least three teenagers have part-time jobs) = 0.132 + 0.028 + 0.002 = 0.162
Poisson Distribution
Poisson distribution was discovered by a French Mathematician-cum-Physicist, Simeon Denis Poisson in 1937. He
derived it as a limiting case of binomial distribution.
The Poisson distribution has widespread applications in areas such as analyzing traffic flow, fault prediction in
electric cables, defects occurring in manufactured objects such as castings, email messages arriving at your computer
and in the prediction of randomly occurring events or accidents. Following are some of the example of Poisson
distribution:
1. Number of rain drops in one minute.
2. Number of cars passing by you for an hour.
3. Number of visitors of a certain web site between 10:00-11:00pm.
4. The number of blind born per year in a large city.
5. The number of printing mistakes per page in a large volume of a book.
6. The number of air pockets in a glass sheet.
7. The number of accidents occurred annually at a busy crossing of city.
8. The number of defective articles produced by a quality machine.
Definition: A discrete random variable X is said to be follow a Poisson distribution if the probability mass
function is given by
e −m m x
p( X = x) = P( x; m) = , x = 0,1,2,3..........
x!
Where e = 2.7183 and m 0
➢ Here m is called the parameter of the Poisson distribution. We may use Greek small letter λ (speak as
‘lambda’) as parameter instead of m.
➢ X is called poisson random variable.
➢ Symbolically, for a poisson random variable X with parameter m we use notation X ~ P( m ) or this can
be written as P(X; m ).
➢ The total of probabilities of the Poisson distribution is unity i.e. P( x; m) = 1
x =0
➢ Mean and Variance of Poisson distribution:
Mean = E(X) = m
Variance = V(X) = E ( X 2 ) − [ E ( X )] 2 = m
Note: In Poisson distribution the mean and variance are equal i.e. m
Poisson distribution as the limiting case of Binomial distribution
The Poisson distribution can be limiting case of a binomial distribution under the conditions.
1. Number of trails i.e. n is indefinitely large i.e. n →
2. p , the probability of success in each trail is indefinitely small.
i.e., p → 0
3. np = m is finite.
Properties of Poisson distribution
The following are the some of the properties of the Poisson distribution.
1. Poisson distribution is a discrete probability distribution with single parameter m .
2. Both mean and variance of the Poisson distribution are equal to m .
3. The distribution is positively skewed and leptokurtic.
4. It is asymptotic form of binomial distribution when p is small, n is large and np is finite.
5. the normal distribution is a limiting form of a Poisson distribution as m → 0
6. The distributio0n of rare events generally approximates to a Poisson distribution.
7. If X 1 and X 2 are two independent Poisson variates with mean m1 and m2 respectively, then
X = X 1 + X 2 is also a Poisson variate with mean m1 + m2 .
Applications of Poisson distribution
This distribution is used to describe the behavior of the rare events like
1. This is widely used in waiting lines or queuing problems in management studies.
2. It has wide applications in industrial quality control.
3. In determining the number of deaths in a given period by a rare disease.
Example 1: A machine which is known to produce 1% defective components is used for a production run of
40 components. We wish to calculate the probability that two defective items are produced.
{Remark: Essentially we are assuming that X ~B (40, 0.01) and are asking for P(X = 2). We use both the
Binomial distribution and its Poisson approximation for comparison. Here it is shown that how binomial
distribution is converging to Poisson distribution under essential conditions.}
Solution:
Using the Binomial distribution we have the solution
P(X = 2) = 40C2 (0.99)40-2(0.01)2
40 39
= (0.99)38(0.01)2
2 1
= 0.0532
Note that the arithmetic involved is unwieldy. Using the Poisson approximation we have the solution as below.
P(X = 2) = (e−0.4 0.42 )/ 2!
= 0.0536
Note that the arithmetic involved is simpler and the approximation is reasonable.
Example 2: Suppose mass-produced needles are packed in boxes of 1000. It is believed that 1 needle in 2000 on
average is substandard. What is the probability that a box contains more than 2 defectives? Using the Poisson
distribution calculate P(X = 0), P(X = 1) and hence calculate P (more than 2 defectives).(e−0.5 = 0.6065)
Solution:
Here m = np = 1000 (1/2000) = ½ = 0.5
𝑒 −0.5 0.5 0
Now p(x=0) = = 𝑒 −0.5
0!
𝑒 −0.5 0.5 1
P(X=1) = = 𝑒 −0.5 0.5 1
1!
P (more than 2 defectives) = P(X>2) = 1- P (X≤2)
= 1- [P(X=0) = P(X=1) + P(X=2)]
−0.5 −0.5 1 𝑒 −0.5 0.5 2
= 1 - [𝑒 + 𝑒 0.5 + ]
2!
0.5 2
= 1- 𝑒 −0.5[1+ 0.5 1
+ ]
2
=1- 𝑒 −0.5 (1.625)
= 1- 0.985612
= 0.014388
Example 3:
In the manufacture of glassware, bubbles can occur in the glass which reduces the status of the glassware to
that of a ‘second’. If, on average, one in every1000 items produced has a bubble, calculate the probability
that exactly six items in a batch of three thousand are seconds.( e−3 = 0.0498)
Solution:
Let X = number of items with bubbles.
Since n = 3000 is too large and p = 0.001 is very small, we can use the Poisson distribution with m= n p =
3000 × 0.001 = 3.
P (exactly six items in a batch of three thousand are seconds) = P(X = 6)
e −3 36
=
6!
≈ 0.0498 × 1.0125
= 0.05
The result means that we have about a 5% chance of finding exactly six seconds in a batch of three thousand
items of glassware.
Example 4:
A manufacturer produces light-bulbs that are packed into boxes of 100. If quality control studies indicate that
0.5% of the light-bulbs produced are defective, what percentage of the boxes will contain (a) no defective, (b)
2 or more defectives?( (e−0.5 = 0.6065)
Solution:
As n is large and p as the P (defective bulb), is small; use the Poisson distribution.
Here, X = number of defective bulbs in a box, and
X ~ P (m) where m = n × p = 100 × 0.005 = 0.5
Hence,
(a) P(X = 0) = (e−0.5 0.50 )/0! = 0.6065 ~ 61%
(b) P(X = 2 or more) = P(X = 2) + P(X = 3) + P(X = 4) + . . .
But it is easier to consider,
P(X ≥2) = 1 − [P(X = 0) + P(X = 1)]
P(X = 1) =(e−0.5 0.5^1)/1! = 0.3033
i.e., P(X ≥2) = 1 − [0.6065 + 0.3033] = 0.0902 ≈ 9%
Example 5: As an example of a waiting-for-occurrence application, consider a telephone operator who on the
average, handles five calls every 3 minutes. What is the probability that there will be no calls in the next minute?
5
−
At least two calls?( e 3 = 0.819)
Solution: Let X = number of calls in a minute.
Here X has a Poisson distribution with E(X) = m= 5/3.
e−5/3 5/3 0
P (no calls in the next minute) = P(X = 0) = 0!
= 0.189
P (at least two calls in the next minute) = P(X ≥2)
= 1 − P(X = 0) − P(X = 1)
5
−3 5 1
e
= 1 −0.189 − { 3
}
1!
= 0.496.
Continuous Probability Distributions
We have already discussed the case of discrete probability distributions, where either the random
variables takes, finite number or infinite countable values. Now we shall be discussing the case of
probability distribution with the random variables falling within a given a given interval.
A continuous probability distribution differs from a discrete probability distribution in several
ways.
• The probability that a continuous random variable will assume a particular value is zero.
• As a result, a continuous probability distribution cannot be expressed in tabular form.
• Instead, an equation or formula is used to describe a continuous probability distribution.
• Probability that a continuous variable lies in an interval is the area covered under its density
curve by an interval.
Normal Distribution
It is the most widely used probability distribution for continuous random variables. It is also most important
continuous probability distribution because most of the data relating to business, social or physical situations
conform to or can be approximated to these distributions. Hence a Normal distribution can serve as a
satisfactory approximation to the Binomial and the Poisson distributions.
Normal Distribution was first discovered by De-Moivre in 1733 and was also known to Laplace in 1774. Later
it was derived by Kark Friedrich Gauss in 1809 and used it for the study of errors in astronomy. Anyhow, the
credit of normal distribution has been given to Gauss and is often called ‘Gaussion distribution’.
Definition: A continuous random variable X is said to have a normal distribution with parameters and 2
if its density function is given by the probability law
1 1 x −
2
f ( x / , 2) = N ( , 2 ) = exp −
2 2
Where − X
− , 0
e = 2.7183 , = 3.1416
Here and 2 are the mean and variance of the normal distribution respectively.
Note: A random variable X with mean and variance 2 and following the normal distribution can be
expressed by X ~ N ( , 2 )
Properties of normal distribution:
The following points are important properties of normal distribution.
1. The normal curve is symmetrical and bell shaped. The range of the
distribution is − to
2. The value of mean, median, mode will coincide as the distribution is
symmetrical.
i.e., mean = median = mode
3. The parameters and 2 represent the mean and variance of the −
distribution. For different values of the parameters we get different normal distributions.
4. It has only one mode i.e. the distribution is unimodal and it occurs at x = .
5. The skewness of the distribution is 1 = 0 and kurtosis is 2 = 3 .
2 4
6. The mean deviation from mean is .
5
7. The total area bounded by the curve and horizontal axis is equal to 1.
8. P[a < X ≤ b]= Area bounded by the curve and the ordinates at a and b’; and
(i) P [μ - σ < X ≤ μ + σ] = 0.6826 = 68.26%
(ii) P [μ – 2σ < X ≤ μ + 2σ] = 0.9544 = 95.44%
(iii) P [μ – 3σ < X ≤ μ + 3σ] = 0.9974 = 99.74%
9. The maximum ordinate occurs at x = and its value is
1
.
2
Q3 − Q1
10. The quartile deviation is = 0.6745
2
11. The first quartile Q1 = -0.6745 and third quartile Q3 = +0.6745
Q3 − Q1
12. The co-efficient of quartile deviation = 0.6745 .
Q3 + Q1
X −
13. If X is a normal variate with mean and standard deviation , then the distribution of Z = is
also normal with mean 0 and variance 1. Here Z is called standard normal variable.
X −
Symbolically, if X ~ N ( , 2 ) then Z = ~ N (0,1)
Standard normal distribution: It is the normal distribution having mean 0 and standard deviation 1. The
probability density function of standard normal variable Z is
1
− z
e 2 ,− Z
2
1
f ( z) =
2
Area under normal curve:
As the normal variable is a continuous random variable, the probability that the random variable X assumes a
value x = x1 and x = x2 is represented by the area under the probability curve bounded by the values x1 and x2
can be defined as
−1 x −
2
x2
1
Pr ob( x1 x x2 ) = e 2
dx
x1 2
Since the normal curve depends on two parameters and 2 , the area
represented by Pr ob( x1 x x2 ) is also dependent on and 2 . Though theoretically this probability can be
calculated by using the method of integral calculus, normal integral tables are available for the use of
practicing statisticians. It is very voluminous work to compile tables for all possible values of and 2 . In
fact such tables would be infinitely many because − , 0 .
To facilitate the preparation of tables, the normal variable is standardized or is transformed to a new variable
which is also normal, but having mean 0 and variance 1. Thus if X is normal variable with mean and
X −
variance 2 , then Z = is a standardized normal variable having mean 0 and variance 1.
x − x − x2 −
And thus Pr ob( x1 x x2 ) = Pr ob 1
= Pr ob( z1 z z 2 )
Some of the areas of standardized normal curve:
Distance from the Area under curve
mean ordinate
±0.6745 50%
±1 68.27%
±1.96 95%
±2 95.45%
±2.58 99%
±3 99.73%
Example 1: Let Z is a standard normal random variable. Calculate (i) P (Z ≤1.1), (ii) P (Z >0.8),
(iii) P (Z ≤ -1.52), (iv) P (-0.4 ≤Z ≤1.32), and (v) P (-0.2 ≤ Z ≤ -0.34).
Solutions:
(i) P (Z ≤ 1.1): This can be read directly from the table.
P (Z ≤ 1.1) = P (-∞≤ Z ≤ 0) + P (0≤ Z ≤1.1)
= 0.50 + 0.3643
= 0.864
(ii) P (Z >0.8) = 0.5 - P (0 ≤ Z≤ 0.8)
= 0.5 -0.2881
= 0.2119
(iii)P (Z ≤ -1.52): Again, we can read this directly from the table.
P (Z ≤ -1.52) = 0.5- P (0 ≤ Z ≤1.52)
=0.5- 0.4357
= 0.0643
(iv) P (- 0.4 ≤ Z≤ 1.32). To calculate this, we note that
P (- 0.4 ≤ Z ≤ 1.32) = P (0 ≤ Z ≤ 0.4) +P (0≤ Z < 1.32)
=0.1554 + 0.4066 = 0.562
(v) Similarly,
P (-0.2 ≤ Z ≤ - 0.34) = P (0.2 ≤ Z ≤ 0.34)
= P (0 ≤ Z ≤ 0.34) -P (0 ≤ Z < 0.2)
= 0.1331 – 0.0793
=0.0538
Example 2:The actual volume of soup in 500 ml jars follows a normal distribution with mean 500 ml and
variance 16 ml. If X denotes the actual volume of soup in a jar, find,
(i) P(X > 496),
(ii) P(X < 498),
(iii) P(492 < X <506),
(iv) P(X > 493).
Solution:
496−500
(i) P(X >496) = P(Z > 4
)
= P (Z >-1)
= 0.5 + P (0 < Z< 1)
=0.5 + 0.3413
= 0.8413
498−500
(ii)P(X <498) = P (Z < 4
)
= P (Z < -0.5)
= 0.5- P (0< Z < 0.5)
=0.5- 0.1915
=0.3085
492−500 506 −500
(iii)P (492 <X <506) = P ( <Z< )
4 4
= P (-2 <Z <1.5)
= P (0 <Z <1.5) +P (0 < Z ≤ 2)
= 0.4772+ 0.4332
= 0.9104
493−500
(iv)P(X >493) = P (Z > )
4
= P (Z >-1.75)
=0.5 + P (0< Z < -1.75)
= 0.5 + 0.4599
= 0.9599
Example