Ch-6 Probability Distribution
Ch-6 Probability Distribution
PROBABILITY DISTRIBUTION
Learning Objectives: -
INTRODUCTION
These three distributions are very useful in statistical studies as well as in deducing other
form of distributions.
2. The terms of this Binomial Distributing are similar to the Binomial Theorem that is why
it is called Binomial Distribution. Let ‘p’ be the probability of success and ‘q’ be the probability
of failure in a single toss or trial. Then evidently ‘p + q = 1’. In general, if we toss ‘n’ number
of coins the probability of success will be given by the terms in the expansion of
The probability of ‘r’ success and ‘n-r’failures comes as P(r ) = nCrprqn-r Where, r = 0, 1, 2,
……n.
nC = n!___
r
r!(n-r)!
The two parameters of the Binomial Distribution are: ‘n’& ‘p’ or ‘n’ & ‘q’
2
(a) Total probability of all the events is one i.e. Σp(r ) = 1, where ‘P(r )’ is the
probability of ‘r’ success in ‘n’ trials.
(b) In each trial the probability of success ‘p’ and probability of failure ‘q’ remained
constant.
(d) SD of the number of success in ‘n’ trials is √(npq) and variance is ‘npq’
(e) When ‘n’ is very large that is if the number of trials n → ∞ and p = q’’ then the
Binomial Distribution tends to Normal Distribution.
(g) The two parameters of the Binomial Distribution are: ‘n’& ‘p’ or ‘n’ & ‘q’
(h) If the number of trials n → ∞ and p → 0 then the Binomial Distribution tends to
Poisson Distribution.
If we ‘n’ number of fair trials are made under the same condition. Then out of ‘n’ trials,
the probability of success of the event ‘X’ can be explained by the Binomial Distribution in the
following way: -
● Out of ‘n’ trials, if the probability of success of the event ‘X’ is zero then
● Out of ‘n’ trials, if the probability of success of the event ‘X’ is onethen
● Out of ‘n’ trials, if the probability of success of the event ‘X’ is two then
P(X=2) = P(number of success ‘2’ and number of failure ‘n-2’)
= P(S) .P(S) .P(F,F,………….up to ‘n-2’times)
= P(S) .P(S) .P(F).P(F)……..up to ‘n-2’times)
= p.p.q.q.............................. up to ‘n-2’ times = nC2 . p2 .qn-2
In the same way
● Out of ‘n’ trials, if the probability of success of the event ‘X’ is ‘r’ then
nC = n!__
r
r!(n-r)!
5. Mean of the Binomial Distribution. Here we will prove that the mean of Binomial
Distribution is = ‘np’.
Let ‘p’ be the probability of success and ‘q’ be the probability of failure of an event ‘X’
in one trial. Then evidently ‘p + q = 1’.
The parameters of the Binomial Distribution are ‘n’ and ‘p’. The probability function of
the Binomial Distributionis:
6. Variance of the Binomial Distribution. Here, we will prove that the variance
of Binomial Distribution is = ‘npq’.
Let ‘p’ be the probability of success and ‘q’ be the probability of failure of an event ‘X’
in one trial. Then evidently ‘p + q = 1’.
The parameters of the Binomial Distribution are ‘n’ and ‘p’. The probability function of
the Binomial Distribution is: -
4
Variance is
V(X) = E(X2) – {E(X)}2
= E{X(X-1) + X} – {np}2
= E{X(X-1)} + E(X) – {np}2
= E{X(X-1)} + np–n2p2 Mathematically, it can be proved that
= n(n-1) p2 + np–n2p2 E{X(X-1)}= n(n-1)p2
= n2p2 - np2 + np–n2p2
= np(1-p) =npq
Mean is E(X) = np p + q =1
Solution:
Here, mean of the binomial distribution is np = 4 ……….(1)
Standard deviation of the binomial distribution is √𝒏𝒑𝒒 = √3 ……….(2)
Solving the equations (1) and (2) we get: npq =3, 4q =3, q =3/4
Example 2: A box contains 7 white and 8 black balls. 6 balls are drawn from the box with
replacement. What is the probability of getting 2 white and 4 black balls?
5
As the balls are drawn with replacement, we can apply Binomial probability law. In
each draw probability of getting white ball is p = 7/15 and probability of getting a black ball is
q = 8/15.
Now out of 6 balls drawn, the probability of getting 2 white and 4 black balls is
Example 3: What is the probability of getting 3 heads when 6 coins are tossed?
According to the Binomial distribution if n coins are tossed probability of getting rheads is
P = nCr pr qn-r =
Example 4: In 10 tosses of a fair coin. What are the chances of 5 head within these 10
trials?
Solution:
Example 6: The probability of a player to win a game is 3/4. Determine the following
probabilities within 5 games: -
Solution:
P(X>3) =P(X=4)
5!
P(X=1) = 5C1 . p1 . q5-1 =5C1 . p1 . q4= . (3/4)1.(1/4)4 = 15/1024
1!(5−1)!
5!
P(X=2) = 5C2 . p2 . q5-2 =5C2 . p2 . q3= .(3/4)2.(1/4)3 = 90/1024
2!(5−2)!
5!
P(X=4) = 5C4 . p4 . q5-4 =5C4 . p4 . q1= . (3/4)4.(1/4)1 = 405/1024
4!(5−4)!
7
Example 7: Within this 10 tosses a fair coin, what are the chances of: -
(d) What are the chances of at least 1 head within these 10 trials?
(e) What are the chances of at least 1 head within these 10 trials?
Solution:
10C 10!
P(X=3) = 3 . p2 . q10-3 =10C3 . p2 . q7=3!(10−3)!.(1/2)3.(1/2)7
10C 10!
P(X=1) = 1 . p1 . q10-1 =10C1 . p1 .q9=1!(10−1)! (1/2)1.(1/2)9
10! 1 10. 9! 1 10 5
=1!.9! . 210 = . . 210 = . 1024 = . 512
1 .9!
10C 10!
P(X=1) = 0 . p0 . q10-0 =10C0 . p0 . q10==0!(10−0)! (1/2)0.(1/2)10
10! 1 10! 1 1
=0!.10! . 210 = 1 .10!
. . 210 = . 1024.
(d) The probability of getting at least one head within these 10 trialsis:
1 1023
P(X≥1) = 1 - P(X=0) = 1 -1024 = 1024
(e) The probability of getting at best one head within these 10 trialsis:
1 10 11
P(X≤1) = P(X=0) + P(X=1)= 1024 - 1024
. = 1024
.
8
9. The limiting form of the Binomial Distribution is the Poisson distribution. When the
probability of success is very small (less than 0.1) and the number of trials is very large
(greater than 30) i.e. if n ∞ and p 0 such that np m, a constant then the Binomial
Distribution becomes the Poisson distribution. The Poisson distribution can be written in the
following form: -
(b) The probability measurement formula for Poisson Distribution for Poisson variable is
P(X) = nXe-m
r!
(c) Arithmetic mean of the Poisson variable = m.
(e) Whenn ∞ and p 0 such that np mn then the Binomial Distribution tends to
Poisson Distribution.
11. In most of the natural phenomena the variable does not take fixed value but lie
within a specific range. In statistics the Normal Distribution refers to the shape and pattern in
which the individual values in asset of data are placed or located. For example; heights and
weights of people, length of bolts manufactured, time taken to produce an item, etc., for such
continuous variate, normal distribution is most appropriate and has become most important
probability model in statistical Analysis.
12. The idea of normal distribution was finally developed by Karl Gauss. Normal Distribution
9
is also referred to as Gaussian Distribution after the name of Karl Gauss. The normal
distribution or Gaussian distribution is the most important of all types of distribution. It is also
known as the bell-shaped distribution.
dx
0 x x+dx
Fig. 5.02
If the diagram of frequency represented in the figure 5-02 satisfies the following conditions
then it is called frequency curve or Gauss Curve: -
Xj+1>Xj | Xj+1-Xj | = dx
14. Normal Variable. If the number of trial is very large i.e. if n ∞ and the number
success and the number of failure q are approximately equal i.e. p = q then the variable of
the Binomial Distribution becomes a Normal variable. Normal variable is represented by X
and its value is -∞ ≥ X ≥ ∞.
15. Normal Curve or Gaussian Curve. The variable exhibiting normal distribution
when represented in a graph assumes the form of a symmetrical, bell shaped, mesokurtic
curve. This curve is called normal curve. It is a special type of symmetrical frequency curve.
The concept of normal curve is theoretical and is given by the mathematical formula i.e. the
normal equation and is rare to find in practice. Many distributions take the form of normal
curve approximately in the longrun.
It the normal distribution is represented by a graph then it is called normal curve. The area
bounded by the normal curve and the X-axis is always one. The expansion of the normal
curve at the two ends is up to infinity. If normal curve is represented w.r.t. arithmetic mean
then the normal curve is symmetrical.
The concept of normal curve has got important uses in connection with SD. The relation of
the SD to the normal curve permits to find out the number of cases falling within any given
distance from the mean. A normal distribution curve is shown below: -
(a) The normal curve is symmetrical about mean (i.e. the values of mean, mode
and median coincides). Mean = Mode =Median.
(b) The height of normal curve is maximum at mean; thus mean, mode and median
are all equal. The distribution is symmetrical. The skewness is zero.
(e) The height of curve declines as we go in either direction from the mean. The
curve approaches nearer and nearer to the base on either side.
(f) The area under the normal curve representing proportionate frequency is one.
(p) The following are the areas distributed under the normal curve: -
(iii) Mean ± 1SD = µ. ±1 σ => 68.27% area
(ii) Mean ± 2SD = µ. ± 2σ => 95.45% area
(iii) Mean ± 3SD = µ. ± 3σ. => 99.74% area
In other words: -
(i) 99.74% of the values will lie in the range of µ + 3σ and µ -3σ.
(ii) 95.45% of the values will lie in the range of µ + 2σ and µ-2σ.
(iii) 68.27% of the values will lie in the range of µ + σ and µ -σ.
In other words: -
Fig. 5.03
(r) The variable is probable i.e. within its full expansion; the sum of the distribution
is one. That is the total probability in the Normal probability distribution is one.
12
STANDARD VARIABLE
17. Standard Variable. If we subtract the arithmetic mean M from the variables Xi
and hence divide it by the variance σi then we get a new variable Z, which is called Standard
Variable.
Zi = (Xi - M )
σi
(a) Arithmetic mean of the standard variable is zero. That is, Mean of Zi = 0.
(b) Variance of the standard variable is one. That is, Variance of Zi =1.
19. Property-1: Prove that the arithmetic mean of the standard variable is zero. That is,
Mean of Zi = 0.
20. Property-2: Prove that the variance of the standard variable is one. That is, Variance
of Zi = 1.
Proof:
σz2 = ∑(Zi-Mz)2 = ∑(Zn – 0)2= ∑(Zi)2 = 1 ∑(Zi)2
N N N N
21. Standardized Normal Distribution. If we subtract the normal mean µ from the
normal variables X and hence divide it by the standard deviation σ then we get a new variable
Z, which is called Standard Variable.
Zi = (Xi - µ)
σx
If the height of the Gaussian distribution curve is normalized that the area underneath
it is equal to unity, the curve called a probability curve. Although the theoretical probability
curve requires a very large number of measurements, the curve obtained from as few as 8
or 10 determinations is practically the same as the theoretical curve.
The narrower the range between which the truth-value lies, the more precise is the
measurement. A small spread of the probability curve indicates high precision whereas a
large spread indicates low precision. The distribution of Z is called standardized normal
distribution.
(k) Standard Normal variable is represented by Z and its value extends -∞≥ Z≥∞.
(l) The variable is probable i.e. within its full expansion; the sum of the distribution
is one. That is the total probability in the Normal probability distribution is one.
(m) The narrower the range between which the truth value lies, the more precise is
the measurement. A small spread of the probability curve indicates high precision
whereas a large spread indicates low precision.
(a) The concept of normal curve has got importance uses in connection with σ.
(b) The relation of σ with the normal curve enables us to put the normal curve in
practical use.
(c) The relationship permits us to find out, from the curve, the number of cases
14
(d) The normal curve permits us to find out the number of cases lying between
any two values in the distribution.
(e) From the normal curve we can find out the probability that any value drawn at
random will fall above or below a particular point.
(f) The normal distribution represented in graphical form gives us the normal
curve.
(g) The narrower the range between which the truth-value lies, the more precise
is the measurement. A small spread of the probability curve indicates high precision
whereas a large spread indicates low precision.
(j) To find out the area under curve from mean, we need not calculate every time,
as standard tables are available.
(k) In order to facilitate finding area under the curve we convert each curve into
standard normal curve, where mean is 0 and standard deviation1.
(l) The approximate probabilities of a few important intervals are given by: -
24. In any problem in which we are interested in determining area under normal curve
where mean is M and SD is σ, we change the mean to 0 and SD to 1 and obtain Z as standard
normal variate. Now we look up the table entry corresponding to Z and that will give the area.
The area under normal curve of z (indicating probability) between any two values of z (say
z1 and z2) can be calculated from the standard tables.
Example 8: If Z is the standard normal variable, then find the probability for the following
fields: -
(a) The value of Z remains between 0 to0.87
Solution:
15
Example 9: Mean and SD of a normal distribution are 20 and 10 respectively. What is the
probability that the values of X will lie between 15 and 40.
Between X1 = 15 and X2 = 40
Z1 = X1 - µ = 15 – 20 = -0.5
σ 10
Z2 = X2 - µ = 40 – 20 =2
σ 10
Probability that the values of X will lie
between 15 and 40 is
=P(15≤X≤40)
=P(-0.5≤Z≤2)
= P(-0.5≤Z≤0) + P(0≤Z≤2)
= P(0≤Z≤-0.5) + P(0≤Z≤2)
= 0.1915 - 0.472
= 0.6787
Example 10: Mark the area under normal curve when mean is 24 and SD is 4.
(a) Between Mean and X = 30
(b) Between X1 = 21 and X2 = 31
Example 11: Mean and SD of a normal distribution are 50 and 5 respectively. What is the
probability that the values of x will lie between 40 and 55.
Example 12: The income of a group of 10,000 persons was found to be normally distributed
with mean = Rs.750 per day. and standard deviations = Rs.50/-. What percentage or persons
have income-exceeding Rs.668/- and how many have income over Rs.860/-per day?
The persons having income more than Rs. 860/- per day fall on the right.
% of Persons having income more than Rs. 860/- = 0.0139 = 01.39%
The number of people having income; more than Rs860/- pm = 104 x .0139 = 139
18
Example 13: 1000 light bulbs; with a mean life; of 120 days are installed in a new factory,
their length of life is normally distributed with SD 20 days.
(b) If it is decided to replace all bulbs together, what interval should be allowed
between replacement, if not more than 10% should expire before replacement?
From the normal integral table we get, P(Z < -1.5) = 0.0668.
The shaded area represents proportion of bulbs that will fuse before 90 days.
No. of bulbs that will fuse before 90 days = 1000 x 0.668 = 66.8 or 67 approximately
Hence Z = - 1.28
Z=X-µ
σ
- 1.28 = X – 120
20
X = 120 – 20 x 1.28 = 94
X = 94 days approximately
19
Example 14: 10% light bulbs are defective product by a machine. Sample 400 light
bulbs are selected from the bulbs produced by that machine. Find the probability that out of
these 400 bulbs: -
(a) Maximum 30 bulbs are defective
(b) 30 to 50 bulbs are defective
(c) More than 55 bulbs are defective
Z1 = X1 - µ = 30 – 40 = -10 = -1.67
0.9130
σ 6 6 ≈91.30%
Z1 = X2 - µ = 50 – 40 = 10 = 1.67
σ 6 6
Probability that the values of X between 30 and 50
= P(30≤X≤50)
= P(-1.67≤Z≤1.67)
= P(-1.67≤Z≤0) + P(0≤Z≤1.67) -∞ X=30 µ =40 X=50 ∞
= 0.4515 + 0.4515 =0.9130 Z=-1.67 Z=1.67
Example 15: A sample of 400 articles has average weight of 25kg and standard
deviation 5kg. How many articles will lie between 20 to 30kg?
Solution: µ = 25, σ =5
Articles between 20 to 30 kg; X1 = 20, X2 = 30 0.68.268
Z1 = X1 - µ = 20 – 25 = -5 = -1 ≈68.268%
σ 5 5
Z1 = X2 - µ = 30 – 25 = 5 = 1
σ 55
Probability that the values of X between 20 and 30
= P(20≤X≤30)
-∞ X=20 =25 X=30 ∞
= P(-1≤Z≤1) Z=-1 Z=1
= P(-1≤Z≤0) + P(0≤Z≤1)
= 0.34134 + 0.34134 = 0.68268 = 68.268%
Number of articles lying between 20 and 30
= 2,000 x 68.268% = 2,000 x 0.68268 = 1365.36 ≈ 1365
Example 16: The hourly mean wages of a certain group of workers in a factory is Rs.285
with a standard deviation of Rs.50. Find what percentage of workers get above Rs.200.
Exercises
Exercises 1: The Goa Municipal Corporation installed 2000 bulbs in the streets of
Chandigarh. If these bulbs have an average life of 1000 hrs with a standard deviation of 200
hours, what number of bulbs is expected to fail in the first 700 burning hours?
Exercises 2: A firm has received orders to supply 10,000 bolts of 2.5 cm+ 0.2 cm.
During the manufacturing process; a sample of 100 bolts was taken. The mean length was
found to be 2.6 cm and standard deviation .26. Find out how many bolts out of 10,000
produced are expected to be within acceptable limits?
Exercises 3: A Naval base supply depot has received a consignment of 50,000 pairs
of boots for issue to Sailors. A study of similar shoes received in the past revealed mean life
of 2 years and SD 6 months. How many shoes are expected to be received for exchange
after 18 months and how many shoes are expected to last for at least 36 months?