NORMAL DISTRIBUTION
A .N Mulaga (PhD)
Department Of Mathematical Sciences
February 2023
NORMAL DISTRIBUTION
• Recall that Random variables can be either
discrete or continuous
• Recall that a discrete variable cannot assume
all values between any two given values of the
variables.
• On the other hand, a continuous variable can
assume all values between any two given
values of the variables.
NORMAL DISTRIBUTION
• Many continuous variables have distributions
that are bell-shaped, and these are called
approximately normally distributed variables.
• The normal distribution is also known as the bell
curve or the Gaussian distribution.
• the theoretical curve, called a normal distribution
curve, can be used to study many variables that
are not perfectly normally distributed but are
nevertheless approximately normal.
NORMAL DISTRIBUTION
• The mathematical equation for a normal
distribution is
( X − )2
−
( 2 2 )
e
y =
2
• The shape and position of a normal
distribution curve depend on two parameters,
the mean and the standard deviation.
NORMAL DISTRIBUTION
• Each normally distributed variable has its own
normal distribution curve, which depends on
the values of the variable’s mean and standard
deviation.
• A normal distribution is a continuous,
symmetric, bell-shaped distribution of a
variable.
Properties of a normal distribution
• A normal distribution curve is bell-shaped.
• The mean, median, and mode are equal and
are located at the center of the distribution.
• A normal distribution curve is unimodal (i.e., it
has only one mode).
• The curve is symmetric about the mean.
• The total area under a normal distribution
curve is equal to 1.00, or 100%.
The Standard Normal Distribution
• Since each normally distributed variable has its
own mean and standard deviation, as stated
earlier, the shape and location of these curves
will vary.
• In practical applications, then, you would have to
have a table of areas under the curve for each
variable.
• To simplify this situation, statisticians use what is
called the standard normal distribution.
The Standard Normal Distribution
• The standard normal distribution is a normal
distribution with a mean of 0 and a standard
deviation of 1.
• The formula for the standard normal
distribution is
−z2
e 2
y =
2
The Standard Normal Distribution
• All normally distributed variables can be
transformed into the standard normally
distributed variable by using the formula for
the standard score:
The Standard Normal Distribution
• Once the X values are transformed by using
the preceding formula, they are called z values.
• The z value or z score is actually the number of
standard deviations that a particular X value is
away from the mean
The Standard Normal Distribution
• the area under a std normal distribution
curve is used to solve practical application
problems.
• such as finding the percentage of adult
women whose height is between 5 feet 4
inches and 5 feet 7 inches, or finding the
probability that a new battery will last longer
than 4 years.
Finding Areas Under the Standard Normal
Distribution Curve
a. Find the area to the left of z = 2.06 =0.9803
b. Find the area to the right of z=-1.19=0.8830
c. Find the area between z=1.68 and
z=1.37=0.4147+0.4535=0.8682
or 0.9535-0.0853=0.8682
A Normal Distribution Curve as a
Probability Distribution Curve
• A normal distribution curve can be used as a
probability distribution curve for normally
distributed variables.
• The area under the standard normal
distribution curve can also be thought of as a
probability.
A Normal Distribution Curve as a
Probability Distribution Curve
• That is, if it were possible to select any z value
at random, the probability of choosing one,
say, between 0 and 2.00 would be the same as
the area under the curve between 0 and 2.00.
which is 0.4772.
• Therefore, the probability of randomly
selecting any z value between 0 and 2.00 is
0.4772.
A Normal Distribution Curve as a
Probability Distribution Curve
• Thus problems involving probability are solved
in the same manner as finding areas under the
standard normal distribution curve.
• For example, if the problem is to find the
probability of selecting a z value between 2.25
and 2.94 ,this could be solved by finding the
area under the standard normal curve.
A Normal Distribution Curve as a
Probability Distribution Curve
• For probabilities, a special notation is used.
• For example, if the problem is to find the
probability of any z value between 0 and 2.32,
this probability is written as P(0<z<2.32)
• Example :Find the probability for each.
a. P(0 < z< 2.32)
b. P(z < 1.65)
c. P(z >1.91)
A Normal Distribution Curve as a
Probability Distribution Curve
Solution:
• P(0 < z <2.32) means to find the area under the
standard normal distribution curve between 0
and 2.32.
• First look up the area corresponding to 2.32. It is
0.9898.
• Then look up the area corresponding to z =0. It is
0.500.
• Subtract the two areas: 0.9898 - 0.5000 = 0.4898.
Hence the probability is 0.4898, or 48.98%.
A Normal Distribution Curve as a
Probability Distribution Curve
b. P(z< 1.65)
Solution:
• Look up the area corresponding to z = 1.65 in
the standard normal Tables.
• It is 0.9505. Hence, P(z < 1.65) 0.9505, or
95.05%.
A Normal Distribution Curve as a
Probability Distribution Curve
c. P(z >1.91)
Solution:
• Look up the area that corresponds toz = 1.91.
It is 0.9719.
• Then subtract this area from 1.0000.
• P(z >1.91)= 1.0000 - 0.9719 = 0.0281, or
2.81%.
A Normal Distribution Curve as a
Probability Distribution Curve
• Sometimes, one must find a specific z value
for a given area under the standard normal
distribution curve.
• The procedure is to work backward, using the
standard normal Tables.
• Example: Find the z value such that the area
under the standard normal distribution curve
between 0 and the z value is 0.2123.
A Normal Distribution Curve as a
Probability Distribution Curve
• Solution :
• From the problem
P (0 z z1 ) = 0.2123
• But P(z<0)=0.5000
• So P( z z1 ) = 0.2123 + 0.5000 = 0.7123
• Hence look up this area (probability) in the
standard normal tables.
• The value in the left column is 0.5, and the top
value is 0.06. Add these two values to get z = 0.56.
A Normal Distribution Curve as a
Probability Distribution Curve
• Note : that If the exact area cannot be found,
use the closest value.
• For example, if you wanted to find the z value
for an area 0.9241, the closest area is 0.9236,
which gives a z value of 1.43.
Applications of the Normal
Distribution
• The standard normal distribution curve can be
used to solve a wide variety of practical
problems.
• The requirement to use the standard normal
distribution curve to solve such problems is
that the variable should be normally or
approximately normally distributed.
Applications of the Normal
Distribution
• To solve problems by using the standard
normal distribution,
• transform the original variable to a standard
normal distribution variable by using the
formula
X −
z=
Applications of the Normal
Distribution
• This formula transforms the values of the
variable into standard units or z values.
• Once the variable is transformed, then the
standard normal Tables can be used to solve
probability problems
Applications of the Normal
Distribution
• Example: A survey found that women spend
on average $146.21 on beauty products
during the summer months. Assume the
standard deviation is $29.44. Find the
percentage of women who spend less than
$160.00. Assume the variable is normally
distributed.
Applications of the Normal
Distribution
• Solution:
• Find the z value corresponding to $160.00.
• We have
X = 160, = 146.21, = 29.44
X − 160 − 146.21
z= = = 0.47
29.44
• We are looking for the probability P(z<0.47) ,hence
P(z<0.47)=0.6808.
• Therefore 0.6808, or 68.08%, of the women spend less
than $160.00 on beauty products during the summer
months
Applications of the Normal
Distribution
• Example :Each month, a Malawian household
generates an average of 28kgs of newspaper
for garbage or recycling. Assume the standard
deviation is 2 kgs. If a household is selected at
random, find the probability of its generating
a. Between 27 and 31 kgs per month
b. More than 30.2 kgs per month
Assume the variable is approximately normally
distributed
Applications of the Normal
Distribution
• Solution :
• We have X 1 = 27, X 2 = 31, = 28, = 2
• Transforming into standard z units
• z =
27 − 28
1
2
=−
1
2
= −0.5
31 − 28 3
z2 = = = 1.5
2 2
• Thus P(-0.5<z<1.5)=0.6247
• Hence, the probability that a randomly
selected household generates between 27 and
31 kgs of newspapers per month is 62.47%.
Applications of the Normal
Distribution
b. More than 30.2 kgs per month
Solution :
• Transforming into z units we have
30.2 − 28 2.2
z= = = 1.1
2 2
• We are looking for P(z>1.1)=1.000-0.8643=0.1357
• Hence, the probability that a randomly selected
household will accumulate more than 30.2 kgs of
newspapers is 0.1357, or 13.57%.
Applications of the Normal
Distribution
• Find specific data values for given percentages,
(probabilities )using the standard normal
distribution.
• READING ASSIGNMENT: read about finding
data values given probabilities
• In our previous example compute the value of
weight of garbage exceeded by 90% of the
sampled households.
Determining Normality
• A normally shaped or bell-shaped distribution
is only one of many shapes that a distribution
can assume;
• Normality is very important since many
statistical methods require that the
distribution of values be normally or
approximately normally shaped
• There are several ways statisticians check for
normality. The easiest way is to draw a
histogram for the data and check its shape
Determining Normality
• If the histogram is not approximately bell
shaped, then the data are not normally
distributed.
• Skewness can be checked by using the
Pearson coefficient of skewness (PC) also
called Pearson’s index of skewness.
• The formula is
3( X − mod e)
PC =
s
Determining Normality
• If the mean >mode the skew is positive
• If the mean <mode the skew is negative
• If the mean =mode the skew is zero and the
distribution is symmetrical
• If the index is greater than or equal to 1 or
less than or equal to 1, it can be concluded
that the data are significantly skewed.
Determining Normality
• There are several other methods used to
check for normality.
• These include:
• Using normal probability graph paper
• Using the chi-square goodness-of-fit test
• Using the Kolmogorov-Smikirov test.
The Central Limit Theorem
Distribution of Sample Means
• Suppose a researcher selects a sample of 30 adult
males and finds the mean of the blood pressure
levels for the sample subjects to be 120.
• Then suppose a second sample is selected, and
the mean of that sample is found to be 115.
• Suppose the researcher continue the process for
100 samples.
• What happens then is that the mean becomes a
random variable, and the sample means 120, 115,
113.3, . . . , 126.72 constitute a sampling
distribution of sample means.
The Central Limit Theorem
• A sampling distribution of sample means is a
distribution using the means computed from
all possible random samples of a specific size
taken from a population.
• If the samples are randomly selected with
replacement, the sample means, for the most
part, will be somewhat different from the
population mean
• These differences are caused by sampling
error
The Central Limit Theorem
• Sampling error is the difference between the
sample measure and the corresponding
population measure due to the fact that the
sample is not a perfect representation of the
population.
Properties of the Distribution of
Sample Means
1. The mean of the sample means will be the
same as the population mean.
• if all possible samples of size n are taken with
replacement from the same population, the
mean of the sample means, denoted by x ,
equals the population mean
• Thus =
x
Properties of the Distribution of
Sample Means
2. The standard deviation of the sample means
will be smaller than the standard deviation of
the population, and it will be equal to the
population standard deviation divided by the
square root of the sample size.
• the standard deviation of the sample means,
denoted by X equals n
• Hence
X =
n
Properties of the Distribution of
Sample Means
• The standard deviation of the sample means is
called the standard error of the mean.
The Central Limit Theorem
• As the sample size n increases without limit,
the shape of the distribution of the sample
means taken with replacement from a
population with mean and standard
deviation will approach a normal
distribution.
• As previously shown, this distribution will
have a mean and a standard deviation .
n
The Central Limit Theorem
• If the sample size is sufficiently large, the
central limit theorem can be used to answer
questions about sample means in the same
manner that a normal distribution can be used
to answer questions about individual values.
• The only difference is that a new formula must
be used for the z values.
• It is X −
z=
n
The Central Limit Theorem
• It’s important to remember two things when you
use the central limit theorem:
1. When the original variable is normally distributed,
the distribution of the sample means will be
normally distributed, for any sample size n.
2. When the distribution of the original variable
might not be normal, a sample size of 30 or more
is needed to use a normal distribution to
approximate the distribution of the sample
means. The larger the sample, the better the
approximation will be.
The Central Limit Theorem
• Example: Research info masters LTD reported
that children between the ages of 2 and 5
watch an average of 25 hours of television per
week. Assume the variable is normally
distributed and the standard deviation is 3
hours. If 20 children between the ages of 2
and 5 are randomly selected, find the
probability that the mean of the number of
hours they watch television will be greater
than 26.3 hours.
The Central Limit Theorem
Solution:
• We are looking for P(z>26.3)
• Since the variable is approximately normally
distributed, the distribution of sample means
will be approximately normal, with a mean of
25. The standard deviation of the sample
means is = = 3 = 0.671
X
n 20
• Transforming 26.3 into z units we have
The Central Limit Theorem
• The z value is
X − 26.3 − 25 1.3
z= = = = 1.94
3 0.671
n 20
• The area to the right of 1.94 is 1.000 - 0.9738
=0.0262, or 2.62%.
• One can conclude that the probability of
obtaining a sample mean larger than26.3
hours is 2.62% [i.e., P(X > 26.3) = 2.62%].
• Therefore P(z>1.94)= 0.0262
READING ASSIGNMENT!!!!!!!!
The Normal Approximation to the Binomial
Distribution
EXPONENTIAL DISTRIBUTION
• The density curve corresponding to any
normal distribution is bell-shaped and
therefore symmetric.
• There are many practical situations in which
the variable of interest to an investigator
might have a skewed distribution.
• One family of distributions that has this
property is the gamma family.
• We consider a special case of gamma
distribution called the exponential distribution
EXPONENTIAL DISTRIBUTION
• X is said to have an exponential distribution
with parameter ( 0) if the distribution
function of X isx
e x 0 x 0
f (x; ) =
0
otherwise
• The formula for the exponential distribution is
given by y = e x
x0
EXPONENTIAL DISTRIBUTION
• For a random variable X which follows an
exponential distribution,
• The mean is given by
1
E( X ) = =
• And the variance is 1
Var ( X ) = =
2
2
1
• The standard deviation is = 2 =
EXPONENTIAL DISTRIBUTION
• Example :The article “Probabilistic Fatigue
Evaluation of Riveted Railway Bridges”
suggested the exponential distribution with
mean value 6 MPa as a model for the
distribution of stress range in certain bridge
connections. Let’s assume that this is in fact
the true model. Find the probability that
stress range is at most 10 MPa
EXPONENTIAL DISTRIBUTION
• Solution:
• The mean =
1
=6
• Implies that = 0.1667
• P( X 10) = 1 − e −(0.1667)(10) = 1 − 0.189 = 0.811
• Find the probability that stress range is
between 5 and 10 MPa ( do it !)
EXPONENTIAL DISTRIBUTION
• The exponential distribution is frequently used
as a model for the distribution of times
between the occurrence of successive events,
such as customers arriving at a service facility
or calls coming in to a switchboard.