0% found this document useful (0 votes)
16 views32 pages

4lecture Notes Continuous Probability Distributions

The document provides an overview of continuous probability distributions, focusing on continuous random variables and the Normal distribution. It explains key concepts such as probability density functions, properties of the Normal distribution, and how to calculate probabilities. Additionally, it covers the standard normal distribution, the normal approximation to the binomial distribution, and the Central Limit Theorem, along with various examples to illustrate these concepts.

Uploaded by

hoyiali98
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views32 pages

4lecture Notes Continuous Probability Distributions

The document provides an overview of continuous probability distributions, focusing on continuous random variables and the Normal distribution. It explains key concepts such as probability density functions, properties of the Normal distribution, and how to calculate probabilities. Additionally, it covers the standard normal distribution, the normal approximation to the binomial distribution, and the Central Limit Theorem, along with various examples to illustrate these concepts.

Uploaded by

hoyiali98
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Statistics 1S1/ 1P1 Slides

Rakesh Bhurtun

Department of Statistics

Rakesh Bhurtun (Rhodes University) Statistics 1S1/1P1 (slide 1) 1 / 32


Chapter 5: Continuous Probability Distributions

Aims
To introduce continuous random variables
To discuss the Normal distribution as an example of continuous
probability distribution
To discuss the normal approximation to the binomial

Rakesh Bhurtun (Rhodes University) Statistics 1S1/1P1 (slide 2) 2 / 32


Continuous Random Variable

Definition
A cont. r.v. is a random variable, denoted X , Y ,Z etc that can assume
any possible value on the numerical scale within the ranges pertaining to
the data at hand. Examples, age, length, blood pressure levels etc.
Definition
The probability density function (diff from a probability mass function
for a discrete distibution) of a random variable X is a function such that
the area under the density curve between any two points a and b is equal
to the probability that the random variable X falls between a and b.
Thus the total area under the density curve over the entire range of
possible values for a random variable is 1.
Normal Distribution

It is a distribution associated with a normally distributed random


variable.
The normal, or Gaussian or “bell-shaped,” distribution is the
cornerstone of most methods of estimation and hypothesis testing.
Many random variables, such as distribution of birth weights or blood
pressures in the general population, tend to follow approximately a
normal distribution.
Using normal distribution is desirable since it is easy to use and tables
for it are more widely available than are tables for other distributions.
Properties of the Normal Distribution

Distinct “bell-shaped” distribution

The shape is determined by 2 parameters, mean µ and variance σ 2


If X has a normal distribution, then
X ∼ N µ, σ 2

h i
f (x ) = √ 1 exp − 2σ1 2 (x − µ)2 , −∞ < x < ∞
2πσ 2
Shapes of the normal distribution

Figure 5.6 same spread but difference centers


Figure 5.7 same center but difference spreads
Calculating Probabilities
For a continuous probability distribution Probability ≡ area under the curve
Note for a cont. r.v. X , and for a constant c, P (X > c) = P (X ≥ c)
Total area under the graph is 1.
The implication is
P (X > c) = 1 − P (X < c)
for some constant c.
1-P(X>c)

P(X>c)

c X
Since a normal random variable is continuous, the area under a single point
on the curve is infinitesimally small which means

P (X = c) = 0

for some constant c.


The distribution is symmetrical around the mean (centre), hence

P (X > µ + c) = P (X < µ − c)

P(X<μ-c) P(X>μ+c)

μ-c μ μ+c X

for some constant c.


If µ = 0, then
P (X > c) = P (X < −c)
Since the distribution is symmetrical and the total area is one, the
area to the left of the mean is equal to the area to the right of the
mean which is equal to 0.5.
Normal (5,1) Distribution Normal (5,1) Distribution

0.4

0.4
0.3

0.3
0.2

0.2
0.1

0.1
0.0

0.0
2 3 4 5 6 7 8 2 3 4 5 6 7 8

X X

P(3 < X < 7) for X ∼ N(µ = 5, σ 2 = 1) P(X < 3) for X ∼ N(µ = 5, σ 2 = 1)

Normal (5,1) Distribution


0.4
0.3
0.2
0.1
0.0

2 3 4 5 6 7 8

P(X > 7) for X ∼ N(µ = 5, σ 2 = 1)


The Standard Normal Distribution
A normal distribution with mean 0 and variance 1 is called a standard
normal distribution, i.e Z ∼ N (0, 1)
This distribution is symmetric about 0,

About 68% of the area under the standard normal density lies
between +1 and -1,
about 95% of the area lies between +2 and -2, and
about 99% lies between +2.5 and -2.5.
P(−1 < Z < 1) = 0.68
P(−1.96 < Z < 1.96) = 0.95
P(−2.576 < Z < 2.576) = 0.99
From a general normally distributed random variable X ∼ N(µ, σ 2 ),
then we define Z as
X −µ
Z= ∼ N(0, 1)
σ

Note that any normal random variable can be converted into a


standard normal random variable a process called standardization
X~N(35,2.5)

X~N(61,5)
X~N(16,1.8)

5 15 20 25 30 35 40 45 50 55 60 65 70 75 80X

Z = (X − μ)/σ

-3 -2 -1 0 1 2 3 Z
Z~N(0,1)
Standard Normal Tables
Table 1

Areas under the Normal Curve

Only tabulated for positive z values


Φ(-Z)= 1− Φ(Z) negative z values are calculated by symmetry
Φ (Z)
Tabulated probabilities represent
0 z
the area to the left

0.00 0.01 ... 0.09

0 0.5000 0.5040 0.5359


0.1 0.5398 0.5438 0.5753
0.2 0.5793 0.5832 ... 0.6141
0.3 0.6179 0.6217 0.6879
0.4 0.6554 0.6591 0.7224
0.5 0.6915 0.6950 0.7549

. . . .
. . . .
. . . .

4.0 0.99997 0.99997 ... 0.99998


Example (not in the book)
Consider Z ∼ N(0, 1). Obtain the following probabilities:
1 P(Z < 2.1)
2 P(Z > 2.42)
3 P(Z < −1.54)
4 P(1.5 < Z < 2.6)
5 P(Z > −1.82)
Example 5.20 pg120
Suppose a mild hypertensive is defined as a person whose DBP is between
90 and 100 mm Hg inclusive, and the subjects are 35- to 44-year old men
whose blood pressures are normally distributed with mean 80 and variance
144.
What is the probability that a randomly selected person from this
population will be a mild hypertensive?
Example (not in book)
Durations of clinical research studies awarded by a particular federal
agency are assumed to follow a normal distribution, with a mean of 24
months and a standard deviation of 2.3 months.
1 Find the probability that a study is awarded with a duration of more
than 20 months.
2 Find the probability that a study is awarded a duration between 16
and 29 months.
Inverse Normal Distribution

Is used when we are given the probability (or area) and we wish to
determine the Z or X value.

The (100 × u)th percentile of a standard normal distribution is


denoted by zu in which P(Z < zu ) = u where Z ∼ N(0, 1)
To evaluate zu we determine the area u in the normal tables and then
find the value zu that corresponds to this area.
Example (not in the book)
Consider a standard normal random variable, Z ∼ N(0, 1). Find the value
of a in the each of the following:
1 P(Z < a) = 0.9;
2 P(Z < a) = 0.3;
Example (not in the book)
Suppose contents of bottles of water coming off a production line have a
normal distribution with mean 9.15 ounces and standard deviation 0.1.
1 If every bottle is labeled 9 ounces, what proportion of bottles contain
less than the labeled amount?
2 If only 3% of bottles exceed weight w , what is the value of w ?
Normal Approximation to the Binomial

When n is large for X ∼ b (n, p) , it is difficult to work out


probabilities
We consider a normal approximation to the binomial
If n is moderately large and p is either near 0 or 1, then the binomial
distribution will be very positively or negatively skewed, respectively

If n is moderately large and p is not too extreme, then the binomial


distribution tends to be symmetric and well be approximated by a
normal distribution.
If X is a binomial random variable with parameters n and p, then
P(a ≤ X ≤ b) is approximated by the area under a Y ∼ N(np, npq)
curve from a–1/2 to b + 1/2.
1
The process of adding and subtracting 2 is called continuity correction
The normal distribution with mean np and variance npq can be used
to approximate a binomial distribution with parameters n and p when
np(1 − p) ≥ 5 as recommended by your textbook.
Example 5.34, pg 132
Suppose that we want to compute the probability that between 50 and 75
(inclusive) of 100 white blood cells will be neutrophils, where the
probability that any one cell is a neutrophil is 0.6. These values are chosen
as proposed limits to the range of neutrophils in normal people, and we
wish to predict what proportion of people will be in the normal range
according to this definition.
Example (not in the book)
On a Saturday afternoon, 160 customers will be observed during check-out
and the number paying by card (i.e. credit or debit) will be recorded.
Records from the store suggest that 45% of customers pay by card.
Estimate the probability that:
1 more than 60 will pay by card;
2 between 80 and 90, inclusive, will pay by card.
Estimation of the Mean of a Distribution
Note that a natural estimator for estimating the population mean µ is
the sample mean
n
P
Xi
i=1
X= n (point estimate)
x is a single realization of a random variable X over all possible
samples of size n that could have been selected from the population
The distribution of X , called a sampling distribution is the
distribution of values of x over all possible samples of size n that
could have been selected from the reference population
Let X1 , . . . , Xn be a random sample drawn from some population with
mean µ. Then for the sample mean X ,
E X = µ regardless of its underlying distribution.
The set of sample means in repeated random samples of size n from
2
this population has variance σn
 
σ2
i.e. Var X = n
The standard deviation of this set of sample means is thus √σ and is
n
referred to as the standard error (se) of the mean.
Central Limit Theorem

Definition
Let X1 , . . . , Xn be a random sample from some
 population
 with mean µ
2
and variance σ 2 . Then for large n, X ∼ N µ, σn even if the underlying
distribution of individual observations in the population is not normal.

Note: ifthe underlying


 population is normal, automatically
σ2
X ∼ N µ, n .
This theorem allows us to perform statistical inference based on the
approximate normality of the sample mean despite the nonnormality
of the distribution of individual observations.
Example 6.27
Compute the probability that the mean birthweight from a sample of 10
infants from the Boston City Hospital population in Table 6.2 (p155) will
fall between 98.0 and 126.0 oz if the mean birthweight for the 1000
birthweights is 112 with a standard deviation of 20.6 oz assuming the
birthweights are normally distributed.
Example 6.27
Compute the probability that the mean birthweight from a sample of 10
infants from the Boston City Hospital population in Table 6.2 (p155) will
fall between 98.0 and 126.0 oz if the mean birthweight for 1000
birthweights is 112 with a standard deviation of 20.6 oz assuming the
birthweights are normally distributed.
Solution  
2 2
For n = 10, X ∼ N µX = µ = 112, σX2 = σn = 20.6 10

!
  98 − 112 X −µ 126 − 112
P 98 < X < 126 = P √ < √ < √
20.6/ 10 σ/ n 20.6/ 10
= P (−2.15 < Z < 2.15)
= P (Z < 2.15) − P (Z < −2.15)
= P (Z < 2.15) − P (Z > 2.15)
= P (Z < 2.15) − [1 − P (Z < 2.15)]
= 0.98422 − 1 + 0.98422
= 0.9684
Example (not in the book)
According to the growth chart that doctors use as a reference, the heights
of two-year-old boys are normally distributed with mean 34.5 inches and
standard deviation 1.3 inches. For a random sample of 12 two-year-old
boys, find the probability that the sample mean will be between 33.3 and
35.1 inches.
Example (not in the book)
According to the growth chart that doctors use as a reference, the heights
of two-year-old boys are normally distributed with mean 34.5 inches and
standard deviation 1.3 inches. For a random sample of 12 two-year-old
boys, find the probability that the sample mean will be between 33.3 and
35.1 inches.  
2
Let X = the heights of two year old boys, X ∼ N 34.5, 1.3 12 .

!
  33.3 − 34.5 X −µ 35.1 − 34.5
P 33.3 < X < 35.1 = P √ < √ < √
1.3/ 12 σ/ n 1.3/ 12
= P (−3.20 < Z < 1.60)
= P (Z < 1.6) − P (Z < −3.20)
= P (Z < 1.6) − P (Z > 3.2)
= P (Z < 1.6) − [1 − P (Z < 3.2)]
= 0.9452 − 1 + 0.99931
= 0.94451
Example (not in the book)
Suppose the amount of a popular sport drink in bottles leaving a filling machine has a normal
distribution with mean 101.5ml and standard deviation 1.6ml.
1 If the bottles are labeled 100ml, what proportion of the bottles contain less than the
labeled amount?
2 If four bottles are randomly selected, find the mean and the standard deviation of the
average content. Give its distribution.
3 What is the probability that the average content of four bottles is less than 100ml?
Solution 
Let X be r.v representng the amount of sport drink in bottles, X ∼ N 101.5, 1.62

 
X −µ 100 − 101.5
P (X < 100) = P <
σ 1.6
= P (Z < −0.94)
= P (Z > 0.94)
= 1 − P (Z < 0.94)
= 1 − 0.82639
= 0.1736

2 2
  
2 Given n = 4, E X = µ = 101.5 and Var X = σn = 1.6 and hence sd X = 1.6 ,
4 2
1.62

∴X ∼N 101.5, 4
.

 
 X −µ 100 − 101.5
P X < 100 =P √ <
σ/ n 1.6/2
= P (Z < −1.88)
= P (Z > 1.88)
= 1 − P (Z < 1.88)
= 1 − 0.96995
= 0.0301
Example (not in the book)
A company that manufactures car mufflers finds that the labour to set up and run an
automatic machine has a mean µ = 2.1 hours and σ = 0.9 hours. For a random sample
of 49 hours,
1 Determine the mean, standard deviation of X , the sample mean and give the
distribution of X .

2 Evaluate P X > 2.2
Solution: Let X be r.v. representing labour to set up and run an automatic machine,
X ∼ µ = 2.1, σ 2 = 0.92
0.92
   0.9
1 For n = 49, E X = 2.1 and Var X = 49
, ∴ sd X = 7
, thus (by large
 2

sample argument) X ∼ N µX = 2.1, σX2 = 0.9
49

2
 
 X −µ 2.2 − 2.1
P X > 2.2 = P √ >
σ/ n 0.9/7
= P (Z > 0.78)
= 1 − P (Z < 0.78)
= 1 − 0.78230
= 0.2177

You might also like