!
Sampling
Distributions
1
Introduction
In this chapter we study some
relationships between population and
sample characteristics.
Generally, we are interested in
population parameters such as
Mean return
Variability of demand
Proportion of defectives in a production line
2
Introduction
Such parameters are usually unknown
Therefore, we draw a samples from the
population, and use them to make
inference about the parameters.
This is done by constructing sample
statistics, that have close relationship to
the population parameters.
3
Introduction
Samples are random, so the sample
statistic is a random variable.
As such it has a sample distribution.
Sample distributions for various
statistics are studied in this chapter
4
9.1 Sampling Distribution of
the Mean
Example 1
A die is thrown infinitely many times. Let X
represent the number of spots showing on
any throw.
The probability distribution of X is
E(X) = 1(1/6) +
x 1 2 3 4 5 6 2(1/6) + 3(1/6)+
………………….= 3.5
p(x) 1/6 1/6 1/6 1/6 1/6 1/6
V(X) = (1-3.5)2(1/6) +
(2-3.5)2(1/6) +
5
Throwing a die twice – sample mean
Suppose we want to estimate
from the mean x of a sample of
size n = 2.
What is the distribution of x ?
6
Throwing a die twice – sample mean
These are And these
all the are the
possible means
pairs of each
of values pair2 throws
for the
Sample
Sample Mean Sample
Mean Sample Mean Sample
Mean Sample Mean
Mean
11 1,1
1,1 11 13
13 3,1
3,1 22 25
25 5,1
5,1 33
22 1,2 1.5
1,2 1.5 14
14 3,2 2.5
3,2 2.5 26
26 5,2
5,2 3.5
3.5
33 1,3
1,3 22 15
15 3,3
3,3 33 27
27 5,3
5,3 44
44 1,4 2.5
1,4 2.5 16
16 3,4 3.5
3,4 3.5 28
28 5,4
5,4 4.5
4.5
55 1,5
1,5 33 17
17 3,5
3,5 44 29
29 5,5
5,5 55
66 1,6 3.5
1,6 3.5 18
18 3,6 4.5
3,6 4.5 30
30 5,6
5,6 5.5
5.5
77 2,1 1.5
2,1 1.5 19
19 4,1 2.5
4,1 2.5 31
31 6,1
6,1 3.5
3.5
88 2,2
2,2 22 20
20 4,2
4,2 33 32
32 6,2
6,2 44
99 2,3 2.5
2,3 2.5 21
21 4,3 3.5
4,3 3.5 33
33 6,3
6,3 4.5
4.5
10
10 2,4
2,4 33 22
22 4,4
4,4 44 34
34 6,4
6,4 55
11
11 2,5 3.5
2,5 3.5 23
23 4,5 4.5
4,5 4.5 35
35 6,5
6,5 5.5
5.5
12
12 2,6
2,6 44 24
24 4,6
4,6 55 36
36 6,6
6,6 66
7
The distribution of x when n = 2
Calculating the relative frequency of each value
of x we have the following results
x
1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
Frequency
1 2 3 4 5 6 5 4 3 2 1
Relative freq
1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
(1+1)/2 = 1 (1+2)/2 = 1.5 (1+3)/2 = 2 Notice there are 36 possible
(2+1)/2 = 1.5 (2+2)/2 = 2 pairs of values:
(3+1)/2 = 2 1,1 1,2 ….. 1,6
2,1 2,2 ….. 2,6
………………..
6,1 6,2 ….. 6,6
8
The Relationship between the
sample size and the sampling
distribution of the sample mean
n 5 n 10 n 25
x 3.5 x 3.5 x 3.5
2 2x 2x 2x
.5833 ( )
x
2
x .2917 ( )
2
.1167 ( )
x
5 10 25
As the sample size changes, the
mean of the sample mean does not
change! 9
The Relationship between the
sample size and the sampling
distribution of the sample mean
n 5 n 10 n 25
x 3.5 x 3.5 x 3.5
2 2x 2x 2x
.5833 ( )
x
2
x .2917 ( )
2
.1167 ( )
x
5 10 25
As the sample size increases, the
variance of the sample mean
decreases! 10
The Relationship between the
sample size and the sampling
distribution of the sample mean
n 5 n 10 n 25
x 3.5 x 3.5 x 3.5
2 2x 2x 2x
.5833 ( )
x
2
x .2917 ( )
2
.1167 ( )
x
5 10 25
Also, note the interesting relationship
between the sample size and the
variance of the sample mean.
We’ll formalize this relationship soon.
11
The Sample Variance
Demonstration: Why is the variance of the sample
mean is smaller than the population variance.
Mean = 1.5 Mean = 2. Mean = 2.5
Population 1 1.5 2 2.5 3
Compare
Let usthe range
take of the population
samples
to the range
of two of the sample
observations. mean.
Click
12
The Central Limit Theorem
If a random sample is drawn from any
population, the sampling distribution of the
sample mean is:
Normal if the parent population is normal,
Approximately normal if the parent population is
not normal, provided the sample size is sufficiently
large.
The larger the sample size, the more closely
the sampling distribution of x will resemble a
normal distribution.
13
The Parameters of the
Sampling Distribution of X
The mean of X is equal to the mean of the
parent population
μ x μx
The variance of X is equal to the parent
population variance divided by ‘n’.
2
σ
σ 2x x
n
14
The Sampling Distribution of X
- Example
Example 2
The amount of soda pop in each bottle is
normally distributed with a mean of 32.2
ounces and a standard deviation of .3
ounces.
Find the probability that a bottle bought by
a customer will contain more than 32
ounces.
15
The Sampling Distribution of X
- Example
Example 2
Solution
The random variable X is the amount of soda in a bottle.
0.7486
P(x 32)
x = 32 = 32.2
x μ 32 32.2
P(x 32) P( ) P(z .67) 0.7486
σx .3
16
The Sampling Distribution of X
Find the probability that a carton of four bottles
will have a mean of more than 32 ounces of
soda per bottle.
Solution
Define the random variable as the mean amount of soda
per bottle.
x 32 32.2
P( x 32) P( )
x .3 4
P( z 1.33 ) 0.9082
P(x 32)
0.9082
x 32 x 32.2
17
The Sampling Distribution of X
Example 3
The average weekly income of B.B.A graduates
one year after graduation is $600.
Suppose the distribution of weekly income has a
standard deviation of $100. What is the probability
that 35 randomly selected graduates have an
average weekly income of less than $550?
Solution
x μ 550 600
P(x 550) P( )
σx 100 35
P(z 2.97) 0.0015
18
The Sampling Distribution of X
Example 3 – continued
If a random sample of 35 graduates actually had
an average weekly income of $550, what would
you conclude about the validity of the claim that
the average weekly income is 600?
Solution
With = 600 the probability to have a sample mean as
low as 550 is very small (0.0015). The claim that the
mean weekly income is $600 is probably unjustified.
It will be more reasonable to assume that is smaller
than $600, because then a sample mean of $550
becomes more probable.
19
9.2 Sampling Distribution of
a Sample Proportion (p)
<
The parameter of interest for qualitative
(nominal) data is the proportion of times
a particular outcome (success) occurs for
a given population.
This is the motivation for studying the
distribution of the sample proportion
20
<
9.2 Sampling Distribution of
a Sample Proportion (p)
Let X be the number of times an event of interest takes
place (we can call such an event a success just like the
definition we used for the binomial experiment)
The number
of successes
<
The sample proportion = p = n
21
9.2 Sampling Distribution of
<
a Sample Proportion (p)
Since X is binomial, probabilities for p can
<
be calculated from the binomial
distribution.
Yet, for inference about p we prefer to use
<
normal approximation to the binomial.
22
Approximate Sampling p̂
Distribution
of a Sample Proportion
From the laws of expected value and
variance, it can be shown that p̂ = p and
p̂2 = p(1-p)/n
Z is calculated by:
pˆ p
pˆ p
Z
Z
p(1 p)
p(1 p)
nn
If both np > 5 and n(1-p) > 5, then Z is
approximately standard normal.
23
Approximate Sampling Distribution
of a Sample Proportion
Example 5
A state representative received 52% of the
votes in the last election.
One year later the representative wanted
to study his popularity.
If his popularity has not changed, what is
the probability that more than half of a
sample of 300 voters would vote for him?
24
Approximate Sampling Distribution
of a Sample Proportion
Example 5
Solution
The number of respondents who prefer the
representative is binomial with n = 300 and p
= .52. Thus, np = 300(.52) = 156 > 5
n(1-p) = 300(1-.52) = 144 > 5. The normal
approximation can be applied here:
pˆp .50 .52
ˆ
P(p .50) P .7549
p(1 p) n .0288
25
Using Sampling Distributions for
Inference
Sampling distributions can be used to make an
inference about population parameters
For example let us look at an inference about the
population mean
Generally we’ll compare the actual sample mean
with a hypothesized value of the unknown
population mean, and make an informed decision
about the likelihood of this hypothesis
26
Using Sampling Distributions for
Inference
Let us guess what the value of is, and build a symmetrical interval
around large enough to make it very likely that the sample mean
falls inside it.
If the sample mean falls outside the interval (although this is very
unlikely), we tend to believe that is different than the value of we
guessed.
The sampling distribution of the sample mean helps in performing the
calculations.
Large
probability
that x falls inside
[]
x
27
Using Sampling Distributions for
Inference
Suppose .95 is considered sufficiently large probability
the sample mean falls inside the interval.
Let us build a symmetrical interval around .
Using the notation and we have:
x
P() = .95.
x
28
Using Sampling Distributions for
Inference
Performing the usual standardization we find that the interval covering
95% of the distribution of the sample mean is:
σ σ
μ 1.96 x μ 1.96
n n
0.95
x
σ σ
μ 1.96 μ 1.96
n n
29
Using Sampling Distributions for
Inference
Now let us apply this interval to example 3.
P( 1.96 x 1.96 ) .95
n n
100 100
P( 600 1.96 x 600 1.96 ) .95
25 25
Which reduces to P( 560.8 x 639.2) .95
Conclusion
There is 95% chance that the sample mean falls within
the interval [560.8, 639.2] if the population mean is
600.
Since the sample mean was 550, the population mean
is probably not 600.
30
Optional: Sampling Distribution of the
Difference Between Two Means
The difference between two means can
become a parameter of interest when the
comparison between two populations is
studied.
To make an inference about 1 - 2 we
observe the distribution of x 1 x.2
31
9.3 Normal Distribution of the
Difference Between two Sample
Means
The distribution of x 1 x 2 is normal if
The two samples are independent, and
The parent populations are normally
distributed.
If the two populations are not both
normally distributed, but the sample
sizes are 30 or more, the distribution of
x1 x 2
is approximately normal. 32
9.3 Normal Distribution of the
Difference Between two Sample
Means
Applying the laws of expected value
and variance we have:
μ x1 x 2 μ1 μ2
2 σ12 σ 22
σ x1 x 2
n n
We can define:
Z ((xx11 xx22)) ((11 22))
Z
22 22
11 22
n11 nn22
n
33
9.3 Normal Distribution of the
Difference Between two Sample
Means
Example 6
The starting salaries of MBA students from
two universities (WLU and UWO) are
$62,000 ([Link]. = $14,500), and
$60,000 (stand. dev. = $18,300).
What is the probability that a sample mean of
WLU students will exceed the sample mean of
UWO students? (nWLU = 50; nUWO = 60)
34
9.3 Normal Distribution of the
Difference Between two Sample
Means
Example 6 – Solution
We need to determine
P( x1 x 2 0)
1 - 2 = 62,000 - 60,000 = $2,000
12 22 14,500 2 18,300 2
$3,128
n n 50 60
x1 x 2 (1 - 2 ) 0 2000
P( x1 x 2 0) P( )
2
1 2 2 3128
n1 n2
P( z .64) .5 .2389 .7389
35