0% found this document useful (0 votes)
14 views35 pages

Understanding Sampling Distributions

The document provides an introduction to sampling distributions, emphasizing the role of statistics in making inferences about populations based on sample data. It covers key concepts such as random samples, sample mean, sample variance, and the Central Limit Theorem, which states that the distribution of the sample mean approaches a normal distribution as the sample size increases. Additionally, it discusses the implications of these concepts for statistical inference and the importance of using appropriate sampling methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views35 pages

Understanding Sampling Distributions

The document provides an introduction to sampling distributions, emphasizing the role of statistics in making inferences about populations based on sample data. It covers key concepts such as random samples, sample mean, sample variance, and the Central Limit Theorem, which states that the distribution of the sample mean approaches a normal distribution as the sample size increases. Additionally, it discusses the implications of these concepts for statistical inference and the importance of using appropriate sampling methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Sampling Distributions

Gül İnan

Department of Mathematics

1/35
November 25, 2025
Introduction to Sampling Distributions

Introduction to Sampling Distributions

2/35
November 25, 2025
Introduction to Sampling Distributions

Introduction

Statistics concerns itself mainly with conclusions and predictions


resulting from chance outcomes that occur in carefully planned
experiments or investigations.
Drawing such conclusions usually involves taking sample
observations from a given population and
Using the sample results to make inferences about the population,
such as its mean or variance.

3/35
November 25, 2025
Introduction to Sampling Distributions

Role of Statistics in Inference

To do this requires that we first find the distributions of certain


functions of the random variables whose values make up the sample,
called statistics.
An example of such a statistic is the sample mean.
The properties of these distributions then allow us to make
probability statements about the resulting inferences drawn from the
sample about the population.
The theory to be given in this chapter forms an important
foundation for the theory of statistical inference.

4/35
November 25, 2025
Introduction to Sampling Distributions

Population

Definition
A set of numbers from which a sample is drawn is referred to as a
population.
The distribution of the numbers constituting a population is called
the population distribution.

5/35
November 25, 2025
Introduction to Sampling Distributions

Examples of Samples and Populations

Suppose a teacher wants to know how many hours students study


each week.
She randomly selects 5 students from a class of 40 and records their
study times.
A layperson might say that these 5 students constitute the sample.
In statistics, it is preferable to regard the study times of these 5
students as a sample drawn from the population, which consists of
the study times of all 40 students.
Both the population and the sample consist of numerical values.

6/35
November 25, 2025
Introduction to Sampling Distributions

Examples of Samples and Populations

Also, suppose that a medical doctor wants to estimate the average


blood pressure of students at a school.
She randomly selects 10 students and measures their blood pressure.
These 10 measurements form a sample from the population of all
students’ blood pressure readings.

7/35
November 25, 2025
Introduction to Sampling Distributions

Example of an Unrepresentative Sample

Suppose a school wants to estimate the average weekly exercise time


of all students.
If the survey only includes students from the tennis club, the results
will likely overestimate the average, because these students are more
active than the general student population.
This sample is biased and not representative of all students.

8/35
November 25, 2025
Introduction to Sampling Distributions

Random Samples and Inference

On the other hand, it is evident that not all samples provide a


reliable basis for generalizing to the populations from which they are
drawn.
The majority of inferential techniques presented in this chapter rely
on the assumption that the samples being analyzed are random.

9/35
November 25, 2025
Introduction to Sampling Distributions

Random Sample

Definition
If X1 , X2 , . . . , Xn are independent and identically distributed random
variables, we say that they constitute a random sample from the
population given by their common distribution.

10/35
November 25, 2025
Introduction to Sampling Distributions

Sample Mean and Sample Variance

Statistical inferences are typically based on statistics, which are


random variables that are functions of a set of random variables
X1 , X2 , . . . , Xn forming a random sample.
Common examples of statistics include the sample mean and the
sample variance.

11/35
November 25, 2025
Introduction to Sampling Distributions

Sample Mean and Sample Variance

Definition
If X1 , X2 , . . . , Xn constitute a random sample, then:
The sample mean is given by
n
1X
X̄ = Xi
n
i=1

The sample variance is given by


n
1 X
S2 = (Xi − X̄ )2
n−1
i=1

12/35
November 25, 2025
Introduction to Sampling Distributions

Sample Mean and Sample Variance: Observed Values

It is common practice to apply the terms “random sample”,


“statistic”, “sample mean”, and “sample variance” to the observed
values of the random variables rather than to the random variables
themselves.

13/35
November 25, 2025
Introduction to Sampling Distributions

Sample Mean and Sample Variance: Observed Values

For observed sample data, we compute:


n n
1X 1 X
x̄ = xi and s 2 = (xi − x)2 ,
n n−1
i=1 i=1

and refer to these quantities as the sample mean and sample


variance.
Here, xi , x̄, and s 2 are the observed values corresponding to the
random variables Xi , X̄ , and S 2 .
These formulas are also used for any dataset, not necessarily a
sample, in which case x̄ and s 2 are referred to simply as the mean
and variance.

14/35
November 25, 2025
Introduction to Sampling Distributions

Role of Sample Statistics in Inference

The statistics introduced in this chapter are fundamental to


statistical inference.
Sample measures such as the sample mean and sample variance are
used to estimate population parameters based on observed data.

15/35
November 25, 2025
Introduction to Sampling Distributions

The Sampling Distribution of the Mean

Because sample statistics vary across random samples, it is essential


to describe their probability distributions.
These distributions are called sampling distributions and they are
crucial in determining the properties of the inferences we make
about the population parameters from which the sample is drawn.
We begin by examining the sampling distribution of the mean under
general assumptions about the population.

16/35
November 25, 2025
Introduction to Sampling Distributions

Theorem: Linear Combinations of Random Variables

Theorem
Let X1 , X2 , . . . , Xn be random variables and let
n
X
Y = ai Xi
i=1

where a1 , a2 , . . . , an are constants. Then:


n
X
E [Y ] = ai E [Xi ]
i=1

If X1 , X2 , . . . , Xn are independent random variables, then


n
X
Var (Y ) = ai2 Var (Xi )
i=1

17/35
November 25, 2025
Introduction to Sampling Distributions

Example

Problem
Suppose X1 and X2 are two independent random variables with
E [X1 ] = 2, E [X2 ] = 3, Var (X1 ) = 1, Var (X2 ) = 4, and let
Y = 3X1 − 2X2 .

E [Y ] = E [3X1 − 2X2 ] = 3E [X1 ] − 2E [X2 ] = 3 · 2 − 2 · 3 = 0

Var (Y ) = Var (3X1 −2X2 ) = 32 Var (X1 )+(−2)2 Var (X2 ) = 9·1+(−2)2 ·4 = 25

18/35
November 25, 2025
Introduction to Sampling Distributions

Sampling Distribution of the Mean

Theorem
If X1 , X2 , . . . , Xn constitute a random sample from a population with
mean µ and variance σ 2 , then

σ2
E (X̄ ) = µ and Var(X̄ ) = .
n

19/35
November 25, 2025
Introduction to Sampling Distributions

Proof of Theorem

Proof
1
Pn
Let X̄ = n i=1 Xi . Then,
n
X 1 1
E (X̄ ) = E (Xi ) = n · µ = µ.
n n
i=1

where E (Xi ) = µ. Since the Xi ’s are independent and using the result
n
X n
X
If Y = ai Xi , then Var(Y ) = ai2 Var(Xi ),
i=1 i=1

we obtain
n  2
X 1 1 2 σ2
Var(X̄ ) = σ2 = n · σ = .
n n2 n
i=1

20/35
November 25, 2025
Introduction to Sampling Distributions

Standard Error of the Mean

It is customary to write E (X̄ ) as µX̄ and Var (X̄ ) as σX̄2 .


We refer to σX̄ as the standard error of the mean.
The formula for the standard error of the mean:
r
σ2
σX̄ =
n

This shows that the standard deviation of the distribution of X


decreases when the sample size n increases.
As n becomes larger and we have more information, the values of X
tend to be closer to µ, the quantity they are intended to estimate.

21/35
November 25, 2025
Introduction to Sampling Distributions

Central Limit Theorem

The Central Limit Theorem (CLT) is one of the most important


theorems in statistics.
It concerns the limiting distribution of the standardized sample mean
as n → ∞.

Theorem
Let X1 , X2 , . . . , Xn be a random sample from an infinite population with
mean µ and variance σ 2 .
Then the limiting distribution of

X −µ
Z= √
σ/ n

as n → ∞ is the standard normal distribution.

22/35
November 25, 2025
Introduction to Sampling Distributions

Practical Implication of the Central Limit Theorem

The Central Limit Theorem justifies approximating the distribution


of X with a normal distribution having:
mean µ
variance σ 2 /n
This approximation is generally used in practice when the sample
size n ≥ 30, regardless of the shape of the population.

23/35
November 25, 2025
Introduction to Sampling Distributions

Example

Problem
A soft-drink vending machine dispenses a random amount of drink with a
mean of µ = 200 ml and a standard deviation of σ = 15 ml.
What is the probability that the average amount dispensed in a random
sample of size n = 36 is at least 204 milliliters?

24/35
November 25, 2025
Introduction to Sampling Distributions

Example

Solution
According to the relevant Theorem,
Mean of the sample mean: µX = µ = 200
Standard error: σX = √σn = √1536 = 2.5.
According to the Central Limit Theorem, the distribution of X is
approximately normal.

Compute the standardized value:


204 − 200
z= = 1.6
2.5

From standard normal tables:

P(X ≥ 204) = P(Z ≥ 1.6) = 0.5000 − 0.4452 = 0.0548


25/35
November 25, 2025
Introduction to Sampling Distributions

Sampling from a Normal Population

It is of interest to note that when the population we are sampling


from is normal, the distribution of X is normal regardless of the
sample size n.

Theorem
If X is the mean of a random sample of size n from a normal population
with mean µ and variance σ 2 , then the sampling distribution of X is a
normal distribution with mean µ and variance σ 2 /n.

26/35
November 25, 2025
Introduction to Sampling Distributions

Example

Problem
A bank employee is servicing customers. The amount of time it takes the
employee to service a customer is distributed as follows:
(a) N(4, 1)
(b) Uniform[3, 5]
What is the probability that the average service time for 50 customers is
between 3.8 and 4.2 minutes in cases (a) and (b) ?

27/35
November 25, 2025
Introduction to Sampling Distributions

Solution: Case (a)

Distribution
Service time: X ∼ N(4, 1). Since the population is normal, the
distribution of X is exactly normal.
Mean and standard deviation of the sample mean:
1
µX = 4, σX = √ = 0.1414
50
Compute the z-scores:
3.8 − 4 4.2 − 4
z1 = = −1.414, z2 = = 1.414
0.1414 0.1414
Probability:

P(3.8 ≤ X ≤ 4.2) = P(−1.414 ≤ Z ≤ 1.414) = 0.8427

28/35
November 25, 2025
Introduction to Sampling Distributions

Solution: Case (b)

Distribution
Service time: X ∼ Uniform(3, 5). Here the CLT is used to approximate
X by a normal distribution, because the population is not normal and
n = 50 is large.
Parameters:
(5 + 3) (5 − 3)2 1
µ= = 4, σ2 = = , σ = 0.57735
2 12 3
Standard deviation of the sample mean:
0.57735
σX = √ = 0.08165
50

29/35
November 25, 2025
Introduction to Sampling Distributions

Solution: Case (b)

Distribution
Compute z-scores:
3.8 − 4 4.2 − 4
z1 = = −2.45, z2 = = 2.45
0.08165 0.08165
Probability:

P(3.8 ≤ X ≤ 4.2) = P(−2.45 ≤ Z ≤ 2.45) = 0.9859

30/35
November 25, 2025
Introduction to Sampling Distributions

The t distribution

We showed that for random samples from a normal population with


the mean µ and the variance σ 2 , the sample mean X̄ has a normal
distribution with the mean µ and the variance σ 2 /n; in other words,
X̄ − µ
√ ∼ N (0, 1).
σ/ n
This is an important result, but the major difficulty in applying it is
that in most realistic applications the population standard deviation
σ is unknown.
This makes it necessary to replace σ with an estimate, usually with
the value of the sample standard deviation S.
Thus, the theory that follows leads to the exact distribution of
X̄ − µ

S/ n
for random samples from normal populations, which motivates the t
distribution. 31/35
November 25, 2025
Introduction to Sampling Distributions

The t distribution

Theorem
If X̄ and S 2 are the mean and the variance of a random sample of size n
from a normal population with mean µ and variance σ 2 , then

X̄ − µ
T = √
S/ n

has the t distribution with n − 1 degrees of freedom.

32/35
November 25, 2025
Introduction to Sampling Distributions

The t Distribution: History

The t distribution was introduced by W. S. Gosset, who published


under the pen name “Student”, because the brewery he worked for
did not permit employees to publish.
Hence, it is also called the Student t distribution or Student’s t
distribution.
Graphs of t distributions with different numbers of degrees of
freedom ν resemble the standard normal distribution, but have larger
variances.
For large values of ν, the t distribution approaches the standard
normal distribution.
You can explore an interactive plot of the t distribution at
[Link]

33/35
November 25, 2025
Introduction to Sampling Distributions

The t Distribution: Tables and Applications

The t distribution has been


tabulated extensively.
The Table IV of “Statistical
Tables” contains values of tα,ν for
α = 0.10, 0.05, 0.025, 0.01, 0.005
and ν = 1, 2, . . . , 29, where tα,ν
satisfies

P(T > tα,ν ) = α

for a random variable T with a t


distribution with ν degrees of
freedom.

34/35
November 25, 2025
Introduction to Sampling Distributions

The t Distribution: Tables and Applications

Due to symmetry about t = 0,


t1−α,ν = −tα,ν .
For ν ≥ 30, probabilities are usually
approximated using the normal
distribution.

35/35
November 25, 2025

You might also like