Sampling Distributions
Gül İnan
Department of Mathematics
1/35
November 25, 2025
Introduction to Sampling Distributions
Introduction to Sampling Distributions
2/35
November 25, 2025
Introduction to Sampling Distributions
Introduction
Statistics concerns itself mainly with conclusions and predictions
resulting from chance outcomes that occur in carefully planned
experiments or investigations.
Drawing such conclusions usually involves taking sample
observations from a given population and
Using the sample results to make inferences about the population,
such as its mean or variance.
3/35
November 25, 2025
Introduction to Sampling Distributions
Role of Statistics in Inference
To do this requires that we first find the distributions of certain
functions of the random variables whose values make up the sample,
called statistics.
An example of such a statistic is the sample mean.
The properties of these distributions then allow us to make
probability statements about the resulting inferences drawn from the
sample about the population.
The theory to be given in this chapter forms an important
foundation for the theory of statistical inference.
4/35
November 25, 2025
Introduction to Sampling Distributions
Population
Definition
A set of numbers from which a sample is drawn is referred to as a
population.
The distribution of the numbers constituting a population is called
the population distribution.
5/35
November 25, 2025
Introduction to Sampling Distributions
Examples of Samples and Populations
Suppose a teacher wants to know how many hours students study
each week.
She randomly selects 5 students from a class of 40 and records their
study times.
A layperson might say that these 5 students constitute the sample.
In statistics, it is preferable to regard the study times of these 5
students as a sample drawn from the population, which consists of
the study times of all 40 students.
Both the population and the sample consist of numerical values.
6/35
November 25, 2025
Introduction to Sampling Distributions
Examples of Samples and Populations
Also, suppose that a medical doctor wants to estimate the average
blood pressure of students at a school.
She randomly selects 10 students and measures their blood pressure.
These 10 measurements form a sample from the population of all
students’ blood pressure readings.
7/35
November 25, 2025
Introduction to Sampling Distributions
Example of an Unrepresentative Sample
Suppose a school wants to estimate the average weekly exercise time
of all students.
If the survey only includes students from the tennis club, the results
will likely overestimate the average, because these students are more
active than the general student population.
This sample is biased and not representative of all students.
8/35
November 25, 2025
Introduction to Sampling Distributions
Random Samples and Inference
On the other hand, it is evident that not all samples provide a
reliable basis for generalizing to the populations from which they are
drawn.
The majority of inferential techniques presented in this chapter rely
on the assumption that the samples being analyzed are random.
9/35
November 25, 2025
Introduction to Sampling Distributions
Random Sample
Definition
If X1 , X2 , . . . , Xn are independent and identically distributed random
variables, we say that they constitute a random sample from the
population given by their common distribution.
10/35
November 25, 2025
Introduction to Sampling Distributions
Sample Mean and Sample Variance
Statistical inferences are typically based on statistics, which are
random variables that are functions of a set of random variables
X1 , X2 , . . . , Xn forming a random sample.
Common examples of statistics include the sample mean and the
sample variance.
11/35
November 25, 2025
Introduction to Sampling Distributions
Sample Mean and Sample Variance
Definition
If X1 , X2 , . . . , Xn constitute a random sample, then:
The sample mean is given by
n
1X
X̄ = Xi
n
i=1
The sample variance is given by
n
1 X
S2 = (Xi − X̄ )2
n−1
i=1
12/35
November 25, 2025
Introduction to Sampling Distributions
Sample Mean and Sample Variance: Observed Values
It is common practice to apply the terms “random sample”,
“statistic”, “sample mean”, and “sample variance” to the observed
values of the random variables rather than to the random variables
themselves.
13/35
November 25, 2025
Introduction to Sampling Distributions
Sample Mean and Sample Variance: Observed Values
For observed sample data, we compute:
n n
1X 1 X
x̄ = xi and s 2 = (xi − x)2 ,
n n−1
i=1 i=1
and refer to these quantities as the sample mean and sample
variance.
Here, xi , x̄, and s 2 are the observed values corresponding to the
random variables Xi , X̄ , and S 2 .
These formulas are also used for any dataset, not necessarily a
sample, in which case x̄ and s 2 are referred to simply as the mean
and variance.
14/35
November 25, 2025
Introduction to Sampling Distributions
Role of Sample Statistics in Inference
The statistics introduced in this chapter are fundamental to
statistical inference.
Sample measures such as the sample mean and sample variance are
used to estimate population parameters based on observed data.
15/35
November 25, 2025
Introduction to Sampling Distributions
The Sampling Distribution of the Mean
Because sample statistics vary across random samples, it is essential
to describe their probability distributions.
These distributions are called sampling distributions and they are
crucial in determining the properties of the inferences we make
about the population parameters from which the sample is drawn.
We begin by examining the sampling distribution of the mean under
general assumptions about the population.
16/35
November 25, 2025
Introduction to Sampling Distributions
Theorem: Linear Combinations of Random Variables
Theorem
Let X1 , X2 , . . . , Xn be random variables and let
n
X
Y = ai Xi
i=1
where a1 , a2 , . . . , an are constants. Then:
n
X
E [Y ] = ai E [Xi ]
i=1
If X1 , X2 , . . . , Xn are independent random variables, then
n
X
Var (Y ) = ai2 Var (Xi )
i=1
17/35
November 25, 2025
Introduction to Sampling Distributions
Example
Problem
Suppose X1 and X2 are two independent random variables with
E [X1 ] = 2, E [X2 ] = 3, Var (X1 ) = 1, Var (X2 ) = 4, and let
Y = 3X1 − 2X2 .
E [Y ] = E [3X1 − 2X2 ] = 3E [X1 ] − 2E [X2 ] = 3 · 2 − 2 · 3 = 0
Var (Y ) = Var (3X1 −2X2 ) = 32 Var (X1 )+(−2)2 Var (X2 ) = 9·1+(−2)2 ·4 = 25
18/35
November 25, 2025
Introduction to Sampling Distributions
Sampling Distribution of the Mean
Theorem
If X1 , X2 , . . . , Xn constitute a random sample from a population with
mean µ and variance σ 2 , then
σ2
E (X̄ ) = µ and Var(X̄ ) = .
n
19/35
November 25, 2025
Introduction to Sampling Distributions
Proof of Theorem
Proof
1
Pn
Let X̄ = n i=1 Xi . Then,
n
X 1 1
E (X̄ ) = E (Xi ) = n · µ = µ.
n n
i=1
where E (Xi ) = µ. Since the Xi ’s are independent and using the result
n
X n
X
If Y = ai Xi , then Var(Y ) = ai2 Var(Xi ),
i=1 i=1
we obtain
n 2
X 1 1 2 σ2
Var(X̄ ) = σ2 = n · σ = .
n n2 n
i=1
20/35
November 25, 2025
Introduction to Sampling Distributions
Standard Error of the Mean
It is customary to write E (X̄ ) as µX̄ and Var (X̄ ) as σX̄2 .
We refer to σX̄ as the standard error of the mean.
The formula for the standard error of the mean:
r
σ2
σX̄ =
n
This shows that the standard deviation of the distribution of X
decreases when the sample size n increases.
As n becomes larger and we have more information, the values of X
tend to be closer to µ, the quantity they are intended to estimate.
21/35
November 25, 2025
Introduction to Sampling Distributions
Central Limit Theorem
The Central Limit Theorem (CLT) is one of the most important
theorems in statistics.
It concerns the limiting distribution of the standardized sample mean
as n → ∞.
Theorem
Let X1 , X2 , . . . , Xn be a random sample from an infinite population with
mean µ and variance σ 2 .
Then the limiting distribution of
X −µ
Z= √
σ/ n
as n → ∞ is the standard normal distribution.
22/35
November 25, 2025
Introduction to Sampling Distributions
Practical Implication of the Central Limit Theorem
The Central Limit Theorem justifies approximating the distribution
of X with a normal distribution having:
mean µ
variance σ 2 /n
This approximation is generally used in practice when the sample
size n ≥ 30, regardless of the shape of the population.
23/35
November 25, 2025
Introduction to Sampling Distributions
Example
Problem
A soft-drink vending machine dispenses a random amount of drink with a
mean of µ = 200 ml and a standard deviation of σ = 15 ml.
What is the probability that the average amount dispensed in a random
sample of size n = 36 is at least 204 milliliters?
24/35
November 25, 2025
Introduction to Sampling Distributions
Example
Solution
According to the relevant Theorem,
Mean of the sample mean: µX = µ = 200
Standard error: σX = √σn = √1536 = 2.5.
According to the Central Limit Theorem, the distribution of X is
approximately normal.
Compute the standardized value:
204 − 200
z= = 1.6
2.5
From standard normal tables:
P(X ≥ 204) = P(Z ≥ 1.6) = 0.5000 − 0.4452 = 0.0548
25/35
November 25, 2025
Introduction to Sampling Distributions
Sampling from a Normal Population
It is of interest to note that when the population we are sampling
from is normal, the distribution of X is normal regardless of the
sample size n.
Theorem
If X is the mean of a random sample of size n from a normal population
with mean µ and variance σ 2 , then the sampling distribution of X is a
normal distribution with mean µ and variance σ 2 /n.
26/35
November 25, 2025
Introduction to Sampling Distributions
Example
Problem
A bank employee is servicing customers. The amount of time it takes the
employee to service a customer is distributed as follows:
(a) N(4, 1)
(b) Uniform[3, 5]
What is the probability that the average service time for 50 customers is
between 3.8 and 4.2 minutes in cases (a) and (b) ?
27/35
November 25, 2025
Introduction to Sampling Distributions
Solution: Case (a)
Distribution
Service time: X ∼ N(4, 1). Since the population is normal, the
distribution of X is exactly normal.
Mean and standard deviation of the sample mean:
1
µX = 4, σX = √ = 0.1414
50
Compute the z-scores:
3.8 − 4 4.2 − 4
z1 = = −1.414, z2 = = 1.414
0.1414 0.1414
Probability:
P(3.8 ≤ X ≤ 4.2) = P(−1.414 ≤ Z ≤ 1.414) = 0.8427
28/35
November 25, 2025
Introduction to Sampling Distributions
Solution: Case (b)
Distribution
Service time: X ∼ Uniform(3, 5). Here the CLT is used to approximate
X by a normal distribution, because the population is not normal and
n = 50 is large.
Parameters:
(5 + 3) (5 − 3)2 1
µ= = 4, σ2 = = , σ = 0.57735
2 12 3
Standard deviation of the sample mean:
0.57735
σX = √ = 0.08165
50
29/35
November 25, 2025
Introduction to Sampling Distributions
Solution: Case (b)
Distribution
Compute z-scores:
3.8 − 4 4.2 − 4
z1 = = −2.45, z2 = = 2.45
0.08165 0.08165
Probability:
P(3.8 ≤ X ≤ 4.2) = P(−2.45 ≤ Z ≤ 2.45) = 0.9859
30/35
November 25, 2025
Introduction to Sampling Distributions
The t distribution
We showed that for random samples from a normal population with
the mean µ and the variance σ 2 , the sample mean X̄ has a normal
distribution with the mean µ and the variance σ 2 /n; in other words,
X̄ − µ
√ ∼ N (0, 1).
σ/ n
This is an important result, but the major difficulty in applying it is
that in most realistic applications the population standard deviation
σ is unknown.
This makes it necessary to replace σ with an estimate, usually with
the value of the sample standard deviation S.
Thus, the theory that follows leads to the exact distribution of
X̄ − µ
√
S/ n
for random samples from normal populations, which motivates the t
distribution. 31/35
November 25, 2025
Introduction to Sampling Distributions
The t distribution
Theorem
If X̄ and S 2 are the mean and the variance of a random sample of size n
from a normal population with mean µ and variance σ 2 , then
X̄ − µ
T = √
S/ n
has the t distribution with n − 1 degrees of freedom.
32/35
November 25, 2025
Introduction to Sampling Distributions
The t Distribution: History
The t distribution was introduced by W. S. Gosset, who published
under the pen name “Student”, because the brewery he worked for
did not permit employees to publish.
Hence, it is also called the Student t distribution or Student’s t
distribution.
Graphs of t distributions with different numbers of degrees of
freedom ν resemble the standard normal distribution, but have larger
variances.
For large values of ν, the t distribution approaches the standard
normal distribution.
You can explore an interactive plot of the t distribution at
[Link]
33/35
November 25, 2025
Introduction to Sampling Distributions
The t Distribution: Tables and Applications
The t distribution has been
tabulated extensively.
The Table IV of “Statistical
Tables” contains values of tα,ν for
α = 0.10, 0.05, 0.025, 0.01, 0.005
and ν = 1, 2, . . . , 29, where tα,ν
satisfies
P(T > tα,ν ) = α
for a random variable T with a t
distribution with ν degrees of
freedom.
34/35
November 25, 2025
Introduction to Sampling Distributions
The t Distribution: Tables and Applications
Due to symmetry about t = 0,
t1−α,ν = −tα,ν .
For ν ≥ 30, probabilities are usually
approximated using the normal
distribution.
35/35
November 25, 2025