0% found this document useful (0 votes)
9 views107 pages

Sampling Distributions Part B

Chapter 6 discusses key sampling distributions for inference, focusing on the Normal Distribution and its properties. It includes theorems and examples demonstrating how to calculate the distribution, expected value, and variance of linear combinations of normal variables. Additionally, it covers the sampling distributions of sample means and proportions, providing formulas and proofs for their normality under certain conditions.

Uploaded by

Mutamba
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views107 pages

Sampling Distributions Part B

Chapter 6 discusses key sampling distributions for inference, focusing on the Normal Distribution and its properties. It includes theorems and examples demonstrating how to calculate the distribution, expected value, and variance of linear combinations of normal variables. Additionally, it covers the sampling distributions of sample means and proportions, providing formulas and proofs for their normality under certain conditions.

Uploaded by

Mutamba
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 6: Sampling Distributions

Part B: Key Sampling Distributions for Inference

STK 220

c University of Pretoria

September 15, 2025


STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 1 / 107
The Normal Distribution

Theorem (8.4)
If Xi N µi , σ2i , i = 1, 2, . . . , n denote independent normal variables then
!
n n n
Y = ∑ ai Xi N ∑ ai µi , ∑ ai2 σ2i
i =1 i =1 i =1

where a1 , a2 , . . . , an are constants.

Note:

1. This is a more general form of Theorem 8.4 in the textbook.


2. In the textbook the sample mean X̄ from a normal population is
considered, which will be treated as a special case of Theorem 8.4 above.

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 2 / 107
Example (Normal 1)
Let X1 , X2 and X3 be independent normal random variables with

X1 N (0, 1) , X2 N (10, 16) and X3 N ( 4, 9)

Let Y = 5X1 + 4X2 2X3 .

1 Give the distribution of Y .


2 Give P10 , the 10th percentile of Y .
3 Generate the empirical distribution of Y with n = 1000 observations as
follows:

a For X1 N (0, 1), use the RANDSEED call with a seed of 123 .
b For X2 N (10, 16), use the RANDSEED call with a seed of 456 .
c For X3 N ( 4, 9), use the RANDSEED call with a seed of 789 .

Note: Compare the mean, variance and P10 of the empirical distribution
with that of the theoretical distribution.

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 3 / 107
1 Y = 5X1 + 4X2 2X3 has a normal distribution with

I Expected value

µY = (5) (0) + (4) (10) + ( 2) ( 4) = 48

I Variance

σ2Y = (5)2 (1) + (4)2 (16) + ( 2)2 (9) = 317

)
Y N (48, 317)
p
Note: σ = 317 = 17. 804

NB: See SAS Program [Link]

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 4 / 107
Matrix notation:

Y = 5X1 + 4X2 2X3


0 1
X1
= 5 4 2 @ X2 A = a 0 X
X3

I Expected value

µY = a0 µ Y
0 1
0
= 5 4 2 @ 10 A = 48
4

I Variance

σ2Y = a0 Σa
0 10 1
1 0 0 5
= 5 4 2 @0 16 0A @ 4 A = 317
0 0 9 2
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 5 / 107
2

0.1 = P (Y P10 )

Y 48 P 48
= P p < 10
p
317 317

= P (Z < z0.1 ) , Z N (0, 1)

From Table I
P10 48
) p = z0.1 = 1.28
317
p
) P10 = 48 1.28 317 = 25. 21

quantile(’normal’,0.1,48,sqrt(317)) 25.1826

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 6 / 107
SAS Program:

proc iml;
r=1000;
call randseed(123,1); x1=randfun(r,’normal’,);
call randseed(456,1); x2=randfun(r,’normal’,10,4);
call randseed(789,1); x3=randfun(r,’normal’,-4,3);
matrix=5*x1+4*x2-2*x3;
create sim from matrix[colname={y}];
append from matrix;
quit;

ods graphics off;


proc univariate data=D1 normal;
var y;
histogram / endpoints=-10 to 110 by 10 cfill=magenta
normal(mu=48 sigma=17.8045);
title ’Linear combination of normal variables’;
run;

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 7 / 107
SAS Output:
Linear combination of normal variables
The UNIVARIATE Procedure
Variable: Y
Moments
N 1000 Sum Weights 1000
Mean 47.6683059 Sum Observations 47668.3059
Std Deviation 18.483242 Variance 341.630234
Skewness 0.06428775 Kurtosis -0.0727985
Uncorrected SS 2613555.99 Corrected SS 341288.604
Coeff Variation 38.7746987 Std Error Mean 0.58449143

Fitted Normal Distribution for Y


Parameters for Normal Distribution
Parameter Symbol Estimate
Mean Mu 48
Std Dev Sigma 17.8045

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 8 / 107
Goodness-of-Fit Tests for Normal Distribution
Test ----Statistic----- ------p Value------
Kolmogorov-Smirnov D 0.03174255 Pr > D >0.250
Cramer-von Mises W-Sq 0.15374778 Pr > W-Sq >0.250
Anderson-Darling A-Sq 1.02864721 Pr > A-Sq >0.250

Quantiles for Normal Distribution


-------Quantile------
Percent Observed Estimated
1.0 6.00801 6.58054
5.0 16.90288 18.71420
10.0 24.33307 25.18262
25.0 34.78080 35.99105
50.0 47.44552 48.00000
75.0 59.47725 60.00895
90.0 72.88150 70.81738
95.0 78.80980 77.28580
99.0 90.66819 89.41946

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 9 / 107
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 10 / 107
Remember: STK 210

Fact (MGF)
Let MX (t ) be the MGF of X and let

Y = aX + b

then

MY (t ) = MaX +b (t ) = E e (aX +b )(t )

= E e aXt +bt = e bt E e X (at )

= e bt MX (at )

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 11 / 107
Theorem (8.4)
If Xi N µi , σ2i ; i = 1, 2, . . . , n denote independent normal variables then
!
n n n
Y = ∑ ai Xi N ∑ ai µi , ∑ ai2 σ2i
i =1 i =1 i =1

where a1 , a2 , . . . , an are constants.

For the Proof:

The following notation will be used

e x = exp [x ]

Therefore the MGF of a normal variable can be written as


1 2 2 1
MX (t ) = e µt + 2 σ t = exp µt + σ2 t 2
2

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 12 / 107
Proof:
n
MY (t ) = ∏ Mai Xi (t ) Theorem 7.3
i =1
n
= ∏ MXi (ai t ) Theorem 4.10
i =1
n
1
= ∏ exp µi (ai t ) + σ2i (ai t )2
2
i =1
" #
n
1 2
= exp ∑ µi (ai t ) + σi (ai t ) 2
i =1
2
" ! ! #
n n
1
= exp ∑ ai µi t + 2 ∑ ai σi t 2
2 2
i =1 i =1

This is the MGF of a normal random variable with expected value ∑ni=1 ai µi and
variance ∑ni=1 ai2 σ2i .
) !
n n n
Y = ∑ ai Xi N ∑ ai µi , ∑ ai2 σ2i
i =1 i =1 i =1
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 13 / 107
Corollary (Sample mean)
Suppose that X1 , X2 , . . . Xn is a random sample from a normal population with
expected value µ and variance σ2 then

σ2
X̄ N µ,
n

Proof: We have that


1 n n
1
X̄ = ∑
n i =1
Xi = ∑
n
Xi
i =1
From Theorem 8.4 it follows that X̄ is normally distributed with
n n
1 1 1
E (X̄ ) = ∑ n
E (Xi ) = ∑ n
µ=n
n
µ=µ
i =1 i =1
2
n
1 n
1 2 2 1 σ2
Var (X̄ ) = ∑ n
Var (Xi ) = ∑ n
σ = n 2 σ2 =
n n
i =1 i =1

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 14 / 107
Example (Normal 2)
Suppose the marks of students are normally distributed with mean µ = 60 and
variance σ2 = 100. Let X̄ be the sample mean of a random sample of size n = 4.
Give the distribution of X̄ .

Since the marks are drawn from a normal population it follows from Corollary
8.4(a) that the sample mean X̄ is also normally distributed i.e.

X̄ N µX̄ , σ2X̄

where

µX̄ = µ = 60

σ2 100
σ2X̄ = = = 25
n 4

) X̄ N (60, 25)

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 15 / 107
Corollary (Di¤erence between two sample means)
Let X1 , X2 , . . . Xn1 and Y1 , Y2 , . . . Yn2 be two independent random samples
from two di¤erent normal populations

Xi N µ1 , σ21 and Yi N µ2 , σ22

then !
σ2 σ2
X̄ Ȳ N µ1 µ2 , 1 + 2
n1 n2

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 16 / 107
Proof: From Corollary 8.4(a) it follows that
! !
σ21 σ2
X̄ N µ1 , and Ȳ N µ2 , 2
n1 n2

Further, X̄ and Ȳ are independent random variables, since two random samples
are independent. Hence

X̄ Ȳ = (1) X̄ + ( 1) Ȳ

is a linear combination of two independent normal variables. It follows from


Theorem 8.4 that X̄ Ȳ is a normal random variable with

µX̄ Ȳ = (1) µ1 + ( 1) µ2 = µ1 µ2

and
σ2 σ22 σ2 σ2
σ2X̄ Ȳ = (1)2 1 + ( 1)2 = 1+ 2
n1 n2 n1 n2

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 17 / 107
Example (Normal 3)
Let X̄ be the sample mean of a random sample of size n1 = 5 from a N (20, 25)
distribution and let Ȳ be the sample mean of an independent random sample of
size n2 = 4 from a N (10, 16) distribution. Give the sampling distribution of
X̄ Ȳ .

Solution:It follows from Corollary 8.4(b) that

X̄ Ȳ N µX̄ Ȳ , σ2X̄ Ȳ

where

µX̄ Ȳ = 20 10 = 10

25 16
σ2X̄ Ȳ = + = 10 = 5 + 4 = 9
5 4

) X̄ Ȳ N (10, 9)

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 18 / 107
Corollary (Sample proportion)
Let X BIN (n, θ ), then if the rule of thumb holds i.e.

nθ 5 and nθ (1 θ) 5

we know that the sample proportion

b = number of successes = 1 X θ (1 θ)
Θ N θ,
n n n
| {z }

X N (nθ, nθ (1 θ )) , Theorem 6.8


b
According to Theorem 8.4 we know that Θ N µΘ 2
b, σb with
Θ
1
µΘ
b = nθ = θ
n

1 2 θ (1 θ)
σ2Θ
b = nθ (1 θ) =
n n

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 19 / 107
The Chi-Square Distribution

De…nition (6.2)
A random variable X has a gamma distribution with α > 0 and β > 0 if
8
< 1
x α 1 e x /β x >0
g (x; α, β) = βα Γ (α)
:
0 elsewhere

Notation:
X GAM (α, β)
where
α = shape parameter and β = scale parameter
Note: The are three basic shapes depending on whether

α < 1, α = 1 and α > 1.

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 20 / 107
Gamma distributions with α = 0.5, 1, 2.

0.6
f(x)
0.5

0.4

0.3

0.2

0.1

0.0
0 1 2 3 4 5 6 7 8 9 10
x

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 21 / 107
De…nition (Gamma function)
The gamma function denoted by Γ (κ ), for κ > 0, is given by
Z ∞
Γ (κ ) = t κ 1 e t dt
0

Note: Z ∞
Γ (1) = e t dt = 1
0

Theorem (Gamma function)


The gamma function satis…es the following properties:

Γ (κ ) = (κ 1) Γ (κ 1)
Γ (n ) = (n 1) ! n = 1, 2, 3, . . .
p
Γ 12 = π

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 22 / 107
SAS Program:
Z ∞
Γ (κ ) = t κ 1 e t dt
0

proc iml;
a=gamma(1);
b=gamma(5);
fact_4=fact(4);
c=(gamma(0.5))##2;
pi=22/7;
print a b fact_4 c[f=10.8 l=’correct pi’] pi[f=10.8];
quit;

SAS Output:

a b fact_4 correct pi pi

1 24 24 3.14159265 3.14285714

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 23 / 107
Fact (Special Gamma distributions)
If X GAM (α, 2) then X χ2 (2α) .
If X GAM (1, β) then X EXP ( β) .

Theorem (6.3)
The mean and variance of the gamma distribution X GAM (α, β) are given by

µ = αβ and σ2 = αβ2

Theorem (6.4)
The MGF of the gamma distribution, X GAM (α, β) is given by
α
MX (t ) = (1 βt )

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 24 / 107
Example (Chi-Square 1)
Use the properties of the X GAM (α, β) to give the expected value, variance
and MGF of

1 X EXP (θ )
2 X χ2 ( ν )

1 X EXP (θ ) GAM (1, θ )

a Moments
µ = (1) θ = θ and σ 2 = (1) θ 2 = θ 2
b MGF
1
MX (t ) = (1 θt ) 1 =
(1 θt )

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 25 / 107
ν
2 X χ2 ( ν ) GAM ,2
2
a Moments
ν ν 2
µ= 2=ν and σ2 = 2 = 2ν
2 2
b MGF
ν/2
MX ( t ) = ( 1 2t )

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 26 / 107
Example
Let Xi GAM (αi , β) i = 1, 2, . . . , n independent random variables. Use the
MGF technique to obtain the distribution of
n
Y = ∑ Xi .
i =1

n
MY ( t ) = ∏ MXi (t )
i =1
n
(∑ni=1 αi )
= ∏ (1 βt ) αi
= (1 βt )
i =1

) ! !
n
Y GAM ∑ αi ,β .
i =1

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 27 / 107
Example (Chi-Square 2)
Let Xi , i = 1, 2, . . . , n independent random variables. Give the distribution of
n
Y = ∑ Xi when
i =1

1 Xi EXP (θ )
2 Xi χ2 ( νi )

1 Xi EXP (θ ) GAM (1, θ )


Y GAM (n, θ )
νi
2 Xi χ2 ( νi ) GAM ,2
2
! ! ! !
n
νi 1 n n
Y GAM ∑ 2 ,2 GAM
2 i∑
νi , 2 χ2 ∑ νi
i =1 =1 i =1

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 28 / 107
Fact
If
X χ2 ( ν )
then the parameter ν is referred to as the degrees of freedom and
ν
X GAM ,2
2

De…nition (6.4)
A random variable X has a chi-square distribution with ν degrees of freedom
()
the probability density is given by
( 1
x ν/2 1 e x /2 x >0
f (x; ν) = 2ν/2 Γ (ν/2)
0 elsewhere

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 29 / 107
Gamma distributions with α = 0.5, 1, 2, 3 & β = 2 i.e. χ2 -distributions.

0.6
f(x)
0.5

0.4

0.3

0.2

0.1

0.0
0 1 2 3 4 5 6 7 8 9 10
x

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 30 / 107
Table III: P χ2 χ20.95 (3) = 0.95 then χ20.95 (3) = 7.815
Z χ2 ( ν )
γ
Percentiles of χ2 distribution, values of χ2γ (ν), γ = f (x; ν) dx.
0
γ
ν 0.005 0.01 0.025 0.05 0.95 0.975 0.99 0.995
2 0.010 0.020 0.051 0.103 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 11.070 12.833 15.086 16.750
9 1.735 2.088 2.700 3.325 16.919 19.023 21.666 23.589
10 2.156 2.558 3.247 3.940 18.307 20.483 23.209 25.188
11 2.603 3.053 3.816 4.575 19.675 21.920 24.725 26.757
27 11.808 12.879 14.573 16.151 40.113 43.195 46.963 49.645
28 12.461 13.565 15.308 16.928 41.337 44.461 48.278 50.993
29 13.121 14.256 16.047 17.708 42.557 45.722 49.588 52.336
30 13.787 14.953 16.791 18.493 43.773 46.979 50.892 53.672

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 31 / 107
χ20.95 (3) = 7.815

0.3
f(x)

0.2

0.1

0.95
0.0
0 1 2 3 4 5 6 7 8 9 10
x

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 32 / 107
SAS Program

proc iml;
QNT_GAM=quantile(’gamma’,0.95,1.5,2);
QNT_CHI=quantile(’chisq’,0.95,3);
print QNT_GAM[f=8.4] QNT_CHI[f=8.4];

CDF_GAM=cdf(’gamma’,7.815,1.5,2);
CDF_CHI=cdf(’chisq’,7.815,3);
print CDF_GAM[f=8.2] CDF_CHI[f=8.2];

p1={0.005,0.01,0.025,0.05,0.1,0.2,0.25,0.3,0.5};
p2=1-p1; p=unique(p1//p2)‘;
x=(1:10)‘;
QNT_chi_3=quantile(’chisq’,p,3);
CDF_chi_3=cdf(’chisq’,x,3);
print QNT_chi_3[r=p f=8.4] ’ ’ CDF_chi_3[r=x f=8.4];
quit;

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 33 / 107
SAS Output

QNT_GAM QNT_CHI
7.8147 7.8147

QNT_GAM QNT_CHI
0.95 0.95

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 34 / 107
SAS Output (Continue)
QNT_chi_3 CDF_chi_3
0.005 0.0717 1 0.1987
0.01 0.1148 2 0.4276
0.025 0.2158 3 0.6084
0.05 0.3518 4 0.7385
0.1 0.5844 5 0.8282
0.2 1.0052 6 0.8884
0.25 1.2125 7 0.9281
0.3 1.4237 8 0.9540
0.5 2.3660 9 0.9707
0.7 3.6649 10 0.9814
0.75 4.1083
0.8 4.6416
0.9 6.2514
0.95 7.8147
0.975 9.3484
0.99 11.3449
0.995 12.8382

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 35 / 107
Fact
When the degrees of freedom, ν is greater than 30 then probabilities related to
the χ2 -distribution can be approximated with the normal distribution i.e.

χ2 ( ν ) N (ν, 2ν)
Graph: χ2 (x; ν) with ν = 5, 10, 15 df and the N (15, 30)

f(x) 0.15

0.10

0.05

0.00
0 10 20 30
x

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 36 / 107
Example (Chi-Square 3)
X χ2 (30)

1 Give the 95th percentile of X .


2 Use the normal approximation of the χ2 -distribution to approximate the
95th percentile of X .

1 From Table II χ20.95 (30) = 43.773.


2 X N (30, 60)
X 30 P95 30
0.95 = P (X P95 ) = P p p
60 60

P95 30
) p = z0.95 = 1.645
60
p
) P95 = 30 + 1.645 60 = 42. 742 χ20.95 (30)

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 37 / 107
P95 for the χ2 (30) and the N (30, 60)

f(x) 0.05

0.04

0.03

0.02

0.01

0.00
10 20 30 40 50 60
-0.01 P95=43.773 x
P95=42.742

p
Note: In general χ2γ (ν) ν + zγ 2ν, for ν large.

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 38 / 107
Example (Chi-Square 4)
Suppose X χ2 (12) . The moment generating function of X is

MX (t ) = (1 2t ) 6

f(x) 0.08
0.06

0.04

0.02

0.00
0 5 10 15 20 25 30
x
Calculate the expected value and variance of X by making use of MX (t ).

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 39 / 107
Calculate µ10 = E (X )

MX0 (t ) = 6 (1 2t ) 7 ( 2) = 12 (1 2t ) 7
µ10 = E (X ) = MX0 (0) = 12 (1 2 (0)) 7 = 12

Calculate µ20 = E X 2

MX00 (t ) = 84 (1 2t ) 8 ( 2) = 168 (1 2t ) 8
µ20 = E X 2 = MX00 (0) = 168

Calculate Var (X )

Var (X ) = E X2 [E (X )]2
= 168 122 = 24

Note: In general if X χ2 (ν) then


ν/2
MX (t ) = (1 2t ) , µ = ν and σ2 = 2ν.

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 40 / 107
Theorem (8.7)
If X N (0, 1) then X 2 χ2 (1) .

Proof:
h 2
i
MX 2 (t ) = E e tX
Z ∞
2 1 x 2 /2
= e tx p e dx
∞ 2π
Z ∞
1 1 2
= p exp x + tx 2 dx
∞ 2π 2
Z ∞
1 1 2
= p exp x (1 2t ) dx
∞ 2π 2
Z ∞
" #
1 1 (x 0)2
= p exp dx
∞ 2π 2 (1 2t ) 1

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 41 / 107
Theorem (8.7)
If X N (0, 1) then X 2 χ2 (1) .

Proof: (Continue)
Z ∞
" #
1 1 (x 0)2
MX 2 (t ) = p exp dx
∞ 2π 2 (1 2t ) 1
q Z ∞
" #
1 1 1 (x 0)2
= (1 2t ) p q exp dx
∞ 2 (1 2t ) 1
2π (1 2t ) 1
| {z }
1
N (0,(1 2t ) )
1
1
= p = (1 2t ) 2 i.e. MGF of χ2 (1)
1 2t

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 42 / 107
SAS Program:
proc iml;
call randseed(123);
z=randfun(1000,’normal’,);
z_sq=z##2;
create sim var{z z_sq}; append;
quit;
proc univariate data=sim;
var z;
histogram / endpoints=-4 to 4 by 0.5 cfill=yellow
normal(mu=0 sigma=1);
title ’Standard normal variable’;
run;
proc univariate data=sim;
var z_sq;
histogram / endpoints=0 to 9 by 1 cfill=yellow
gamma(alpha=0.5 sigma=2);
title ’Chi-squared variable with 1 df’;
run;

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 43 / 107
SAS Output:
The UNIVARIATE Procedure
Variable: Z_SQ
Moments
N 1000 Sum Weights 1000
Mean 1.01125058 Sum Observations 1011.25058
Std Deviation 1.43937936 Variance 2.07181294
Skewness 2.80591679 Kurtosis 10.3206346
Uncorrected SS 3092.36886 Corrected SS 2069.74113
Coeff Variation 142.336567 Std Error Mean 0.04551717
Parameter Symbol Estimate
Fitted Gamma Distribution for Z_SQ
Parameters for Gamma Distribution
Parameter Symbol Estimate
Threshold Theta 0
Scale Sigma 2
Shape Alpha 0.5

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 44 / 107
Fitted Gamma Distribution for Z_SQ
Goodness-of-Fit Tests for Gamma Distribution
Test ----Statistic----- ------p Value------
Kolmogorov-Smirnov D 0.01772873 Pr > D >0.250
Cramer-von Mises W-Sq 0.03098187 Pr > W-Sq >0.250
Anderson-Darling A-Sq 0.22125090 Pr > A-Sq >0.250

Quantiles for Gamma Distribution


------Quantile------
Percent Observed Estimated
1.0 0.00024 0.00016
5.0 0.00329 0.00393
10.0 0.01818 0.01579
25.0 0.10637 0.10153
50.0 0.46033 0.45494
75.0 1.36249 1.32330
90.0 2.68974 2.70554
95.0 3.67182 3.84146
99.0 7.52377 6.63490

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 45 / 107
Empirical distribution of Z N (0, 1)

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 46 / 107
Empirical distribution of Z 2 χ2 (1)

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 47 / 107
Theorem (8.9)
If Xi χ2 (νi ) , i = 1, 2, . . . , n are independent chi-square variables, then
!
n n
Y = ∑ Xi χ2 ∑ νi
i =1 i =1

Proof:
n
MY (t ) = ∏ MX i ( t ) independent variables
i =1

n νi ∑ni=1 νi
= ∏ (1 2t ) 2 = (1 2t ) 2
i =1

i.e. MGF of χ2 (∑ni=1 νi ) .

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 48 / 107
Example (Chi-Square 5)
Suppose X1 χ2 (5), X2 GAM (4, 2) and X3 EXP (2) are
independent random variables. Let: Y = X1 + X2 + X3
1. Give the distribution of Y .

X1 χ2 (5)
X2 GAM (4, 2) χ2 (8)
X3 EXP (2) GAM (1, 2) χ2 (2)

) Y has a χ2 -distribution with

ν = 5 + 8 + 2 = 15

degrees of freedom i.e. Y χ2 (15) .

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 49 / 107
X1 χ 2 ( 5 ) , X2 χ2 (8) and X3 χ2 (2) . Y = ∑3i =1 Xi χ2 (15)

0.5
f(x)
0.4

0.3

0.2

0.1

0.0
0 2 4 6 8 10 12 14 16 18 20 22 24
x

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 50 / 107
Example (Chi-Square 5)
Suppose X1 χ2 (5), X2 GAM (4, 2) and X3 EXP (2) are
independent random variables.
Let:
Y = X1 + X2 + X3

2. Give the 5th and the 95th percentile of Y .

P05 = χ20.05 (15) = 7.261

P95 = χ20.95 (15) = 24.996

quantile(’chisq’,0.05,15) 7.2609
quantile(’chisq’,0.95,15) 24.9958

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 51 / 107
Example (Chi-Square 5)
Suppose X1 χ2 (5), X2 GAM (4, 2) and X3 EXP (2) are
independent random variables. Let:

Y = X1 + X2 + X3

3. Generate the empirical distribution of Y with n = 1000 observations


as follows:
I For X1 χ2 (5), use the RANDSEED call with a seed of 123 .
I For X2 GAM (4, 2), use the RANDSEED call with a seed of 456 .
I For X3 EXP (2), use the RANDSEED call with a seed of 789 .
Note: Compare the mean, variance and P5 and P95 of the empirical
distribution with that of the theoretical distribution.

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 52 / 107
SAS Program:

proc iml;
call randseed(123,1);
x1=randfun(1000,’chisquare’,5);
x2=randfun(1000,’gamma’,4,2);
x3=randfun(1000,’exponential’,2);
y=x1+x2+x3;
create sim var{y}; append;
quit;

ods graphics off;


proc univariate data=sim;
var y;
histogram / endpoints=0 to 40 by 5 cfill=red
gamma(alpha=7.5 sigma=2);
run;

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 53 / 107
SAS Output:
The UNIVARIATE Procedure
Variable: Y
Moments
N 1000 Sum Weights 1000
Mean 15.2761254 Sum Observations 15276.1254
Std Deviation 5.38771349 Variance 29.0274567
Skewness 0.71155841 Kurtosis 0.65223801
Uncorrected SS 262358.437 Corrected SS 28998.4292
Coeff Variation 35.2688482 Std Error Mean 0.17037446
Fitted Gamma Distribution for Y
Parameters for Gamma Distribution
Parameter Symbol Estimate
Threshold Theta 0
Scale Sigma 2
Shape Alpha 7.5
Mean 15
Std Dev 5.477226

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 54 / 107
Goodness-of-Fit Tests for Gamma Distribution
Test ----Statistic----- ------p Value------
Kolmogorov-Smirnov D 0.03480413 Pr > D 0.178
Cramer-von Mises W-Sq 0.33855866 Pr > W-Sq 0.107
Anderson-Darling A-Sq 1.92598711 Pr > A-Sq 0.101
Quantiles for Gamma Distribution
-------Quantile------
Percent Observed Estimated
1.0 5.78072 5.22935
5.0 7.62567 7.26094
10.0 8.86100 8.54676
25.0 11.34526 11.03654
50.0 14.63383 14.33886
75.0 18.50925 18.24509
90.0 22.54529 22.30713
95.0 25.02918 24.99579
99.0 31.21160 30.57791

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 55 / 107
Empirical distribution of Y = X1 + X2 + X3 χ2 (15)

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 56 / 107
Theorem (8.11)
If X̄ and S 2 are the mean and variance of a random sample of size n from a
normal population with mean µ and standard deviation σ then

1 X̄ and S 2 are independent


(n 1) S 2
2 the random variable χ2 (n 1)
σ2

Fact (Sum of Squared Deviations (SSD))


n n
∑ (Xi ∑ (Xi
2 2
X̄ ) = µ )2 n (X̄ µ)
i =1 i =1

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 57 / 107
From SSD it follows that
n n
∑ ( Xi ∑ (Xi
2 2
µ )2 = X̄ ) + n (X̄ µ)
i =1 i =1

Divide by σ2
2
n
Xi µ 2 (n 1) S 2 X̄ µ
∑ σ
=
σ 2
+ p
σ/ n
i =1
| {z } | {z } | {z }
V2 V3
V1
V1 = V2 + V3

2
Since ∑ni=1 (Xi X̄ ) = (n 1) S 2

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 58 / 107
Proof:

1 Assume independence of X̄ and S 2 .


2

Xi µ 2
I For V1 = ∑ni=1 σ :

F
Xi µ
σ , i = 1, 2, . . . , n are indep. N (0, 1) variables
Xi µ 2
F
σ , i = 1, 2, . . . , n are indep. χ2 (1), Theorem 8.7
Xi µ 2
F V1 = ∑ni=1 σ χ2 (n ), Theorem 8.9

X̄ pµ 2
I For V3 = :
σ/ n

F
X̄ µ
p
σ/ n
N (0, 1), Corollary 8.4a
X̄ µ 2
F p
σ/ n
χ2 (1 ), Theorem 8.7

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 59 / 107
I Since X̄ and S 2 are indep., it follows that V2 and V3 are indep.
I V1 = V2 + V3 is the sum of two independent random variables

MV 1 = MV 2 MV 3 Theorem 7.3
n /2
(1 2t ) = MV 2 ( 1 2t ) 1 /2
) MV 2 = ( 1 2t ) (n 1 )/2

) MV 2 is MGF of a χ2 (n 1) =) :

(n 1) S 2
V2 = χ2 (n 1) .
σ2

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 60 / 107
Lemma
Suppose X1 , X2 , . . . , Xn is a random sample from N µ, σ2 then

2σ4
Var S 2 =
n 1

Proof: Since
(n 1) S 2
χ2 (n 1)
σ2
we know that

(n 1) S 2 (n 1)2
Var = 2 (n 1) =) Var S 2 = 2 (n 1)
σ2 σ4

2σ4
) Var S 2 =
(n 1)

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 61 / 107
Fact (Sample Variance)
Suppose that Xi N µ, σ2 , i = 1, 2, . . . , n is a random sample then the
sample variance S 2 is a consistent estimator of σ2 .

Proof:

1. S 2 is an unbiased estimator of σ2 , since E S 2 = σ2 , Theorem 10.1.


(n 1) S 2
2. Since χ2 (n 1) it follows that
σ2

(n 1) S 2 n 1 2
Var = 2 (n 1) =) : Var S 2 = 2 (n 1)
σ2 σ2

)
σ4 2σ4 n
Var S 2 = 2 (n 1) = !0
(n 1)2 n 1 ∞

) According to Theorem 10.3, S 2 is a consistent estimator of σ2 .

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 62 / 107
Example (Chi-Square 6)
A random sample of size n = 10 is taken from a normally distributed population
of marks X , where X N 60, 102 .The following statistics are considered
2
∑ (Xi X̄ ) 9S 2
S2 = and T =
9 100
Simulation: Generate r = 1000 samples with a seed of 123 from the
population described above and compute the values of S 2 and T for each sample.
1 Give the theoretical and empirical values for the following:
a E (T ) and Var (T )
b P (T > 19.023)
c The value a such that P (T < a) = 0.05.
2 Give the theoretical and empirical values for the following:
a E S 2 and Var S 2
b P S 2 < 30
c The value b such that P S 2 < b = 0.95.

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 63 / 107
SAS Program:

proc iml;
call randseed(123);
mu=60; sigma=10; n=10;
x=randfun({1000,10},’normal’,mu,sigma);
xbar=(mean(x‘))‘;
S2=(var(x‘))‘;
T=(n-1)*S2/(sigma##2);
create D6 var{xbar S2 T}; append; close D6;
quit;

ods graphics off;


proc univariate data=D6;
var T;
histogram / endpoints=0 to 20 by 2 cfill=purple
gamma(alpha=4.5 sigma=2);
run;

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 64 / 107
SAS Output:
The UNIVARIATE Procedure
Variable: T
Moments
N 1000 Sum Weights 1000
Mean 9.08734131 Sum Observations 9087.34131
Std Deviation 4.25733689 Variance 18.1249174
Skewness 0.87736093 Kurtosis 0.7979859
Uncorrected SS 100686.565 Corrected SS 18106.7925
Coeff Variation 46.8490919 Std Error Mean 0.13462881

Fitted Gamma Distribution for T


Parameters for Gamma Distribution
Parameter Symbol Estimate
Threshold Theta 0
Scale Sigma 2
Shape Alpha 4.5
Mean 9
Std Dev 4.242641
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 65 / 107
Goodness-of-Fit Tests for Gamma Distribution
Test ----Statistic----- ------p Value------
Kolmogorov-Smirnov D 0.01788304 Pr > D >0.250
Cramer-von Mises W-Sq 0.05738230 Pr > W-Sq >0.250
Anderson-Darling A-Sq 0.36603765 Pr > A-Sq >0.250

Quantiles for Gamma Distribution


-------Quantile------
Percent Observed Estimated
1.0 2.29483 2.08790
5.0 3.40832 3.32511
10.0 4.11774 4.16816
25.0 6.01724 5.89883
50.0 8.40758 8.34283
75.0 11.47591 11.38875
90.0 14.74488 14.68366
95.0 17.37229 16.91898
99.0 22.08728 21.66599

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 66 / 107
9S 2
Empirical distribution of T =
100

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 67 / 107
Answers:
9S 2 (n 1) S 2
1 T = = χ2 (9)
100 σ2
a

E (T ) = 9 and Var (T ) = 18

F Empirical: mean(T) 9.0873


F Empirical: var(T) 18.1249
b From the χ2 -table
P (T > 19.023) = 0.025

F Theoretical: 1-cdf(’chisq’,19.023,9) 0.0250


F Empirical: mean(T>19.023) 0.0280
0 1
c From the χ2 -table it follows P @T < 3.325 A
| {z } = 0.05.
a
F Theoretical: quantile(’chisq’,0.05,9) 3.3251
F Empirical: call qntl(Q1c_Em,T,0.05) 3.4083

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 68 / 107
(n 1) S 2
2 χ2 (9)
σ2
a

9S 2
F E =9
100
9
E S 2 = 9 =) E S 2 = 100
100
= σ2 unbiased estimator

Empirical: mean(S2) 100.9705


9S 2
F Var = 18
100
2 2
9 100
Var S 2 = 18 =) Var S 2 = 18 = 2222
100 9

Empirical: var(S2) 2237.6

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 69 / 107
b

9S 2 9 30
P S 2 < 30 = P <
100 100
27
= P χ2 (9) <
10
= P χ2 (9) < 2.7
= 0.025

F Theoretical: cdf(’chisq’,2.7,9) 0.0250


F Empirical: mean(S2<30) 0.0240

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 70 / 107
c

0.95 = P S2 < b
0 1
B 9S 2 9 b C
B C
= PB < C
@ 100 100
|{z} A
χ20.95 (9 )

)
9 b 100
= 16.919 =) b = 16.919 = 187. 99
100 9

F Theoretical: (sigma##2/df)*quantile(’chisq’,0.95,9) 187.988


F Empirical: call qntl(Q2c_Em,S2,0.95) 193.0254

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 71 / 107
The t-Distribution

For a random sample from a normal population with mean µ and variance
σ2 , the random variable X̄ has a normal distribution with mean µ and
σ2
variance i.e.
n
X̄ µ
Z = p N (0, 1)
σ/ n

In most applications the population standard deviation σ is unknown.


What is the exact distribution of

X̄ µ
T = p
S/ n

for random samples from normal populations?

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 72 / 107
Theorem (8.12)
If Y χ2 (ν) and Z N (0, 1) are independent random variables then

Z
T = p t (ν)
Y /ν

with pdf
ν+1
Γ 2
ν +1
2
f (t ) = 2 t
p ν . 1+ ν ∞<t<∞
πνΓ
2
and is called the t distribution with ν degrees of freedom.

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 73 / 107
Table II: P [T t0.95 (10)] = 0.95 then t0.95 (10) = 1.812
R t γ (ν)
Percentiles of t distribution, values of tγ (ν), γ = ∞ f (t; ν) dt.
γ
ν 0.9 0.95 0.975 0.99 0.995
1 3.078 6.314 12.706 31.821 63.657
2 1.886 2.920 4.303 6.965 9.925
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
10 1.372 1.812 2.228 2.764 3.169
29 1.311 1.699 2.045 2.462 2.756
∞ 1.282 1.645 1.960 2.326 2.576

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 74 / 107
t0.05 (10) = t0.95 (10) = 1.812 and t0.95 (10) = 1.812

0.4
f

0.3

0.2

0.9
0.1

0.0
-3 -2 -1 0 1 2 3
t

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 75 / 107
SAS Program and Output

proc iml;
t_05=quantile(’t’,0.05,10);
t_95=quantile(’t’,0.95,10);
print t_05[f=8.4] t_95[f=8.4];

prob1=cdf(’t’,-1.8125,10);
prob2=cdf(’t’,1.8125,10);
print prob1[f=8.4] prob2[f=8.4];

t_05 t_95
-1.8125 1.8125

prob1 prob2
0.0500 0.9500

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 76 / 107
t (1) , t (2) , t (3) , t (9) and the N (0, 1) distribution

0.4
f(t;df)

0.3

0.2

0.1

-4 -3 -2 -1 0 1 2 3 4
t

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 77 / 107
Fact
The t distribution is symmetrical about t = 0.
For ν 30, probabilities may be approximated with the standard normal
distribution.

p={0.05,0.95};
t30=quantile(’t’,p,30);
t50=quantile(’t’,p,50);
t100=quantile(’t’,p,100);
z=quantile(’normal’,p);
print p t30[f=8.4] t50[f=8.4] t100[f=8.4] z[f=8.4];

p t30 t50 t100 z

0.05 -1.6973 -1.6759 -1.6602 -1.6449


0.95 1.6973 1.6759 1.6602 1.6449

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 78 / 107
Theorem (8.13)
If X̄ and S 2 are the mean and variance of a random sample of size n from a
normal population with mean µ and variance σ2 , then

X̄ µ
T = p t (n 1)
S/ n

Proof:

From Corollary 8.4(a) it follows that

X̄ µ
Z = p N (0, 1)
σ/ n

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 79 / 107
From Theorem 8.11 it follows that

(n 1) S 2
Y = χ2 (n 1)
σ2

Since X̄ and S 2 are independent according to Theorem 8.11, it follows that


Y and Z are independent.
From Theorem 8.12 it follows that
Z
T = p
Y / (n 1)

X̄ µ
p
σ/ n X̄ µ
= s = p t (n 1)
S/ n
(n 1) S 2
σ 2 (n 1)

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 80 / 107
Example (t distribution)
Let X1 , X2 , . . . , X9 be a random sample from a normal distribution, where

Xi N (6, 25) .

Let X̄ and S 2 represent the sample mean and sample variance, respectively.
Simulation:
Generate r = 1000 samples with a seed of 123 from the population described
above and compute the values of X̄ and S 2 for each sample.
Calculate the theoretical and empirical values for the following:

1 P (3 < X̄ < 7)
3 (X̄ 6)
2 P 1.860 <
S
3 P S 2 < 6. 812 5
4 P S 2 < 31.9375

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 81 / 107
SAS Program:

proc iml;
n=9; mu=6; sigma=sqrt(25);
stderr=sigma/sqrt(9);
r=1000; dim=r//n;

call randseed(123); x=randfun(dim,’normal’,mu,sigma);


xbar=(mean(x‘))‘;
s2=(var(x‘))‘;
t=(xbar-mu)/sqrt(s2/n);
create DT var{xbar s2 t}; append;
show contents; closefile;
quit;

proc univariate data=DT;


var t;
run;

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 82 / 107
SAS Output:
The UNIVARIATE Procedure
Variable: T
Moments
N 1000 Sum Weights 1000
Mean 0.02386244 Sum Observations 23.8624437
Std Deviation 1.13113618 Variance 1.27946906
Skewness 0.06800334 Kurtosis 1.28370653
Uncorrected SS 1278.75901 Corrected SS 1278.18959
Coeff Variation 4740.23615 Std Error Mean 0.03576967
Quantiles (Definition 5)
Level Quantile
100% Max 5.59736225
99% 2.58432837
95% 1.83053765
90% 1.40047200
75% Q3 0.74458650
50% Median 0.00927994
25% Q1 -0.69036993
10% -1.33760939
5% -1.74870274
1% -2.70103880
0% Min -4.59805608
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 83 / 107
Answers:
25
1 Since X̄ N 6, it follows that
9

3 6 X̄ 6 7 6
P (3 < X̄ < 7) = P < <
5/3 5/3 5/3
= P ( 1.8 < Z < 0.6) , Z N (0, 1)
= Φ (0.6) Φ ( 1.8)
= 0.7257 0.0359 = 0.6898

I Theoretical:
cdf(’normal’,7,mu,stderr)-cdf(’normal’,3,mu,stderr)
OR:
cdf(’normal’,(7-mu)/stderr)-cdf(’normal’,(3-mu)/stderr);
0.6898

I Empirical: mean(xbar>3 & xbar<7) 0.6750

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 84 / 107
2

3 (X̄ 6) X̄ 6
P > 1.86 = P p > 1.86
S S/ 9

= P (t (8) > 1.86)


= 1 P (t (8) 1.86)
= 1 0.95 = 0.05

I Theoretical: 1-cdf(’t’,1.86,n-1) 0.0500

I Empirical: mean(t>1.86) 0.0470

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 85 / 107
3

8S 2 8 6. 812 5
P S 2 < 6. 812 5 = P <
25 25

(n 1) S 2
= P χ2 (8) < 2. 18 , χ2 (n 1)
σ2

= 0.025

I χ2 -Table: χ20.025 (8) = 2.18

I Theoretical: cdf(’chisq’,(n-1)/sigma##2*6.8125,n-1)
0.0250

I Empirical: mean(S2<6.8125) 0.0130

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 86 / 107
4

(n 1) S 2 (8) 31.9375
P S 2 < 31.9375 = P <
σ2 25

(n 1) S 2
= P χ2 (8) < 10.22 , χ2 (n 1)
σ2

= 0.75

I Theoretical: cdf(’chisq’,(n-1)/sigma##2*31.9375,n-1)
0.7501

I Empirical: mean(S2<31.9375) 0.7460

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 87 / 107
The F-Distribution

Theorem (8.14)
If U χ2 (ν1 ) and V χ2 (ν2 ) are independent random variables then

U/ν1
F = F ( ν1 , ν2 )
V /ν2

is a random variable having an F distribution with pdf

ν1 + ν2
Γ ν1 1
2 ( ν1 + ν2 )
2 ν1 2 ν1 ν
g (f ) = ν1 ν f 2 1
1+ 1f
Γ Γ 2 ν2 ν2
2 2
for f > 0 and g (f ) = 0 elsewhere.

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 88 / 107
The F-Distribution

Corollary (8.14)
If U F (ν1 , ν2 ) then
1
F ( ν2 , ν1 )
U

Proof: Suppose X1 χ2 (ν1 ) and X2 χ2 (ν2 ) are independent then

X1 /ν1
U= F ( ν1 , ν2 )
X2 /ν2

but
1 X /ν
= 2 2 F ( ν2 , ν1 ) according to Theorem 8.14
U X1 /ν1

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 89 / 107
The F-Distribution
Table IV: P [F f0.95 (5, 10)] = 0.95 then f0.95 (5, 10) = 3.33

Table IV: Values of f0.95 (ν1 , ν2 )


ν1 = Degrees of freedom for numerator
ν2 1 2 3 4 5 6 10
1 161 200 216 225 230 234 242
2 18.51 19.00 19.16 19.25 19.30 19.33 19.40
3 10.13 9.55 9.28 9.12 9.01 8.94 8.79
4 7.71 6.94 6.59 6.39 6.26 6.16 5.96
5 6.61 5.79 5.41 5.19 5.05 4.95 4.74
6 5.99 5.14 4.76 4.53 4.39 4.28 4.06
7 5.59 4.74 4.35 4.12 3.97 3.87 3.64
8 5.32 4.46 4.07 3.84 3.69 3.58 3.35
9 5.12 4.26 3.86 3.63 3.48 3.37 3.14
10 4.96 4.10 3.71 3.48 3.33 3.22 2.98
ν2 = Degrees of freedom for denominator

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 90 / 107
P (F f0.95 (5, 10)) = 0.95 then f0.95 (5, 10) = 3.33

0.8

0.6

0.4

0.2 0.95

0.0
0 1 2 3 4 5
f

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 91 / 107
Fact
1
fγ ( ν 1 , ν 2 ) =
f1 γ ( ν2 , ν1 )

Proof: Suppose F F (ν2 , ν1 ) and by making use of Corollary 8.14

1
γ = P [F fγ (ν2 , ν1 )] = P fγ ( ν 1 , ν 2 )
F
| {z }
γ

Per de…nition it follows

1 γ = P F f1 γ ( ν2 , ν1 )
1 1 1 1
= P =1 P
F f1 γ ν2 , ν1 )
( F f1 γ (ν2 , ν1 )
| {z }
γ
1
) fγ ( ν 1 , ν 2 ) =
f1 γ (ν2 , ν1 )

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 92 / 107
Example (F-distribution 1)
Suppose F F (5, 10) .

1 Give the values for P05 and P95 .


2 Give the values for P05 and P95 with SAS.

1 From Table IV(a) it follows that

1
f0.05 (5, 10) =
f0.95 (10, 5)
1
= = 0.211
4.74

f0.95 (5, 10) = 3.33

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 93 / 107
0.8

0.6

0.4

0.2 0.9

0.0
0 1 2 3 4 5
F(5,10)

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 94 / 107
2 SAS Program and Output:

proc iml;
v1=quantile(’F’,0.05,5,10); v1_alt=1/quantile(’F’,0.95,10,5);
v2=quantile(’F’,0.95,5,10); v2_alt=1/quantile(’F’,0.05,10,5);
print v1 [f=8.3] v1_alt [f=8.3] v2 [f=8.3] v2_alt [f=8.3];

me=quantile(’F’,0.5,5,10);
pp1=cdf(’F’,1,5,10); pp2=cdf(’F’,2,5,10); pp3=cdf(’F’,3,5,10);
print me[f=8.3 l=’me’]
pp1 [f=8.3 l=’1’] pp2 [f=8.3 l=’2’] pp3 [f=8.3 l=’3’];

v1 v1_alt v2 v2_alt
0.211 0.211 3.326 3.326

me 1 2 3
0.932 0.535 0.836 0.934

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 95 / 107
Theorem (8.15)
If S12 and S22 are the variances of independent random samples of sizes n1 and
n2 from normal populations with the variances σ21 and σ22 , then

S 2 /σ2
F = 12 21 F ( n1 1, n2 1)
S2 /σ2

Proof:
From Theorem 8.13 it follows that
( n1 1) S12 ( n2 1) S22
U= χ 2 ( n1 1) and V = χ 2 ( n2 1)
σ21 σ22
are two independent random variables, since the samples are indep.
From Theorem 8.14 it follows
U (n 1 1 )S 12
n1 1 σ21
/ ( n1 1) S12 /σ21
F = = = F ( n1 1, n2 1)
V (n 2 1 )S 22
/ ( n2 1) S22 /σ22
σ22
n2 1

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 96 / 107
Example (F-distribution 2)
Consider the following two independent random samples

Xi N (10, 18) , for i = 1, 2, . . . , 9


and
Yj N (20, 24) , for j = 1, 2, . . . , 8.

Let X̄ and SX2 represent the sample mean and sample variance of the …rst
random sample.
Let Ȳ and SY2 represent the sample mean and sample variance of the second
random sample.
De…ne the statistics

S 2 /18 S2
U = X2 , V = X2 and W = 3X̄ 2Ȳ .
SY /24 SY

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 97 / 107
Example (F-distribution 2 (Continue))
Simulation: Generate 1000 samples from the following distributions:

X N (10, 18) use the RANDSEED call with a seed of 123 to generate
samples of size n = 9.
Y N (20, 24) use the RANDSEED call with a seed of 987 to generate
samples of size n = 8.
Compute the values for U, V and W for each sample.

Questions: Calculate the theoretical and empirical values of the following:


1 The 5th and the 95th percentile of U.
2 P SX2 < SY2 Use SAS to calculate the theoretical probability.
3 The 5th and the 95th percentile of V .
4 P (3X̄ < 2Ȳ )
5 The 5th and the 95th percentile of W .
6 The value d such that P (jW µW j < d ) = 0.95.

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 98 / 107
SAS Program: Simulation

proc iml;
mu1=10; sigma1=sqrt(18);
mu2=20; sigma2=sqrt(24);

call randseed(123,1); XX=randfun({1000,9},’normal’,mu1,sigma1);


call randseed(987,1); YY=randfun({1000,8},’normal’,mu2,sigma2);

MeanX=(mean(XX‘))‘; VarX=(var(XX‘))‘;
MeanY=(mean(YY‘))‘; VarY=(var(YY‘))‘;

U=(VarX/(sigma1##2))/(VarY/(sigma2##2));
V=VarX/VarY;
W=3*MeanX-2*MeanY;

create D2 var{U V W}; append; show contents; closefile;


quit;

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 99 / 107
1
S 2 /18 S 2 /σ2
U = X2 = X2 X F ( n1 1, n2 1) F (8, 7)
SY /24 SY /σ2Y

a From the F -table the 95th percentile of U is

f0.95 (8, 7) = 3.73.

b From the F -table the 5th percentile of U is

1 1
= = 0.285 7
f0.95 (7, 8) 3.5

Let: pp={0.95,0.05};

F
3.7257
Theoretical: quantile(’F’,pp,8,7)
0.2857

F
3.6420
Empirical: call qntl(Q1_Em,U,pp)
0.3071

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 100 / 107
2

! !
SX2 SX2 /18 24
P SX2 < SY2 = P <1 =P <
SY2 2
SY /24 18
4
= P F (8, 7) <
3

I Theoretical: cdf(’F’,(sigma2##2/sigma1##2),8,7) 0.6413


I Empirical: mean(varX<varY) 0.6530

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 101 / 107
3 Solve for a

0.95 = P (V < a )
0 1
! B 2 2 C
SX2 B SX /σ1 24 C
= P <a =PB
B S 2 /σ2 < a 18
C
C
SY2 @ Y 2 | {z } A
| {z }
f0.95 (8,7 )
F (8,7 )

From the F -table it follows


24 18
a = |{z}
3.73 =) a = 3.73 = 2. 797 5
18 24 |{z}
f0.95 (8,7 ) f0.95 (8,7 )

I Theoretical:
(18/24)*quantile(’F’,pp,8,7) 2.7943
I Empirical:
qntl(Q3_Em,V,pp) 2.7315

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 102 / 107
4 We know that

X̄ N (10, 18/9) N (10, 2)

Ȳ N (20, 24/8) N (20, 3)

are two independent normal variables, therefore

W = 3X̄ 2Ȳ N µW , σ2W

where

µW = 3 (10) + ( 2) (20) = 10

σ2W = 32 (2) + ( 2)2 (3) = 30

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 103 / 107
)

P (3X̄ < 2Ȳ ) = P (3X̄ 2Ȳ < 0)


= P (W < 0)
0 ( 10)
= P Z< p
30
= P (Z < 1.83) = 0.9664

I Theoretical: cdf(’normal’,0,-10,sqrt(30)) 0.9661

I Empirical: mean(3*meanx<2*meany) 0.9750

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 104 / 107
5 From no. 4 we know that

W = 3X̄ 2Ȳ N ( 10, 30) .

0 1
B C
B P95 ( 10) C
B
0.95 = P (W < P95 ) = P BZ < p C
30 C
@ | {z }A
Φ 1 (0.95 )=z
0.95

)
P95 + 10 1
p = Φ (0.95) = z0.95 = 1.645
30
p
P95 = 10 + 1.645 30 = 0.735 36

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 105 / 107
Note:
1
p
P95 = µW + Φ (0.95) σW = 10 + 1.645 30 = 0.989 96

1
p
P05 = µW + Φ (0.05) σW = 10 1.645 30 = 19. 01

I pp={0.05, 0.95};
I
-19.0092
Theoretical: quantile(’normal’,pp,-10,sqrt(30))
-0.9908

I
-18.5693
Empirical: qntl(Q5_Em,W,pp)
-1.6064

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 106 / 107
6 From no. 4 we know that

W = 3X̄ 2Ȳ N ( 10, 30) .

I Theoretical:
quantile(’normal’,0.975)*sqrt(30)
OR:
quantile(’normal’,0.975,-10,sqrt(30))-(-10)
10.7352

I Empirical:
call qntl(Q6_Em,abs(W-(-10)),0.95) 9.9957

STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 107 / 107

You might also like