Sampling Distributions Part B
Sampling Distributions Part B
STK 220
c University of Pretoria
Theorem (8.4)
If Xi N µi , σ2i , i = 1, 2, . . . , n denote independent normal variables then
!
n n n
Y = ∑ ai Xi N ∑ ai µi , ∑ ai2 σ2i
i =1 i =1 i =1
Note:
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 2 / 107
Example (Normal 1)
Let X1 , X2 and X3 be independent normal random variables with
a For X1 N (0, 1), use the RANDSEED call with a seed of 123 .
b For X2 N (10, 16), use the RANDSEED call with a seed of 456 .
c For X3 N ( 4, 9), use the RANDSEED call with a seed of 789 .
Note: Compare the mean, variance and P10 of the empirical distribution
with that of the theoretical distribution.
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 3 / 107
1 Y = 5X1 + 4X2 2X3 has a normal distribution with
I Expected value
I Variance
)
Y N (48, 317)
p
Note: σ = 317 = 17. 804
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 4 / 107
Matrix notation:
I Expected value
µY = a0 µ Y
0 1
0
= 5 4 2 @ 10 A = 48
4
I Variance
σ2Y = a0 Σa
0 10 1
1 0 0 5
= 5 4 2 @0 16 0A @ 4 A = 317
0 0 9 2
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 5 / 107
2
0.1 = P (Y P10 )
Y 48 P 48
= P p < 10
p
317 317
From Table I
P10 48
) p = z0.1 = 1.28
317
p
) P10 = 48 1.28 317 = 25. 21
quantile(’normal’,0.1,48,sqrt(317)) 25.1826
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 6 / 107
SAS Program:
proc iml;
r=1000;
call randseed(123,1); x1=randfun(r,’normal’,);
call randseed(456,1); x2=randfun(r,’normal’,10,4);
call randseed(789,1); x3=randfun(r,’normal’,-4,3);
matrix=5*x1+4*x2-2*x3;
create sim from matrix[colname={y}];
append from matrix;
quit;
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 7 / 107
SAS Output:
Linear combination of normal variables
The UNIVARIATE Procedure
Variable: Y
Moments
N 1000 Sum Weights 1000
Mean 47.6683059 Sum Observations 47668.3059
Std Deviation 18.483242 Variance 341.630234
Skewness 0.06428775 Kurtosis -0.0727985
Uncorrected SS 2613555.99 Corrected SS 341288.604
Coeff Variation 38.7746987 Std Error Mean 0.58449143
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 8 / 107
Goodness-of-Fit Tests for Normal Distribution
Test ----Statistic----- ------p Value------
Kolmogorov-Smirnov D 0.03174255 Pr > D >0.250
Cramer-von Mises W-Sq 0.15374778 Pr > W-Sq >0.250
Anderson-Darling A-Sq 1.02864721 Pr > A-Sq >0.250
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 9 / 107
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 10 / 107
Remember: STK 210
Fact (MGF)
Let MX (t ) be the MGF of X and let
Y = aX + b
then
= e bt MX (at )
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 11 / 107
Theorem (8.4)
If Xi N µi , σ2i ; i = 1, 2, . . . , n denote independent normal variables then
!
n n n
Y = ∑ ai Xi N ∑ ai µi , ∑ ai2 σ2i
i =1 i =1 i =1
e x = exp [x ]
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 12 / 107
Proof:
n
MY (t ) = ∏ Mai Xi (t ) Theorem 7.3
i =1
n
= ∏ MXi (ai t ) Theorem 4.10
i =1
n
1
= ∏ exp µi (ai t ) + σ2i (ai t )2
2
i =1
" #
n
1 2
= exp ∑ µi (ai t ) + σi (ai t ) 2
i =1
2
" ! ! #
n n
1
= exp ∑ ai µi t + 2 ∑ ai σi t 2
2 2
i =1 i =1
This is the MGF of a normal random variable with expected value ∑ni=1 ai µi and
variance ∑ni=1 ai2 σ2i .
) !
n n n
Y = ∑ ai Xi N ∑ ai µi , ∑ ai2 σ2i
i =1 i =1 i =1
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 13 / 107
Corollary (Sample mean)
Suppose that X1 , X2 , . . . Xn is a random sample from a normal population with
expected value µ and variance σ2 then
σ2
X̄ N µ,
n
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 14 / 107
Example (Normal 2)
Suppose the marks of students are normally distributed with mean µ = 60 and
variance σ2 = 100. Let X̄ be the sample mean of a random sample of size n = 4.
Give the distribution of X̄ .
Since the marks are drawn from a normal population it follows from Corollary
8.4(a) that the sample mean X̄ is also normally distributed i.e.
X̄ N µX̄ , σ2X̄
where
µX̄ = µ = 60
σ2 100
σ2X̄ = = = 25
n 4
) X̄ N (60, 25)
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 15 / 107
Corollary (Di¤erence between two sample means)
Let X1 , X2 , . . . Xn1 and Y1 , Y2 , . . . Yn2 be two independent random samples
from two di¤erent normal populations
then !
σ2 σ2
X̄ Ȳ N µ1 µ2 , 1 + 2
n1 n2
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 16 / 107
Proof: From Corollary 8.4(a) it follows that
! !
σ21 σ2
X̄ N µ1 , and Ȳ N µ2 , 2
n1 n2
Further, X̄ and Ȳ are independent random variables, since two random samples
are independent. Hence
X̄ Ȳ = (1) X̄ + ( 1) Ȳ
µX̄ Ȳ = (1) µ1 + ( 1) µ2 = µ1 µ2
and
σ2 σ22 σ2 σ2
σ2X̄ Ȳ = (1)2 1 + ( 1)2 = 1+ 2
n1 n2 n1 n2
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 17 / 107
Example (Normal 3)
Let X̄ be the sample mean of a random sample of size n1 = 5 from a N (20, 25)
distribution and let Ȳ be the sample mean of an independent random sample of
size n2 = 4 from a N (10, 16) distribution. Give the sampling distribution of
X̄ Ȳ .
X̄ Ȳ N µX̄ Ȳ , σ2X̄ Ȳ
where
µX̄ Ȳ = 20 10 = 10
25 16
σ2X̄ Ȳ = + = 10 = 5 + 4 = 9
5 4
) X̄ Ȳ N (10, 9)
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 18 / 107
Corollary (Sample proportion)
Let X BIN (n, θ ), then if the rule of thumb holds i.e.
nθ 5 and nθ (1 θ) 5
b = number of successes = 1 X θ (1 θ)
Θ N θ,
n n n
| {z }
1 2 θ (1 θ)
σ2Θ
b = nθ (1 θ) =
n n
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 19 / 107
The Chi-Square Distribution
De…nition (6.2)
A random variable X has a gamma distribution with α > 0 and β > 0 if
8
< 1
x α 1 e x /β x >0
g (x; α, β) = βα Γ (α)
:
0 elsewhere
Notation:
X GAM (α, β)
where
α = shape parameter and β = scale parameter
Note: The are three basic shapes depending on whether
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 20 / 107
Gamma distributions with α = 0.5, 1, 2.
0.6
f(x)
0.5
0.4
0.3
0.2
0.1
0.0
0 1 2 3 4 5 6 7 8 9 10
x
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 21 / 107
De…nition (Gamma function)
The gamma function denoted by Γ (κ ), for κ > 0, is given by
Z ∞
Γ (κ ) = t κ 1 e t dt
0
Note: Z ∞
Γ (1) = e t dt = 1
0
Γ (κ ) = (κ 1) Γ (κ 1)
Γ (n ) = (n 1) ! n = 1, 2, 3, . . .
p
Γ 12 = π
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 22 / 107
SAS Program:
Z ∞
Γ (κ ) = t κ 1 e t dt
0
proc iml;
a=gamma(1);
b=gamma(5);
fact_4=fact(4);
c=(gamma(0.5))##2;
pi=22/7;
print a b fact_4 c[f=10.8 l=’correct pi’] pi[f=10.8];
quit;
SAS Output:
a b fact_4 correct pi pi
1 24 24 3.14159265 3.14285714
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 23 / 107
Fact (Special Gamma distributions)
If X GAM (α, 2) then X χ2 (2α) .
If X GAM (1, β) then X EXP ( β) .
Theorem (6.3)
The mean and variance of the gamma distribution X GAM (α, β) are given by
µ = αβ and σ2 = αβ2
Theorem (6.4)
The MGF of the gamma distribution, X GAM (α, β) is given by
α
MX (t ) = (1 βt )
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 24 / 107
Example (Chi-Square 1)
Use the properties of the X GAM (α, β) to give the expected value, variance
and MGF of
1 X EXP (θ )
2 X χ2 ( ν )
a Moments
µ = (1) θ = θ and σ 2 = (1) θ 2 = θ 2
b MGF
1
MX (t ) = (1 θt ) 1 =
(1 θt )
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 25 / 107
ν
2 X χ2 ( ν ) GAM ,2
2
a Moments
ν ν 2
µ= 2=ν and σ2 = 2 = 2ν
2 2
b MGF
ν/2
MX ( t ) = ( 1 2t )
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 26 / 107
Example
Let Xi GAM (αi , β) i = 1, 2, . . . , n independent random variables. Use the
MGF technique to obtain the distribution of
n
Y = ∑ Xi .
i =1
n
MY ( t ) = ∏ MXi (t )
i =1
n
(∑ni=1 αi )
= ∏ (1 βt ) αi
= (1 βt )
i =1
) ! !
n
Y GAM ∑ αi ,β .
i =1
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 27 / 107
Example (Chi-Square 2)
Let Xi , i = 1, 2, . . . , n independent random variables. Give the distribution of
n
Y = ∑ Xi when
i =1
1 Xi EXP (θ )
2 Xi χ2 ( νi )
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 28 / 107
Fact
If
X χ2 ( ν )
then the parameter ν is referred to as the degrees of freedom and
ν
X GAM ,2
2
De…nition (6.4)
A random variable X has a chi-square distribution with ν degrees of freedom
()
the probability density is given by
( 1
x ν/2 1 e x /2 x >0
f (x; ν) = 2ν/2 Γ (ν/2)
0 elsewhere
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 29 / 107
Gamma distributions with α = 0.5, 1, 2, 3 & β = 2 i.e. χ2 -distributions.
0.6
f(x)
0.5
0.4
0.3
0.2
0.1
0.0
0 1 2 3 4 5 6 7 8 9 10
x
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 30 / 107
Table III: P χ2 χ20.95 (3) = 0.95 then χ20.95 (3) = 7.815
Z χ2 ( ν )
γ
Percentiles of χ2 distribution, values of χ2γ (ν), γ = f (x; ν) dx.
0
γ
ν 0.005 0.01 0.025 0.05 0.95 0.975 0.99 0.995
2 0.010 0.020 0.051 0.103 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 11.070 12.833 15.086 16.750
9 1.735 2.088 2.700 3.325 16.919 19.023 21.666 23.589
10 2.156 2.558 3.247 3.940 18.307 20.483 23.209 25.188
11 2.603 3.053 3.816 4.575 19.675 21.920 24.725 26.757
27 11.808 12.879 14.573 16.151 40.113 43.195 46.963 49.645
28 12.461 13.565 15.308 16.928 41.337 44.461 48.278 50.993
29 13.121 14.256 16.047 17.708 42.557 45.722 49.588 52.336
30 13.787 14.953 16.791 18.493 43.773 46.979 50.892 53.672
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 31 / 107
χ20.95 (3) = 7.815
0.3
f(x)
0.2
0.1
0.95
0.0
0 1 2 3 4 5 6 7 8 9 10
x
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 32 / 107
SAS Program
proc iml;
QNT_GAM=quantile(’gamma’,0.95,1.5,2);
QNT_CHI=quantile(’chisq’,0.95,3);
print QNT_GAM[f=8.4] QNT_CHI[f=8.4];
CDF_GAM=cdf(’gamma’,7.815,1.5,2);
CDF_CHI=cdf(’chisq’,7.815,3);
print CDF_GAM[f=8.2] CDF_CHI[f=8.2];
p1={0.005,0.01,0.025,0.05,0.1,0.2,0.25,0.3,0.5};
p2=1-p1; p=unique(p1//p2)‘;
x=(1:10)‘;
QNT_chi_3=quantile(’chisq’,p,3);
CDF_chi_3=cdf(’chisq’,x,3);
print QNT_chi_3[r=p f=8.4] ’ ’ CDF_chi_3[r=x f=8.4];
quit;
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 33 / 107
SAS Output
QNT_GAM QNT_CHI
7.8147 7.8147
QNT_GAM QNT_CHI
0.95 0.95
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 34 / 107
SAS Output (Continue)
QNT_chi_3 CDF_chi_3
0.005 0.0717 1 0.1987
0.01 0.1148 2 0.4276
0.025 0.2158 3 0.6084
0.05 0.3518 4 0.7385
0.1 0.5844 5 0.8282
0.2 1.0052 6 0.8884
0.25 1.2125 7 0.9281
0.3 1.4237 8 0.9540
0.5 2.3660 9 0.9707
0.7 3.6649 10 0.9814
0.75 4.1083
0.8 4.6416
0.9 6.2514
0.95 7.8147
0.975 9.3484
0.99 11.3449
0.995 12.8382
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 35 / 107
Fact
When the degrees of freedom, ν is greater than 30 then probabilities related to
the χ2 -distribution can be approximated with the normal distribution i.e.
χ2 ( ν ) N (ν, 2ν)
Graph: χ2 (x; ν) with ν = 5, 10, 15 df and the N (15, 30)
f(x) 0.15
0.10
0.05
0.00
0 10 20 30
x
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 36 / 107
Example (Chi-Square 3)
X χ2 (30)
P95 30
) p = z0.95 = 1.645
60
p
) P95 = 30 + 1.645 60 = 42. 742 χ20.95 (30)
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 37 / 107
P95 for the χ2 (30) and the N (30, 60)
f(x) 0.05
0.04
0.03
0.02
0.01
0.00
10 20 30 40 50 60
-0.01 P95=43.773 x
P95=42.742
p
Note: In general χ2γ (ν) ν + zγ 2ν, for ν large.
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 38 / 107
Example (Chi-Square 4)
Suppose X χ2 (12) . The moment generating function of X is
MX (t ) = (1 2t ) 6
f(x) 0.08
0.06
0.04
0.02
0.00
0 5 10 15 20 25 30
x
Calculate the expected value and variance of X by making use of MX (t ).
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 39 / 107
Calculate µ10 = E (X )
MX0 (t ) = 6 (1 2t ) 7 ( 2) = 12 (1 2t ) 7
µ10 = E (X ) = MX0 (0) = 12 (1 2 (0)) 7 = 12
Calculate µ20 = E X 2
MX00 (t ) = 84 (1 2t ) 8 ( 2) = 168 (1 2t ) 8
µ20 = E X 2 = MX00 (0) = 168
Calculate Var (X )
Var (X ) = E X2 [E (X )]2
= 168 122 = 24
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 40 / 107
Theorem (8.7)
If X N (0, 1) then X 2 χ2 (1) .
Proof:
h 2
i
MX 2 (t ) = E e tX
Z ∞
2 1 x 2 /2
= e tx p e dx
∞ 2π
Z ∞
1 1 2
= p exp x + tx 2 dx
∞ 2π 2
Z ∞
1 1 2
= p exp x (1 2t ) dx
∞ 2π 2
Z ∞
" #
1 1 (x 0)2
= p exp dx
∞ 2π 2 (1 2t ) 1
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 41 / 107
Theorem (8.7)
If X N (0, 1) then X 2 χ2 (1) .
Proof: (Continue)
Z ∞
" #
1 1 (x 0)2
MX 2 (t ) = p exp dx
∞ 2π 2 (1 2t ) 1
q Z ∞
" #
1 1 1 (x 0)2
= (1 2t ) p q exp dx
∞ 2 (1 2t ) 1
2π (1 2t ) 1
| {z }
1
N (0,(1 2t ) )
1
1
= p = (1 2t ) 2 i.e. MGF of χ2 (1)
1 2t
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 42 / 107
SAS Program:
proc iml;
call randseed(123);
z=randfun(1000,’normal’,);
z_sq=z##2;
create sim var{z z_sq}; append;
quit;
proc univariate data=sim;
var z;
histogram / endpoints=-4 to 4 by 0.5 cfill=yellow
normal(mu=0 sigma=1);
title ’Standard normal variable’;
run;
proc univariate data=sim;
var z_sq;
histogram / endpoints=0 to 9 by 1 cfill=yellow
gamma(alpha=0.5 sigma=2);
title ’Chi-squared variable with 1 df’;
run;
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 43 / 107
SAS Output:
The UNIVARIATE Procedure
Variable: Z_SQ
Moments
N 1000 Sum Weights 1000
Mean 1.01125058 Sum Observations 1011.25058
Std Deviation 1.43937936 Variance 2.07181294
Skewness 2.80591679 Kurtosis 10.3206346
Uncorrected SS 3092.36886 Corrected SS 2069.74113
Coeff Variation 142.336567 Std Error Mean 0.04551717
Parameter Symbol Estimate
Fitted Gamma Distribution for Z_SQ
Parameters for Gamma Distribution
Parameter Symbol Estimate
Threshold Theta 0
Scale Sigma 2
Shape Alpha 0.5
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 44 / 107
Fitted Gamma Distribution for Z_SQ
Goodness-of-Fit Tests for Gamma Distribution
Test ----Statistic----- ------p Value------
Kolmogorov-Smirnov D 0.01772873 Pr > D >0.250
Cramer-von Mises W-Sq 0.03098187 Pr > W-Sq >0.250
Anderson-Darling A-Sq 0.22125090 Pr > A-Sq >0.250
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 45 / 107
Empirical distribution of Z N (0, 1)
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 46 / 107
Empirical distribution of Z 2 χ2 (1)
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 47 / 107
Theorem (8.9)
If Xi χ2 (νi ) , i = 1, 2, . . . , n are independent chi-square variables, then
!
n n
Y = ∑ Xi χ2 ∑ νi
i =1 i =1
Proof:
n
MY (t ) = ∏ MX i ( t ) independent variables
i =1
n νi ∑ni=1 νi
= ∏ (1 2t ) 2 = (1 2t ) 2
i =1
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 48 / 107
Example (Chi-Square 5)
Suppose X1 χ2 (5), X2 GAM (4, 2) and X3 EXP (2) are
independent random variables. Let: Y = X1 + X2 + X3
1. Give the distribution of Y .
X1 χ2 (5)
X2 GAM (4, 2) χ2 (8)
X3 EXP (2) GAM (1, 2) χ2 (2)
ν = 5 + 8 + 2 = 15
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 49 / 107
X1 χ 2 ( 5 ) , X2 χ2 (8) and X3 χ2 (2) . Y = ∑3i =1 Xi χ2 (15)
0.5
f(x)
0.4
0.3
0.2
0.1
0.0
0 2 4 6 8 10 12 14 16 18 20 22 24
x
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 50 / 107
Example (Chi-Square 5)
Suppose X1 χ2 (5), X2 GAM (4, 2) and X3 EXP (2) are
independent random variables.
Let:
Y = X1 + X2 + X3
quantile(’chisq’,0.05,15) 7.2609
quantile(’chisq’,0.95,15) 24.9958
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 51 / 107
Example (Chi-Square 5)
Suppose X1 χ2 (5), X2 GAM (4, 2) and X3 EXP (2) are
independent random variables. Let:
Y = X1 + X2 + X3
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 52 / 107
SAS Program:
proc iml;
call randseed(123,1);
x1=randfun(1000,’chisquare’,5);
x2=randfun(1000,’gamma’,4,2);
x3=randfun(1000,’exponential’,2);
y=x1+x2+x3;
create sim var{y}; append;
quit;
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 53 / 107
SAS Output:
The UNIVARIATE Procedure
Variable: Y
Moments
N 1000 Sum Weights 1000
Mean 15.2761254 Sum Observations 15276.1254
Std Deviation 5.38771349 Variance 29.0274567
Skewness 0.71155841 Kurtosis 0.65223801
Uncorrected SS 262358.437 Corrected SS 28998.4292
Coeff Variation 35.2688482 Std Error Mean 0.17037446
Fitted Gamma Distribution for Y
Parameters for Gamma Distribution
Parameter Symbol Estimate
Threshold Theta 0
Scale Sigma 2
Shape Alpha 7.5
Mean 15
Std Dev 5.477226
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 54 / 107
Goodness-of-Fit Tests for Gamma Distribution
Test ----Statistic----- ------p Value------
Kolmogorov-Smirnov D 0.03480413 Pr > D 0.178
Cramer-von Mises W-Sq 0.33855866 Pr > W-Sq 0.107
Anderson-Darling A-Sq 1.92598711 Pr > A-Sq 0.101
Quantiles for Gamma Distribution
-------Quantile------
Percent Observed Estimated
1.0 5.78072 5.22935
5.0 7.62567 7.26094
10.0 8.86100 8.54676
25.0 11.34526 11.03654
50.0 14.63383 14.33886
75.0 18.50925 18.24509
90.0 22.54529 22.30713
95.0 25.02918 24.99579
99.0 31.21160 30.57791
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 55 / 107
Empirical distribution of Y = X1 + X2 + X3 χ2 (15)
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 56 / 107
Theorem (8.11)
If X̄ and S 2 are the mean and variance of a random sample of size n from a
normal population with mean µ and standard deviation σ then
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 57 / 107
From SSD it follows that
n n
∑ ( Xi ∑ (Xi
2 2
µ )2 = X̄ ) + n (X̄ µ)
i =1 i =1
Divide by σ2
2
n
Xi µ 2 (n 1) S 2 X̄ µ
∑ σ
=
σ 2
+ p
σ/ n
i =1
| {z } | {z } | {z }
V2 V3
V1
V1 = V2 + V3
2
Since ∑ni=1 (Xi X̄ ) = (n 1) S 2
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 58 / 107
Proof:
Xi µ 2
I For V1 = ∑ni=1 σ :
F
Xi µ
σ , i = 1, 2, . . . , n are indep. N (0, 1) variables
Xi µ 2
F
σ , i = 1, 2, . . . , n are indep. χ2 (1), Theorem 8.7
Xi µ 2
F V1 = ∑ni=1 σ χ2 (n ), Theorem 8.9
X̄ pµ 2
I For V3 = :
σ/ n
F
X̄ µ
p
σ/ n
N (0, 1), Corollary 8.4a
X̄ µ 2
F p
σ/ n
χ2 (1 ), Theorem 8.7
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 59 / 107
I Since X̄ and S 2 are indep., it follows that V2 and V3 are indep.
I V1 = V2 + V3 is the sum of two independent random variables
MV 1 = MV 2 MV 3 Theorem 7.3
n /2
(1 2t ) = MV 2 ( 1 2t ) 1 /2
) MV 2 = ( 1 2t ) (n 1 )/2
) MV 2 is MGF of a χ2 (n 1) =) :
(n 1) S 2
V2 = χ2 (n 1) .
σ2
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 60 / 107
Lemma
Suppose X1 , X2 , . . . , Xn is a random sample from N µ, σ2 then
2σ4
Var S 2 =
n 1
Proof: Since
(n 1) S 2
χ2 (n 1)
σ2
we know that
(n 1) S 2 (n 1)2
Var = 2 (n 1) =) Var S 2 = 2 (n 1)
σ2 σ4
2σ4
) Var S 2 =
(n 1)
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 61 / 107
Fact (Sample Variance)
Suppose that Xi N µ, σ2 , i = 1, 2, . . . , n is a random sample then the
sample variance S 2 is a consistent estimator of σ2 .
Proof:
(n 1) S 2 n 1 2
Var = 2 (n 1) =) : Var S 2 = 2 (n 1)
σ2 σ2
)
σ4 2σ4 n
Var S 2 = 2 (n 1) = !0
(n 1)2 n 1 ∞
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 62 / 107
Example (Chi-Square 6)
A random sample of size n = 10 is taken from a normally distributed population
of marks X , where X N 60, 102 .The following statistics are considered
2
∑ (Xi X̄ ) 9S 2
S2 = and T =
9 100
Simulation: Generate r = 1000 samples with a seed of 123 from the
population described above and compute the values of S 2 and T for each sample.
1 Give the theoretical and empirical values for the following:
a E (T ) and Var (T )
b P (T > 19.023)
c The value a such that P (T < a) = 0.05.
2 Give the theoretical and empirical values for the following:
a E S 2 and Var S 2
b P S 2 < 30
c The value b such that P S 2 < b = 0.95.
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 63 / 107
SAS Program:
proc iml;
call randseed(123);
mu=60; sigma=10; n=10;
x=randfun({1000,10},’normal’,mu,sigma);
xbar=(mean(x‘))‘;
S2=(var(x‘))‘;
T=(n-1)*S2/(sigma##2);
create D6 var{xbar S2 T}; append; close D6;
quit;
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 64 / 107
SAS Output:
The UNIVARIATE Procedure
Variable: T
Moments
N 1000 Sum Weights 1000
Mean 9.08734131 Sum Observations 9087.34131
Std Deviation 4.25733689 Variance 18.1249174
Skewness 0.87736093 Kurtosis 0.7979859
Uncorrected SS 100686.565 Corrected SS 18106.7925
Coeff Variation 46.8490919 Std Error Mean 0.13462881
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 66 / 107
9S 2
Empirical distribution of T =
100
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 67 / 107
Answers:
9S 2 (n 1) S 2
1 T = = χ2 (9)
100 σ2
a
E (T ) = 9 and Var (T ) = 18
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 68 / 107
(n 1) S 2
2 χ2 (9)
σ2
a
9S 2
F E =9
100
9
E S 2 = 9 =) E S 2 = 100
100
= σ2 unbiased estimator
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 69 / 107
b
9S 2 9 30
P S 2 < 30 = P <
100 100
27
= P χ2 (9) <
10
= P χ2 (9) < 2.7
= 0.025
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 70 / 107
c
0.95 = P S2 < b
0 1
B 9S 2 9 b C
B C
= PB < C
@ 100 100
|{z} A
χ20.95 (9 )
)
9 b 100
= 16.919 =) b = 16.919 = 187. 99
100 9
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 71 / 107
The t-Distribution
For a random sample from a normal population with mean µ and variance
σ2 , the random variable X̄ has a normal distribution with mean µ and
σ2
variance i.e.
n
X̄ µ
Z = p N (0, 1)
σ/ n
X̄ µ
T = p
S/ n
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 72 / 107
Theorem (8.12)
If Y χ2 (ν) and Z N (0, 1) are independent random variables then
Z
T = p t (ν)
Y /ν
with pdf
ν+1
Γ 2
ν +1
2
f (t ) = 2 t
p ν . 1+ ν ∞<t<∞
πνΓ
2
and is called the t distribution with ν degrees of freedom.
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 73 / 107
Table II: P [T t0.95 (10)] = 0.95 then t0.95 (10) = 1.812
R t γ (ν)
Percentiles of t distribution, values of tγ (ν), γ = ∞ f (t; ν) dt.
γ
ν 0.9 0.95 0.975 0.99 0.995
1 3.078 6.314 12.706 31.821 63.657
2 1.886 2.920 4.303 6.965 9.925
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
10 1.372 1.812 2.228 2.764 3.169
29 1.311 1.699 2.045 2.462 2.756
∞ 1.282 1.645 1.960 2.326 2.576
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 74 / 107
t0.05 (10) = t0.95 (10) = 1.812 and t0.95 (10) = 1.812
0.4
f
0.3
0.2
0.9
0.1
0.0
-3 -2 -1 0 1 2 3
t
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 75 / 107
SAS Program and Output
proc iml;
t_05=quantile(’t’,0.05,10);
t_95=quantile(’t’,0.95,10);
print t_05[f=8.4] t_95[f=8.4];
prob1=cdf(’t’,-1.8125,10);
prob2=cdf(’t’,1.8125,10);
print prob1[f=8.4] prob2[f=8.4];
t_05 t_95
-1.8125 1.8125
prob1 prob2
0.0500 0.9500
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 76 / 107
t (1) , t (2) , t (3) , t (9) and the N (0, 1) distribution
0.4
f(t;df)
0.3
0.2
0.1
-4 -3 -2 -1 0 1 2 3 4
t
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 77 / 107
Fact
The t distribution is symmetrical about t = 0.
For ν 30, probabilities may be approximated with the standard normal
distribution.
p={0.05,0.95};
t30=quantile(’t’,p,30);
t50=quantile(’t’,p,50);
t100=quantile(’t’,p,100);
z=quantile(’normal’,p);
print p t30[f=8.4] t50[f=8.4] t100[f=8.4] z[f=8.4];
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 78 / 107
Theorem (8.13)
If X̄ and S 2 are the mean and variance of a random sample of size n from a
normal population with mean µ and variance σ2 , then
X̄ µ
T = p t (n 1)
S/ n
Proof:
X̄ µ
Z = p N (0, 1)
σ/ n
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 79 / 107
From Theorem 8.11 it follows that
(n 1) S 2
Y = χ2 (n 1)
σ2
X̄ µ
p
σ/ n X̄ µ
= s = p t (n 1)
S/ n
(n 1) S 2
σ 2 (n 1)
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 80 / 107
Example (t distribution)
Let X1 , X2 , . . . , X9 be a random sample from a normal distribution, where
Xi N (6, 25) .
Let X̄ and S 2 represent the sample mean and sample variance, respectively.
Simulation:
Generate r = 1000 samples with a seed of 123 from the population described
above and compute the values of X̄ and S 2 for each sample.
Calculate the theoretical and empirical values for the following:
1 P (3 < X̄ < 7)
3 (X̄ 6)
2 P 1.860 <
S
3 P S 2 < 6. 812 5
4 P S 2 < 31.9375
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 81 / 107
SAS Program:
proc iml;
n=9; mu=6; sigma=sqrt(25);
stderr=sigma/sqrt(9);
r=1000; dim=r//n;
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 82 / 107
SAS Output:
The UNIVARIATE Procedure
Variable: T
Moments
N 1000 Sum Weights 1000
Mean 0.02386244 Sum Observations 23.8624437
Std Deviation 1.13113618 Variance 1.27946906
Skewness 0.06800334 Kurtosis 1.28370653
Uncorrected SS 1278.75901 Corrected SS 1278.18959
Coeff Variation 4740.23615 Std Error Mean 0.03576967
Quantiles (Definition 5)
Level Quantile
100% Max 5.59736225
99% 2.58432837
95% 1.83053765
90% 1.40047200
75% Q3 0.74458650
50% Median 0.00927994
25% Q1 -0.69036993
10% -1.33760939
5% -1.74870274
1% -2.70103880
0% Min -4.59805608
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 83 / 107
Answers:
25
1 Since X̄ N 6, it follows that
9
3 6 X̄ 6 7 6
P (3 < X̄ < 7) = P < <
5/3 5/3 5/3
= P ( 1.8 < Z < 0.6) , Z N (0, 1)
= Φ (0.6) Φ ( 1.8)
= 0.7257 0.0359 = 0.6898
I Theoretical:
cdf(’normal’,7,mu,stderr)-cdf(’normal’,3,mu,stderr)
OR:
cdf(’normal’,(7-mu)/stderr)-cdf(’normal’,(3-mu)/stderr);
0.6898
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 84 / 107
2
3 (X̄ 6) X̄ 6
P > 1.86 = P p > 1.86
S S/ 9
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 85 / 107
3
8S 2 8 6. 812 5
P S 2 < 6. 812 5 = P <
25 25
(n 1) S 2
= P χ2 (8) < 2. 18 , χ2 (n 1)
σ2
= 0.025
I Theoretical: cdf(’chisq’,(n-1)/sigma##2*6.8125,n-1)
0.0250
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 86 / 107
4
(n 1) S 2 (8) 31.9375
P S 2 < 31.9375 = P <
σ2 25
(n 1) S 2
= P χ2 (8) < 10.22 , χ2 (n 1)
σ2
= 0.75
I Theoretical: cdf(’chisq’,(n-1)/sigma##2*31.9375,n-1)
0.7501
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 87 / 107
The F-Distribution
Theorem (8.14)
If U χ2 (ν1 ) and V χ2 (ν2 ) are independent random variables then
U/ν1
F = F ( ν1 , ν2 )
V /ν2
ν1 + ν2
Γ ν1 1
2 ( ν1 + ν2 )
2 ν1 2 ν1 ν
g (f ) = ν1 ν f 2 1
1+ 1f
Γ Γ 2 ν2 ν2
2 2
for f > 0 and g (f ) = 0 elsewhere.
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 88 / 107
The F-Distribution
Corollary (8.14)
If U F (ν1 , ν2 ) then
1
F ( ν2 , ν1 )
U
X1 /ν1
U= F ( ν1 , ν2 )
X2 /ν2
but
1 X /ν
= 2 2 F ( ν2 , ν1 ) according to Theorem 8.14
U X1 /ν1
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 89 / 107
The F-Distribution
Table IV: P [F f0.95 (5, 10)] = 0.95 then f0.95 (5, 10) = 3.33
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 90 / 107
P (F f0.95 (5, 10)) = 0.95 then f0.95 (5, 10) = 3.33
0.8
0.6
0.4
0.2 0.95
0.0
0 1 2 3 4 5
f
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 91 / 107
Fact
1
fγ ( ν 1 , ν 2 ) =
f1 γ ( ν2 , ν1 )
1
γ = P [F fγ (ν2 , ν1 )] = P fγ ( ν 1 , ν 2 )
F
| {z }
γ
1 γ = P F f1 γ ( ν2 , ν1 )
1 1 1 1
= P =1 P
F f1 γ ν2 , ν1 )
( F f1 γ (ν2 , ν1 )
| {z }
γ
1
) fγ ( ν 1 , ν 2 ) =
f1 γ (ν2 , ν1 )
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 92 / 107
Example (F-distribution 1)
Suppose F F (5, 10) .
1
f0.05 (5, 10) =
f0.95 (10, 5)
1
= = 0.211
4.74
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 93 / 107
0.8
0.6
0.4
0.2 0.9
0.0
0 1 2 3 4 5
F(5,10)
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 94 / 107
2 SAS Program and Output:
proc iml;
v1=quantile(’F’,0.05,5,10); v1_alt=1/quantile(’F’,0.95,10,5);
v2=quantile(’F’,0.95,5,10); v2_alt=1/quantile(’F’,0.05,10,5);
print v1 [f=8.3] v1_alt [f=8.3] v2 [f=8.3] v2_alt [f=8.3];
me=quantile(’F’,0.5,5,10);
pp1=cdf(’F’,1,5,10); pp2=cdf(’F’,2,5,10); pp3=cdf(’F’,3,5,10);
print me[f=8.3 l=’me’]
pp1 [f=8.3 l=’1’] pp2 [f=8.3 l=’2’] pp3 [f=8.3 l=’3’];
v1 v1_alt v2 v2_alt
0.211 0.211 3.326 3.326
me 1 2 3
0.932 0.535 0.836 0.934
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 95 / 107
Theorem (8.15)
If S12 and S22 are the variances of independent random samples of sizes n1 and
n2 from normal populations with the variances σ21 and σ22 , then
S 2 /σ2
F = 12 21 F ( n1 1, n2 1)
S2 /σ2
Proof:
From Theorem 8.13 it follows that
( n1 1) S12 ( n2 1) S22
U= χ 2 ( n1 1) and V = χ 2 ( n2 1)
σ21 σ22
are two independent random variables, since the samples are indep.
From Theorem 8.14 it follows
U (n 1 1 )S 12
n1 1 σ21
/ ( n1 1) S12 /σ21
F = = = F ( n1 1, n2 1)
V (n 2 1 )S 22
/ ( n2 1) S22 /σ22
σ22
n2 1
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 96 / 107
Example (F-distribution 2)
Consider the following two independent random samples
Let X̄ and SX2 represent the sample mean and sample variance of the …rst
random sample.
Let Ȳ and SY2 represent the sample mean and sample variance of the second
random sample.
De…ne the statistics
S 2 /18 S2
U = X2 , V = X2 and W = 3X̄ 2Ȳ .
SY /24 SY
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 97 / 107
Example (F-distribution 2 (Continue))
Simulation: Generate 1000 samples from the following distributions:
X N (10, 18) use the RANDSEED call with a seed of 123 to generate
samples of size n = 9.
Y N (20, 24) use the RANDSEED call with a seed of 987 to generate
samples of size n = 8.
Compute the values for U, V and W for each sample.
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 98 / 107
SAS Program: Simulation
proc iml;
mu1=10; sigma1=sqrt(18);
mu2=20; sigma2=sqrt(24);
MeanX=(mean(XX‘))‘; VarX=(var(XX‘))‘;
MeanY=(mean(YY‘))‘; VarY=(var(YY‘))‘;
U=(VarX/(sigma1##2))/(VarY/(sigma2##2));
V=VarX/VarY;
W=3*MeanX-2*MeanY;
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 99 / 107
1
S 2 /18 S 2 /σ2
U = X2 = X2 X F ( n1 1, n2 1) F (8, 7)
SY /24 SY /σ2Y
1 1
= = 0.285 7
f0.95 (7, 8) 3.5
Let: pp={0.95,0.05};
F
3.7257
Theoretical: quantile(’F’,pp,8,7)
0.2857
F
3.6420
Empirical: call qntl(Q1_Em,U,pp)
0.3071
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 100 / 107
2
! !
SX2 SX2 /18 24
P SX2 < SY2 = P <1 =P <
SY2 2
SY /24 18
4
= P F (8, 7) <
3
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 101 / 107
3 Solve for a
0.95 = P (V < a )
0 1
! B 2 2 C
SX2 B SX /σ1 24 C
= P <a =PB
B S 2 /σ2 < a 18
C
C
SY2 @ Y 2 | {z } A
| {z }
f0.95 (8,7 )
F (8,7 )
I Theoretical:
(18/24)*quantile(’F’,pp,8,7) 2.7943
I Empirical:
qntl(Q3_Em,V,pp) 2.7315
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 102 / 107
4 We know that
where
µW = 3 (10) + ( 2) (20) = 10
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 103 / 107
)
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 104 / 107
5 From no. 4 we know that
0 1
B C
B P95 ( 10) C
B
0.95 = P (W < P95 ) = P BZ < p C
30 C
@ | {z }A
Φ 1 (0.95 )=z
0.95
)
P95 + 10 1
p = Φ (0.95) = z0.95 = 1.645
30
p
P95 = 10 + 1.645 30 = 0.735 36
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 105 / 107
Note:
1
p
P95 = µW + Φ (0.95) σW = 10 + 1.645 30 = 0.989 96
1
p
P05 = µW + Φ (0.05) σW = 10 1.645 30 = 19. 01
I pp={0.05, 0.95};
I
-19.0092
Theoretical: quantile(’normal’,pp,-10,sqrt(30))
-0.9908
I
-18.5693
Empirical: qntl(Q5_Em,W,pp)
-1.6064
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 106 / 107
6 From no. 4 we know that
I Theoretical:
quantile(’normal’,0.975)*sqrt(30)
OR:
quantile(’normal’,0.975,-10,sqrt(30))-(-10)
10.7352
I Empirical:
call qntl(Q6_Em,abs(W-(-10)),0.95) 9.9957
STK 220 ( c University of Pretoria) Ch6 Sampling Distributions Part B September 15, 2025 107 / 107