Unit 3 Notes
Unit 3 Notes
′
∑ni=1 fi (xi − A)r
μr =
N
∑𝐧𝐢=𝟏 𝐟𝐢 (𝐱 𝐢 )𝐫
Moments about origin: 𝛍𝐫 =
𝐍
Qns: The first four moments of a distribution about the value 4 of the variables are-1.5, 17, -30 and 308. Find moments 𝜇1 ,
𝜇2 , 𝜇3 ,𝜇1 about mean and about origin. Also find 𝛽1 and 𝛽2 .
Sol: We have A=4, μ1 ′ = −1.5, μ2 ′ = 17, μ3 ′ = −30, μ4 ′ = 308
Moment about mean
μ1 = μ1 ′ − μ1 ′ =0
2
μ2 = μ2 ′ − μ1 ′ =17−(−1.5)2 = 14.75
′3
μ3 = μ3 ′ − 3μ2 ′ μ1 ′ + 2μ1 =30 − 3(17)(−1.5) + 2(−1.5)2 = 39.75
2 ′4
μ4 = μ4 ′ − 4μ3 ′ μ1 ′ + 6μ2 ′ μ1 ′ − 3μ1 =308 − 4(−30)(−1.5) + 6(17)(−1.5)2 − 3(−1.5)4 = 342.3125
Moments about origin
̅ = μ1 ′ + 𝐴 = −1.5 + 4 = 2.5
𝒗𝟏 = 𝒙
̅𝟐 = 𝟏𝟒. 𝟕𝟓 + (𝟐. 𝟓)𝟐 = 𝟐𝟏
𝒗𝟐 = μ2 + 𝒙
̅𝟑 = 𝟑𝟗. 𝟕𝟓 + 𝟑 ∗ 𝟏𝟒. 𝟕𝟓 ∗ 𝟐. 𝟓 + (𝟐. 𝟓)𝟑 = 𝟏𝟔𝟔
̅+ 𝒙
𝒗𝟑 = μ3 + 3μ2 𝒙
̅𝟐 + 𝒙
̅ + 6μ2 𝒙
𝒗𝟒 = μ4 + 4μ3 𝒙 ̅𝟒 = 𝟏𝟑𝟑𝟐
Calculation of 𝛽1 and 𝛽2
𝜇32 μ4
𝑆𝑘𝑒𝑤𝑛𝑒𝑠𝑠, 𝛽1 = = 0.492377, 𝐾𝑢𝑟𝑡𝑜𝑠𝑖𝑠, 𝛽2 = =1.573398<3
𝜇23 𝜇22
Qns: Calculate the first four central moments about the mean of the following data:
x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1
Solution
X f fx (x- (xi F(x- f(xi − x̅)2 (xi − x̅)3 f(xi − x̅)3 (xi f(xi − x̅)4
Mean) − x̅)2 Mean) − x̅)4
0 1 0 -4 16 -4 16 -64 -64 256 256
1 8 8 -3 9 -24 72 -27 -216 81 648
2 28 56 -2 4 -56 112 -8 -224 16 448
3 56 168 -1 1 -56 56 -1 -56 1 56
4 70 280 0 0 0 0 0 0 0 0
5 56 280 1 1 56 56 1 56 1 56
6 28 168 2 4 56 112 8 224 16 448
7 8 56 3 9 24 72 27 216 81 648
2568 1 8 4 16 4 16 64 64 256 256
Total Total Total=0 Total=512 Total=0 Total=0 Total=1912
f=256 fx=1024
∑n ̅ )1
i=1 fi (xi −x ∑n ̅ )2
i=1 fi (xi −x 512
μ1 = =0 μ2 = = =2
N N 256
∑n ̅ )3
i=1 fi (xi −x ∑n ̅ )4
i=1 fi (xi −x 1912
μ3 = =0 μ4 = = = 7.4
N N 256
Skewness:
1. Symmetric Skewness: A perfect symmetric distribution is one in which frequency distribution is the same on the
sides of the center point of the frequency curve. In this, Mean = Median = Mode. There is no skewness in a perfectly
symmetrical distribution.
2. Asymmetric Skewness: A asymmetrical or skewed distribution is one in which the spread of the frequencies is
different on both the sides of the center point or the frequency curve is more stretched towards one side or value of
Mean. Median and Mode falls at different points.
(A)Positive Skewness: In this, the concentration of frequencies is more towards higher values of the variable i.e.
the right tail is longer than the left tail.
(B) Negative Skewness: In this, the concentration of frequencies is more towards the lower values of the variable
i.e. the left tail is longer than the right tail.
What is Kurtosis?
It is also a characteristic of the frequency distribution. It gives an idea about the shape of a frequency distribution. Basically,
the measure of kurtosis is the extent to which a frequency distribution is peaked in comparison with a normal curve. It is the
degree of peaked Ness of a distribution.
Types of Kurtosis
The following figure describes the classification of kurtosis:
1. Leptokurtic: Leptokurtic is a curve having a high peak than the normal distribution. In this curve, there is too much
concentration of items near the central value.
2. Mesokurtic: Mesokurtic is a curve having a normal peak than the normal curve. In this curve, there is equal
distribution of items around the central value.
3. Platykurtic: Platykurtic is a curve having a low peak than the normal curve is called platykurtic. In this curve, there is
less concentration of items around the central value.
𝜇32 𝜇4
𝑆𝑘𝑒𝑤𝑛𝑒𝑠𝑠, 𝛽1 = , 𝐾𝑢𝑟𝑡𝑜𝑠𝑖𝑠, 𝛽2 = ,
𝜇23 𝜇22
Solution:
C.I f x fx (𝑥 − 𝑥̅ ) f(𝑥 − (𝑥 𝑓(𝑥 (𝑥 − ̅̅̅
𝑥)3 𝑓(𝑥 (𝑥 𝑓(𝑥
𝑥̅ ) − ̅̅̅
𝑥)2 − ̅̅̅
𝑥)2 − ̅̅̅
𝑥)3 − ̅̅̅
𝑥)4 − ̅̅̅
𝑥)4
5-15 1 10 10 -25 -25 625 625 -15625 -15625 390625 390625
15-25 3 20 60 -15 -45 225 675 -3375 -10125 50625 151875
25-35 5 30 150 -5 -25 25 125 -125 -625 625 3125
35-45 7 40 280 5 35 25 175 125 875 625 4375
45-55 4 50 200 15 60 225 900 3375 13500 50625 202500
Total 20 700 0 2500 -12000 752500
∑ 𝑓𝑥
𝑥̅ = ∑𝑓
=700/20=35
∑n ̅ )1
i=1 fi (xi −x
μ1 = =0
N
∑ 𝑥𝑦 = 𝑎 ∑ 𝑥 + 𝑏 ∑ 𝑥 2
Qns: By the method of Least Square method, fit a straight line of following data
X 1 2 3 4 5
y 14 27 40 55 68
Solution:
x y xy X2
1 14 14 1
2 27 54 4
3 40 120 9
4 55 220 16
5 68 340 25
Total=15 204 748 55
∑ 𝑦 = 𝑛𝑎 + 𝑏 ∑ 𝑥
204=5a+15b ………(1)
∑ 𝑥𝑦 = 𝑎 ∑ 𝑥 + 𝑏 ∑ 𝑥 2
748=15a+55b …….(2)
b=13.6, a=0
Curve fitting-Staright line: Y=a+bx
Y=13.6X ……(3)
X=15
Y=204
X=20, Y=272
Qns: By the method of Least Square method, fit a exponential curve 𝑦 = 𝑎𝑒 𝑏𝑥 of following data
x 1 5 7 9 12
y 10 15 12 15 21
Solution:
Y=A+BX
Where Y=logy, A=loga, B=Logb
x=X y Y=logy XY X2
1 14 1.1461 1.1461 1
2 27 1.4313 2.8626 4
3 40 1.6020 4.806 9
4 55 1.7403 6.9612 16
5 68 1.8325 9.1625 25
Total=15 7.7522 24.8892 55
∑ 𝑌 = 𝑛𝐴 + 𝐵 ∑ 𝑋
7.7522=5A+15B ………(1)
∑ 𝑋𝑌 = 𝐴 ∑ 𝑋 + 𝐵 ∑ 𝑋 2
24.8892=15A+55B ………(2)
A=loga=1.0606……..
B=logb= 0.16326
a=Antilog of A=Antilog(1.0606)=11.4815
b=Antilog B=Antilog(0.16326)=1.4563
𝑦 = 11.4815𝑒 1.4563𝑥
𝑦 = 11.4815𝑥 1.4563
∑ 𝑦 = 𝑛𝑎 + 𝑏 ∑ 𝑥 …….(1)
𝑛
𝜕𝑈
= 2 ∑( 𝑦𝑖 − 𝑎 − 𝑏𝑥𝑖 )(−𝑥𝑖 ) = 0
𝜕𝑏
𝑖=1
∑ 𝑥𝑦 = 𝑎 ∑ 𝑥 + 𝑏 ∑ 𝑥 2 ……(2)
2) Fitting the Parabolic curve Y=a+bx+cx2
𝐸𝑖 = 𝑦𝑖 − 𝑎 − 𝑏𝑥𝑖 − 𝑐𝑥𝑖 2
By the principle of Least square method, the value of a and b are
𝑛 𝑛
𝜕𝑈 𝜕𝑈 𝜕𝑈
= 0, = 0, =0
𝜕𝑎 𝜕𝑏 𝜕𝑐
∑ 𝑦 = 𝑛𝑎 + 𝑏 ∑ 𝑥 + 𝑐 ∑ 𝑥 2
∑ 𝑥𝑦 = 𝑎 ∑ 𝑥 + 𝑏 ∑ 𝑥 2 + 𝑐 ∑ 𝑥 3
∑ 𝑥 2𝑦 = 𝑎 ∑ 𝑥 2 + 𝑏 ∑ 𝑥 3 + 𝑐 ∑ 𝑥 4
3) Fitting of the curve 𝑦 = 𝑎𝑥 + 𝑏𝑥 2
Error of estimate for the ith point (xi, yi) is𝐸𝑖 = 𝑦𝑖 − 𝑎𝑥𝑖 − 𝑏𝑥𝑖 2
By the principle of Least square method, the value of a and b are
𝑛 𝑛
𝑐0
1. Use the method of least squares to fit the curve 𝑦 = + 𝑐1 √𝑥 for the following data.
𝑥
𝑐0
𝑦= + 𝑐1 √𝑥
𝑥
𝑦 1 1
Normal Equation ∑ = 𝐶0 ∑ + 𝑐1 ∑ ………..(1)
𝑥 𝑥2 √𝑥
1
∑ 𝑦√𝑥 = 𝐶0 ∑ +𝑐1 ∑ 𝑥 ……..(2)
√𝑥
x y y/x 1 1 𝑦 √𝑥
𝑥2 √𝑥
0.1 21 210 100 3.162 6.6407
0.2 11 55 25 2.236 4.919
0.4 7 17.5 6.25 1.58 4.427
0.5 6 12 4 1.414 4.243
1 5 5 1 1 5
2 6 3 .25 0.707 8.485
4.2 302.5 136.5 10.093 34.064
𝑦 1 1
∑ = 𝐶0 ∑ + 𝑐1 ∑ ………..(1) 302.5=136.5𝐶0 + 10.093𝑐1 ……..(1)
𝑥 𝑥2 √𝑥
1
∑ 𝑦√𝑥 = 𝐶0 ∑ +𝑐1 ∑ 𝑥 ……..(2) 34.064=10.093𝐶0 +4.2𝑐1 ……….(2)
√𝑥
𝐶0 = 1.965 , 𝑐1 = 3.386
1.965
𝑦= + 3.386√𝑥
𝑥
∑ 𝑿𝒀
𝒓= where 𝑿 = 𝒙 − 𝒙
̅,𝒀 = 𝒚 − 𝒚
̅
√∑ 𝑿𝟐 √∑ 𝒀𝟐
𝒏 ∑ 𝒙𝒚 − ∑ 𝒙 ∑ 𝒚
𝒓=
√(𝒏 ∑ 𝒙𝟐 − (∑ 𝒙)𝟐 √(𝒏 ∑ 𝒚𝟐 − (∑ 𝒚)𝟐
2. Calculate the correlation coefficient for the following heights (in inches) of
fathers(𝑋) and their sons (𝑌):
x 65 66 67 67 68 69 70 72
y 67 68 65 68 72 72 69 71
x y 𝑑𝑥 = 𝑥 − 𝑥̅ 𝑑𝑦 = 𝑦 − 𝑦̅ 𝑑𝑥 2 𝑑𝑦 2 𝑑𝑥 𝑑𝑦
65 67 -3 -2 9 4 6
66 68 -2 -1 4 1 2
67 65 -1 -4 1 16 4
67 68 -1 -1 1 1 1
68 72 0 3 0 9 0
69 72 1 3 1 9 3
70 69 2 0 4 0 0
72 71 4 2 16 4 8
∑ 𝑦 =552 2
∑𝑥 ∑ 𝑑𝑥 =36 ∑ 𝑑 2 ∑ 𝑑𝑥 𝑑𝑦
𝑦
= 544 = 44 = 24
∑𝑥 ∑𝑦
𝑥̅ = =544/8=68 𝑦̅ = =69
𝑁 𝑁
∑ 𝑑𝑥 𝑑𝑦 24
𝑟= = = 0.603
√36√44
√∑ 𝑑𝑥 2 √∑ 𝑑𝑦 2
Qns: From the data given below, find the number of items n: r=0.5, ∑ 𝑋𝑌 = 120, ∑ 𝑋 2 = 90, 𝜎𝑦 = 8, where X and Y are
deviation from the arithmetic mean.
∑ 𝑌2
𝜎𝑦2 =
𝑛
∑ 𝑌2
64=
𝑛
∑ 𝑌 2 = 64𝑛
∑ 𝑋𝑌
𝑟=
√∑ 𝑋 2 √∑ 𝑌 2
120
0.5 =
√90√64𝑛
14400
0.25 =
90 ∗ 64𝑛
14400
𝑛= = 10
0.25 ∗ 90 ∗ 64
Karl Pearson’s Coefficient of Skewness:
𝑴𝒆𝒂𝒏 − 𝑴𝒐𝒅𝒆 𝟑(𝑴𝒆𝒂𝒏 − 𝑴𝒆𝒅𝒊𝒂𝒏)
𝒔𝒌𝒆𝒘𝒏𝒆𝒔𝒔 = =
𝑺𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝑫𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝑺𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝑫𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏
Bowley’s Coefficient of Skewness:
𝑸𝟏 + 𝑸𝟑 − 𝟐𝑴
𝒔𝒌𝒆𝒘𝒏𝒆𝒔𝒔 =
𝑸 𝟑 − 𝑸𝟏
𝑵
− 𝒑𝒄𝒇
𝑸𝟏 = 𝒍𝟏 + 𝟒 ∗𝒊
𝒇
𝟑𝑵
− 𝒑𝒄𝒇
𝑸𝟑 = 𝒍𝟏 + 𝟒 ∗𝒊
𝒇
𝑵
− 𝒑𝒄𝒇
𝑴 = 𝒍𝟏 + 𝟐 ∗𝒊
𝒇
C.I. f cf
0-10 3 3
10-20 5 8
20-30 10 18
30-40 8 26
40-50 6 32
Total 32
𝑁 32
𝑀= = = 16
2 2
𝑁
−𝑝𝑐𝑓 16−8
2
𝑀 = 𝑙1 + ∗ 𝑖=20+ ∗ 10 = 28
𝑓 10
𝑁
𝑄1 = =8
4
𝑁
−𝑝𝑐𝑓 8−3
4
𝑄1 = 𝑙1 + ∗ 𝑖=10+ ∗ 10 = 20
𝑓 5
3𝑁
𝑄1 = = 24
4
3𝑁
−𝑝𝑐𝑓 24−18
4
𝑄3 = 𝑙1 + ∗ 𝑖=30+ ∗ 10 = 37.5
𝑓 8
3. Ten students got the following percentage of marks in Economics and statistics
Roll No. 1 2 3 4 5 6 7 8 9 10
Marks in
78 36 98 25 75 82 90 62 65 39
Economics
Marks in
84 51 91 60 68 62 86 58 53 47
Statistics
Find the correlation coefficient between marks of Economics and Statistics.
x y x2 y2 xy
78 84 6084 7056 6552
36 51 1296 2601 1836
98 91 9604 8281 8918
25 60 625 3600 1500
75 68 5625 4624 5100
82 62 6724 3844 5084
90 86 8100 7396 7740
62 58 3844 3364 3596
65 53 4225 2809 3445
39 47 1521 2209 1833
Total=650 660 47648 45784 45604
∑ 𝑑𝑥 𝑑𝑦 2704
𝑟= = = 0.78
2 2 √5398√2224
√∑ 𝑑𝑥 √∑ 𝑑𝑦
Calculation of Coefficient of Correlation for a Bivariate Frequency Distribution:
𝑛 ∑ 𝑓𝑥𝑦 − ∑ 𝑓𝑥 ∑ 𝑓𝑦
𝑟𝑥𝑦 =
√(𝑛 ∑ 𝑓𝑥 2 − (∑ 𝑓𝑥)2 √(𝑛 ∑ 𝑓𝑦 2 − (∑ 𝑓𝑦)2
𝑛 ∑ 𝑓𝑢𝑣 − ∑ 𝑓𝑢 ∑ 𝑓𝑣
𝑟𝑥𝑦 =
√(𝑛 ∑ 𝑓𝑢2 − (∑ 𝑓𝑢)2 √(𝑛 ∑ 𝑓𝑣 2 − (∑ 𝑓𝑣)2
Qns The following table gives according to age the frequency of the marks obtained by 100 students in an intelligent test
Age in 18 19 20 21 Total
years
Marks
10-20 4 2 2 8
20-30 5 4 6 4 19
30-40 6 8 10 11 35
40-50 4 4 6 8 22
50-60 2 4 4 10
60-70 2 3 1 6
Total 19 22 31 28 100
Calculate the co-efficient of correlation between age and intelligence.
Solution
Mid Age in 18 19 20=b 21 f 𝑥−𝑎 fu fu2 fuv
𝑢=
Value years(y) 𝑖
Marks(x)
15 10-20 4(24) 2(6) 2(0) ----- 8 -3 -24 72 30
25 20-30 5(20) 4(8) 6(0) 4(-8) 19 -2 -38 76 20
35 30-40 6(6) 8(8) 10(0) 11(-11) 35 -1 -35 35 3
45=a 40-50 4(0) 4(0) 6(0) 8(0) 22 0 0 0 0
55 50-60 --- 2(-2) 4(0) 4(4) 10 1 10 10 2
65 60-70 ---- 2(-4) 3(0) 1(2) 6 2 12 24 -2
f 19 22 31 28 100 Total -75 217 53
𝑣 =𝑦−𝑏 -2 -1 0 1
fv -38 -22 0 28 -32=∑ 𝑓𝑣
fv2 76 22 0 28 126= ∑ 𝑓𝑣 2
fuv 50 16 0 -13 53
𝑛 ∑ 𝑓𝑢𝑣−∑ 𝑓𝑢 ∑ 𝑓𝑣 100∗53−(−75)(−32)
𝑟𝑥𝑦 = = =0.213
√(𝑛 ∑ 𝑓𝑢2 −(∑ 𝑓𝑢)2 √(𝑛 ∑ 𝑓𝑣 2 −(∑ 𝑓𝑣)2 √100∗217−(−75)2 √100∗126−(−32)2
6 ∑ 𝐷2 6 ∗ 76
𝑟= = = 0.46
𝑁(𝑁 2 − 1) 10(100 − 1)