Business Statistics
(Assignment)
Manpreet
BMS 1-C (19118)
Scatter Plot
6
X Y
5
5 5
4 5 4
2 1
5 5 3
1 1
2
2 1
1 4 1
1 3
2 4 0
0 1 2 3 4 5 6
2 5
4 4
4 5
2 2
Conclusion:
3 5 It appears to be a direct correlation between
1 1
3 3 X & Y as more of data points are moving upwards
3 5 from left to right.
3 4
2 5
4 4
4 2
2 5
5 3
5 4
4 4
5 3
4 5
3 5
3 1
2 4
91 108
(X- x̅)(Y-
X Y X- x̅ Y- y̅ y̅)
5 5 1.97 1.4 2.75
4 5 0.97 1.4 1.35
2 1 -1.03 -2.6 2.69
5 5 1.97 1.4 2.75
1 1 -2.03 -2.6 5.29
2 1 -1.03 -2.6 2.69
1 4 -2.03 0.4 -0.81
1 3 -2.03 -0.6 1.22
2 4 -1.03 0.4 -0.41
2 5 -1.03 1.4 -1.45
4 4 0.97 0.4 0.39
4 5 0.97 1.4 1.35
2 2 -1.03 -1.6 1.65
3 5 -0.03 1.4 -0.05
1 1 -2.03 -2.6 5.29
3 3 -0.03 -0.6 0.02
3 5 -0.03 1.4 -0.05
3 4 -0.03 0.4 -0.01
2 5 -1.03 1.4 -1.45
4 4 0.97 0.4 0.39
4 2 0.97 -1.6 -1.55
2 5 -1.03 1.4 -1.45
5 3 1.97 -0.6 -1.18
5 4 1.97 0.4 0.79
4 4 0.97 0.4 0.39
5 3 1.97 -0.6 -1.18
4 5 0.97 1.4 1.35
3 5 -0.03 1.4 -0.05
3 1 -0.03 -2.6 0.09
2 4 -1.03 0.4 -0.41
91 108 0.00 0.00 20.40
Karl Pearson’s Coefficient of Correlation
r=0.359442 r(using excel function)= 0.359442
Linear Curve
X Y
6
5 5
4 5 5
2 1
5 5 4
1 1
2 1 3
1 4
1 3 2
2 4
1
2 5 y = 0.4003x + 2.3859
4 4 R² = 0.1292
0
4 5 0 1 2 3 4 5 6
2 2
3 5
1 1
3 3
3 5 Parabolic
3 4
6
2 5
4 4 5
4 2
2 5 4
5 3
5 4 3
4 4
5 3 2
4 5
1 R² = 0.1707
3 5
3 1 y = -0.1912x2 + 1.5676x + 0.9295
0
2 4 0 1 2 3 4 5 6
91 108
(X- x̅)(Y-
X Y X- x̅ Y- y̅ y̅) (X- x̅)2 (Y- y̅)2 ŷ (Linear) ŷ( PARABOLIC)
5 5 1.97 1.4 2.75 3.87 1.96 4.39 3.99
4 5 0.97 1.4 1.35 0.93 1.96 3.99 4.14
2 1 -1.03 -2.6 2.69 1.07 6.76 3.19 3.30
5 5 1.97 1.4 2.75 3.87 1.96 4.39 3.99
1 1 -2.03 -2.6 5.29 4.13 6.76 2.79 2.31
2 1 -1.03 -2.6 2.69 1.07 6.76 3.19 3.30
1 4 -2.03 0.4 -0.81 4.13 0.16 2.79 2.31
1 3 -2.03 -0.6 1.22 4.13 0.36 2.79 2.31
2 4 -1.03 0.4 -0.41 1.07 0.16 3.19 3.30
2 5 -1.03 1.4 -1.45 1.07 1.96 3.19 3.30
4 4 0.97 0.4 0.39 0.93 0.16 3.99 4.14
4 5 0.97 1.4 1.35 0.93 1.96 3.99 4.14
2 2 -1.03 -1.6 1.65 1.07 2.56 3.19 3.30
3 5 -0.03 1.4 -0.05 0.00 1.96 3.59 3.91
1 1 -2.03 -2.6 5.29 4.13 6.76 2.79 2.31
3 3 -0.03 -0.6 0.02 0.00 0.36 3.59 3.91
3 5 -0.03 1.4 -0.05 0.00 1.96 3.59 3.91
3 4 -0.03 0.4 -0.01 0.00 0.16 3.59 3.91
2 5 -1.03 1.4 -1.45 1.07 1.96 3.19 3.30
4 4 0.97 0.4 0.39 0.93 0.16 3.99 4.14
4 2 0.97 -1.6 -1.55 0.93 2.56 3.99 4.14
2 5 -1.03 1.4 -1.45 1.07 1.96 3.19 3.30
5 3 1.97 -0.6 -1.18 3.87 0.36 4.39 3.99
5 4 1.97 0.4 0.79 3.87 0.16 4.39 3.99
4 4 0.97 0.4 0.39 0.93 0.16 3.99 4.14
5 3 1.97 -0.6 -1.18 3.87 0.36 4.39 3.99
4 5 0.97 1.4 1.35 0.93 1.96 3.99 4.14
3 5 -0.03 1.4 -0.05 0.00 1.96 3.59 3.91
3 1 -0.03 -2.6 0.09 0.00 6.76 3.59 3.91
2 4 -1.03 0.4 -0.41 1.07 0.16 3.19 3.30
91 108 0.00 0.00 20.40 50.97 63.2
6
Here,
BLUE= ACTUAL DATA
5
ORANGE= ESTIMATED DATA
with respect to y(linear)
4
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Series1 Series2
y = 0.1292x + 3.135
6
5 Here,
BLUE= ACTUAL
4
DATA
3 ORANGE=
ESTIMATED DATA
2 with respect to
y(Parabolic)
1
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Series1 Series2
y = 0.1708x + 2.9856
1. R^2 of linear model is 0.1292 & that of parabolic model is 0.1707
2. Ideal value of R^2 is 0.6
3. So the higher value of parabolic model proves that is a better fit
STANDARD ERROR
(Y-
2
(Y- ŷ) (Linear) ŷ)2(Parabolic)
0.38 1.03
1.03 0.74
4.78 5.29
0.38 1.03
3.19 1.71
4.78 5.29
1.47 2.87
0.05 0.48
0.66 0.49
3.29 2.89
0.00 0.02
1.03 0.74
1.41 1.69
2.00 1.18
3.19 1.71
Standard Error Formula
0.34 0.83
2.00 1.18 √(y-y̅) ^2/√n-2
0.17 0.01
3.29 2.89
0.00 0.02
3.95 4.58
3.29 2.89
1.92 0.98
0.15 0.00
0.00 0.02
1.92 0.98
1.03 0.74
2.00 1.18
6.69 8.48
0.66 0.49
55.03 52.41
Standard Error in Linear
=1.965524
Standard Error in Parabolic
=1.871806
A lower standard error in parabolic model implies that it is a better model
since the typical variation between actual and estimated values is lower as
compared to that of linear model.