0% found this document useful (0 votes)
12 views5 pages

Curve Fitting Using Least Squares Method

The document discusses the Least Squares Method for curve fitting, detailing how to derive the equation of a curve based on given data points to minimize error. It explains the formulation of normal equations for linear, second-degree, power, and exponential curves, providing examples for each case. The importance of squaring residuals to avoid cancellation of errors and to facilitate calculus in finding the best fit is also highlighted.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views5 pages

Curve Fitting Using Least Squares Method

The document discusses the Least Squares Method for curve fitting, detailing how to derive the equation of a curve based on given data points to minimize error. It explains the formulation of normal equations for linear, second-degree, power, and exponential curves, providing examples for each case. The importance of squaring residuals to avoid cancellation of errors and to facilitate calculus in finding the best fit is also highlighted.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Statistics (Lecture Note - 1)

Fitting of Curves (Least Squares Method)


Let Y = Y (X) be an unknown equation of a curve where X is the independent variable and Y is the dependent variable.
Further, we have a set of values
(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )
which satisfies Y = Y (X).
Our task is to obtain the equation Y = Y (X) based on the given data so that the error should be minimum (Principle of
Least Squares).

Curve Fitting by Least Squares Method


y

y = 0.96x + 0.2

n
X
Error: U = (yi − Yi )2 , where Yi = Y (xi ).
i=1

Let us consider the equation of the unknown curve as

Y = a0 + a1 X + a2 X 2 + . . . + an X n

(degree = n) where ai are (n + 1) constants. Therefore, using/substituting the given data:

y1 = a0 + a1 x1 + a2 x21 + . . . + an xn1
y2 = a0 + a1 x2 + a2 x22 + . . . + an xn2
..
.
yn = a0 + a1 xn + a2 x2n + . . . + an xnn

Now, our task is to determine the constants a0 , a1 , . . . , an such that it represents the curve of best fit.
The curve will be best fitting when error is zero. Now, we define the quantity U , the sum of squares of errors:
n
X n
X
U= (yi − Yi )2 = (yi − a0 − a1 xi − a2 x2i − . . . − an xni )2
i=1 i=1

According to the principle of maxima and minima, the extreme value of the function U is obtained by
∂U
= 0 for i = 0, 1, 2, . . . , n
∂ai

Dr. Avijit Das 1 NIT Silchar


(provided the partial derivatives exist.) Therefore,
n
∂U X
=0⇒2 (yi − a0 − a1 xi − a2 x2i − . . . − an xni )(−1) = 0
∂a0 i=1
n
X n
X n
X n
X
⇒ yi = na0 + a1 xi + a2 x2i + . . . + an xni
i=1 i=1 i=1 i=1
n n n n n
∂U X X X X X
=0⇒ xi yi = a0 xi + a1 x2i + a2 x3i + . . . + an xn+1
i
∂a1 i=1 i=1 i=1 i=1 i=1
n n n n
∂U X X X X
=0⇒ x2i yi = a0 x2i + a1 x3i + . . . + an xn+2
i
∂a2 i=1 i=1 i=1 i=1
..
.
n n n n
∂U X X X X
=0⇒ xni yi = a0 xni + a1 xn+1
i + . . . + an x2n
i
∂an i=1 i=1 i=1 i=1

These equations are known as Normal Equations.

Why are the errors (residuals) squared in the Least Squares Method?
In the method of least squares, the difference between the observed and fitted (predicted) values is called the residual:

ei = yi − Yi
P P
To measure the total error, we could consider the sum of these residuals: ei = (yi − Yi ). However, positive and negative
errors cancel each other, giving a misleading measure of overall deviation. To avoid this, we consider the sum of squared
residuals: X X
U= e2i = (yi − Yi )2
The residuals are squared for the following reasons:

a) It removes the effect of sign (no cancellation of positive and negative errors).

b) The squared function is smooth and differentiable everywhere, unlike |ei |, which is not differentiable at ei = 0. Hence,
calculus can be applied easily to find the minimum.

c) Squaring penalizes larger errors more strongly, giving better overall fits.

e2i gives the maximum likelihood estimates of the parameters.


P
d) If the errors are normally distributed, minimizing

Fitting of Straight Line


Let Y = a + bX is the unknown curve (linear curve) with two constants a and b. Therefore, the corresponding error of least
square method is given by
Xn
U= (yi − Yi )2
i=1

Hence, the normal equations are


∂U X X
=0⇒ yi = na + b xi
∂a
∂U X X X
=0⇒ xi yi = a xi + b x2i
∂b
Solve the above two equations and find the constants a and b to get the expression of the unknown curve.

Example 1: Fit a straight line to the following data:

x 1 6 11 16 20 26
y 13 16 17 23 24 31

Dr. Avijit Das 2 NIT Silchar


The normal equations are: X X
yi = na + b xi
X X X
xi yi = a xi + b x2i

x2i
P P P P
It requires: yi , xi , xi yi ,

x y x2 xy
1 13 1 13
6 16 36 96
11 17 121 187
16 23 256 368
20 24 400 480
26 31 676 806
Σx = 80 Σy = 124 Σx2 = 1490 Σxy = 1950
Hence, the normal equations become
124 = 6a + 80b, 1950 = 80a + 1490b
⇒ a = 11.3227, b = 0.7008

Example 2: Fit a straight line to the following data:

x 0 1 2 3 4
y 1.0 1.8 3.3 4.5 6.3

The normal equations are: X X


yi = na + b xi
X X X
xi yi = a xi + b x2i

x2i
P P P P
It requires: yi , xi , xi yi ,

x y x2 xy
0 1.0 0 0.00
1 1.8 1 1.80
2 3.3 4 6.60
3 4.5 9 13.50
4 6.3 16 25.20
Σx = 10 Σy = 16.9 Σx2 = 30 Σxy = 47.1
Hence, the normal equations become
16.9 = 5a + 10b, 47.1 = 10a + 30b.
Solving gives
a = 0.72, b = 1.33
so the least-squares line is
y = 0.72 + 1.33x .

Fitting of Second Degree Curve (Parabola)


Let the equation of the curve
Y = a + bX + cX 2
Normal equations are given by:
∂U
= 0 ⇒ Σy = na + bΣx + cΣx2
∂a
∂U
= 0 ⇒ Σxy = aΣx + bΣx2 + cΣx3
∂b
∂U
= 0 ⇒ Σx2 y = aΣx2 + bΣx3 + cΣx4
∂c

Example 1: Fit a second degree parabola to the following data:

Dr. Avijit Das 3 NIT Silchar


x 0 1 2 3 4
y 0 3 4 5 6
Now, compute the necessary powers and products to form the normal equations:

x y x2 x3 x4 xy x2 y
0 0 0 0 0 0 0
1 3 1 1 1 3 3
2 4 4 8 16 8 16
3 5 9 27 81 15 45
4 6 16 64 256 24 96
Σ 10 30 100 354 50 160

Normal equations:

19 = 5a + 10b + 30c
50 = 10a + 30b + 100c
160 = 30a + 100b + 354c
⇒ a = 1.114, b = 1.7717, c = −0.1429
Example 2: Fit a parabola y = ax2 + bx + c to the given data.

x 10 12 15 23 20
y 14 17 23 25 21

Solution: Let the parabola of best fit be


y = ax2 + bx + c.
The normal equations (for n observations) are:
X X X
yi = a x2i + b xi + c n,
X X X X
xi yi = a x3i + b x2i + c xi ,
X X X X
x2i yi = a x4i + b x3i + c x2i .
Compute the required sums:

x y x2 x3 x4 xy x2 y
10 14 100 1000 10000 140 1400
12 17 144 1728 20736 204 2448
15 23 225 3375 50625 345 5175
23 25 529 12167 279841 575 13225
20 21 400 8000 160000 420 8400
2 3 4 2
Σx = 80 Σy = 100 Σx = 1398 Σx = 26270 Σx = 521202 Σxy = 1684 Σx y = 30648

Substitute into the normal equations:

100 = a(1398) + b(80) + c(5),


1684 = a(26270) + b(1398) + c(80),
30648 = a(521202) + b(26270) + c(1398).
Solving this system (by elimination or matrix methods) gives

a = −0.06950, b = 3.00993, c = −8.72790

(rounded to five decimal places). Thus the best-fit parabola is

y = −0.06950 x2 + 3.00993 x − 8.72790 .

Dr. Avijit Das 4 NIT Silchar


Fitting a Power Curve
Let the equation of the unknown curve is
Y = aX b
Taking log on both sides
log Y = log a + b log X
Let
P = log Y, A = log a, Q = log X
Hence, P = A + bQ, which is a linear equation with two constants a and b. Therefore, the corresponding normal equations are

ΣP = nA + bΣQ

ΣP Q = AΣQ + bΣQ2
Example: Fit a power curve Y = aX b for the following data:

x 6 2 10 5 8 31
y 9 11 12 8 7 47
Now, compute the logarithmic values needed for the linearized form log Y = log a + b log X:

x y P = log y Q = log x PQ Q2
6 9 0.9542 0.7782 0.7425 0.6056
2 11 1.0414 0.3010 0.3135 0.0906
10 12 1.0792 1.0000 1.0792 1.0000
5 8 0.9031 0.6990 0.6313 0.4886
8 7 0.8451 0.9031 0.7635 0.8156
31 47 1.6721 1.4914 2.4938 2.2243
ΣP = 6.4951 ΣQ = 5.1727 ΣP Q = 6.0238 ΣQ2 = 5.2247

Answer :Y = 10.5099 X −0.0737

Fitting an Exponential Curve


Let
Y = abX
Taking log on both sides:
log Y = log a + X log b
Let P = log Y, A = log a, B = log b, hence
P = A + Bx

Another form: Let Y = aebX

log Y = log a + Xb log e


Hence, P = A + Bx.

Dr. Avijit Das 5 NIT Silchar

You might also like