Sei sulla pagina 1di 10

EE 231 - Winter 2002

Numerical Methods
Lecture 25 April 8, 2002
Curve Fitting or Least Squares Approximation
Linear Regression
Linear least squares approximation is sometimes called linear regression. Suppose we have n data points

(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )

which are subject to experimental error, and they look like they almost fit on a straight line, as in the
figure.
y
y = ax + b

Dk (a,b)

(xk ,yk )

x
Instead of trying to find a smooth function which passes through all of the points, since the data may contain
experimental errors, we try to find the straight line, or regression line, that gives the best fit in the least
squares sense.
To make this more precise, suppose that the regression line or approximating line has equation

y = ax + b,

we want to measure how good a fit this line is to the data by minimizing the sum of the squares of the
deviations
Dk = yk (axk + b)
for k = 1, 2, . . . , n, that is, we want
n
X n
X 2
E(a, b) = Dk (a, b)2 = [yk (axk + b)]
k=1 k=1

to be a minimum.
As a function of the two variables a and b, the minimum of E(a, b) will occur either at a critical point where

E(a, b) E(a, b)
=0 and = 0,
a b

or at a point where E(a, b) is not differentiable. Clearly, there are no points where E(a, b) is not differentiable,
E(a, b)
so we should look for the minimum value of E(a, b) at a critical point. Therefore, we set = 0 and
a
E(a, b)
= 0 and solve for the values of a and b at the critical point. Note that we really should test the
b
critical point to see that we do, indeed, have a global minimum for the function E(a, b).
n
P 2
Differentiating E(a, b) = [yk (axk + b)] , we have
k=1

n n
E(a, b) X 2
X
0= = [yk (axk + b)] = 2 [yk (axk + b)] (xk )
a a
k=1 k=1

and
n n
E(a, b) X X
0= = [yk (axk + b)]2 = 2 [yk (axk + b)] (1).
b b
k=1 k=1

These equations simplify to the following,


n
X n
X n
X
a x2k + b xk = xk yk
k=1 k=1 k=1
Xn Xn Xn
a xk + b 1= yk
k=1 k=1 k=1

and are called the normal equations.


Solving the normal equations for a and b we get
n
 n
 n

P P P
n xk yk xk yk
k=1 k=1 k=1
a=  2
n n
x2k
P P
n xk
k=1 k=1
 n
 n
  n
 n

x2k
P P P P
yk xk yk xk
k=1 k=1 k=1 k=1
b=  2 .
n n
x2k
P P
n xk
k=1 k=1

Now that we have the values of a and b, the line that best approximates the data points in the least squares
sense is
y = ax + b,
and is called the least squares line or the regression line.

Example 1. Given the experimental data

(x1 , y1 ) = (1, 1.3), (x2 , y2 ) = (2, 3.5), (x3 , y3 ) = (3, 4.2), (x4 , y4 ) = (4, 5.0), (x5 , y5 ) = (5, 7.0)
(x6 , y6 ) = (6, 8.8), (x7 , y7 ) = (7, 10.1), (x8 , y8 ) = (8, 12.5), (x9 , y9 ) = (9, 13.0), (x10 , y10 ) = (10, 15.6)

Find the least squares line approximating this experimental data. Also, find the total least squares error

10
X 2
E(a, b) = [yk (axk + b)]
k=1

made with this approximation.

2
Solution. The data are tabulated below, note that the last column was completed after the coefficients a
and b were found.

xk yk x2k xk yk axk + b

1 1.3 1 1.3 1.18

2 3.5 4 7.0 2.72

3 4.2 9 12.6 4.25

4 5.0 16 20.0 5.79

5 7.0 25 35.0 7.33

6 8.8 36 52.8 8.87

7 10.1 49 70.7 10.41

8 12.5 64 100.0 11.94

9 13.0 81 117.0 13.48

10 15.6 100 156.0 15.02

: 55 81.0 385 572.4 E(a, b) 2.34

From the normal equations, we have

10(572.4) 55(81)
a= = 1.538
10(385) (55)2

and
385(81) 55(572.4)
b= = 0.360.
10(385) (55)2
The equation of the least squares line is therefore

y = 1.538x 0.360.

The approximate values given by the least squares technique at the data points are given in the last column
of the table.

3
The graph of the least squares line and the data points were plotted using gnuplot and are shown in the
figure below.

20
18 1.538*t - 0.360
example1.dat
16
14
12
10
8
6
4
2
0
-2
0 2 4 6 8 10 12

General Least Squares Polynomial Approximation


Now we consider the general problem of approximating a data set

{(xi , yi ) : i = 1, 2, . . . m}

with an algebraic polynomial


n
X
pn (x) = ak xk
k=0

of degree n where n + 1 < m, using the least squares procedure as above.


First note that if we want to fit the data exactly, then we have m linear equations in n + 1 unknowns
a0 , a1 , . . . , an ,

a0 + a1 x1 + a2 x21 + + an xn1 = y1
a0 + a1 x2 + a2 x22 + + an xn2 = y2
..
.
a0 + a1 xm + a2 x2m + + an xnm = ym

and we have more equations than unknowns, since m > n + 1, so the system is overdetermined, usually
an overdetermined system has no solution.
In this situation, when there is no exact fit, we have to find values of the coefficients which give a best fit in
the least squares sense. Again, we choose the coefficients a0 , a1 , . . . , an to minimize the least squares error,
that is, the sum of the squares of the deviations,
m
X m
X m
X m
X
E(a0 , a1 , . . . , an ) = [yi pn (xi )]2 = yi2 2 yi pn (xi ) + pn (xi )2 .
i=1 i=1 i=1 i=1

4
After simplifying, we have
m n m
! n X
n m
!
X X X X X
E(a0 , a1 , . . . , an ) = yi2 2 aj xji yi + aj ak xj+k
i .
i=1 j=0 i=1 j=0 k=0 i=1

As in the linear case, for E to be minimized, it is necessary that


m n m
E X X X
= 2 yi xji + 2 ak xj+k
i = 0,
aj i=1 i=1 k=0

and this gives n + 1 normal equations


n
X m
X m
X
ak xij+k = yi xji ,
k=0 i=1 i=1

for j = 0, 1, 2, . . . , n.
This is a system of n + 1 linear equations in n + 1 unknowns:
m
X m
X m
X m
X m
X
a0 x0i + a1 x1i + a2 x2i + + an xni = yi x0i
i=1 i=1 i=1 i=1 i=1

m
X m
X m
X m
X m
X
a0 x1i + a1 x2i + a2 x3i + + an xn+1
i = yi x1i
i=1 i=1 i=1 i=1 i=1

..
.
m
X m
X m
X m
X m
X
a0 xni + a1 xn+1
i + a2 xn+2
i + + an x2n
i = yi xni .
i=1 i=1 i=1 i=1 i=1

It can be shown that this system of equations has a unique solution for a0 , a1 , . . . , an provided that the xi ,
i = 1, 2, . . . , m, are all distinct.

For example, suppose that we have decided to use a quadratic polynomial

p2 (x) = ax2 + bx + c,

then the least squares error is


m m
X 2
X 2
yi (ax2i + bxi + c) ,

E(a, b, c) = [yi p2 (xi )] =
i=1 i=1

and differentiating, we get


m
E X
2 yi (ax2i + bxi + c) (x2i ) = 0
 
=
a i=1

m
E X
2 yi (ax2i + bxi + c) (xi ) = 0
 
=
b i=1

m
E X
2 yi (ax2i + bxi + c) (1) = 0.
 
=
c i=1
5
The normal equations are
m
X m
X m
X m
X
a x4i +b x3i +c x2i = x2i yi
i=1 i=1 i=1 i=1

m
X m
X m
X m
X
a x3i + b x2i + c xi = xi yi
i=1 i=1 i=1 i=1

m
X m
X m
X m
X
a x2i + b xi + c 1= yi
i=1 i=1 i=1 i=1

which is a system of 3 linear equations in 3 unknowns (a, b, c), which has a unique solution provided that
x1 , x2 , . . . , xm are all distinct.
Note: In the general case, we can also determine the value of the least squares error
m
X 2
E(a0 , a1 , . . . , an ) = [yi pn (xi )]
i=1

and this gives a numerical value which indicates how well the curve fits the data.
Also, different polynomials, or other functions, can be tried and the one that gives the smallest value of E
can be used.
High order polynomials can give a perfect fit in principle, but in practice the results could be useless. The
degree of the polynomials should be kept to a minimum, say no more than 2 or 3.

Example 2. Given the data

(x1 , y1 ) = (1.0, 1.7), (x2 , y2 ) = (2.0, 1.8), (x3 , y3 ) = (3.0, 2.3), (x4 , y4 ) = (4.0, 3.1),

which curve should we use to fit the data?


Solution.
It is a good idea to actually plot the data first and draw a best fit curve by hand. This may reveal what
type of curve should be used. For example, for this data we have the following plot.

The plot seems to indicate that we should use a quadratic polynomial for the least squares approximations.

6
We choose a quadratic
p2 (x) = ax2 + bx + c
and use a table to calculate the summations used in the normal equations.

xk yk xk yk x2k yk x2k x3k x4k p2 (xk ) (yk p2 (xk ))2

1.0 1.7 1.7 1.7 1.0 1.0 1.0 1.695 2.5 105

2.0 1.8 3.6 7.2 4.0 8.0 16.0 1.815 2.25 104

3.0 2.3 6.9 20.7 9.0 27.0 81.0 2.285 2.25 104

4.0 3.1 12.4 49.6 16.0 64.0 256.0 3.105 2.5 105

: 10.0 8.9 24.6 79.2 30.0 100.0 354.0 5.0 104

Again, note that the last two columns were calculated after solving the normal equations for a, b, and c.
The normal equations are

354a + 100b + 30c = 79.2


100a + 30b + 10c = 24.6
30a + 10b + 4c = 8.9

Solving these equations, we have a = 0.175, b = 0.405, c = 1.925, and the quadratic which gives a best fit
to the data set in the least squares sense is

p2 (x) = 0.175x2 0.405x + 1.925,

and the value of the error is E(a, b, c, ) = 5.0 104 . Using gnuplot we plotted the least squares quadratic
and the data points on the same graph. The result is shown below.
3.2
0.175*t**2 - 0.405*t + 1.925
3 quadratic.dat
2.8

2.6

2.4

2.2

1.8

1.6
0 0.5 1 1.5 2 2.5 3 3.5 4
7
If instead of a quadratic we had chosen a linear polynomial to get the best fit to the data, say

p1 (x) = ax + b,

then the least squares error to be minimized is


m
X
E(a, b) = [yi (axi + b)]2 ,
i=1

and the normal equations become


m
E(a, b) X
= 2 [yi (axi + b)] (xi ) = 0
a i=1

m
E(a, b) X
= 2 [yi (axi + b)] (1) = 0
b i=1

and the normal equations are


m
X m
X m
X
a x2i + b xi = xi yi
i=1 i=1 i=1
Xm Xm Xm
a xi + b 1= yi .
i=1 i=1 i=1

Now we use a table to compute the summations needed in the normal equations.

xk yk xk yk x2k p1 (xk ) (yk p1 (xk ))2

1.0 1.7 1.7 1.0 1.52 3.24 103

2.0 1.8 3.6 4.0 1.99 3.61 102

3.0 2.3 6.9 9.0 2.46 2.56 102

4.0 3.1 12.4 16.0 2.93 2.89 102

: 10.0 8.9 24.6 30.0 0.123

Again, note that the last two columns were completed after the normal equations were solved for a and b.
The normal equations are

30a + 10b = 24.6


10a + 4b = 8.9

with solution a = 0.47 and b = 1.05.

8
Therefore, the linear polynomial that gives a best fit to the given data in the least squares sense is

y = 0.47x + 1.05.

Note that the least squares error in the linear case is E(a, b) = 0.123, while the least squares error in the
quadratic case was E(a, b, c) = 5.0 104 , and we conclude that the quadratic gives a better least squares fit
to the data.
We plotted the least squares quadratic, the least squares line, and the original data on the same graph using
gnuplot, and the result is shown below.

3.5
0.175*t**2 - 0.405*t + 1.925
0.47*t + 1.05
3 quadratic.dat

2.5

1.5

1
0 0.5 1 1.5 2 2.5 3 3.5 4

Note: There is no reason to use only polynomials in the least squares approximation process, we can use
other functions as well. Using polynomials, our basis functions were the powers:

{1, x, x2 , x3 , , xn , . . . }.

However, we are free to use other bases, for example,

{ax1 , ax2 , ax3 , . . . }

where {a1 , a2 , . . . , an , . . . } is sequence of distinct positive real numbers. Or, we could use the trigonometric
functions to get a trigonometric polynomial, for example,

{1, cos x, sin x, cos 2x, sin 2x, . . . , sin nx, cos nx, . . . }.

Or, we could use any family of continuous functions which is linearly independent on the real line.

9
In most cases, we can reduce the problem to a linear regression problem, or a quadratic least squares
approximation by making the appropppriate transformations.
For example,
If our original function is a hyperbola, say

1
p(x) =
a0 + a1 x

then the appropriate transformation is

1
z= = a0 + a1 x.
p(x)

If our original function is an exponential, say

p(x) = abx

then the appropriate transformation is

log p(x) = log a + x log b,

and so z = A + Bx, where A = log a and B = log b.


If the original function is a power (geometric)

p(x) = axb ,

then the appropriate transformation is

log p(x) = log a + b log x,

and so z = A + Bt, where A = log a, B = b, and t = log x.


If the original functions are trigonometric functions,

p(x) = a0 + a1 cos x,

then the appropriate transformation is


p(t) = a0 + a1 t
where t = cos x.

10

Potrebbero piacerti anche