Sei sulla pagina 1di 15

CE 205: Numerical Methods

Curve-fitting (Linear and


Non-linear Regression)

CE 205: Numerical Methods Dr. Tanvir Ahmed


Motivation
sampling data are acquired at
discrete points.
- but the values at undefined points are
wanted in many engineering applications

Interpolation
-for very precise data
- Ex: density of water at different temp.

Least-squares Regression
-for imprecise/noisy data
- Purpose is trend analysis

CE 205: Numerical Methods Dr. Tanvir Ahmed


Interpolation and linear regression

CE 205: Numerical Methods Dr. Tanvir Ahmed


Interpolation and linear regression

Least-square regression
Visually sketch a line that conforms to the
data

Linear interpolation
Connect the data points consecutively by
lines segments

Polynomial interpolation
Connect the data points consecutively by
simple curves

CE 205: Numerical Methods Dr. Tanvir Ahmed


Linear Least Squares Regression
A method to determine the “best” coefficients in a linear
model for given data set [(x1,y1), (x2,y2)……(xn,yn)].

A straight line model: y  a0  a1 x  e

Coefficients Residual Error

“Best” coefficients can be obtained by minimizing the sum


of the squares of the estimate residuals.
n n
Sr   e  yi  a0  a1 xi 
2 2
i
i1 i1

CE 205: Numerical Methods Dr. Tanvir Ahmed


Estimating the best-fit parameters

n xi yi   xi  yi
a1 
n x   x 
2
2
i i

a0  y  a1 x

Sr
Standard error of the estimate: s y/ x 
n2

CE 205: Numerical Methods Dr. Tanvir Ahmed


Improvement by linear regression

(a) the spread of data around the (b) the spread of the data around
mean of the dependent data the best fit line:

Sum of squares of residuals Sum of squares of residuals between


between data points and the mean data points and the regression line
n
S r    yi  a0  a1 xi 
n
St    yi  y 
2 2

i 1 i 1

CE 205: Numerical Methods Dr. Tanvir Ahmed


Assessing the “Goodness” of fit

St-Sr quantifies the improvement or error reduction due to


describing data in terms of a straight line rather than as an
average value

St  Sr
r 
2

St

r2 represents the percentage of the original uncertainty


explained by the model (Coefficient of Determination)
- For 
a perfect fit, Sr=0 and r2=1.
2
- If r =0, there is no improvement over simply picking the mean.
- If r2<0, the model is worse than simply picking the mean!

CE 205: Numerical Methods Dr. Tanvir Ahmed


Polynomial regression

Some engineering data is


poorly represented by a
straight line.

A higher order polynomial


(e.g. parabola) may be well-
suited

CE 205: Numerical Methods Dr. Tanvir Ahmed


Polynomial regression
For a second order polynomial, the least-square regression
can be extended to
n n
Sr   e  yi  a0  a1 xi  a x
2
i
2 2
2 i 
i1 i1

The coefficients can be determined using Gaussian


elimination:


CE 205: Numerical Methods Dr. Tanvir Ahmed


Polynomial regression
The two-dimensional case can easily be extended to m-th
order polynomial
y  a0  a1 x  a2 x 2    am x m  e

The sum of squares of residuals:


n n
Sr   e  yi  a0  a1 xi  a x  a x
2
i
2
2 i m i 
m 2

 i1 i1

Can you formulate the matrix to determine the



coefficients?

CE 205: Numerical Methods Dr. Tanvir Ahmed


Multiple linear regression

y is a linear function of two or


more independent variables:

 a0  a1x1  a2 x2 am xm
y

Again, the best fit is obtained


by minimizing the sum of the
squares of the estimate
residuals:
n n
Sr   ei2  yi  a0  a1 x1,i  a2 x2,i am xm,i 
2

 i1 i1

CE 205: Numerical Methods Dr. Tanvir Ahmed


Nonlinear relationship
Linear regression is predicated on the fact that the
relationship between the dependent and independent
variables is linear - this is not always the case.

Three common examples are:

exponential : y  1e 1x

power : y   2 x 2
x
saturation - growth - rate : y   3
3  x

CE 205: Numerical Methods Dr. Tanvir Ahmed


Linearizing nonlinear relationship

this may involve taking logarithms or inversion:

M odel Nonlinear Linearized

exponential : y  1e1 x ln y  ln 1  1 x

power : y  2 x 2 log y  log  2  2 log x


x 1 1 3 1
saturation- growth - rate : y   3  
3  x y 3 3 x

CE 205: Numerical Methods Dr. Tanvir Ahmed


Linearizing nonlinear relationship

CE 205: Numerical Methods Dr. Tanvir Ahmed

Potrebbero piacerti anche