Sei sulla pagina 1di 35

CURVE FITTING

CURVE FITTING
Describes techniques to fit curves (curve fitting) to discrete
data to obtain intermediate estimates.

There are two general approaches for curve fitting:


• Least Squares regression:
Data exhibit a significant degree of scatter. The strategy is
to derive a single curve that represents the general trend
of the data.
• Interpolation:
Data is very precise. The strategy is to pass a curve or a
series of curves through each of the points.
Introduction
In engineering, two types of applications are
encountered:
– Trend analysis. Predicting values of dependent
variable, may include extrapolation beyond data
points or interpolation between data points.

– Hypothesis testing. Comparing existing


mathematical model with measured data.
Trend analysis.
Mathematical Background
• Arithmetic mean. The sum of the individual data
points (yi) divided by the number of points (n).

y
 y i
, i  1, , n
n

• Standard deviation. The most common measure of a


spread for a sample.

St
Sy  , St   ( yi  y ) 2
n 1
Mathematical Background (cont’d)

• Variance. Representation of spread by the square of


the standard deviation.

 i   y   y 
2 2
( y y ) 2
/n
S 
2
or S 2
 i i
n 1
y
n 1
y

• Coefficient of variation. Has the utility to quantify the


spread of data.
Sy
c.v.  100%
y
Linear Regression

Fitting a straight line to a set of paired


observations: (x1, y1), (x2, y2),…,(xn, yn).

y = a0+ a1 x + e
a1 - slope
a0 - intercept
e - error, or residual, between the model
and the observations
Linear Regression: Residual
Criteria for a “Best” Fit
How to find a0 and a1 so that the error would be
minimum?
n n n n
min e  (y
i 1
i
i 1
i  a0  a1 xi ) min | e |  | y
i 1
i
i 1
i  a0  a1 xi |

e1= -e2
Cont.
n n
min S r   ei2   ( yi  a0  a1 xi ) 2
i 1 i 1

 S r
 a  0
 0

 S r  0
 a1

n n n
S r   ei2   ( yi , measured  yi , model ) 2   ( yi  a0  a1 xi ) 2
i 1 i 1 i 1

Yields a unique line for a given set of data.


Determination of ao and a1
S r
 2 ( yi  ao  a1 xi )  0 a  na0
ao 0

S r na0   xi a1   yi
 2 ( yi  ao  a1 xi ) xi   0
a1  ii  0i 1i
y x  a x  a x 2

0   yi   a 0   a1 xi
2 equations with 2
0   yi xi   a 0 xi   a x 2
1 i unknowns, can be
solved simultaneously

n xi yi   xi  yi
a0  y  a1 x a1 
n x   xi 
2 2
i
Interpretation
Error Quantification
• Total sum of the squares around the mean for the
dependent variable, y, is St S  ( y  y)2
t  i

• Sum of the squares of residuals around the regression line is


Sr n n
S r   ei2   ( yi  ao  a1 xi ) 2
i 1 i 1

• St-Sr quantifies the improvement or error reduction due to


describing data in terms of a straight line rather than as an
average value
St  S r
r  2
r2: coefficient of determination St
r : correlation coefficient
Error Quantification of Linear Regression

For a perfect fit:


• Sr= 0 and r = r2 =1, signifying that the line
explains 100 percent of the variability of the
data.
• For r = r2 = 0, Sr = St, the fit represents no
improvement.
Example

Fit a straight line to the x and y values in the


following Table:
xi yi xiyi xi2
 xi  28  yi  24.0
1 0.5 0.5 1
2 2.5 5 4  i  140
x 2
 xi yi  119 .5
3 2 6 9
28 24
4 4 16 16 x 4 y  3.42857
5 3.5 17.5 25 7 7
6 6 36 36
28 24
7 5.5 38.5 x  49  4 y  3.428571
7 7
28 24 119.5 140
(cont’d)

n xi yi   xi  yi
a1 
n x  ( xi )
2 2
i

7 119.5  28  24
  0.8392857
7 140  28 2

a0  y  a1 x
 3.428571  0.8392857  4  0.07142857
Y = 0.07142857 + 0.8392857 x
Example (Error Analysis)

^
xi yi (yi  y)
2
e  ( yi  y ) 2
2


 i 
i
   22.7143
2
1 0.5 8.5765 0.1687 S t y y
2 2.5 0.8622 0.5625
S r   ei  2.9911
2
3 2.0 2.0408 0.3473
4 4.0 0.3265 0.3265
5 3.5 0.0051 0.5896 St  S r
6 6.0 6.6122 0.7972 r 
2
 0.868
St
7 5.5 4.2908 0.1993
28 24.0 22.7143 2.9911
r  r 2  0.868  0.932
Cont.

•The standard deviation (quantifies the spread around the mean):

St 22.7143
sy    1.9457
n 1 7 1
•The standard error of estimate (quantifies the spread around the
regression line)
Sr 2.9911
sy / x    0.7735
n2 72
Because S y / x  S y , the linear regression model has good fitness
Linearization of Nonlinear Relationships
• The relationship between the dependent and
independent variables is linear.
• However, a few types of nonlinear functions
can be transformed into linear regression
problems.
The exponential equation.
The power equation.
The saturation-growth-rate equation.
Linearization of Nonlinear Relationships
1. The exponential equation.

y  a1eb1x 

ln y  ln a1  b1 x
y* = ao + a 1 x
Linearization of Nonlinear Relationships
2. The power equation

y  a2 xb2 

log y  log a2  b2 log x


y* = ao + a1 x*
Linearization of Nonlinear Relationships
3. The saturation-growth-rate equation

x
y  a3 
b3  x

y* = 1/y
1 1 b3  1 
    ao = 1/a3
a1 = b3/a3
y a3 a3  x  x* = 1/x
Example
Fit the following Equation:

y  a2 x b2
to the data in the following table: log y  log( a2 xb2 )
xi yi X*=log xi Y*=logyi log y  log a2  b2 log x
1 0.5 0 -0.301 let Y *  log y, X *  log x,
2 1.7 0.301 0.226 a0  log a2 , a1  b2
3 3.4 0.477 0.534
4 5.7 0.602 0.753 Y *  a0  a1 X *
5 8.4 0.699 0.922
15 19.7 2.079 2.141
Example

Xi Yi X*i=Log(X) Y*i=Log(Y) X*Y* X*^2


1 0.5 0.0000 -0.3010 0.0000 0.0000
2 1.7 0.3010 0.2304 0.0694 0.0906
3 3.4 0.4771 0.5315 0.2536 0.2276
4 5.7 0.6021 0.7559 0.4551 0.3625
5 8.4 0.6990 0.9243 0.6460 0.4886
Sum 15 19.700 2.079 2.141 1.424 1.169

 n  x i y i   x i  y i 5 1.424  2.079  2.141


a1    1.75
n  x i  ( x i ) 5 1.169  2.079
2 2

2


a0  y  a1x  0.4282  1.75  0.41584  0.334
Linearization of Nonlinear
Functions: Example

log y=-0.334+1.75log x

y  0.46x 1.75
Polynomial Regression

• Some engineering data is poorly represented


by a straight line.
• For these cases a curve is better suited to fit
the data.
• The least squares method can readily be
extended to fit the data to higher order
polynomials.
Polynomial Regression (cont’d)

A parabola is preferable
Polynomial Regression (cont’d)

• A 2nd order polynomial (quadratic) is defined by:


y  ao  a1 x  a2 x  e
2

• The residuals between the model and the data:

ei  yi  ao  a1 xi  a2 xi
2

• The sum of squares of the residual:


Sr   ei   yi  ao  a1 xi  a2 xi
2

2 2
Polynomial Regression (cont’d)
S r
 2 ( yi  ao  a1 xi  a2 xi2 )  0
ao
S r
 2 ( yi  ao  a1 xi  a2 xi2 ) xi  0
a1
S r
 2 ( yi  ao  a1 xi  a2 xi2 ) xi2  0
a2

 i
y  n  a o  a1 i
x  a 2 i
x 2
3 linear equations

 i i o  i 1 i 2  i
with 3 unknowns
x y  a x  a x 2
 a x 3
(ao,a1,a2), can be
solved
x 2
i yi  ao  x  a1  x  a2  x
2
i
3
i
4
i
Polynomial Regression (cont’d)

• A system of 3x3 equations needs to be solved to determine


the coefficients of the polynomial.
 n

x x
i
2
i
  a0    y i 
   
  xi x x   a1     xi yi 
2 3
i i
 xi2 x x
3 4  a2   xi2 yi 
 i i   

• The standard error & the coefficient of determination

Sr St  S r
sy / x  r 
2

n3 St
Polynomial Regression (cont’d)

General:
The mth-order polynomial:
y  ao  a1 x  a2 x 2  .....  am x m  e
• A system of (m+1)x(m+1) linear equations must be solved for
determining the coefficients of the mth-order polynomial.
• The standard error:
Sr
sy / x 
n  m  1
St  S r
• The coefficient of determination: r 
2

St
Polynomial Regression- Example
Fit a second order polynomial to data:
xi yi xi2 xi3 xi4 xiyi xi2yi x i  15
0 2.1 0 0 0 0 0
1 7.7 1 1 1 7.7 7.7 y
 xi3  225 i  152.6
2 13.6 4 8 16 27.2 54.4
 i  55
x 2

3 27.2 9 27 81 81.6 244.8


4 40.9 16 64 256 163.6 654.4
5 61.1 25 125 625 305.5 1527.5
15 152.6 55 225 979 585.6 2489  i  979
x 4

x y i i  585.6
15 152.6
x  2.5, y  25.433
 xi yi  2488.8
2
6 6
Example (cont’d)

• The system of simultaneous linear equations:


 6 15 55  a0   152.6 
15 55 225  a    585.6 
  1   
55 225 979   2488.8
 2 
a 
a0  2.47857, a1  2.35929, a2  1.86071
y  2.47857  2.35929 x  1.86071 x 2

S r   ei  3.74657
2
St    yi  y   2513.39
2
Polynomial Regression- Example (cont’d)
xi yi ymodel e i2 (yi-y`)2
0 2.1 2.4786 0.14332 544.42889
1 7.7 6.6986 1.00286 314.45929
2 13.6 14.64 1.08158 140.01989
3 27.2 26.303 0.80491 3.12229
4 40.9 41.687 0.61951 239.22809
5 61.1 60.793 0.09439 1272.13489
15 152.6 3.74657 2513.39333

•The standard error of estimate:


3.74657
sy / x   1.12
63
•The coefficient of determination:
2513.39  3.74657
r2   0.99851, r  r 2  0.99925
2513.39

Potrebbero piacerti anche