Sei sulla pagina 1di 85

Optimized Curve Fitting

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 1


Case 1

 Tabulated data (interest table, steam table etc.)


 Estimates are required at intermediate values
from such tables

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 2


Case 2

 Experimentation
 Independent (predictor) variable X
 Dependent (response) variable Y
 Data available at discrete points or times
 Estimates are required at points between the
discrete values (as it is impractical or expensive
to actually measure them)

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 3


Case 3

 Function substitution
 A implicit (complicated) function or program is known
 Results at all values are possible but time consuming

 And further mathematical operations (integration,


differentiation, maximum or minimum points ) is
difficult

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 4


Case 4

 Hypothesis testing
 Alternative mathematical models are given
 Which is the best to use for a given situation

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 5


Solution
 Graphically represent data points
 Develop a mathematical relation (curve fitting)
which describe the relationship between variables
 Draw the curve for the developed mathematical
relation which best represent the given data
points
 Use the mathematical relation or the curve
 to estimate the intermediate values and
extrapolation
 for further mathematical operations
 for hypothesis testing

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 6


Problem

 Sketch a line that visually conforms to the data


points
 And obtain the y value for x = 8

i 1 2 3 4 5
x 2.10 6.22 7.17 10.5 13.7
y 2.90 3.83 5.98 5.71 7.74

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 7


Curve Fitting

10.00 10.00
8.00 8.00
6.00 6.00

y
y

4.00 4.00
2.00 2.00
0.00 0.00
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
x x

10.00 10.00
8.00 8.00
6.00 6.00
y
y

4.00 4.00
2.00 2.00
0.00 0.00
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
x x

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 8


Curve Fitting

10.00
8.00
6.00
4.00
y

2.00
0.00
-2.00 0 2 4 6 8 10 12 14 16
-4.00
x

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 9


General observations

 Curves are dependent on subjective view point


 For the same data set, there should not be
different curves
 There may be error in reading the values from
the graph

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 10


Curve Fitting
 Two methods - depending on error in data
120

 Interpolation 100

 Precise data 80

Temperature (deg F)
60

 Force through each data point 40

20

0
0 1 2 3 4 5

 Regression
Time (s)

 X values are accurate 7

 Y values are noisy (Experimental) 5

f(x)
4

 Represent trend of the data 3

Without matching individual points


1

0
0 2 4 6 8 10 12 14 16
x

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 11


Regression steps
Y = A*exp(–X/X0)

 Model selection
 Describing Merit function for closeness-of-fit
 Compute values of the parameter of the model
 Interpretation of results & assessing goodness-
of-fit

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 12


Right Model selection
 Understanding of basic principle of the problem
 Model should represent the data trends
 Linear model y = a0 + a1 x
 Polynomial model
 Non-linear model
 Exponential law
 Power law
 Logarithmic law
 Gaussian law
 Multiple variable y = b1x1 + b2x2 + ... + bnxn + c
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 13
Describing Merit Function
 Method of least squares
Outliers & Weighing function
9

8
y = a0 + a1 x y5

7
Data points
y3
6
e3
y4
5
f(x)

e2
4 Residual
y1 y2 e = y - (a 0 + a 1x )

Regression Model
2 y = a 0 + a 1x

0
0 2 4 6 8 10 12 14 16
x

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 14


Computing parameter values
Obtain the parameters (a0 & a1) that minimizes the sum of
squares of error between the data points and the line
Linear regression y = a0 + a1 x
Polynomial regression
Multiple regression y = b1x1 + b2x2 + ... + bnxn + c
Exponential law
Power law
Logarithmic law
Can be solved explicitly
Non-linear regression
Gaussian law
Iteratively solved using Levenberg-Marquardt algorithm
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 15
Goodness-of-fit

Visual inspection
Random distribution of residual around data points
Correlation Coefficient r2
Standard error of parameters
Confidence interval
Prediction interval

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 16


Interpretation of results

 Curve fitting provides a correlation between the


variables
 It means that ‘X predicts Y’ not ‘X causes Y ‘
 Parameter values must also make sense

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 17


Linear Regression
Model selection
Assume linear model y = a0 + a1 x
Merit Function
sum of square of residual error
yi = a0 + a1xi + ei 9

8
y5

ei = yi − a0 − a1xi 7

6
Data points
y3
e3
y4
n


5

Sr = ei2
f(x)
e2
4 Residual
y1 y2 e = y - (a 0 + a 1x )

i= 1 3

Regression Model
n 2

∑ ( yi − a0 − a1xi )
y = a 0 + a 1x

2
= 1

i= 1 0
0 2 4 6 8 10 12 14 16
x

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 18


Linear Regression
 Parameter computation
Find the values of a0 and a1 that minimize Sr
Minimize Sr by equating derivatives WRT a0 and a1 to zero,
 First a0
∂S r ∂ n 2
= (
∑ i 0 1 i 
y − a − a x )
∂a0 ∂a0  i=1
n

= ∑ [] 2
i=1 ∂ a0

n
∂   Finally
= ∑ 2[]  [] 
i=1  ∂ a0  n  n
n
na0 +  ∑ xi  a1 = ∑ yi
= ∑ 2[ y
i=1
i − a0 − a1 xi ]( − 1 ) i=1  i=1
a0 + x a1 = y
= 0
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 19
Linear Regression
 Second a1
∂S r ∂ n 2
= (
∑ i 0 1 i 
y − a − a x )
∂a1 ∂a1  i=1
n

= ∑ [] 2
i=1 ∂a1

n
∂ 
= ∑ 2[]  [] 
i=1  ∂a1 
n
= ∑ 2[ y i − a0 − a1 xi ]( − xi )
i=1  Finally
= 0
n   n 2 n
 ∑ xi  a0 +  ∑ xi  a1 = ∑ xi yi
i=1  i=1  i=1
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 20
Linear Regression

 Equations  Solution
1 n n 2 1 n n
n  n ∑ yi ∑ xi − ∑ xi ∑ xi yi
na0 +  ∑ xi  a1 = ∑ yi n i=1 i=1 n i=1 i=1
a0 =
i=1  i=1
n 2 1  n 2
∑ xi −  ∑ xi 
n   n 2 n i=1 n i=1 
 ∑ xi  a0 +  ∑ xi  a1 = ∑ xi yi n
i=1  i=1  i=1 1 n n
∑ xi yi − ∑ xi ∑ yi
i=1 n i=1 i=1
a1 =
n 2 1  n 2
∑ xi −  ∑ xi 
i=1 n i=1 

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 21


Linear Regression
 Sum of squared values  Variances & covariance

a1

a0 a1

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 22


Example
i xi yi xi
2 xi yi yi
2

1 2.10 2.90 4.41 6.09 8.41


2 6.22 3.83 38.69 23.82 14.67
3 7.17 5.98 51.41 42.88 35.76
4 10.50 5.71 110.25 59.96 32.60
5 13.70 7.74 187.69 106.04 59.91
Sum 39.69 26.16 392.45 238.78 151.35
1 n n 2 1 n n 5
∑ yi ∑ xi − ∑ xi ∑ xi yi ∑ xi = 39.69
n i=1 i=1 n i=1 i=1 i =1
a0 =
n 2 1  n 2 5 2
∑ xi −  ∑ xi  ∑ xi = 392.3201
i=1 n  i=1  i =1
n 1 n n 5
∑ xi yi − ∑ xi ∑ yi ∑ yi = 26.16
i=1 n i=1 i=1 i =1
a1 =
n 2 1  n 2 5
∑ xi −  ∑ xi  ∑ xi yi = 238.7416
i=1 n  i=1  i =1
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 23
Example
5
∑ xi = 39.69 1 1
(26.16)(392.3) − (39.69)(238.7)
i =1 a0 = 5 5 = 2.038
1
5 2 392.3 − [ 39.69] 2
∑ xi = 392.3201 5
i =1 1
5 238 . 7 − (39.69)(26.16)
a1 = 5 = 0.4023
∑ yi = 26.16 1
i =1 392.3 − [ 39.69] 2
5 5
∑ xi yi = 238.7416
i =1

y = 2.038 + 0.4023x
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 24
Another Approach

[Z ] * { A} = {Y }
[[Z] T
]
* [Z ] * { A} = [Z ] * {Y } { T
}
{ A} = [[Z ] * [Z ] ] * {[Z] }
−1
* {Y }
T T

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 25


Example
xi yi
1 2.10 2.90
2 6.22 3.83
3 7.17 5.98
4 10.50 5.71
5 13.70 7.74
y = a0 + a1 x
2.10 1 2.90 
2.10a1 + a0 = 2.90 6.22 1  3.83 
 a 1   
6.22a1 + a0 = 3.83 7.17 1  *   = 5.98 
7.17a1 + a0 = 5.98   a 0   
10.50 1 5.71 
10.50a1 + a0 = 5.71 13.70
 1  7.74 
 
13.70a1 + a0 = 7.74
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 26
Example
2.10 1 2.90 
6.22 1  3.83 
2.10 6.22 7.17 10.50 13.70   a1  2.10 6.22 7.17 10.50 13.70   
 1 1 1 1 1  * 7.17 1*  = 
a 1 1 1 1 1  * 5.98 
  10.50 
1  0   5.71 
  
13.70 1  7.74 
  

392.45 39.69  a1  238.78 


39.69 *  = 
 5  a0  26.16 

a1  0.012922 - 0.10257  238.78 


a  =  *
 0  - 0.10257 1.014232  26.16 

a1  0.4022 
  = 2.0395 
a0  
y = 2.038 + 0.4023x
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 27
Goodness-of-fit - I
Visual inspection: Linear trend matching

10
8
6
y

4
2
0
0 2 4 6 8 10 12 14 16
x

y = 2.038 + 0.4023x

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 28


Goodness-of-fit - I
Predicted values and e
xi yi y e = yi − y
2.10 2.90 2.88 0.02
6.22 3.83 4.54 -0.71
7.17 5.98 4.92 1.06
10.50 5.71 6.26 -0.55
13.70 7.74 7.55 0.19

y = 2.038 + 0.4023 x

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 29


Goodness-of-fit - I
Visual inspection

yi y e = y i 8− y
2.90 2.88 +18.5%

3.83 4.54 6

y predicted
5.98 4.92 -17.5%
4
5.71 6.26
7.74 7.55 2
y = 2.038 + 0.4023 x
0
0 2 4 6 8
y m easured

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 30


Goodness-of-fit - II

yi y e = y i8 − y
y = 0.8644x + 0.7097
2.90 2.88
R2 = 0.8644
3.83 4.54 6

y predicted
5.98 4.92
4
5.71 6.26
7.74 7.55 2
y = 2.038 + 0.4023x
0
0 2 4 6 8
Can be used to compare y m easured
the mathematical models

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 31


Goodness-of-fit - III
Residual Analysis
 If a fitted equation is representative of the data then its
residuals should not form a pattern when residuals are
plotted against values of experimental variables or the
fitted values.
 Sometimes, a normal probability plot is used to see if
the residuals form a pattern (the normal distribution is
representative of random variation).
These procedures allow us to investigate outliers, test
assumptions, and fits.

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 32


Goodness-of-fit - III
Residual Analysis

Residual plot shows a pattern, indicating that fitted equation


is not representative of the data.
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 33
Goodness-of-fit - III
Residual Plot: e vs y

1.5
1.0
0.5
e

0.0
-0.5
-1.0
0 1 2 3 4 5 6 7 8
y

There is no pattern, showing random distribution of e and


indicating that fitted equation is representative of the data.
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 34
Goodness-of-fit - IV
9 Error reduction due to
8
St=SSyy describing the data in terms
7 Sr=SSE
6 of straight line rather than
SSR
5 as an average value
y

4
Sum of squares of residuals
3
2 due to regression
1
0 SSR = St − Sr
0 2 4 6 8 10 12 14 16
x

Maximum possible residual Unexplained residual after linear reg.


Total sum of square of residuals Sum of square of residuals between
between data point and the data point and the predicted y from the
mean n linear model
n
Syy = St = ∑
i =1
(y i − y )2 SSE = Sr = ∑ (yi − a0 − a1xi )2
i =1
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 35
Goodness-of-fit - IV
Coefficient of Determination
Fraction of total variation (residual) in y that is accounted for
by the fitted equation
sum of squares of residual due to regression
r =
2
total sum of squares of residual
St − Sr
r = 2
St

For perfect fit, Sr=0 ⇒ r2=1


For no improvement, Sr=St ⇒ r2=0
The magnitude of r2 is a measure of the relative strength of
the linear association between x and y.
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 36
Goodness-of-fit – IV
Correlation coefficient, r,
assigns a signed number between -1 and 1 that is a
measure of the strength of the relationship between
the variables.
r = 0 means there is no relationship between the variables
r = 1 there is a perfect positive relationship between the
variables; thus, the independent variable, y can be
exactly predicted from the independent variable x, by
the equation of a straight line.
r = -1 there is a perfect negative relationship between the
variables; again the independent variable y can be
exactly predicted from the independent variable x, by
the equation of a straight line.
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 37
Goodness-of-fit - IV
The values of r are never exactly 1 or -1
Positive r
If x gets larger, y also increases.
Negative r
The variables are inversely related.
As x get larger, y decreases or
as x decreases, y increases.
Just because r is close to 1 does not mean that
fit is necessarily good
To confirm, always inspect a plot of the data
along with the regression line

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 38


Goodness-of-fit - IV

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 39


Goodness-of-fit - IV
Spread of dependent variable
Around the mean of dependent variable y = 5.232
10
8
6
y

4
2
0
0 2 4 6 8 10 12 14 16
x

Total sum of square of residuals Standard deviation Coeff of


between data point and the mean variation
n
St Sy
Syy = St = ∑ (y i − y) 2
sy = c.v . =
i =1 n −1 y
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 40
Goodness-of-fit - IV
Spread of dependent variable
Around the mean of dependent variable
yi yi − y (y i − y ) 2
1 2.9 -2.33 5.44
2 3.83 -1.40 1.97
3 5.98 0.75 0.56
4 5.71 0.48 0.23
5 7.74 2.51 6.29
Sum 26.16 St 14.48
y 5.232 Sy 1.90
CV 0.364
Total sum of square of residuals Sample standard deviation St
between data point and the mean sy =
n
n −1
Sy
St = ∑ (y
i =1
i − y) 2
Coefficient of variation c.v . =
y
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 41
Goodness-of-fit - IV
Spread of dependent variable
Around the linear regression
10
8
6
y

4
2
0
0 2 4 6 8 10 12 14 16
x

Total sum of square of residuals Standard error of estimate


between measured y and the y
calculated with the linear model
n Sr
SSE = Sr = ∑ (y
i =1
i − a0 − a1xi ) 2
sy / x =
n −2
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 42
Goodness-of-fit - IV
Spread of dependent variable
Around the linear regression
yi y e = y i − y (y i − y ) 2
2.90 2.88 0.02 0.00
3.83 4.54 -0.71 0.51
5.98 4.92 1.06 1.12
5.71 6.26 -0.55 0.31
7.74 7.55 0.19 0.04
Sr 1.96
S y/x 0.81
Total sum of square of residuals Standard error of estimate
between measured y and the y
calculated with the linear model
n
Sr
Sr = ∑ (y
i =1
i − a0 − a1xi ) 2
sy / x =
n −2
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 43
Goodness-of-fit - IV
St = 14.48
Sr = 1.96

Coefficient of Determination
St − Sr
r =
2
St =0.864

Correlation Coefficient
r = 0.93

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 44


Goodness of fit V
Standard error of estimate
Sr
sy / x = = 0.81
n −2
Standard error in a0 and a1

a0 = 0.81492

a1 = 0.09198

cv(a0) = 0.81492/ 2.038 = 0.40


cv(a1) = 0.09198/ 0.4023 = 0.23
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 45
Goodness of fit VI
If measurements are normally distributed
The range y + S y to y − S y will encompass approximately
68% of the measurement
The range y + 2S y to y − 2S y will encompass approximately
95% of the measurement
It is possible to define an interval within which measurement
is likely to fall with certain confidence (probability)

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 46


Goodness of fit VI
Confidence Interval
Range around estimated parameters within which the
true value of parameter is expected to lie with a given
probability
The probability that the true mean of y, µ, falls within
the bound from L to U is 1-α. where α is significance
level
L = ai − S (ai )tα / 2,n −1 U = ai + S (ai )tα / 2,n −1

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 47


Regression Plot

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 48


Polynomial Regression
• Minimize the residual between the data points
and the curve -- least-squares regression

Linear yi = a0 + a1xi

yi = a0 + a1 xi + a2 xi2
Quadratic
Cubic yi = a0 + a1 xi + a2 xi2 + a3 xi3

General yi = a0 + a1 xi + a2 xi2 + a3 xi3 +  + am xim

Must find values of a0 , a1, a2, … am


21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 49
Polynomial Regression
 Residual

ei = yi − (a0 + a1 xi + a2 xi2 + a3 xi3 +  + am xim )

Sum of squared residuals


n 2 n
S r = ∑ ei = ∑ [ y − (a0 + a1x + a2 x 2 + a3 x 3 +  + am x m )]2
i=1 i=1

• Minimize by taking derivatives

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 50


Polynomial Regression
 Normal Equations
 n n
2
n
m   n 
 n ∑ xi ∑ i
x  ∑ xi   ∑ yi 
 n i=1 i=1 i=1   a0  ni=1 
n n n
 ∑x ∑ xi
2
∑ xi
3
 ∑ xi   a1 
m +1  ∑x y 
 i=1 i i=1 i=1 i=1    i=1 i i 
 n  =  n 2 
m+ 2  a2 
n n n
 ∑ xi2 ∑ i
x 3
∑ i
x 4
 ∑ xi    ∑ xi yi 
 i=1 i=1 i=1 i=1     i=1 
       a 
m
  
n m n
m +1
n
m+ 2
n
2m  n m 
 ∑ xi ∑ xi ∑ xi  ∑ xi   ∑ xi yi 
i=1 i=1 i=1 i=1  i=1 

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 51


Polynomial Regression
 Solution

[Z ] * { A} = {Y }
[[Z] T
]
* [Z ] * { A} = [Z ] * {Y } { T
}
{ A} = [[Z ] * [Z ] ] * {[Z] }
−1
* {Y }
T T

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 52


Example
x 0 1.0 1.5 2.3 2.5 4.0 5.1 6.0 6.5 7.0 8.1 9.0
y 0.2 0.8 2.5 2.5 3.5 4.3 3.0 5.0 3.5 2.4 1.3 2.0
x 9.3 11.0 11.3 12.1 13.1 14.0 15.5 16.0 17.5 17.8 19.0 20.0
y -0.3 -1.3 -3.0 -4.0 -4.9 -4.0 -5.2 -3.0 -3.5 -1.6 -1.4 -0.1
 n n
2
n
3  n 
 n ∑ xi ∑ xi ∑ xi   ∑ yi 
 n i=1 i=1 i=1   ni=1 
n n n  a 0 
 ∑x ∑ xi
2
∑ xi
3
∑ xi   a 
4  ∑x y 
 i=1 i i=1 i=1 i=1   1  =  i=1
i i

n 2 n n n  n 2 
3 4 5 a2 
 ∑ xi ∑ xi ∑ xi ∑ xi     ∑ xi yi 
i=1 i=1 i=1 i=1   a3  i=1 
n 3 n
4
n
5
n
6  n
3 
 ∑ xi ∑ xi ∑ xi ∑ xi   ∑ xi yi 
i=1 i=1 i=1 i=1   i=1 

 24 229.6 3060.2 46342.8   a0   − 1.30 


 229.6 3060.2 46342.8 752835.2   a1   − 316.9 
   =  
 3060.2 46342.8 752835.2 12780147.7  a2   − 6037.2 
46342.8 752835.2 12780147.7 223518116.8  a  − 9943.36
  3   
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 53
Example
 a0   − 0.3593 Regression Equation
a   2.3051 
 1 =   y = - 0.359 + 2.305x - 0.353x2 + 0.012x3
a 2  − 0.3532
a   0.0121 
 3
6
 

2
f(x)

-2

-4

-6
0 5 10 15 20 25
21/4/2006 Anuj Jain, Astt Prof, AMD,
x MNNIT, Allahabad 54
Exponential function
 If relationship is an exponential function
bx
y = ae
To make it linear, take logarithm of both sides
ln (y) = ln (a) + bx
Now it’s a linear relation between ln(y) and x
Linear regression gives

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 55


Exponential function
 Greater weights to small y values
 Better to minimize weighted function

Linear regression gives

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 56


Exponential function

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 57


Power Function
 If relationship is a power function
b
y = ax
To make linear, take logarithm of both sides
ln (y) = ln (a) + b ln (x)

Now it’s a linear relation between ln(y) and ln(x)

ln(a)
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 58
Power Function
x y X=Log(x) Y=Log(y)
1.2 2.1 0.18 0.74
2.8 11.5 1.03 2.44
4.3 28.1 1.46 3.34
5.4 41.9 1.69 3.74
6.8 72.3 1.92 4.28
7.9 91.4 2.07 4.52
100 2.5

90

80 2

70

1.5
60

Y=Log(y)
50
y

1
40

30

0.5
20

10
0
0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 1 2 3 4 5 6 7 8 9 X=Log(x)
x

x vs y X=Log(x) vs Y=log(y)
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 59
Power Function
Using the X’s and Y’s, not the original x’s and y’s

 n   n  5 5
 n ∑ Xi   ∑ Yi  ∑ X i = ∑ ln (xi ) = 8.34
 i=1    =  i=1 
a
i =1 i =1
n n 2  B  n 
  5 2 5
 ∑ Xi ∑ Xi   ∑ X Y
i i 2
i=1 i=1  i=1  ∑ X i = ∑ ln (xi ) = 14.0
i =1 i =1
5 5
∑ Yi = ∑ ln (yi ) = 19.1
i =1 i =1
 6 8.34  a  19.1  5 5
8.34 14.0   B  = 31.4 ∑ X iYi = ∑ ln (xi ) ln (yi ) = 31.4
    
i =1 i =1

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 60


Power Function
Example – Carbon Adsorption
q = pollutant mass sorbed per carbon mass
C = concentration of pollutant in solution,
K = coefficient
n = measure of the energy of the reaction
q = K ( c) n
log10 q = log10 K + n log10 c

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 61


Power Function
Logarithmic axes:
logK3 = 1.8733, K = 101.6733 = 74.696, n = 0.2289

2.5

2
Y=Log(q)

1.5

log10 q = log10 K + n log10 c


1

0.5

0
0 0.5 1 1.5 2 2.5 3
X=Log(c)
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 62
Power Function
Arithmetic axes: K = 74.702, and n = 0.2289
350

300

250

200
q

150 q = K ( c) n

100

50

0
0 100 200 300 400 500 600

21/4/2006 C
Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 63
Nonlinear Relation

Define for initial guess of λs

Obtain dλ s needed to reduce dβ s to zero

In concise matrix form

ATdβ = (ATA)dλ
dλ =(ATA)-1(ATdβ )
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 64
Nonlinear Relation
Gaussian function

In matrix form

ATdβ = (ATA)dλ
dλ =(ATA)-1(ATdβ )
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 65
Nonlinear Relation
(A, x0, σ)
Initial guess (0.8, 15, 4)
Converged values (1.03, 20.14, 4.86)
Actual values (1, 20, 5)

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 66


Software
 Although method involves sophisticated
mathematics,
 a typical software requires initialization of
model and parameters and pressing a button
to provide the results with the statistical values
 No software can pick a model- it can only help
in differentiating between models
 Better programs allow users to specify their
own function

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 67


EXCEL functions
 ToolsData AnalysisRegression FORM
 Input X range
 Input Y range
 Labels (column heading)
 Constant is zero
 Confidence level
 Output range
 Residuals & standardized residual
 Residual plots
 Line fit plot
 Normal probability plot

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 68


EXCEL functions
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.929710151
R Square 0.864360965
Adjusted R Square 0.819147953
Standard Error 0.809178231
Observations 5

ANOVA
df SS MS F Significance F
Regression 1 12.51757177 12.51757177 19.11752687 0.022132988
Residual 3 1.964308227 0.654769409
Total 4 14.48188

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 2.039476493 0.814915963 2.502683204 0.087499474 -0.553952235 4.632905221 -0.553952235 4.632905221
X Variable 1 0.402182352 0.091982912 4.372359417 0.022132988 0.109451398 0.694913305 0.109451398 0.694913305
0 4.078952986

RESIDUAL OUTPUT PROBABILITY OUTPUT

Observation Predicted Y Residuals Standard Residuals Percentile Y


1 2.884059431 0.015940569 0.022747255 10 2.9
2 4.54105072 -0.71105072 -1.014672192 30 3.83
3 4.923123954 1.056876046 1.508166301 50 5.71
4 6.262391185 -0.552391185 -0.788264407 70 5.98
5 7.54937471 0.19062529 0.272023044 90 7.74
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 69
EXCEL functions
X Variable 1 Residual Plot

Residuals
1
0
-1
0 2 4 6 8 10 12 14 16
X Variable 1

X Variable 1 Line Fit Plot

10
Y
5
Y

Predicted Y
0
0 2 4 6 8 10 12 14 16
X Variable 1

Normal Probability Plot

10
5
Y

0
0 20 40 60 80 100
Sample Percentile

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 70


EXCEL functions
 INTERCEPT(Xdata, Ydata) intercept with y axis of
best fit straight line
 SLOPE(Xdata, Ydata) intercept with y axis of best
fit straight line
 LINEST(Xdata, Ydata, stats) best fit straight line
 TREND(Xdata, Ydata, newXdata,const)y values
along the linear trend
 LOGEST(Xdata, Ydata, stats) best fit exponential
line
 CORREL(array1, array2) correlation coefficient
 PEARSON(array1, array2) P correlation coefficient
 RSQ(array1, array2) square of P correlation
coefficient
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 71
EXCEL functions
 DEVSQ(array) sum of squares of data points
about the sample mean
 STEYX(Xdata, Ydata) standard error of predicted
y for each x in regression
 TINV(probability, dofl) student’s t-distribution
 CONFIDENCE(alpha, std dev, size) confidence
interval for population mean
 CHITEST(actual range, expected range) test for
independence
 FTEST(array1, array2) variance in arrays are not
significantly different

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 72


Example 1

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 73


Example 1

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 74


Example 2
a2 a3 a4
 Mh  m   v 2
  dp 
  = a 1  s   ci
  
 A CA D C ρ s   ma   g DC   DC 
S.No. Correlation SSE

0.70
1.    ms  0.0003105
Mh
  = 0.0129
m 
 A CA D C ρ s   a 
0.24
2.  Mh   v 2
 0.0011567
  =0.0014 Ci

gD
 A CA D C ρ s   C 
3.  Mh   dp 
−0.36 0.0009327
  = 0.04 
 A CA D C ρ s  D
 C 
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 75
Example 2

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 76


Example 2

S.No. Correlation SSE

1.  Mh   v Ci
1.00 2

0.54
0.00001030
  = 0.00096 m s 
  
 A CA D C ρ s  m
 a   g DC 
0.70 0.004
2.  ms   dp  0.00031046
 Mh = 0.013   
 
A D ρ  ma   DC 
 CA C s 

−0.055
 Mh  m 
1.02
 v 2

0.54
 dp 
  0.00069 s   Ci
  
 A CA D C ρ s  0.00000892
 ma   g DC   DC 

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 77


Example 2

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 78


Example 2
S.No Parameter Value of the Standard error Coefficient of
. s in Eq. parameter of the variation of the
(5.4) parameter parameter (%)
1. a1 9.5739E-4 0.5185E-4 5.416
2. a2 1.0046 0.0142 1.412
3. a3 0.5365 0.0105 1.958

R2 = 0.987

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 79


Example 3

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 80


Example 3

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 81


Example 3

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 82


Example 3

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 83


Thanks

21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 84


Example

Often it is difficult to determine which model is best


simply by looking at the scatter plot. In these
cases, one should find the regression equations
for the most appropriate 2 or 3 models and then
plot the data and graph each of the regression
models in the same viewing window. Decide which
model is the best fit by determining which one
contains more of the data points.
21/4/2006 Anuj Jain, Astt Prof, AMD, MNNIT, Allahabad 85

Potrebbero piacerti anche