Group 10 - Curve Fitting

CSU Vision CSU Mission Core Values CSU IGA
Transforming lives by CSU is committed to transform the lives Productivity Compassion Competent
Educating for the BEST of people and communities through high Accessibility Accountability
quality instruction and innovative Relevance Self-disciplined
research, development, production and Excellence
extension. Universally-adept
Republic of the Philippines

Cagayan State University
COLLEGE OF ENGINEERING
Carig Sur, Tuguegarao City
DEPARTMENT OF CHEMICAL ENGINEERING
Advance Engineering Mathematics for ChE

(ChE 57)
Second Semester 2016 – 2017
Course Topic: CURVE FITTING USING MATLAB®
Course Activity: RESEARCH
Name of Student: DULIYAO, VAN VESPER J.

GONZALES, RUBINA C.
GUIAWAN, RISHAYLE MAE M.
GUILLERMO, DENVER V.
LAPUEBLA, JEREMY C.
Program: BSChE
Year Level: III
Date Submitted: May 12, 2017
Instructor: Engr. CAESAR P. LLAPITAN Rating:________
Date Checked: ________

CURVE FITTING
I. INTRODUCTION
Data given are often for discrete values along a continuum. However, you may
require estimates at points between the discrete values period. Linear regression describes
techniques to fit curves to such data to obtain intermediate estimates. In addition, you may
require a simplified version of a complicated function. One way to do this is to compute
values of the function at a number of discrete values along the range of interest. Then, a
simpler function may be derived to fit this curves. Both of these applications are known as
curve fitting.
They are two general approaches for curve fitting that are distinguished from each
other on the basis of the amount of error associated with the data. First, where the data exhibit
a significant degree of error or “scatter,” the strategy is to derive a single curve that
represents the general trend of the data. Because any individual data point may be incorrect,
we make no effort to intersect every point. Rather, the curve is designed to follow the pattern
of the points taken as a group. One approach of this method is called least-squares regression.
Second, where the data are known to be very precise, the basic approach is to fit a
curve or a series of curves that pass directly through each of the points. Such data usually
originate from tables. Examples are values for the densities of water or for the heat capacity
of gases as a function of temperature. The estimation of values between well-known discrete
points is called interpolation.
Chapter I
LINEAR REGRESSION
I. INTRODUCTION
Linear regression is a common Statistical Data Analysis technique. It is used to
determine the extent to which there is a linear relationship between a dependent variable
and one or more independent variables. It can be used in business to evaluate trends and
make estimates or forecasts. For example, if a company's sales have increased steadily
every month for the past few years, conducting a linear analysis on the sales data with
monthly sales on the y-axis and time on the x-axis would produce a line that that depicts
the upward trend in sales. In essence, it involves showing how the variation in the
dependent variable can be captured by change in the independent variables.
II. THEORETICAL BACKGROUND

1. Statistics Review
The most common measure of central tendency is the arithmetic mean. The

arithmetic mean y of a sample is defined as the sum of the individual data points (yi)
divided by the number of points (n), or
y
y i
where the summation is from i=1 through n.

There are several alternatives to the arithmetic mean. The median is the midpoint
of a group of data. It is calculated by first putting the data in ascending order. If the
number of measurements is odd, the median is the middle value. If the number is
even, it is the arithmetic mean of the two middle values. The median is sometimes
called the 50th percentile. The mode is the value that occurs most frequently. The
concept has direct utility only when dealing with discrete or coarsely rounded data.
The simplest measure of spread is the range, the difference between the largest
and the smallest value. Although it is certainly easy to determine, it is not considered
a very reliable measure because it is highly sensitive to extreme values.
The most common measure of spread for a sample is the standard deviation (sy) about
the mean:
1
St
sy 
n 1
whereStis the total sum of the squares of the residuals between the data points and the
mean, or
S t   yi  y   2
Thus, if the individual measurements are spread out widely around the mean, St
will be large. If they are grouped tightly, the standard deviation will be small. The
spread can also be represented by the square of the standard deviation, which is called
the variance:
s 2

y y
i  2
n 1
y
We should note that an alternative, more convenient formula is available to compute

the variance:
 y   y  / n
2 2
s 2
 i i
n 1
y
A final statistic that has utility in quantifying the spread of data is the coefficient
of variation (c.v.). This statistics is the ratio of the standard deviation to the mean. As
such, it provides a normalized measure of the spread, it is often multiplied by 100 so
that it can be expressed in the form of a percent:
sy
c.v.   100%
y
Another characteristic that bears on the present discussion is the data distribution
that is the shape with which the data are spread around the mean. A histogram
provides a simple visual representation of the distribution, a histogram is constructed
by sorting the measurements into intervals, or bins. The units of measurement are
plotted on the ordinate.
2. Random Numbers And Simulation
In this section, we will describe two MATLAB functions that can be used to
produce a sequence of random numbers. The first (rand) generates numbers that have
a normal distribution.
2
MATLAB Function: rand
Function rand generates a sequence of numbers that are uniformly distributed
between 0 and 1. A simple representation of its syntax is
r = rand(m, n)
where r = an m-by-n matrix of random numbers. The following formula can be used
to generate a uniform distribution on another interval:
rununiform = low + (up – low) + rand(m, n)
where low = the lower bound and up = the upper bound.
MATLAB Function: randn

Function randn generates a sequence of numbers that are normally distributed
with a mean of 0 and a standard deviation of 1. A simple representation of its syntax
is
r = randn(m, n)
where r = an m-by-n matrix of random numbers. The following formula can be then
used to generate a normal distribution with a different mean (mn) and standard
deviation (s),
rnormal = mn + s * randn(m, n)
3. Linear Least-Squares Regression

Where substantial error is associated with data, the curve-fitting strategy is to
derive an approximating function that fits the shape of the general trend of the data
without necessarily matching the individual points. One approach to do this is to
visually inspect the plotted data and sketch a “best” line through the points. Although
such “eyeball” approaches have commonsense appeal and are valid for “back-of-the-
envelop” calculations, they are deficient because they are arbitrary. That is, unless the
points define a perfect straight line, different analysts would draw different lines.
To remove this subjectivity, some criterion must be devised to establish a basis for
the fit. One way to do this is to derive a curve that minimizes the discrepancy between
the data points and the curve. To do this, we must first quantify the discrepancy
between the simplest example is fitting a straight line to set a pair of observations:
x1 , y1 , x2 , y2 ...., xn , yn 
3
The mathematical expression for the straight line is
y  a0  a1 x  e
wherea0 and a1 are coefficients representing the intercept and the slope, respectively,
and e is the error, or the residual, between the model and the observations, which can
be represented
e  y  a0  a1 x
Thus, the residual us the discrepancy between the true value of y and the
approximate value a0 + a1x, predicted by the linear equation.
Criteria for a “Best” Fit

One strategy for fitting a “best” line through the data would be to minimize the
sum of the residual errors for all the available data, as in
n n
e  y
i 1
1
i 1
i  a0  a1 xi 
wheren = total number of points.

A strategy that overcomes the shortcomings of the aforementioned approaches is
to minimize the sum of the squares of the residuals:
n n
S r   ei2    yi  a0  a1 x1 
2
i 1 i 1
This criterion is called the least squares, has a number of advantages, including
that it yields a unique line for a given set of data,
Least-Squares Fit of a Straight line

n n
To determine values for a0 and a1 S r   ei2    yi  a0  a1 x1 2 is differentiated with
i 1 i 1
respect to each unknown coefficient:
4
S r
 2  y i  a 0  a1 xi 
a 0
S r
 2  y i  a 0  a1 xi xi 
a 0
Note that we have simplified the summation symbols; unless otherwise indicated, all
summations are from i=1 to n. Setting this derivatives equal to zero will result in a
minimum Sr. If this is done, the equation can be expressed as
0   yi   a0   a1xi
0   xi yi   a0 xi   a1xi2
Now, realizing that  a0  na0 , we can express the equations as a set of two
simultaneous linear equations with two unknowns (a0 anda1);

na0   xi a1   y1
 x a   x a   x y
i 0
2
i 1 i i
These are called the normal equations. They can be solved simultaneously for
n xi yi   xi  yi
a1 
n xi2  xi 
2
Quantification of error of Linear Regression

Standard deviation for the regression line ca be determined as
Sr
sy / x 
n2
wheresy/x is called the standard error of the estimate.
The difference between the two quantities, St-Sr, quantifies the improvement or error
reduction due to describing the data in terms of a straight line rather thyan as an
average value. Because then magnitude of this quantity is scale-dependent, the
difference is normalized to St to yield
St  S r
r2 
St
wherer2 is called the coefficient of determination and r is the correlation coefficient
  r . For a perfect fit, S =0 and r =1, signifying that the line explains 100% of the
2
r
2
variability of the data. For r2=0, Sr=St and the fit represents the improvement. An
5
alternative formulation for the r that is more convenient for a computer
implementation is
n xi yi    xi  yi 
r
n xi2  xi  n yi2   yi 
2 2
4. Linearization Of Non-linear Relationships

Linear regression provides a powerful technique for fitting a best line to data.
However, it is predicted on the fact that the relationship between the dependent and
independent variables is linear. This is not always the case, and the first step in any
regression analysis should be to plot and visually inspect the data to ascertain whether
a linear model applies. In some cases, techniques such as polynomial regression. For
others, transformation can be used to express the data in a form that is compatible
with linear regression.
One example is the exponential model:
y  1e 1x
where 1 and 1 are constants. This model is used in may fields of engineering and
science to characterize quantities that increase (positive 1 ) or decrease (negative 1 )

at a rate that is directly proportional to their own magnitude.
Another example of a nonlinear model is simple power equation:
y   2 x 2
where  2 and  2 are constants coefficients. This model has wide applicability in all
fields of engineering and science.
A third example of a nonlinear model is saturation-growth-rate equation;
x
y  3
3  x
where  3 and  3 are constants coefficients. This model, which is particularly well-
suited for characterizing population growth rate under limiting conditions, also
represents a nonlinear relationship between y and x that levels off, of “saturates,” as x
increases. It has many applications, particularly in biologically related areas of both
engineering and science.
6
III. NUMERICAL ANALYSIS
A. General Algorithm
Step 1: START.
Step 2:Read the data
Step 3: Plot the data
Step 4: Is the plot appear to have a linear regression?
n ( xi yi )  ( xi )( yi )
Step 5: If YES calculate r  , the correlation
n x 2 i   x 2 i n y 2 i   y 2 i
coefficient, if NO perform non-linear regression analysis.
Step 6: Is r equal, greater than or less than to zero?
Step 7: If YES calculate a1  , yave, xave, and a0  y  a1 x . If no
n x 2i  ( xi )2
perform non-linear regression analysis.
Step 8: Display equation as y  a1 x  a0
Step 9: END
7
B. General Flowchart
START
Read the data.
Plot the data.
Does the plot appear NO

be a linear
regression?
YES
n ( xi yi )  ( xi )(  yi )
calculate r 
n x 2i   x 2i n y 2i   y 2i
NO Perform non-linear
Is r equal, greater than or
regression.
less than zero?
YES
calculate a  , yave, xave, and
1
n x 2i  ( xi ) 2
a0  y  a1 x .
Display
y  a1 x  a0
END
8
The program starts and then the program will read the data, which are the values
of x and y. the program will plot the given values. The program will ask if the plot
appears to be a linear regression, if not the program will perform nonlinear regression.
On the other hand the program will proceed to calculate r, the coefficient of
determination. The proceeding to the next step, after calculating r, the program will
ask if r is greater than or less than zero, if not the program will perform nonlinear
regression, if yes the program will calculate a1, the slope of the line, a0, the y-
intercept of the line and the yave and xave, then the program will display the equation of
the line that best fits the given set of data. Then the program ends.
C. M-files
M-file of Statistics Review
y=[]; %input data

c=((std(y)/mean(y)));
fprintf('total of the data=%8.4f\n',sum(y)) %total of the data
fprintf('mean=%8.4f\n',mean(y)) %mean
fprintf('median=%8.4f\n',median(y)) %median
fprintf('mode=%8.4f\n',mode(y)) %mode
fprintf('range=%8.4f\n',range(y)) %range
fprintf('std=%8.4f\n',std(y)) %Standard deviation
fprintf('var=%8.4f\n',var(y)) %variance
fprintf('Coefficient variance=%8.4f\n',c) %coefficient Variance
hist(y)
M-file of Random Numbers and Simulation
clear,clc, clf,formatshortg
%input how many random numbers
n=;
r=randn(m,n);
%equation normally distibuted numbers
Runiform=low+(up-low)*rand(m,n)
%histogram of the random numbers
hist(m,n),title('()')
xlabel('()')
9
M-file of Linear Least-Squares Regression
x=[]; %input values of x
y=[]; %input values of y
n=length(x)
coef=polyfit(x,y,1);
ycal=polyval(coef,x);
S=sum((ycal-y).^2);
yave=(sum(y)/2);
sdev=sum((y-yave).^2);
std=sqrt(S/(n-1));
cor=sqrt(1-S/std);
coef1=coef;
%equation of the line
fprintf('y=a+bx,a=%8.5f,b=%8.5f\n',coef(2),coef(1))
%standard error
fprintf('Standard error=%8.4f\n',std)
%Correlation coefficient
fprintf('Correlation coefficient=%8.4\n',cor)
S=sum((-y).^2);
yave=(sum(y)/2);
Sdev=sum((y-yave).^2);
std=sqrt(S/(n-2));
cor=sqrt(1-S/Sdev);
%equation of the curve
fprintf('y=a+bx+cx^2,a=%8.5f,b=%8.5f,c=%8.5\n',coef(3),coef(2),coef(1))
%Standard error
fprintf('Standard error=%8.4\n',std)
%correlation coefficient
xmax=max(x);xmin=min(x);
x1=[xminxmax];
y1=polyval(coef1,x1);
dx=(xmax-xmin)/50;
x2=xmin:dx:xmax;
y2=polyval(coef,x2);
plot(x,y,'o',x1,y1,x2,y2);
xlabel('x');ylabel('y')
M-file of Linear Least-Squares Regression
clear,clf,clf
x=[];% input values of x
y=[];%input values of y
a=polyfit(x,log(y),1) %coefficient of the polynomial
xp=[0:.1:2.5];
yp=a(1)*xp+a(2);
alpha1=exp(a(2));beta1=a(1);
ype=alpha1*exp(beta1*xp);
subplot(2,1,1)
plot(x,y,'o',xp,ype)
title('y versus x'),ylabel('y')
subplot(2,1,2)
plot(x,log(y),'o',xp,yp)
title('ln(y) versus x'),xlabel('x'),ylabel('ln(y)')
10
D. Syntax
Syntax Description
Sum Returns the sum of the elements.
Mean Average mean value
Median Median value
Mode Most frequent value
Range Difference of the highest and lowest value
Std Standard deviation
Var Variance, which measures the spread or dispersion of the values
polyfit Finds the coefficient of the polynomial p(x) of degree n that fits
the data.
Polyval Returns the value of the polynomial of degree n evaluated at x.
max Maximum value
Min Minimum value
Fpintf Writes the data to a text file
Rand Generates a sequence of uniformly distributed numbers
randn Generates a sequence of numbers that are normally distributed
11
IV. CONCLUSION
Linear regression can be used in medical fields and business accounting analysis.
It can predict the trend of the data without necessarily matching the individual points.
For example, one would like to know not just whether patients have high blood
pressure, but also whether the likelihood of having a high blood pressure is influenced
by factors such as age and weight. The variable to explain is called the dependent
variable or, alternatively, the response variable. The variable that explains that
explains are called the independent variable or predictor variable. While in business, it
can be used to predict the flow of sales of the business.
Sample Problems:
1. Given the data

0.9 1.42 1.30 1.55 1.63
1.32 1.35 1.47 1.95 1.66
1.96 1.47 1.92 1.35 1.05
1.85 1.74 1.65 1.78 1.71
2.29 1.82 2.06 2.14 1.27
Determine (a) the mean, (b) median, (c) mode, (d) range, (e) standard deviation, (f)
variance, and (g) coefficient of variation.
A. Computational method
a. Mean
y i  40.61
y
y i

40.61
 1.6244
n 25
b. Median
n  1 25  1
median    13 ; the 13th value is 1.92
2 2
c. Mode is 1.35 and 1.47.
d. Range
range  yi l arg est  yi smallest
12
range  2.29  0.90
range  1.39
e. Standard deviation
By using excel values of ( yi  yi ) 2 and y2i are obtained.
I yi ( yi  yi ) 2 y2i
1 0.9 0.524755 0.81

2 1.32 0.092659 1.7424
3 1.96 0.112627 3.8416
4 1.85 0.050895 3.4225
5 2.29 0.443023 5.2441
6 1.42 0.041779 2.0164
7 1.35 0.075295 1.8225
8 1.47 0.023839 2.1609
9 1.74 0.013363 3.0276
10 1.82 0.038259 3.3124
11 1.30 0.105235 1.69
12 1.47 0.023839 2.1609
13 1.92 0.087379 3.6864
14 1.65 0.000655 2.7225
15 2.06 0.189747 4.2436
16 1.55 0.005535 2.4025
17 1.95 0.106015 3.8025
18 1.35 0.075295 1.8225
19 1.78 0.024211 3.1684
20 2.14 0.265843 4.5796
21 1.63 3.14E-05 2.6569
22 1.66 0.001267 2.7556
23 1.05 0.329935 1.1025
24 1.71 0.007327 2.9241
25 1.27 0.125599 1.6129
 40.61 2.764416 68.7313
13
( yi  y) 2
sy 
n 1
(2.774892) 2
sy 
24  1
s y  0.564
f. Variance
st
s2 y 
n 1
2.774892
s2 y 
25  1
s2 y  0.1156
g. Coefficient variation
sy
c.v  100%
y
0.564
c.v   100%
1.6244
cv  0.3472%
B. M-file
y=[0.9; 1.32; 1.96; 1.85; 2.29; 1.42; 1.35; 1.47; 1.74; 1.82; 1.30; 1.47;
1.92; 1.65; 2.06; 1.55; 1.95; 1.35; 1.78; 2.14; 1.63; 1.66; 1.05; 1.71;
1.27];
c=((std(y)/mean(y)));
fprintf('total of the data=%8.4f\n',sum(y)) %total of the data
fprintf('mean=%8.4f\n',mean(y)) %mean
fprintf('median=%8.4f\n',median(y)) %median
fprintf('mode=%8.4f\n',mode(y)) %mode
fprintf('range=%8.4f\n',range(y)) %range
fprintf('std=%8.4f\n',std(y)) %Standard deviation
fprintf('var=%8.4f\n',var(y)) %variance
fprintf('Coefficient variance=%8.4f\n',c) %coefficient Variance
hist(y)
14
C. MatlabOutput
Histogram
15
2. Use least-squares regression to fit a straight line to
X 0 2 4 6 9 11 12 15 17 19
Y 6 6 7 6 9 8 8 10 12 12
Along with the slope and intercept, compute the standard error of the estimate and the
correlation coefficient. Plot the data and the regression line.
A. M-file
x=[0 2 4 6 9 11 12 15 17 19]; %dependent variable
y=[5 6 7 6 9 8 8 10 12 12]; %independent variable
n=length(x)
S=sum((ycal-y).^2);
yave=(sum(y)/2);
sdev=sum((y-yave).^2);
std=sqrt(S/(n-1));
cor=sqrt(1-S/std);
coef1=coef;
%equation of the line
fprintf('y=a+bx,a=%8.5f,b=%8.5f\n',coef(2),coef(1))
%standard error
fprintf('Standard error=%8.4f\n',std)
%Correlation coefficient
S=sum((-y).^2);
yave=(sum(y)/2);
Sdev=sum((y-yave).^2);
std=sqrt(S/(n-2));
cor=sqrt(1-S/Sdev);
%equation of the curve
fprintf('y=a+bx+cx^2,a=%8.5f,b=%8.5f,c=%8.5\n',coef(3),coef(2),coef(1))
%Standard error
fprintf('Standard error=%8.4\n',std)
%correlation coefficient
xmax=max(x);xmin=min(x);
x1=[xminxmax];
y1=polyval(coef1,x1);
dx=(xmax-xmin)/50;
x2=xmin:dx:xmax;
y2=polyval(coef,x2);
plot(x,y,'o',x1,y1,x2,y2);
xlabel('x');ylabel('y')
16
B. MatlabOutput
17
3. Perform computation the same computation a sin Example 14.2 but in addition to the drag
coefficient, also vary the mass uniformly by ±10%.
Example 14.2 If the initial velocity is zero, the downward velocity of the free-falling
bungee jumper can be predicted with the following analytical solution (Eq.1.9):
gm  gc d 
v tanh  
cd  m 
 
2
Suppose that g = 9.81m/s , and m = 68.1 kg, but cd is not known precisely. For example,
you might know that it varies uniformly between 0.225 and 0.275 (i.e., ±10% around a
mean value of 0.25 kg/m). Use the rand function to generate 1000 random uniformly
distributed values of cd and then employ these values along with the analytical solution to
compute the resulting distribution of velocities at t = 4s.
A. M-file
clear, clc, clf, format shortg
%n is the number of values that will show
n=10;t=4;g=9.8;
%equations of minimum and maximum drag coefficients
cd=0.25;cdmin=cd-0.1*cd,cdmax=cd+0.1*cd
r=rand(n,1)
%random numbers
cdrand=cdmin+(cdmax-cdmin)*r;
%mean of the drag coefficients
meancd=mean(cdrand),stdcd=std(cdrand)
%change in the drag coefficients
Deltacd=(max(cdrand)-min(cdrand))/meancd/2*100.
subplot(3,1,1)
%histogram of the drag coeficients
hist(cdrand),title('(a) Distibution of drag')
xlabel('cd(kg/m)')
%equation of the minimum and maximum jumper's mass
m=68.1;mmin=m-0.1*m,mmax=m+0.1*m
r=rand(n,1);
%jumper's mass random numbers equation
mrand=mmin+(mmax-mmin)*r;
meanm=mean(mrand),stdm=std(mrand)
%change in the jumper's mass
Deltam=(max(mrand)-min(mrand))/meanm/2*100.
subplot(3,1,2)
%histogram of the jumper's mass
hist(mrand),title('(b) Distibution of mas')
xlabel('m(kg)')
%random numbers equation

vrand=sqrt(g*mrand./cdrand).*tanh(sqrt(g*cdrand./mrand)*t);
%mean of the velocity
meanv=mean(vrand)
%change in the velocities
Deltav=(max(vrand)-min(vrand))/meanv/2*100.
subplot(3,1,3)
%histogram of the velocity
hist(vrand),title('(c) Distibution of mas')
xlabel('v(m/s)')
18
B. Output
19
20
21
4. Perform the same computation as in Example 14.3, but in addition to the drag coefficient,
also vary the mass normally around its mean value with a coefficient of variation of
5.7887%.
Example 14.3Analyze the same case as in Example 14.2, but rather than employing a
uniform distribution, generate normally-distributed drag coefficients with a mean of 0.25
and standard deviation of 0.01443.
A. M-file
clear,clc,clf,clf,formatshortg
n=1000;t=4;g=9.81;
cd=0.25;stdev=0.01443;
r=randn(n,1);
%equation random numbers
cdrand=cd+stdev*r;
%mean of the drag coefficient
meancd=mean(cdrand),stdevcd=std(cdrand)
cvcd=stdevcd/meancd*100.
subplot(3,1,1)
%histogram of the drag coefficients
hist(cdrand),title('(a) Distibution of drag')
xlabel('cd(kg/m)')
m=68.1;stdev=0.05778*m;
r=rand(n,1);
%equation random numbers
mrand=m+stdev*r;
%mean of the masses
meanm=mean(mrand),stdevm=std(mrand)
cvm=stdevm/meanm*100.
subplot(3,1,2)
%histogram of the masess
hist(mrand),title('(b) Distribution of mass')
xlabel('m(kg)')
%equation of the random numbers
vrand=sqrt(g*mrand./cdrand).*tanh(sqrt(g*cdrand./mrand)*t);
%mean of the velocities
meanv=mean(vrand),stdevv=std(vrand)
cvv=stdevv/meanv*100.
subplot(3,1,3)
%histogram of the velocities
hist(vrand),title('(c)Distribution of velocity')
xlabel('v(m/s)')
22
B. Output
23
5.Fit an exponential model
X 0.4 0.8 1.2 1.6 2 2.3
Y 800 985 1490 1950 2850 3600
Plot the data and the equation on both standard and semi-logarithmic graphs with the
MATLAB subplot function.
A. M-file
clear,clf,clf
x=[0.4 0.8 1.2 1.6 2 2.3];
y=[800 985 1490 1950 2850 3600];
%coefficient of the polynomial
a=polyfit(x,log(y),1)
xp=[0:.1:2.5];
yp=a(1)*xp+a(2);
alpha1=exp(a(2));beta1=a(1);
ype=alpha1*exp(beta1*xp);
subplot(2,1,1)
plot(x,y,'o',xp,ype)
title('y versus x'),ylabel('y')
subplot(2,1,2)
plot(x,log(y),'o',xp,yp)
title('ln(y) versus x'),xlabel('x'),ylabel('ln(y)')
24
B. Output
25
Chapter II
GENERAL LINEAR LEAST-SQUARES AND NONLINEAR REGRESSION
I. INTRODUCTION
General linear least squares is a technique for estimating the unknown parameters in a
linear regression model. General linear least-squares can be used to perform linear regression
when there is a certain degree of correlation between the residuals in a regression model.
Nonlinear regression is a form of regression analysis in which observational data are modeled
by a function which is nonlinear combination of the model parameters and depends on one or
more independent variables. The data is fitted by a method of successive approximations.

1. Polynomial regression
The least-squares procedure can be readily extended to fit the data to a higher-order
polynomial. For example, suppose that we fit a second-order polynomial or quadratic:
y =ao +a1x +a2x2+e
For this case the sum of the squares of the residuals is
n
Sr =  (yi – a0 – aixi – a2xi2)2
i 1
To generate the least-squares fit, we take the derivative of the sum of the squares
of the residuals with respect to each of the unknown coefficients of the polynomial, as in
Sr
 2 ∑ (yi – a0 – aixi – a2xi2)
a 0
Sr
 2 ∑ xi (yi – a0 – aixi – a2xi2)
a1
Sr
 2 ∑ xi2 (yi – a0 – aixi – a2xi2)
a 2
These equation can be set equal to zero and rearranged and develop the following set of
normal equations:
(n)ao + (∑ 𝑥 i)a1 + (∑ 𝑥i2)a2 = ∑ 𝑦i
(∑ 𝑥i)ao + (∑ 𝑥i2)a1 + (∑ 𝑥i3)a2= ∑ 𝑥iyi
26
(∑ 𝑥i2)ao + (∑ 𝑥 i3)a1 + (∑ 𝑥 i4)a2=∑ 𝑥i2yi
Where all summations are from i =1through n. Note that the preceding three equations
are linear and have three unknowns: a0, a1, and a2. The coefficients of the unknowns can
be calculated directly from the observed data.
For this case, we see that the problem of determining a least-squares second-order
polynomial is equivalent to solving a system of three simultaneous linear equations. The
two-dimensional case can be easily extended to an mth-order polynomial as in
y =a0 +a1x +a2x2 +···+am xm +e
The foregoing analysis can be easily extended to this more general case. Thus, we can
recognize that determining the coefficients of an mth-order polynomial is equivalent to
solving a system of m+1 simultaneous linear equations. For this case, the standard error is
formulated as
Sr
Sy/x =
n  (m  1)
This quantity is divided by n−(m+1) because (m+1) data-derived coefficients—a0,

a1,...,am—were used to compute Sr; thus, we have lost m+1 degrees of freedom. In
addition to the standard error, a coefficient of determination can also be computed for
polynomial regression.
2. Multiple Linear Regression

Another useful extension of linear regression is the case where y is a linear function
of two or more independent variables. For example, y might be a linear function of x 1 and
x2, as in
y =a0+a1x1 +a2x2 +e
As with the previous cases, the “best” values of the coefficients are determined by
formulating the sum of the squares of the residuals:
n
Sr = (y
i 1
i a 0  a1 x1,1  a 2 x 2, i ) 2
And differentiating with respect to each of the unknown coefficients:
27
Sr
 2 ∑ (yi – a0 – aixi – a2xi2)
a 0
Sr
 2 ∑ xi (yi – a0 – aixi – a2xi2)
a1
Sr
 2 ∑ xi2 (yi – a0 – aixi – a2xi2)
a 2
The coefficients yielding the minimum sum of the squares of the residuals are
obtained by setting the partial derivatives equal to zero and expressing the result in matrix
form as
n x , x ,
1 i 2 i a0 y i
 x1, i x , x , x ,
12 i 1 i 2 i a1 = x , y
1 i i
 x2, i x , x , x ,
1 i 2 i 22 i a2 x , y
2 i i
where the standard error is formulated as
Sr
Sy/x=
n  (m  1)
3. General Linear Least Square

In the preceding pages, we have introduced three types of regression: simple linear,
polynomial, and multiple linear. In fact, all three belong to the following general linear
least-squares model:
y =a0z0 +a1z1 +a2z2 +···+am zm+e
where z0, z1,...,zm are m+1 basis functions. It can easily be seen how simple linear and
multiple linear regression fall within this model—that is, z0 =1,z1 = x1,z2 = x2,...,zm = xm.
Further, polynomial regression is also included if the basic functions are simple
monomials as in z0 =1,z1 = x,z2 = x2,...,zm = xm
Note that the terminology “linear” refers only to the model’s dependence on its
parameters—that is, the a’s. As in the case of polynomial regression, the functions
themselves can be highly nonlinear. For example, the z’s can be sinusoids, as in.
y =a0 +a1 cos(ωx)+a2 sin(ωx)
28
The coefficient of determination and the standard error can also be formulated in
terms of matrix algebra. Recall that r2 is defined as
St  S r
r2 
St
Substituting the definitions of Sr and St gives
r2= ∑(𝑦𝑖 − 𝑦𝑝)2 / ∑(𝑦𝑖 − ȳ)2
where ȳ =the prediction of the least-squares fit. The residuals between the best-fit
curveand the data, yi −ȳ, can be expressed in vector form as
{y}−[Z]{a}
4. Nonlinear regression
There are many cases in engineering and science where nonlinear models must be fit
to data. In the present context, these models are defined as those that have a nonlinear
dependence on their parameters. For example,
y =a0(1−e−a1x)+ e
As with linear least squares, nonlinear regression is based on determining the values
of the parameters that minimize the sum of the squares of the residuals. However, for the
nonlinear case, the solution must proceed in an iterative fashion. There are techniques
expressly designed for nonlinear regression. For example, the Gauss-Newton method uses
a Taylor series expansion to express the original nonlinear equation in an approximate,
linear form. Then least-squares theory can be used to obtain new estimates of the
parameters that move in the direction of minimizing the residual.
29
A. General algorithm
Step 1: START.
Step 2: Read the data
Step 3: Plot the data
Step 4: Is the plot appear to have a non-linear regression?
Step 5: If YES calculate r2 =

(y  y )
i p
2
, if NO perform linear regression

 (y  y) i
2
analysis.
Step 6: Is r2 equal, greater than or less than to zero?
Step 7: If YES calculate (n)ao + (∑ 𝑥 i)a1 + (∑ 𝑥i2)a2 = ∑ 𝑦I, (∑ 𝑥i)ao + (∑ 𝑥i2)a1 +
(∑ 𝑥i3)a2= ∑ 𝑥iyi, (∑ 𝑥 i2)ao + (∑ 𝑥i3)a1 + (∑ 𝑥i4)a2=∑ 𝑥i2yi, yave, xave, and a0  y  a1 x . If
no perform linear regression analysis.
Step 8: Display equation as y =a0 +a1x +a2x2
Step 9: END.
30
B. General flow chart
Start
Yi=0 ai=0 Xi=0
Plot the data
Is the plot
appear be a non- No
linear
regression?
Yes
Calculate r =2 (y  y )
i p
2
 (y  y) i
2
Is r2 equal, greater
thanor less than to Perform linear regression
No
zero?
Yes
Calculate (n)ao+ (∑ 𝑥 i)a1 + (∑ 𝑥i2)a2 = ∑ 𝑦i

, (∑ 𝑥i)ao + (∑ 𝑥i2)a1 + (∑ 𝑥i3)a2= ∑ 𝑥iyi, (∑ 𝑥 i2)ao +
(∑ 𝑥i3)a1 + (∑ 𝑥 i4)a2=∑ 𝑥 i2yi, yave, xave,
Display
y =a0 +a1x +a2x2
End
31
The program starts and then the program will read the data, if the yi = 0, ai = 0, xi = 0.
The program will plot the given data. The program will ask if the plot will appear to be a non-
linear regression, if No the program will perform a linear regression of r2, the coefficient of
determination. The program will proceed to evaluate the value of r2. Then the next step after
calculation the r2, the program will ask if r2 is greater than, a less than zero, if No the program
will perform linear regression, if Yes the program will calculate a, the slope of the line, a o,
the yaveand x. then the program will the equation of the line y = ao + aix +a2x2. Then the
program end.
C. SYNTAX
Function Description
Plot (x, y) This Matlab function plots the value of x versus y
; This function separates the variable in a row.
, This function separates the variables in column
‘o’ Function for graphing
% This function is defined as a comment function.
IV. CONCLUSION
General Linear Least-Squares and Nonlinear Regression can be used in
different field of application. For example, an experiment to evaluate the rate of
absorption and the rate of removal of a certain drug administered intravenously on
human subjects. The investigator collect blood samples at half intervals for six
hours. By using the nonlinear regression it can estimate the number of hours
required for the body to eliminate enough of the drug so that the average drug
concentration in the bloodstream falls below 1 microgram/deciliter. And by
graphing the given data to verify the exact answer.
32
Sample problems:
1. Fit a parabola to the data from Table 14.1. Determine the r2 for the fit and
comment on the efficacy of the result.
Table 14.1
v, m/s 10 20 30 40 50 60 70 80
F, N 25 70 380 550 610 1220 830 1450
A. Computational method
The data can be tabulated and the sums computed as:
ί x y x2 x3 x4 xy x2y
1 10 25 100 1000 10000 250 2500
2 20 70 400 8000 160000 1400 28000
3 30 380 900 27000 810000 11400 342000
4 40 550 1600 64000 2560000 22000 880000
5 50 610 2500 125000 6250000 30500 1525000
6 60 1220 3600 216000 12960000 73200 4392000
7 70 830 4900 343000 24010000 58200 4067000
8 80 1450 6400 512000 40960000 116000 9280000
∑ 360 5135 20400 1296000 87720000 312850 20516500
Using normal equation:
(n)ao + (∑ 𝑥 i)a1 + (∑ 𝑥i2)a2 = ∑ 𝑦i
8ao + 360a1 +20400a2 = 5135
(∑ 𝑥i)ao + (∑ 𝑥i2)a1 + (∑ 𝑥i3)a2= ∑ 𝑥iyi
360ao + 20400a1 + 1296000a2 = 312850
33
(∑ 𝑥i2)ao + (∑ 𝑥 i3)a1 + (∑ 𝑥 i4)a2=∑ 𝑥i2yi
20400ao + 1296000a1 + 87720000a2 = 20516500
In matrix form:
 8 360 20400  a 0  5135

 360 20400   
129600   a1   312850

20400 129600 87720000 a 2  20516500
B. M-file
Input:
N=[8 360 20400;360 20400 129600;20400 129600 87720000];

r=[5135 312850 20516500];
a=N/r
Output: the coefficient
%evaluate
Output
Input
v=[10 20 30 40 50 60 70 80]';
F=[25 70 380 550 610 1220 830 1450]';
plot(v,F,'o'),grid on
%to locate (a curve) by plotted points
34
\
Output
Therefore the best fit is
y=0.0010+0.0063x+4.2747x2
Predicted values at x are 10, 20, 30, 40, 50, 60, 70, and 80;
0.0010 + 0.0063(10) + 4.2747(10)2= 427.534
0.0010 + 0.0063(20) + 4.2747(20)2= 1710.007
0.0010 + 0.0063(30) + 4.2747(30)2= 3847.42
0.0010 + 0.0063(40) + 4.2747(40)2= 6839.773
0.0010 + 0.0063(50) + 4.2747(50)2= 10687.066
0.0010 + 0.0063(60) + 4.2747(60)2= 15389.299
0.0010 + 0.0063(70) + 4.2747(70)2= 20946.472
0.0010 + 0.0063(80) + 4.2747(80)2= 27358.585
The mean (ȳ) if the y value is
35
25  70  380  550  610  1220  830  1450
ȳ  641.875
8
i x y ypred (y-ȳ)2 (y-ypred)2

1 10 25 427.534 380535 162034
2 20 70 1710.007 327041 26896123
3 30 380 3847.42 68579 1202309
4 40 550 6839.773 8441 39561244
5 50 610 10687.066 1016 101547259
6 60 1220 15389.299 334229 200769034
7 70 830 20946.472 35391 404672446
8 80 1450 27358.585 653066 671254777
∑ 1808297 1456885918
r2 is defined as
r2 =
(y  y )
i p
2
 (y  y)i
2
1456885918  1808297
r2=  0.99876
1456885918
2. Use the following set of pressure-volume data to find the best possible virial constants
(A1 and A2) for the equation of state shown below. R =82.05 mL atm/gmol K and T =303
K.
PV A1 A2
 1 
RT V V ^2
P (atm) 0.985 1.108 1.363 1.631

V (mL) 25000 22200 18000 15000
A. Computation method
36
PV A1 A2
1  2
RT V V
A1 A2 PV
 2  1 ; let R = 82.05 mL.atm/gmol K
V V RT
T = 303 K
A1 A2 PV
 2  1
V V (82.05)(303)
We find,
A1 A2 PV
 2  1
V V 24861
This equation is expressed in the matrix form
VA = K
A1 A2 𝐴1
=K
V V 2 𝐴2
B. M-file
Input
p=[0.985 1.108 1.363 1.631];
v=[25000 22200 18000 15000];
fori=1:4
V(i,1)=v(i).^-1;
V(i,2)=v(i).^-2;
end
%equation of constant K
K=[p.*v./24861-1]';
disp(['V='])
disp(V)
disp(['K='])
disp(K)
%equation to get the virial constants
A=(V'*V)\(V'*K)
%to locate (a curve) by plotted points
plot(p,v, 'o'),grid on
37
Output
C. Graph
38
Chapter III
FOURIER ANALYSIS
I. INTRODUCTION
Fourier analysis deals with both the time and the frequency domains. Maybe, some of
us are not comfortable with the concluding, as a general overview of Fourier approximation
we are concern with a large subsequent of a material. An important aspect of this overview
will be to familiarize you with the frequency domain. This can be analyze by some methods
least square, by using sinusoidal function, Fourier series and by discrete Fourier transform.
Curve Fitting with Sinusoidal Function
A periodic function f(t) is one for which
f (t) = f (t + T ) (16.1)
where T is a constant called the period that is the smallest value of time for which Eq. (16.1)
holds.
In this discussion, we will use the term sinusoid to represent any waveform that can
be described as a sine or cosine. There is no clear-cut convention for choosing either
function, and in any case, the results will be identical because the two functions are simply
offset in time by π/2 radians. For this chapter, we will use the cosine, which can be expressed
generally as
f (t) = A 0 + C1cos(w 0 t +  ) (16.2)
where,
The mean value A0 sets the average height above the abscissa.
The amplitude C1 specifies the height of the oscillation.
The angular frequency w0 characterizes how often the cycles occur.
The phase angle (or phase shift) θ parameterizes the extent to which the sinusoid is
shifted horizontally.
39
Note that the angular frequency (in radians/time) is related to the ordinary frequency
f (in cycles/time) by
w 0  2 f (16.3)
where frequency,
1
f 
T
In addition, the phase angle represents the distance in radians from t = 0 to the point
at which the cosine function begins a new cycle. A negative value is referred to as a lagging
phase angle because the curve cos(w0t − θ) begins a new cycle _ radians after cos(w0t). Thus,
cos(w0t − θ) is said to lag cos(w0t).
Although Eq. (16.2) is an adequate mathematical characterization of a sinusoid, it is
awkward to work with from the standpoint of curve fitting because the phase shift is included
in the argument of the cosine function. This deficiency can be overcome by invoking the
trigonometric identity:
C1cos(w 0 t +  ) = C1[cos(w 0 t +  )cos( ) - sin(w 0 t +  )sin(  )] (16.4)
Substituting Eq. (16.4) into Eq. (16.2):
y = A 0 + A1cos(w 0 t) + B1sin(w 0 t)
where,
A1 = C1cos( ) , B1 = - C1sin(  )
 B1 
  arctan    , C1  A1  B1
2 2
 A 1 
In a Least square fit sinusoid we can find the values of A0, A1 and B1 using a matrix in
the form of:


N  cos(w t) 0  sin(w
t)   A0  

0
   y 
 cos(w 0 t)  cos (w t)  cos( w 0 t)sin(w 0 t)   A1  =   ycos(w 0 t) 
2
0
  sin(w 0 t)
  cos(w 0 t)sin(w 0 t)  sin 2 (w 0 t)   B1    ysin(w 0 t) 
where,
40
N= is the number of variables given
In finding the Discrete Fourier Transform use Fast Fourier Transform which needs to
find the fx, tn, fmax and fmin.
1
fx  : Frequency at the given change in time
t
N
tn  : Time at total length
fx
f max  0.5f x : Nyquist frequency, this is the highest frequency

1
f min  : Minimum frequency
tn

Least-Squares Fit of a Sinusoid
A. Algorithm :
Step 1: Start
Step 2: Declare the variables w0 and N
Step 3: Compute for A0, A1, B1 and C1
Step 4: Get the equation of a sinusoidal
Step 5: Display the obtained values and equation
Step 6: End
B. Flow Chart:
In starting, use oval shape to indicate starting point. Hexagonal shape is use to
indicate the variables needed (W0, t, N, ph). Next, rhombic shape indicate the
values of the input variables. Then, the rectangular shape indicate the
computational method to be perform and also use to display the result in the
computation. Lastly, use oval shape to end the process
41
Start
W0=0 N=0
t=0 ph=0
Input W0 ,ph, t
and N
A0=∑ 𝑌/N C1=√𝐴2 + 𝐵 2

2 2
B1=𝑁 ∑ 𝑦𝑠𝑖𝑛( 𝑤0 𝑡) A1=𝑁 ∑ 𝑦𝑐𝑜𝑠( 𝑤0 𝑡)
Print the values of A0, A1, B1, C1

plug the values in y=A0+A1cos(W0t)+B1sin(W0t)
End
Discrete Fourier Transform:

A. Algorithm:
Step 1: Start
Step 2: Declare the given N and dt
42
Step 3: Compute for fx, tn, fmax and fmin
Step 4: Print for the values of fx, tn, fmax and fmin
Step 5: Plot the obtained values of DFT
Step 6: End
B. Flow Chart:
First, use oval shape to indicate starting point. Hexagonal shape is use to
indicate the variables needed (N and dt). Next, rhombic shape indicate the values of
the input variables. Then, the rectangular shape indicate the computational method to
be perform and also use to display the result in the computation. Lastly, use oval
shape to end the process.
Start
N= 0 dt = 0
Input
N and dt
1 N 1
fx  ; tn  ; f max  0.5f x ; f min 
t fx tn
Print the values of fx, tn, fmaxand fmin
Plot the obtained DFT
End
43
C. Syntax:
Function Description
[L,U] This Matlab function returns matrix Y that, for sparse A, contains strictly the lower
triangular L, without its diagonal and upper triangular U as submatrices.
sum(A) This Matlab function returns sum along different dimension of an array.
plot(x,y) This Matlab function plots the value of x versus y.
; This function separates the variable in a row.
, This function separates the variables in column
Sqrt This function is defined as square root.
% This function is defined as a comment function.
fft(y) This function provides an efficient way to compute the DFT.
subplot(x,y,z) This function creates axes in tiled position.
IV. APPLICATION
In curve fitting, Fourier analysis is used to determine the behavior of a certain

function with respect to time. By the use of Discrete Fourier Transform we can identify the
graph of a signal that is very useful in communication, we can also solve problems related in
the behavior of a substance over a period of time.
Sample Problems:
Problem no. 1
The pH in a reactor varies sinusoidally over the course of a day. Use least-squares regression
to fit Eq. (16.11) to the following data. Use your fit to determine the mean, amplitude, and
time of maximum pH. Note that the period is 24 h.
44
Time(t) Conc.(Y)
Hr PH
0 7.6
2 7.2
4 7
5 6.5
7 7.5
9 7.2
12 8.9
15 9.1
20 8.9
22 7.9
24 7
A. Computational method:
Time(t) Conc.(Y)
Hr PH Ycos(W0t) Ysin(W0t) Cos(W0t) Sin(W0t) Cos(W0t)* Cos2(W0t) Sin2(W0t)
Sin(W0t)
0 7.6 7.6 0 1.0 0 0 1.0 0
2 7.2 6.23538 3.6 0.86603 0.5 0.43301 0.75 0.25
4 7 3.5 6.06218 0.5 0.86603 0.43301 0.25 0.75
5 6.5 1.68232 6.27852 0.258852 0.96593 0.25 0.06699 0.93301
7 7.5 -1.94114 7.24444 -0.258852 0.96593 -0.25 0.06699 0.93301
9 7.2 -5.09117 5.09117 -0.70711 0.70711 -0.5 0.5 0.5
12 8.9 -8.9 0 -1.0 0 0 1.0 0
15 9.1 -6.43467 -6.43467 -0.70711 -0.70711 0.5 0.5 0.5
20 8.9 4.45 -7.70763 0.5 -0.86603 -0.43301 0.25 0.75
22 7.9 6.8416 -3.95 0.86603 -0.5 -0.43301 0.75 0.25
24 7 7 0 1.0 0 0 1.0 0
=84.8 =14.94232 =10.18401 =2.31784 =1.93185 0 =6.13397 =4.86603
45
2
w0 = = 0.261799 rad/hr
24
Solving for the matrix form:
 11 2.31784 1.93185   A 0   84.8 

2.31784 6.13397   A  = 14.94232 
 0  1  
1.93185 0   
4.86603  B1   10.18401 
as we evaluate, we obtain 𝐴0 = 8.0270, 𝐴1 = -0.59717 and 𝐵1= -1.09392
therefore the sinusoid equation is:
y = 8.0270 - 0.59717 cos(w 0 t) - 1.09392sin (w 0 t)
where amplitude is:
C1 = (- 0.59717) 2  (-1.09392)2 = 1.2463
phase angle:
  1.09392  24
  arctan       2.0705   7.9087 hrs.
  0.59717  2
Time maximum:
tmax = 24 - 7.9087 = 16.0913 hrs.
B. M-file:
Input:
% To get the values of Ao, A1 and B1 we apply matrix
A = [11 2.317837 1.931852; 2.317837 6.13397 0; 1.931852 0

4.86603]
b = [84.8; 14.9423; 10.184]
[L,U]= lu(A);
d = L\b;
x=U\d
46
A1 = -0.5972;
B1 = -1.0939;
% Q is the Phase angle
Q = 7.9087;
% To compute for the amplitude
Amplitude = sqrt(A1^2+B1^2)
tmax = 24-Q
Output:
47
Input:
w0=2*pi/24;
t=[0 2 4 5 7 9 12 15 20 22 24]';
ph=[7.6 7.2 7 6.5 7.5 7.2 8.9 9.1 8.9 7.9 7]';
Z=[ones(size(t)) cos(w0*t) sin(w0*t)];
a=(Z'*Z)\(Z'*ph);
Sr= sum((ph-Z*a).^2);
syx=sqrt(Sr/(length(t)-length(a)));
tp=(0:24);
php=a(1)+a(2)*cos(w0*tp)+a(3)*sin(w0*tp);
plot(t,ph,'bo',tp,php,'r-')
title('Time vs pH')
xlabel('Time(hours)')
ylabel('pH')
gridon
axistight
mean=a(1);
theta=atan2(-a(3),a(2))*24/2*pi;
aplitude=sqrt(a(2)^2+a(3)^2);
time_max_ph=24-theta;
48
Output:
Time vs pH
8.5
8
pH
7.5
6.5
0 5 10 15 20
Time(hours)
Sample problem 2:
Duplicate Example 16.3, but for 64 points sampled at a rate of dt = 0.01 s from the function
f (t) = cos[2 (12.5)t] + cos[2 (25)t]
Use fft to generate a DFT of these values and plot the results.
A. Computational method:
N = 64
dt = 0.01s
therefore,
1
fx   100 Hz
0.01
64
tn   0.64 s
100
f max  0.5(100)  50 Hz
49
1
f min   1.5625 Hz
0.64
B. M-file:
Input:
n=64; dt=0.01; fs=1/dt; T = 0.64;

tspan=(0:n-1)/fs;
y=5+cos(2*pi*12.5*tspan)+sin(2*pi*25*tspan);
subplot(3,1,1);
plot(tspan,y,'-ok','linewidth',2,'MarkerFaceColor','black');
title('(a) f(t) versus time (s)');
Y=fft(y)/n;
Y';
nyquist=fs/2;fmin=1/T;
f = linspace(fmin,nyquist,n/2);
Y(1)=[];YP=Y(1:n/2);
subplot(3,1,2)
stem(f,real(YP),'linewidth',2,'MarkerFaceColor','blue')
grid;title('(b) Real component versus frequency')
subplot(3,1,3)
stem(f,imag(YP),'linewidth',2,'MarkerFaceColor','blue')
grid;title('(b) Imaginary component versus frequency')
xlabel('frequency (Hz)')
50
Output:
(a) f(t) versus time (s)

10
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
(b) Real component versus frequency
0.5
-0.5
0 5 10 15 20 25 30 35 40 45 50
(b) Imaginary component versus frequency
0.5
-0.5
0 5 10 15 20 25 30 35 40 45 50
frequency (Hz)
51
Chapter IV
POLYNOMIAL INTERPOLATION
I. INTRODUCTION
One of the most important concept of numerical analysis is interpolation.

Interpolation is a fundamental mathematical tool used to approximate values between known
data points. The simplest method of interpolation is the linear interpolation but this method
introduces considerable errors that is why polynomial interpolation is introduced. Polynomial
interpolation uses polynomial functions to give different number of approaches for obtaining
adequate and more precise estimates for such situation. This method of numerical analysis
can be used to improve the velocity prediction for the free-falling bungee jumper as well as
the getting the desired density and viscosity of certain substance that is not included on the
given values.
The most common method of estimating the intermediate values between precise data
points is polynomial interpolation. There are a variety of alternative forms for expressing an
interpolating polynomial beyond the familiar format of the equation
f ( x)  p1 x n 1  p 2 x n  2  ....  p n 1 x  p n
these are the Newton’s and Lagrange interpolating polynomials which are considered the
most popular and useful forms. Inverse interpolation are also used
Newton’s Interpolating Polynomials
The general formula for Newton’s interpolating polynomial is
f n 1 ( x)  f ( x1 )  ( x  x1 ) f [ x2 , x1 ]  ( x  x1 )( x  x2 ) f [ x3 , x2 , x1 ]  ....... 
( x  x1 )( x  x2 )......( x  xn 1 ) f [ xn , xn 1;....., x2 , x1 ]
Where the bracketed function evaluation are finite divided differences. For example, the first
finite divided difference is represented generally as
f ( xi )  f ( x j )
f [ xi , x j ] 
xi  x j
52
The second finite divided difference, which represents the difference of two first divided
differences, is expressed generally as
f [ xi , x j ]  f [ x j , xk ]
f [ xi , x j , xk ] 
xi  xk
Similarly, the nth finite divided difference is
f [ xn , xn 1 ,...., x2 ]  f [ xn 1 , xn  2 ,...., x1 ]
f [ xn , xn 1 ,......, x2 , x1 ] 
xn  x1
It is not necessary that the data points are arranged in ascending order but the points
should be ordered so that they are centered around and as close as possible, to the unknown.
Newton’s interpolating polynomial can be viewed in simple interpretation such as the

linear interpolation and quadratic interpolation. Linear interpolation is the simplest form of
interpolation in which the data points are connected with a straight line and can be calculated
using the Newton linear-interpolation formula
f ( x2 )  f ( x1 )
f1 ( x)  f ( x1 )  ( x  x1)
x2  x1
On the other hand, quadratic interpolation is used to introduce some curvature into the line
connecting the points. A particularly convenient form for this purpose is
f 2 ( x)  b1  b2 ( x  x1 )  b3 ( x  x1 )( x  x2 )
Where:
b1  f ( x1 )
f ( x2 )  f ( x1 )
b2 
x2  x1
f ( x3 )  f ( x 2 ) f ( x 2 )  f ( x1 )

x3  x 2 x 2  x1
b3 ( x) 
x3  x1
Lagrange Interpolating Polynomials
The linear Lagrange interpolating formula yields a straight line that connects the data
points and can calculated in the general form of
53
x  x2 x  x1
f 1 ( x)  f ( x1 )  f ( x2 )
x1  x2 x2  x1
where the nomenclature f ( x1 ) designates that this is a first-order polynomial. The same
strategy can be employed to fit a parabola through three points. For this case three parabolas
would be used with each one passing through one of the points and equaling zero at the other
two. Their sum would then represent the unique parabola that connects the three points.

A. Algorithm
1. Newton Interpolating Polynomials
a) Linear Interpolation
Step 1: Start.
Step 2: Declare variables x0, y0, x1, and y1.
Step 3: Check if x0 = x1. If so, then go back to step 1 because the value of the
function is undefined in this condition. If not, then proceed to step 4.
Step 4: Enter the value of x.
Step 5: Check whether min{x0, x1} ≤ x ≤ max{x0, x1}. If not, then enter another
value of x. If yes, then proceed step 6.
y1  y0
Step 6: Calculate using P  y0  ( x  x0 )
x1  x0
Step 7: Check if y0 = y1. Because if the same, it will be obtained P= y0.
Step 8: Write the results y = P.
Step 9: End.
b) Quadratic Interpolation
Step 1: Start.
Step 2: Declare variables x0, y0, x1, y1, x2 and y2.
Step 3: Check if x0< x1< x2. If so, then go back to step 1 because the function
value is undefined under these conditions. If not, then proceed to step 4.
54
Step 4: Enter the value of x.
Step 5: Check whether min{x0, x1, x2} ≤ x ≤ max{x0, x1, x2}. If not, then enter
another value of x. If yes, then proceed step 6.
y1  y0 y  y1 F  F01
Step 6: Calculate using F01  , F12  2 and F012  12 .
x1  x0 x2  x1 x 2  x0
Step 7: Calculate using P  y1  ( x  x0 ) F01  ( x  x0 )( x  x1 ) F012 .
Step 8: Check if F012 = 0. If yes, then the resulting equation is linear. If not then
the resulting equation is a quadratic equation.
Step 9: Write the results y = P.
Step 10: End
2. Lagrange Interpolating Polynomials
Step 1: Start.
Step 2: Declare variables n and x.
Step 3: Get the estimated data.
Step 4: Display the data.
Step 5: End.
55
B. Flow Chart
1. Newton Interpolating Polynomials
a) Linear Interpolation
The program starts and then it is instructed to modify the program course
execution and taking all the input data is equal to zero. The program ask if x0 = x1, if
it is yes then program return to which the input data is equal to zero. On the other
hand, if x0 and x1 is not equal the program will proceed to the next step in which the
value of x is entered. The program will ask again if the min{x0, x1} ≤ x ≤ max{x0, x1}.
If not, then enter another value of x. If yes, then proceed to the next step. The next
step is calculating the unknown variable using the general Newton’s interpolating
polynomials formula. And then the program checked if y0 = y1. Because if the same, it
will be obtained P= y0.Write the results y = P. Then, the program end.
Start
x0=0 x1=0
y0=0 y1=0
Input
x0, y0, x1, y1
Is
Yes
x0 = x1
No
Input x
Is
min {x0,x1} ≤ x ≤ max {x0,x1} No
Yes
y1  y0
P  y 0  ( x  x0 )
x1  x0
Is
Yes y0 = y1 No
Print Print
y = y0 y=P
End
56
b) Quadratic Interpolation
The program start and then declare the variables x0, y0, x1, y1, x2 and y2.The program
check if x0< x1< x2. If so, then the program go back to the pervious step because the function
value is undefined under these conditions. If not, then proceed to next step. The program
requires to enter the value of x. Then checked whether min{x0, x1, x2} ≤ x ≤ max{x0, x1, x2}. If
not, then enter another value of x. If yes, then proceed to the next step in which the required
data is calculated using the formula of quadratic interpolating polynomials. The program
checked if F012 = 0. If yes, then the resulting equation is linear. If not then the resulting
equation is a quadratic equation. Write the results y = P. Lastly, the program ends.
Start
x0=0 x1=0 x2=0

y0=0 y1=0 y2=0
Input
x0, y0, x1, y1, x2, y2
Is
No x0< x1< x2
Yes
Input x
Is
min {x0,x1,x2} ≤ x ≤ max {x0,x1,x2} No
Yes
y1  y0 , y1  y0 y 2  y1 ,
F01  F  F12 
x1  x01
0 x1  x0 x2  x1
F12  F01
F012 
x 2  x0
Is
Yes y0 = y1 No
Print Print
y=P y=P
Write : Linear Function
wr
End
57
2. Lagrange Interpolating Polynomials
The program start and then declare the variables x and n. The variable are read by the
program. For i=1 to (n+1) is steps of the previous step do Read xi,fi end for {the above
statements reads x,s and the corresponding values of f is }. Declare Sum=0. For i=1 to (n+1)
in steps of 1 do Profvnc=1. For j=1 to (n+1) in steps of 1 do. If (j≠i) then prodfunc=prodfunc
X(x-xj) / (xi-xj) endfor. Then Sum=Sum+fi x Prodfunc {sum is the value of f at x} end for.
The program write x, sum. Lastly, the program stop.
Start
x=0
n=0
Read x and n
Read xi, fi
i = 0 to n
Sum = 0
For i = 0 to n
Prod = 1
For j = 0 to n
Is
Yes
j=i
NO
 x  xj 
prod  prod  
x x 
 i j 
Next j
Sum = sum + prod * fi
Next i
Print x and sum
End
58
C. Syntax
Functions Description
Polyfit polyfit(x,y,n) finds the coefficients of a polynomial p(x) of degree n that fits the y data
by minimizing the sum of the squares of the deviations of the data from the model
(least-squares fit).
Polyval polyval(p,x) returns the value of a polynomial of degree n that was determined
by polyfit, evaluated at x.
Newtint This function finds the coefficients for Newton interpolation polynomial.
Lagrange This function finds the coefficients for Lagrange interpolation polynomial.
D. General M-files
For Newton’s Interpolating Polynomial
functionyint = Newtint(x,y,xx)
% Newtint: Newton interpolating polynomial
% yint = Newtint(x,y,xx): Uses an (n - 1)-order Newton
% interpolating polynomial based on n data points (x, y)
% to determine a value of the dependent variable (yint)
% at a given value of the independent variable, xx.
% input:
% x = independent variable
% y = dependent variable
% xx = value of independent variable at which
% interpolation is calculated
% output:
% yint = interpolated value of dependent variable
% compute the finite divided differences in the form of a
% difference table
n = length(x);
if length(y)~=n, error('x and y must be same length'); end
b = zeros(n,n); % assign dependent variables to the first column
of b.
b(:,1) = y(:); % the (:) ensures that y is a column vector.
for j = 2:n
fori = 1:n-j+1
b(i,j) = (b(i+1,j-1)-b(i,j-1))/(x(i+j-1)-x(i));
end
end
% use the finite divided differences to interpolate
xt = 1;
yint = b(1,1);
for j = 1:n-1 xt = xt*(xx-x(j));
yint = yint+b(1,j+1)*xt;
end
59
For Lagrange Interpolating Polynomials
functionyint = Lagrange(x,y,xx)
% Lagrange: Lagrange interpolating polynomial
% yint = Lagrange(x,y,xx): Uses an (n - 1)-order
% Lagrange interpolating polynomial based on n data points
% to determine a value of the dependent variable (yint) at
% a given value of the independent variable, xx.
% input:
% xx = value of independent variable at which the
% output:
n = length(x);
s = 0;
fori = 1:n product = y(i);
for j = 1:n
ifi ~= j product = product*(xx-x(j))/(x(i)-x(j));
end
end
s = s+product;
end
yint = s;
IV. CONCLUSION
Interpolation is one of the method of numerical analysis in which it is the easiest
way of finding the member of a class of function that agrees with the given data. This
method is used in approximating functions, usually a polynomial. This method are very
important and are extensively applied in the models of different phenomena.
Interpolation may also be used to produce a smooth graph of a function for values are
known only at discrete points, either from measurements or calculations. A member of a
function that agrees with the given data can be calculated with Newton’s interpolation
and Lagrange interpolation in which this two method give similar values. The polyfit and
polyval functions can also be used in polynomial interpolation. This function gives
similar values when it is compared to Newton and Lagrange interpolation.
60
Sample Problem:
1. The following data for the density of nitrogen gas versus temperature come from a
table that was measured with high precision. Use first- through fifth-order
polynomials to estimate the density at a temperature of 330 K. What is your best
estimate?
T, K 200 250 300 350 400 450

Density, kg/m3 1.708 1.367 1.139 0.967 0.854 0.759
A. Computational Method:
Using the model developed above, expressions for the desired results can be obtained.
First, arrange the data so that the points are closest to and centered on the unknown.
T, K xi 300 350 400 250 450 200

Density, kg/m3 xj 1.139 0.967 0.854 1.367 0.759 1.708
First (1st) divided difference:
f ( xi )  f ( x j )
f [ xi , x j ] 
xi  x j
0.967  1.139 0.759  1.367

f [ x2 , x1 ]   3.44 x10 3 f [ x5 , x4 ]   3.04 x10 3
350  300 250  400
0.854  0.967 1.708  0.759

f [ x3 , x2 ]   2.26 x10 3 f [ x6 , x5 ]   3.796 x10 3
400  350 200  450
1.367  0.854
f [ x4 , x3 ]   3.42 x10 3
250  400
Second (2nd) divided difference:
f [ xi , x j ]  f [ x j , xk ]
f [ xi , x j , xk ] 
xi  xk
 2.26 x10 3  (3.44 x10 3 )

f [ x3 , x2 , x1 ]   1.18 x10 5
400  300
61
 3.42 x10 3  (2.26 x10 3 )
f [ x4 , x3 , x2 ]   1.16 x10 5
250  350
 3.04 x10 3  (3.42 x10 3 )

f [ x5 , x4 , x3 ]   7.6 x10 6
450  400
 3.796 x10 3  (3.04 x10 3 )

f [ x6 , x5 , x4 ]   1.512 x10 5
200  250
Third (3rd) divided difference:
f [ xi , x j , xk ]  f [ x j , xk , xl ]
f [ xi , x j , xk , xl ] 
xi  xl
1.16 x10 5  1.18 x10 5

f [ x4 , x3 , x2 , x1 ]   4 x10 9
250  300
7.6 x10 6  1.16 x10 5

f [ x5 , x4 , x3 , x2 ]   4 x10 8
450  350
1.512 x10 5  7.6 x10 6

f [ x6 , x5 , x4 , x3 ]   3.76 x10 8
200  400
Fourth (4th) divided difference:
f [ xi , x j , xk , xl ]  f [ x j , xk , xl , xm ]
f [ xi , x j , xk , xl , xm ] 
xi  xm
 4 x10 8  4 x10 9
f [ x5 , x4 , x3 , x2 , x1 ]   2.933x10 10
450  300
 3.76 x10 8  (4 x10 8 )

f [ x6 , x5 , x4 , x3 , x2 ]   1.6 x10 11
200  350
Fifth (5th) divided difference:
f [ xi , x j , xk , xl , xm ]  f [ x j , xk , xl , xm , xn ]
f [ xi , x j , xk , xl , xm , xn ] 
xi  xn
 1.6 x10 11  (2.933x10 10 )

f [ x6 , x5 , x4 , x3 , x2 , x1 ]   2.733x10 12
200  300
62
Thus, the divided difference table is
T D First Second Third Fourth Fifth

300 1.139 -3.44 x 10-3 1.18 x 10-5 4 x 10-9 -2.9333 x 10-10 -2.7733 x 10-12
350 0.967 -2.26 x 10-3 1.16 x 10-5 -4 x 10-8 -1.6 x 10-11
400 0.854 -3.42 x 10-3 7.6 x 10-6 -3.76 x 10-8
250 1.367 -3.04 x 10-3 1.512 x 10-5
450 0.759 -3.796 x 10-3
200 1.708
Table 1.0: Divided Difference Table
Estimating the density at 330 °K
For 1st-order fit:
f1 ( x)  f ( x1 )  ( x  x1 ) f [ x2 , x1 ]
f1 (330)  1.139  (330  300)( 3.44 x10 3 ) = 1.0358 kg/m3
For 2nd-order fit:
f 2 ( x)  f ( x1 )  ( x  x1 ) f [ x2 , x1 ]  ( x  x1 )( x  x2 ) f [ x3 , x2 , x1 ]
f 2 (330)  1.0358  (330  300)(330  350)(1.18 x10 5 ) = 1.0287 kg/m3
For 3rd-order fit:
f 3 ( x)  f ( x1 )  ( x  x1 )( x  x2 )( x  x3 ) f [ x4 , x3 , x2 , x1 ]
f 31 (330)  1.0827  (330  300)(330  350)(330  400)( 4 x10 9 ) = 1.02886 kg/m3
63
For 4th-order fit
f 4 ( x)  f ( x1 )  ( x  x1 )( x  x2 )( x  x3 )( x  x4 ) f [ x5 , x4 , x3 , x2 , x1 ]
f 4 (330)  1.0289  (330  300)(330  350)(330  400)(330  250)(2.933x10 10 ) =
1.0279 kg/m3
For 5th-order fit:
f 5 ( x)  f ( x1 )  ( x  x1 )( x  x2 )( x  x3 )( x  x4 )( x  x5 ) f [ x6 , x5 , x4 , x3 , x2 , x1 ]
f 5 (330)  1.0289  (330  300)(330  350)(330  400)(330  250)( 2.933x10  10)

= 1.02901 kg/m3
B. M-file
Using the polyfit and polyval function
At 1st order:
% Interpolation using the polyfit function

% T=independent variable
% D=dependent variable
% Output:
% Density=interpolated value of the dependent variable
T=[300 350];
D=[1.139 0.967];
p=polyfit (T,D,1)
Density=polyval (p,330)
warningoffMATLAB:polyfit:RepeatedPoints
At 2nd order:
% Output:
T=[300 350 400 ];

D=[1.139 0.967 0.854];
p=polyfit (T,D,2)
64
At 3rd order:
% Output:
T=[300 350 400 250 ];

D=[1.139 0.967 0.854 1.367];
p=polyfit (T,D,3)
At 4th order:

% Output:
T=[300 350 400 250 450];

D=[1.139 0.967 0.854 1.367 0.759];
p=polyfit (T,D,4)
At 5th order:

% Output:
T=[300 350 400 250 450 200];

D=[1.139 0.967 0.854 1.367 0.759 1.708];
p=polyfit (T,D,5)
warningoffMATLAB:polyfit:RepeatedPointsOrRescale
plot(T,D,'o')
65
M-file to implement Newton interpolation
functionyint = Newtint(x,y,xx)
% Newtint: Newton interpolating polynomial
% yint = Newtint(x,y,xx): Uses an (n - 1)-order Newton
% interpolating polynomial based on n data points (x, y)
% to determine a value of the dependent variable (yint)
% at a given value of the independent variable, xx.
% input:
% xx = value of independent variable at which
% output:
% compute the finite divided differences in the form of a
% difference table
x=[300 350 400 250 450 200];
y=[1.139 0.967 0.854 1.367 0.759 1.708];
n=5
xx=330
n = length(x);
b = zeros(n,n); % assign dependent variables to the first column
of b.
b(:,1) = y(:); % the (:) ensures that y is a column vector.
for j = 2:n
fori = 1:n-j+1
b(i,j) = (b(i+1,j-1)-b(i,j-1))/(x(i+j-1)-x(i));
end
end
% use the finite divided differences to interpolate
xt = 1;
yint = b(1,1);
for j = 1:n-1 xt = xt*(xx-x(j));
yint = yint+b(1,j+1)*xt;
end
66
M-file to implement Lagrange Interpolation
functionyint = Lagrange(x,y,xx)
% Lagrange: Lagrange interpolating polynomial
% yint = Lagrange(x,y,xx): Uses an (n - 1)-order
% Lagrange interpolating polynomial based on n data points
% to determine a value of the dependent variable (yint) at
% a given value of the independent variable, xx.
% input:
% xx = value of independent variable at which the
% output:
x=[300 350 400 250 450 200];
y=[1.139 0.967 0.854 1.367 0.759 1.708];
n=5
xx=330
n = length(x);
s = 0;
fori = 1:n product = y(i);
for j = 1:n
ifi ~= j product = product*(xx-x(j))/(x(i)-x(j));
end
end
s = s+product;
end
yint = s;
C. Mat lab Outputs
For polyfit and polyval functions
At 1st order:
67
At 2nd order:
At 3rd order:
At 4th order:
At 5th order:
68
For Newton Interpolation
For Lagrange Interpolation
D. Graph
69
Chapter V
SPLINE AND PIECEWISE INTERPOLATION
I. INTRODUCTION
In interpolating between n data points, (n−1)th-order polynomials were used.
For example, for eight points, we can derive a perfect seventh-order polynomial. This
curve would capture all the meanderings (at least up to and including seventh
derivatives) suggested by the points. However, there are cases where these functions
can lead to erroneous results because of round-off error and oscillations. An
alternative approach is to apply lower-order polynomials in a piecewise fashion to
subsets of data points. Such connecting polynomials are called spline functions.
In interpolating function on an interval [𝑎,], it is not wise to use a high-degree
interpolating polynomial and equally-spaced interpolation points unless this interval is
sufficiently small. If the fitting function is only required to have a few continuous
derivatives, then one can construct a piecewise polynomial to fit the data.

A. Linear Spline
For n data points (i =1,2,...,n), there are n−1 intervals. Each interval i has its
own spline function, si(x). For linear splines, each function is merely the straight line
connecting the two points at each end of the interval, which is formulated as
si x  ai  bi x  xi 
where ai is the intercept, which is defined as

ai  f i
and biis the slope of the straight line connecting the points:
f i 1  f i
bi 
xi 1  xi
where fi is shorthand for f(xi). Substituting the first two equations into the last gives
f i 1  f i
si  x   f i  x  xi 
xi 1  xi
These equations can be used to evaluate the function at any point between x1 and xn
by first locating the interval within which the point lies. Then the appropriate equation
is used to determine the function value within the interval.
70
B. Quadratic Spline
The objective in quadratic splines is to derive a second-order polynomial for
each interval between data points. The polynomial for each interval can be
represented generally as
si x  ai  bi x  xi   ci x  xi 
2
For n data points (i =1,2,...,n), there are n−1 intervals and, consequently, 3(n−1)
unknown constants (the a’s, b’s, and c’s) to evaluate. Therefore, 3(n−1) equations or
conditions are required to evaluate the unknowns.
C. Cubic Splines
The goal of cubic spline interpolation is to get an interpolation formula that is
continuous in both the first and second derivatives, both within the intervals and at the
interpolating nodes. This type of Spline is the most frequently used because it gives
the smoothest interpolating function. Quartic or higher-order splines are not used
because they tend to exhibit the instabilities inherent in higher-order polynomials.
The objective in cubic splines is to derive a third-order polynomial for each
interval between knots as represented generally by
si x  ai  bi x  xi   ci x  xi   di x  xi 
2 3
Thus, for n data points (i =1,2,...,n), there are n−1 intervals and 4(n−1) unknown
coefficients to evaluate. Consequently, 4(n−1) conditions are required for their
evaluation.
There are several options in obtaining two additional equations for cubic
splines. They are obtained by having one of the conditions below:
1.Natural end conditions

These conditions assume that the second derivatives at the end knots (knots
are data points where two splines meet) are equal to zero.
2.Clamped end conditions

These conditions assume that the first derivatives at the first and last knots are
known.
71
3.“Not-a-knot” end conditions
Conditions where continuity is forced in the third derivative at the second and
penultimate (next-to-the-last) points.

A. General Algorithm
Step 1. Start.
Step 2: Declare variables x and y.
Step 3. Plot the data.
Step 4. Display the data.
Step 5. End.
B. Flow Chart 5.1
START
For i=0 to n
Input xi, yi
Plot values
END
Start. Then from the initial ith component of value zero up to the nth component, plug-
in the values of the independent variable x with their corresponding dependent variable y.
Following this step is the plotting of values depending on what type of function it must graph.
Once done, display the results. End.
72
C. Syntax
Functions Description
polyfit(x,y,n) finds the coefficients of a polynomial p(x) of degree n that fits
Polyfit the y data by minimizing the sum of the squares of the deviations of the data from the
model (least-squares fit).
polyval(p,x) returns the value of a polynomial of degree n that was determined
Polyval
by polyfit, evaluated at x.
This built-in function provides a handy means to implement a number of different
Interp1
types of piecewise one-dimensional interpolation.
Methods Description
nearest neighbour interpolation. This method sets the value of an interpolated point
Nearest to the value of the nearest existing data point. Thus, the interpolation looks like a
series of plateaus, which can be thought of as zero-order polynomials.
Linear linear interpolation. This method uses straight lines to connect the points.
Spline piecewise cubic spline interpolation. This is identical to the spline function.
pchip and
piecewise cubic Hermite interpolation.
cubic
73
IV. CONCLUSION
Splines minimize oscillation and the round-off error by fitting lower-order
polynomials to data in a piecewise fashion. Among all the spline function, cubic splines is the
most commonly used because they provide the simplest representation that exhibits the
desired appearance of smoothness.
An Application of Cubic spline and piecewise interpolation formula was applied to
compute heat transfer across the thermocline depth of three lakes in the study area of Auchi
in Edo State of Nigeria. Eight temperature values each for depths 1m to 8m were collected
from the lakes. Graphs of these temperatures against the depths were plotted. Cubic spline
interpolation equation was modelled. MAPLE 15 software was used to simulate the modelled
equation using the values of temperatures and depths in order to obtain the unknown
coefficients of the variables in the 21 new equations. Three optimal equations were found to
represent the thermocline depth for the three lakes. These equations were used to obtain the
dT
thermocline gradients ( ) and subsequently to compute the heat flux across the
dz
thermocline for the three lakes. Similar methods were used for cubic piecewise interpolation.
The analytical and numerical results obtained for the computation of thermocline depth and
temperature was presented for the three lakes with relative error analysis. Absolute Relative
Error|∈𝑎𝑆| for Analytic solution with Cubic Spline Interpolation was 0.41%, 0.70% and
0.74%, while Absolute Relative Error|∈𝑎𝑃| for Analytic solution with Cubic Piecewise
Interpolation are 0.82%, 2.11% and 1.48% respectively. Comparative analysis showed that
the results obtained with cubic spline interpolation method had less percentage error than the
cubic piecewise interpolation method.
74
Sample Problem:
Runge’s function is written as
f x  
1
1  25 x 2
Generate five equidistantly spaced values of this function over the interval: [-1, 1]. Fit
these data with (a) a fourth-order polynomial, (b) a linear spline, and (c) a cubic
spline. Present your results graphically.
A. M-text File
a) Fourth-Order Polynomial
% create vector of 5 equally spaced points in [-1,1]
x=linspace(-1,1,5);
% compute corresponding y-values
y=1./(1+25*x.^2);
% compute 4th-degree interpolating polynomial
p=polyfit(x,y,4);
% for plotting, create vector of 100 equally spaced points
xx=linspace(-1,1);
% compute corresponding y-values to plot function
yy=1./(1+25*xx.^2);
% plot function
plot(xx,yy)
% tell MATLAB that next plot should be superimposed on current one
holdon
% plot polynomial, using polyval to compute values and a red dashed curve
plot(xx,polyval(p,xx),'r--')
% indicate interpolation points on plot using circles
plot(x,y,'o')
% label axes
xlabel('x')
ylabel('y')
% set caption
title('Fitting Runge''s function with 4th-Order Polynomial')
%setting legends
legend('function','4th-order polynomial')
75
b) Linear Spline
x=linspace(-1,1,5);
y=1./(1+25*x.^2);
xx = linspace(-1,1);
yy = interp1(x,y,xx,'linear');
%generate values for Runge’s function and display with the spline fit and
%the original data:
yr = 1./(1+25*xx.^2);
plot(x,y,'o',xx,yy,xx,yr,'b--')
% label axes
xlabel('x')
ylabel('y')
% set caption
title('Fitting Runge''s function with Linear Spline')
%setting legends
legend('function','linear spline')
c) Cubic Spline
x=linspace(-1,1,5);
y=1./(1+25*x.^2);
xx = linspace(-1,1);
yy = spline(x,y,xx);
%generate values for Runge’s function and display with the spline fit and
%the original data:
yr = 1./(1+25*xx.^2);
plot(x,y,'o',xx,yy,xx,yr,'--')
% label axes
xlabel('x')
ylabel('y')
% set caption
title('Fitting Runge''s function with Cubic Spline')
%setting legends
legend('function','cubic spline')
76
B. Matlab Outputs
a) Fourth-order Polynomial
b) Linear Spline
77
c) Cubic Spline
78
References
Chapman, S. J. (2001). MATLAB Programming for Enginneers.
Chapra, S. C. (2005). Applied Numerical Methods with MATLAB for Engineers and Scientist.
New York: McGraw-Hill Icn.
Hahn, B. D., & Valentine, D. T. (2001). Essential MATLAB for Engineers and Scientist.
Oxford: Elsvier Ltd.
79

Group 10 - Curve Fitting

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Group 10 - Curve Fitting

Caricato da

Copyright:

Formati disponibili

CSU Vision CSU Mission Core Values CSU IGA

Republic of the Philippines

DEPARTMENT OF CHEMICAL ENGINEERING

Advance Engineering Mathematics for ChE

Course Topic: CURVE FITTING USING MATLAB®

Course Activity: RESEARCH

Name of Student: DULIYAO, VAN VESPER J.

Year Level: III

Date Submitted: May 12, 2017

Instructor: Engr. CAESAR P. LLAPITAN Rating:________

Date Checked: ________

II. THEORETICAL BACKGROUND

where the summation is from i=1 through n.

We should note that an alternative, more convenient formula is available to compute

MATLAB Function: randn

3. Linear Least-Squares Regression

Criteria for a “Best” Fit

wheren = total number of points.

Least-Squares Fit of a Straight line

respect to each unknown coefficient:

simultaneous linear equations with two unknowns (a0 anda1);

Quantification of error of Linear Regression

wherer2 is called the coefficient of determination and r is the correlation coefficient

4. Linearization Of Non-linear Relationships

science to characterize quantities that increase (positive 1 ) or decrease (negative 1 )

Step 2:Read the data

Step 3: Plot the data

Step 4: Is the plot appear to have a linear regression?

coefficient, if NO perform non-linear regression analysis.

Step 6: Is r equal, greater than or less than to zero?

Step 8: Display equation as y  a1 x  a0

Read the data.

Plot the data.

Does the plot appear NO

M-file of Statistics Review

y=[]; %input data

M-file of Random Numbers and Simulation

M-file of Linear Least-Squares Regression

Sum Returns the sum of the elements.

Mean Average mean value

Median Median value

Mode Most frequent value

Range Difference of the highest and lowest value

Std Standard deviation

Var Variance, which measures the spread or dispersion of the values

Polyval Returns the value of the polynomial of degree n evaluated at x.

max Maximum value

Min Minimum value

Fpintf Writes the data to a text file

Rand Generates a sequence of uniformly distributed numbers

randn Generates a sequence of numbers that are normally distributed

1. Given the data

By using excel values of ( yi  yi ) 2 and y2i are obtained.

1 0.9 0.524755 0.81

 40.61 2.764416 68.7313

%random numbers equation

GENERAL LINEAR LEAST-SQUARES AND NONLINEAR REGRESSION

II. THEORETICAL BACKGROUND

y =ao +a1x +a2x2+e

For this case the sum of the squares of the residuals is

(n)ao + (∑ 𝑥 i)a1 + (∑ 𝑥i2)a2 = ∑ 𝑦i

(∑ 𝑥i)ao + (∑ 𝑥i2)a1 + (∑ 𝑥i3)a2= ∑ 𝑥iyi

y =a0 +a1x +a2x2 +···+am xm +e

Z=[ones(size(t)) cos(w0t) sin(w0t)];