Sei sulla pagina 1di 152

Lecture Notes on Numerical Analysis

Manuel Julio Garca


Department of Mechanical Engineering
EAFIT University Medellin, Colombia

April 5, 2011

Contents
Preface

vii

Notation

ix

1 Linear Systems
1.1 Linear Systems of Equations . . . . . . . . . .
1.1.1 Geometric interpretation . . . . . . . .
1.1.2 Singular cases . . . . . . . . . . . . . .
1.2 Gau Elimination . . . . . . . . . . . . . . . .
1.2.1 Forward elimination . . . . . . . . . .
1.2.2 Backward substitution . . . . . . . . .
1.2.3 Operations in a matrix . . . . . . . . .
1.2.4 Forward elimination General case . .
1.2.5 Backward substitution General case
1.3 LU Decomposition . . . . . . . . . . . . . . .
1.3.1 Cholesky Factorisation . . . . . . . . .
1.4 Special Types of Matrices . . . . . . . . . . .
1.5 Exercises . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

1
1
1
4
4
5
5
6
8
9
11
13
15
16

2 Iterative Methods
2.1 Vector Space . . . . . . . . . . . . .
2.2 Vector Norms . . . . . . . . . . . . .
2.2.1 Distance between two vectors
2.2.2 Some norms in Rn . . . . . .
2.3 Sequence of Vectors . . . . . . . . .
2.3.1 Equivalent norms . . . . . . .
2.4 Inner Product . . . . . . . . . . . . .
2.5 Matrix Norms . . . . . . . . . . . . .
2.5.1 Natural norms . . . . . . . .
2.6 Eigenvalues . . . . . . . . . . . . . .
2.6.1 Applications . . . . . . . . .
2.7 Iterative Methods . . . . . . . . . . .
2.7.1 Spectral Radius . . . . . . . .
2.7.2 Convergent Matrix . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

19
19
20
20
20
23
24
26
27
27
29
31
32
32
32

iii

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

iv

CONTENTS

2.8
2.9

2.7.3 General characteristics


2.7.4 Jacobis Method . . .
2.7.5 Gau Seidel . . . . . .
2.7.6 Steepest Descent . . .
2.7.7 Conjugate Gradient .
Condition Number . . . . . .
Exercises . . . . . . . . . . .

of Iterative Methods
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .

3 Interpolation
3.1 Introduction . . . . . . . . . . . . . . . . . . .
3.1.1 Polynomial approximation . . . . . . .
3.2 Lagrange polynomials . . . . . . . . . . . . .
3.2.1 Second order Lagrange polynomials .
3.2.2 General case . . . . . . . . . . . . . .
3.2.3 Other Approximations . . . . . . . . .
3.3 Polynomials defined by parts . . . . . . . . .
3.3.1 One-dimensional Interpolation . . . .
3.3.2 Two dimensional interpolation . . . .
3.4 Calculo del gradiente de un campo escalar . .
3.4.1 calculo del gradiente para un elemento
3.4.2 calculo del gradiente para un campo .
3.5 Exercises . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

33
35
36
38
40
43
44

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

47
47
47
49
50
50
51
52
55
57
58
59
59
59

4 The Finite Element Method


4.1 Classification of the Partial Differential Equations PDE
4.2 Boundary Value Problems . . . . . . . . . . . . . . . . .
4.2.1 One dimensional boundary problems . . . . . . .
4.3 Preliminary mathematics . . . . . . . . . . . . . . . . .
4.3.1 The Divergence (Gau) theorem . . . . . . . . .
4.3.2 Greens equation . . . . . . . . . . . . . . . . . .
4.3.3 Bilinear Operators . . . . . . . . . . . . . . . . .
4.4 Overview of Finite Element Method . . . . . . . . . . .
4.4.1 Weak Form . . . . . . . . . . . . . . . . . . . . .
4.4.2 Abstract Form . . . . . . . . . . . . . . . . . . .
4.4.3 Variational Form . . . . . . . . . . . . . . . . . .
4.4.4 Discrete Problem (Galerkin Method) . . . . . . .
4.5 Variational Formulation . . . . . . . . . . . . . . . . . .
4.5.1 Reduction to Homogeneous Boundary Conditions
4.5.2 The Ritz-Galerkin Method . . . . . . . . . . . .
4.5.3 Other Methods. . . . . . . . . . . . . . . . . . . .
4.6 One dimensional problems . . . . . . . . . . . . . . . . .
4.6.1 Dirichlet boundary conditions . . . . . . . . . . .
4.6.2 Pragmatics . . . . . . . . . . . . . . . . . . . . .
4.6.3 Computation of the ` vector . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

63
63
65
65
67
67
67
68
69
69
69
70
71
73
73
75
76
77
77
79
80

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

CONTENTS

4.7

4.6.4 von Newman Boundary Conditions . . . . . . . . . . . . . . . . . . . 81


Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5 Two-dimensional Elliptic Problems


5.1 Poissons equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Weak form of the problem . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.1 Weak form of the Dirichlet homogeneous boundary problem . . .
5.2.2 Weak form of the von Newman homogeneous boundary problem
5.3 Discrete problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4 Computation of the stiffness matrix . . . . . . . . . . . . . . . . . . . .
5.5 Non-homogeneous Dirichlet boundary problem . . . . . . . . . . . . . .
5.6 Non-homogeneous von Newman boundary problems . . . . . . . . . . .
5.7 Fourier boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . .
5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

87
87
88
89
89
89
91
93
95
115
117

6 Afin Transformations
6.1 Change of variable in an integral . . . . . . . . . . . .
6.2 Transformation of a Standard Element . . . . . . . . .
6.2.1 Computation of transformation function . . .
6.2.2 Base functions for the standard element . . . .
6.2.3 Computation of integrals over a finite element .
6.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

119
119
120
121
122
122
130

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

7 Parabolic Problems
133
7.1 Finite Difference Approximation . . . . . . . . . . . . . . . . . . . . . . . . 134
7.2 Method for time integration . . . . . . . . . . . . . . . . . . . . . . . . . 135
A Barycentric Coordinates

137

B Gradient
139
B.1 Computation of the gradient of a function . . . . . . . . . . . . . . . . . . . 139

Preface
The aim of Engineering Analysis is to model the physical behaviour of structures under
the interaction of external loads such as forces and temperatures, in order to verify that
they comply with design specifications.
The course Advanced Methods in Numerical Analysis is a postgraduate course of the
Mechanical Engineering Department at EAFIT University in Medellin, Colombia. The
aim of the course is to study the numerical (computational) solutions to mathematically
formulated physical problems. The course covers an introduction to state-of-the-art linear
algebra matrix methods as well as an introduction to the numerical formulation of continuum mechanics problems such as: heat transfer, potential flow, diffusion, and others that
can be modelled by a second-order differential equation. The method of solution studied
during this part is the Finite Element Method.
These notes grew from my undergrad notes while attending professor Jose Rafael Toros
course, Numerical Analysis, at Los Andes University. From there they evolved after several
years of teaching the field while at los Andes University and then at EAFIT University.
They do not pretend to be a mathematical theory of numerical methods or Finite Elements.
Instead, they intend to introduce the students to powerful mathematical and numerical
techniques and to motivate them to further study in the topic. Numerical modelling
can be summarised as follows: Start by establishing the nature of the problem and the
mathematical equations that model its physics, transform the mathematical formulation
into a suitable form to be solve by a computer and finally implement the solution into
a computer code. That covers the whole cycle. The course is not all inclusive in topics
but tries to cover and understand all the single steps involved in the process of solving
a basic continuum problem by computer. There is no preference for computer languages
but Matlab, Maple, and C++ are used extensively. However, the algorithms presented in
the text are generic and do not compromise with any of these languages.
Care must be taken. These notes are not in the final version and may have some typos
and missing parts. I apologise for that. They are intended as a guide during the course to
help the students in the understanding of the concepts and algorithms given during the
course. New examples will be added continuously. I hope you find this work to be of help.
A preliminary course in calculus of several variables and a computer programing course
are required for success in the course.
Manuel Garca
Medelln, September 2002

vii

Notation
A
aij
aj
ai
x
xT
,
Ei
(E1)
N (x)
Wi

k
a(u, v)
`(v)
h, i
AD

v
2 v

matrix
component in row i column j of the A matrix
jth column of A
ith row of A
a vector of n elements
the traspouse of x, row vector
scalar quantities
the ith equation of a system of equations
a property of a vector space
a norm of x
the ith base function
a domain
the boundary of a domain

xk

a bilinear operator
a linear operator
inner product
Dirichlet stiffness matrix. Does not contain
rows or columns in corrresponding to the
dirichlet boundary.
gradient of v
Laplacian of v

ix

Chapter 1

Linear Systems
1.1

Linear Systems of Equations

This section deals with finding the value of the variables x1 , x2 , x3 . . . , xn that simultaneously satisfy a set of linear equations which are present in the solution of many physical
problems. In general, a system of n linear equations can be represented as
a11 x1 + a12 x2 + . . . + a1n xn = b1
a21 x1 + a22 x2 + . . . + a2n xn = b2
..
.
an1 x1 + an2 x2 + . . . + ann xn = bn
where a represents real coefficients and b independent real constants. A system of
equations have a geometrical interpretation that helps to to understand the meaning of
a system of equations, also the mechanics of the method of solution, and the meaning of
singular systems. Next section, we introduce these concepts starting with a system of two
equations and later we extend the concept to the general case.

1.1.1

Geometric interpretation

Lets have the following system of two equations:


2x1 + 4x2 = 2

(1.1)

4x1 + 11x2 = 1.

(1.2)

This system can be written in matrix form as:




2 4
4 11



x1
x2

or in a compact notation:
Ax = b
1


=

2
1

Linear Systems

or
[A] x = b
with


A=

2 4
4 11


,

and

x=

x1
x2


.

We refer to xi as the ith component of vector x. Also in the literature it is common to


refer to the ith component of vector x as [x]i and x(i). In the same way aij represent the
ith row, jth column component of the matrix A. Unless it is explicitly stated, aij and xi
are real numbers. That is xi and aij R. Rows and columns are denoted using Matlab
colon notation: thus A[i, :] represents the ith row and A[:, i] represents the ith column.
A system of equations have a geometric interpretation that differs if we look at the
rows or the columns of the system.
Row representation (System of two equations)
Equations 1.1 and 1.2 are linear equations that can be plotted as lines in a two dimensional
plane. The simultaneous solution represents the intersection of the two lines in the plane,
see figure 1.1.

Figure 1.1: Geometric representation to the solution of a system of equations. Row representation.

Column representation (System of two equations)


Notice that equations 1.1 and 1.2 can also be written also in the following way:
 

  
2
4
2
x1
+ x2
=
.
4
11
1
That is the sum of two vectors which are multiplied by scalars x. Remember that when a
number is multiplied by a scalar quantity, its magnitude is changed by this factor. Figure
1.2 shows the vectors represented by the columns of the system. If we multiply the first

1.1 Linear Systems of Equations

vector (2, 4) by 3 and the second (4, 11) by 1 and then add them, the result is vector (2, 1)
which is the solution to the system. This result is not a coincidence at all. As the columns
are linearly independent they represent a base of the space which in this particular case
is a base of R2 . Therefore any vector in the space can be found as a linear combination of
the elements of the base.

y
(4,11)
first column

(2,4)
second column

x
1 (4,11)
3 (2,4)

Figure 1.2: Column representation of a system of two equations.

An important conclusion is that any vector in the plane can be reproduced by these
two vectors if we choose x1 and x2 properly. Notice that this result can also be extended
to any two vectors different from (2, 4) and (4, 11) if and only if they are not parallel to
each other. Why?
Row representation (System of

a11 a12
a21 a22
a31 a32

three equations)

a13
x1
b1
a23 x2 = b2
a33
x3
b3

Every row represents a plane in three dimensional space. That way, for example, (a11 , a12 , a13 )
is the vector normal to the plane represented by the first row. It can also be written as
A[1, :] using the Matlab colon notation. In the same way (a21 , a22 , a23 ) = A[2, :] defines
another plane and so forth.
The solution to the system of equations is given by the intersection of the planes. That
is plane A[1, :] intersects plane A[2, :] in a line and that line intersects plane A[3, :] in a
point which is the solution to the system.

Linear Systems

Column representation (System of three equations)


In the same way as the n = 2 case, each matrix column represents a vector in three
dimensional space. In Matlab colon notation the columns are written as

a11
a12
a13
a21 = A(:, 1)
a22 = A(:, 2)
a23 = A(:, 3)
a31
a32
a33
and the vector solution is obtained as a linear combination of these three vectors. That is,
multiplying the column vectors A [:, 1], A [:, 2], and A [:, 3] by the scalar quantities x1 , x2
and x3 and then adding the result. See figure 1.3.

Figure 1.3: Column representation to the solution of a system of 3 linear equations

1.1.2

Singular cases

There are cases when the system does not have a solution. That can be interpreted from
the geometrical point of view as follows. If we use row vector representation, then each
matrix row represents a plane and the solution is the simultaneous intersection of the
planes. If one of the planes is parallel to the intersection of the other two planes then the
three planes never intersect in a point and therefore there is no solution. See figure 1.4(a).
On the other hand, when we plot the column vectors of the same singular system, they
will be lying in the same plane. See figure 1.4(b). This means the vectors do not form a
base in R3 .

1.2

Gau Elimination

From the geometric point-of-view each equation (row) of a system of n equations represents
a hyperplane of nth dimension. Finding the solution to the system is therefore equivalent to
find the intersection of all the n hyperplanes. This procedure can be illustrated as follows:
Eliminate one variable by intersecting two hyperplanes and the result is hyper-plane of

1.2 Gau Elimination

z
y
x
(a) Row representation

(b) Column representation

Figure 1.4: Geometric representation of a singular system.

n 1 dimension. Then, consecutively find the intersection of the resulting hyperplane


with the next hyperplane. Every intersection will reduce the dimension of the resulting
hyperplane by one. After n 1 operations, we will obtain a hyper-point of n dimension
which is the solution of the system.
Gau elimination consists of two main steps: forward elimination and backward substitution.

1.2.1

Forward elimination

The purpose of forward elimination is to reduce the set of equations to an upper triangular
system. The process starts eliminating the coefficients of the first column from the second
equation until the last equations. Then it eliminates the coefficient of the second column
from the third equation and so forth until the n 1 column of the system. That way
the last equation will have only one unknown. To eliminate a coefficient aik from the ith
equation, equation k must be multiplied by 1/aik and added to equation i. The first
equation is called the pivot equation and the term akk is called the pivot coefficient.

1.2.2

Backward substitution

After the forward elimination, the original matrix is transformed into an upper triangular
matrix. The last of the equations (n) will have only one unknown: an,n xn = bn . The
unknown xn is found and replaced back into the n1 equation an1,n1 xn1 +an1,n xn =
bn which now has only one unknown xn1 . The equation is solved and the procedure is
repeated until we reach the first equation and all the values are known.
Known Problems:
During the process of forward elimination and backward substitution, a division by
zero can be presented. In the same way, due to the computer arithmetics even if the
number is not zero but close to zero, the same problem can be presented.
Rounded off values can result in inexact solutions.

Linear Systems
Ill-conditioned system of equations where small changes in the coefficients give rise
to large changes in the solution.
Singular systems of equations.

1.2.3

Operations in a matrix

Let A Rnn and x, b Rn . The system of


and b as

a11 a12 a1n


a21 a22 a2n

..
..
.
.
an1 an2 ann

n equations can be written in terms of A, x,


x1
x2
..
.

xn

b1
b2
..
.
bn

(E1 )
(E2 )
..
.
(En )

where (Ei ) denotes equation i of the system. The following operations can be accomplished
without altering the result.
A equation, Ei , can be multiplied by a scalar number , with 6= 0 and the resulting
equation can replace Ei
(Ei ) Ei
A equation, Ej , can be multiplied by , and added to equation Ei
(Ei + Ej ) Ei
The order of the equation can be changed
(Ei ) (Ej )
Example 1.1
Lets have the following system of two equations,
2x1 + 4x2 = 2

(1.3)

4x1 + 11x2 = 1.

(1.4)

To eliminate the first variable we multiply equation (1.3) by 4/2 and add the result to
equation (1.4),
2x1 + 4x2 = 2
0 + 3x2 = 3.
We arrive at the last equation then we solve for x2 as x2 = 1. The second step consists
of replacing x2 = 1 back into equation (1.3) and solving for x1 (back substitution).

1.2 Gau Elimination

Example 1.2
Solve the following system of equations, represented in matrix form, using forward elimination and backward substitution:

2
(E1 )
2 4 0
x1
1 4 1 x2 = 1
(E2 ) .
4
(E3 )
2 2 2
x3

E2

E3

Then we operate the system in the following way:


E2 21 E1 ,

2
2 4 0
x1
0 2 1 x2 = 0 .
4
2 2 2
x3

E3 22 E1 ,


2 4 0
x1
2
0 2 1 x2 = 0 .
0 2 2
x3
2
Notice that variable x1 has been eliminated in all the equations (column one equals zero).
Now we can proceed with column two.
E3 E3 E2 ,


2 4 0
x1
2
0 2 1 x2 = 0 .
0 0 3
x3
2
The second step consists of back-replacing the unknowns. For the last equation we have:
2
x3 = .
3
then after successive replacements
2
=0
3
 
1
2x1 + 4
=2
3
2x2 +

1
x2 = .
3

5
x1 = .
3

Notice that the forward elimination transformed the original system Ax = b into an
upper triangular system U x = c with


2 4 0
2
0 2 1 =U
y
c= 0
0 0 3
2
which has the same solution since equality was maintained by applying the same operations
to A and b during the process.

1.2.4

Linear Systems

Forward elimination General case

Given the following set of n linear equations, E1, . . . , En


a11 x1 + a12 x2 + + a1n xn = b1
a21 x1 + a22 x2 + . . . + a2n xn = b2
..
.

(E1 )
(E2 )
..
.

an1 x1 + an2 x2 + .............ann xn = bn (En )


we want to find xi for i = 1 . . . n. The solution is obtained by applying forward elimination
followed by back substitution. Forward elimination transforms the original system into
an upper triangular one. Algorithm 1 presents the forward elimination algorithm. For
each i row it puts zeros in the positions underneath the diagonal (outer loop). To do this
it divides equation Ei by aii so that the diagonal becomes one, that is: aii = 1. Then
it multiplies Ei by aji (the value that we want to eliminate) and adds it to Ej . The
resulting equation replaces Ej .
Algorithm 1 Forward Elimination
n is the dimension of the matrix
for i = 1 to n do
{divides ith equation (row) by aii so that aii = 1 }
Ei Ei /aii
{Put zeros in the ith column under the diagonal}
for j > i do
Ej = Ej aji Ei
end for
end for

(operations)

(n i + 2)

(n i + 2)

In order to have a measure of how long the algorithm takes to solve a system, we compute the total number of operations (multiplications, subtractions, divisions and sums).
We can approximate this value by only counting the number of multiplications and divisions. For each cycle the operations per cycle opi are (n i) divisions plus the number of
operations of the j loop which is done (n i) times. The number of operations per j cycle
is (n i), resulting from multiplying aji Ei . That is
opi = (n i + 2) + (n i)(n i + 2)
= (n i + 2)(1 + n i)
= n2 2ni + 3n + i2 3i + 2
which is the total number of operations per ith cycle. Remembering that
m
X

1=m

m
X

m(m + 1)
j=
2

m
X

j2 =

m(m + 1)(2m + 1)
6

1.2 Gau Elimination

then the total number of operations after n cycles will be


op =

n
X

(n2 2ni + 3n + i2 3i + 2)

i=1

= n2

n
X

1 2n

n
X

i + 3n

n
X
i

n(n + 1)
= n2 n 2n
2
2
1
= n2 + n + n3 .
3
3

1+

n
X

i2 3

n
X

i+2

n
X

n(n + 1)(2n + 1)
+ 3nn +
3
6

n(n + 1)
2


+ 2n

When n is large, the number of operations op will be of the order of O(n3 ).

1.2.5

Backward substitution General case

After applying the forward elimination algorithm the matrix is transformed into an upper
triangular system like this

U11 U1i U1n

..

Uii Uin
Ux =
x = c.

.
..

0
Unn
To solve an upper triangular system we solve/replace the values for x starting from the
last equation and moving backwards. This way, for the last equation we have
Unn xn = cn xn = cn/Unn
with xn known, we replace this value into the n 1 row (equation)
Un1,n1 xn1 + Un1,n xn = cn1
and solve for xn1
Un1,n1 xn1 = cn1 Un1,n xn
cn1 Un1,n xn
.
xn1 =
Un1,n1
Now with xn and xn1 known, we can move up to the n 2 row and solve for xn2 . This
procedure can be repeated for equation n 3 and so forth. In general, suppose we are
replacing in equation j. Equation j has the following form
UjT x = Ujj xj + Uj,j+1 xj+1 + ............. + Ujn xn = cn .

10

Linear Systems
j

unkown

xj

known

Figure 1.5: The operation of the jth row times vector x has the only unknown value of xj

Notice that equation Ej is obtained by multiplying the jth row of the matrix by
vector x. Figure 1.5 illustrates this multiplication. At this stage, vector x consist of two
parts. The upper consisting of unknown values and the lower known part. Then when x is
multiplied by the jth row of the matrix, all the unknown values of the matrix are cancelled
except the xj term. This is because U is an upper triangular matrix and therefore Ujk = 0
k<j
Uj,1 x1 + . . . + Uj,j xj + Uj,j+1 xj+1 + . . . + Ujn xn = cj
|
{z
}
|
{z
}
0
x known
then solving for xj
Uj,j xj
Uj,j xj

= cj (Uj,j+1 xj+1 + . . . + Ujn xn )


n
X
= cj
Ujk xk
cj

xj

k>j
n
P

Ujk xk

k>j

Uj,j

This result is summarised in algorithm 2.


The total number of multiplications and divisions will be
1
X
j=n

(n j) + 1 = n

1
X
j=n

1
X
j=n

j+

1
X
j=n

n(n + 1)
+n
= nn
2
n2 n
= n2
+n
2
2
(n2 + n)
=
2

1.3 LU Decomposition

11

Algorithm 2 Backward Substitution


n is the dimension of the matrix
for j = n 1 to 1 do
temp = 0
for k = j + 1 to n do
temp = temp + Ujk xk
end for
xj = (Cj temp)/Ujj
end for
which is of order O(n2 ), recalling that the first part of the algorithm was of order O(n3 ).
This example shows the advantage of doing backward substitution instead of setting zeros
in the upper triangular positions of the matrix (backward elimination). This result lead
us to the conclusion that substitution must be preferred over elimination due to the order
of complexity of the algorithm.

1.3

LU Decomposition

Backward substitution starts from an upper triangular system of equations and solves for
x in U x = b. As we showed in the previous section this procedure is of a lower order of
complexity than forward elimination. So if a system of equations can be written as
LU x = b,

(1.5)

with L lower triangular and U upper triangular matrices then it can be solved with two
substitution procedures: i) Use backward substitution to find c from Lc = b and ii) use
forward substitution to find x from U x = c. These two substitutions are of less order than
one elimination plus one substitution. However not many systems can be easily expressed
in terms of (1.5). In general, one can find a possible factorisation at the same time of the
elimination procedure. This of course does not have any advantage in terms of efficiency
because the factorisation itself is a O(n3 ) algorithm. Nevertheless, it can be reworded if
we are solving a system with the same matrix but several b.
To illustrate the factorisation procedure, lets consider the following system of four
equations:

x1
2
2 1 0 0
1 2 1 0 x2 1


Ax =
0 1 2 1 x3 = 4 = b.
x4
8
0 0 1 2
Reducing the system to an upper triangular system is a straight-forward procedure because
of the tridiagonal nature of the matrix where most of the terms are already set to zero.
We proceed using forward elimination. To eliminate a21 , we multiply equation E1 by

12

Linear Systems

l21 = a21 /a11 = (1/2) and subtract it from equation

2
 
0
E2 E2 1/2 E1
0
0

E2 ,
1
3/
2
1
0

Now to eliminate a32 we multiply equation E2 by l32 =


from equation E3 ,

2 1
 
0 3/
2
E3 E3 2/3 E2
0 0
0 0

0
1
2
1

0
0
.
1
2

a32 /a22 = (2/3) and subtract it


0
1
4/
3
1

0
0
.
1
2

Finally, to eliminate a43 we multiply equation E3 by l43 = a43 /a33 = (3/4) and subtract
it from equation E4

2 1
0
0
 
0 3/2 1
0
.
E4 E4 3/4 E3

0 0 4/
3 1
0 0
0 5/4
The result is an upper triangular matrix. Also, at the same time we operate over the
matrix we should operate over the b vector. It is transformed into vector c:


2
2
0
0

b2 b2 l21 b1
4 , b3 b3 l32 b2 4 ,
8
8

2
0

b4 b4 l43 b3
4 = c,
5

where the same multiples lij (used for the matrix) are used to operate the vector. The
original problem Ax = b was transformed into an upper triangular system U x = c

2 1
0
0
x1
2
0
0 3/2 1
x2
0


0 0 4/
x3 = 4 .
1
3
x4
5
0 0
0 5/4
U
x
=
c
Now we concentrate in how c was obtained. First notice that the operations only
involves multiplications by the lij multiples. Observing with detail these sequence of

1.3 LU Decomposition

13

operations it can be notice that if these multiples are put into a matrix form (L), then the
steps from b to c are exactly the same as solving Lc = b. That is, the procedure from b
to c is equivalent to apply forward elimination on the following system
1
0
0
1/2 1
0

0 2/
1
3
0
0 3/4
L

0
2
0
0

0 4
5
1
c


= 1 .

4
8
=

In general,

l21
l31
..
.

0
1
l32
..
.

0
0
1
..
.

0
0
0
..
.

0
0
0

ln1 ln2 . . . lnn1

c1
c2
c3
..
.

0
1
cn

b1
b2
b3
..
.

bn

In conclusion, we started with a system Ax = b and then by forward elimination transformed it into U x = c. Both systems have the same solution since equality was maintained
at each step by applying the same operators to A and b.
As c is given by
Lc = b
and
Ux = c
then
U x = L1 b
therefore
L U x = b.
Which implies the original matrix A was factorised into a lower L and an upper U , triangular matrix. A good Gau elimination code consists of two main steps:
i. Factorise A into L and U and
ii. Compute x from LU x = b.
Please note that we used a tridiagonal symmetric system only to simplify the computation. In general, this method applies to fully populated non-symmetrical matrices.

1.3.1

Cholesky Factorisation

Theorem 1.3.1. If A is a symmetric positive definite matrix, then A can be factorised


into A = CC T , and C is a lower triangular matrix and C T is its transpose.

14

Linear Systems

Proof. This serves at the same time to deduce the algorithm. To prove that A = CC T
is equivalent to find C, we proceed using induction. First we show that we can calculate
the values for the first column of C and then we suppose the known values of the p 1
columns and show we can compute the values for the pth column.
First Column
n
X
T
Ai1 =
Cpk Ck1
,
p = 1, . . . , n.
k=1

As C is lower triangular, that means Cik = 0 for k > i therefore


T
T
T
Ai1 = Ci1 C11
+ Ci2 C21
+ + Cin Cn1
T , we have
using Cij = Cji

Ai1 = Ci1 C11 + Ci2 C12 + + Cin C1n .


Because C is lower triangular C1k = 0, for k > 1, then all the terms to the right hand side
of the equation, with the exception of the first, are cancelled.
Ai1 = Ci1 C11
solving for Ci1
Ci1 =

Ai1
C11

and for i = 1 we have


A11 = C11 C11
2
A11 = C11

C11 =

p
A11 .

Because A is positive definite, then A11 > 0. We can compute the values of the first
column of C as,
Ai1
Ci1 =
.
A11
Now suppose we know the values of the first p 1 columns of C, we want to show we
can compute the values of the pth column. We accomplish this by expanding a component
of A in the pth column. See figure 1.6. That is
Aip =

n
X

T
Cik Ckp

k=1

by definition of C T
Aip =

n
X
k=1

Cik Cpk

1.4 Special Types of Matrices

15

0
p
i

0
p

CT

Figure 1.6: Cholesky factorisation.

which is equivalent to multiplying rows p and i of matrix C. See figure 1.6. Notice that
Cpk = 0 for k > p, therefore the upper limit of the sum can be reduced from n to p
Aip =

p
X

Cik Cpk

k=1

expanding the last term


Aip =

p1
X

Cik Cpk + Cip Cpp

k=1

and solving for Cip , we have


Aip
Cip =

where Cpp

p1
P

Cik Cpk

k=1

Cpp
can be calculated making p = i, in the last equation
v
u
p1
u
X
Cpp = tApp
Cpk Cpk .

(1.6)

(1.7)

k=1

Equations 1.6 and 1.7 demonstrate that we can compute the matrix C for any A
positive definite.

1.4

Special Types of Matrices

Diagonally Dominant Matrix


A square matrix A Rnn is strictly diagonally dominant if
|aii | >

n
X
j=1
j6=i

|aij |,

i = 1, . . . , n.

16

Linear Systems

Non zero position

Figure 1.7: Banded matrix

A strictly diagonally dominant matrix is nonsingular. A symmetric diagonally dominant


real matrix with nonnegative diagonal entries is positive semidefinite.
Positive definite
A matrix A Rnn is positive definite if it is symmetric and xt Ax > 0, for all x, with
x 6= 0, and x Rn .

Banded
A matrix A is banded if there exist integer numbers p and q, 1 < p, q < n, such that
aij = 0, always that j i + p and i j + q. In other words the non zero values of the
matrix are around a diagonal by a distance of p. See figure 1.7.

1.5

Exercises

1. Matrix and vector operations. Assume that R, x, y, z Rn and A Rnn


then implement the following operations into computer code.
(a) Scalar vector multiplication: z = x
(b) Vector addition: z = x + y

zi = xi

zi = xi + yi

(c) Dot product (inner product): = x y = xT y


(d) Vector multiplication: z = x y

(e) Scalar matrix multiplication: B = A


(f) Matrix addition: C = A + B

xi yi

zi = x i yi

Bij = Aij

Cij = Aij + Bij


P
(g) Matrix vector multiplication: y = Ax y =
aij xj
j

Rnp

(h) Matrix matrix multiplication: Let A


and B Rpm , then
p
P
C = AB Cij =
aik bkj , and in colon notation Cij = A (i, :) B ((:, j)
k=1

1.5 Exercises

17


2. Given the following matrix A =


CC T = A.


3. Given b =

5
9
4

4 1
1 54


find the Cholesky factorisation C such that


solve the system Ax = b using the Cholesky method and the factori-

sation found the the previous exercise.

4. The stress tensor [] for a two dimensional case is represented by a 2 2 matrix is given
by
[] = traza() [I] + 2 []

(1.8)

where [] is a 2 2 matrix that represents the strain tensor, [I] is the identity matrix
and and are the Lame constants which are related with the Young modulus E and
the Poissons ratio by

E
,
(1 2)(1 + )

E
2(1 + )

(1.9)

The traction q (force per unit area) over a plane with normal e = (e1 , e2 ) is given by
the following expression
q = []e

(1.10)

Write a computer program that given the strain tensor , the material constants E and
and the plane normal vector e, computes the traction vector over the plane.

5. Given the following matrices L and S shows that L is the inverse of S

L = l21
l31

1
, S = l21 1

1
0 1
l31 0 1

6. Write the system Ax = b corresponding to the following plot.

18

Linear Systems

(2,5)

7. Shows that if A = CC T then A is symmetric.


8. A matrix A is positive definite if xT Ax > 0 allways that x 6= 0. Shows that the matrix
A defined by


a 0
A=
0 c
is positive definite always that a and c be greater than zero.
9. Let be A a positive definite and symmetric matrix in Rnn . Compute the number of
operations (multiplications and divisions) needed to accomplish the Cholesky factorisation A = C C T . Assume that the matrix is stored in complete form (lower and upper
entries) and its band equal to b. The number of operations should be expressed in
terms of n and b.

Chapter 2

Iterative Methods

2.1

Vector Space

A vector space E is a set with two operations defined as addition and scalar multiplication.
Additionally, if x, y, and z belong to E and and real numbers, then the following
axioms are satisfied.
(E1)
(E2)
(E3)

Commutativity
Associativity
Identity element

(E4)

Inverse element

(E5)

Distributivity of scalar
multiplication with respect to scalar addition
Distributivity of scalar
multiplication with respect to vector addition
Compatibility of scalar
multiplication
with
scalar multiplication
Identity element of
scalar multiplication

(E6)

(E7)

(E8)

x+y =y+x
(x + y) + z = x + (y + z)
There is an element in E, denoted by 0, such
that 0 + x = x + 0 = x
For each x in E, there is an element x in E,
such that x + (x) = (x) + x = 0
( + )x = x + x

(x + y) = x + y

()x = ()x

1x=x

19

20

2.2

Iterative Methods

Vector Norms

A vector norm in E is a function kk defined in E R with the following properties: Let


x, y E and R then

2.2.1

i)

N (x) 0,

x E

ii)

N (x) = 0,

x = (0, 0, . . . , 0)T = 0

iii)

N (x) = || N (x)

iv)

N (x + y) N (x) + N (y).

Distance between two vectors

Let x, y E the distance d R between two vectors with respect to the norm kk be
defined as:
d = N (x y)
then d is referred to as a metric for Rn .
Properties of a metric
Let x, y, z E then

2.2.2

i)

d(x, y) 0,

ii)

d(x, y) = d(y, x)

iii)

d(x, y) d(y, z) + d(z, y).

d(x, y) = 0 if and only if x = y

Some norms in Rn

The norms l1 , l2 , and l , are defined as follows

kxk1 =
kxk2

|xi |.
nX o1/2
=
x2i
.

kxk =

max |xi | .

1in

Example 2.1
Show that kxk is a norm.
Properties i) to iii) are straightforward and are left as exercise. For iv) property we
have
iv) kx + yk = max |xi + yi | max(|xi | + |yi |)
kx + yk < max |xi | + max |yi | .

2.2 Vector Norms

21

Example 2.2
Find the l1 , l2 and l norms of the following vectors
a. x = (1, 1, 2)T

T
b. x = 3, 4, 0, 3/2
c. x = sin k, cos k, 2k

T

, k Z +.

For the a. case we have


kxk1 = 1 + 1 + 2 = 4
p

kxk2 = (1)2 + (1)2 + (2)2 = 6 = 2.449

and

kxk = max {|1| , |1| , |2|} = 2.


Example 2.3
Given x = (1, 1, 1) and y = (1.2, 1, 1) find the distance between x and by.
Ans.
kx yk2 = 0.21356

(2.1)

kx yk = .2

(2.2)

Example 2.4
In an experiment, theory predicts that the solution to a problem is x = (1, 1, 1). But the
experiment results are xe = (1.001, 0.989, 0.93). Find the distance between the two results
using l and l2 .
l :
d = k(1, 1, 1) (1.001, 0.989, 0.93)k
= k(0.001, 0.011, 0.07)k
= 0.07
l2 :
d = k(0.001, 0.011, 0.07)k2
p
= 0.0012 + 0.0112 + 0.072
= 0.070866
Some norms as the l2 norm requires a bit more work to be proved. The following
theorem is an intermediary result necessary to prove that l2 is a well defined norm.
Theorem 2.2.1 (The Cauchy-Buniakowsky-Shwarz inequality). This inequality states
that for all x, y Rn then
n
X
i=1

|xi yi |

( n
X
i=1

x2i

)1/ ( n
2 X
i=1

)1/
2
yi2

22

Iterative Methods

Proof. If y = 0 or x = 0, the result is straight forward because both sides of the inequality
are zero.
Lets suppose that y 6= 0 and x 6= 0. For each R we have
0 kx

yk22

n
X

(xi yi ) =

i=1

then
2

n
X

xi yi 5

i=1

n
X

x2i

i=1

n
X

n
X

x2i + 2

i=1

n
X

xi yi +

i=1

n
X

yi2

i=1

yi2 = kxk22 + 2 ky||22

i=1

because kxk2 > 0 and kyk2 > 0 the last equation is particularly true for =

kxk2
kyk2 .

Then



n
kxk2 X
kxk2 2
2
2
xi yi 5 kxk2 +
kyk22 = 2kxk22
kyk2
kyk2
i=1

dividing by
2

x i yi

2kxk22

kyk2
kxk2


= 2kxk2 kyk2

therefore
X

xi yi kxk2 kyk2

e as a vector whose xei component is x


Defining x
ei = xi whenever xi yi 0. It is true
that ke
xk2 = kxk2 and,
X

|xi yi | =

xei yi ke
xk2 kyk2 = kxk2 kyk2

that is
n
X

|xi yi |

i=1

( n
X
i=1

|xi |

)1/ ( n
2 X

)1/
2
2

|yi |

i=1

Example 2.5
Prove that k k2 is a well defined norm

i. kxk2 0, kxk2 = 0 x = 0
ii. kxk2 = ||kxk2
iii. kx + yk2 kxk2 + kyk2
i. and ii. have obvious results. To prove iii. we can use the Cauchy-BuniakowskyShwarz inequality.

2.3 Sequence of Vectors

23

Let be x, y Rn
X
(xi + yi )2 =
(x2i + 2xi yi + yi2 )
X
X
X

x2i + 2
x i yi +
yi2
X
X

x2i + 2kxk2 kyk2 +


yi2

kx + yk22 =

kxk22 + 2kxk2 kyk2 + kyk22


= (kxk2 + kyk2 )2 .
Then
kx + yk22 (kxk2 + kyk2 )2 .
Therefore
kx + yk2 kxk2 + kyk2

2.3

Sequence of Vectors



A sequence of vectors x(k) k=1 converge to x in Rn , if > 0, N () such that


(k)

x x < ,

k > N ().

Theorem 2.3.1. A sequence of vectors {x(k) } converge to x Rn with respect to kk ,


if and only if
(k)

lim xi

= xi ,

i n.

Proof. It consist of two parts


a. Lets suppose that {x(k) }
k=1 converge to x with respect to the || || norm. Then given
any  > 0, exists an integer N () such that for all k N ()
||x(k) x|| < 
that is
(k)

max|xi
i

(k)

this implies that |xi

xi | < 

xi | <  for all i and therefore


(k)

lim xi

= xi

24

Iterative Methods
(k)

b. Suppose that lim xi


such that

k
(k)
|xi xi |

= xi for all i = 1, 2, ..., n. That is for a given  > 0 Ni ()

<  always that k > Ni (). Now lets define N () = max Ni ().
i

(k)

Therefore if k N () then |xi

xi | <  for each i and in particular


(k)

max|xi

xi | < 

(2.3)

||x(k) x|| < 

(2.4)

(2.5)
which implies {x(k) }
k=1 converges to x.

Example 2.6
Show that the following vector converges.
xk = (1, 2 + 1/k, 3/k 2 , ek sin(k))
According to the theorem, to show that the vector bx converges with respect to the norm
l it is enough to show that each of its components converges. Then we have
lim 1 = 1,

lim 2 + 1/k = 2,

lim 3/k 2 = 0,

lim ek sin(k) = 0

as each xi converges we conclude that x converges.

2.3.1

Equivalent norms

Definition: Let N1 and N2 be two norms on a vector space E. These norms are equivalent
norms if there exist positive real numbers c and d such that
c N1 (x) N2 (x) d N1 (x)
for all x E. An equivalent condition is that there exists a number C > 0 such that
1
N1 (x) N2 (x) C N1 (x)
C
for all x E. To see the equivalence, set C = max{1/c, d}.
Some key results are as follows:
i. On a finite dimensional vector space all norms are equivalent. The same is not true
for vector spaces of infinite dimension (Kreyszig, 1978).
ii. It follows that on a finite dimensional vector space, one can check the convergence
of a sequence with respect to any norm. If a sequence converges in one norm, it
converges in all norms.

2.3 Sequence of Vectors

25

iii. If two norms are equivalent on a vector space E, they induce the same topology on
E (Kreyszig, 1978).
Theorem 2.3.2. Let x Rn then the norms l2 and l are equivalent in Rn
kxk kxk2

n kxk

Graphically this result is illustrated in figure 2.1 for the n=2 case.
Proof. Let us select xj as the maximum component of x, that is xj =
follows that
n
X
2
2
2
kxk = |xj | = xj
x2i = kxk22 .

max | xi |. It

i=1

Additionally,
kxk22

n
X
i=1

x2i

n
X
i=1

x2j

x2j

n
X

1 = nx2j = n kxk2 .

i=1

Figure 2.1: Equivalence of l and l2 norms

Example 2.7
Given:
xk = (1, 2 + 1/k, 3/k 2 , e k sin(k))
Show that xk converge to x = (1, 2, 0, 0) with respect to the norm kk2 .
Solution.
We already proved that this sequence of vectors converges with respect to the l norm.
Therefore, given any R, > 0, an integer N (/2) with the property that


k

x x < /2.

26

Iterative Methods

always that k > N (/2) and using the result from theorem 2.3.2






k

x x < 4 xk x < 2 (/2) = .

2

When k > N (/2) therefore xk converges to x with respect to kk2 .
Note: It can be shown that all the norms of Rn are equivalent with respect to the
convergence.

2.4

Inner Product

An Inner product is a way to multiply vectors together and the result of this multiplication
is a scalar number. For a vector space V , an inner product is a map h, i : V V R
that satisfies the following four properties. Let x, y, and z and be elements in V and
be a scalar, then:
i. hx, yi = hy, xi
ii. hx, yi = hx, yi
iii. hx + z, yi = hx, yi + hz, yi
iv. hx, xi 0
v. hx, xi = 0

x=0

A vector space together with an inner product on it is called an inner product space. This
definition also applies to an abstract vector space over any field.
Examples of inner product spaces include:
a. The Euclidean space Rn , where the inner product is given by the dot product
h(x1 , x2 , . . . , xn ), (y1 , y2 , . . . , yn )i =

n
X

x i yi .

i=1

b. The vector space of real functions whose domain is a closed interval [a, b] with inner
product
Z b
hf, gi =
= f g dx.
a

Every inner product space is a metric space. The metric is given by


g(v, w) = hv w, v wi .

(2.6)

If this process results in a complete metric space ( a metric space in which every Cauchy
sequence is convergent) it is called a Hilbert space.

2.5 Matrix Norms

2.5

27

Matrix Norms

Let A y B Rnn , R. A matrix norm in Rn is a function kk defined in Rnn R


with the following properties:
i)

kAk 0

ii) kAk = 0 if and only if A = 0 zero matrix


iii) kAk = || kAk
iv) kA + Bk kAk + kBk
v) kABk kAk kBk
The distance between two matrices can be defined in the usual way as d = kA Bk.
Ax

1
x
0.5

0.5
1

Figure 2.2: Geometrical effect of multiplying a vector x by a matrix A over a set of vectors with
norm equal to one.

2.5.1

Natural norms

A matrix natural norm is derived from vector norms. To understand how a vector norm
can be used to define a matrix norm let us first observe the geometrical effect of matrixvector multiplication. When a matrix A is multiplied by a vector x, the result is a new
vector which is rotated and scaled in comparison with the original x vector. Figure 2.2
illustrates this transformation for a series of vectors x R2 with Euclidean norm equal to
one. This set of vectors represents a circle. When operator A is applied to the vectors the
original circle is transformed into an ellipse. All vectors are scaled by different factors. If
we choose a factor C large enough to be larger than the maximum scale factor, then we
can affirm that
kAxk C kxk .
If C is the smallest number for which the inequality holds for all x, that is C is the
maximum factor by which A can stretch a vector, then kAk is defined as the supreme of
C over all vectors
kAxk
kAk = sup C = sup
kxk

28

Iterative Methods

or equivalently
kAk = sup kAxk .
kxk=1

Figure 2.3: Examples of norms.

Theorem 2.5.1. If kk is a vector norm in Rn , then


kAk = max kAxk
kxk=1

is a matrix norm and is called natural norm.


Proof. In the following proof we use Einstein notation
P to simplify the overuse of the
summation symbol. In Einstein notation thePsymbol
is suppressed and summation is
assumed over the repeated indices. That is, j aij xj is equal to aij xj .
i. kAk > 0,

a1j xj


..

.


kAxk = aij xj

..

.

anj xj







> 0, if and only if aij xj 6= 0




because kxk =
6 0 aij xj = 0, if and only if aij = 0
ii. kAk = || kAk,
kAk = kaij xj k = || kaij xj k = || kAk

2.6 Eigenvalues

29

iii. kA + Bk kAk + kBk


kA + Bk = max k(aij + bij )xj k = max kaij xj + bij xj k
kxk=1

kxk=1

max (kaij xj k + kbij xj k) kAk + kBk


kxk=1

Example 2.8
It can be shown that if A = aij then the kAk can be calculated by
kAk = max

n
X

|aij |

j=1

see exercise 4. If A is defined as:

1 2 1
A = 0 3 1
5 1 1

then,
X

|a1j | = 4

|a2j | = 4

|a3j | = 7

therefore
max = 7and kAk = 7

2.6

Eigenvalues

For any square matrix A, we can look for vectors x that are in the same direction as Ax.
These vectors are called eigenvectors. Multiplication by a matrix A normally changes the
direction of a vector but for certain exceptional vectors, Ax is a multiple of x, that is
Ax = x.
In this way, the effect of multiplying Ax is to stretch, contract, or reverse x by a factor
. This is illustrated in figure 2.4.
Example 2.9
The matrix:

3 0 0
A= 0 2 0
0 0 1

30

Iterative Methods

Ax = x

Figure 2.4: Multiplication of matrix A by special vector only changes its magnitude

has the following eigenvalues and eigenvectors:



1
0
0
1 = 3, x1 = 0 ; 2 = 2, x2 = 1 ; 3 = 1, x3 = 0 .
0
0
1
Additionally, any vector in the space can be written as a linear combination of the
eigenvectors (this is only if the eigenvectors are all different)
y = 1 x1 + 2 x2 + 3 x3
with i R and xi R3 . If we apply A to vector y, we have:
Ay = A(1 x1 + 2 x2 + 3 x3 )
= A1 x1 + 2 Ax2 + 3 Ax3
= 1 1 x1 + 2 2 x2 + 3 3 x3 .
The action of A in any vector y is still determined by the eigenvectors.
Diagonal matrices are certainly the simplest. The eigenvalues are the diagonal itself
and eigenvectors are in the direction of the Cartesian axes. For other matrices we find the
eigenvalues and then the eigenvectors in the following form:
If Ax = x, then (A I)x = 0, and because x 6= 0 therefore A I has dependent
columns and the determinant of A I must be zero, in other words, shifting the matrix
by I, it becomes singular.
Example 2.10
Let

A=

2 1
1 2

then

A I =

2
1
1
2


.

(2.7)

2.6 Eigenvalues

31

The eigenvalues of A are the numbers that make the determinant of A I equal to
zero
det(A I) = (2 )2 1 = 0
= 2 4 + 3 = 0
= ( 1)( 3) = 0
which leads to 1 = 1 and 2 = 3. Replacing 1 = 1 into equation 2.7


1 1
A 1 I =
1 1
because (A I)x(1) = 0


1 1
1 1

"

(1)

xx
(1)
xy


=

0
0

which leads to,


x

(1)

1
1

1 1
1 1

2 = 3
A 2 I =
and
(2)


=

1
1


.

In the case when n = 2 the characteristic polynomial is quadratic and there is an exact
formula for computing the roots. For n = 3 and n = 4 the characteristic polynomials are
of order 3 and 4 for which there exist formulas for computing the roots. For n > 4 there
is not (and there will never be) such a formula. Numerical methods must be used in those
cases. See (Golub and Loan, 1996) for a detailed presentation of such algorithms.

2.6.1

Applications

There is probably not other mathematical property with larger physical application than
the Eigenvalues and Eigenvectors of a matrix.
In Solid Mechanics the stress tensor represents the state of internal forces of a point in
a continuum. The actual force over a surface is given by the multiplication of the stress
matrix by a vector normal to the surface. In this way depending on the orientation of
the plane there will be different components of the shear or tension stress. The directions
given by the eigenvalues of the stress tensor represent the main directions of the stress in
which the tension stress is maximum and the shear stress is zero.
A second example is found in vibration analysis. A dynamic system can be identified
in terms of the mass and spring constants of its components. When the equations of
motion for the system are written, it results in a set of differential equations that can

32

Iterative Methods

be written in matrix form. The solution of these equations leads to the conclusion that
the eigenvectors of the matrix identifying the system correspond to the natural vibration
modes of the dynamic system (Clough and Penzien, 1993).
Principal component analysis is a well established method used to explain the relationship between multidimensional data (n observations) and lower number of variables
m. The method finds the covariance matrix that relates the data and finds the eigenvectors and eigenvalues of it. Eigenvectors represent the principal components of the data
analysed (Fukunaga, 1990).
There many other applications and in fields as variated as for example web search
engines (The famous Google page rank (Berry and Browne, 2005; Vise and Malseed)

2.7

Iterative Methods

The first part of this section introduces some basic concepts essential to the understanding
of the iterative methods in linear algebra.

2.7.1

Spectral Radius

The spectral radius of a matrix A is defined as (A) = max() when is an eigenvalue of


A.
Theorem 2.7.1 (Norm of Spectral Radius). If A is a n n matrix, then:
p
i. k A k2 = (At A)
ii. (A) k A k , for any natural norm .
Proof. The first part is left as an exercise. To prove the part two suppose that is an
eigenvalue of A with associate eigenvector x then and kxk = 1
| |=| | k x k =k x k=k Ax kk A kk x k=k A k
therefore if is the maximum eigenvalue.
max | |<k A k
(A) k A k

2.7.2

Convergent Matrix

A matrix A of dimensions n n is called convergent if:


 
lim Ak
= 0, i ,j : 1, ..., n.
k

ij

2.7 Iterative Methods

33

Theorem 2.7.2 (Convergent Matrix). For a given matrix A (n n) the following statements are equivalent.
i. A is convergent.
ii. lim k An k= 0,
n

iii. lim k An k= 0,
n

f or any natural norm.


f or every natural norm.

iv. (A) < 1


v. lim An x = 0,
n

x.

Proof. The proof of this theorem can be found in (Issacson and Keller, 1966) page 14.

2.7.3

General characteristics of Iterative Methods

A sequence of vectors can be generated in the following way


x(k) = T x(k1) + c
Examples of iterative methods includes the Jacobi, Gauss-Seidel methods, and more recent
Conjugate Gradient (Hestenes and Stiefel, 1952), GMRES (Saad and Schultz, 1986) and
many others. The development and interest of iterative methods has increased greatly over
the years. While they do not seem to be efficient for small system of equations they are a
complete success for large linear systems. The following results are useful to understand
when do they converge.
Theorem 2.7.3. If the spectral radius (T ) satisfies (T ) < 1 then there exist a (I T )1
and
1

(I T )

= I + T + T + ... =

Tj

j=0

Proof. As T x = x is true for an eigenvalue and x its eigenvector, then multiplying by


-1 and adding x to both sides we have.
x T x = x x
(I T )x = (1 )x
meaning that (1 ) is an eigenvalue of the matrix (I T ) but || (T ) < 1 therefore
= 1 is not an eigenvalue of T , hence = 0 is not an eigenvalue of (I T ). Therefore
(I T )x 6= 0 means (I T )1 exists!.
Let Sm = I + T + T 2 + ... + T m then
(I T )Sm = (I + T + T 2 + ... + T m ) (T + T 2 + ... + T m+1 )
= I T m+1

34

Iterative Methods

As T is convergent.
lim (I T )Sm = lim (I T m+1 ) = I

therefore
(I T )1 = lim Sm = I + T + T 2 + ... =
m

Tj

j=0

Theorem 2.7.4. For all x(0) Rn , the sequence {xk }


k=0 defined by:
x(k) = T x(k1) + c,

k 1

converges to the unique solution


x = T x + c (T ) < 1
Proof. Lets suppose that (T ) < 1 then
x(k) = T x(k1) + c
but in turn x(k1) is expressed in terms of x(k2) and so forth.
xk = T (T x(k2) + c) + c
= T 2 x(k2) + T c + c
= T 2 x(k2) + (T + I)c
..
=
.
= T k x(0) + (T k1 + T (k2) + ... + I)c
xk = T k x(0) + (Sk1 )c
in the limit:
lim xk = 0 + lim (Sk1 )c

k0

because T is convergent.
Additionally
lim (Sk1 ) = (I T )1

Therefore.
lim xk = (I T )1 c

x = (I T )1 c
(I T )x = c
x Tx = c

2.7 Iterative Methods

35

Now the opposite


xk = T x(k1) + c converges to x = T x + c then (T ) < 1
for an arbitrary initial guess x0 lets define z as z = x0 x. For the exact value x we
have
x = Tx + c
from which we substract
(x(k) = T x(k1) + c)
to obtain,
x x(k) = T x T x(k1)
x x(k) = T (x x(k1) )
using recursively this definition for x x(k1)
x x(k) = T (T (x x(k2) )) = T 2 (x x(k2) )
= T k (x x0 ) = T k (z)
because the solution converges then
0 = T k (z) when k .
then T k converges (T ) < 1.

2.7.4

Jacobis Method

Matrix vector multiplication can be expressed in index notation as


X
Ei :
aij xj = bi .
j

Which corresponds to the multiplication of the ith row of A times vector x. Taking the
ith term of the sum ()aii xi ) apart from the rest of the sum we have
X
Ei : aii xi +
aij xj = bi
j6=i

Solving for xi

xi =

aij xj + bi

j6=i

aii

which can be written as


xi =

Tij xj + ci

(2.8)

36

Iterative Methods

where
Tij =

(
0,

if i = j,
,
if i 6= j

aij
aii

bi
= ci
aii

and

Equation 2.8 can be used to produce a sequence of the form


x(k) = T x(k1) + c
with the stop condition
k

x xk1
<
kxk k
If we define
(D)ij = aij

if i = j,

and 0

i 6= j

if

(L)ij = aij

if i > j,

and 0

if

ij

(U )ij = aij

if i < j,

and 0

if

ij

then
A = (D L U )
and we can write the same result in compact matrix notation as
Ax = b
(D L U )x = b
Dx = D1 (L + U )x + D1 b

2.7.5

Gau Seidel

In the Jacobi method at each iteration (k) the vector xk is computed using the information
of the previous computation xk1 . However, as you compute the vector xk at position xki
one can use the freshly computed components (xkj ) for the j < i. That is, from equation
2.8 we have
P

aij xjk1 + bi
xki =

j6=i

aii

dividing the sum in two

xki =

P
j<i

aij xk1

P
j>i

aij xk1
+ bi
j

aii

and replacing the xk1


by the xki for the first sum
i
P
P

aij xkj
aij xk1
+ bi
j
xki =

j<i

j>i

aii

Which can seeing as an improvement over the Jacobi method. However, none of this
methods has been famous for its convergence speed.

2.7 Iterative Methods

37

Theorem 2.7.5 (Symmetric Matrix). If A is symmetric then


y T Ax = xT Ay
Proof.

X
y T (Ax) = y T
aij xj
j

yi

aij xj

Since A = AT , aij = aji then,


=

XX

yi aji xj

xj

aji yi

= xT Ay
which completes the proof.
Positive definite matrix
A symmetric matrix A Rnn is said to be positive definite if
hx, Axi > 0, x Rn .
Theorem 2.7.6. If A is a positive definite matrix, then the quadratic
P (x) =

1
hx, Axi hx, bi
2

is minimised at the point where Ax = b and the minimum value is


P (A1 b) =

b, A1 b .
2

Proof. For the case when n = 1, it is quite simple,


1
P (x) = ax2 bx
2
dP (x)
= ax b = 0
dx

ax = b.
If a is positive, then P (x) is a parabola which opens upward.

38

Iterative Methods

For n > 1, suppose x is the solution to Ax = b, we want to show that at any point y,
P (y) is larger than P (x).
P (y) P (x) =

1
1
hy, Ayi hy, bi hx, Axi + hx, bi
2
2

Replacing Ax = b, we have
1
hy, Ayi hy, Axi
2
1
= hy, Ayi hy, Axi +
2

P (y) P (x) =

1
hx, Axi + hx, Axi
2
1
hx, Axi
2

because A is symmetric
hy, Axi =

1
1
hy, Axi + hx, Ayi
2
2

then,
P (y) P (x) =
=
=
=



1
1
1
1
hy, Ayi
hy, Axi + hx, Ayi + hx, Axi
2
2
2
2
1
1
hy, (Ay Ax)i hx, (Ay Ax)i
2
2
1
1
hy, A(y x)i hx, A(y x)i
2
2
1
h(y x), A(y x)i .
2

Since A is positive the last expression can never be negative. It is equal to zero if y = x.
Therefore P (y) is larger than P (x) and the minimum occurs at x = A1 b


1
1
A b, A(A1 b) (A1 b), b
2


1
1
(A b), b (A1 b), b
=
2

1

= (A1 b), b .
2

Pmin =

In conclusion, minimising P (x) is equivalent to solving Ax = b. Figure 2.5 illustrates


this result in R2 . P (x) represents a paraboloid facing upward and with its minimum value
at x = A1 b.

2.7.6

Steepest Descent

One of the simplest strategies to minimise P (x) is the Steepest Descent method. At a
given point xc the function P (x) decreases most rapidly in the direction of the negative
gradient P (xc ) = b Axc . We call r c = b Axc , the residual of xc . If the residual is
non zero, then there exists a positive such that P (xc + r c ) < P (xc ). In the method

2.7 Iterative Methods

39

Figure 2.5: Two dimensional representation of P (x)

of the steepest descent (with exact line search) we set = hr c , r c i / hr c , Ar c i thereby


minimising P (xc + r c ).
To show this let us expand P at the point (xc + r c ). If P (x) is defined as
P (x) =

1
hx, Axi hx, bi ,
2

then
1
hxc + r c ), A(xc + r c )i hxc + r c , bi
2
1
= [hxc , A(xc + r c )i + hr c , A(xc + r c )i] h(xc + r c ), bi
2

1
hxc , Axc i + hxc , Ar c i + hr c , Axc i + 2 hr c , Ar c i hxc , bi hr c , bi .
=
2

P (xc + r c ) =

Sorting terms we have:


P (xc +r c ) =


1
1
hxc , Axc ihxc , bi+ hxc , Ar c i + hr c , Axc i + 2 hr c , Ar c i hr c , bi .
2
2

Notice that the first two terms represent P (xc ). Since A is symmetric

1
2 hxc , Ar c i + 2 hr c , Ar c i hr c , bi
2
1
= P (xc ) + hxc , Ar c i + 2 hr c , Ar c i hr c , bi .
2

P (xc + r c ) = P (xc ) +

Replacing b = r c + Axc
1
P (xc + r c ) = P (xc ) + hxc , Ar c i + 2 hr c , Ar c i hr c , (r c + Axc )i .
2

40

Iterative Methods

Expanding the last term and using the symmetry of A again


1
P (xc + r c ) = P (xc ) + hxc , Ar c i + 2 hr c , Ar c i hr c , r c i hr c , Axc i
2
1 2
= P (xc ) + hr c , Ar c i hr c , r c i
2
which is minimum when
d
d


1 2
hr c , Ar c i hr c , r c i = 0
2

that is
hr c , Ar c i = hr c , r c i
therefore
=

hr c , r c i
.
hr c , Ar c i

Algorithm 3 summarise this results. Notice that in a practical computer implementation


it is not necessary to store all the terms (x(k) , r (k) , and (k) ) in the sequence.
Algorithm 3 Steepest Descent
Set initial guess x(0)
Set  = small number
Compute r (0) = b Ax(0)
k=0
while kr (k) k >  do

(k) (k)
r ,r
(k)

=
(k)
r , Ar (k)
x(k+1) = x(k) + (k) r (k)
r (k+1) = b Ax(k)
k =k+1
end while

2.7.7

Conjugate Gradient

This is an improvement over the previous method. The goal, as before, is to minimise the
function P (x) defined by
1
P (x) = hx, Axi hx, bi .
(2.9)
2
The general search directions are defined by an arbitrary direction vector v as
x(k) = x(k1) + t(k) v (k)

(2.10)

2.7 Iterative Methods

41

and this search direction will be optimum for t given by

(k) (k1)
v ,r
(k)

t =
(k)
v , Av (k)

(2.11)

where r is the residual. This result can be easily demonstrated using the same procedure
used in the Steepest Decent to find the optimum when the search direction is the negative
of the gradient.
As a difference of the steepest Descent method the search directions in the Conjugate
Gradient method are not based on local information only, instead it accounts for all the
previous search directions taking care that the new direction be A-orthogonal to all the
previous ones. This will have some advantages that will be mention, but first lets define
the meaning of A-orthogonal.
The Method of A-orthogonality
A set of vectors V = {v (1) , v (2) , . . . , v (n) } is called A-orthogonal with respect to a matrix
A if
D
E
v (i) , Av (j) = 0 if i 6= j
(2.12)
As a result the set of vectors V is linearly independent
Theorem 2.7.7. Given V = {v (1) , v (2) , . . . , v (n) } a set of A-orthogonality vectors in Rm
and given x0 and arbitrary vector in Rm , then the sequence
x(k) = x(k1) + t(k) v (k)
with tk as in (2.11), will converge to the exact solution of Ax = b in n iterations assuming
exact mathematics.
Proof. A proof can be found in (Burden, 1985)
Theorem 2.7.8. The residual vectors r (k) , with k = 1, 2, . . . n for the conjugate gradient
method satisfy
E
D
r (k) , v (j) = 0 for j = 1, 2, . . . k
Proof. A proof can be found in reference (Burden, 1985)
Given that the r (k1) residual is A-orthogonal to all the previous search directions v j
(j = 1, 2, . . . , k 2) then v k can be constructed in terms of r (k1) and v (k1) as
v (k) = r (k1) + s(k1) v (k1)

(2.13)

Selection of this direction also implies that any two different residual from the sequence
will be orthogonal. See (Shewchuk, 1994) for a proof. As we want v (k) to be A-orthogonal
to all the previous directions it should satisfy
D
E
v (k1) , Av (k) = 0

42

Iterative Methods

and replacing v (k) by (2.13)


D
E D
E
0 = v (k1) , Av (k) = v (k1) , A(r (k1) + s(k1) v (k1) )
D
E
D
E
= v (k1) , Ar (k1) + s(k1) v (k1) , Av (k1)
solving for s(k1)
(k1)

(k1)

v
, Ar (k1)

=
(k1)
v
, Av (k1)

(2.14)

To compute x(k) we need to compute first t(k) from (2.11)


t

(k)

(k) (k1)

(k1)

v ,r
(r
+ s(k1) v (k1) ), r (k1)
=


=
(k)
v , Av (k)
v (k) , Av (k)

(k1) (k1)


r
,r
+ s(k1) v (k1) , r (k1)


=
v (k) , Av (k)


and by theorem 2.7.8 v (k1) , r (k1) = 0 therefore
t

(k)

(k1) (k1)
r
,r

=
(k)
v , Av (k)

(2.15)

finally the new residual is find as r (k) = b Ax(k) . Equations (2.14) (2.13) (2.15) (2.10)
can be used in that order to produce the sequence of x(k) vectors. However a further
simplification can be done for s(k1) avoiding one matrix vector multiplication. Let us
first re-write the residual r (k) = b Ax(k) using equation (2.10)


r (k) = b Ax(k) = b A x(k1) + t(k) v (k)
that is
r (k) = r (k1) t(k) Av (k)

(2.16)

Using this result into the inner product of r(k) by itself and recalling that any two different
residuals are orthogonal we have
D
E D
E
r (k) , r (k) = r (k) , r (k1) t(k) Av (k)
D
E
D
E
= r (k) , r (k1) t(k) r (k) , Av (k)
D
E
= t(k) r (k) , Av (k) .


Besides we can get an expression for v (k) , Av (k) from equation (2.15)
D

E 
D
E
v (k) , Av (k) = 1/t(k) r (k1) , r (k1)

2.8 Condition Number

43

replacing this two last results into equation 2.14 for s(k)
s

(k)

(k)

(k)

v , Ar (k)
r , Av (k)
=


=
(k)
v , Av (k)
v (k) , Av (k)



1/t(k) r (k) , r (k)



=
1/t(k) r (k1) , r (k1)

That is

r (k) , r (k)
=
(k1) (k1)
r
,r

(k)

(2.17)

As a difference, the equation now solve for s(k) instead of s(k1) this change the order
application of the equations to produce the sequence for x(k) . Before completing the algorithm, it is necessary to set the initial guess x(0) and initial residual and search direction.
The new order to evaluate a step in the iteration process is (2.15), (2.10), (2.16), (2.17),
and (2.13). This is summarised in algorithm 4. However, this version of the method does
not have fast convergence properties. The preconditioned version of the method in which
the system is pre-multiplied at each iteration by a special matrix improves largely its convergence properties. This is out of the scope of this book but the reader is referred to the
book of SIAM, Templates for the Solution of Linear Systems see reference (Barrett et al.,
1994).
Algorithm 4 Conjugate Gradient
Set initial guess x(0)
Compute r (0) = b Ax(0) and set v (1) = r (0)
for k = 1 to n do

(k1) (k1)
r
,r
(k)

t =
(k)
v , Av (k)
x(k) = x(k1) + t(k) v (k)
r (k) = r (k1) t(k) Av (k)

(k) (k)
r ,r
(k)
s =
(k1) (k1)
r
,r
v (k+1) = r (k) + s(k) v (k)
end for

2.8

Condition Number

Condition number of a non singular matrix A related to the norm k.k is


(A) = kAk kA1 k.

44

Iterative Methods

The base-b logarithm of (A) is an estimate of how many base-b digits are lost in solving
a linear system with that matrix. It approximates the worst case loss of precision. For all
non singular matrix A and any norm k.k
1 = kIk = kA A1 k kAk.kA1 k = (A).
A system is said to be singular if the condition number is infinite, and ill-conditioned if it
is too large, where too large means roughly the precision of matrix entries.
Consider for example


1
2
A=
,
1.0001 2
for which kAk = 3..0001. This is not a large number. However,
A1 =

10000 10000
5001.5 5000

and its norm kA1 k = 20000. Therefore, The condition number will be (A) = (3.0001)(20000) =
60002

2.9

Exercises

1. Show that the norm defined by


kAk = max |ai,j |
i,j

is not a consistent norm, that is it do not comply with the following property
kABk kAk kBk
Hint: Find a counter example with A = B
2. The Hilbert-Schmidt o Frobenius norm is defined as
sX
kAkF =
(ai,j )2
i,j

Show the triangular inequality for this norm


3. Show that if kAk1 is the matrix norm induced by the vector norm kxk =
then kAk1 is equal to the maximum sum of the columns of A, that is
kAk1 = max
j

n
X
i=1

!
|ai,j |

Pm

i=1 |xi |,

2.9 Exercises

45

4. Show that if kAkinf is the matrix norm induced by the vector norm kxk = maxi |xi |,
then kAk is equal to the maximum sum of the rows of A, that is

n
X
kAk = max
|ai,j |
i

j=1

5. For the following system



A=

8 1
1 2



x
y


=

9
3

Find the two first terms of the sequence generated by the steepest descent method. If
the solution is [x, y] = [1, 1], find the maximum norm error and the two norm error for
these two terms.
6. Show that if A is a symmetric matrix then its eigenvalues are real numbers and the
eigenvectors are perpendicular.
p
7. Prove that k A k2 = (At A).
8. Shows that the gradient of the quadratic form P (x) =
P (xc ) = b Axc .

1
2

hx, Axi hx, bi . is equal to

Chapter 3

Interpolation
3.1

Introduction

Suppose that as a result of an experiment a set of data points (x, u) related to each other is
obtained. The relationship between x and u is expressed as u(x) but from the experimental
data we only know the values at certain points, that is, ui = u(xi ). For example suppose
we have the following data:
u
x

1.29
1.00

1.74
1.90

2.38
2.80

3.19
3.70

4.03
4.60

4.65
5.50

4.55
6.40

3.04
7.30

This data is plotted in figure 3.1. The continuous line represents the function we want
to interpolate. Unfortunately this function is usually unknown and the only information
available is that presented in the table. The problem that rises is to evaluate the function
u(x) at intermediate data points x. That is xi < x < xi+1 . Then we need to find a
continuous function u
(x), that approximates u(x) in such way that is exact at the data
points u
(xi ) = ui .
4.5
4
3.5
3
2.5
2
1.5
1

Figure 3.1: Approximation of a set of data points

3.1.1

Polynomial approximation

Polynomials are functions that provide advantages for approximating a set of points. They
are continuous, differentiable, and integrable. Furthermore, these operations can be ac47

48

Interpolation

complished and implemented straight-away.


In general, a polynomial of order n can be written as,
Pn (x) = an xn + . . . + a0
with n a positive integer and an , . . . , a0 real coefficients. Its derivate
Pn0 (x) = nan xn1 + . . . + a1
and its integral
Z
Pn (x)dx =

an xn+1
+ . . . + a0 x + C.
(n + 1)

WEIERSTRASS theorem of approximation


Suppose that f is a function defined and continuous in the interval [a, b]. then for each
> 0 exist a polynomial P (x) defined over [a, b] with the following property
|f (x) P (x)| < x .[a, b]
Figure 3.2 illustrate such a result.

Figure 3.2: Weierstrass theorem of approximation

Example 3.1 Taylor Polynomials.


Taylor polynomials are very good to locally approximate a function around a point. Suppose that f (x) C n [a, b], that f n+1 exists in [a, b], and that x0 [a, b]. Then for all
x [a, b] at a distance h = x x0 it is true that
f 00 (x0 ) 2
f n (x0 ) n
(h) + . . . +
(h) + O(hn+1 )
2!
n!
where O(hn+1 is the error of the norder approximation and is a function of hn+1 . Figure
3.3 plots f (x) = cos(x) and the local Taylor approximations, of order zero and one, around
point x = 1. For n = 1, p0 is a constant function defined by p0(x) = cos(1). For n = 1
p1 is a line given by p1(x) = cos(1) sin(1)(x 1). Thus, when the order of the Taylor
polynomial increases a better approximation of the function in the neighbourhood of x = 1
is obtained as it approximates not only the function but its derivatives. However, the error
of approximation increases as we move away from x = 1.
f (x) = f (x0 + h) = f (x0 ) + f 0 (x0 )(h) +

3.2 Lagrange polynomials

49

Figure 3.3: Taylor polynomial approximation of the cos function at point x = 1. p0(x), zero, and
p1(x), fist, order approximations are shown.

3.2

Lagrange polynomials

Suppose that you want to find a first degree polynomial that passes through the points
(x0 , u0 ) y (xi , ui ). There are many ways to find a straight line that passes through these
points. The main idea behind Lagrange polynomials is to find a set of polynomials of
first degree whose linear combination is equal to the desired polynomial. In this case, the
line that joins the points (x0 , u0 ) and (x1 , u1 ) is given by the first degree polynomial P (x)
defined as,
x x1
x x0
P (x) =
u0 +
u1
x0 x1
x1 x0
which defines the line crossing the points (x0 , u0 ) and (x1 , u1 ) by adding the lines P1 =
xx0
xx1
x0 x1 u0 and P2 = x1 x0 u1 . If we define the functions W0 (x) and W1 (x) as
W0 (x) =

x x1
,
x0 x1

W1 (x) =

x x0
x1 x0

then P (x) can be written also as


P (x) = W0 (x) u0 + W1 (x) u1 .
Notice that the W functions were found using the following rule
W0 (x0 ) = 1,

W0 (x1 ) = 0.

W1 (x0 ) = 0,

W1 (x1 ) = 1.

and
Consequently when multiplying a polynomial W0 by the function value at point x0 , that
is u0 = u(x0 ) then the W0 function cross point (x0 , u0 ) and A(x1 , 0). The same applies to
W1 that cross points (x0 , 0) and (x1 , u1 ). When both functions are added, the resulting
first order polynomial will cross through the given points (x0 , u0 ) and (x1 , u1 ).

50

3.2.1

Interpolation

Second order Lagrange polynomials

Suppose now that you have three points and you want to find the polynomial of second
degree that interpolates those points. Using the technique explained above, we want to
find functions W0 (x), W1 (x), W2 (x) such that
P (x) = W0 (x) u0 + W1 (x) u1 + W2 (x) u2
with Wi second order polynomials. The result of adding second order polynomials is a
second order polynomial. Additionally they must comply with
P (x0 ) = u0 ,

P (x1 ) = u1

and P (x2 ) = u2 .

(3.1)

In other words the functions must go through (interpolate) the points.


In order to obtain the Wi functions we proceed in the same way as in the first order
polynomials, that is, in order to guarantee the condition in equation 3.1 it is enough that
functions Wi be defined as
(
1
if i = j.
Wi (xj ) =
(3.2)
0
if i 6= j.
That is Wi (xj ) is equal to one at point xi and zero at the other points. This function can
be computed in different ways. For example to construct function W0 that cancels at each
xi with i 6= 0, we choose a series of binomial factors each one cancelling at points xi
W0 = (x x1 )(x x2 ).

(3.3)

Now, in order to satisfy W0 (x0 ) = 1, we divide this result by the same product of binomials
as in equation 3.3 and choosing x = x0
W0 =

(x x1 )(x x2 )
.
(x0 x1 )(x0 x2 )

(3.4)

In this way W0 complies with 3.2. In a similar way we can construct functions W1 and
W2 . Figure 3.4 shows these three functions. Notice that any quadratic function passing
through points x0 , x1 and x2 can be constructed by the linear combination of these three
functions.

3.2.2

General case

Given a set of n + 1 discrete points by which a function pass through. The set of points
can be interpolated by an n degree polynomial obtained by the linear combination of n + 1
polynomials of n degree with the following property:
(
1
if i = k.
Wi (xk ) =
0
if i 6= k

3.2 Lagrange polynomials

51

u2

1.6

Interpolated Function

1.4

u3
1.2
u1
1.0

W2

W1

W3

0.8

0.6

0.4

0.2

0
0.2

0.4

0.6

0.8

1.0

Figure 3.4: Second degree Lagrange polynomials

which can be constructed by:


n
Y
(x xk )
Wi =
(xi xk )
k=0
k6=i

using Polynomial interpolation provides continuous and derivable functions. Besides


derivation and integration of polynomials is straight-forward. However, when the order of the polynomials increases, the loop of the polynomials increases resulting in a poor
local approximation. So high order polynomials must be avoided in the construction of
interpolated functions. Instead, interpolation by parts must be considered. This will be
discussed in the next section.

3.2.3

Other Approximations

Lagrange polynomials are not the only way of interpolating a set of data points. In general
if v is a function defined over n points xi by vi with
vi = v(xi )
then the function v(x) can be interpolated by v(x) V n defined in terms of Wi (x) functions
with
X
v(x) =
vi Wi (x).
i

Wi functions conform a base that generates an approximation space. For example the
space of second order functions can be generated by a base of P 2
span {W1 (x), W2 (x), W3 (x)} .

52

Interpolation

Example 3.2
Given the functions W1 = 1, W2 = x, W3 = x2 , any polynomial p P 2 is given by the
linear combination of
p = a1 W1 + a2 W2 + a3 W3 .
For example find a1 ,a2 and a3 such as p is equal to p(x) = 3x2 + 2x + 5

3.3

Polynomials defined by parts

50
40
30
20
10
0

10

Figure 3.5: Oscillation due to high order polynomial approximation

One disadvantage of the polynomial interpolation is that as the number of points


increases the order of the polynomial increases. A higher degree of polynomial does not
necessarily mean a good approximation as it can introduce undesirable oscillations as is
shown in figure 3.5. To avoid this, the domain can be subdivided and the function can be
approximated by a series of functions in each subdomain.
Let define the domain of a function u defined as,
u:R
can be expressed in terms of subdivisions of the domain such as
= K1 K2 K3 . . . . . . . . . Kn
where Ki represents a finite element and Ki intersects Kj at the maximum at the boundary.
Some examples of domains defined by parts for the one-, two-, and three-dimensional
case are shown next.
One dimensional case
Figure 3.6 presents a one-dimensional domain defined by parts. The segment [a, b] is
subdivided as a set of elements Ki . Each element consist of a segment of line and in this

3.3 Polynomials defined by parts

53

Figure 3.6: One dimensional domain defined by parts

particular case they have


P four nodes per element. Additionally, the union of the elements
is equal to the domain
Ki = and the element only intersects at one point,
(
.
Ki Kj =
vertex
Two dimensional case
y

= Union of triangles
x
Figure 3.7: Two dimensional domain defined by parts

Figure 3.7 presents a two dimensional domain defined by parts. The domain is subdivided as a set of triangles Ki . Each triangle consist of three vertex and three line segments.
In this particular case the nodes are
P defined at the vertex of the triangle. The union of the
elements is equal to the domain
Ki = and the element only intersects at one point
or at an edge,

Ki Kj = 1 vertex.

1 edge
Three dimensional case
Figure 3.8 presents a three dimensional domain defined by parts. The domain is subdivided
as a set of tetrahedral elements Ki . Each tetrahedron consist of four vertices, four threedimensional triangles and six line segments. In this particular case the nodes are defined

54

Interpolation

K1

= Union of tetrahedras

Figure 3.8: Three dimensional domain defined by parts

at the vertex of the triangles. The union of the elements is equal to the domain
and the element only intersects at one face, edge or vertex,

1 vertex.
Ki Kj =

1 edge

1 f ace

Ki =

Base (Shape) functions


Base functions are defined for the elements ki in the following way:

(
)
n

X

n
n
i
P = u P (Ki ) u(x) =
ai x .

(3.5)

i=0

So in two dimensions we have

X

P n = u P n (Ki ) u(x) =
aij xi y j .


i+jn

Pn

It can be observed that equation 3.5 can be written in terms of the base function of
as:
u1 = W1 (x1 )
u2 = W2 (x2 ).
..
.
un = Wn (xn )

Expanding the terms of the base and remembering that Wi P n , it delivers:


W1 (x1 ) = a10 x1 + a11 x11 + + a1n xn1 = u1
..
.
Wi (xi )

= ai0 xi + ai1 xi1 + + ain xni = ui

3.3 Polynomials defined by parts

55

Wi (xi ), can be written as:


[A] x = u
Notice that these functions are defined for each element ki . Therefore the limits of
the elements can not be arbitrarily chosen. Two adjacent elements should have common
vertices in order to preserve the continuity in the function. Therefore, if xi is a common
node for two elements and ur (x) is the function defined at element Kr and us (x) is the
function defined at element Ks , then it must be true that
ur (xi ) = us (xi ).

3.3.1

One-dimensional Interpolation

First order polynomials


A function for which its value is only known at certain points can be represented by a pair
vectors:

x1

u1

..
..
x=
u
=
.
.

xm
um
The domain is represented by the interval [x1 , xn ] With this information it can be found
P n (x) = u
(x) that interpolates the function u(x) at the m points. Suppose that you want
to know the value of the function at certain point x in the domain. Proceed as follows:
i. Find ki such that x ki
ii. Compute the base functions W |ki
P
iii. Compute u(x) as u(x) = ui W i (x).
Differential Operators
Let v be a function known at n points xi . That is vi = v(xi ) is known for n points. Then
the function is interpolated in V as
X
v(x) =
vi Wi (x)
i

where the functions Wi form the basis of V . Figure 3.9 plots a function and its interpolation
using five elements. The derivate of v V is given by
X d
X dWi (x)
dv
=
vi Wi (x) =
vi
dx
dx
dx
Notice that while the continuity in the interpolated function is guarantee by the interpolation, its derivative in only derivable by parts as at the node points the derivative is not
well defined. One could try to interpolate again this result but in general the derivative
of the interpolated function is different from the interpolated of the derivative.

56

Interpolation
v4

v(x)
v2
v6
1

W1

x1

W2

W4

x2

x4

x6

Figure 3.9: Piece-wise linear approximation of a smooth function

Second order polynomials


Definition of the base functions :
(
1
Wi (xj ) =
0

if
if

i = j.
i 6= j

Let us analyse the meaning of having a domain defined by parts. First, take a data point
inside an element, for example point x4 k2 . Notice that W4 is completely defined in
element k2 Outside of it takes a value of zero. However, when we look at function W3
at point x3 , belonging to elements k1 and k2 , we notice that the function is defined over
these two elements. See figure 3.10.

Figure 3.10: Piece-wise quadratic interpolation using Lagrange polynomials

3.3 Polynomials defined by parts

57

Figure 3.11: Interpolation using a second order polynomial defined by parts

Example
Figure 3.11 shows a second order polynomial interpolating u by parts. Notice that each
element consists of three nodes and therefore a second order polynomial can be obtained
at each element.
Exercises
i. Using Matlab or Maple, draw function u(x) = (x/5) sin(5x) + x defined over the
interval [0,4].
ii. Using the same partition of the domain of the last example, plot u
(x) second order
Lagrange polynomials that interpolates u(x).
iii. Global numeration Vs local numeration. From the last example we see the need of
having a global numeration of the domain nodes but in order to compute the base
functions it is convenient to have a local numeration.

3.3.2

Two dimensional interpolation

Let be the domain of a function u(x, y); is bounded by a polygon and = Ki (the
union of subdomains where Ki can be a triangle or a square. The approximation space
P n (xi ) is defined for each element Ki .
Example
Approximation space with P 2
P P2

P (x) =

aij xi y j .

i+jn

The number of nodes needed to define a polynomial is given by the number of coefficients
of the polynomial. That way

58

Interpolation

P1

N(1) = 3

P2

N(1) = 6

P3

N(3) = 10

Bases in the 2-D:


P 1 = span {W1 (x), W2 (x), W3 (x)}
P 2 = span {W1 (x), . . . , W6 (x)} .
In general Wr (x) is defined as:
Wr (x) =

arij xi y j

i+j<n

and the coefficients arij can be calculated using the following definition
Wr (xs ) = rs .
To guarantee continuity care must be taken at nodes belonging to several elements. The
function defined over those elements and evaluated at the common node must be the same.

3.4

Calculo del gradiente de un campo escalar

spanish
ttranslate this
Uno de los problemas de calcular el gradiente es que solo contamos con la informaci
on
del campo escalar u en los nodos. De esta forma usando elementos de Lagrange de orden
uno no se puede garantizar que el campo gradiente sea continuo. es decir cuando calculamos el gradiente para un punto com
un a dos elementos, entonces el calculo basado en
un elemento difiere con el calculo hecho con el elemento vecino.

3.5 Exercises

3.4.1

59

calculo del gradiente para un elemento

Se debe calcular


u u
,
u =
x x
Adicionalmete u se aproxima en terminos de las funciones base como
X
u=
ui i (x, y)


(3.6)

(3.7)

y ui la funci
on evaluada en los vertices para el caso de elementos P1. Reemplazando el
valor de la funci
on aproximada en el calculo del gradiente tenemos:
P
u
ui i X i
=
=
ui
(3.8)
x
x
x
Lo que significa que solo se necesita calcular el gradiente de las funciones base sobre el
elemento. Para elementos P1 las funciones base son lineales de la forma
i (x, y) = ai x + bi y + c
por tanto
i = (ai , bi )

(3.9)

El gradiente puede calcularse entonces para cada uno de los nodos del triangulo usando
ecuaciones 3.8 y 3.9

3.4.2

calculo del gradiente para un campo

Una vez calculado el gradiente para un nodo como el node 3 de la figura nos daremos
cuenta que su valor difiere dependiendo del elemento con el cual se haya hecho el calculo.
Una opci
on aceptable es hacer el promedio de los resultados de cada elemento. Es decir
si hay n elementos que comparten un nodo i,
Pn  ui 
e=1 xj
ui
e
=
(3.10)
xj
n
 
ui
es el valor obtenido usando el elemento e.
donde x
j
e

3.5

Exercises

1. Given the following points {x0 = 0, x1 = 0.6, x2 = 0.9} defining a real number interval
and its middle point, find the Lagrange polynomials Wi (x) of first and second order.
2. For the functions f (x) = cos(x), f (x) = ln(x + 1), and f (x) = tan(x) find the values
of the function at the points fi = f (xi ) and use the lagrange polynomials from the
previous point to find the interpolating function
X
g(x) =
fi Wi (x)
i

60

Interpolation

3. Find the approximation error for each of the functions of the previous point when first
and second order Lagrange interpolation is used. The error is defined as
Z

|f (x) g(x)|dx

error =
a

4. The aim of this exercise is to explore the interpolation and approximation functions
available in commercial software as Matlab, Maple, and Excel/OpenOffice .
Consider the following table

that relates the density against Altitude.


h Altitude
(m)
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000

Density
(kg/m3 )
1.225
1.007
0.819
0.660
0.526
0.414
0.312
0.228
0.166
0.122
0.089

(a) Find the first, second third and fourth order polynomial that best approximates
the curve. With this functions construct a vector approx and compute the `2 and
` norm error.
(b) If dP/dz = g is the function that relates pressure and altitude for a hydrostatic
fluid, find the function that describe the pressure as a function of height. With
this function write a table and compare the results with the values found in tables.
5. Compute the fist order Lagrange polynomials W1 , W2 , W3 for the triangle shown in
the figure using the following equation which correspond to the coefficients of the Wi
function

x 1 y1 1
ai
x2 y2 1 bi = ei
x3 x3 1
ci
where ei is a canonic unit vector in the i direction (it is equal to one in the i position
and zero otherwise). The coordinates of the triangle are given in the following figure
1

Data from U.S. Standard Atmosphere, 1962, U.S. Government Printing Office

3.5 Exercises

61

(4, 5)
(4.5, 4.5)
(3.5, 4)

(5, 4)
(4, 3.5)

(3, 3)

6. Find the interpolation g(x, y) =


functions:

fi Wi (x, y), with fi = f (xi , yi ), for the following

(a) f (x, y) = 2x2 + 3y 2 + 1


(b) f (x, y) = sin(3x) cos(3y).
Plot each of these functions and its interpolation g(x, y).
7. Select ten points over a line connecting the points (3, 3) and (4.5, 4.5). Find the value
of the functions and its interpolation. Plot the results and compute the error using `2
norm and ` norm
8. Repeat the same procedure using second order Lagrange polynomials.
9. The goal of this exercise is to plot a field given by a function f (x), f : R2 R, defined
over a domain and interpolate the value of the field at any arbitrary point inside .
The domain is described as three different text files with extension .cor, .tri, y .f
<file_name>.cor
<file_name>.tri
<file_name>.f
The first one contains the coordinates of the nodes. The second the connectivities of
the triangles and the third one the value of a function at node points (in the same order
as they appear in the .cor file). As example consider the following domain consisting
of two elements
3

4
I
II

62

Interpolation
If the file is called simple they will look like
simple.cor
0.0 0.0
1.0 0.0
0.0 2.0
1.0 2.0

simple.tri
1 2 4
1 4 3

simple.f
0.0
1.5
7.3
3.8

(a) Plot the scalar field (using Matlab, Maple or Octave) and show a map of colours
associated to the magnitude of the scalar field
(b) Develop a procedure that ask the user for a point and the program return the
value of the function at the point (if the point is inside the domain). The point
input could be by keyboard or mouse.

Chapter 4

The Finite Element Method


4.1

Classification of the Partial Differential Equations PDE

The general form of a Partial Differential Equation (PDE) with two independent variables,
u = u(x, y), defined over a two dimensional domain , is:
a

2u
2u
2u
+
2h
+
b
+f =0
x2
xy
y 2

(4.1)

where a, h, and b represent real constants or functions of x and y, and f is a function


of u/x, u/y and u. This general form (equation 4.1) is quite similar to the general
equation of a conic
ax2 + 2hxy + by 2 + 2cx + 2dy + e = 0
(4.2)
which represents an ellipse when (abh2 > 0), a parabola when (abh2 = 0) or hyperbole
when (ab h2 < 0).
In the same way, the PDE s are classified. A PDE is:
Elliptic
if
ab h2 > 0
Parabolic
if
ab h2 = 0
Hyperbolic if
ab h2 < 0.
Example 4.1

i. Diffusion Equation

2u
u
=
(4.3)
2
x
t
where t represents the time and is the diffusion coefficient of the material. (Notice
that y = t for this equation.)
2

Then a = 2 , h = 0, and b = 0. Therefore, ab h2 = 0. One concludes that the


diffusion equation is parabolic.
ii. Wave Equation
2u
2u
=
(4.4)
x2
t2
By comparison with 4.1, a = 2 , h = 0, and b = 1. Therefore, ab h2 = 2 < 0.
One concludes that the wave equation is hyperbolic.
2

63

64

The Finite Element Method

h
y

Figure 4.1: A wing profile for the tricomi equation

iii. Laplace Equation


2u 2u
+ 2 =0
x2
y

(4.5)

By comparison with 4.1, a = 1, h = 0, and b = 1. Therefore, ab h2 = 1 > 0. One


concludes that the Laplace equation, is elliptic.
In the last examples a, b, and h were real constants but in general they can be
functions of x and y (equations with variable coefficients). Also, they can switch
type depending on the domain. (That is, they can change type when they change
from one region of the x y plane to the other. To illustrate this fact let see the
following example
iv. The Tricomi Equation
One important application is the perturbation in the air caused by the displacement
of a wing. The usual way of solving the problem is to keep the wing still and move
the air (wind) around the wing.
For a non-viscous fluid
q = ui + w
where q is the velocity and w is the potential
1 M2

 2w 2w
+
=0
x2
y 2

(4.6)

where M is known as the Mach number. Which is equal to the ratio between u and
the speed of the sound in the media.
Subsonic case
a = (1 M 2 )
b = 1
2

ab h

= (1 M 2 )

As the flow regime is subsonic, M < 1 and


ab h2 = (1 M 2 ) > 0

(4.7)

4.2 Boundary Value Problems

65

and therefore equation (4.6) is elliptic. If the fluid is incompressible, the speed
of sound in the fluid is infinitum hence, M = 0 and the equation (4.6) results
in the Laplace equation.
Supersonic case
a = (1 M 2 )
b=1
As the flow regime is supersonic, M > 1 and
ab h2 = (1 M 2 ) < 0
and therefore the Tricomi equation (4.6) is hyperbolic.
Transonic case
When M 1, the flow regime is considered transonic. This case is more
interesting and difficult to solve. We can not neglect the nonlinear part when
obtaining the equation. After some change of variables we get
2u 2u
+ 2 = 0.
2

(4.8)

Now abh2 = (1)() = so equation (4.6) is elliptic when > 0 and hyperbolic
when < 0. This situation is reflected in the fact that the flow is mixed in the
original variables x and y.

4.2

Boundary Value Problems

To determine a unique solution for equation (4.1) one needs to specify the boundary
conditions.
Let be an open subset of a plane or space and be the boundary of that region.
The boundary conditions can be classified as (let u = u(x, y))
u| = g


u
= (u n) = g
n


u
u+
=g
n

Dirichlet

(4.9)

Newman

(4.10)

Fourier.

(4.11)

One problem could present one or more of these conditions in one or more parts of its
boundary .

4.2.1

One dimensional boundary problems

Let us have the following second order differential equation with boundary conditions
defined over a closed domain [a, b] R


d
du
k
= f (x)
dx
dx

66

The Finite Element Method

where f(x) is a known function and k is a constant. The problem can have one of the
following boundary conditions:
u(a), u(b)


du
du
k ,k
dx a dx b

du
u(b) + k
dx

Dirichlet Boundary conditions


Von Newman Boundary conditions
Fourier Boundary conditions.

Depending on the problem and its governing equations, the boundary conditions have
different physical meaning.
Heat transfer: u represents the temperature
(
ua temperature at u (a)
Dirichlet
ub temperature at u (b)
(

u b
Von Newman k heat flow
x a

u
Fourier
k
= h (Tw T ) convection .
x
Elasticity: Equation of the elastic curve for beams. u represents the vertical displacement
of a point in the neutral surface (Beer and Jr., 1992)pp:481
d2 u
M (x)
=
2
dx
EI
(

u0 displacement at u (x0 )

un displacement at u (xn )

u b
slope .
x a

Dirichlet
Von Newman

Fluid mechanics In determining Potential flow, u = represents the potential or stream


function and the velocity is defined by
vx =

vy =

with the boundary conditions:





Von Newman vi =
velocity at the boundary .
xi surface

4.3 Preliminary mathematics

67

In the following sections we will analyse the solution to the boundary problem with
Dirichlet, Von Newman, and Fourier conditions. Also it will be considered the boundary
problem with mixed conditions; that is, the Dirichlet boundary condition at a and Von
Newman at b.

4.3
4.3.1

Preliminary mathematics
The Divergence (Gau) theorem

Figure 4.2: Definition of the domain and boundary

This theorem relates the volume integral of a vector function over a volume with a
surface integral of the same function, over the surface delimiting its volume. Let be a
vector function defined over a domain and let be its surrounding surface. (Notice that
this definition can be applied for two and three dimensions)
: Rn Rn
Rn ,

= boundary().

Then the divergence theorem states that


Z
Z
h, ni d =
div () d.

4.3.2

(4.12)

Greens equation

Let r, u, v be scalar functions defined Rn R and , z vector functions Rn Rn . It can


be shown that the divergence of the product of v times z is equal to
div (vz) = v div (z) + hz, vi .

(4.13)

Applying the divergence theorem (equation 4.12) with = vz and using this last
result we obtain
Z
Z
v hz, ni d =
div (vz) d

Z
=
(vdiv (z) + hz, vi) d.

68

The Finite Element Method

If we choose z = u, for some u then


Z
Z
v hu, ni d =
(vdiv (u) + hu, vi) d,

with div(u) = 4u, the Laplacian of u, we obtain,


Z
Z
Z
hu, vi d.
v4u d +
v hu, ni d =

(4.14)

Equation (4.14) is known as Greens equation.

4.3.3

Bilinear Operators

A Bilinear Form a(u, v) defined over a vector space (u, v V ) is a scalar function
a (u, w) : V V R
which is linear in each argument separately. In other words, if , R and u, v, w, V
a (u + v, w) = a (u, w) + a (v, w)
and
a (u, v + w) = a (u, v) + a (u, w) .
Example 4.2
Let V be the space of integrable functions in the real interval [a, b]. and let be a(, ) defined
as
Z
b

a(u, v) =

u v dx
a

Shows that a(u, v) is a bilinear form


Solution.
Using the properties of the integral
Z

a (u + v, w) =

(u + v)w dx
Z b
Z b
=
uw dx +
vw dx
a

= a (u, w) + a (v, w)
Which prove that is linear with respect to the first argument. To prove that is linear with
respect to the second argument we can do it simmilarly or in this case prove that the form
is symmetric, that is a(u, v) = a(v, u). This indeed is straigh forward as
Z b
Z b
u w dx =
w u dx
a

4.4 Overview of Finite Element Method

4.4

69

Overview of Finite Element Method

Let us have the following boundary value problem




d
du

k
= f in (0, 1)
dx
dx

du
u(0) = g and k
= h.
dx 1

(4.15)

Which is a elliptic one dimensional equation sometimes referred to as the Euler equation.
In the following steps we will transform the differential equation into an integral form
which can be expressed in a more general abstract form that allows generalisation for a
set of elliptic or parabolic or hyperbolic equations. The final step is the discretisation of
the equations by using a set of finite base functions to represent the solution.

4.4.1

Weak Form

A functional is a special function whose domain is itself a set of functions, and whose
range is another set of functions that may be numerical constants.
The idea of the method is to transform the differential equation into an integral problem. This can be obtained by multiplying the differential equation by an arbitrary function
v and then integrating over the domain. Thus for equation (4.8) we have
1

Z
0

integration by parts gives


Z
0

dx

du
k
dx



vdx

f vdx = 0
0



Z 1
du dv
du 1
k
dx kv

f v dx = 0
dx dx
dx 0
0



Z 1
du dv
du b
k
f v dx kv
= 0
dx dx
dx a
0

which is called the weak form of the problem.

4.4.2

Abstract Form

Let us define
Z

a(u, v) =

k
0

Z
`(v) =
Rb
a

f 0 g = f g|ba

Rb
a

f g0

f v dx.
0

du dv
dx
dx dx

(4.16)

70

The Finite Element Method

The boundary value problem can then be rewritten as




du b
a (u, v) ` (v) kv
=0
dx a

(4.17)

and is called the abstract form of the problem.


Let V be the vector space with the following property


V = v L2 (a, b) : a (v, v) < v (0) = 0
where L2 is the space defined by


Z
2
2
v dx < .
L = v : R such that

(4.18)

That is square integrable functions. Then, if u is the solution to (4.15), it is characterised


by u V such that a(u, v) = (f, v) v V .

4.4.3

Variational Form

In order to find the variational form of the problem, let us define a functional J(v) based
in (4.17) as follows


dv b
J (v) = a (v, v) ` (v) kv
.
(4.19)
dx a
It can then be shown that J(v) has a minimum at u, that is
J (u) < J (v)

v V.

(4.20)

This is called the variational form of the problem.


We can conclude that we have equivalent relationships among the following three forms:



d
du
du
=h
(P1)

k
= f in (0, 1), u(0) = g, and k
dx
dx
dx 1



Z 1
du dv
du b
k
f v dx kv
=0
(P2)
dx dx
dx
0

J (u) < F (v)

v V.

(P3)

We should call (P1), (P2), and (P3) the local, weak, and variational forms respectively.
In the above, the functional form F is chosen for the Euler equation (4.15). However, for
a given boundary value problem, it might be difficult to find the corresponding functional
for the variational formulation. Nevertheless, it is not necessary to find F (v) to solve the
problem and the weak form (P2) can be used instead of (P3), since the form (P2) is easily
obtained from the differential form by the procedure shown in this section Kikuchi (1986).
The space V therefore can be seen as v such that v 0 L2 and v(a) = 0. The next step
is to find a set of functions v that satisfy the problem.

4.4 Overview of Finite Element Method

4.4.4

71

Discrete Problem (Galerkin Method)

Let us have the following boundary value problem




d
du

k
= f in (a, b)
dx
dx

u (a) = g.

Integration by parts take us to the weak form of the problem


b

Z
a


Z b
du dv
du b
k
f v dx
dx kv =
dx dx
dx a
a

with v V and V is the space of admissible functions. The problem can be expressed in
an abstract way as
To find u U
a (u, v) = ` (v) v V
for which in the above case
Z

a (u, v) =
a


du dv
du b
k
dx kv
dx dx
dx a

` (w) =

f v dx.
a

Notice that a(u, w) is a bilinear operator.


The Galerkin method consists of choosing u and v belonging to the same space of
functions U .


U = u, admissible function, H 1
if span {w1 , w2 , . . .} is a base of the space, we can express
u(x) =

i wi (x).

Notice that the dimension of the base is unknown. The function u can be approximated
by a base of finite dimension as
u
(x) =
=u

n
X

ui wi (x)

i=1

where ui = u (x) is the value of a function at a point xi .


The Galerkin method consists of choosing u and v in the same space of functions U ,
so they will be expressed in the same base.
u(x) =

X
i

ui wi (x)

v(x) =

X
j

vj wj (x)

72

The Finite Element Method

Replacing the interpolated functions into the abstract form

X

a (u, v) = ` (v)

X

X
ui wi ,
v j wj
= `
v j wj .

(4.21)

Because a is a bilinear operator, we have



X

X 
X
X
a
=
a ui wi ,
v j wj
ui wi ,
v j wj
i


 X
ui a wi ,
v j wj

ui vj a (wi , wj ).

ij

In the same way, for `(v) we have

` (v) = `

v j wj

vj ` (wj ).

The equation (4.21) can be rewritten as


X
X
ui vj a (wi , wj ) =
vj ` (wj ).
ij

The term a(wi , wj ) is a real value that can be computed from evaluating the integrals in
terms of the base functions, that is
Z b
dwi dwj
a(wi , wj ) =
dx.
(4.22)
a dx dx
Then aij represent a matrix that has the dimension of the space of approximation V . The
discrete problem can be expressed in vector form as
hAu, vi = h`, vi

(4.23)

where A is the matrix form by A = aij = a (wi , wj ), u = (u1 , . . . , un ), v = (v1 , . . . , vn )


and `j = ` (wj ).
Using the properties of the inner product, equation (4.23) is transformed into
hAu `, vi = 0,
and because we have to satisfy this equation for all v V then
Au ` = 0.
Solving for u we find the discrete approximation to the solution.

4.5 Variational Formulation

4.5

73

Variational Formulation

Before formulating linear elliptic problems as variational problems, we first present the
following abstract result Braess (1997).
Theorem 4.5.1 (Characterisation Theorem). Let V be a linear space, and suppose a :
V V R is a symmetric positive bilinear form, i.e., a(v, v) > 0 for all v V , v 6= 0.
In addition, let
`:V R
be a linear functional. Then the quantity
1
J(v) = a(v, v) `(v)
2
attains its minimum over V at u if and only if
a(u, v) = `(v) for all v V.

(4.24)

Moreover, there is at most one solution of (4.24).


Proof. For u, v V and t R, we have
1
J(u + tv) = a(u + tv, u + tv) `(u + tv)
2
1
= J(u) + t[a(u, v) `(v)] + t2 a(v, v).
2

(4.25)

If u V satisfies (4.24), then (4.25) with t = 1 implies


1
J(u + v) = J(u) + a(v, v) for all v V
2
> J(u), if v 6= 0.

(4.26)

Thus, u is a unique minimal point. Conversely, if J has a minimum at u, then for


every v V , the derivative of the function t 7 J(u + tv) must vanish at t = 0. By (4.25)
the derivative is a(u, v) `(v), and (4.24) follows.
The relation (4.25) describes the size of J at a distance v from a minimal point u.

4.5.1

Reduction to Homogeneous Boundary Conditions

In the following, let L be a second order elliptic partial differential operator with divergence
structure:
X
L(u) = Lu =
i (aik k u) + a0 u,
(4.27)
where
a0 (x) > 0

for

x .

74

The Finite Element Method

Here we used the following notation for simplicity

= j .
xj
One begins by transforming the associated Dirichlet problem
Lu = f

in

u=g

on

(4.28)

into one with homogeneous boundary conditions. Assuming that there is a function u0
which coincides with g on the boundary and for which Lu0 exists. Then
Lu Lu0 = f Lu0
therefore
Lw = f1
w=0

in

on

(4.29)

where w = a u0 and f1 = f Lu0 . For simplicity, one can assume that the boundary
condition in (4.28) is already homogeneous. The next step is to show that the boundaryvalue problem (4.29) characterises the solution of the variational problem. The differential
equation Lu = f is called the Euler equation for the variational problem due to L. Euler
who first carried out this analysis.
Theorem 4.5.2 (Minimal Property). Every classical solution of the boundary-value problem
X

i (aik k u) + a0 u = f in ,
i,k

u=0

on

is a solution of the variational problem


#
Z " X
1
1
J(v) :=
aik i vk v + a0 v 2 f v dx min!
2
2

(4.30)

i.k

with zero boundary values.


among all functions in C 2 () C 0 ()
Proof. The proof proceeds with the help of Greens formula
Z
Z
Z
vi w dx = wi v dx + vwni ds

(4.31)

Here v and w are assumed to be C 1 functions, and ni , is the i-th component of the
outward-pointing normal n. Inserting w := aik k u in (2.9), we have
Z
Z
vi (aik k u)dx = aik i vk u dx,
(4.32)

4.5 Variational Formulation

75

provided v = 0 on . Let
Z
a(u, v) :=

aik i uk v + a0 uv dx,

(4.33)

i,k

Z
`(v) :=

f vdx.

(4.34)

Summing (4.32) over i and k gives that for every v C 1 () C() with v = 0 on

Z
X
v
a(u, v) `(v) =
i (aikk u) + a0 u f dx
(4.35)

i,k

Z
v[Lu f ]dx = 0,

(4.36)

provided Lu = f . This is true if a is a classical solution. Now the characterisation theorem


implies the minimal property.
The same method of proof shows that every solution of the variational problem which
lies in the space C 2 () C 0 () is a classical solution of the boundary-value problem.
The above connection was observed by Thomson in 1847, and later by Dirichlet for
the Laplace equation. Dirichlet asserted that the boundedness of J(u) from below implies
that J attains its minimum for some function u. This argument is now called the Dirichlet
principle. However, in 1870 Weierstrass showed that it does not hold in general. In
particular, the integral
Z
1

u2 (t)dt

J(u) =
0

4.5.2

The Ritz-Galerkin Method

There is a simple natural approach to the numerical solution of elliptic boundary-value


problems. Instead of minimising the functional J defining the corresponding variational
problem over all of H m () orH0m (), respectively, we minimise it over some suitable finitedimensional subspace [Ritz 1908]. The standard notation for the subspace is Sh . Here
h stands for a discretisation parameter, and the notation suggests that the approximate
solution will converge to the true solution of the given (continuous) problem as h 0.
We first consider approximation in general subspaces, and later show how to apply it
to a model problem. The solution of the variational problem
1
J(v) = a(v, v) `(v) min!
Sh
2
in the subspace Sh can be computed using the Characterisation Theorem 4.5.1. In particular, uh , is a solution provided
a(uh , v) = `(v)

for all v Sh .

(4.37)

76

The Finite Element Method


Suppose {w1 , w2 , . . . , wN } is a basis for Sh . Then (4.37) is equivalent to
a(uh , wi ) = `(wi ),

i = 1, 2, . . . , N.

Assuming uh has the form


uh =

N
X

z k wk ,

(4.38)

k=1

we are led to the system of equations


N
X

a(wk , wi )zk = `(wi ),

i = 1, 2, . . . N,

k=1

which we can write in matrix-vector form as


Az = b,
where A = ai,k := a(wk , wi ) and bi := `(wi ). Whenever a is an H m elliptic bilinear form,
the matrix A is positive definite:
z 0 Az =

zi Aik zk

i,k

!
=a

z k wk ,

z i wi

= a(uh , uh )

kuh k2m
and so z 0 Az > 0 for z 6= 0. Here we have made use of the bijective maping RN Sh which
is defined by (4.38). Without explicitly referring to this canonical mapping, in the sequel
we will identify the function space Sh with RN . In engineering sciences, and in particular
if the problem comes from contiuum mechanics, the matrix A is called the stiffness matrix
or system matrix.

4.5.3

Other Methods.

There are several related methods:


Rayleigh-Ritz Method: Here the minimum of J is sought in the space Sh . Instead of
the basis-free derivation
P via (4.37), usually one finds uh as in (4.38) by solving the
equation (/zi )J( k zk wk ) = 0.
Galerkin Method: The weak equation (4.37) is solved for problems where bilinear form
is not necessarily symmetric. If the weak equations arise from variational problem
with a positive quadratic form, then often the term Ritz-Galerkin Method is used.

4.6 One dimensional problems

77

Petrov-Galerkin Method: Here we seek uh Sh with


a(uh , v) = `(v)

for all v Th ,

where the two N-dimensional spaces Sh and Th need not be the same. The choice
of a space of test functions which is different from Sh is particularly useful problems
with singularities.
As we saw in previous sections that the boundary conditions determine whether a
problem should be formulated in H m () or in H0m (). For the purposes of a unifided
notation, in the following we always suppose V H m (), and that the bilinear form a is
always V-elliptic, i.e.,
a(v, v) kvk2m

and

|a(u, v)| Ckukm kvkm

for all u, v V,

where 0 < C. The norm k k is thus equivalent to the energy norm (2.14), which
we use to get our first error bounds. - In addition, let ` V 0 with |`(v)| < k`k kvkm for
v V . Here k`k is the (dual) norm of `.

4.6

One dimensional problems

In this section we will take an one-dimensional elliptic problem and transform it from its
differential form up to its computer implementation.

4.6.1

Dirichlet boundary conditions

Let us have the following boundary value problem




d
du

k
= f in
dx
dx

[a, b] .

With boundary conditions


u(a) = ua , u(b) = ub
The weak form of the problem
Z b
a


Z b
du b
du dv
dx v k =
f v dx.
dx dx
dx a
a

(4.39)

Select the appropriate space of approximations of the functions u and v as


u H 1 where H 1 = {u such that u is continuum and derivable by parts}
v H01 where H01 = {v H 1 and v(a) = v(b) = 0}.
Then the effect of selecting v H01 is to cancel (without lost of generality) the second
term on the left side of (4.39). The spaces H1 and H01 can be expressed with the same
base as
(H 1 )n = span{wi , i = 1 . . . n}
(H01 )n = span{wi , i = 2 . . . n 1}.

78

The Finite Element Method

That is, the values for w1 and wn in (H01 )n are omitted. In the Dirichlet boundary problem
the values of the functions at the boundary points a and b are known. Then u can be
expressed in terms of H01 in the following way:
u(x) ua w1 +

n1
X

ui wi + ub wn .

i=2

Then the abstract form of the Dirichlet boundary problem in (4.39) can be written as
a(u, v) = `(v)

!
n1
n1
n1
X
X
X
a ua w1 +
ui wi + ub wn ,
v j wj = `
v j wj

i=2

j=2

i=2

and applying the bi-linearity properties of the a operator we have

!
!
n1
n1
n1
n1
X
X
X
X
a ua w1 + ub wn ,
v j wj + a
ui wi ,
v j wj = `
v j wj
i=2
n1
X

i=2

vj a (ua w1 + ub wn , wj ) +

i=2

n1
X
i=2

ui

j=2
n1
X
j=2

vj a (wi , wj ) =

i=2
n1
X

vj ` (wj ) .

i=2

Defining AD = a(wi , wj ) and `D = `(wj ) for i, j = 2 . . . n 1, the last expression can be


simplified as
n1
X


vj a (ua w1 + ub wn , wj ) + AD u, v = `D , v
i=2

where u and v represent a vector with dimension (2 . . . n 1). By applying the bi-linear
property once again to the first term of the left side of equation, we have:
n1
X

vj a (ua w1 , wj ) +

i=2
n1
X

ua

i=2

n1
X


vj a (ub wn , wj ) + AD u, v = `D , v

i=2
n1
X

vj a (w1 , wj ) + ub


vj a (wn , wj ) + AD u, v = `D , v .

i=2

Notice that a (w1 , wj ) with j = 2, . . . , n 1 corresponds to the first column of the matrix
a1 , but without the boundary rows 1 and n. We denote this column vector of the matrix
D
by aD
1 . Similarly a (wn , wj ) = an is the nth column of matrix A without the boundary
rows 1 and n again. Using this definition, the last equation can be rewritten in a compact
form as:

D
A u, v = `D , v ua aD
1 , v ub an , v
2

As the matrix is symmetric there is no difference into a column or row vector

4.6 One dimensional problems

79

and applying the properties of the dot product we have


D
A u, v = `D ua aD
1 ub an , v .

(4.40)

Because (4.40) must be valid for all v, then it is true that


D
AD u = `D ua aD
1 ub an .

(4.41)

AD u = `D g.

(4.42)

D
or defining g = ua aD
1 + ub an

Solving the linear system of equations represented by (4.42) we found vector u, that is
the value of u(x) at the nodes ui = u(xi ).

4.6.2

Pragmatics

How to find AD , and g ?


First notice that matrix

a11
a11

..
.

A can be expressed in term of its columns as

a12 . . . a1n
a12 . . . a1n


= a1 a2 . . . an
..
..
..

.
.
.

an1 an2 . . . ann


where Ai represents the i-th column of matrix A. In the same way, if we multiply the
matrix by a vector that has only the j-th position different from zero, the result is the j-h
column of the matrix times this value

a11 a12 . . . a1n


0
a1j
a11 a12 . . . a1n uj
a2j

=
u
..
..
= uj aj .
..
..
..
j ..
.

.
.
.
.
.
an1 an2 . . . ann

anj

Notice that the matrix vector product can be expressed in terms of the column description
of a matrix as:

a11 a12 . . . a1n


u1
a11 a12 . . . a1n u2

..
.. = u1 a1 + u2 a2 + . . . + un an .
..
..
..
.

.
.
.
.
an1 an2 . . . ann

un

This result can be used to compute the value of ua aD


1 . Notice that if we multiply A
by a vector with zeros in all the entries but the first one and this one is equal to ua , then
we have

a11
a11 a12 . . . a1n
ua
a21
a11 a12 . . . a1n 0

=
u
= ua a1 .
..

..
..
..
..
a ..

.
.
.
.
.
an1 an2 . . . ann

an1

80

The Finite Element Method


A
A

AD

Figure 4.3: Schematic representation of matrices A and AD

If we now remove the rows 1 and n from the column vector a1 . We obtain the desired
D
value aD
1 . Matrices A, and A can be graphically seen as shown in figure 4.3.
Compute A and `
Construct the vector

ua
0
..
.

g =

0
ub
0

D
Compute ua aD
1 + ub an in two steps as:

i. Multiply Ag0 .
ii. Remove boundary rows 1 and n from vector Ag0 . That is g = (Ag 0 )D
Remove boundary rows and columns (1, and n ) from A to compute AD , and the
rows (1, and n ) from vector ` to obtain `D .
Solve u from AD u = `D g.

4.6.3

Computation of the ` vector

The ` vector is given by


Z
`j = ` (wj ) =

f wj dx

(4.43)

In general we can express the function f in terms of the base of the space of approximation
so
X
f=
fs ws .
s

Replacing the value in the equation (4.43) we get


Z bX
X Z b
`j =
fs ws wj =
fs
ws wj
a

4.6 One dimensional problems


that is

81

..
.
..
.
Rb

fs a ws wj
`j =
s

..

..
.

4.6.4

(4.44)

von Newman Boundary Conditions

Let us suppose now that the von Newman boundary conditions are known at extreme
points a and b,


du
du
qa
k =
qb .
k =
dx a
dx b
Where the minus sing was set to the constant in order to give the same meaning as in heat
transfer where the heat flux q is proportional to the negative of the temperature gradient,
that is the heat flows from high to lower temperatures. In order to consider von Newman
Boundary conditions we came back to the weak weak form (equation 4.16):
Z
a


Z b
du b
du dv
dx v k =
f v dx.
dx dx
dx a
a

Then, the second term of the weak form can not be cancelled as in the Dirichlet boundary
problem. In the Dirichlet problem it was cancelled by making the function v(x) to vanish
at the boundary. Now we need to refine the definition so it does not cancel the terms where
the von newman boundary condition is known. This is easily acomplished by definig v(x)
to vanish just at the dirichletPboundary and not to the whole boundary. If v is expressed
in terms of the base, v(x) = vj Wj (x), and vj = 0 at the Dirichlet boundary. Replacing
this in the second term of the weak form we have,


du b X
du b
vk =
v j wj k
dx a
dx a
j

X
du b
=
v j wj k
dx a
j

= hv, qi ,
where q was defined as

du b
qj = wj k
dx a
Notice that the relation of qj with the heat flux is
qj =
qj

(4.45)

82

The Finite Element Method

where the minus sing is introduced to be coherent with the heat flux definition.

qj =

!


du
du
du b
=wj k wj k
wj k
dx a
dx b
dx a


du
du
=wj (b) k wj (a) k
dx b
dx a

(4.46)
(4.47)

and because
(
1 for j = 1
wj (a) =
0 for j 6= 1
then

(
1
wj (b) =
0


du
k dx
a

..
q=
.

du
k
dx b

for j = n
for j =
6 n

(4.48)

As a final remark remember that in general at most one Dirichlet boundary condition is
essential to guaranty a unique solution, in which case the first (a) or the last (b) point of
q will be cancelled
Example 4.3
Find u(x) that solves
d

dx

du
dx


= f (x)

defined over = [1, 2] and k = 1 and boundary conditions


Dirichlet: u(a) = 10
von Newman: q(b) = 1
Solution.
In this case we will use only two linear elements of size one. The base functions are
defined by
w1 (x) = x + 1
w2 (x) = x,
and its derivatives by
dw1
dw2
= 1,
=1
dx
dx
The local matrix for the element is given by


Z 1
dwi dwj
1 1
=
1 1
0 dx dx

4.6 One dimensional problems

83

The global matrix can be constructed as

1 1 0
A = 1 2 1
0 1 1

Dirichlet boundary conditions

1 1 0
10
10
d = 1 2 1 0 = 10
0 1 1
0
0
von Newman boundary conditions in b: If the heat flow is q(b) = 1 then
k

du
=
q (x) = 1
dx

therefore the q vector is given by

0
q= 0
1

The final system is given by


[A]u + d q = `
Reducing the size by supressing the dirichlet position (row one),
    
    

10
0
10
0
2 1 u1
=
+

=
1
1
0
0
1 1
u2
which results in

 
9
u=
8

after couplig with dirichlet boundary consitions



10

u= 9
8
From which we can notice that the slope in the second element is equal to (9 8)/1 = 1
as it was expected from the von newman boundary condition. Also the slope is the same
for the first element as it was expected from the fact that the source f (x) = 0
Example 4.4
Construct the matrix of the system corresponding to the one-dimensional conduction
problem defined over an interval [a, b] using second order Lagrange polynomials. See
figure 4.4.
Solution.

84

The Finite Element Method


a
1

Elem 1

Elem 2

Elem 3

b
9

Elem 4

h
Figure 4.4: Finite element subdivision of a one-dimensional domain. This subdivision corresponds
to a second order Lagrange element. For this example the elements are defined to have constant
size h and all the nodes are separated by a constant distance h/2.

w
1

w
1

x1

w
3

x2

x3

Figure 4.5: Base functions w1 , w2 , w3 for a one-dimensional second order Lagrange element

Let us consider a one-dimensional domain defined by the interval [a, b] as in figure 4.4.
The interval is partitioned in four elements of equal size. This is not always the case but it
will simplify the computations in this example. We are interested to find the temperature
at each of the internal nodes (1..9).
A second order Lagrange element consists of three base functions as can be observed
in figure 4.5. Each of the functions has the form
wi = ai x2 + bi x + ci .
An element k will have nodes numerated 2k 1, 2k, and 2k + 1. If ne is the number of
elements, then the total number of nodes will be 2(ne) + 1.
For an element whose nodes are located at r, r + h/2, and r + h the base functions are
given by
w1 = 2/h2 x2 (4r + 3h)/h2 x + 1/h2 (2r2 + 3rh + h2 )
w2 = 4/h2 x2 + 4(h + 2r)/h2 x 4r(r + h)/h2
w3 = 2/h2 x2 (4r + h)/h2 x + r(h + 2r)/h2
Here the base functions w1 w2 , and w3 were locally numbered when in fact they correspond
to the global indices 2k 1, 2k, and 2k + 1. The local system matrix for this element can

4.7 Exercises

85

be calculated by solving the integrals


Z

r+h

dwi dwj
dx dx

alocalij =
r

That gives as a result


7
3h

8
Alocal =
3h

1
3h
which defines completely the integrals for one

1
8
3h
3h

16
8

3h
3h

7
8

3h
3h
general element.

Algorithm 5 Global matrix creation for example


ne is the number of elements
for k = 1 to ne do
g(1) := 2k 1
g(2) := 2k
g(3) := 2k + 1
for i = 1 to 3 do
for j = 1 to 3 do
A[g(i), g(j)] := A[g(i), g(j)] + Alocal [i][j]
end for
end for
end for
However from (4.22) we have that the global system matrix is defined as an integral
over the whole domain. This integral can be decomposed as the sum of integrals over the
total number of elements

Z b
ne Z
X
dwi dwj
dwi dwj
dx =
.
a(wi , wj ) =
ke dx dx
a dx dx
ke

One could be tempted to transcript the last equation into an algorithm in order to compute
the global matrix. However, this procedure can be very inefficient as most of the terms
of the matrix will be equal to zero. This can be concluded by observing a function wi in
the figure 4.4. In general, a function wi is only different to zero in the range from xi2
i dwj
to xi+2 . Therefore, the product dw
dx dx will be different to zero only if |i j| < 4 for the
case of P2 Lagrange polynomials in one dimension.
A better approach to construct the global system matrix is presented in Algorithm 5.

4.7

Exercises

1. For which values of a, the equation (1 a2 )


hyperbolic? Justify your answer.

d2 u d2 u
+ 2 = 0, is elliptic, parabolic and
dx2
dy

86

The Finite Element Method

2. Given the following differential equation




du
d
b(x)
+ c(x)u = f (x).

dx
dx
for u : R R with b(x), c(x), y f (x) known functions
(a) Find the abstract form and the weak form of the problem a(u, v) = `(v).
(b) Shows that a(u, v) is a bilinear form.
(c) Find the stiffness matrix of the system.

Chapter 5

Two-dimensional Elliptic Problems


5.1

Poissons equation

Let u be a scalar field defined over a domain R2 with boundary = 1 2 3 as


in figure 5.1. Then Poissons equation is defined as
2 u = f (x).

(5.1)

In the heat transfer case, Poissons equation governs the steady state temperature distribution for which f (x) represents the internal heat source. To complete the boundary
problem, the following boundary conditions are defined:
Dirichlet: u|1 = g
temperature at boundary 1 .
 
u
Newman: hu, ni|2 =
= q heat flow at boundary 2
n 2
Fourier:

( hu, ni + u)3 =

convection at boundary 3 .

where 1 2 3 =

3
1

2
Figure 5.1: Definition of the domain and boundary

87

88

5.2

Two-dimensional Elliptic Problems

Weak form of the problem

To obtain the weak form of the problem we multiply (5.1) for a test function v and integrate
over the whole domain
Z
Z
2 u v d =

f v d.

Then we apply Greens equation (4.14) to reduce the order of the equation and obtain
Z

Z
uv d

Z
v hu, ni d =

f v d

(5.2)

which is the weak form of equation (5.1).


Before going further, we need to identify the function space of approximation for which
(5.2) has a solution. Let L2 be the space of functions which are square integrable, that is


L () =

Z


2

v : R,
|v| d < ,

and let H 1 be the space of functions whose first partial derivatives are in L2 ,


H () =




u
2
2

u : R u L and
L () .
xi

Then to guarantee a solution, functions u and v must belong to a space U which is a


subset of H 1 . Additionally it is necessary to define one more space, the space of functions
in H 1 that are equal to zero in the boundary of the domain,
H01 () =




v : R v H1 and v = 0 on the boundary .

Figure 5.2 shows a two-dimensional example of a function v in H01

v() = 0
x

Figure 5.2: Schematic view of a function v : R, v H01 , and R2

5.3 Discrete problem

5.2.1

89

Weak form of the Dirichlet homogeneous boundary problem

A boundary condition that is equal to zero it is called homogeneous. For a given function
u defined over with boundary the Dirichlet homogenous boundary condition is defined
as
u| = 0.
If v H01 () the second term of (5.2) is cancelled and the weak form of the problem
is simplified as
Z
Z
uv d =
f v d
(5.3)

5.2.2

Weak form of the von Newman homogeneous boundary problem

This means, the directional derivative normal to the surface is equal to zero. For applications in the area of heat flow it means the boundary is thermally insulated and the heat
flux throughout the boundary is zero.

u
= 0.
n
Assuming v H 1 the second term of (5.2) is cancelled by the Newman homogeneous
condition. The weak form of the problem is transformed into
Z
Z
uv d =
f v d.
(5.4)

Care must be taken as problems with only von Newman boundary conditions give rise
to singular systems.

5.3

Discrete problem

Let
H 1 = span {w1 , w2 , . . . , wn }
then
u (x) =

ui wi (x)

where

ui = u (xi ) ,

In the same way


v (x) =

vj wj , (x)

defining:
Z
uv d.

a (u, v) =

x R2 .

90

Two-dimensional Elliptic Problems

It can be shown that a (u, v) is a bilinear operator, therefore

X
X
= a
ui wi (x),
vj wj (x)
i

ui vj a (wi (x) , wj (x))

i,j

and

Z
` (v) =

f v d

which is a lineal operator and therefore


` (v) =

vj ` (wj ).

Moreover, (5.3) is transformed into


X
X
ui vj a (wi , wj ) =
vj ` (wj ).
i,j

(5.5)

If a (wi , wj ) = aij and ` (wj ) = `j , then (5.5) can be rewritten in terms of matrices and
inner products as
hAu, vi = h`, vi ,
(5.6)
where
A = aij
u = (u1 , u2 , . . . , un )T
v = (v1 , v2 , . . . , vn )T
` = (`1 , `2 , . . . , `n )T .
By the properties of the inner product
hAu, vi = h`, vi
hAu, vi h`, vi = 0
hAu `, vi = 0
.
Because this is valid for all v H 1 then
Au ` = 0
Au = `.
Solving the system we found u that satisfies 2 u = f at nodal points. The matrix
A is usually referred as the stiffness matrix or as the system matrix. This is due to the

5.4 Computation of the stiffness matrix

91

similarity with the elasticity problem were the method was first developed. It can be
calculated by
Z
wi wj d.
(5.7)
A = aij =

Dividing the domain into a set of finite elements, the stiffness matrix over the domain
can be calculated as the sum of integrals over the total number of elements, that is,
XZ
wi wj de
aij =
e

5.4

Computation of the stiffness matrix


y

wA

a
h

wA

c
A

x
x

Figure 5.3: Geometric computation of Lagrange P1 integrals

This section discusses a method for the computation of the integrals that constitute
matrix terms ai,j of the discrete system. This indeed depends on the space of approximation selected for the solution. For the present example, the space of approximation consists
of first order Lagrange polynomials (P1 Elements). In order to evaluate the integrals, we
first have to evaluate the gradient of these functions. So let ABC define a triangle in the
domain as shown in figure 5.3. Vertex A, B, and C are ordered in the counterclockwise
direction. The length of the sides of the triangles are defined in the following way:






a = BC , b = CA , c = AB .
Each triangle has three possible functions: wA (x), wB (x), wC (x) which are first order
functions defining a plane. The gradient of each one of these functions can be easily
calculated by visualising the plane and computing its slope and direction. We will compute
the gradient for function wA (x) and then extend the result for wB and wC . If wA (x) is a
lineal function such that wA (A) = 1 and wA (B) = wA (C) = 0, see figure 5.3. Then wA
is orthogonal to the BC side. (BC is the iso-level line and the gradient is always normal
to the iso-level lines.) As wA goes from zero to one from the BC side to the A vertex and
h is the height of the BC side of the triangle, then the slope is equal to 1/h, that is:
|wA | =

1
.
h

92

Two-dimensional Elliptic Problems

To obtain the direction of the gradient, we proceed in the following way. If k is a unit
vector following the z-axis direction, then a vector in the xy plane perpendicular to the

BC side is given by the cross product of k and BC, that is:

k BC
k BC
=
.

a
BC
Because the area of the triangle is given by Area = ah/2, then the gradient vector can be
expressed as

1 k BC
k BC
wA =
=
.
h
a
2 Area
Notice that the gradient of the base function wA is equal to the cross-product between k
and the vector defined by the opposite vertex (in the counter clock direction) and divided
by 2 times the area of the triangle. This result can be extended to base functions wb and
wC
wB =
wC

k CA
2 Area

k AB
.
2 Area

Next we need to compute the dot product of each of the two gradients in the triangle.
For example:
wA wB =
=

k BC k CA

2 Area 2 Area

BC CA
.
4 Area2

And the integral of this product over the triangle


Z
BC CA
wA wB d =
d
2

4A
Z
BC CA
=
d
4 A2


BC CA
=
.
4A

In a similar way it can be found that


Z
a2
wA wA d =
.
4A

5.5 Non-homogeneous Dirichlet boundary problem

5.5

93

Non-homogeneous Dirichlet boundary problem

In this case, the general integral form of the problem is given by


Z
Z
Z
fv
u v v u n =

with boundary conditions


u| = g,
where g is a known value. We select u U and v V defined as before,
U H 1 [] continuum and derivable by parts
V H01 [] equal to zero in the boundary
R
in such a way that the second term of the integral equation, v u n, is equal to zero.
The abstract form of the problem becomes,
a(u, v) = `(v).
The solution space H 1 is selected as a finite space with base H 1 = span{w1 , w2 , . . . , wn }.
A function wi is referred as base (or shape) function. The main characteristic of these
functions is that they only take values for a small domain around the point where they
are defined and zero in the rest. That is, wi is defined around the point (xi ) and zero
everywhere else. As such, there are functions that corresponds to boundary points wj ,
such as xj and functions that correspond to inside points wi such as xi I. This way
we can classify the nodes into two sets and I, with I = . If a function v H01 ,
then vi = v(xi ) = 0 for all i so H01 can be expressed in terms of the same base wi if
we only use the functions with inside indexes. Given
n
v H01
and
H 1 = span {wi }n1 ,
then v can be expressed in terms of the H01 base if we only take the functions wi that do
not belong to boundary points (xi
/ )
v=

n
X

vi wi (x) =

vi wi (x)

iI

where i I means xi I. Notice that in general, the node index j = i . . . n can be


classified as inside (i I) if the point xj or in the boundary j , if the point
xj .
In order to express a function u in terms of the base, we have to take into account the
known values of the function at the boundary
u| = g
where g is the Dirichlet boundary condition expressed in terms of boundary nodes,
u (xi ) = gi ,

for all xi .

94

Two-dimensional Elliptic Problems

A*

AD

Figure 5.4: Graphic representation of a matrix for a two or three dimensional problem. The rows
and columns with indexes corresponding to inside points (I) are organised at the beginning of the
matrix while the rows and columns with indexes belonging to boundary points are located at
the end.

Using these known values, the function u can be expressed in terms of the same base
of functions by separating the unknown values u
from the known values g at the boundary
X
X
u=
u
i wi +
gi wi .
iI

Replacing the values of functions u and v for their interpolated approximation into the
abstract form, we have
!
X
X
X
u
i wi +
gi wi ,
v j wj
a (u, v) = a
I

!
= a

u
i wi ,

v j wj

+a

gi wi ,

u
i vj a (wi , wj ) +

i,jI

!
X

v j wj

gi vj a (wi , wj )

i,jI

u
i vj aij +

i,jI

gi vj aij .

(5.8)

i,jI

The indices for the first term of the right-hand side correspond to inside points only and
therefore can be written in compact notation as
X


, v
u
i vj aij = AD u
i,jI

where AD is the matrix that contains only the indices of the inside points. Figure 5.4
shows a schematic representation of this matrix.
The second term of (5.8) can be decomposed as
X
X
X
gi vj aij =
vj dj
with dj =
aij gi .
i,jI

jI

5.6 Non-homogeneous von Newman boundary problems


Remember that gj are the values of u at the boundary points, that is
gj = u(xj ). Then, if we define the vector g as
(
0
if i I
gi =
u(xi ) if i

95

(5.9)

then we can extend the limits of the sum until I + and apply the symmetry property
of matrix aij to obtain
dj =

aij gi =

iI+

aji gi

for all j I,

iI+

which is a matrix vector multiplication operation between matrix A times vector g. As


j I from this multiplication we suppress rows corresponding to boundary points, j .
This is illustrated in figure 5.5 and expressed mathematically as
d = (Ag)D ,
Then (5.8) is transformed into


, v + hd, vi .
a(u, v) AD u
Using this result the abstract form of the Dirichlet problem is transformed into

D

, v = h`, vi hd, vi .
A u
And by the properties of the inner product, we have

D

A u ` + d, v = 0.
Because this result must be valid for all v in V , we have
=`d
AD u

(5.10)

which is the discrete form of the Dirichlet problem of the Poisson equation (5.1).

5.6

Non-homogeneous von Newman boundary problems

Let u(x) defined over Rn satisfy the Poisson equation (5.1)


2 u = f (x),
with boundary conditions:
Dirichlet:
u|1 = g
von Newman:

hu, ni|2 =

function defined at boundary 1 .




u
n


=q
2

gradient defined at boundary 2

96

Two-dimensional Elliptic Problems


g

A
A*
I

= dj

Figure 5.5: Graphic representation of the computation of vector d as the multiplication of the jth
row of A times vector g. Notice that j I only.

If we subdivide the boundary in 1 + 2 , the weak form (equation 5.2) can be written
as
Z
Z
Z
Z
uv d
v hu, ni d
v hu, ni d =
f v d

As a difference to the Dirichlet boundary problem, the integral over the boundary is
known over a section of it 2 .
To find the solution in a finite space we select the solution space for u as u H 1 . For
the trial function v we select a variation of the space H01 where a function that belongs to
this space is equal to zero only at the boundary 1 . In this way we can cancel the integral
over the section 1 of the boundary and leave the integral over the section 2 which is given
by the von Newman boundary condition. Replacing the value of the boundary condition
into the weak form we have
Z

Z
uv d

Z
vq d =

f v d

The only difference at this point is in the second term of the left hand side of the
equation. Besides we will consider an index as i 1 if xi 1 and i I if xi
/ 1 then
the functions u and v can be expressed in terms of the base as

u=

ui wi +

iI

v=

X
jI

X
i1

v j wj

gi wi

5.6 Non-homogeneous von Newman boundary problems

97

then
Z

q(x)

q(x) v d =

Z
q(x)wj d

vj
2

jI

vj wj d

jI

vj qj

jI


j
= v, q
where qj was defined as
Z
qj =

q(x)wj d
2

Notice that this integral is only defined over the boundary 2 . This means that qj 6= 0
only for the indicesPj where xj 2 expressing q(x) in terms of the interpolation base
functions, q(x) = ( i qi wi ), we have
!

qj =
2

qi wi

wj d,

that is
qj =

X
i

Z
qi

wi wj d.
2

Example 5.1 Diamond domain


Solve the equation 2 u = 10 for the domain represented in figure 5.6, with the following
set of boundary conditions:
a. Dirichlet Only
Dirichlet at the nodes u(x2 ) = 2, u(x3 ) = 3, u(x4 ) = 11, u(x5 ) = 14
b. Dirichlet + von Newman

u
Dirichlet at the nodes u(x4 ) = 2, u(x5 ) = 2 and von Newman
= 3x + 3y along the
n

u
segment formed by nodes 2 and 3 and
= 0 along segments 5-3 and 2-4.
n

c. Dirichlet + von Newman assuming u = 2x2 + 3y 2 .


Set the proper von Newman boundary conditions for the same segments of the item
before such that u = 2x2 + 3y 2 . Then find u by the FEA method and compare the
results
Solution.

98

Two-dimensional Elliptic Problems


y

2
III

IV

3
I

II
x

Figure 5.6: Four elements defining a diamond shaped domain

Solution a. Dirichlet Only Before proceeding notice that this problem has an analytical solution equal to u = 2x2 + 3y 2 ; this can be proved by computing the Laplacian
of u
 2

u 2u
2
u=
+ 2 = 10.
x2
y
and verifying the Dirichlet boundary conditions with u = 2x2 + 3y 2 at nodes two to
five.
(0,1)

B
(1,0)

(0,0)

Figure 5.7: Standard element of the domain

As all the elements have the same shape, the element stiffness matrices are the same
for all of them. Figure 5.7 shows an element and its local notation used. Using the
procedure of section 5.4 the local stiffness matrix is found to be

1
1/2 1/2
0
hwi , wj i = 1/2 1/2
e
1/2
0
1/2

(5.11)

On the other hand, using analytical procedures we can calculate the following matrix
used for the source terms

Z
1/12 1/24 1/24
hwi , wj i = 1/24 1/12 1/24
(5.12)
e
1/24 1/24 1/12

5.6 Non-homogeneous von Newman boundary problems

99

In the Dirichlet problem, the nodes that belong to 1 are 2, 3, 4, 5 and only the
node number 1 is left after removing the Dirichlet nodes. We end up with a one
dimensional problem with A = a11 ,
d = (A g)D d = d1 =

a1j gj

with g formed by the Dirichlet boundary conditions as g = [0, 2, 3, 11, 14]T


To compute A we need to calculate the first row of A, aT1 . To compute A, the integral
over the domain is splitted into integrals over the elements
Z
XZ
hwi , wj i
aij =
hwi , wj i =

For example
Z
Z
Z
a11 = hw1 , w1 i +
hw1 , w1 i +
I

II

Z
hw1 , w1 i +

III

hw1 , w1 i
IV

=1+1+1+1=4

Z
hw1 , w2 i +

a12 =

hw1 , w2 i = 1

II

Z
hw1 , w3 i +

a13 =

hw1 , w3 i = 1

III

In the same way we compute for a14 and a15 completing the first row of A
aT1 = [4, 1, 1, 1, 1]


in this way d1 = (A g)D = aT1 , g = 30
Vector `
Vector ` consist only of one component
Z
`1 =

5
X

fj wj w1

Z
f (x) w1 =

j=1

and dividing the integral over the domain as the sum of integrals over the elements
an with fi = 10 constant for all i we have

5 Z
X X

`1 = (10)
wj w1
e

j=1

100

Two-dimensional Elliptic Problems


As all the elements in this example are equal, then the inner sum is the sum over
the first row of (5.12)
`1 = (10)(4)(1/12 + 1/24 + 1/24) = 20/3

AD u = ` d
4u = 20/3 30

(5.13)
(5.14)

form where u = 5.83. As the analytical solution for u(1,1) is equal to 5 then the
error is 16%.
Solution b. Dirichlet + von Newman The boundary is divided into a Dirichlet boundary 1 and von Newman boundary 2 . We will define the boundary based on the
nodes in the following way:
2 = 53 32 24
where 53 means the segment of line from node 5 to 3 and so forth. For this
example we will define the von newman boundary conditions as
q(x)|32 = 3x + 3y

and q(x)|53 = q(x)|24 = 0

that is: a flux per unit area over the segment 3-2 and isolated over segments 5-3 and
2-4.
In this case we need to compute AD which is a matrix of dimension three corresponding to nodes 1 to 3. The final system we need to solve is
AD u = ` d + q

(5.15)

where the dimension of all the vectors is three.


Computation of the AD and d
It is necessary to complete the first three rows of the complete A matrix. As we have

5.6 Non-homogeneous von Newman boundary problems

101

already calculated the first row, and using the symmetry of the matrix, we have

a21 = a12 = 1
Z
Z
hw2 , w2 i = 1/2 + 1/2 = 1
a22 = hw2 , w2 i +
II
I
Z
a23 = hw2 , w3 i = 0
ZI
hw2 , w3 i = 0
a24 =
II

a25 = 0
a31 = a13 = 1
a32 = a23 = 1/2
Z
Z
a33 = hw2 , w2 i +
I

hw2 , w2 i = 1/2 + 1/2 = 1

III

a34 = 0
a35 = 0

in compact notation

A=

4 1 1 1 1
1 1
0
0
0

1 0
1
0
0

where the stars stand for values that are not needed to be computed. The vector
with Dirichlet boundary conditions is g = [0, 0, 0, 11, 14]T , then d is equal to

d=


25
4 1 1 1 1
0


0
1 1
0
0
0

0
1 0
1
0
0 . 0 =

11

14

Vector `
Additionally to `1 we need to compute `2 and `3

102

Two-dimensional Elliptic Problems

Z
f (x) w2 =

`2 =

= 10

5
X
XZ
j=1

fj wj w2

!
wj w2

IV

II

w3 w2

w2 w2 +

w1 w2 +

w3 w2 +

w2 w2 +
II

Z
w1 w2 +

II

j=1

Z
= 10

5
X

IV

IV

= 10(2)(1/12 + 1/24 + 1/24)


= 10/3
Finally by symmetry `3 = `2 so we have

2
10
1
`=
3
1
vector
Computation of the q
vector will be different from zero only at positions 2 and 3. So we will proceed
The q
to compute q2 as
Z
Z
Z
Z
q2 =
q(x)w2 d =
q(x)w2 d +
q(x)w2 d +
q(x)w2 d
2

53

32

24

In the first section, 53 , the integral is zero because q(x) is equal to zero and also
w2 (x) is equal to zero. In the third section, 24 , the integral is zero because q(x) is
equal to zero (in this case w2 (x) is different to zero and we would have to compute
this integral if q(x) were not zero ). So finally q2 is reduced to:
Z
q(x)w2 d
q2 =
32

This integral can be computed in two different ways. The first one is only practical
when solving problems by hand, the second one is used in a general way and it is
easy to program.
a Direct integration of q(x)
Z
q2 =
32

Z
q(x)w2 d =

(3x + 3y)w2 d
32

to be able to compute this integral we should first change the coordinate system
x y to a local one-dimensional system in the direction of the segment 3 2
and with origin located at 3. Figure 5.8 shows the projection of q(x) and w2

5.6 Non-homogeneous von Newman boundary problems

103

q(x)

3
2

w2

h
Figure 5.8: Line integral

over the local


coordinate system. Here h is the length of the segment that is
equivalent to 2 in this case. Notice that q(x) becomes a constant function
q() = 3. in the same way w became a straight line w() = /2
The value of q2 can be easily calculated replacing q(x) by q() and w2 (x) by
w2 ()

Z
q2 =

(3)(/2) d
0

2
Z 2
3
3 2
=
d =
2 0
2 2 0

3 2
=
2
b Integration of q(x) in terms of the base
Z
q2 =

Z
q(x)w2 (x) d =

qr wr (x)w2 (x) d

as qr is a constant, equal to du/dn evaluated at the point xr , we can take it


out of the integral
X Z
q2 =
qr
wr (x)w2 (x) d
r

In a similar way to case a. we change the coordinate system to a local onedimensional system , as appreciate in figure 5.9. Over this coordinate system
the two base functions locally called as wA () for node A and wB () for a node
B.
The value of the base functions in the local coordinate system is therefore
wA () =

+1
h

wB () =

104

Two-dimensional Elliptic Problems

Figure 5.9: line integral for a general case b.

The total number of different integrals over the segment is four from which
three are different. They can be easily computed as
h

h

 

h
wA wB d =
+1
d =
h
h
6
0
0


Z h
Z h

h
wA wA d =
+1
+ 1 d =
h
h
3
0
0
Z h
Z h  

d =
wB wB d =
h
h
3
0
0
Z

Summarising
Z


wi wj d =

h/3 h/6
h/6 h/3

Finally, the value for q2 can be calculated assuming h =

q2 =

(5.16)

2 as

Z
qr

wr (x)w2 (x) d
2

Z
= qA wA wB + qB

2
2
=3
+3
6
3
3
=
2
2

Z
wB wB

(5.17)

which is the same for case a. Notice that this later procedure seems a bit more
intensive in labour but it allows a general way to compute this line integrals.
Also, the computation of the standard line integral has to be performed only
once.

5.6 Non-homogeneous von Newman boundary problems

105

In the same way we compute the value of q3 .


X Z
wr (x)w3 (x) d
qr
q3 =
2

= qA wA wA + qB

2
2
=3
+3
3
6
3
=
2
2

wB wA

(5.18)

In compact form we have

q =
2

0
3
3
0
0

And the final system after removing Dirichlet positions

25
2
0
4 1 1
10
2

1 1
0
1
3
0
+

u=
3
2
0
1
3
1 0
1

whose solution is

7.95

u=
6.74
6.74
Solution c. Dirichlet + von Newman assuming u = 2x2 + 3y 2 As we verified in Solution a, the Dirichlet boundary conditions over 1 already fulfil with this
u|1 = 2x2 + 3y 2 .
In order to set von Newman boundary conditions over 2 we need to compute the
gradient of u and the normal over the boundary segments. The gradient of u is given
by

u
!
4x
x

u =
u =
6y
y
In the same way as before, the von Newman boundary 2 is divided in three sections
2 = 53 32 24

106

Two-dimensional Elliptic Problems


and each section has a external normal vector:


1
1

32 : n
32 =
,
2
2


1
1
24 : n
24 = ,
2
2


1 1
53 : n
53 = ,
2 2
So q(x) can be computed for each boundary segment as



u
1

=
= hu, n
32 i = (4x + 6y)
q(x)

n 32
2


32

1
u
= hu, n
24 i = (4x 6y)
q(x)
=

n 24
2

24


u
1
q(x)
=
= hu, n
53 i = (4x + 6y)

n
2
53
53

Having defined the von Newman boundary conditions we proceed to compute the
new q and solve the resulting system with the terms previously computed. In order
to compute this vector the flux function q(x) should be evaluated at each boundary
node. As the nodes are shared by two segments this value will depend on the segment
(due to its different normal orientation). Figure 5.10 shows the different values of
q(x) according to the boundary segment.
The only boundary values we need to compute for this calculation are q2 and q3 .
Z
q3 =
q(x) w3 (x) d
2
Z
Z
=
q(x) w3 (x) d +
q(x) w3 (x) d + 0
53
32
X Z
X Z
wr (x)w3 (x) d +
qr
=
qr
wr (x)w3 (x) d
r

53

32

and expanding the sums over the local segments we have



 Z

 Z
Z
Z
q3 = qA wA wB + qB wB wB
+ qA wA wA + qB wB wA
=
2
=
3

!
2
6
+
6
2

!!
2
3

53

53

!
2
4

3
2

!!
2
6

32

32

(5.19)

5.6 Non-homogeneous von Newman boundary problems

q5 =

107

8
2

q3 =

6
2

q4 =

2
2

q3 = 62
2

q2 =

q2 =

42

4
2

Figure 5.10: Definition of the flux function q(x) at each segment. Notice that q2 and q3 take
different values depending on the segment they are defined on.

In the same way for q2


Z
q2 =

q(x) w2 (x) d

(5.20)

Z
q(x) w2 (x) d +
q(x) w2 (x) d + 0
32
24
X Z
X Z
=
qr
wr (x)w2 (x) d +
qr
wr (x)w2 (x) d

32

24

and expanding the sums over the local segments (see figure 5.10) we have


q2 = qA
6
=
2
=

Z
wA wB + qB
!
2
4

6
2

 Z

Z
+ qA wA wA + qB wB wA

wB wB
32

!!
2
3

32

2
3

!
2
2
+
3
2

!!
2
6

24

24

(5.21)

In compact form we have

0
q = 2/3
2/3

108

Two-dimensional Elliptic Problems


And the final system after removing Dirichlet positions

4 1 1
2
25
0
10
1 1
0 u = 1
0 + 2/3
3
1 0
1
1
0
2/3
whose solution is

7.95

u=
6.74
6.74

5

kue uk

As the exact solution is ue =


2 the approximation error is kue k = 0.14.
3
Example 5.2 Square domain
Solve the following partial differential equation defined over the two-dimensional domain
shown in figure 5.11 with Dirichlet and von Newman boundary conditions
2 u = 4
u=0

for x = 0,

y = 0...1

u = 0 for y = 0, x = 0 . . . 1
u
= 3x + 2x2 + 4xy for y = 1,
n
u
= 3y + 2y 2 + 4xy for x = 1,
n
where u : R2 R, R and given by


= (x, y) 0 x 1,

x = 0...1

and

y = 0 . . . 1.


0y1

Find:
i. The Dirichlet matrix of the system AD for the given topology.
ii. Compute the ` vector.
iii. Find the Dirichlet boundary conditions resulting vector.
iv. Find the von Newman boundary conditions resulting vector.
v. Solve the system.
Solution.
D ,
To find the solution it is necessary to solve the system AD uD = `D (Ag)D q

5.6 Non-homogeneous von Newman boundary problems

109

y
5

0.5

0.5

Figure 5.11: Discrete representation of the domain for Square Domain example

Dirichlet matrix (local matrix Al ): As all the elements have the same shape we only
have to compute the local stiffness matrix once over one element and reproduce the
results to the other elements.
For simplicity, we select the VI element and number it as
A

AB = (0, 1/2)

(5.22)

BC = (1/2, 1/2)

(5.23)

CA = (1/2, 0)

(5.24)

By definition, the stiffness matrix for 2D linear Lagrange elements (P1) is given by

hBC, BCi hBC, CAi hBC, ABi


1
hCA, BCi hCA, CAi hCA, ABi
Al =
4 Area
hAB, BCi hAB, CAi hAB, ABi
Which is a symmetric Matrix. As the Area of the element is (1/8) then
and the local matrix becomes

1/2 1/4 1/4


1
1/2 1/2
0 = 1/2 1/2
0
Al = 2 1/4 1/4
1/4
0
1/4
1/2
0
1/2

1
4 Area

=2

Dirichlet Matrix (global matrix A): Because the problem is a Dirichlet Homogeneous
problem we do not have to compute the total matrix but only the dirichlet matrix.
Cancelling all the dirichlet positions we are left with positions one to four.
When computing each entry of the matrix we should notice where in the domain the
base functions related to that entry are different from zero. For example, position
a11 of the global matrix is defined as
XZ
a11 =
hw1 , w1 i
e

110

Two-dimensional Elliptic Problems


From all the elements e we notice that w1 is only different from zero at elements VI
and VII. therefore
Z
Z
a11 =
hw1 , w1 i.
hw1 , w1 i +
V II

VI

Using local notation


Z
a11 =

Z
hwA , wA i +

hwA , wA i = 1 + 1 = 2.

VI

V II

In a similar way for position a33 :


Z
XZ
hw3 , w3 i = 8 hwB , wB i = 4,
a33 =
e

and for position a22 :


a22 =

XZ

Z
hw2 , w2 i = 2

hwB , wB i = 1.
e

Also by symmetry
a44 = a11 = 2.
In similar way the positions off the diagonal are
Z
Z
a13 =
hwA , wB i +
hwA , wC i = 1
V II
ZV I
1
a12 =
hwA , wC i =
2
VI
a14 = 0
Z
Z
a23 =
hwB , wC i +
hwC , wB i = 0
VI
ZV
1
a24 =
hwB , wA i =
2
ZV
Z
a34 =
hwC , wA i +
hwB , wA i = 1
V

IV

Putting all the terms together we have

2
1/2 1
0
1/2
1
0 1/2
.
A=
1
0
4
1
0
1/2 1
2
Vector `: Given f (x) =constant the term `j becomes easy to compute for a standard
element:
Z
XZ
XZ
`j =
4wj =
4wL(j) = 4
wL(j) ,

5.6 Non-homogeneous von Newman boundary problems

111

C (0,1/2)
y= x + 1/2

A
(0,0)

B
(1/2,0)

Figure 5.12: Standard element for the computation of the integrals of the source term

where L(j) is the local index at element e. As all the elements are the same We take
an standard element as shown in figure 5.12
The base functions over the element are
wA = 1 2x 2y
wB = 2x
wC = 2y
In order to evaluate the integral over the standard element the limits are defined as
Z
Z 1/2 Z x+1/2
wj =
wj dy dx.
e

which result after evaluation in


Z
wA = 1/24
Ze
wB = 1/24
Ze
wC = 1/24
e

To compute the vector ` the only positions we need to compute are one to four:
Z

Z
Z
`1 = 4 w1 = 4
wA +
wA = (4)(2)(1/24) = 1/3

VI
V II
Z

Z
Z
`2 = 4 w2 = 4
wC +
wC = (4)(2)(1/24) = 1/3
 VZII  V III
Z
`3 = 4 w3 = 4 8 wB = (4)(8)(1/24) = 4/3
Z e

Z
Z
`4 = 4 w4 = 4
wA +
wA = (4)(2)(1/24) = 1/3

IV

112

Two-dimensional Elliptic Problems


in other words,

1/3
1/3

`=
4/3
1/3

Dirichlet Boundary Conditions As mentioned before Dirichlet boundary conditions


are zero, therefore g = 0 and Ag = 0
von Newman Boundary Conditions There are two possible ways to compute the vector q. The first one is a direct computation of the integral preserving u/n = q(x),
with x = (x, y), as an explicit function. The second one interpolates the functions
q(x) in terms of the same finite element base.
version 1 The von Newman boundary conditions vector is defined as
Z
qi =

q(x)wi (x) d

It is only necessary to compute the positions one, two and four. To compute the
integral we cut the domain with a plane at the boundary at y = 1 and visualise
the base functions. We also cut the domain with a plane at the boundary at
x = 1. These two sections are shown in figure 5.13.

q(x)

q(x)

w1

1
2x

w2
2x +2

0.5
for y =1

2x 1
1

w4

2y

w2
2y +2

0.5
for x=1

2y 1

Figure 5.13: One dimensional projections of the base functions and q(x) on the boundary at y = 1
and x = 1

These two sections represent the complete von Newman domain. To compute
the von newman vector we proceed by replacing the function q(x) for each

5.6 Non-homogeneous von Newman boundary problems

113

section
Z
q(x)w1 (x)

q1 =
Z2
=

(3x + 2x2 + 4x)w1 (x)

2
1/2

(3x + 2x2 + 4x)(2x) dx +

(3x + 2x2 + 4x)(2x + 2) dx

1/2

= 13/24
In the same way q2 can be computed as
Z
Z 1
2
(3x + 2x + 4x)(2x 1) dx +
q2 =

(3y + 2y 2 + 4y)(2y 1) dx

1/2

1/2

= 9/8
and finally q4
Z 1/2
Z
2
q4 =
(3y + 2y + 4y)(2y) dy +
0

(3y + 2y 2 + 4y)(2y + 2) dx

1/2

= 13/24
summarising

13/24
9/8

q =
0 .
13/24
version 2 Another possibility is to approximate the function q(wx) in terms of the
base functions. This approach is more suitable for programming
Z
X
qi =
q(x)wi (x) with q(x) =
qr wr

with qr = q(xr ), the result of evaluating the given function at boundary nodes.
Then we have
X Z
qi =
qr wi wr
r

R
This way the only integrals we need to compute are wi wr over one line
element of length 1/2 as shown in figure 5.14
R 1/2
R 1/2
R 1/2
In this case we need to compute 0 wA wA , 0 wA wB , and 0 wB wB , the
results are as follows
R 1/2
0

wi wr = A
B

A
1/6
1/12

B
1/12
1/12

114

Two-dimensional Elliptic Problems

u w= 2x +1
1

w= 2x

1/2

Figure 5.14: Standard 1-D element for the computation of boundary integrals

In order to compute q4 we expand the sum


Z
Z
Z
q4 = q9 w9 w4 + q4 w4 w4 + q2 w2 w4
replacing the values of the integral and q9 = 0, q4 = 1, and q2 = 3
Z
Z
Z
q4 = q9 wA wB + q4 (2) wA wA + q2 wA wB
= 0 + 2/6 + 3/12
= 7/12

In the same way for q2


Z
q2 = q4

Z
w4 w2 + q1

Z
w2 w1 + q2

w2 w4

replacing the values of the integral and q4 = 1, q2 = 2, and q1 = 1


Z
Z
Z
q2 = q4 wA wB + q1 wA wB + q2 (2) wA wA
= 7/6

Finally by symmetry q1 = q4 . Summarising

7/12
7/6

q =
0 .
7/12
final solution
Au = ` + q

5.7 Fourier boundary conditions

115

Solving the system we obtain wu for the different versions of wq

2.0
1.94

3.5
3.40

uversion b =
uversion a =

1.33
1.30

2.0
1.94
where version 1 is the most accurate due to the exact computation of the von
Newman term.

5.7

Fourier boundary conditions

Fourier boundary conditions


Find

revisar
esta
seccion

u(x) : Rn R
R
that satisfies
2 u = f (x),

(5.25)

with boundary conditions:


Dirichlet:



u(x)

= g(x)


for all x 1

u
=q
n 2

von Newman:

hu(x), ni|2 =

Fourier:



u
=
u +
n
3

(5.26)
(5.27)
(5.28)

In a heat transfer context Dirichlet boundary conditions represent the known temperatures at a given surface. von Newman boundary conditions represent the heat flux and
Fourier boundary conditions represent a convection surfaces and it is usually written in
terms of the temperature of the surrounding fluid u and the convection coefficient h, as
k

u
= h(u u )
n

(5.29)

The equivalence with (5.28) can be easily demonstrated by doing h = / and u = /


The weak form of the problem (5.2) can be written as
Z
Z
Z
Z
uv d
v hu, ni d
v hu, ni d
v hu, ni d

1
2
3
Z
f v d
=

116

Two-dimensional Elliptic Problems

where the integral of the second term of (5.2) was divided into three sections = 1 +
2 + 3 corresponding to each boundary condition. Additionally, we select u H 1 , and
in order to cancel the integral over 1 we select v H01 . then the weak form becomes
Z
Z
Z
Z
uv d
v hu, ni d
v hu, ni d =
f v d

Where the integral over 3 can be replaced by the Fourier boundary condition (5.29)
in the following way,
Z
Z
Z
u
v d = v h(u u )d3
hvu, n
i d =
(5.30)
n

3
3
3

Expressing u and v in terms of a base of a certain finite space of functions, we have


X
X
u=
ui wi (x) +
gi wi (x)
iI

v=

i1

(5.31)

vj wj (x)

and replacing (5.31) into (5.30), we have,


Z

v h(u u )d3 =
3

vj

Z
ui

h wj (x) wi (x)d3 +

Z
vj

h u wj (x)d3 (5.32)
3

defining the matrix Cji as


Z
Cji =

h wj (x) wi (x)d3
3

and the vector mj as


Z
mj =

h u wj (x) d3
3

then (5.32) can be written as


Z
d3 = hCu, vi hm, vi
v hu, ni

(5.33)

The discrete form of the Laplace equation with fourier boundary conditions can be
written as
hA u, vi hC u, vi + hm, vi = 0
(5.34)

5.8 Exercises

117

The complete solution including all the boundary conditions is


A u, v + (Ag)D , v hp, vi hC u, vi + hm, vi = 0

(5.35)

and because it should be valid for any arbitrary v


(A C)D u = ` d + p m

(5.36)

where (A C)D R(1 )(1 ) and `, d, p, and m R(1 )

5.8

Exercises

1. Let 2 T = 0 over the domain shown in the figure and with Dirichlet boundary conditions as is shown. Find:
T=5 10

11 T=2
3

T=2
8

9 T=1

T=5
1

2
5

T=5

T=1
4

6
T=2

a The Dirichlet matrix AD .


b The vector resulting from the boundary conditions.
c The Solution of the system.
2. About the Galerkin method. Which of the following sentences is true?
a The Galerkin method is used only for homogeneous problems
b In the Galerkin method the weighted function is the first function.
c It is the weighted residual method where the weight function is equals to the interpolation base.
d The Galerkin method uses one degree polynomials

d
3. Write the variational formulation for the equation dx
a(x) du
dx = f (x)
4. The von Newman boundary condition:
a is equivalent to the boundary temperature for the heat conduction problem,
b is only enunciate for the Laplace equation.
c is equal to hgrad(u), ni, where n is a vector normal to the boundary.
d is equal to hgrad(u), dli, where dl is a differential vector in the direction of the
boundary

118

Two-dimensional Elliptic Problems

5. For questions 5, 6 and 7 consider the following heat transfer problem for which the
temperature u is given by 2 u = 0 over the following domain
16

15

14

13
10

11

and boundary conditions: (a) du


dx = xy boundary conditions for nodes 10, 13, 15, 16
and (b) u(x, y) = 0 for every other node.
Each segment has a horizontal or vertical line size of one.
6. Find the heat flow contribution to the q vector at positions 15 and 13.
7. Make a drawing showing the non zero positions of the matrix.
8. Find the position A(15,15) from the matrix.
9. Find the position A(9, 10) from the matrix.
10. Solve the example 5.1.a using second order elements. How is the error?
11. The biharmonic equation is given by
4 u = f in , u = 0 y

u
= 0 in
n

The solution u(x), with x R2 , can be seen as the deformation of a membrane or plate
clamped at the sides. Assuming small deformations, find the weak form of the problem.

Chapter 6

Afin Transformations
6.1

Change of variable in an integral


x2

x2
g(x)

(x1 ,x2)

x1

(x1 ,x2)

x1

Figure 6.1: Change of variable

Let f be a scalar function defined over a domain and let x . The integral of
f
over a region in the x1 , y2 coordinate system can be transformed
(x)dx extended
R
extended over a domain
in the x
into an integral F (
x)d
1 x
2 plane. Next, we are

going to study the relationship between the regions and and the integrals f (x) and
are related by function g as g(
F (
x). Variables x and x
x) = x. Notice that g(x) is a
vector application, so it is composed of scalar functions gi as

 

g1 (x1 , x2 )
x1
g(
x) =
=
.
(6.1)
g2 (x1 , x2 )
x2

Geometrically, it can be considered that the two equations 6.1 define an application
that takes a point (x1 , x2 ) in the plane x1 x2 and corresponds to a point (x, y) in the xy
points in the x1 x2 plane is mapped into the set in the x1 x2 plane
plane. The set of
as is represented in figure 6.1. Sometimes the system of equations 6.1 can be solved for
the x
s variables in function of the xs variables. When this is possible, we can express the
result in the form
= g 1 (x).
x
119

120

Afin Transformations

These equations define an application from the x1 x2 plane to the x1 x2 plane and are the
inverse application of g(
x), as defined in (6.1), because they transform the points from
Among these applications, those called one-to-one applications are of special
into .
into different points in . In other
importance. They transform different points from

words, two different points in are not mapped into the same point in by a one-to-one
application.
We will consider applications for which the functions g1 , g2 are continuous and have
continuous partial derivatives gi /xj for i, j = 1, 2. For functions gi1 we make similar
assumptions. These considerations are not very rigorous given that this is valid for most
applications resulting from practical problems.
The formula for transformation of double integrals can be written in the following way:


Z
Z
gi


d.
f (x)d =
f (g(
x)) det
(6.2)

Where the factor det |gi /xj | that appears in the integral of the right-hand side of the
equation is called the Jacobian of the transformation



g1 g1
gi x
2
= 1 x
(6.3)
det
.
x
j g2 g2


x
1 x
2

6.2

Transformation of a Standard Element


C

x2

x2
C=(0,1)

g(x)

A
k
A=(0,0)

B=(1,0)

x1

x1

Figure 6.2: Transformation to a standard element

Let k represent a triangular finite element in the x1 x2 plane and let k be a finite
B,
and C as defined in figure 6.2. Then there is
element in the x
1 x
2 plane with vertex A,
a mapping function g(x) that transforms any point in the element k into an element k.
Such transformations are unique to each triangle k and can be calculated in the following
way:
g(
x) : R2 R2
k k

6.2 Transformation of a Standard Element

121

where
=A
g(A)
=B
g(B)

(6.5)

= C.
g(C)

(6.6)

(6.4)

Equations 6.4, 6.5, and 6.6 conform to a set of six equations that give rise to the
following afin transformation:

     
m11 m12 x
1
b
x
g(
x) =
+ 1 = 1 .
(6.7)
m21 m22 x
2
b2
x2
The matrix M plays the role of scaling and rotating while vector b is responsible for the
translation to the origin. It can be shown that a transformation defined in this way is
one-to-one and preserves the proportion and the relative position among the transformed
points. For example, a point p located half way between A and B will be transformed

which is located half way between the points A and B.


into a point p

6.2.1

Computation of transformation function

i.

m11
m21

 
0
=A
A =
and
g(A)
0
     
   
m12 0
b
A1
b
A1
+ 1 =
1 =
m22 0
b2
A2
b2
A2

(6.8)

ii.

m11
m21

 
1
=B

and
g(B)
B=
0
     

 

m12 1
A1
B1
m11
B 1 A1
+
=

=
m22 0
A2
B2
m21
B 2 A2


m11
m21

 
0

=C
C=
and
g(C)
1
     

 

m12 0
A1
C1
m12
C 1 A1
+
=

=
m22 1
A2
C2
m22
C 2 A2

(6.9)

iii.

(6.10)

Then from equations 6.8, 6.9,and 6.10 the application g(


x) can be written in matrix
notation as


B1 A1 C1 A1
g(
x) =
B2 A2 C2 A2

   
x
1
A
+ 1 .
x
2
A2

(6.11)

122

6.2.2

Afin Transformations

Base functions for the standard element

Figure 6.3 shows the equivalence between the base function wA defined over an element
According to the
k and its equivalent version w
A defined over the standard element k.
definition of the Lagrange base function, it must true that
wi (
xj ) = ij

for

i = A, B, C

B,
C}.

j = {A,
x

and

If g(
x) is the coordinate transformation function as defined in the last numeral, then for
all x k with x = g(
x) we have
w
i (
x) = wi (x).

x2
1

wA

(6.12)

wA

x2
C

g( x1 , x2 )
C

B
A

A
B

x1

x1

Figure 6.3: Representation of base functions for a standard element

6.2.3

Computation of integrals over a finite element

We can use the results from the change of variable equation, equation 6.2, to calculate
integrals over a finite element domain by computing the integrals over a standard element
in the following way.
If
X
f (x)
f (xj )wj (x)


Z
Z
gi

dk
f (x)dk = f (g(
(6.13)
x)) det
xj

k
k
where the Jacobian |gi /xj | can be computed from equation 6.11 as



gi

= m11 m12
xj
m21 m22

(6.14)

and integrals over the standard finite element k can be defined as r = x1 and s = x2
Z 1 Z r+1
f (r, s) ds dr.
(6.15)
0

6.2 Transformation of a Standard Element

123
x2

x2

8
2x 4

2x +20

g(x)

C=(0,1)

k
A=(0,0)

x 1+1

B=(1,0)

x1

x1

Figure 6.4: An element and its transformation to a standard element. See example 6.1

Example 6.1
Let us consider a triangle with coordinates A = (6, 6), B = (8, 6) and C = (7, 8) as shown
in figure 6.4. If wA , wB , and wC are the Lagrange first order polynomials defined over the
triangle in the usual way, then they can be written in terms of the x1 and x2 coordinates
as follows:
wA = 1/2x1 1/4x2 + 11/2

(6.16)

wB = 1/2x1 1/4x2 3/2


wC = 3 + 1/2x2 .
For example if we want to compute the integral of wA over the element, then we divide
the integral in two parts: one from x1 = 6 until x1 = 7, and the second from x1 = 7 until
x1 = 8 in the following way.
Z 7 Z 2x1 4
Z 8 Z 2x1 +20
Z
wA dk =
wA dx2 dx1 +
wA dx2 dx1 .
k

By replacing the value of wA from equation 6.16 into the last equation, this integral can
be calculated by hand or by using a computer algebra system to obtain
Z
2
wA dk = .
3
k
To compute the value of the integral by using the method of transformation of variables
exposed in this section, we need first to compute the values of the transformed function
g(
x) defined in equation 6.11 in order to obtain the Jacobian and then compute the integral
over the standard element. Computation of the Jacobian for the present example leads us
to


 
 

gi
= B1 A1 C1 A1 = 8 6 7 6 = 2 1

xj
B2 A2 C2 A2
66 86
0 2
and


2 1
det
= 4.
0 2

124

Afin Transformations

Then we compute the standard integral of w


A over the k domain
Z 1 Z x1 +1
Z
w
A (
x1 , x
2 ) d
x2 dx1 .
wA dk =

For a standard finite element k the base functions, w


j , can be calculated from the definition
of Lagrange polynomials. The following values are obtained:
w
A = x1 x2 + 1

(6.17)

w
B = x1

(6.18)

w
C = x2 .

(6.19)

Replacing the value for w


A enables us to compute the integral
Z
Z 1 Z x1 +1
1

wA dk =
(x1 x2 + 1) d
x2 dx1 = .
6

k
0
0
Finally we replace this value into the change of coordinated equation (6.13) to obtain:


Z
Z
gi
dk = 1 4 = 2

(6.20)
wA (x)dk = w
A (g(
x)) det
xj
6
3

k
k
which is the same value obtained by the direct computation of the integral.
Example 6.2
Find the discrete solution to the following equation
2 u = 10
Defined over a domain with the shape of a regular hexagon inscribed in a circle of radio
equal to one, as shown in the figure 6.5, and with the following boundary conditions:
Dirichlet:
Fourier:

u(x7 ) = 100
u
= h(u u ),
n

(6.21)
u = 20,

h = 10

(6.22)

where n
is the external normal to the boundary segment.
6
4

5
7

3
1

Figure 6.5: Discrete domain. The Dirichlet boundary condition is defined at the central point

6.2 Transformation of a Standard Element

125

Solution.
To find the discrete solution of the Problem we need to solve the following linear system
of equations.
[A C]u = ` d + p m
where A is the stiffness matrix, C is the resulting matrix from the Fourier boundary
conditions, ` is the resulting vector from the source term f (x), d is the vector with the
Dirichlet boundary conditions, p the vector with the von Newman boundary conditions,
and m is the vector with the Fourier boundary conditions.
Matrix A
As all the elements have the same shape it is only necessary to calculate the integrals for
one element. Consider the element with nodes 2, 7, 4 and locate the origen of coordinates
in node 2. To generalise, lets denote node 2 by A, node 7 by B and node 4 by C. The
coordinates of nodes A, B and C is
A = (0, 0),

B = ( 3/2, 1/2),

4=C
7=B

C = (0, 1).

2=A

The area of the element is

1 3/2
3
bh
=
=
.
Area =
2
2
4
As the element is an equilateral triangle, the local stiffness matrix can be calculated easily.
The first term of the diagonal with index A,A
Z
kBCk2
1
hwA , wA i =
=
4 Area
3
e
and is the same value for all the diagonal positions
Z
Z
Z
1
hwB , wB i = hwC , wC i = hwA , wA i =
3
e
e
e
R
The term wA wB can be calculates as
Z
hBC, CAi
1
hwA , wB i =
=
4 Area
2 3
e
However, it is the same for all the non diagonal terms
Z
Z
Z
1
hwA , wC i = hwB , wC i = hwA , wB i ==
2 3
e
e
e
rearranging terms in matrix form
Alocal

1
1/2 1/2
1 1
=
/2
1
1/2
3
1
1
/2 /2
1

126

Afin Transformations

With this result the global matrix can be assembled. For example for a node in the
perimeter (i = 1..6), the position Aii in the matrix will be equal to
Z
2
Aii = 2 hwA , wA i =
3
e
because it only has two element sharing that node. Similarly the position 7 in the diagonal
will be
Z
2
A77 = 6 hwA , wA i = .
3
e
The nodes at sides of the hexagon will have the same value
Z
1
a1,2 = a2,4 = a4,6 = a6,5 = a13 = a1,5 = hwA , wB i = .
2 3
e
Finally the central node is connected to each of the elements in the perimeter by two
elements,
Z
1
a7,1 = a7,2 = a7,3 = a7,4 = a7,5 = a7,6 = 2 hwA , wB i = .
3
e
organising this terms in matrix form and taking into account symmetry we have

2
1/2 1/2
0
0
0
1
1/
2
0
1/2
0
0
1

1
0
2
0
1/2
0
1
. /2


A = 1 3 0
1/2
0
2
0
1/2 1 .

0
0
1/2
0
2
1/2 1

0
0
0
1/2 1/2
2
1
1
1
1
1
1
1
6
The Dirichlet stiffness matrix is easily computed from this by removing the last row and
column.
Vector `
Vector ` is defined by
Z
`j =

f (x)wj d =

Z
fr

wr wj d

where the domain can be decomposed as the sum over the finite elements. However the
resulting integrals over the elements needs to be calculated using and afin transformation.
Z
XX Z
XX
`j =
fr wr wj dk =
fr det |Jx(x) | w
r w
j dk
k

The integrals over the standard elements where

Z
1/12

w
r w
j dk = 1/24

k
1/24

previously calculated as

1/24 1/24
1/12 1/24
1/24 1/12

6.2 Transformation of a Standard Element

127

As f (x) is constant over the domain, and all the elements have the same shape, and
therefore the same Jacobian, the expression for `j transforms into
`j = 10 det |Jx(x) |

Ij

where Ij corresponds to the sum of a row (or a column) of the matrix


Ij =

XZ
r

w
r w
j dk = 1/12 + 1/24 + 1/124 = 1/6

The Jacobian can also be easily computed from the coordinates of the element as det |Jx(x) | =

3 . Replacing this values we have
2

`j = 10

3X
(1/6)
2
k

The sum will be accomplished over 2 elements for index one to six and over 6 elements for
index seven.

3
`j = 10
2 (1/6)

j = 1..6
2
3
`j = 10
6 (1/6)
if
j=7
2
Finally we remove the last row to obtain the equivalent to (`D )

` = 10
6

1
1
1
1
1
1

Vector d
We start by defining a vector
gj0

(
0
=
100

if

j = 1..6
j=7

Then we proceed as usual


d = (Ag 0 )D

128

Afin Transformations

the result is the multiplication of the last row of the A matrix multiplied by 100 and then
suppress the Dirichlet boundary index that correspond to the last row:

1
1

100
1

d =

3 1

1
1
Fourier Boundary Conditions, Matrix C The positions in the C matrix are defined
by boundary integrals of the following form
Z
Cij =
h wi wj d3
3

To evaluate the integral any segment of the boundary can be taken as all have the same
length (equal to one). The integral over the boundary is an line integral for the case of
2D problems. See figure 6.6.
b
wa (x)

wb (x)

a
a
0

b
1

Figure 6.6: One dimensional projection of the shape function on a boundary segment

For this one dimensional case the shape functions became


wa (x) = x + 1,

wb (x) = x

The local integrals for this boundary segments are calculated analytically by
1
Z 1
(1 x)3
1
2
C localaa =
(1 x) dx =
= ,

2
3
0
0
1
Z 1
3
1
x
= ,
C localbb =
x2 dx =

3 0 3
0
 2
1
Z 1
1
x
x3
2
C localab =
x x dx =
+
= .
3
3
6
0
0
In matrix form
Clocal

1
=
3

1 1/2
1/2 1

6.2 Transformation of a Standard Element

129

The contributions to the global matrix have the following form


Z
XZ
wi wj dt
Cij = h
wi wj = h
3

where t represent a boundary segment. In our case one of the sides of the hexagon. As
the integral is over the boundary, the contribution to the matrix is only made at the i,
and j index on the boundary So for the index 1 to six we have terms in the diagonal
Z
2h
Cii = 2h wi wi = 2h C1,1 =
,
j = 1..6
3
t
besides each boundary node connected with another share only one element. Explicitally,
we have
Z
h
C1,2 = C2,4 = C4,6 = C6,5 = C13 = C1,5 = h wi wj =
6
t
reordering this into matrix form (and omitting the Dirichlet index)

C=
3

2 1/2 1/2 0
0
0
1/
0 1/2 0
0

2 2

1/
1/
0
2
0
0

2
2
.
0 1/2 0
2
0 1/2

0
0 1/2 0
2 1/2
0
0
0 1/2 1/2 2

Fourier Boundary Conditions, Vector m


The positions of the m vector are defined as
Z
XZ
mj =
hu0 wj d3 =
hu0 wj dt
3

where h and u0 are constants and


Z

t wj dt

for a standard segment is easily computed as

Z
(1 x) dx = 1/2

w1 dt =
t

and

Z
w2 dt =

x dx = 1/2
0

adding all the contributions and rearrange them into vector form we have

m = h u

1
1
1
1
1
1

= 200

1
1
1
1
1
1

130

Afin Transformations

Finally, the system to solve is:

[A C]u =

3 100
10
+ 200

6
3

1
1
1
1
1
1

Whose solution is

6.3

u1
u2
u3
u4
u5
u6

15.404
15.404
15.404
15.404
15.404
15.404

Exercises

1. For this problem consider a squared domain () discretise as shown in the figure below.
If f (x) = 2x2 + 5y 2 is an analytical function defined over the domain, we wish to
compute is integral
ZZ
f (x)dx

using Lagrange interpolation of first and second order.Compute the function at the
element nodes and use afin transformation in both cases. Finally compare the results
with the analytical solution.
y 2

Notice that in the case of second order Lagrange elements six points are necessary to
compute the polynomials. In this case use the middle points between vertex for the
additional nodes

6.3 Exercises

131
3
1
y
0.5

6
k

1
0

5
k

0.5

Chapter 7

Parabolic Problems
In the previous chapters the Finite Elements Method was used to solve elliptic problems.
In particular the heat conduction problem with temperature and heat flux boundary conditions was studied. We will discuss in this chapter heat conduction problems that involves
the time derivative. If u = u(x, t) represents the temperature of a point x at a time t,
then the initial boundary value problem of the heat conduction can be stated as
u
= 2 u + f
t

in

(7.1)


u
= m q(x)
n 2

(7.2)

with boundary conditions:


u|1 = g(x),
and initial conditions
u(x, 0) = u0 (x)

(7.3)

1
u1 = g

2
q

Figure 7.1: Domain definition and boundary conditions for a heat transfer problem, Notice that
1 2 =

To derive the finite element approximation studied in Chapter 5, the corresponding


weak form must be obtained. Multiplying both sides of equation 7.1 by an arbitrary
133

134

Parabolic Problems

virtual function v (virtual temperature) and applying the Greens equation, we have
Z
Z
Z
u
2
v= uv+ f v
t

Z
Z
Z
Z
u
u
uv +
v =
v+ f v
t
n

The right hand side of this equation can be treated as usual. That is, by expressing u
and v in terms of a base of approximation we obtain the abstract form of the right hand
side of the equation and then obtain the discrete form
Z
u
v = a(u, v) + `(v)
t
Z
u
v =hAu, vi + h`, vi
(7.4)
t
Now the left hand side of the equation contains the time derivate of u that can be expressed
in terms of the same base as
 X

X
u X ui wi
ui
=
=
wi =
uw
i
(7.5)
t
t
t
i

where

ui
= u.
Then the left hand side can be computed as
t
Z
Z X
Z
X
X
u
v=
ui wi
v j wj =
ui vj
wi wj
t

(7.6)

ij

which can be expressed in matrix form as


Z
u
v = hM u,
vi
t

(7.7)

Incorporating this result into (7.4) gives,


hM u,
vi = hAu, vi + h`, vi

(7.8)

Because this results must be true for any function v in admissible space of approximation,
we have,
M u + Au = +`
(7.9)

7.1

Finite Difference Approximation

The time derivate of u can be approximated by its forward finite difference approximation
in the following way, If un is the temperature at time nt, un = u(x, nt), then
u =

un un1
t

(7.10)

7.2 Method for time integration

135

replacing (7.1) into (7.9)



M

un un1
t


+ Au + g = `

M un M un1
+ Au = ` g
t
M un M un1 + t(Aun ) = t(` g)
M un + t(Aun ) = t(` g) + M un1
then the implicit form of the transient heat conduction problem
(M + tA)un = t(` g) + M un1

7.2

(7.11)

Method for time integration

A more general way of approximating the time derivate is called the method for time integration, to see how this method works, let us consider a simplified form a time dependent
problem
dx
= f (x), x = x0 at t = 0.
dt
For a given increment t, with xn = x(nt), this equation can be approximated by
xn xn1
= f (xn )
t
which is called the forward difference scheme or if we choose f (xn1 ) instead of f (xn ) it
will lead as to the backward difference scheme
xn xn1
= f (xn1 )
t
Selection of each scheme will depend upon the problem, as each one seems to be equally
approximated. A weighted averaged approximation of the right hand side of the backward
and forward schemes take us to a general form
xn xn1
= f (xn ) + (1 )f (xn1 )
t
with 0 1. Notice that when = 0 we have the backward difference approximation
and when = 1 we obtain the forward difference approximation.
M u + Au + g = `
Applying the method approximation we have
 n

u un1
M
+ Aun + (1 )Aun1 + g = `
t

(7.12)

(7.13)

136

Parabolic Problems

expanding the first term and grouping the terms with un and the terms with un+1 ,
un
un1
M
+ Aun + (1 )Aun1 + g = `,
t
t




M
M
n
+ A u +
+ (1 )A un1 + g = `.
t
t
M

Multiplying by t
(M + t A)un + (M + t(1 )A) un1 + g t = `t,
which led as to the final system
(M + t A)un = (M t(1 )A) un1 g t + `t

(7.14)

Example
As an example consider the one dimensional case of this equation. Solving for un we have
un =

(M t(1 )A) un1 g t + `t


(M + t A)

Appendix A

Barycentric Coordinates
C(x3 , y3 )

A(x1 , y1 )
P

B(x2 , y2 )
Figure A.1: Barycentric triangle coordinates

Given a triangle defined by ABC and some point P, we defined barycentric coordinates,
m1 , m2 , and m3 as the mass associated to each triangle vertex such that the mass centre
is located at P , that is
M P = m1 A + m2 B + m3 C
P
where M is the sum of all the masses, M =
mi , and A = (x1 , y1 ) B = (x2 , y2 ) C =
(x3 , y3 ). Figure A.2. Assuming that M = 1 then
m2 = (y1 x y1 x3 x1 y + y x3 y3 x + x1 y3 )/b

(A.1)

m3 = (y1 x x2 y1 y2 x + y2 x1 + x2 y x1 y)/b

(A.2)

m1 = (y2 x3 + x2 y3 + y x3 y3 x + y2 x x2 y)/b

(A.3)

with b = (y1 x3 + y2 x1 y2 x3 x1 y3 x2 y1 + x2 y3 )
If all the masses are equal then then mass centre is equal to the geometric centre
of the triangle. if m3 = 0 and m1 and m2 > 0 then the barycentre will be located in
the segment of line that joints the two points. If however, one of the masses is negative
then the barycentres will be located over the line that crosses the two points but outside
of the segment that joint the two points. See figure A.2. This can be easily proved by
137

138

Barycentric Coordinates
Barycentre

m1 > 0

m2 > 0

m1 < 0

m2 > 0

Figure A.2: Barycentric triangle coordinates

setting the origin of the coordinate system at point 1, the the barycentre is given by
m1 x1 + m2 x2 = M x
, as x1 = 0 and solving for x
, x
= x (m2/M ). Because m1 < 0 then
m
> x.
( 2/M ) > 1 therefore x
This concept can be generalised to conclude that if a point P is outside a triangle
ABC then at least one its barycentric coordinates is equal to zero. The prof is left to the
enjoyment of the reader.

Appendix B

Gradient
B.1

Computation of the gradient of a function

Let = u0 where u0 means

u
xi

with i = 1..3
Z
( u0 ) v = 0

defining
= i wi ,

u = uk wk ,

v = v j wj

and replacing this into the previous expression we have:


Z

i wi uk wk0 (vj wj ) = 0
Z
Z
i wi vj wj
uk wk0 vj wj = 0
Z

Z
i vj
wi wj uk vj
wk0 wj = 0

defining
Z

Z
wi wj = Hij

wk0 wj = Bkj

i vj Hij uk vj Bkj = 0
as Hij is symmetric
vj (Hij i ) vj (Bkj uk ) = 0
it is valid for all vj
[H] = [B]T uk = L

139

(B.1)
(B.2)
(B.3)

Bibliography
Terrence Akay. Metodos Numericos aplicados a la Ingeniera. Limusa.
R. Barrett, M. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout,
R. Pozo, C. Romine, and H. Van der Vorst. Templates for the Solution of Linear
Systems: Building Blocks for Iterative Methods, 2nd Edition. SIAM, Philadelphia, PA,
1994.
K.-J. Bathe. Finite Element Procedures. Prentice Hall, Upper Saddle, New Jersey 07458,
1996.
F. P Beer and R. Johnston Jr. Mechanics of Materials. McGraw-Hill, New York, second
edition, 1992.
Michael Berry and Murray Browne. Understanding Search Engines: Mathematical Modeling and Text Retrieval. SIAM, second edition, 2005.
Dietrich Braess. Finite Elements. Cambridge University Press, 1997.
Susanne C. Brenner and L. Ridgway Scott. The Mathematical Theory of Finite Elements.
Springer-Verlag, 1994.
Richard Burden. Analisis Numerico. Grupo Editorial Iberoamerica, 1985.
Ray W Clough and Joseph Penzien. Dynamics of Structures. McGraw-Hill Publishing
Company, 2nd edition, 1993.
Luz Myriam Echevery. Notas de clase del curso de analisis numerico. Departamento de
Matem
aticas, Universidad de los Andes.
Keinosuke Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press,
1990.
Gene H. Golub and Charles F. Van Loan. Matrix Computations. Johns Hopkins University
Press, 3rd edition, November 1996.
M.R. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems.
J. Res. NBS 49, 409-436, 1952.
141

142

BIBLIOGRAPHY

E. Issacson and H. B. Keller. Analysis of numerical methods. John Wiley & Sons, New
York, 1966.
Jose Rafael Toro. Notas de clase de analisis numerico. Departamento de Ingeniera
Mec
anica, Universidad de los Andes, 1988.
Kendall E. Atkinson and Weimin Han. Theoretical Numerical Analysis: A Functional
Analysis Framework. Springer, second edition, 2005.
Noboru Kikuchi. Finite Element Methods in Mechanics. Cambridge University Press,
1986.
E. Kreyszig. Introductory Functional Analysis with Applications. John Wiley & Sons,
1978.
J.E. Marsden and T.J.R Hughes. Mathematical Foundations of Elasticity. Prentice-Hall,
Englewood Cliffs, New Jersey, 1983.
J.T Oden. Finite Elements: an introduction. In [Ciarlet and Lions 1991] pp. 3-12, 1991.
Ciarlet. Ph. The Finite Element Method for Elliptic Problems. North-Holland, Amsterdam, 1978.
Youcef Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and
Applied Mathematics, Philadelphia, PA, USA, 2003. ISBN 0898715342.
Youcef Saad and Martin H. Schultz. Gmres: A generalized minimal residual algorithm
for solving nonsymmetric linear systems. SIAM Journal on Scientific and Statistical
Computing, 7(3):856869, 1986. doi: 10.1137/0907058. URL http://link.aip.org/
link/?SCE/7/856/1.
Jonathan Richard Shewchuk. The conjugate gradient method. without the agonizing pain. http://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.
pdf, 1994.
Gilbert Strang. Introducction to Applied Mathematics. Wellesley Cambridge press, 1986.
Jose Rafael Toro. Problemas Variacionales y Elementos Finitos en Ingeniera Mec
anica.
Ediciones Uniandes, Bogota, primera edition, Marzo 2007.
Lloyd N. Trefethen and David Bau. Numerical Linear Algebra. SIAM, 1997.
C Truesdell. A First Course in Rational Continuum Mechanics. Academic Press, New
York - London, 1977/1991.
David Vise and Mark Malseed. The Google Story.
O. C. Zienkiewicz and R L Taylor. Finite Element Method. Butterworth-Heinemann, 5th
edition, 2000.

Potrebbero piacerti anche