Sei sulla pagina 1di 243

# INTRODUCTION TO MATLAB FOR

ENGINEERING STUDENTS
David Houcque
Northwestern University
(version 1.2, August 2005)

Contents
1 Tutorial lessons 1

1.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Basic features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3

## A minimum MATLAB session . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3.1

Starting MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3.2

## Using MATLAB as a calculator . . . . . . . . . . . . . . . . . . . . .

1.3.3

Quitting MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Getting started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.1

## Creating MATLAB variables . . . . . . . . . . . . . . . . . . . . . . .

1.4.2

Overwriting variable . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.3

Error messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.4

Making corrections . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.5

1.4.6

1.4.7

1.4.8

1.4.9

10

## 1.4.11 Getting help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

1.4

1.5

2 Tutorial lessons 2
2.1

12

Mathematical functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

2.1.1

13

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
i

2.2

Basic plotting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

2.2.1

overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

2.2.2

14

2.2.3

15

2.2.4

16

2.2.5

## Specifying line styles and colors . . . . . . . . . . . . . . . . . . . . .

17

2.3

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

2.4

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.5

Matrix generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.5.1

Entering a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.5.2

Entering a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

2.5.3

Matrix indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

2.5.4

Colon operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

2.5.5

Linear spacing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

2.5.6

## Colon operator in a matrix . . . . . . . . . . . . . . . . . . . . . . . .

22

2.5.7

Creating a sub-matrix . . . . . . . . . . . . . . . . . . . . . . . . . .

23

2.5.8

## Deleting row or column . . . . . . . . . . . . . . . . . . . . . . . . . .

25

2.5.9

Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

2.5.10 Continuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

26

26

27

## 2.5.14 Special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

2.6

## 3 Array operations and Linear equations

3.1

3.2

30

Array operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

3.1.1

30

3.1.2

30

## Solving linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

3.2.1

33

Matrix inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ii

3.2.2
3.3

Matrix functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

## 4 Introduction to programming in MATLAB

35

4.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

4.2

M-File Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

4.2.1

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

4.2.2

Script side-effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

M-File functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

4.3.1

38

4.3.2

40

4.4

## Input to a script file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

4.5

Output commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

4.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

42

4.3

## 5 Control flow and operators

43

5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

5.2

Control flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

5.2.1

43

5.2.2

45

5.2.3

45

5.2.4

46

5.2.5

## Other flow structures . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

5.2.6

Operator precedence . . . . . . . . . . . . . . . . . . . . . . . . . . .

47

5.3

## Saving output to a file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47

5.4

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

6 Debugging M-files

49

6.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

6.2

Debugging process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

6.2.1

## Preparing for debugging . . . . . . . . . . . . . . . . . . . . . . . . .

50

6.2.2

Setting breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

iii

6.2.3

## Running with breakpoints . . . . . . . . . . . . . . . . . . . . . . . .

50

6.2.4

Examining values . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

6.2.5

## Correcting and ending debugging . . . . . . . . . . . . . . . . . . . .

51

6.2.6

Ending debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

6.2.7

Correcting an M-file . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

A Summary of commands

53

58

58

60

## B.3 Further details

60

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

## C Main characteristics of MATLAB

62

C.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

C.2 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

C.3 Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

C.4 Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

iv

List of Tables
1.1

1.2

## Hierarchy of arithmetic operations . . . . . . . . . . . . . . . . . . . . . . . .

2.1

Elementary functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

2.2

13

2.3

## Attributes for plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

2.4

Elementary matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

2.5

Special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

3.1

Array operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

3.2

## Summary of matrix and array operations . . . . . . . . . . . . . . . . . . . .

32

3.3

Matrix functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

4.1

38

4.2

39

4.3

40

4.4

41

5.1

## Relational and logical operators . . . . . . . . . . . . . . . . . . . . . . . . .

45

5.2

Operator precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47

53

54

54

55

55

56

56

56

57

vi

List of Figures
1.1

2.1

15

2.2

16

2.3

## Typical example of multiple plots . . . . . . . . . . . . . . . . . . . . . . . .

17

vii

Preface
Introduction to MATLAB for Engineering Students is a document for an introductory
R 1
course in MATLAB
and technical computing. It is used for freshmen classes at Northwestern University. This document is not a comprehensive introduction or a reference manual. Instead, it focuses on the specific features of MATLAB that are useful for engineering
classes. The lab sessions are used with one main goal: to allow students to become familiar
with computer software (e.g., MATLAB) to solve application problems. We assume that the
students have no prior experience with MATLAB.
The availability of technical computing environment such as MATLAB is now reshaping
the role and applications of computer laboratory projects to involve students in more intense
problem-solving experience. This availability also provides an opportunity to easily conduct
numerical experiments and to tackle realistic and more complicated problems.
Originally, the manual is divided into computer laboratory sessions (labs). The lab
document is designed to be used by the students while working at the computer. The
emphasis here is learning by doing. This quiz-like session is supposed to be fully completed
in 50 minutes in class.
The seven lab sessions include not only the basic concepts of MATLAB, but also an introduction to scientific computing, in which they will be useful for the upcoming engineering
courses. In addition, engineering students will see MATLAB in their other courses.
The end of this document contains two useful sections: a Glossary which contains the
brief summary of the commands and built-in functions as well as a collection of release notes.
The release notes, which include several new features of the Release 14 with Service Pack
2, well known as R14SP2, can also be found in Appendix. All of the MATLAB commands
have been tested to take advantage with new features of the current version of MATLAB
available here at Northwestern (R14SP2). Although, most of the examples and exercises still
work with previous releases as well.
This manual reflects the ongoing effort of the McCormick School of Engineering and
Applied Science leading by Dean Stephen Carr to institute a significant technical computing
R 2
in the Engineering First
courses taught at Northwestern University.
Finally, the students - Engineering Analysis (EA) Section - deserve my special gratitude. They were very active participants in class.
David Houcque
Evanston, Illinois
August 2005

R
MATLAB
is a registered trademark of MathWorks, Inc.
R
2
Engineering First
is a registered trademark of McCormick
School of Engineering and Applied Science (Northwestern University)
1

viii

Acknowledgements
I would like to thank Dean Stephen Carr for his constant support. I am grateful to a number
(Fall Quarter 2004), in particular Randy Freeman, Jorge Nocedal, and Allen Taflove for
their helpful reviews on some specific parts of the document. I also want to thank Malcomb
MacIver, EA3 Honors instructor (Spring 2005) for helping me to better understand the
animation of system dynamics using MATLAB. I am particularly indebted to the many
students (340 or so) who have used these materials, and have communicated their comments
and suggestions. Finally, I want to thank IT personnel for helping setting up the classes and
other computer related work: Rebecca Swierz, Jesse Becker, Rick Mazec, Alan Wolff, Ken
Kalan, Mike Vilches, and Daniel Lee.

David Houcque has more than 25 years experience in the modeling and simulation of structures and solid continua including 14 years in industry. In industry, he has been working as
R&D engineer in the fields of nuclear engineering, oil rig platform offshore design, oil reservoir engineering, and steel industry. All of these include working in different international
environments: Germany, France, Norway, and United Arab Emirates. Among other things,
he has a combined background experience: scientific computing and engineering expertise.
He earned his academic degrees from Europe and the United States.
Here at Northwestern University, he is working under the supervision of Professor Brian
Moran, a world-renowned expert in fracture mechanics, to investigate the integrity assessment of the aging highway bridges under severe operating conditions and corrosion.

ix

Chapter 1
Tutorial lessons 1
1.1

Introduction

The tutorials are independent of the rest of the document. The primarily objective is to help
you learn quickly the first steps. The emphasis here is learning by doing. Therefore, the
best way to learn is by trying it yourself. Working through the examples will give you a feel
for the way that MATLAB operates. In this introduction we will describe how MATLAB
handles simple numerical expressions and mathematical formulas.
The name MATLAB stands for MATrix LABoratory. MATLAB was written originally
to provide easy access to matrix software developed by the LINPACK (linear system package)
and EISPACK (Eigen system package) projects.
MATLAB  is a high-performance language for technical computing. It integrates
computation, visualization, and programming environment. Furthermore, MATLAB is a
modern programming language environment: it has sophisticated data structures, contains
built-in editing and debugging tools, and supports object-oriented programming. These factors
make MATLAB an excellent tool for teaching and research.
MATLAB has many advantages compared to conventional computer languages (e.g.,
C, FORTRAN) for solving technical problems. MATLAB is an interactive system whose
basic data element is an array that does not require dimensioning. The software package
has been commercially available since 1984 and is now considered as a standard tool at most
universities and industries worldwide.
It has powerful built-in routines that enable a very wide variety of computations. It
also has easy to use graphics commands that make the visualization of results immediately
available. Specific applications are collected in packages referred to as toolbox. There are
toolboxes for signal processing, symbolic computation, control theory, simulation, optimization, and several other fields of applied science and engineering.
In addition to the MATLAB documentation which is mostly available on-line, we would
1

recommend the following books: , , , , , , , and . They are excellent in
their specific applications.

1.2

Basic features

As we mentioned earlier, the following tutorial lessons are designed to get you started
quickly in MATLAB. The lessons are intended to make you familiar with the basics of
MATLAB. We urge you to complete the exercises given at the end of each lesson.

1.3

## A minimum MATLAB session

The goal of this minimum session (also called starting and exiting sessions) is to learn the
first steps:
How to log on
Invoke MATLAB
Do a few simple calculations
How to quit MATLAB

1.3.1

Starting MATLAB

After logging into your account, you can enter MATLAB by double-clicking on the MATLAB
shortcut icon (MATLAB 7.0.4) on your Windows desktop. When you start MATLAB, a
special window called the MATLAB desktop appears. The desktop is a window that contains
other windows. The major tools within or accessible from the desktop are:
The Command Window
The Command History
The Workspace
The Current Directory
The Help Browser
The Start button

## Figure 1.1: The graphical interface to the MATLAB workspace

When MATLAB is started for the first time, the screen looks like the one that shown
in the Figure 1.1. This illustration also shows the default configuration of the MATLAB
desktop. You can customize the arrangement of tools and documents to suit your needs.
Now, we are interested in doing some simple calculations. We will assume that you
have sufficient understanding of your computer under which MATLAB is being run.
You are now faced with the MATLAB desktop on your computer, which contains the prompt
(>>) in the Command Window. Usually, there are 2 types of prompt:
>>
EDU>

## for full version

for educational version

Note: To simplify the notation, we will use this prompt, >>, as a standard prompt sign,
though our MATLAB version is for educational purpose.

1.3.2

## Using MATLAB as a calculator

As an example of a simple interactive calculation, just type the expression you want to
evaluate. Lets start at the very beginning. For example, lets suppose you want to calculate
the expression, 1 + 2 3. You type it at the prompt command (>>) as follows,
>> 1+2*3
ans =
7
You will have noticed that if you do not specify an output variable, MATLAB uses a
default variable ans, short for answer, to store the results of the current calculation. Note
that the variable ans is created (or overwritten, if it is already existed). To avoid this, you
may assign a value to a variable or output argument name. For example,
>> x = 1+2*3
x =
7
will result in x being given the value 1 + 2 3 = 7. This variable name can always
be used to refer to the results of the previous computations. Therefore, computing 4x will
result in
>> 4*x
ans =
28.0000
Before we conclude this minimum session, Table 1.1 gives the partial list of arithmetic
operators.
4

Table 1.1:
Symbol
+

1.3.3

## Basic arithmetic operators

Operation
Example
2+3
Subtraction
23
Multiplication
23
Division
2/3

Quitting MATLAB

To end your MATLAB session, type quit in the Command Window, or select File Exit
MATLAB in the desktop main menu.

1.4

Getting started

After learning the minimum MATLAB session, we will now learn to use some additional
operations.

1.4.1

## Creating MATLAB variables

MATLAB variables are created with an assignment statement. The syntax of variable assignment is
variable name = a value (or an expression)
For example,
>> x = expression
where expression is a combination of numerical values, mathematical operators, variables,
and function calls. On other words, expression can involve:
manual entry
built-in functions
user-defined functions

1.4.2

Overwriting variable

Once a variable has been created, it can be reassigned. In addition, if you do not wish to
see the intermediate results, you can suppress the numerical output by putting a semicolon
(;) at the end of the line. Then the sequence of commands looks like this:
>> t = 5;
>> t = t+1
t
=
6

1.4.3

Error messages

If we enter an expression incorrectly, MATLAB will return an error message. For example,
in the following, we left out the multiplication sign, *, in the following expression
>> x = 10;
>> 5x
??? 5x
|
Error: Unexpected MATLAB expression.

1.4.4

Making corrections

To make corrections, we can, of course retype the expressions. But if the expression is
lengthy, we make more mistakes by typing a second time. A previously typed command
can be recalled with the up-arrow key . When the command is displayed at the command
prompt, it can be modified if needed and executed.

1.4.5

## Controlling the hierarchy of operations or precedence

Lets consider the previous arithmetic operation, but now we will include parentheses. For
example, 1 + 2 3 will become (1 + 2) 3
>> (1+2)*3
ans
=
9
and, from previous example

>> 1+2*3
ans
=
7
By adding parentheses, these two expressions give different results: 9 and 7.
The order in which MATLAB performs arithmetic operations is exactly that taught
in high school algebra courses. Exponentiations are done first, followed by multiplications
and divisions, and finally by additions and subtractions. However, the standard order of
precedence of arithmetic operations can be changed by inserting parentheses. For example,
the result of 1+23 is quite different than the similar expression with parentheses (1+2)3.
The results are 7 and 9 respectively. Parentheses can always be used to overrule priority,
and their use is recommended in some complex expressions to avoid ambiguity.
Therefore, to make the evaluation of expressions unambiguous, MATLAB has established a series of rules. The order in which the arithmetic operations are evaluated is given
in Table 1.2. MATLAB arithmetic operators obey the same precedence rules as those in
Table 1.2: Hierarchy of arithmetic operations
Precedence Mathematical operations
First
The contents of all parentheses are evaluated first, starting
from the innermost parentheses and working outward.
Second
All exponentials are evaluated, working from left to right
Third
All multiplications and divisions are evaluated, working
from left to right
Fourth
All additions and subtractions are evaluated, starting
from left to right
most computer programs. For operators of equal precedence, evaluation is from left to right.
Now, consider another example:
1
4 6
+
2
2+3
5 7
In MATLAB, it becomes
>> 1/(2+3^2)+4/5*6/7
ans =
0.7766
or, if parentheses are missing,
>> 1/2+3^2+4/5*6/7
ans =
10.1857
7

So here what we get: two different results. Therefore, we want to emphasize the importance
of precedence rule in order to avoid ambiguity.

1.4.6

## Controlling the appearance of floating point number

MATLAB by default displays only 4 decimals in the result of the calculations, for example
163.6667, as shown in above examples. However, MATLAB does numerical calculations
in double precision, which is 15 digits. The command format controls how the results of
computations are displayed. Here are some examples of the different formats together with
the resulting outputs.
>> format short
>> x=-163.6667
If we want to see all 15 digits, we use the command format long
>> format long
>> x= -1.636666666666667e+002
To return to the standard format, enter format short, or simply format.
There are several other formats. For more details, see the MATLAB documentation,
or type help format.
Note - Up to now, we have let MATLAB repeat everything that we enter at the
prompt (>>). Sometimes this is not quite useful, in particular when the output is pages en
length. To prevent MATLAB from echoing what we type, simply enter a semicolon (;) at
the end of the command. For example,
>> x=-163.6667;
>> x
x
=
-163.6667

1.4.7

## Managing the workspace

The contents of the workspace persist between the executions of separate commands. Therefore, it is possible for the results of one problem to have an effect on the next one. To avoid
this possibility, it is a good idea to issue a clear command at the start of each new independent calculation.
8

>> clear
The command clear or clear all removes all variables from the workspace. This
frees up system memory. In order to display a list of the variables currently in the memory,
type
>> who
while, whos will give more details which include size, space allocation, and class of the
variables.

1.4.8

## Keeping track of your work session

It is possible to keep track of everything done during a MATLAB session with the diary
command.
>> diary
or give a name to a created file,
>> diary FileName
where FileName could be any arbitrary name you choose.
The function diary is useful if you want to save a complete MATLAB session. They
save all input and output as they appear in the MATLAB window. When you want to stop
the recording, enter diary off. If you want to start recording again, enter diary on. The
file that is created is a simple text file. It can be opened by an editor or a word processing
use the function type to view the diary file or you can edit in a text editor or print. This
command is useful, for example in the process of preparing a homework or lab submission.

1.4.9

## Entering multiple statements per line

It is possible to enter multiple statements per line. Use commas (,) or semicolons (;) to
enter more than one statement at once. Commas (,) allow multiple statements per line
without suppressing output.
>> a=7; b=cos(a), c=cosh(a)
b
=
0.6570
c
=
548.3170
9

1.4.10

Miscellaneous commands

## Here are few additional useful commands:

To clear the Command Window, type clc
To abort a MATLAB computation, type ctrl-c
To continue a line, type . . .

1.4.11

Getting help

To view the online documentation, select MATLAB Help from Help menu or MATLAB Help
directly in the Command Window. The preferred method is to use the Help Browser. The
Help Browser can be started by selecting the ? icon from the desktop toolbar. On the other
hand, information about any command is available by typing
>> help Command
Another way to get help is to use the lookfor command. The lookfor command differs
from the help command. The help command searches for an exact function name match,
while the lookfor command searches the quick summary information in each function for
a match. For example, suppose that we were looking for a function to take the inverse of
a matrix. Since MATLAB does not have a function named inverse, the command help
inverse will produce nothing. On the other hand, the command lookfor inverse will
produce detailed information, which includes the function of interest, inv.
>> lookfor inverse
Note - At this particular time of our study, it is important to emphasize one main point.
Because MATLAB is a huge program; it is impossible to cover all the details of each function
one by one. However, we will give you information how to get help. Here are some examples:
Use on-line help to request info on a specific function
>> help sqrt

In the current version (MATLAB version 7), the doc function opens the on-line version
of the help manual. This is very helpful for more complex commands.
>> doc plot

10

## Use lookfor to find functions by keywords. The general form is

>> lookfor FunctionName

1.5

Exercises

Note: Due to the teaching class during this Fall 2005, the problems are temporarily removed
from this section.

11

Chapter 2
Tutorial lessons 2
2.1

Mathematical functions

MATLAB offers many predefined mathematical functions for technical computing which
contains a large set of mathematical functions.
Typing help elfun and help specfun calls up full lists of elementary and special
functions respectively.
There is a long list of mathematical functions that are built into MATLAB. These
functions are called built-ins. Many standard mathematical functions, such as sin(x), cos(x),
tan(x), ex , ln(x), are evaluated by the functions sin, cos, tan, exp, and log respectively in
MATLAB.
Table 2.1 lists some commonly used functions, where variables x and y can be numbers,
vectors, or matrices.
Table 2.1: Elementary functions
cos(x)
sin(x)
tan(x)
acos(x)
asin(x)
atan(x)
exp(x)
sqrt(x)
log(x)
log10(x)

Cosine
Sine
Tangent
Arc cosine
Arc sine
Arc tangent
Exponential
Square root
Natural logarithm
Common logarithm

abs(x)
sign(x)
max(x)
min(x)
ceil(x)
floor(x)
round(x)
rem(x)
angle(x)
conj(x)

Absolute value
Signum function
Maximum value
Minimum value
Round towards +
Round towards
Round to nearest integer
Remainder after division
Phase angle
Complex conjugate

## In addition to the elementary functions, MATLAB includes a number of predefined

12

constant values. A list of the most common values is given in Table 2.2.
Table 2.2: Predefined constant values
pi
i,j
Inf
NaN

2.1.1

...

## The imaginary unit i, 1

The infinity,
Not a number

Examples

We illustrate here some typical examples which related to the elementary functions previously
defined.

## As a first example, the value of the expression y = ea sin(x) + 10 y, for a = 5, x = 2, and

y = 8 is computed by
>> a = 5; x = 2; y = 8;
>> y = exp(-a)*sin(x)+10*sqrt(y)
y =
28.2904
The subsequent examples are
>> log(142)
ans
=
4.9558
>> log10(142)
ans
=
2.1523
Note the difference between the natural logarithm log(x) and the decimal logarithm (base
10) log10(x).
To calculate sin(/4) and e10 , we enter the following commands in MATLAB,
>> sin(pi/4)
ans
=
0.7071
>> exp(10)
ans
=
2.2026e+004
13

Notes:
Only use built-in functions on the right hand side of an expression. Reassigning the
value to a built-in function can create problems.

There are some exceptions. For example, i and j are pre-assigned to 1. However,
one or both of i or j are often used as loop indices.
To avoid any possible confusion, it is suggested to use instead ii or jj as loop indices.

2.2
2.2.1

Basic plotting
overview

MATLAB has an excellent set of graphic tools. Plotting a given data set or the results
of computation is possible with very few commands. You are highly encouraged to plot
mathematical functions and results of analysis as often as possible. Trying to understand
mathematical equations with graphics is an enjoyable and very efficient way of learning mathematics. Being able to plot mathematical functions and data freely is the most important
step, and this section is written to assist you to do just that.

2.2.2

## Creating simple plots

The basic MATLAB graphing procedure, for example in 2D, is to take a vector of xcoordinates, x = (x1 , . . . , xN ), and a vector of y-coordinates, y = (y1 , . . . , yN ), locate the
points (xi , yi ), with i = 1, 2, . . . , n and then join them by straight lines. You need to prepare
x and y in an identical array form; namely, x and y are both row arrays or column arrays of
the same length.
The MATLAB command to plot a graph is plot(x,y). The vectors x = (1, 2, 3, 4, 5, 6)
and y = (3, 1, 2, 4, 5, 1) produce the picture shown in Figure 2.1.
>> x = [1 2 3 4 5 6];
>> y = [3 -1 2 4 5 1];
>> plot(x,y)
Note: The plot functions has different forms depending on the input arguments. If y is a
vector plot(y)produces a piecewise linear graph of the elements of y versus the index of the
elements of y. If we specify two vectors, as mentioned above, plot(x,y) produces a graph
of y versus x.
For example, to plot the function sin (x) on the interval [0, 2], we first create a vector of
x values ranging from 0 to 2, then compute the sine of these values, and finally plot the
result:
14

## Figure 2.1: Plot for the vectors x and y

>> x = 0:pi/100:2*pi;
>> y = sin(x);
>> plot(x,y)
Notes:
0:pi/100:2*pi yields a vector that
starts at 0,
takes steps (or increments) of /100,
stops when 2 is reached.
If you omit the increment, MATLAB automatically increments by 1.

2.2.3

## Adding titles, axis labels, and annotations

MATLAB enables you to add axis labels and titles. For example, using the graph from the
previous example, add an x- and y-axis labels.
Now label the axes and add a title. The character \pi creates the symbol . An
example of 2D plot is shown in Figure 2.2.

15

1
0.8
0.6
0.4

Sine of x

0.2
0
0.2
0.4
0.6
0.8
1

x = 0:2

## Figure 2.2: Plot of the Sine function

>> xlabel(x = 0:2\pi)
>> ylabel(Sine of x)
>> title(Plot of the Sine function)
The color of a single curve is, by default, blue, but other colors are possible. The desired
color is indicated by a third argument. For example, red is selected by plot(x,y,r). Note
the single quotes, , around r.

2.2.4

## Multiple data sets in one plot

Multiple (x, y) pairs arguments create multiple graphs with a single call to plot. For example,
these statements plot three related functions of x: y1 = 2 cos(x), y2 = cos(x), and y3 =
0.5 cos(x), in the interval 0 x 2.
>>
>>
>>
>>
>>
>>
>>
>>

x = 0:pi/100:2*pi;
y1 = 2*cos(x);
y2 = cos(x);
y3 = 0.5*cos(x);
plot(x,y1,--,x,y2,-,x,y3,:)
xlabel(0 \leq x \leq 2\pi)
ylabel(Cosine functions)
legend(2*cos(x),cos(x),0.5*cos(x))
16

## >> title(Typical example of multiple plots)

>> axis([0 2*pi -3 3])
The result of multiple data sets in one graph plot is shown in Figure 2.3.
Typical example of multiple plots
3
2*cos(x)
cos(x)
0.5*cos(x)

Cosine functions

3
0 x 2

## Figure 2.3: Typical example of multiple plots

By default, MATLAB uses line style and color to distinguish the data sets plotted in
the graph. However, you can change the appearance of these graphic components or add
annotations to the graph to help explain your data for presentation.

2.2.5

## Specifying line styles and colors

It is possible to specify line styles, colors, and markers (e.g., circles, plus signs, . . . ) using
the plot command:
plot(x,y,style_color_marker)
where style_color_marker is a triplet of values from Table 2.3.
To find additional information, type help plot or doc plot.

17

Symbol Color
k
r
b
g
c
m
y

2.3

Black
Red
Blue
Green
Cyan
Magenta
Yellow

## Symbol Line Style

:
.
none

Solid
Dashed
Dotted
Dash-dot
No line

Symbol Marker
+
o

s
d

Plus sign
Circle
Asterisk
Point
Cross
Square
Diamond

Exercises

Note: Due to the teaching class during this Fall Quarter 2005, the problems are temporarily
removed from this section.

18

2.4

Introduction

Matrices are the basic elements of the MATLAB environment. A matrix is a two-dimensional
array consisting of m rows and n columns. Special cases are column vectors (n = 1) and row
vectors (m = 1).
In this section we will illustrate how to apply different operations on matrices. The following
topics are discussed: vectors and matrices in MATLAB, the inverse of a matrix, determinants,
and matrix manipulation.
MATLAB supports two types of operations, known as matrix operations and array operations. Matrix operations will be discussed first.

2.5

Matrix generation

Matrices are fundamental to MATLAB. Therefore, we need to become familiar with matrix
generation and manipulation. Matrices can be generated in several ways.

2.5.1

Entering a vector

A vector is a special case of a matrix. The purpose of this section is to show how to create
vectors and matrices in MATLAB. As discussed earlier, an array of dimension 1 n is called
a row vector, whereas an array of dimension m 1 is called a column vector. The elements
of vectors in MATLAB are enclosed by square brackets and are separated by spaces or by
commas. For example, to enter a row vector, v, type
>> v = [1 4 7 10 13]
v =
1
4
7

10

13

Column vectors are created in a similar way, however, semicolon (;) must separate the
components of a column vector,
>> w = [1;4;7;10;13]
w =
1
4
7
10
13
On the other hand, a row vector is converted to a column vector using the transpose operator.
The transpose operation is denoted by an apostrophe or a single quote ().
19

>> w = v
w =
1
4
7
10
13
Thus, v(1) is the first element of vector v, v(2) its second element, and so forth.
Furthermore, to access blocks of elements, we use MATLABs colon notation (:). For example, to access the first three elements of v, we write,
>> v(1:3)
ans =
1
4

Or, all elements from the third through the last elements,
>> v(3,end)
ans =
7
10

13

where end signifies the last element in the vector. If v is a vector, writing
>> v(:)
produces a column vector, whereas writing
>> v(1:end)
produces a row vector.

2.5.2

Entering a matrix

## A matrix is an array of numbers. To type a matrix into MATLAB you must

begin with a square bracket, [
separate elements in a row with spaces or commas (,)
use a semicolon (;) to separate rows
end the matrix with another square bracket, ].
20

## Here is a typical example. To enter a matrix A, such as,

1 2 3
A= 4 5 6
7 8 9

(2.1)

type,
>> A = [1 2 3; 4 5 6; 7 8 9]
MATLAB then displays the 3 3 matrix as follows,
A

=
1
4
7

2
5
8

3
6
9

Note that the use of semicolons (;) here is different from their use mentioned earlier to
suppress output or to write multiple commands in a single line.
Once we have entered the matrix, it is automatically stored and remembered in the
Workspace. We can refer to it simply as matrix A. We can then view a particular element in
a matrix by specifying its location. We write,
>> A(2,1)
ans =
4
A(2,1) is an element located in the second row and first column. Its value is 4.

2.5.3

Matrix indexing

We select elements in a matrix just as we did for vectors, but now we need two indices.
The element of row i and column j of the matrix A is denoted by A(i,j). Thus, A(i,j)
in MATLAB refers to the element Aij of matrix A. The first index is the row number and
the second index is the column number. For example, A(1,3) is an element of first row and
third column. Here, A(1,3)=3.
Correcting any entry is easy through indexing.
A(3,3)=0. The result is
>> A(3,3) = 0
A
=
1
2
3
4
5
6
7
8
0
21

## Here we substitute A(3,3)=9 by

Single elements of a matrix are accessed as A(i,j), where i 1 and j 1. Zero or negative
subscripts are not supported in MATLAB.

2.5.4

Colon operator

The colon operator will prove very useful and understanding how it works is the key to
efficient and convenient usage of MATLAB. It occurs in several different forms.
Often we must deal with matrices or vectors that are too large to enter one element at a time. For example, suppose we want to enter a vector x consisting of points
(0, 0.1, 0.2, 0.3, , 5). We can use the command
>> x = 0:0.1:5;
The row vector has 51 elements.

2.5.5

Linear spacing

On the other hand, there is a command to generate linearly spaced vectors: linspace. It
is similar to the colon operator (:), but gives direct control over the number of points. For
example,
y = linspace(a,b)
generates a row vector y of 100 points linearly spaced between and including a and b.
y = linspace(a,b,n)
generates a row vector y of n points linearly spaced between and including a and b. This is
useful when we want to divide an interval into a number of subintervals of the same length.
For example,
>> theta = linspace(0,2*pi,101)
divides the interval [0, 2] into 100 equal subintervals, then creating a vector of 101 elements.

2.5.6

## Colon operator in a matrix

The colon operator can also be used to pick out a certain row or column. For example, the
statement A(m:n,k:l specifies rows m to n and column k to l. Subscript expressions refer
to portions of a matrix. For example,
22

>> A(2,:)
ans =
4
5

## is the second row elements of A.

The colon operator can also be used to extract a sub-matrix from a matrix A.
>> A(:,2:3)
ans =
2
3
5
6
8
0
A(:,2:3) is a sub-matrix with the last two columns of A.
A row or a column of a matrix can be deleted by setting it to a null vector, [ ].
>> A(:,2)=[]
ans =
1
3
4
6
7
0

2.5.7

Creating a sub-matrix

To extract a submatrix B consisting of rows 2 and 3 and columns 1 and 2 of the matrix A,
do the following
>> B = A([2 3],[1 2])
B
=
4
5
7
8
To interchange rows 1 and 2 of A, use the vector of row indices together with the colon
operator.
>> C = A([2 1 3],:)
C
=
4
5
6
1
2
3
7
8
0
It is important to note that the colon operator (:) stands for all columns or all rows. To
create a vector version of matrix A, do the following
23

>> A(:)
ans =
1
2
3
4
5
6
7
8
0
The submatrix comprising the intersection of rows p to q and columns r to s is denoted by
A(p:q,r:s).
As a special case, a colon (:) as the row or column specifier covers all entries in that row or
column; thus
A(:,j) is the jth column of A, while
A(i,:) is the ith row, and
A(end,:) picks out the last row of A.
The keyword end, used in A(end,:), denotes the last index in the specified dimension. Here
are some examples.
>> A
A =
1
4
7

2
5
8

3
6
9

>> A(2:3,2:3)
ans =
5
6
8
9
>> A(end:-1:1,end)
ans =
9
6
3
24

ans =
2
3
8
9

2.5.8

## To delete a row or column of a matrix, use the empty vector operator, [ ].

>> A(3,:) = []
A =
1
2
3
4
5
6
Third row of matrix A is now deleted. To restore the third row, we use a technique for
creating a matrix
>> A = [A(1,:);A(2,:);[7 8 0]]
A =
1
2
3
4
5
6
7
8
0
Matrix A is now restored to its original form.

2.5.9

Dimension

To determine the dimensions of a matrix or vector, use the command size. For example,
>> size(A)
ans
=
3

## means 3 rows and 3 columns.

Or more explicitly with,
>> [m,n]=size(A)

25

2.5.10

Continuation

If it is not possible to type the entire input on the same line, use consecutive periods, called
an ellipsis . . ., to signal continuation, then continue the input on the next line.
B = [4/5
1/x^2
x-7

7.23*tan(x)
0
sqrt(3)

sqrt(6); ...
3/(x*log(x)); ...
x*sin(x)];

Note that blank spaces around +, , = signs are optional, but they improve readability.

2.5.11

Transposing a matrix

The transpose operation is denoted by an apostrophe or a single quote (). It flips a matrix
about its main diagonal and it turns a row vector into a column vector. Thus,
>> A
ans =
1
2
3

4
5
6

7
8
0

By using linear algebra notation, the transpose of m n real matrix A is the n m matrix
that results from interchanging the rows and columns of A. The transpose matrix is denoted
AT .

2.5.12

Concatenating matrices

Matrices can be made up of sub-matrices. Here is an example. First, lets recall our previous
matrix A.
A

=
1
4
7

2
5
8

3
6
9

## The new matrix B will be,

>> B = [A 10*A; -A [1 0 0; 0 1 0; 0 0 1]]
B
=
1
2
3
10
20
30
26

4
7
-1
-4
-7

2.5.13

5
8
-2
-5
-8

6
9
-3
-6
-9

40
70
1
0
0

50
80
0
1
0

60
90
0
0
1

Matrix generators

MATLAB provides functions that generates elementary matrices. The matrix of zeros, the
matrix of ones, and the identity matrix are returned by the functions zeros, ones, and eye,
respectively.
Table 2.4: Elementary matrices
eye(m,n)
eye(n)
zeros(m,n)
ones(m,n)
diag(A)
rand(m,n)

## Returns an m-by-n matrix with 1 on the main diagonal

Returns an n-by-n square identity matrix
Returns an m-by-n matrix of zeros
Returns an m-by-n matrix of ones
Extracts the diagonal of matrix A
Returns an m-by-n matrix of random numbers

For a complete list of elementary matrices and matrix manipulations, type help elmat
or doc elmat. Here are some examples:
1.

>> b=ones(3,1)
b
=
1
1
1
Equivalently, we can define b as >> b=[1;1;1]

2.

3.

>> eye(3)
ans
=
1
0
0
1
0
0

0
0
1

>> c=zeros(2,3)
c
=
0
0
0
27

In addition, it is important to remember that the three elementary operations of addition (+), subtraction (), and multiplication () apply also to matrices whenever the
dimensions are compatible.
Two other important matrix generation functions are rand and randn, which generate
matrices of (pseudo-)random numbers using the same syntax as eye.
In addition, matrices can be constructed in a block form. With C defined by C = [1
2; 3 4], we may create a matrix D as follows
>> D = [C zeros(2); ones(2) eye(2)]
D =
1
2
0
0
3
4
0
0
1
1
1
0
1
1
0
1

2.5.14

Special matrices

MATLAB provides a number of special matrices (see Table 2.5). These matrices have interesting properties that make them useful for constructing examples and for testing algorithms.
Table 2.5: Special matrices
hilb
invhilb
magic
pascal
toeplitz
vander
wilkinson

Hilbert matrix
Inverse Hilbert matrix
Magic square
Pascal matrix
Toeplitz matrix
Vandermonde matrix
Wilkinsons eigenvalue test matrix

28

2.6

Exercises

Note: Due to the teaching class during this Fall Quarter 2005, the problems are temporarily
removed from this section.

29

Chapter 3
Array operations and Linear
equations
3.1

Array operations

MATLAB has two different types of arithmetic operations: matrix arithmetic operations
and array arithmetic operations. We have seen matrix arithmetic operations in the previous
lab. Now, we are interested in array operations.

3.1.1

## As we mentioned earlier, MATLAB allows arithmetic operations: +, , , and to be

carried out on matrices. Thus,
A+B or B+A
A*B
A^2
*A or A*

3.1.2

## is valid if A and B are of the same size

is valid if As number of column equals Bs number of rows
is valid if A is square and equals A*A
multiplies each element of A by

## Array arithmetic operations

On the other hand, array arithmetic operations or array operations for short, are done
element-by-element. The period character, ., distinguishes the array operations from the
matrix operations. However, since the matrix and array operations are the same for addition
(+) and subtraction (), the character pairs (.+) and (.) are not used. The list of array
operators is shown below in Table 3.2. If A and B are two matrices of the same size with
elements A = [aij ] and B = [bij ], then the command
30

.*
./
.^

Element-by-element multiplication
Element-by-element division
Element-by-element exponentiation
Table 3.1: Array operators

>> C = A.*B
produces another matrix C of the same size with elements cij
the same 3 3 matrices,

10 20
1 2 3

B = 40 50
A= 4 5 6 ,
70 80
7 8 9

## = aij bij . For example, using

30
60
90

we have,
>> C = A.*B
C
=
10
40
160
250
490
640

90
360
810

To raise a scalar to a power, we use for example the command 10^2. If we want the
operation to be applied to each element of a matrix, we use .^2. For example, if we want
to produce a new matrix whose elements are the square of the elements of the matrix A, we
enter
>> A.^2
ans
=
1
4
16
25
49
64

9
36
81

The relations below summarize the above operations. To simplify, lets consider two
vectors U and V with elements U = [ui ] and V = [vj ].
U. V
U./V
U.V

produces [u1 v1 u2 v2 . . . un vn ]
produces [u1 /v1 u2 /v2 . . . un /vn ]
produces [uv11 uv22 . . . uvnn ]

31

Subtraction
Multiplication
Division
Left division
Exponentiation

/
\

.
./
.\
.

3.2

## Solving linear equations

One of the problems encountered most frequently in scientific computation is the solution of
systems of simultaneous linear equations. With matrix notation, a system of simultaneous
linear equations is written
Ax = b
(3.1)
where there are as many equations as unknown. A is a given square matrix of order n, b is a
given column vector of n components, and x is an unknown column vector of n components.
In linear algebra we learn that the solution to Ax = b can be written as x = A1 b, where
A1 is the inverse of A.
For example, consider the following system of linear equations

x + 2y + 3z = 1
4x + 5y + 6z = 1

7x + 8y
= 1
The coefficient matrix A is

1 2 3
A= 4 5 6
7 8 9

1
and the vector b = 1
1

## With matrix notation, a system of simultaneous linear equations is written

Ax = b
This equation can be solved for x using linear algebra. The result is x = A1 b.
There are typically two ways to solve for x in MATLAB:
1. The first one is to use the matrix inverse, inv.

32

(3.2)

>> A = [1 2 3; 4 5 6; 7 8 0];
>> b = [1; 1; 1];
>> x = inv(A)*b
x
=
-1.0000
1.0000
-0.0000

2. The second one is to use the backslash (\)operator. The numerical algorithm behind
this operator is computationally efficient. This is a numerically reliable way of solving
system of linear equations by using a well-known process of Gaussian elimination.
>> A = [1 2 3; 4 5 6; 7 8 0];
>> b = [1; 1; 1];
>> x = A\b
x
=
-1.0000
1.0000
-0.0000

This problem is at the heart of many problems in scientific computation. Hence it is important that we know how to solve this type of problem efficiently.
Now, we know how to solve a system of linear equations. In addition to this, we will
see some additional details which relate to this particular topic.

3.2.1

Matrix inverse

## Lets consider the same matrix A.

1 2 3
A= 4 5 6
7 8 0
Calculating the inverse of A manually is probably not a pleasant work. Here the handcalculation of A1 gives as a final result:

16 8 1
1
A1 = 14 7 2
9
1 2 1
In MATLAB, however, it becomes as simple as the following commands:
33

>> A = [1 2 3; 4 5 6; 7 8 0];
>> inv(A)
ans =
-1.7778
0.8889
-0.1111
1.5556
-0.7778
0.2222
-0.1111
0.2222
-0.1111
which is similar to:

A1

16
8 1
1
2
= 14 7
9
1
2 1

## and the determinant of A is

>> det(A)
ans =
27
For further details on applied numerical linear algebra, see  and .

3.2.2

Matrix functions

MATLAB provides many matrix functions for various matrix/vector manipulations; see
Table 3.3 for some of these functions. Use the online help of MATLAB to find how to use
these functions.
det
diag
eig
inv
norm
rank

Determinant
Diagonal matrices and diagonals of a matrix
Eigenvalues and eigenvectors
Matrix inverse
Matrix and vector norms
Number of linearly independent rows or columns
Table 3.3: Matrix functions

3.3

Exercises

Note: Due to the teaching class during this Fall Quarter 2005, the problems are temporarily
removed from this section.

34

Chapter 4
Introduction to programming in
MATLAB
4.1

Introduction

So far in these lab sessions, all the commands were executed in the Command Window.
The problem is that the commands entered in the Command Window cannot be saved
and executed again for several times. Therefore, a different way of executing repeatedly
commands with MATLAB is:
1. to create a file with a list of commands,
2. save the file, and
3. run the file.
If needed, corrections or changes can be made to the commands in the file. The files that
are used for this purpose are called script files or scripts for short.
This section covers the following topics:
M-File Scripts
M-File Functions

4.2

M-File Scripts

A script file is an external file that contains a sequence of MATLAB statements. Script
files have a filename extension .m and are often called M-files. M-files can be scripts that
simply execute a series of MATLAB statements, or they can be functions that can accept
arguments and can produce one or more outputs.
35

4.2.1

Examples

## Here are two simple scripts.

Example 1
Consider the system of equations:

x + 2y + 3z = 1
3x + 3y + 4z = 1

2x + 3y + 3z = 2
Find the solution x to the system of equations.
Solution:
Use the MATLAB editor to create a file: File New M-file.
Enter the following statements in the file:
A = [1 2 3; 3 3 4; 2 3 3];
b = [1; 1; 2];
x = A\b

## Save the file, for example, example1.m.

Run the file, in the command line, by typing:
>> example1
x
=
-0.5000
1.5000
-0.5000

When execution completes, the variables (A, b, and x) remain in the workspace. To see a
listing of them, enter whos at the command prompt.
Note: The MATLAB editor is both a text editor specialized for creating M-files and a
graphical MATLAB debugger. The MATLAB editor has numerous menus for tasks such as
saving, viewing, and debugging. Because it performs some simple checks and also uses color
to differentiate between various elements of codes, this text editor is recommended as the
tool of choice for writing and editing M-files.
There is another way to open the editor:
36

>> edit
or
>> edit filename.m
to open filename.m.
Example 2
Plot the following cosine functions, y1 = 2 cos(x), y2 = cos(x), and y3 = 0.5 cos(x), in the
interval 0 x 2. This example has been presented in previous Chapter. Here we put
the commands in a file.
Create a file, say example2.m, which contains the following commands:
x = 0:pi/100:2*pi;
y1 = 2*cos(x);
y2 = cos(x);
y3 = 0.5*cos(x);
plot(x,y1,--,x,y2,-,x,y3,:)
xlabel(0 \leq x \leq 2\pi)
ylabel(Cosine functions)
legend(2*cos(x),cos(x),0.5*cos(x))
title(Typical example of multiple plots)
axis([0 2*pi -3 3])

## Run the file by typing example2 in the Command Window.

4.2.2

Script side-effects

All variables created in a script file are added to the workspace. This may have undesirable
effects, because:
Variables already existing in the workspace may be overwritten.
The execution of the script can be affected by the state variables in the workspace.
As a result, because scripts have some undesirable side-effects, it is better to code any
complicated applications using rather function M-file.

37

4.3

M-File functions

As mentioned earlier, functions are programs (or routines) that accept input arguments and
return output arguments. Each M-file function (or function or M-file for short) has its own
area of workspace, separated from the MATLAB base workspace.

4.3.1

## This simple function shows the basic parts of an M-file.

function f = factorial(n)
% FACTORIAL(N) returns the factorial of N.
% Compute a factorial value.

(1)
(2)
(3)

f = prod(1:n);

(4)

The first line of a function M-file starts with the keyword function. It gives the function
name and order of arguments. In the case of function factorial, there are up to one output
argument and one input argument. Table 4.1 summarizes the M-file function.
As an example, for n = 5, the result is,
>> f = factorial(5)
f =
120
Table 4.1: Anatomy of a M-File function
Part no.

M-file element

Description

(1)

(2)

Function
definition
line
H1 line

(3)

Help text

(4)

Function body

## Define the function name, and the

number and order of input and
output arguments
A one line summary description
of the program, displayed when you
request Help
A more detailed description of
the program
Program code that performs
the actual computations

Both functions and scripts can have all of these parts, except for the function definition
line which applies to function only.
38

In addition, it is important to note that function name must begin with a letter, and
must be no longer than than the maximum of 63 characters. Furthermore, the name of the
text file that you save will consist of the function name with the extension .m. Thus, the
above example file would be factorial.m.
Table 4.2 summarizes the differences between scripts and functions.
Table 4.2: Difference between scripts and functions
Scripts

Functions

## - Do not accept input

arguments or return output
arguments.
- Store variables in a
workspace that is shared
with other scripts
- Are useful for automating
a series of commands

## - Can accept input arguments and

return output arguments.
- Store variables in a workspace
internal to the function.
- Are useful for extending the MATLAB

39

4.3.2

## Input and output arguments

As mentioned above, the input arguments are listed inside parentheses following the function
name. The output arguments are listed inside the brackets on the left side. They are used
to transfer the output from the function file. The general form looks like this
function [outputs] = function_name(inputs)
Function file can have none, one, or several output arguments. Table 4.3 illustrates some
possible combinations of input and output arguments.
Table 4.3: Example of input and output arguments
function C=FtoC(F)
function area=TrapArea(a,b,h)
function [h,d]=motion(v,angle)

4.4

## One input argument and

one output argument
Three inputs and one output
Two inputs and two outputs

## Input to a script file

When a script file is executed, the variables that are used in the calculations within the file
must have assigned values. The assignment of a value to a variable can be done in three
ways.
1. The variable is defined in the script file.
2. The variable is defined in the command prompt.
3. The variable is entered when the script is executed.
We have already seen the two first cases. Here, we will focus our attention on the third one.
In this case, the variable is defined in the script file. When the file is executed, the user is
prompted to assign a value to the variable in the command prompt. This is done by using
the input command. Here is an example.
%
%
%
%

## This script file calculates the average of points

scored in three games.
The point from each game are assigned to a variable
by using the input command.

40

## game2 = input(Enter the points scored in the second game );

game3 = input(Enter the points scored in the third game );
average = (game1+game2+game3)/3
The following shows the command prompt when this script file (saved as example3) is
executed.
>>
>>
>>
>>

example3
Enter the points scored in the first game
Enter the points scored in the second game
Enter the points scored in the third game

15
23
10

average =
16
The input command can also be used to assign string to a variable. For more information,
see MATLAB documentation.
A typical example of M-file function programming can be found in a recent paper which
related to the solution of the ordinary differential equation (ODE) .

4.5

Output commands

As discussed before, MATLAB automatically generates a display when commands are executed. In addition to this automatic display, MATLAB has several commands that can be
used to generate displays or outputs.
Two commands that are frequently used to generate output are: disp and fprintf.
The main differences between these two commands can be summarized as follows (Table
4.4).
Table 4.4: disp and fprintf commands
disp

. Simple to use.
. Provide limited control over the appearance of output

fprintf

## . Slightly more complicated than disp.

. Provide total control over the appearance of output

41

4.6

Exercises

1. Liz buys three apples, a dozen bananas, and one cantaloupe for $2.36. Bob buys a dozen apples and two cantaloupe for$5.26. Carol buys two bananas and three cantaloupe
for $2.77. How much do single pieces of each fruit cost? 2. Write a function file that converts temperature in degrees Fahrenheit ( F) to degrees Centigrade ( C). Use input and fprintf commands to display a mix of text and numbers. Recall the conversion formulation, C = 5/9 (F 32). 3. Write a user-defined MATLAB function, with two input and two output arguments that determines the height in centimeters (cm) and mass in kilograms (kg)of a person from his height in inches (in.) and weight in pounds (lb). (a) Determine in SI units the height and mass of a 5 ft.15 in. person who weight 180 lb. (b) Determine your own height and weight in SI units. 42 Chapter 5 Control flow and operators 5.1 Introduction ## MATLAB is also a programming language. Like other computer programming languages, MATLAB has some decision making structures for control of command execution. These decision making or control flow structures include for loops, while loops, and if-else-end constructions. Control flow structures are often used in script M-files and function M-files. By creating a file with the extension .m, we can easily write and run programs. We do not need to compile the program since MATLAB is an interpretative (not compiled) language. MATLAB has thousand of functions, and you can add your own using m-files. MATLAB provides several tools that can be used to control the flow of a program (script or function). In a simple program as shown in the previous Chapter, the commands are executed one after the other. Here we introduce the flow control structure that make possible to skip commands or to execute specific group of commands. 5.2 Control flow MATLAB has four control flow structures: the if statement, the for loop, the while loop, and the switch statement. 5.2.1 ## The if...end structure ## MATLAB supports the variants of if construct. if ... end if ... else ... end 43 if ... elseif ... else ... end ## The simplest form of the if statement is if expression statements end Here are some examples based on the familiar quadratic formula. 1. ## discr = b*b - 4*a*c; if discr < 0 disp(Warning: discriminant is negative, roots are imaginary); end 2. ## discr = b*b - 4*a*c; if discr < 0 disp(Warning: discriminant is negative, roots are imaginary); else disp(Roots are real, but may be repeated) end 3. ## discr = b*b - 4*a*c; if discr < 0 disp(Warning: discriminant is negative, roots are imaginary); elseif discr == 0 disp(Discriminant is zero, roots are repeated) else disp(Roots are real) end ## It should be noted that: elseif has no space between else and if (one word) no semicolon (;) is needed at the end of lines containing if, else, end indentation of if block is not required, but facilitate the reading. the end statement is required 44 5.2.2 ## Relational and logical operators ## A relational operator compares two numbers by determining whether a comparison is true or false. Relational operators are shown in Table 5.1. Table 5.1: Relational and logical operators Operator Description > < >= <= == = & | Greater than Less than Greater than or equal to Less than or equal to Equal to Not equal to AND operator OR operator NOT operator Note that the equal to relational operator consists of two equal signs (==) (with no space between them), since = is reserved for the assignment operator. 5.2.3 ## The for...end loop In the for ... end loop, the execution of a command is repeated at a fixed and predetermined number of times. The syntax is for variable = expression statements end Usually, expression is a vector of the form i:s:j. A simple example of for loop is for ii=1:5 x=ii*ii end It is a good idea to indent the loops for readability, especially when they are nested. Note that MATLAB editor does it automatically. Multiple for loops can be nested, in which case indentation helps to improve the readability. The following statements form the 5-by-5 symmetric matrix A with (i, j) element i/j for j i: 45 n = 5; A = eye(n); for j=2:n for i=1:j-1 A(i,j)=i/j; A(j,i)=i/j; end end 5.2.4 ## The while...end loop This loop is used when the number of passes is not specified. The looping continues until a stated condition is satisfied. The while loop has the form: while expression statements end The statements are executed as long as expression is true. x = 1 while x <= 10 x = 3*x end It is important to note that if the condition inside the looping is not well defined, the looping will continue indefinitely. If this happens, we can stop the execution by pressing Ctrl-C. 5.2.5 ## Other flow structures The break statement. A while loop can be terminated with the break statement, which passes control to the first statement after the corresponding end. The break statement can also be used to exit a for loop. The continue statement can also be used to exit a for loop to pass immediately to the next iteration of the loop, skipping the remaining statements in the loop. Other control statements include return, continue, switch, etc. For more detail about these commands, consul MATLAB documentation. 46 5.2.6 Operator precedence We can build expressions that use any combination of arithmetic, relational, and logical operators. Precedence rules determine the order in which MATLAB evaluates an expression. We have already seen this in the Tutorial Lessons. Here we add other operators in the list. The precedence rules for MATLAB are shown in this list (Table 5.2), ordered from highest (1) to lowest (9) precedence level. Operators are evaluated from left to right. Table 5.2: Operator precedence Precedence 1 2 3 4 5 6 7 8 9 5.3 Operator Parentheses () Transpose (. 0 ), power (.), matrix power () Unary plus (+), unary minus (), logical negation () Multiplication (. ), right division (. /), left division (.\), matrix multiplication (), matrix right division (/), matrix left division (\) Addition (+), subtraction () Colon operator (:) Less than (<), less than or equal to (), greater (>), greater than or equal to (), equal to (==), not equal to (=) Element-wise AND, (&) Element-wise OR, (|) ## Saving output to a file In addition to displaying output on the screen, the command fprintf can be used for writing the output to a file. The saved data can subsequently be used by MATLAB or other softwares. To save the results of some computation to a file in a text format requires the following steps: 1. Open a file using fopen 2. Write the output using fprintf 3. Close the file using fclose Here is an example (script) of its use. 47 ## % write some variable length strings to a file op = fopen(weekdays.txt,wt); fprintf(op,Sunday\nMonday\nTuesday\nWednesday\n); fprintf(op,Thursday\nFriday\nSaturday\n); fclose(op); This file (weekdays.txt) can be opened with any program that can read .txt file. 5.4 Exercises Note: Due to the teaching class during this Fall Quarter 2005, the problems are temporarily removed from this section. 48 Chapter 6 Debugging M-files 6.1 Introduction This section introduces general techniques for finding errors in M-files. Debugging is the process by which you isolate and fix errors in your program or code. Debugging helps to correct two kind of errors: Syntax errors - For example omitting a parenthesis or misspelling a function name. Run-time errors - Run-time errors are usually apparent and difficult to track down. They produce unexpected results. 6.2 Debugging process We can debug the M-files using the Editor/Debugger as well as using debugging functions from the Command Window. The debugging process consists of Preparing for debugging Setting breakpoints Running an M-file with breakpoints Stepping through an M-file Examining values Correcting problems Ending debugging 49 6.2.1 ## Preparing for debugging Here we use the Editor/Debugger for debugging. Do the following to prepare for debugging: Open the file Save changes Be sure the file you run and any files it calls are in the directories that are on the search path. 6.2.2 Setting breakpoints Set breakpoints to pause execution of the function, so we can examine where the problem might be. There are three basic types of breakpoints: A standard breakpoint, which stops at a specified line. A conditional breakpoint, which stops at a specified line and under specified conditions. An error breakpoint that stops when it produces the specified type of warning, error, NaN, or infinite value. You cannot set breakpoints while MATLAB is busy, for example, running an M-file. 6.2.3 ## Running with breakpoints After setting breakpoints, run the M-file from the Editor/Debugger or from the Command Window. Running the M-file results in the following: The prompt in the Command Window changes to K>> ## indicating that MATLAB is in debug mode. The program pauses at the first breakpoint. This means that line will be executed when you continue. The pause is indicated by the green arrow. In breakpoint, we can examine variable, step through programs, and run other calling functions. 50 6.2.4 Examining values While the program is paused, we can view the value of any variable currently in the workspace. Examine values when we want to see whether a line of code has produced the expected result or not. If the result is as expected, step to the next line, and continue running. If the result is not as expected, then that line, or the previous line, contains an error. When we run a program, the current workspace is shown in the Stack field. Use who or whos to list the variables in the current workspace. Viewing values as datatips First, we position the cursor to the left of a variable on that line. Its current value appears. This is called a datatip, which is like a tooltip for data. If you have trouble getting the datatip to appear, click in the line and then move the cursor next to the variable. 6.2.5 ## Correcting and ending debugging While debugging, we can change the value of a variable to see if the new value produces expected results. While the program is paused, assign a new value to the variable in the Command Window, Workspace browser, or Array Editor. Then continue running and stepping through the program. 6.2.6 Ending debugging After identifying a problem, end the debugging session. It is best to quit debug mode before editing an M-file. Otherwise, you can get unexpected results when you run the file. To end debugging, select Exit Debug Mode from the Debug menu. 6.2.7 Correcting an M-file ## To correct errors in an M-file, Quit debugging Do not make changes to an M-file while MATLAB is in debug mode Make changes to the M-file Save the M-file Clear breakpoints 51 ## Run the M-file again to be sure it produces the expected results. For details on debugging process, see MATLAB documentation. 52 Appendix A Summary of commands Table A.1: Arithmetic operators and special characters Character + : ; , ... % 0 = () [] Description Addition Subtraction Multiplication (scalar and array) Division (right) Power or exponentiation Colon; creates vectors with equally spaced elements Semi-colon; suppresses display; ends row in array Comma; separates array subscripts Continuation of lines Percent; denotes a comment; specifies output format Single quote; creates string; specifies matrix transpose Assignment operator Parentheses; encloses elements of arrays and input arguments Brackets; encloses matrix elements and output arguments 53 ## Table A.2: Array operators Character . ./ . .\ .0 Description Array Array Array Array Array multiplication (right) division power (left) division (nonconjugated) transpose ## Table A.3: Relational and logical operators Character < > == = & | && || Description Less than Less than or equal to Greater than Greater than or equal to Equal to Not equal to Logical or element-wise AND Logical or element-wise OR Short-circuit AND Short-circuit OR 54 ## Table A.4: Managing workspace and file commands Command cd clc clear (all) clear x copyfile delete dir exist help lookfor mkdir movefile pwd rmdir type what which who whos Description Change current directory Clear the Command Window Removes all variables from the workspace Remove x from the workspace Copy file or directory Delete files Display directory listing Check if variables or functions are defined Display help for MATLAB functions Search for specified word in all help entries Make new directory Move file or directory Identify current directory Remove directory Display contents of file List MATLAB files in current directory Locate functions and files Display variables currently in the workspace Display information on variables in the workspace ## Table A.5: Predefined variables and math constants Variable ans eps i Inf eps j NaN pi Description Value of last variable (answer) Floating-point relative accuracy Imaginary unit of a complex number Infinity () Floating-point relative accuracy Imaginary unit of a complex number Not a number The number (3.14159 . . .) 55 ## Table A.6: Elementary matrices and arrays Command eye linspace ones rand zeros Description Identity matrix Generate linearly space vectors Create array of all ones Uniformly distributed random numbers and arrays Create array of all zeros ## Table A.7: Arrays and Matrices: Basic information Command disp isempty isequal length ndims numel size Description Display text or array Determine if input is empty matrix Test arrays for equality Length of vector Number of dimensions Number of elements Size of matrix ## Table A.8: Arrays and Matrices: operations and manipulation Command cross diag dot end find kron max min prod reshape sort sum size Description Vector cross product Diagonal matrices and diagonals of matrix Vector dot product Indicate last index of array Find indices of nonzero elements Kronecker tensor product Maximum value of array Minimum value of array Product of array elements Reshape array Sort array elements Sum of array elements Size of matrix 56 Table A.9: Arrays and Matrices: matrix analysis and linear equations Command cond det inv linsolve lu norm null orth rank rref trace Description Condition number with respect to inversion Determinant Matrix inverse Solve linear system of equations LU factorization Matrix or vector norm Null space Orthogonalization Matrix rank Reduced row echelon form Sum of diagonal elements 57 Appendix B Release notes for Release 14 with Service Pack 2 B.1 Summary of changes MATLAB 7 Release 14 with Service Pack 2 (R14SP2) includes several new features. The major focus of R14SP2 is on improving the quality of the product. This document doesnt attempt to provide a complete specification of every single feature, but instead provides a brief introduction to each of them. For full details, you should refer to the MATLAB documentation (Release Notes). The following key points may be relevant: 1. Spaces before numbers - For example: A* .5, you will typically get a mystifying message saying that A was previously used as a variable. There are two workarounds: (a) Remove all the spaces: A*.5 (b) Or, put a zero in front of the dot: A * 0.5 2. RHS empty matrix - The right-hand side must literally be the empty matrix [ ]. It cannot be a variable that has the value [ ], as shown here: rhs = []; A(:,2) = rhs ??? Subscripted assignment dimension mismatch 58 3. New format option - We can display MATLAB output using two new formats: short eng and long eng. short eng Displays output in engineering format that has at least 5 digits and a power that is a multiple of three. >> format short eng >> pi ans = 3.1416e+000 long eng Displays output in engineering format that has 16 significant digits and a power that is a multiple of three. >> format long eng >> pi ans = 3.14159265358979e+000 4. Help - To get help for a subfunction, use >> help function_name>subfunction_name In previous versions, the syntax was >> help function_name/subfunction_name This change was introduced in R14 (MATLAB 7.0) but was not documented. Use the MathWorks Web site search features to look for the latest information. 5. Publishing - Publishing to LATEXnow respects the image file type you specify in preferences rather than always using EPSC2-files. The Publish image options in Editor/Debugger preferences for Publishing Images have changed slightly. The changes prevent you from choosing invalid formats. The files created when publishing using cells now have more natural extensions. For example, JPEG-files now have a .jpg instead of a .jpeg extension, and EPSC2files now have an .eps instead of an .epsc2 extension. Notebook will no longer support Microsoft Word 97 starting in the next release of MATLAB. 6. Debugging - Go directly to a subfunction or using the enhanced Go To dialog box. Click the Name column header to arrange the list of function alphabetically, or click the Line column header to arrange the list by the position of the functions in the file. 59 B.2 Other changes 1. There is a new command mlint, which will scan an M-file and show inefficiencies in the code. For example, it will tell you if youve defined a variable youve never used, if youve failed to pre-allocate an array, etc. These are common mistakes in EA1 which produce runnable but inefficient code. 2. You can comment-out a block of code without putting % at the beginning of each line. The format is %{ Stuff you want MATLAB to ignore... %} The delimiters %{ and %} must appear on lines by themselves, and it may not work with the comments used in functions to interact with the help system (like the H1 line). 3. There is a new function linsolve which will solve Ax = b but with the users choice of algorithm. This is in addition to left division x = A\b which uses a default algorithm. 4. The eps constant now takes an optional argument. eps(x) is the same as the old eps*abs(x). 5. You can break an M-file up into named cells (blocks of code), each of which you can run separately. This may be useful for testing/debugging code. 6. Functions now optionally end with the end keyword. This keyword is mandatory when working with nested functions. B.3 Further details 1. You can dock and un-dock windows from the main window by clicking on an icon. Thus you can choose to have all Figures, M-files being edited, help browser, command window, etc. All appear as panes in a single window. 2. Error messages in the command window resulting from running an M-file now include a clickable link to the offending line in the editor window containing the M-file. 3. You can customize figure interactively (labels, line styles, etc.) and then automatically generate the code which reproduces the customized figure. 60 4. feval is no longer needed when working with function handles, but still works for backward compatibility. For example, x=@sin; x(pi) will produce sin(pi) just like feval(x,pi) does, but faster. 5. You can use function handles to create anonymous functions. 6. There is support for nested functions, namely, functions defined within the body of another function. This is in addition to sub-functions already available in version 6.5. 7. There is more support in arithmetic operations for numeric data types other than double, e.g. single, int8, int16, uint8, uint32, etc. Finally, please visit our webpage for other details: http://computing.mccormick.northwestern.edu/matlab/ 61 Appendix C Main characteristics of MATLAB C.1 History ## Developed primarily by Cleve Moler in the 1970s Derived from FORTRAN subroutines LINPACK and EISPACK, linear and eigenvalue systems. Developed primarily as an interactive system to access LINPACK and EISPACK. Gained its popularity through word of mouth, because it was not officially distributed. Rewritten in C in the 1980s with more functionality, which include plotting routines. The MathWorks Inc. was created (1984) to market and continue development of MATLAB. According to Cleve Moler, three other men played important roles in the origins of MATLAB: J. H. Wilkinson, George Forsythe, and John Todd. It is also interesting to mention the authors of LINPACK: Jack Dongara, Pete Steward, Jim Bunch, and Cleve Moler. Since then another package emerged: LAPACK. LAPACK stands for Linear Algebra Package. It has been designed to supersede LINPACK and EISPACK. C.2 Strengths ## MATLAB may behave as a calculator or as a programming language MATLAB combine nicely calculation and graphic plotting. MATLAB is relatively easy to learn 62 ## MATLAB is interpreted (not compiled), errors are easy to fix MATLAB is optimized to be relatively fast when performing matrix operations MATLAB does have some object-oriented elements C.3 Weaknesses ## MATLAB is not a general purpose programming language such as C, C++, or FORTRAN MATLAB is designed for scientific computing, and is not well suitable for other applications MATLAB is an interpreted language, slower than a compiled language such as C++ MATLAB commands are specific for MATLAB usage. Most of them do not have a direct equivalent with other programming language commands C.4 Competition ## One of MATLABs competitors is Mathematica, the symbolic computation program. MATLAB is more convenient for numerical analysis and linear algebra. It is frequently used in engineering community. Mathematica has superior symbolic manipulation, making it popular among physicists. There are other competitors: Scilab GNU Octave Rlab 63 Bibliography  The MathWorks Inc. MATLAB 7.0 (R14SP2). The MathWorks Inc., 2005.  S. J. Chapman. MATLAB Programming for Engineers. Thomson, 2004.  C. B. Moler. Numerical Computing with MATLAB. Siam, 2004.  C. F. Van Loan. Introduction to Scientific Computing. Prentice Hall, 1997.  D. J. Higham and N. J. Higham. MATLAB Guide. Siam, second edition edition, 2005.  K. R. Coombes, B. R. Hunt, R. L. Lipsman, J. E. Osborn, and G. J. Stuck. Differential Equations with MATLAB. John Wiley and Sons, 2000.  A. Gilat. MATLAB: An introduction with Applications. John Wiley and Sons, 2004.  J. Cooper. A MATLAB Companion for Multivariable Calculus. Academic Press, 2001.  J. C. Polking and D. Arnold. ODE using MATLAB. Prentice Hall, 2004.  D. Kahaner, C. Moler, and S. Nash. Numerical Methods and Software. Prentice-Hall, 1989.  J. W. Demmel. Applied Numerical Linear Algebra. Siam, 1997.  D. Houcque. Applications of MATLAB: Ordinary Differential Equations. Internal communication, Northwestern University, pages 112, 2005. 64 Polynomials in Matlab Polynomials f(x) = anx n + an-1x n-1 + ... + a1x + a0 n is the degree of the polynomial Examples: f(x) = 2x 2 - 4x + 10 degree 2 f(x) = 6 degree 0 Polynomials in Matlab Represented by a row vector in which the elements are the coefficients. Must include all coefficients, even if 0 Examples 8x + 5 p = [8 5] 6x 2 - 150 h = [6 0 -150] Value of a Polynomial Recall that we can compute the value of a polynomial for any value of x directly. Ex: f(x) = 5x 3 + 6x 2 - 7x + 3 x = 2; y = (5 * x ^ 3) + (6 * x ^ 2) - (7 * x) + 3 y= 53 Value of a Polynomial Matlab can also compute the value of a polynomial at point x using a function, which is sometimes more convenient polyval (p, x) p is a vector with the coefficients of the polynomial x is a number, variable or expression Value of a polynomial Ex: f(x) = 5x 3 + 6x 2 - 7x + 3 p = [5 6 -7 3]; x = 2; y = polyval(p, x) y= 53 Roots of a Polynomial Recall that the roots of a polynomial are the values of the argument for which the polynomial is equal to zero Ex: f(x) = x2 - 2x -3 0 = x 2 - 2x -3 0 = (x + 1)(x - 3) 0=x+1 OR 0 = x - 3 x = -1 x=3 Roots of a Polynomial Matlab can compute the roots of a function r = roots(p) p is a row vector with the coefficients of the polynomial r is a column vector with the roots of the polynomial Roots of a Polynomial Ex: f(x) = x2 - 2x -3 p = [1 -2 -3]; r = roots(p) r= 3.0000 -1.0000 Polynomial Coefficients Given the roots of a polynomial, the polynomial itself can be calculated Ex: roots = -3, +2 x = -3 OR x = 2 0=x+3 0 = x -2 0 = (x + 3)(x - 2) f(x) = x 2 + x - 6 Polynomial Coefficients Given the roots of a polynomial, Matlab can compute the coefficients p = poly(r) r is a row or column vector with the roots of the polynomial p is a row vector with the coefficients Polynomial Coefficients Ex: roots = -3, +2 r = [-3 +2]; p = poly(r) p= 1 1 -6 % f(x) = x 2 + x -6 ## Adding and Subtracting Polynomials Polynomials can be added or subtracted Ex: f1(x) + f2(x) f1(x) = 3x 6 + 15x 5 - 10x 3 - 3x 2 +15x - 40 f2(x) = 3x 3 - 2x - 6 6 5 3 2 3x + 15x - 7x -3x +13x - 46 ## Adding and Subtracting Polynomials ## Can do this in Matlab by just adding or subtracting the coefficient vectors Both vectors must be of the same size, so the shorter vector must be padded with zeros ## Adding and Subtracting Polynomials Ex: f1(x) = 3x 6 + 15x 5 - 10x 3 - 3x 2 +15x - 40 f2(x) = 3x 3 - 2x - 6 p1 = [3 15 0 -10 -3 15 -40]; p2 = [0 0 0 3 0 -2 -6]; p = p1+p2 p= 3 15 0 -7 -3 13 -46 %f(x) = 3x 6 + 15x 5 -7x 3 -3x 2 +13x -46 Multiplying Polynomials Polynomials can be multiplied: Ex: (2x 2 + x -3) * (x + 1) = 2x 3 + x 2 - 3x + 2x 2 + x -3 2x 3 +3x 2 -2x -3 Multiplying Polynomials Matlab can also multiply polynomials c = conv(a, b) a and b are the vectors of the coefficients of the functions being multiplied c is the vector of the coefficients of the product Multiplying Polynomials Ex: (2x 2 + x -3) * (x + 1) a = [2 1 -3]; b = [1 1]; c = conv(a, b) c= 2 3 -2 -3 % 2x 3 + 3x 2 -2x -3 Dividing Polynomials Recall that polynomials can also be divided x -10 x + 1 x 2 - 9x - 10 -x 2 - x -10x -10 -10x -10 0 Dividing Polynomials Matlab can also divide polynomials [q,r] = deconv(u, v) u is the coefficient vector of the numerator v is the coefficient vector of the denominator q is the coefficient vector of the quotient r is the coefficient vector of the remainder Dividing Polynomials Ex: (x 2 - 9x -10) (x + 1) u = [1 -9 -10]; v = [1 1]; [q, r] = deconv(u, v) q= 1 -10 % quotient is (x - 10) r= 0 0 0 % remainder is 0 Example 1 Write a program to calculate when an object thrown straight up will hit the ground. The equation is s(t) = -gt2 +v 0t + h0 s is the position at time t (a position of zero means that the object has hit the ground) g is the acceleration of gravity: 9.8m/s 2 v 0 is the initial velocity in m/s h0is the initial height in meters (ground level is 0, a positive height means that the object was thrown from a raised platform) Example 1 Prompt for and read in initial velocity Prompt for and read in initial height Find the roots of the equation Solution is the positive root Display solution Example 1 v = input('Please enter initial velocity: '); h = input('Please enter initial height: '); x = [-4.9 v h]; y = roots(x); if y(1) >= 0 fprintf('The object will hit the ground in %.2f seconds \n', y(1)) else fprintf('The object will hit the ground in %.2f seconds \n', y(2)) end Example 1 Please enter initial velocity: 19.6 Please enter initial height: 58.8 The object will hit the ground in 6.00 seconds Derivatives of Polynomials We can take the derivative of polynomials f(x) = 3x 2 -2x + 4 dy = 6x - 2 dx Derivatives of Polynomials Matlab can also calculate the derivatives of polynomials k = polyder(p) p is the coefficient vector of the polynomial k is the coefficient vector of the derivative Derivatives of Polynomials Ex: f(x) = 3x 2 - 2x + 4 p = [3 -2 4]; k = polyder(p) k= 6 -2 % dy/dx = 6x - 2 Integrals of Polynomials ?6x 2 dx = 6 ?x 2 dx = 6 * ? x3 = 2 x3 Integrals of Polynomials Matlab can also calculate the integral of a polynomial g = polyint(h, k) h is the coefficient vector of the polynomial g is the coefficient vector of the integral k is the constant of integration - assumed to be 0 if not present 10 Integration of Polynomials ?6x 2 dx h = [6 0 0]; g = polyint(h) g= 2 0 0 % g(x) = 2x 3 ## Polynomial Curve Fitting Curve fitting is fitting a function to a set of data points That function can then be used for various mathematical analyses Curve fitting can be tricky, as there are many possible functions and coefficients Curve Fitting Polynomial curve fitting uses the method of least squares Determine the difference between each data point and the point on the curve being fitted, called the residual Minimize the sum of the squares of all of the residuals to determine the best fit 11 Curve Fitting A best-fit curve may not pass through any actual data points A high-order polynomial may pass through all the points, but the line may deviate from the trend of the data 12 ## Matlab Curve Fitting Matlab provides an excellent polynomial curve fitting interface First, plot the data that you want to fit t = [0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0]; w = [6.00 4.83 3.70 3.15 2.41 1.83 1.49 1.21 0.96 0.73 0.64]; plot(t, w) ## Choose Tools/Basic Fitting from the menu on the top of the plot 13 ## Matlab Curve Fitting Choose a fit it will be displayed on the plot the numerical results will show the equation and the coefficients the norm of residuals is a measure of the quality of the fit. A smaller residual is a better fit. ## Repeat until you find the curve with the best fit Linear Fit Linear Fit 14 ## 8th Degree Polynomial ## 8th Degree Polynomial Example 2 Find the parabola that best fits the data points (-1, 10) (0, 6) (1, 2) (2, 1) (3, 0) (4, 2) (5, 4) and (6, 7) The equation for a parabola is f(x) = ax 2 + bx + a 15 Example 2 X = [-1, 0, 1, 2, 3, 4, 5, 6]; Y= [10, 6, 2, 1, 0, 2, 4, 7]; plot (X, Y) 16 Other curves All previous examples use polynomial curves. However, the best fit curve may also be power, exponential, logarithmic or reciprocal See your textbook for information on fitting data to these types of curves Interpolation Interpolation is estimating values between data points Simplest way is to assume a line between each pair of points Can also assume a quadratic or cubic polynomial curve connects each pair of points Interpolation yi = interp1(x, y, xi, 'method') interp1 - last character is one x is vector with x points y is a vector with y points xi is the x coordinate of the point to be interpolated yi is the y coordinate of the point being interpolated 17 Interpolation method is optional: 'nearest' - returns y value of the data point that is nearest to the interpolated x point 'linear' - assume linear curve between each two points (default) 'spline' - assumes a cubic curve between each two points Interpolation Example: x = [0 1 2 3 4 5]; y = [1.0 -0.6 -1.4 3.2 -0.7 -6.4]; yi = interp1(x, y, 1.5, 'linear') yi = -1 yj = interp1(x, y, 1.5, 'spline') yj = -1.7817 18 Lecture 8 Matrices and Matrix Operations in Matlab Matrix operations Recall how to multiply a matrix A times a vector v:        1 2 1 1 (1) + 2 2 3 Av = = = . 3 4 2 3 (1) + 4 2 5 This is a special case of matrix multiplication. To multiply two matrices, A and B you proceed as follows:        3 0 1 + 4 2 + 2 1 2 1 2 . = = AB = 5 2 3 + 8 6 + 4 2 1 3 4 Here both A and B are 2 2 matrices. Matrices can be multiplied together in this way provided that the number of columns of A match the number of rows of B. We always list the size of a matrix by rows, then columns, so a 3 5 matrix would have 3 rows and 5 columns. So, if A is m n and B is p q, then we can multiply AB if and only if n = p. A column vector can be thought of as a p 1 matrix and a row vector as a 1 q matrix. Unless otherwise specified we will assume a vector v to be a column vector and so Av makes sense as long as the number of columns of A matches the number of entries in v. Printing matrices on the screen takes up a lot of space, so you may want to use > format compact Enter a matrix into Matlab with the following syntax: > A = [ 1 3 -2 5 ; -1 -1 5 4 ; 0 1 -9 0] Also enter a vector u: > u = [ 1 2 3 4] To multiply a matrix times a vector Au use *: > A*u Since A is 3 by 4 and u is 4 by 1 this multiplication is valid and the result is a 3 by 1 vector. Now enter another matrix B using: > B = [3 2 1; 7 6 5; 4 3 2] You can multiply B times A: > B*A but A times B is not defined and > A*B will result in an error message. You can multiply a matrix by a scalar: > 2*A 28 29 Adding matrices A + A will give the same result: > A + A You can even add a number to a matrix: > A + 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . This should add 3 to every entry of A. Component-wise operations Just as for vectors, adding a . before *, /, or ^ produces entry-wise multiplication, division and exponentiation. If you enter: > B*B the result will be BB, i.e. matrix multiplication of B times itself. But, if you enter: > B.*B the entries of the resulting matrix will contain the squares of the same entries of B. Similarly if you want B multiplied by itself 3 times then enter: > B^3 but, if you want to cube all the entries of B then enter: > B.^3 Note that B*B and B^3 only make sense if B is square, but B.*B and B.^3 make sense for any size matrix. ## The identity matrix and the inverse of a matrix The n n identity matrix is a square matrix with ones on the diagonal and zeros everywhere else. It is called the identity because it plays the same role that 1 plays in multiplication, i.e. AI = A, IA = A, Iv = v for any matrix A or vector v where the sizes match. An identity matrix in Matlab is produced by the command: > I = eye(3) A square matrix A can have an inverse which is denoted by A1 . The definition of the inverse is that: AA1 = I and A1 A = I. In theory an inverse is very important, because if you have an equation: Ax = b where A and b are known and x is unknown (as we will see, such problems are very common and important) then the theoretical solution is: x = A1 b. We will see later that this is not a practical way to solve an equation, and A1 is only important for the purpose of derivations. In Matlab we can calculate a matrixs inverse very conveniently: > C = randn(5,5) > inv(C) However, not all square matrices have inverses: > D = ones(5,5) > inv(D) 30 ## LECTURE 8. MATRICES AND MATRIX OPERATIONS IN MATLAB ## The Norm of a matrix For a vector, the norm means the same thing as the length. Another way to think of it is how far the vector is from being the zero vector. We want to measure a matrix in much the same way and the norm is such a quantity. The usual definition of the norm of a matrix is the following: Definition 1 Suppose A is a m n matrix. The norm of A is |A| max |Av|. |v|=1 The maximum in the definition is taken over all vectors with length 1 (unit vectors), so the definition means the largest factor that the matrix stretches (or shrinks) a unit vector. This definition seems cumbersome at first, but it turns out to be the best one. For example, with this definition we have the following inequality for any vector v: |Av| |A||v|. In Matlab the norm of a matrix is obtained by the command: > norm(A) For instance the norm of an identity matrix is 1: > norm(eye(100)) and the norm of a zero matrix is 0: > norm(zeros(50,50)) For a matrix the norm defined above and calculated by Matlab is not the square root of the sum of the square of its entries. That quantity is called the Froebenius norm, which is also sometimes useful, but we will not need it. ## Some other useful commands Try out the following: > C = rand(5,5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . random matrix with uniform distribution in [0, 1]. > size(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . gives the dimensions (m n) of C. > det(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . the determinant of the matrix. > max(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . the maximum of each column. > min(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .the minimum in each column. > sum(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . sums each column. > mean(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . the average of each column. > diag(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . just the diagonal elements. > C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .tranpose the matrix. In addition to ones, eye, zeros, rand and randn, Matlab has several other commands that automatically produce special matrices: > hilb(6) > pascal(5) 31 Exercises 8.1 Enter the matrix M by > M = [1,3,-1,6;2,4,0,-1;0,-2,3,-1;-1,2,-5,1] and also the matrix 1 3 3 2 1 6 . N= 1 4 1 2 1 2 Multiply M and N using M * N. Can the order of multiplication be switched? Why or why not? Try it to see how Matlab reacts. 8.2 By hand, calculate Av, AB, 2 4 A = 2 1 1 1 and BA for: 0 1 9 , B = 1 1 0 1 1 0 2 , 2 0 3 v = 1 . 1 Check the results using Matlab. Think about how fast computers are. Turn in your hand work. 8.3 (a) Write a Matlab function program myinvcheck that makes a n n random matrix (normally distributed, A = randn(n,n)), calculates its inverse (B = inv(A)), multiplies the two back together, calculates the error equal to the norm of the difference of the result from the n n identity matrix (eye(n)), and returns error. (b) Write a Matlab script program that calls myinvcheck for n = 10, 20, 40, . . . , 2i 10, records the results of each trial, and plots the error versus n using a log plot. (See help loglog.) What happens to error as n gets big? Turn in a printout of the programs, the plot, and a very brief report on the results of your experiments. Section 4.2 ## Control Structures in Matlab 299 ## 4.2 Control Structures in Matlab In this section we will discuss the control structures offered by the Matlab programming language that allow us to add more levels of complexity to the simple programs we have written thus far. Without further ado and fanfare, lets begin. If If evaluates a logical expression and executes a block of statements based on whether the logical expressions evaluates to true (logical 1) or false (logical 0). The basic structure is as follows. if logical_expression statements end If the logical_expression evaluates as true (logical 1), then the block of statements that follow if logical_expression are executed, otherwise the statements are skipped and program control is transferred to the first statement that follows end. Lets look at an example of this control structures use. Matlabs rem(a,b) returns the remainder when a is divided by b. Thus, if a is an even integer, then rem(a,2) will equal zero. What follows is a short program to test if an integer is even. Open the Matlab editor, enter the following script, then save the file as evenodd.m. n = input(Enter an integer: ); if (rem(n,2)==0) fprintf(The integer %d is even.\n, n) end Return to the command window and run the script by entering evenodd at the Matlab prompt. Matlab responds by asking you to enter an integer. As a response, enter the integer 12 and press Enter. ## Copyrighted material. See: http://msenux.redwoods.edu/Math4Textbook/ 300 Chapter 4 Programming in Matlab >> evenodd Enter an integer: 12 The integer 12 is even. Run the program again. When prompted, enter the nonnegative integer 17 and press Enter. >> evenodd Enter an integer: 17 There is no response in this case, because our program provides no alternative if the input is odd. Besides the new conditional control structure, we have two new commands that warrant attention. 1. Matlabs input command, when used in the form n = input(Enter an integer: ), will display the string as a prompt and wait for the user to enter a number and hit Enter on the keyboard, whereupon it stores the number input by the user in the variable n. 2. The command fprintf is used to print formatted data to a file. The default syntax is fprintf(FID,FORMAT,A,...). i. The argument FID is a file identifier. If no file identifier is present, then fprintf prints to the screen. ii. The argument FORMAT is a format string, which may contain contain conversion specifications from the C programming language. In the evenodd.m script, %d is conversion specification which will format the first argument following the format string as a signed decimal number. Note the \n at the end of the format string. This is a newline character, which creates a new line after printing the format string to the screen. iii. The format string can be followed by zero or more arguments, which will be substituted in sequence for the C-language conversion specifications in the format string. Else We can provide an alternative if the logical_expression evaluates as false. The basic structure is as follows. Section 4.2 ## Control Structures in Matlab 301 if logical_expression statements else statments end In this form, if the logical_expression evaluates as true (logical 1), the block of statements between if and else are executed. If the logical_expression evaluates as false (logical 0), then the block of statements between else and end are executed. We can provide an alternative to our evenodd.m script. Add the following lines to the script and resave as evenodd.m. n = input(Enter an integer: ); if (rem(n,2)==0) fprintf(The integer %d is even.\n, n) else fprintf(The integer %d is odd.\n, n) end Run the program, enter 17, and note that we now have a different response. >> evenodd Enter an integer: 17 The integer 17 is odd. Because the logical expression rem(n,2)==0 evaluates as false when n = 17, the fprintf command that lies between else and end is executed, informing us that the input is an odd integer. Elseif Sometimes we need to add more than one alternative. For this, we have the following structure. 302 Chapter 4 Programming in Matlab if logical_expression1 statements elseif logical_expression2 statements else statements end Program flow is as follows. 1. If logical_expression1 evaluates as true, then the block of statements between if and elseif are executed. 2. If logical_expression1 evaluates as false, then program control is passed to elseif. At that point, if logical_expression2 evaluates as true, then the block of statements between elseif and else are executed. If the logical expression logical_expression2 evaluates as false, then the block of statments between else and end are executed. You can have more than one elseif in this control structure, depending on need. As an example of use, consider a program that asks a user to make a choice from a menu, then reacts accordingly. First, ask for input, then set up a menu of choices with the following commands. a=input(Enter a number a: ); b=input(Enter a number b: ); fprintf(\n) fprintf(1) Add a and b.\n) fprintf(2) Subtract b from a.\n) fprintf(3) Multiply a and b.\n) fprintf(4) Divide a by b.\n) fprintf(\n) n=input(Enter your choice: ); Now, execute the appropriate choice based on the users menu selection. Section 4.2 if n==1 fprintf(The elseif n==2 fprintf(The elseif n==3 fprintf(The elseif n==4 fprintf(The else fprintf(Not end ## Control Structures in Matlab 303 ## sum of %0.2f and %0.2f is %0.2f.\n,a,b,a+b) difference of %0.2f and %.2f is %.2f.\n,a,b,a-b) product of %.2f and %.2f is %.2f.\n,a,b,a*b) quotient of %.2f and %.2f is %.2f.\n,a,b,a/b) a valid choice.\n) Save this script as mymenu.m, then execute the command mymenu at the command window prompt. You will be prompted to enter two numbers a and b. As shown, we entered 15.637 for a and 28.4 for b. The program presents a menu of choices and prompts us for a choice. Enter a number a: 15.637 Enter a number b: 28.4 1). 2). 3). 4). Add a and b. Subtract b from a. Multiply a and b. Divide a by b. ## Enter your choice: We enter 1 as our choice and the program responds by adding the values of a and b. Enter your choice: 1 The sum of 15.64 and 28.40 is 44.04. Some comments are in order. 1. Note that after several elseif statments, we still list an else command to catch invalid entries for n. The choice for n should match a menu designation, either 1, 2, 3, or 4, but any other choice for n causes the statement following else to 304 Chapter 4 Programming in Matlab ## be executed. This is a good programming practice, having a sort of otherwise block as a catchall for unexpected responses by the user. 2. We use the conversion specification %0.2f in this example, which outputs a number in fixed point format with two decimal places. Note that this results in rounding in the output of numbers with more than two decimal places. ## Switch, Case, and Otherwise As the number of elseif entries in a conditional control structure increases, the code becomes harder to understand. Matlab provides a much simpler control construct designed for this situation. switch expression case value1 statements case value2 statements ... otherwise statements end The expression can be a scalar or string. Switch works by comparing the input expression to each case value. It executes the statements following the first match it finds, then transfers control to the line following the end. If switch finds no match, it executes the statment block following the otherwise statement. Lets create a second script having the same menu building code. a=input(Enter a number a: ); b=input(Enter a number b: ); fprintf(\n) fprintf(1) Add a and b.\n) fprintf(2) Subtract b from a.\n) fprintf(3) Multiply a and b.\n) fprintf(4) Divide a by b.\n) fprintf(\n) n=input(Enter your choice: ); fprintf(\n) Section 4.2 ## Control Structures in Matlab 305 ## However, instead of using an if ...elseif...else...end control structure to select a block of statments to execute based on the users choice of menu item, we will implement a switch structure instead. switch n case 1 fprintf(The case 2 fprintf(The case 3 fprintf(The case 4 fprintf(The otherwise fprintf(Not end ## sum of %.2f and %.2f is %.2f.\n,a,b,a+b) difference of %.2f and %.2f is %.2f.\n,a,b,a-b) product of %.2f and %.2f is %.2f.\n,a,b,a*b) quotient of %.2f and %.2f is %.2f.\n,a,b,a/b) a valid choice.\n) Save the script as mymenuswitch.m, change to the command window, and enter and execute the command mymenuswitch at the command prompt. Note that the behavior of the program mymenuswitch is identical to that of mymenu. Enter a number a: 15.637 Enter a number b: 28.4 1) 2) 3) 4) Add a and b. Subtract b from a. Multiply a and b. Divide a by b. ## Enter your choice: 1 The sum of 15.64 and 28.40 is 44.04. If the user enters an invalid choice, note how control is passed to the statment following otherwise. 306 Chapter 4 Programming in Matlab ## Enter a number a: 15.637 Enter a number b: 28.4 1) 2) 3) 4) Add a and b. Subtract b from a. Multiply a and b. Divide a by b. ## Enter your choice: 5 Not a valid choice. Loops A for loop is a control structure that is designed to execute a block of statements a predetermined number of times. Here is its general use syntax. for index=start:increment:finish statements end As an example, we display the squares of every other integer from 5 to 13, inclusive. for k=5:2:13 fprintf(The square of %d is %d.\n, k, k^2) end The output of this simple loop follows. The The The The The square square square square square of of of of of 5 is 25. 7 is 49. 9 is 81. 11 is 121. 13 is 169. Another looping construct is Matlabs while structure whos basic syntax follows. Section 4.2 ## Control Structures in Matlab 307 while expression statements end The while loop executes its block of statements as long as the logical controlling expresssion evaluates as true (logical 1). For example, we can duplicate the output of the previous for loop with the following while loop. k=5; while k<=13 fprintf(The square of %d is %d.\n, k, k^2) k=k+2; end When using while, it is not difficult to fall into a trap of programming a loop that iterates indefinitely. In the example above, the loop will iterate as long a k<=13, so it is imperative that we increment the value of k in the interior of the loop, as we do with the command k=k+2. This command adds 2 to the current value of k, then replaces k with this incremented value. Thus, the first time through the loop, k = 5, the second time through the loop, k = 7, etc. When k is no longer less than or equal to 13, the loop terminates. The simple use of for and while loops to generate the squares of every other integer from 5 to 13 can be accomplished just as easily with array operations. k=5:2:13; A=[k; k.^2]; fprintf(The square of %d is %d.\n, A) The use of fprintf in this example warrants some explanation. First, the command A=[k; k.^2] generates a matrix with two rows, the first row holding the values of k, the second the squares of k. A = 5 25 7 49 9 81 11 121 13 169 308 Chapter 4 Programming in Matlab Again, with roots deeply planted in Fortran, Matlab enters the entries of matrix A in columnwise fashion. The following command will print all entries in the matrix A. >> A(:) ans = 5 25 7 49 9 81 11 121 13 169 It is important to note the order in which the entries were extracted from the matrix A, that is, the entries from the first column, followed by the entries from the second column, etc, until all of the entries of matrix A are on display. This is precisely the order in which the fprintf command picks off the entries of matrix A in our example above. The first time through the loop, fprintf replaces the two occurrences of %d in its format string with 5 and 25. The second time through the loop, the two occurrences of %d are replaced with 7 and 49, and so on, until the loop terminates. For additional help on fprintf, type doc fprintf at the Matlab command prompt. Although a good introductory example of using loops, producing the squares of integers is not a very interesting task. Lets look at a more interesting example of the use of for and while loops. I Example 1. For a number of years, a number of leading mathematicicans believed that the infinite series X 1 1 1 1 = 1 + 2 + 2 + 2 + 2 k 2 3 4 (4.1) k=1 converged to a finite sum, but none of them could determine what that sum was. Until 1795, that is, when Leonhard Euler produced the suprising result that the sum of the series equaled 2 /6. Use a for loop to sum the first twenty terms of the series and compare this partial sum with 2 /6 (compute the relative Section 4.2 ## Control Structures in Matlab 309 error). Secondly, write a while loop that will determine how many terms must be summed to produce an approximation of pi2 /6 that is correct to 4 significant digits. We wish to sum the first 20 terms, so we begin by setting n = 20. We will store the running sum in the variable s, which we initialize to zero. A for loop is used to track the running sum. On each pass of the for loop, the sum is updated with the command s=s+1/k^2, which takes the previous value of the running sum stored in s, adds the value of the next term in the series, and stores the result back in s. n=20; s=0; for k=1:n s=s+1/k^2; end The true sum of the series in 2 /6. Hence, we compute the relative error with the following statement. rel=abs(s-pi^2/6)/abs(pi^2/6); We can then use fprintf to report the results. fprintf(The sum of the first %d terms is %f.\n, n, s) fprintf(The relative error is %e.\n, rel) You should obtain the following results. The sum of the first 20 terms is 1.596163. The relative error is 2.964911e-02. Note that the relative error is less than 5 102 , so our approximation of 2 /6 (using the first 20 terms of the series) contains only 2 significant digits. The next question asks how many terms must be summed to produce and approximation of 2 /6 correct to 4 significant digits. Thus, the relative error in the approximation must be less than 5 104 . 310 Chapter 4 Programming in Matlab We start by setting tol=5e-4. The intent is to loop until the relative error is less than this tolerance, which signifies that we have 4 signicant digits in our approximation of 2 /6 using the infinite series. The variable s will again contain the running sum and we again initialize the running sum to zero. The variable k contains the term number, so we initialize k to 1. We next initialize rel_error to 1 so that we enter the while loop at least once. tol=5e-4; s=0; k=1; rel_error=1; Each pass through the loop, we do three things: 1. We update the running sum by adding the next term in the series to the previous running sum. This is accomplished with the statement s=s+1/k^2. 2. We increment the term number by 1, using the statement k=k+1 for this purpose. 3. We calculate the current relative error. We continue to loop while the relative error is greater than or equal to the tolerance. while rel_error>=tol s=s+1/k^2; k=k+1; rel_error=abs(s-pi^2/6)/abs(pi^2/6); end When the loop is completed, we use fprintf to output results. fprintf(The actual value of pi^2/6 is %f.\n, pi^2/6) fprintf(The sum of the first %d terms is %f.\n, k, s) fprintf(The relative error is %f.\n, rel_error) The results follow. Section 4.2 ## Control Structures in Matlab 311 ## The actual value of pi^2/6 is 1.644934. The sum of the first 1217 terms is 1.644112. The relative error is 0.000500. Note that we have agreement in approximately 4 significant digits. This example demonstrates when we should use a for loop and when a while loop is more appropriate. We offer the following advice. What type of loop should I use? When you know in advance the precise number of times the loop should interate, use a for loop. On the other hand, if you have no predetermined knowledge of how many times you will need the loop to execute, use a while loop. For k = A Matlab also supports a type of for loop whose block of statements are executed for each column in a matrix A. The syntax is as follows. for k = A statements end In this construct, the columns of matrix A are stored one at a time in the variable k while the following statments, up to the end, are executed. For example, enter the matrix A. >> A=[1 2;3 4] A = 1 2 3 4 Now, enter and exercute the following loop. 312 Chapter 4 Programming in Matlab ## >> for k=A, k, end k = 1 3 k = 2 4 Note that k is assigned a column of matrix A at each iteration of the loop. This column assignment does not contradict the use we have already investigated, i.e., for k=start:increment:finish. In this case, the Matlab construct start:increment:finish creates a row vector and k is assigned a new column of the row vector at each iteration. Of course, this means that k is assigned a new element of the row vector at each iteration. As an example of use, lets plot the family of functions defined by the equation 2 y = 2Ctet , (4.2) ## where C is one of a number of specific constants. The following code produces the family of curves shown in Figure 4.1. At each iteration of the loop, C is assigned the next value in the row vector [-5,-3,-1,0,1,3,5]. t=linspace(-4,4,200); for C=[-5,-3,-1,0,1,3,5] y=-2*C*t.*exp(-t.^2); line(t,y) end ## Break and Continue There are two sitautions that frequently occur when writing loops. 1. If a certain state is achieved, the programmer wishes to terminate the loop and pass control to the code that follows the end of the loop. 2. The programmer doesnt want to exit the loop, but does want to pass control to the next iteration of the loop, thereby skipping any remaining code that remains in the loop. As an example of the first situation, consider the following snippet of code. Matlabs break command will cause the loop to terminate if a prime is found in the loop index k. Section 4.2 ## Control Structures in Matlab 313 ## Figure 4.1. A family of curves for C = 5, 3, 1, 0, 1, 3, and 5. for k=[4,8,12,15,17,19,21,24] if isprime(k) fprintf(Prime found, %d, exiting loop.\n,k) break end fprintf(Current value of index k is: %d\n,k) end Note that the output lists each value of k until a prime is found. The fprinft command issues a warning message and the break command terminates the loop. Current value of Current value of Current value of Current value of Prime found, 17, index k index k index k index k exiting is: 4 is: 8 is: 12 is: 15 loop. In this next code snippet, we want to print the multiples of three that are less than or equal to 20. At each iteration of the while loop, Matlab checks to see if k is divisible by 3. If not, it increments the counter k, then the continue statment 314 Chapter 4 Programming in Matlab that follows skips the remaining statements in the loop, and returns control to the next iteration of the loop. N=20; k=1; while k<=N if mod(k,3)~=0 k=k+1; continue end fprintf(Multiple of 3: %d\n,k) k=k+1; end If a multiple of 3 is found, fprintf is used to output the result in a nicely formatted statement. Multiple Multiple Multiple Multiple Multiple Multiple of of of of of of 3: 3: 3: 3: 3: 3: 3 6 9 12 15 18 ## Any and All The command any returns true if any element of a vector is a nonzero number or is logical 1 (true). >> v=[0 0 1 0 1]; any(v) ans = 1 On the other hand, any returns false if non of the elements of a vector are nonzero numbers or logical 1s. Section 4.2 ## Control Structures in Matlab 315 ## >> v=[0 0 0 0]; any(v) ans = 0 The command all returns true if all of the elements of a vector are nonzero numbers or are logical 1s (true). >> v=[1 1 1 1]; all(v) ans = 1 On the other hand, all returns false if the vector contains any zeros or logical 0s. >> v=[1 0 1 1 1]; all(v) ans = 0 The any and all command can be quite useful in programs. Suppose for example, that you want to pick out all the numbers from 1 to 2000 that are multiples of 8, 12, and 15. Heres a code snippet that will perform this task. Each time through the loop, we use the mod function to determine the remainder when the current value of k is divided by the numbers in divisors. If all remainders are equal to zero, then k is divisible by each of the numbers in divisors and we append k to a list of multiples of 8, 12, and 15. N=2000; divisors=[8,12,15]; multiples=[]; for k=1:N if all(mod(k,divisors)==0) multiples=[multiples,k]; end end fmt=%5d %5d %5d %5d %5d\n; fprintf(fmt,multiples) 316 Chapter 4 Programming in Matlab ## The resulting output follows. 120 720 1320 1920 240 840 1440 360 960 1560 480 1080 1680 600 1200 1800 Vectorizing the Code. In this case we can avoid loops altogether by using logical indexing. k=1:2000; b=(mod(k,8)==0) & (mod(k,12)==0) & (mod(k,15)==0); fmt=%5d %5d %5d %5d %5d\n; fprintf(fmt,k(b)) The reader should check that this code snippet produces the same result, i.e., all multiples of 8, 12, and 15 from 1 to 2000. Nested Loops Its possible to nest one loop inside another. You can nest a second for loop inside a primary for loop, or you can nest a for loop inside a while loop. Other combinations are also possible. Lets look at some situations where this is useful. I Example 2. A Hilbert Matrix H is an n n matrix defined by H(i, j) = 1 , i+j1 (4.3) ## where 1 i, j n. Set up nested for loops to construct a Hilbert Matrix of order n = 5. We could create this Hilbert Matrix using hand calculations. For example, the entry in row 3 column 2 is H(3, 2) = 1 1 = . 3+21 4 These calculations are not difficult, but for large order matrices, they would be tedious. Lets let Matlab do the work for us. Section 4.2 ## Control Structures in Matlab 317 n=5; H=zeros(n); Now, doubly nested for loops achieve the desired result. for i=1:n for j=1:n H(i,j)=1/(i+j-1); end end Set rational display format, display the matrix H, then set the format back to default. format rat H format The result is shown in the following Hilbert Matrix of order 5. H = 1 1/2 1/3 1/4 1/5 1/2 1/3 1/4 1/5 1/6 1/3 1/4 1/5 1/6 1/7 1/4 1/5 1/6 1/7 1/8 1/5 1/6 1/7 1/8 1/9 Note that the entry is row 3 column 2 is 1/4, as predicted by our hand calculations above. Similar hand calculations can be used to verify other entries in this result. How does it work? Because n = 5, the primary loop becomes for i=1:5. Similarly, the inner loop becomes for j=1:5. Now, here is the way the nested structure proceeds. First, set i = 1, then execute the statement H(i,j)=1/(i+j1) for j = 1, 2, 3, 4, and 5. With this first pass, we set the entries H(1, 1), H(1, 2), H(1, 3), H(1, 4), and H(1, 5). The inner loop terminates and control returns to the primary loop, where the program next sets i = 2 and again executes the statement H(i,j)=1/(i+j-1) for j = 1, 2, 3, 4, and 5. With this second pass, we set the entries H(2, 1), H(2, 2), H(2, 3), H(2, 4), and H(2, 5). A third and fourth 318 Chapter 4 Programming in Matlab pass of the primary loop occur next, with i = 3 and 4, each time iterating the inner loop for j = 1, 2, 3, 4, and 5. Finally, on the last pass through the primary, the program sets i = 5, then executes the statement H(i,j)=1/(i+j-1) for j = 1, 2, 3, 4, and 5. On this last pass, the program sets the entries H(5, 1), H(5, 2), H(5, 3), H(5, 4), and H(5, 5) and terminates. ## Fourier Series An Application of a For Loop Matlabs mod(a,b) works equally well with real numbers, providing the remainder when a is divided by b. For example, if you divide 5.7 by 2, the quotient is 2 and the remainder is 1.7. The command mod(5.7,2) should return the remainder, namely 1.7. >> mod(5.7,2) ans = 1.7000 The following commands produce what engineers call a sawtooth curve, shown in Figure 4.2(a). t=linspace(0,6,500); y=mod(t,2); plot(t,y) In Figure 4.2(a), we plot the remainders when the time is divided by 2. Hence, we get the a periodic function of period 2, where the remainders grow from 0 to just less than 2, then repeat every 2 units. With the following adjustment, we produce what engineers call a square wave (shown in Figure 4.2(b)). Recall that y<1 returns true (logical 1) when y is less than 1, and false (logical 0) when y is not less than 1. y=(y<1); plot(t,y,*) This type of curve can emulate a switch that is on for one second (y-value 1), then off for the next second (y-value 0), and then periodically repeats this on-off cycle over its domain. Section 4.2 ## Control Structures in Matlab 319 (a) (b) ## Figure 4.2. Generating a square wave with mod. Lets make one last adjustment, increasing the amplitude by 2, then shifting the graph downward 1 unit. This produces the square wave shown in Figure 4.3. y=2*y-1; plot(t,y,*) ## Figure 4.3. A square wave with period 2. Note that this square waves alternates between the values 1 and 1. The curve equals 1 on the first half of its period, then 1 on its second half. This pattern then repeats with period 2 over the remainder of its domain. Fourier Series. Using advanced mathematics, it can be shown that the following Fourier Series converges to the square wave pictured in Figure 4.3 X n=0 4 sin(2n + 1)t (2n + 1) (4.4) 320 Chapter 4 Programming in Matlab It is certain remarkable, if not somewhat implausible, that one can sum a series of sinusoidal function and get the resulting square wave picture in Figure 4.3. We will generate and plot the first five terms of the series (4.4), saving each term in a row of matrix A for later use. First, we initialize the time and the number of terms of the series that we will use, then we allocate appropriate space for matrix A. Note how we wisely save the number of points and the number of terms in variables, so that if we later decide to change these values, we wont have to scan every line of our program making changes. N=500; numterms=5; t=linspace(0,6,N); A=zeros(numterms,N); Next, we use a for loop to compute each of the first 5 terms (numterms) of series (4.4), storing each term in a row of matrix A, then plotting the sinusoid in three space, where we use the angular frequency as the third dimension. for n=0:numterms-1 y=4/((2*n+1)*pi)*sin((2*n+1)*pi*t); A(n+1,:)=y; line((2*n+1)*pi*ones(size(t)),t,y) end We orient the 3D-view, add a box for depth, turn on the grid, and annotate each axis to produce the image shown in Figure 4.4(a). view(20,20) box on grid on xlabel(Angular Frequency) ylabel(t-axis) zlabel(y-axis) As one would expect, after examining the terms of series (4.4), each each consecutive term of the series is a sinusoid with increasing angular frequency and decreasing amplitude. This is evident with the sequence of terms shown in Figure 4.4(a). Section 4.2 ## Control Structures in Matlab 321 ## However, a truly remarkable result occurs when we add the sinusoids in Figure 4.4(a). This is easy to do because we saved each individual term in a row of matrix A. Matlabs sum command will sum the columns of matrix A, effectively summing the first five terms of series (4.4) at each instant of time t. figure plot(t,sum(A)) We tighten the axes, add a grid, then annotate each axis. This produces the image in Figure 4.4(b). Note the striking similarity to the square wave in Figure 4.3. axis tight grid on xlabel(t-axis) ylabel(y-axis) (a) (b) ## Figure 4.4. Approximating a square wave with a Fourier series (5 terms). Because we smartly stored the number of terms in the variable numterms, to see the effect of adding the first 10 terms of the fourier series (4.4) is a simple matter of changing numterms=5 to numterms=10 and running the program again. The ouptut is shown in Figures 4.5(a) and (b). In Figure 4.5(b), note the even closer resemblance to the square wave in Figure 4.3, except at the ends, where the ringing exhibited there is known as Gibbs Phenomenon. If you one 322 Chapter 4 Programming in Matlab day get a chance to take a good upper-division differential equations course, you will study Fourier series in more depth. (a) (b) ## Figure 4.5. Adding additional terms of the Fourier series (4.4). Section 4.2 ## Control Structures in Matlab 323 4.2 Exercises 1. Write a for loop that will output the cubes of the first 10 positive integers. Use fprintf to output the results, which should include the integer and its cube. Write a second program that uses a while loop to produce an indentical result. 2. Without appealing to the Matlab command factorial, write a for loop to output the factorial of the numbers 10 through 15. Use fprintf to format the output. Write a second program that uses a while loop to produce an indentical result. Hint: Consider the prod command. 3. Write a single program that will count the number of divisors of each of the following integers: 20, 36, 84, and 96. Use fprintf to output each result in a form similar to The number of divisors of 12 is 6. 4. Set A=magic(5). Write a program that uses nested for loops to find the sum of the squares of all entries of matrix A. Use fprintf to format the output. Write a second program that uses array operations and Matlabs sum function to obtain the same result. ## hard Euler shows that 4 1 1 1 = 1 + 4 + 4 + 4 + . 90 2 3 4 Write a program that uses a for loop to sum the first 20 terms of this series. Compute the relative error when this sum is used as an approximation of 4 /90. Write a second program that uses a while loop to determine the number of terms required so that the sum approximates 4 /90 to four significant digits. In both programs, use fprintf to format your output. 7. Some attribute the following series to Leibniz. 1 1 1 1 = 1 + + 4 3 5 7 9 Write a program that uses a for loop to sum the first 20 terms of this series. Compute the relative error when this sum is used as an approximation of /4. Write a second program that uses a while loop to determine the number of terms required so that the sum approximates /4 to four significant digits. In both programs, use fprintf to format your output. ## 8. Goldbachs Conjecture is one of the most famous unproved conjectures 5. Write a program that uses nested in all of mathematics, which is remarkfor loops to produce Pythagorean Triples, able in light of its simplistic statement: positive integers a, b and c that satisfy All even integers greater than a2 +b2 = c2 . Find all such triples such 2 can be expressed as a sum that 1 a, b, c 20 and use fprintf of two prime integers. Write to produce nicely formatted results. a program that expresses 1202 as the sum of two primes in ten 6. Another result proved by Leon- 324 Chapter 4 Programming in Matlab ## different ways. Use fprintf to format your output. 9. Here is a simple idea for generating a list of prime integers. Create a vector primes with a single entry, the prime integer 2. Write a program to test each integer less than 100 to see if it is a prime using the following procedure. i. If an integer is divisible by any of the integers in primes, skip it and go to the next integer. ii. If an integer is not divisible by any of the integers in primes, append the integer to the vector primes and go to the next integer. Use fprintf to output the vector primes in five columns, right justified. 10. There is a famous stoy about Sir Thomas Hardy and Srinivasa Ramanujan, which Hardy relates in his famous work A Mathematicians Apology, a copy of which resides in the CR library along with the work The Man Who Knew Infinity: A Life of the Genius Ramanujan. I remember once going to see him when he was ill at Putney. I had ridden in taxi cab number 1729 and remarked that the number seemed to me rather a dull one, and that I hoped it was not an unfavorable omen. No, he replied, it is a very interesting number; it is the smallest number expressible as the sum of two cubes in two different ways. Write a program with nested loops to ## find integers a and b (in two ways) so that a3 + b3 = 1729. Use fprintf to format the output of your program. 11. Write a program to perform each of the following tasks. i. Use Matlab to draw a circle of radius 1 centered at the origin and inscribed in a square having vertices (1, 0), (1, 1), (1, 1), and (1, 1). The ratio of the area of the circle to the area of the square is : 4 or /4. Hence, if we were to throw darts at the square in a rondom fashion, the ratio of darts inside the circle to the number of darts thrown should be approximately equal to /4. ii. Write a for loop that will plot 1000 randomly generated points inside the square. Use Matlabs rand command for this task. Each time random point lands within the unit circle, increment a counter hits. When the for loop terminates, use fprintf to output the ratio darts that land inside the circle to the number of darts thrown. Calculate the relative error in approximating pi/4 with this ratio. 12. Write a program to perform each of the following tasks. i. Prompt the user to enter a 3 4 matrix. Store the result in the matrix A. ii. Use if..elseif..else to provide a menu with three choices: (1) Switch two rows of the matrix; (2) Multiply a row of the matrix by a scalar; and (3) Subtract a scalar multiple of a Section 4.2 ## Control Structures in Matlab 325 ## row from another row of the matrix. iii. Perform the task in the requested menu item, then return the resulting matrix to the user. a. In the case of (1), your program should prompt the user for the rows to switch. b. In the case of (2), your program should prompt the user for a scalar and a row that will be multiplied by the scalar. c. In the case of (3), your program should prompt the user for a scalar and two row numbers, the first of which is to be multiplied by the scalar and subtracted from the second. 13. The solutions of the quadratic equation ax2 + bx + c = 0 are given by the quadratic formula b b2 4ac . x= 2a b2 4ac The discriminant D = is used to predict the number of roots. There are three cases. 1. If D > 0, there are two real solutions. 2. If D = 0, there is one real solution. 3. If D < 0, there are no real solutions. Write a program the performs each of the following tasks. i. The program prompts the user to input a, b, and c. ii. The programs computes the discriminant D which it uses with the conditional if..elseif..else to de- ## termine and output the number of solutions. iii. The program should also output the solutions, if any. Use well crafted format strings with fprintf to output all results. 14. Plot the modified sawtooth curve. N=500; t=linspace(0,6*pi,N); y=mod(t+pi,2*pi)-pi; line(t,y) It can be shown that the Fourier series X 2(1)n+1 sin nx n n=1 ## converges to the sawtooth curve. Perform each of the following tasks. i. Sketch each of the individual terms in 3-space, as shown in the narrative. ii. Sketch the sawtooth, hold the graph, then sum the first five terms of the series and superimpose the plot of the result. 15. Logical arrays are helpful when it comes to drawing piecewise functions. For example, consider the piecewise definition  0, if x < 0 f (x) = x, if 0 x . Enter the following to plot the piecewise function f . 326 Chapter 4 Programming in Matlab N=500; x=linspace(-pi,pi,N); y=0*(x<0)+(pi-x).*(x>=0); plot(x,y,.) A Fourier series representation of the function f is given by a0 X + [an cos nx + bn sin nx] , 2 n=1 ## where a0 = /2, and an = 1 cos n n2 and bn = 1 . n ## Hold the plot of f , then superimpose the Fourier series sum for n = 1 to n = 5. Hint: This is a nice place for anonymous functions, for example, set: b = @(n) 1/n; Section 4.2 ## Control Structures in Matlab 327 4.2 Answers 1. This for loop will print the cubes of the first 10 positive integers. for k=1:10 fprintf(The cube of %d is %d.\n,k,k^3) end This while loop will print the cubes of the first 10 positive integers. k=1; while k<=10 fprintf(The cube of %d is %d.\n,k,k^3) k=k+1; end 3. The following program will print the number of divisors of 20, 36, 84, and 96. for k=[20, 36, 84, 96] count=0; divisor=1; while (divisor<=k) if mod(k,divisor)==0 count=count+1; end divisor=divisor+1; end fprintf(The number of divisors of %d is %d.\n,k,count) end ## 5. This program will print Pythagorean Triples so that 1 a, b, c 20. 328 Chapter 4 Programming in Matlab N=20; for a=1:N for b=1:N for c=1:N if (c^2==a^2+b^2) fprintf(Pythagorean Triple: %d, %d, %d\n, a, b, c) end end end end ## 7. The following loop will sum the first 20 terms. N=20; s=0; for k=1:N; s=s+(-1)^(k+1)/(2*k-1); end Compute the relative error in approximating /4 with the sum of the first 20 terms. rel=abs(s-pi/4)/abs(pi/4); Output the results. fprintf(The actual value of pi/4 is %.6f.\n,pi/4) fprintf(The sum of the first %d terms is %f.\n,N,s) fprintf(The relative error is %.2e.\n,rel) ## 9. The mod(m,primes)==0 comparison will produce a logical vector which contains a 1 in each position that indicates that the current number m is divisible by the number in the prime vector in the corresponding position. If any of these are 1s, then the number m is divisible by at least one of the primes in the current list primes. In that case, we continue to the next value of m. Otherwise, we append m to the list of primes. ## Control Structures in Matlab 329 Section 4.2 primes=2; for m=3:100 if any(mod(m,primes)==0) continue else primes=[primes,m]; end end We create a format of 5 fields, each of width 5, then fprintf the vector primes using this format. fmt=%5d %5d %5d %5d %5d\n; The resulting list of primes is nicely formatted in 5 columns. 2 13 31 53 73 3 17 37 59 79 5 19 41 61 83 7 23 43 67 89 11 29 47 71 97 11. We open a figure window and set a lighter background color of gray. fig=figure; set(fig,Color,[0.95,0.95,0.95]) We draw a circle of radius one, increase its line width, change its color to dark red, set the axis equal, then turn the axes off. t=linspace(0,2*pi,200); x=cos(t); y=sin(t); line(x,y,LineWidth,2,Color,[0.625,0,0]) axis equal axis off 330 Chapter 4 Programming in Matlab ## We draw the square in counterclockwise order of vertices, starting at the point (1, 1). In that order, we set the x- and y-values of the vertices in the vectors x and y. We then draw the square with a thickened line width and color it dark blue. x=[1,-1,-1,1,1]; y=[1,1,-1,-1,1]; line(x,y,LineWidth,2,Color,[0,0,0.625]) Next, we set the number of darts thrown at 1000. Then we plot one dart at the origin, change its linestyle to none (points will not be connected with line segements), choose a marker style, and color it dark blue. N=1000; dart=line(0,0,... LineStyle,none,... Marker,.,... Color,[0,0,0.625]); In the loop that follows, we use the handle to this initial dart to add x- and y-data (additional darts). First, we set the number of hits (darts that land inside the circle) to zero (we dont count the initial dart at (0, 0)). hits=0; for k=1:N x=2*rand-1; y=2*rand-1; set(dart,... XData,[get(dart,XData),x],... YData,[get(dart,YData),y]) %drawnow if (x^2+y^2<1) hits=hits+1; end end In the body of the loop, we choose uniform random numbers between 1 and 1 for both x and y. We then add these new x- and y-values to the XData and YData of the existing dart by using its handle dart. We do this for x by Section 4.2 ## Control Structures in Matlab 331 first getting the XData for the dart with get(dart,XData), then we append the current value of x with [get(dart,XData),x]. The set command is used to set the updated list of x-values. A similar task updates the darts y-values. At each iteration of the loop, we also increment the hits counter if the dart lies within the border of the circle, that is, if x2 + y 2 < 1. You can uncomment the drawnow command to animate the dart throwing at the expense of waiting for 1000 darts to be thrown to the screen. Finally, we output the requested data to the command window. fprintf(%d of %d darts fell inside the circle.\n,hits,N) fprintf(The ratio of hits to throws is %f\n\n,hits/N) fprintf(The ratio of the area of the circle to the area\n) fprintf(the area of the square is pi/4, or approximately %f.\n\n,pi/4) rel_err=abs(hits/N-pi/4)/abs(pi/4); fprintf(The relative error in approximating pi/4 wiht the ratio\n) fprintf(of hits to total darts thrown is %.2e.\n,rel_err) ## 15. We begin by using logical relations to craft the piecewise function. N=500; x=linspace(-pi,pi,N); y=0*(x<0)+(pi-x).*(x>=0); line(x,y,... LineStyle,none,... Marker,.) Next, we compose two anonymous functions for the coefficients an and bn . a=@(n) (1-cos(n*pi))/(n^2*pi); b=@(n) 1/n; Note that a0 = /2, so a0 /2 = /4, which must be added to the sum of the first five terms of the series. Thus, well start our running sum at s = /4. The for loop then sums the first five terms. We use another line command to plot the result. 332 Chapter 4 Programming in Matlab s=pi/4; N=5; for k=1:N s=s+(a(k)*cos(k*x)+b(k)*sin(k*x)); end line(x,s,LineWidth,2,Color,[0.625,0,0]) Chapter 6 ## Eigenvalues and Eigenvectors 6.1 Introduction to Eigenvalues Linear equations Ax D b come from steady state problems. Eigenvalues have their greatest importance in dynamic problems. The solution of d u=dt D Au is changing with time growing or decaying or oscillating. We cant find it by elimination. This chapter enters a new part of linear algebra, based on Ax D x. All matrices in this chapter are square. A good model comes from the powers A; A 2 ; A3 ; : : : of a matrix. Suppose you need the hundredth power A100 . The starting matrix A becomes unrecognizable after a few steps, and A100 is very close to :6 :6I :4 :4  :  :8 :3 :2 :7 A   :70 :45 :30 :55 A2   :650 :525 :350 :475 A3   :6000 :6000 :4000 :4000 A100 A100 was found by using the eigenvalues of A, not by multiplying 100 matrices. Those eigenvalues (here they are 1 and 1=2) are a new way to see into the heart of a matrix. To explain eigenvalues, we first explain eigenvectors. Almost all vectors change direction, when they are multiplied by A. Certain exceptional vectors x are in the same direction as Ax. Those are the eigenvectors. Multiply an eigenvector by A, and the vector Ax is a number  times the original x. The basic equation is Ax D x. The number  is an eigenvalue of A. The eigenvalue  tells whether the special vector x is stretched or shrunk or reversed or left unchangedwhen it is multiplied by A. We may find  D 2 or 12 or 1 or 1. The eigenvalue  could be zero! Then Ax D 0x means that this eigenvector x is in the nullspace. If A is the identity matrix, every vector has Ax D x. All vectors are eigenvectors of I . All eigenvalues lambda are  D 1. This is unusual to say the least. Most 2 by 2 matrices have two eigenvector directions and two eigenvalues. We will show that det.A  I / D 0. 283 284 ## Chapter 6. Eigenvalues and Eigenvectors This section will explain how to compute the xs and s. It can come early in the course because we only need the determinant of a 2 by 2 matrix. Let me use det.A  I / D 0 to find the eigenvalues for this first example, and then derive it properly in equation (3). The matrix A has two eigenvalues  D 1 and  D 1=2. Look at det.AI /:      3 1 1 :8 :3 :8   :3 2 AD det D    C D .  1/   : :2 :7 :2 :7   2 2 2 Example 1 I factored the quadratic into   1 times   12 , to see the two eigenvalues  D 1 and  D 12 . For those numbers, the matrix A  I becomes singular (zero determinant). The eigenvectors x 1 and x 2 are in the nullspaces of A  I and A  12 I . .A  I /x 1 D 0 is Ax 1 D x 1 and the first eigenvector is .:6; :4/. .A  12 I /x 2 D 0 is Ax 2 D 12 x 2 and the second eigenvector is .1; 1/:      :6 :8 :3 :6 x1 D and Ax 1 D D x 1 (Ax D x means that 1 D 1) :4 :2 :7 :4        1 :8 :3 1 :5 x2 D and Ax 2 D D (this is 12 x 2 so 2 D 12 ). 1 :2 :7 1 :5 If x1 is multiplied again by A, we still get x 1 . Every power of A will give An x 1 D x 1 . Multiplying x 2 by A gave 12 x 2 , and if we multiply again we get . 12 /2 times x 2 . When A is squared, the eigenvectors stay the same. The eigenvalues are squared. This pattern keeps going, because the eigenvectors stay in their own directions (Figure 6.1) and never get mixed. The eigenvectors of A 100 are the same x 1 and x 2 . The eigenvalues of A100 are 1100 D 1 and . 12 /100 D very small number.  D1 Ax1 D x 1 D :6 :4 2 D 1    D :5  2 D :25 :5 :5 Ax 2 D 2 x 2 D  1 x2 D 1 A2 x 1 D .1/2 x 1 A2 x 2 D .:5/2 x 2 D :25 :25 Ax D x A2 x D 2 x Figure 6.1: The eigenvectors keep their directions. A 2 has eigenvalues 12 and .:5/2 . Other vectors do change direction. But all other vectors are combinations of the two eigenvectors. The first column of A is the combination x 1 C .:2/x 2 :       :6 :2 :8 C : (1) Separate into eigenvectors D x 1 C .:2/x 2 D :4 :2 :2 ## 6.1. Introduction to Eigenvalues 285 Multiplying by A gives .:7; :3/, the first column of A 2 . Do it separately for x 1 and .:2/x 2 . Of course Ax 1 D x 1 . And A multiplies x 2 by its eigenvalue 12 :         1 :8 :7 :6 :1 Multiply each x i by i A D is x 1 C .:2/x 2 D C : :2 :3 :4 :1 2 Each eigenvector is multiplied by its eigenvalue, when we multiply by A. We didnt need these eigenvectors to find A 2 . But it is the good way to do 99 multiplications. At every step x 1 is unchanged and x 2 is multiplied by . 12 /, so we have . 12 /99 : 2 3     very 1 :8 :6 A99 is really x 1 C .:2/. /99 x 2 D C 4 small 5 : :2 :4 2 vector This is the first column of A100 . The number we originally wrote as :6000 was not exact. We left out .:2/. 21 /99 which wouldnt show up for 30 decimal places. The eigenvector x 1 is a steady state that doesnt change (because 1 D 1/. The eigenvector x 2 is a decaying mode that virtually disappears (because  2 D :5/. The higher the power of A, the closer its columns approach the steady state. We mention that this particular A is a Markov matrix. Its entries are positive and every column adds to 1. Those facts guarantee that the largest eigenvalue is  D 1 (as we found). Its eigenvector x 1 D .:6; :4/ is the steady statewhich all columns of A k will approach. Section 8.3 shows how Markov matrices appear in applications like Google. For projections we can spot the steady state . D 1/ and the nullspace . D 0/.   :5 :5 Example 2 The projection matrix P D has eigenvalues  D 1 and  D 0. :5 :5 Its eigenvectors are x 1 D .1; 1/ and x 2 D .1; 1/. For those vectors, P x 1 D x 1 (steady state) and P x 2 D 0 (nullspace). This example illustrates Markov matrices and singular matrices and (most important) symmetric matrices. All have special s and xs:   :5 :5 1. Each column of P D adds to 1, so  D 1 is an eigenvalue. :5 :5 2. P is singular, so  D 0 is an eigenvalue. 3. P is symmetric, so its eigenvectors .1; 1/ and .1; 1/ are perpendicular. The only eigenvalues of a projection matrix are 0 and 1. The eigenvectors for  D 0 (which means P x D 0x/ fill up the nullspace. The eigenvectors for  D 1 (which means P x D x/ fill up the column space. The nullspace is projected to zero. The column space projects onto itself. The projection keeps the column space and destroys the nullspace:         1 2 0 2 Project each part v D C projects onto P v D C : 1 2 0 2 Special properties of a matrix lead to special eigenvalues and eigenvectors. That is a major theme of this chapter (it is captured in a table at the very end). 286 ## Chapter 6. Eigenvalues and Eigenvectors Projections have  D 0 and 1. Permutations have all jj D 1. The next matrix R (a reflection and at the same time a permutation) is also special.   Example 3 The reflection matrix R D 01 10 has eigenvalues 1 and 1. The eigenvector .1; 1/ is unchanged by R. The second eigenvector is .1; 1/its signs are reversed by R. A matrix with no negative entries can still have a negative eigenvalue! The eigenvectors for R are the same as for P , because reflection D 2.projection/  I :       0 1 :5 :5 1 0 R D 2P  I D2  : (2) 1 0 :5 :5 0 1 Here is the point. If P x D x then 2P x D 2x. The eigenvalues are doubled when the matrix is doubled. Now subtract I x D x. The result is .2P  I /x D .2  1/x. When a matrix is shifted by I , each  is shifted by 1. No change in eigenvectors. Figure 6.2: Projections P have eigenvalues 1 and 0. Reflections R have  D 1 and 1. A typical x changes direction, but not the eigenvectors x 1 and x 2 . Key idea: The eigenvalues of R and P are related exactly as the matrices are related: The eigenvalues of R D 2P  I are 2.1/  1 D 1 and 2.0/  1 D 1. The eigenvalues of R 2 are 2 . In this case R2 D I . Check .1/2 D 1 and .1/2 D 1. ## The Equation for the Eigenvalues For projections and reflections we found s and xs by geometry: P x D x; P x D 0; Rx D x. Now we use determinants and linear algebra. This is the key calculation in the chapteralmost every application starts by solving Ax D x. First move x to the left side. Write the equation Ax D x as .A  I /x D 0. The matrix A  I times the eigenvector x is the zero vector. The eigenvectors make up the nullspace of A  I . When we know an eigenvalue , we find an eigenvector by solving .A  I /x D 0. Eigenvalues first. If .A  I /x D 0 has a nonzero solution, A  I is not invertible. The determinant of A  I must be zero. This is how to recognize an eigenvalue : 287 ## 6.1. Introduction to Eigenvalues ## Eigenvalues The number  is an eigenvalue of A if and only if A  I is singular: det.A  I / D 0: (3) ## This characteristic equation det.A  I / D 0 involves only , not x. When A is n by n, the equation has degree n. Then A has n eigenvalues and each  leads to x: For each  solve .A  I /x D 0 or Ax D x to find an eigenvector x:  Example 4 AD 1 2 2 4  is already singular (zero determinant). Find its s and xs. ## When A is singular,  D 0 is one of the eigenvalues. The equation Ax D 0x has solutions. They are the eigenvectors for  D 0. But det.A  I / D 0 is the way to find all s and xs. Always subtract I from A:   1 2 Subtract  from the diagonal to find A  I D : (4) 2 4 Take the determinant ad  bc of this 2 by 2 matrix. From 1   times 4  , the ad part is 2  5 C 4. The bc part, not containing , is 2 times 2.   1 2 (5) det D .1  /.4  /  .2/.2/ D 2  5: 2 4 Set this determinant  2  5 to zero. One solution is  D 0 (as expected, since A is singular). Factoring into  times   5, the other root is  D 5: det.A  I / D 2  5 D 0 ## yields the eigenvalues 1 D 0 and 2 D 5 : ## Now find the eigenvectors. Solve .A  I /x D 0 separately for  1 D 0 and 2 D 5:          y 0 y 2 D yields an eigenvector D for 1 D 0 z 0 z 1          4 2 y 0 y 1 .A  5I /x D D yields an eigenvector D for 2 D 5: 2 1 z 0 z 2 .A  0I /x D 1 2 2 4 The matrices A  0I and A  5I are singular (because 0 and 5 are eigenvalues). The eigenvectors .2; 1/ and .1; 2/ are in the nullspaces: .A  I /x D 0 is Ax D x. We need to emphasize: There is nothing exceptional about  D 0. Like every other number, zero might be an eigenvalue and it might not. If A is singular, it is. The eigenvectors fill the nullspace: Ax D 0x D 0. If A is invertible, zero is not an eigenvalue. We shift A by a multiple of I to make it singular. In the example, the shifted matrix A  5I is singular and 5 is the other eigenvalue. 288 ## Chapter 6. Eigenvalues and Eigenvectors Summary To solve the eigenvalue problem for an n by n matrix, follow these steps: ## 1. Compute the determinant of A  I . With  subtracted along the diagonal, this determinant starts with n or n . It is a polynomial in  of degree n. 2. Find the roots of this polynomial, by solving det.A  I / D 0. The n roots are the n eigenvalues of A. They make A  I singular. 3. For each eigenvalue , solve .A  I /x D 0 to find an eigenvector x. ## A note on the eigenvectors of 2 by 2 matrices. When A  I is singular, both rows are multiples of a vector .a; b/. The eigenvector is any multiple of .b; a/. The example had  D 0 and  D 5:  D 0 W rows of A  0I in the direction .1; 2/; eigenvector in the direction .2; 1/  D 5 W rows of A  5I in the direction .4; 2/; eigenvector in the direction .2; 4/: Previously we wrote that last eigenvector as .1; 2/. Both .1; 2/ and .2; 4/ are correct. There is a whole line of eigenvectorsany nonzero multiple of x is as good as x. MATLABs eig.A/ divides by the length, to make the eigenvector into a unit vector. We end with a warning. Some 2 by 2 matrices have only one line of eigenvectors. This can only happen when two eigenvalues are equal. (On the other hand A D I has equal eigenvalues and plenty of eigenvectors.) Similarly some n by n matrices dont have n independent eigenvectors. Without n eigenvectors, we dont have a basis. We cant write every v as a combination of eigenvectors. In the language of the next section, we cant diagonalize a matrix without n independent eigenvectors. ## Good News, Bad News Bad news first: If you add a row of A to another row, or exchange rows, the eigenvalues usually change. Elimination does not preserve the s. The triangular U has its eigenvalues sitting along the diagonalthey are the pivots. But they are not the eigenvalues of A! Eigenvalues are changed when row 1 is added to row 2:     1 3 1 3 U D has  D 0 and  D 1; A D has  D 0 and  D 7: 0 0 2 6 Good news second: The product  1 times 2 and the sum 1 C 2 can be found quickly from the matrix. For this A, the product is 0 times 7. That agrees with the determinant (which is 0/. The sum of eigenvalues is 0 C 7. That agrees with the sum down the main diagonal (the trace is 1 C 6/. These quick checks always work: The product of the n eigenvalues equals the determinant. The sum of the n eigenvalues equals the sum of the n diagonal entries. 289 ## 6.1. Introduction to Eigenvalues The sum of the entries on the main diagonal is called the trace of A: 1 C 2 C    C n D trace D a11 C a22 C    C ann : (6) Those checks are very useful. They are proved in Problems 1617 and again in the next section. They dont remove the pain of computing s. But when the computation is wrong, they generally tell us so. To compute the correct s, go back to det.A  I / D 0. The determinant test makes the product of the s equal to the product of the pivots (assuming no row exchanges). But the sum of the s is not the sum of the pivotsas the example showed. The individual s have almost nothing to do with the pivots. In this new part of linear algebra, the key equation is really nonlinear:  multiplies x. Why do the eigenvalues of a triangular matrix lie on its diagonal? Imaginary Eigenvalues One more bit of news (not too terrible). The eigenvalues might not be real numbers. Example 5 The 90 rotation QD 01 1 0 ## has no real eigenvectors. Its eigenvalues are  D i and  D i . Sum of s D trace D 0. Product D determinant D 1. ## After a rotation, no vector Qx stays in the same direction as x (except x D 0 which is useless). There cannot be an eigenvector, unless we go to imaginary numbers. Which we do. To see how i can help, look at Q 2 which is I . If Q is rotation through 90 , then Q2 is rotation through 180 . Its eigenvalues are 1 and 1. (Certainly I x D 1x.) Squaring Q will square each , so we must have  2 D 1. The eigenvalues of the 90 rotation matrix Q are Ci and i , because i 2 D 1. Those s come as usual from det.Q  I / D 0. This equation gives 2 C 1 D 0. Its roots are i and i . We meet the imaginary number i also in the eigenvectors: Complex eigenvectors 0 1 1 0     1 1 Di i i  and 0 1 1 0     i i D i : 1 1 Somehow these complex vectors x 1 D .1; i / and x 2 D .i; 1/ keep their direction as they are rotated. Dont ask me how. This example makes the all-important point that real matrices can easily have complex eigenvalues and eigenvectors. The particular eigenvalues i and i also illustrate two special properties of Q: 1. Q is an orthogonal matrix so the absolute value of each  is jj D 1. 2. Q is a skew-symmetric matrix so each  is pure imaginary. 290 ## Chapter 6. Eigenvalues and Eigenvectors ## A symmetric matrix .AT D A/ can be compared to a real number. A skew-symmetric matrix .AT D A/ can be compared to an imaginary number. An orthogonal matrix .AT A D I / can be compared to a complex number with jj D 1. For the eigenvalues those are more than analogiesthey are theorems to be proved in Section 6:4. The eigenvectors for all these special matrices are perpendicular. Somehow .i; 1/ and .1; i / are perpendicular (Chapter 10 explains the dot product of complex vectors). Eigshow in MATLAB There is a MATLAB demo (just type eigshow), displaying the eigenvalue problem for a 2 by 2 matrix. It starts with the unit vector x D .1; 0/. The mouse makes this vector move around the unit circle. At the same time the screen shows Ax, in color and also moving. Possibly Ax is ahead of x. Possibly Ax is behind x. Sometimes Ax is parallel to x. At that parallel moment, Ax D x (at x 1 and x 2 in the second figure). The eigenvalue  is the length of Ax, when the unit eigenvector x lines up. The built-in choices for A illustrate three possibilities: 0; 1, or 2 directions where Ax crosses x. 0. There are no real eigenvectors. Ax stays behind or ahead of x. This means the eigenvalues and eigenvectors are complex, as they are for the rotation Q. 1. There is only one line of eigenvectors (unusual). The moving directions Ax and x touch but dont cross over. This happens for the last 2 by 2 matrix below. 2. There are eigenvectors in two independent directions. This is typical! Ax crosses x at the first eigenvector x 1 , and it crosses back at the second eigenvector x 2 . Then Ax and x cross again at x 1 and x 2 . You can mentally follow x and Ax for these five matrices. Under the matrices I will count their real eigenvectors. Can you see where Ax lines up with x?           2 0 0 1 0 1 1 1 1 1 AD 0 1 1 0 1 0 1 1 0 1 2 2 0 1 1 291 ## 6.1. Introduction to Eigenvalues When A is singular (rank one), its column space is a line. The vector Ax goes up and down that line while x circles around. One eigenvector x is along the line. Another eigenvector appears when Ax 2 D 0. Zero is an eigenvalue of a singular matrix. ## REVIEW OF THE KEY IDEAS 1. Ax D x says that eigenvectors x keep the same direction when multiplied by A. 2. Ax D x also says that det.A  I / D 0. This determines n eigenvalues. 3. The eigenvalues of A2 and A1 are 2 and 1 , with the same eigenvectors. 4. The sum of the s equals the sum down the main diagonal of A (the trace). The product of the s equals the determinant. 5. Projections P , reflections R, 90 rotations Q have special eigenvalues 1; 0; 1; i; i . Singular matrices have  D 0. Triangular matrices have s on their diagonal. WORKED EXAMPLES 6.1 A ## Find the eigenvalues and eigenvectors of A and A 2 and A1 and A C 4I :     2 1 5 4 AD and A2 D : 1 2 4 5 ## Check the trace 1 C 2 and the determinant 1 2 for A and also A2 . Solution ## The eigenvalues of A come from det.A  I / D 0: 2   1 det.A  I / D D 2  4 C 3 D 0: 1 2   This factors into .1/.3/ D 0 so the eigenvalues of A are 1 D 1 and 2 D 3. For the trace, the sum 2C2 agrees with 1C3. The determinant 3 agrees with the product  1 2 D 3. The eigenvectors come separately by solving .A  I /x D 0 which is Ax D x:        1 1 x 0 1  D 1: .A  I /x D D gives the eigenvector x 1 D 1 1 y 0 1        1 1 x 0 1  D 3: .A  3I /x D D gives the eigenvector x 2 D 1 1 y 0 1 292 ## Chapter 6. Eigenvalues and Eigenvectors A2 and A1 and A C 4I keep the same eigenvectors as A. Their eigenvalues are  2 and 1 and  C 4: A2 has eigenvalues 12 D 1 and 32 D 9 A1 has 1 1 and 1 3 A C 4I has 1C4 D5 3C4 D7 ## The trace of A2 is 5 C 5 which agrees with 1 C 9. The determinant is 25  16 D 9. Notes for later sections: A has orthogonal eigenvectors (Section 6.4 on symmetric matrices). A can be diagonalized since 1 2 (Section 6.2). A is similar to any 2 by 2 matrix with eigenvalues 1 and 3 (Section 6.6). A is a positive definite matrix (Section 6.5) since A D AT and the s are positive. Find the eigenvalues and eigenvectors of this 3 by 3 matrix A: 2 3 Symmetric matrix 1 1 0 Singular matrix 2 1 5 A D 41 Trace 1 C 2 C 1 D 4 0 1 1 6.1 B Solution Since all rows of A add to zero, the vector x D .1; 1; 1/ gives Ax D 0. This is an eigenvector for the eigenvalue  D 0. To find  2 and 3 I will compute the 3 by 3 determinant: 1   1 0 D .1  /.2  /.1  /  2.1  / 2 1 D .1  /.2  /.1  /  2 det.A  I / D 1 0 1 1   D .1  /./.3  /: That factor  confirms that  D 0 is a root, and an eigenvalue of A. The other factors .1  / and .3  / give the other eigenvalues 1 and 3, adding to 4 (the trace). Each eigenvalue 0; 1; 3 corresponds to an eigenvector : 2 3 2 3 2 3 1 1 1 x 1 D 4 1 5 Ax 1 D 0x 1 x 2 D 4 0 5 Ax 2 D 1x 2 x 3 D 4 2 5 Ax 3 D 3x 3 : 1 1 1 I notice again that eigenvectors are perpendicular when A is symmetric. The 3 by 3 matrix produced a third-degree (cubic) polynomial for det.A  I / D 3 C 42  3. We were lucky to find simple roots  D 0; 1; 3. Normally we would use a command like eig.A/, and the computation will never even use determinants (Section 9.3 shows a better way for large matrices). The full command S; D D eig.A/ will produce unit eigenvectors in the columns of the eigenvector matrixp S . The first one happens to have three minus signs, reversed from .1; 1; 1/ and divided by 3. The eigenvalues of A will be on the diagonal of the eigenvalue matrix (typed as D but soon called ). 293 ## 6.1. Introduction to Eigenvalues ## Problem Set 6.1 1 The example at the start of the chapter has powers of this matrix A:       :8 :3 :70 :45 :6 :6 AD and A2 D and A1 D : :2 :7 :30 :55 :4 :4 Find the eigenvalues of these matrices. All powers have the same eigenvectors. (a) Show from A how a row exchange can produce different eigenvalues. (b) Why is a zero eigenvalue not changed by the steps of elimination? ## Find the eigenvalues and the eigenvectors of these two matrices:     1 4 2 4 AD and A C I D : 2 3 2 4 A C I has the by 1. ## Compute the eigenvalues and eigenvectors of A and A 1 . Check the trace !     0 2 1=2 1 1 AD and A D : 1 1 1=2 0 A1 has the has eigenvalues ## eigenvectors as A. Its eigenvalues are ## eigenvectors as A. When A has eigenvalues 1 and 2 , its inverse . ## Compute the eigenvalues and eigenvectors of A and A 2 :     1 3 7 3 2 AD and A D : 2 0 2 6 A2 has the same as A. When A has eigenvalues 1 and 2 , A2 has eigenvalues . In this example, why is 21 C 22 D 13? ## Find the eigenvalues of A and B (easy for triangular matrices) and A C B:       3 0 1 1 4 1 AD and B D and A C B D : 1 1 0 3 1 4 Eigenvalues of A C B (are equal to)(are not equal to) eigenvalues of A plus eigenvalues of B. ## Find the eigenvalues of A and B and AB and BA:         1 0 1 2 1 2 3 2 AD and B D and AB D and BA D : 1 1 0 1 1 3 1 1 (a) Are the eigenvalues of AB equal to eigenvalues of A times eigenvalues of B? (b) Are the eigenvalues of AB equal to the eigenvalues of BA? 294 7 ## Chapter 6. Eigenvalues and Eigenvectors ## Elimination produces A D LU . The eigenvalues of U are on its diagonal; they . The eigenvalues of L are on its diagonal; they are all . The are the eigenvalues of A are not the same as . (a) If you know that x is an eigenvector, the way to find  is to (b) If you know that  is an eigenvalue, the way to find x is to . . What do you do to the equation Ax D x, in order to prove (a), (b), and (c)? (a) 2 is an eigenvalue of A2 , as in Problem 4. (b) 1 is an eigenvalue of A1 , as in Problem 3. (c)  C 1 is an eigenvalue of A C I , as in Problem 2. 10 Find the eigenvalues and eigenvectors for both of these Markov matrices A and A 1 . Explain from those answers why A100 is close to A1 :     1=3 1=3 :6 :2 1 : AD and A D 2=3 2=3 :4 :8 11 ## Here is a strange fact about 2 by 2 matrices with eigenvalues  1 2 : The columns of A  1 I are multiples of the eigenvector x 2 . Any idea why this should be? 12 Find three eigenvectors for this matrix P (projection matrices have  D 1 and 0): 2 3 :2 :4 0 Projection matrix P D 4 :4 :8 0 5 : 0 0 1 13 If two eigenvectors share the same , so do all their linear combinations. Find an eigenvector of P with no zero components.  From the unit vector u D 16 ; 16 ; 36 ; 56 construct the rank one projection matrix P D uuT . This matrix has P 2 D P because uT u D 1. (a) P u D u comes from .uuT /u D u. ## /. Then u is an eigenvector with  D 1. ## (b) If v is perpendicular to u show that P v D 0. Then  D 0. (c) Find three independent eigenvectors of P all with eigenvalue  D 0. 14 ## Solve det.Q  I / D 0 by the quadratic formula to reach  D cos  i sin  :   cos   sin  QD rotates the xy plane by the angle  . No real s. sin  cos  Find the eigenvectors of Q by solving .Q  I /x D 0. Use i 2 D 1. 295 ## 6.1. Introduction to Eigenvalues 15 ## Every permutation matrix leaves x D .1; 1; : : :; 1/ unchanged. Then  D 1. Find two more s (possibly complex) for these permutations, from det.P  I / D 0 : 2 3 2 3 0 1 0 0 0 1 P D 4 0 0 1 5 and P D 4 0 1 0 5 : 1 0 0 1 0 0 16 ## The determinant of A equals the product  1 2    n . Start with the polynomial det.A  I / separated into its n factors (always possible). Then set  D 0 : det.A  I / D .1  /.2  /    .n  / so det A D Check this rule in Example 1 where the Markov matrix has  D 1 and 12 . 17 The sum of the diagonal entries (the trace) equals the sum of the eigenvalues:   a b AD has det.A  I / D 2  .a C d / C ad  bc D 0: c d p The quadratic formula gives the eigenvalues  D .aCd C /=2 and  D Their sum is . If A has 1 D 3 and 2 D 4 then det.A  I / D . 18 ## If A has 1 D 4 and 2 D 5 then det.A  I / D .  4/.  5/ D 2  9 C 20. Find three matrices that have trace a C d D 9 and determinant 20 and  D 4; 5. 19 ## A 3 by 3 matrix B is known to have eigenvalues 0; 1; 2. This information is enough to find three of these (give the answers where possible) : (a) the rank of B (b) the determinant of B T B (c) the eigenvalues of B T B (d) the eigenvalues of .B 2 C I /1 . 20 ## Choose the last rows of A and C to give eigenvalues 4; 7 and 1; 2; 3: 2 3   0 1 0 0 1 Companion matrices AD C D 40 0 15 :      21 ## The eigenvalues of A equal the eigenvalues of A T . This is because det.A  I / equals det.AT  I /. That is true because . Show by an example that the T eigenvectors of A and A are not the same. 22 Construct any 3 by 3 Markov matrix M : positive entries down each column add to 1. Show that M T .1; 1; 1/ D .1; 1; 1/. By Problem 21,  D 1 is also an eigenvalue of M . Challenge: A 3 by 3 singular Markov matrix with trace 12 has what s ? 296 ## Chapter 6. Eigenvalues and Eigenvectors 23 Find three 2 by 2 matrices that have  1 D 2 D 0. The trace is zero and the determinant is zero. A might not be the zero matrix but check that A 2 D 0. 24 This matrix is singular with rank one. Find three s and three eigenvectors: 2 3 2 3 1  2 1 2  A D 425 2 1 2 D 44 2 45 : 1 2 1 2 25 Suppose A and B have the same eigenvalues  1 ; : : :; n with the same independent eigenvectors x 1 ; : : :; x n . Then A D B. Reason: Any vector x is a combination c1 x 1 C    C cn x n . What is Ax? What is Bx? 26 The block B has eigenvalues 1; 2 and C has eigenvalues 3; 4 and D has eigenvalues 5; 7. Find the eigenvalues of the 4 by 4 matrix A: 2 3 0 1 3 0   62 3 0 4 7 B C 7 AD D6 4 0 0 6 15 : 0 D 0 0 1 6 27 ## Find the rank and the four eigenvalues of A and C : 2 3 2 1 1 1 1 1 61 1 1 17 60 7 6 AD6 4 1 1 1 1 5 and C D 4 1 1 1 1 1 0 0 1 0 1 1 0 1 0 3 0 17 7: 05 1 28 Subtract I from the previous A. Find the s and then the determinants of 2 3 2 3 0 1 1 1 0 1 1 1 61 0 1 17 6 1 0 1 1 7 7 6 7: B DAI D6 4 1 1 0 1 5 and C D I  A D 4 1 1 0 1 5 1 1 1 0 1 1 1 0 29 ## (Review) Find the eigenvalues of A, B, and C : 2 3 2 3 1 2 3 0 0 1 A D 4 0 4 5 5 and B D 4 0 2 0 5 0 0 6 3 0 0 30 2 2 and C D 4 2 2 2 2 3 2 25 : 2 ## When a C b D c C d show that .1; 1/ is an eigenvector and find both eigenvalues :   a b AD : c d 297 ## 6.1. Introduction to Eigenvalues 31 If we exchange rows 1 and 2 and columns 1 and 2, the eigenvalues dont change. Find eigenvectors of A and B for  D 11. Rank one gives 2 D 3 D 0. 2 3 2 3 1 2 1 6 3 3 A D 4 3 6 3 5 and B D PAP T D 4 2 1 1 5 : 4 8 4 8 4 4 32 ## Suppose A has eigenvalues 0; 3; 5 with independent eigenvectors u; v; w. (a) Give a basis for the nullspace and a basis for the column space. (b) Find a particular solution to Ax D v C w. Find all solutions. (c) Ax D u has no solution. If it did then ## would be in the column space. 33 ## Suppose u; v are orthonormal vectors in R 2 , and A D uvT . Compute A2 D uvT uvT to discover the eigenvalues of A. Check that the trace of A agrees with  1 C 2 . 34 ## Find the eigenvalues of this permutation matrix P from det .P  I / D 0. Which vectors are not changed by the permutation? They are eigenvectors for  D 1. Can you find three more eigenvectors? 2 3 0 0 0 1 6 1 0 0 0 7 7 P D6 4 0 1 0 0 5: 0 0 1 0 Challenge Problems 35 There are six 3 by 3 permutation matrices P . What numbers can be the determinants of P ? What numbers can be pivots? What numbers can be the trace of P ? What four numbers can be eigenvalues of P , as in Problem 15? 36 Is there a real 2 by 2 matrix (other than I ) with A3 D I ? Its eigenvalues must satisfy 3 D 1. They can be e 2i=3 and e 2i=3 . What trace and determinant would this give? Construct a rotation matrix as A (which angle of rotation?). 37 ## (a) Find the eigenvalues and eigenvectors of A. They depend on c:   :4 1  c AD : :6 c (b) Show that A has just one line of eigenvectors when c D 1:6. (c) This is a Markov matrix when c D :8. Then An will approach what matrix A 1 ? Outlines ## Cramers Rule and Gauss Elimination Mike Renfro September 28, 2004 Mike Renfro ## Cramers Rule and Gauss Elimination Outlines ## Part I: Review of Previous Lecture Part II: Cramers Rule and Gauss Elimination ## Review of Previous Lecture Mike Renfro ## Cramers Rule and Gauss Elimination Outlines ## Part I: Review of Previous Lecture Part II: Cramers Rule and Gauss Elimination ## Cramers Rule and Gauss Elimination Cramers Rule Introduction Example Advantages and Disadvantages Gauss Elimination Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages Homework Mike Renfro ## Cramers Rule and Gauss Elimination Part I Review of Previous Lecture Mike Renfro ## Cramers Rule and Gauss Elimination ## Review of Previous Lecture Graphical interpretation Solvable and unsolvable problems Linear dependence and independence Ill-conditioning Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework Part II Cramers Rule and Gauss Elimination Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework Introduction Example Advantages and Disadvantages ## Cramers Rule (1750) A linear system of equations can be solved by using Cramers rule, which for a system of 2 equations      a11 a12 x1 b1 = a21 a22 x2 b2 yields x1 = |[A2 ]| |[A1 ]| , x2 = |[A]| |[A]| where  b1 a12 b2 a22 a11 b1 a21 b2 A1 = A2 = Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework Introduction Example Advantages and Disadvantages ## Cramers Rule Details ## For a system of n equations, Cramers rule requires that you calculate n + 1 determinants of n n matrices. In the general case for a system of equations [A]{x} = {b}, the matrix [Ai ] is obtained by replacing the ith column of the original [A] matrix with the contents of the {b} vector. Each unknown variable xi is found by dividing the determinant |[Ai ]| by the determinant of the original coefficient matrix |[A]|. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework Introduction Example Advantages and Disadvantages ## Cramers Rule Example Solve the following system of equations using Cramers rule: x1 x2 + x3 = 3 2x1 + x2 x3 = 0 3x1 + 2x2 + 2x3 = 15 Convert the system of equations into matrix form: 1 1 1 x1 3 2 1 1 x2 0 = 3 2 2 x3 15 Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework Introduction Example Advantages and Disadvantages ## Cramers Rule Example (continued) 1 1 1 x1 3 1 1 , {x} = x2 0 [A] = 2 , {b} = 3 2 2 x3 15 ## Define matrices [A1 ], [A2 ], and [A3 ] as 3 1 1 1 3 1 1 1 , [A2 ] = 2 0 1 , [A1 ] = 0 15 2 2 3 15 2 1 1 3 1 0 [A3 ] = 2 3 2 15 Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework Introduction Example Advantages and Disadvantages ## Cramers Rule Example (continued) Calculate determinants of [A], [A1 ], [A2 ], and [A3 ]: |[A]| = 12 |[A1 ]| = 12 |[A2 ]| = 24 |[A3 ]| = 48 ## Unknowns x1 , x2 , and x3 are then calculated as x1 = |[A1 ]| 12 |[A2 ]| 24 |[A3 ]| 48 = = 1, x2 = = = 2, x3 = = =4 |[A]| 12 |[A]| 12 |[A]| 12 Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework Introduction Example Advantages and Disadvantages ## Cramers Rule Example (continued) ## Be sure to double-check your answers by substituting them into the original equations: x1 x2 + x3 = 1 2 + 4 = 2x1 + x2 x3 = 2 + 2 4 = ## 3x1 + 2x2 + 2x3 = 3 + 4 + 8 = 15 Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework Introduction Example Advantages and Disadvantages ## Cramers Rule Advantages/Disadvantages Advantages Easy to remember steps Disadvantages Computationally intensive compared to other methods: the most efficient ways of calculating the determinant of an n n matrix require (n 1)(n!) operations. So Cramers rule would require (n 1)((n + 1)!) total operations. For 8 equations, that works out to 7(9!) = 2540160 operations, or around 700 hours if you can perform one operation per second. Roundoff error may become significant on large problems with non-integer coefficients. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages Gauss Elimination Recall the scaffolding problem from the beginning of Chapter 3. Its matrix form was P T 1 1 1 1 0 1 1 0 9 5P T 1 4 0 7 1 B 0 P T 0 1 1 1 0 2 C = 0 P2 T 0 0 3 2 0 P T 0 0 0 1 1 3 E P3 TF 0 0 0 0 0 4 Notice that its coefficient matrix contains nothing but zeroes below the diagonal. This is an example of an upper triangular matrix, and these systems of equations are very easy to solve. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Gauss Elimination Introduction (continued) The original system of equations on the scaffolding problem was TA +TB 9TB TC +TC TC TD +4TD +TD 3TD TF +7TF TE +2TE TE +TF 4TF = P1 = 5P1 = P2 . = P2 = P3 = P3 Notice that we can solve for TF using only the sixth equation in the system. That is, TF = P43 . After solving for TF , we can solve for TE using only the fifth equation. The pattern continues, back-substituting through the system of equations until finally we solve for TA using the first equation. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Gauss Elimination Introduction (continued) ## The goal of Gauss elimination is to convert any given system of equations into an equivalent upper triangular form. Once converted, we can back-substitute through the equations, solving for the unknowns algebraically. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Gauss Elimination Rules ## The operations used in converting a system of equations to upper triangular form are known as elementary operations and are: Any equation may be multiplied by a nonzero scalar. Any equation may be added to (or subtracted from) another equation. The positions of any two equations in the system may be swapped. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Gauss Elimination Example 2x1 x2 + x3 = 4 (1) 4x1 + 3x2 x3 = 6 (2) ## 3x1 + 2x2 + 2x3 = 15 (3) ## To eliminate the 4x1 term in Equation 2, multiply Equation 1 by 2 and subtract it from Equation 2. To eliminate the 3x1 term in Equation 3, multiply Equation 1 by 23 and subtract it from Equation 3. This gives a system of equations 2x1 x2 + x3 = 4 5x2 3x3 = 2 7 1 x 2 + x3 = 9 2 2 Mike Renfro ## Cramers Rule and Gauss Elimination (4) (5) (6) Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Gauss Elimination Example (continued) ## To eliminate the 27 x2 term from Equation 6, multiply Equation 5 7 and subtract it from Equation 6. This gives a system of by 10 equations 2x1 x2 + x3 = 4 5x2 3x3 = 2 13 52 x3 = 5 5 Mike Renfro ## Cramers Rule and Gauss Elimination (7) (8) (9) Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Gauss Elimination Example (continued) Equation 9 can easily be solved for x3 .   52 5 x3 = =4 5 13 Equation 8 can easily be solved for x2 , once x3 is known. x2 = 1 1 (2 + 3x3 ) = (2 + 3(4)) = 2 5 5 ## Equation 7 can easily be solved for x1 , once both x2 and x3 are known. 1 1 x1 = (4 + x2 x3 ) = (4 + 2 4) = 1 2 2 Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Matrix Version of Gauss Elimination ## The Gauss elimination method can be applied to a system of equations in matrix form. Instead of eliminating terms from equations, well be replacing certain elements of the coefficient matrix with zeroes. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Matrix Version (Step 0) Start by defining the augmented matrix [C (0) ] for the problem: a1n (0) a1,n+1 (0) (0) a21 a22 ]= .. .. .. . . . (0) (0) an1 an2 a2n .. . (0) a2,n+1 .. . (0) an,n+1 [C (0) (0) a11 (0) a12 ann (0) (0) (0) where the first n columns are the elements of the original [A] matrix, and the last column is the elements of the original {b} matrix. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Matrix Version (Step 1) Zero out the first column of the [C ] matrix, rows 2 n. To turn 21 a21 to a zero, multiply row 1 by aa11 , then subtract the numbers on row 1 from row 2. To turn a31 to a zero, multiply row 1 by aa31 , 11 then subtract the numbers on row 1 from row 3. Repeat for rows 4 n. [C (1) (0) (0) a1n (0) a1,n+1 (1) .. . a2n .. . (1) a2,n+1 .. . (1) ann (1) an,n+1 a11 a12 0 ]= .. . 0 a22 .. . an2 Mike Renfro (0) (1) (1) ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Matrix Version (Step 2) Zero out the second column of the [C ] matrix, rows 3 n. To 32 turn a32 to a zero, multiply row 2 by aa22 , then subtract the numbers on row 2 from row 3. To turn a42 to a zero, multiply row 42 , then subtract the numbers on row 2 from row 4. Repeat 2 by aa22 for rows 5 n. (0) (0) (0) (0) (0) a11 a12 a13 a1n a1,n+1 (1) (1) (1) (1) 0 a22 a13 a2n a2,n+1 (2) (2) (2) ## 0 a33 a3n a3,n+1 [C (1) ] = 0 . .. .. .. .. .. . . . . . . . (2) (2) (2) 0 0 an3 ann an,n+1 Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Matrix Version (Step n-1) Zero out the (n 1)th column of the [C ] matrix, row n. To turn an,n1 an,n1 to a zero, multiply row n 1 by an1,n1 , then subtract the numbers on row n 1 from row n. (0) (0) (0) (0) (0) a11 a12 a13 a1n a1,n+1 (1) (1) (1) (1) 0 a22 a13 a2n a2,n+1 (2) (2) (2) 0 a33 a3n a3,n+1 [C (1) ] = 0 . .. .. .. .. .. . . . . . . . (n1) (n1) 0 0 0 ann an,n+1 Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages Example Solve the following system of equations with Gauss elimination: 2 1 1 x1 4 4 3 1 x2 6 = 3 2 2 x3 15 First, set up the augmented matrix [C (0) ]: 2 1 1 4 3 1 6 [C (0) ] = 4 3 2 2 15 Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages Example (continued) Step 1a: eliminate the 4 on row 2, column 1. Multiply all the 21 elements of row 1 by aa11 = 42 = 2, then subtract them from the elements of row 2. 2 1 1 4 [C ] = 4 (2)(2) 3 (2)(1) 1 (2)(1) 6 (2)(4) 3 2 2 15 2 1 1 4 5 3 2 = 0 3 2 2 15 Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages Example (continued) Step 1b: eliminate the 3 31 elements of row 1 by aa11 elements of row 3. 2 (1) 0 [C ] = 3 (1.5)(2) 2 1 1 5 3 = 0 0 3.5 0.5 ## on row 3, column 1. Multiply all the = 32 = 1.5, then subtract them from the 1 1 4 5 3 2 2 (1.5)(1) 2 (1.5)(1) 15 (1.5)(4) 4 2 9 ## This completes the first elimination step. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages Example (continued) Step 2: eliminate the 3.5 on row 3, column 2. Multiply all the 32 elements of row 2 by aa22 = 3.5 5 = 0.7, then subtract them from the elements of row 3. 2 1 1 4 5 3 2 [C (2) ] = 0 0 3.5 (0.7)(5) 0.5 (0.7)(3) 9 (0.7)(2) 2 1 1 4 5 3 2 = 0 0 0 2.6 10.4 This completes the second elimination step. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages Example (continued) Weve now converted the original system of equations 2 1 1 x1 4 4 3 1 x2 6 = 3 2 2 x3 15 into an equivalent upper-triangular system of equations 4 2 1 1 x1 0 x2 2 5 3 = x3 10.4 0 0 2.6 Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages Example (continued) The new system of equations can be converted back to algebraic form as: 2x1 x2 + x3 = 4 5x2 3x3 = 2 2.6x3 = 10.4 (10) (11) (12) ## Solve Equation 12 for x3 : x3 = 10.4 2.6 = 4. Then solve Equation 11 1 for x2 : x2 = 5 (2 + 3x3 ) = 2. Then solve Equation 10 for x1 : x1 = 21 (4 + x2 x3 ) = 1. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages Example (continued) ## Double-check the solution by substituting the values of x1 , x2 , and x3 into the original equations: 2x1 x2 + x3 = 2(1) 2 + 4 4x1 3x2 x3 = 4(1) + 3(2) 4 3x1 + 2x2 + 2x3 = 3(1) + 2(2) + 2(4) Mike Renfro =4 =6 = 15 ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework ## Introduction and Rules Example Matrix Version and Example Advantages and Disadvantages ## Gauss Elimination Advantages/Disadvantages Advantages Much less computation required for larger problems. Gauss 3 elimination requires n3 multiplications to solve a system of n equations. For 8 equations, this works out to around 170 operations, versus the roughly 2.5 million operations for Cramers rule. Disadvantages Not quite as easy to remember the procedure for hand solutions. Roundoff error may become significant, but can be partially mitigated by using more advanced techniques such as pivoting or scaling. Mike Renfro ## Cramers Rule and Gauss Elimination Cramers Rule Gauss Elimination Homework Homework ## Solve Problem 3.4 using: Cramers rule, Gauss elimination, and MATLABs \ operator. Double-check your answers by substituting them back into the original equations. Mike Renfro ## Cramers Rule and Gauss Elimination Lecture 12 LU Decomposition In many applications where linear systems appear, one needs to solve Ax = b for many different vectors b. For instance, a structure must be tested under several different loads, not just one. As in the example of a truss (9.2), the loading in such a problem is usually represented by the vector b. Gaussian elimination with pivoting is the most efficient and accurate way to solve a linear system. Most of the work in this method is spent on the matrix A itself. If we need to solve several different systems with the same A, and A is big, then we would like to avoid repeating the steps of Gaussian elimination on A for every different b. This can be accomplished by the LU decomposition, which in effect records the steps of Gaussian elimination. LU decomposition The main idea of the LU decomposition is to record the steps used in Gaussian elimination on A in the places where the zero is produced. Consider the matrix: 1 2 3 A = 2 5 12 . 0 2 10 The first step of Gaussian elimination is to subtract 2 times the first row from the second row. In order to record what we have done, we will put the multiplier, 2, into the place it was used to make a zero, i.e. the second row, first column. In order to make it clear that it is a record of the step and not an element of A, we will put it in parentheses. This leads to: 1 2 3 (2) 1 6 . 0 2 10 There is already a zero in the lower left corner, so we dont need to eliminate anything there. We record this fact with a (0). To eliminate the third row, second column, we need to subtract 2 times the second row from the third row. Recording the 2 in the spot it was used we have: 1 2 3 (2) 1 6 . (0) (2) 2 Let U be the upper triangular matrix produced, and let L be the lower triangular matrix with the records and ones on the diagonal, i.e.: 1 2 3 1 0 0 U = 0 1 6 , L = 2 1 0 , 0 0 2 0 2 1 43 44 ## LECTURE 12. LU DECOMPOSITION ## then we have the following mysterious coincidence: 1 2 1 0 0 1 2 3 LU = 2 1 0 0 1 6 = 2 5 0 2 0 2 1 0 0 2 3 12 = A. 10 Thus we see that A is actually the product of L and U . Here L is lower triangular and U is upper triangular. When a matrix can be written as a product of simpler matrices, we call that a decomposition of A and this one we call the LU decomposition. ## Using LU to solve equations If we also include pivoting, then an LU decomposition for A consists of three matrices P , L and U such that: P A = LU. (12.1) The pivot matrix P is the identity matrix, with the same rows switched as the rows of A are switched in the pivoting. For instance: 1 0 0 P = 0 0 1 , 0 1 0 would be the pivot matrix if the second and third rows of A are switched by pivoting. Matlab will produce an LU decomposition with pivoting for a matrix A with the following command: > [L U P] = lu(A) where P is the pivot matrix. To use this information to solve Ax = b we first pivot both sides by multiplying by the pivot matrix: P Ax = P b d. Substituting LU for P A we get: LU x = d. Then we need only to solve two back substitution problems: Ly = d and U x = y. In Matlab this would work as follows: > A = rand(5,5) > [L U P] = lu(A) > b = rand(5,1) > d = P*b > y = L\d > x = U\y > rnorm = norm(A*x - b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Check the result. We can then solve for any other b without redoing the LU step. Repeat the sequence for a new right hand side: c = randn(5,1); you can start at the third line. While this may not seem like a big savings, it would be if A were a large matrix from an actual application. 45 Exercises 12.1 Solve the systems below by hand using the LU decomposition. Pivot if appropriate. In each of the two problems, check by hand that LU = P A and Ax = b.     0 2 4 , b= (a) A = 3 .5 4     1 4 3 (b) A = , b= 3 5 2 12.2 Finish the following Matlab function program: function [x1, e1, x2, e2] = mysolve(A,b) % Solves linear systems using the LU decomposition with pivoting % and also with the built-in solve function A\b. % Inputs: A -- the matrix % b -- the right-hand vector % Outputs: x1 -- the solution using the LU method % e1 -- the norm of the residual using the LU method % x2 -- the solution using the built-in method % e2 -- the norm of the residual using the built-in method format long Test the program on both random matrices (randn(n,n)) and Hilbert matrices (hilb(n)) with n really big and document the results. Chapter 2 Linear Equations ## One of the problems encountered most frequently in scientific computation is the solution of systems of simultaneous linear equations. This chapter covers the solution of linear systems by Gaussian elimination and the sensitivity of the solution to errors in the data and roundoff errors in the computation. 2.1 ## Solving Linear Systems ## With matrix notation, a system of simultaneous linear equations is written Ax = b. In the most frequent case, when there are as many equations as unknowns, A is a given square matrix of order n, b is a given column vector of n components, and x is an unknown column vector of n components. Students of linear algebra learn that the solution to Ax = b can be written x = A1 b, where A1 is the inverse of A. However, in the vast majority of practical computational problems, it is unnecessary and inadvisable to actually compute A1 . As an extreme but illustrative example, consider a system consisting of just one equation, such as 7x = 21. The best way to solve such a system is by division: x= 21 = 3. 7 ## Use of the matrix inverse would lead to x = 71 21 = 0.142857 21 = 2.99997. The inverse requires more arithmetica division and a multiplication instead of just a divisionand produces a less accurate answer. Similar considerations apply February 15, 2008 ## Chapter 2. Linear Equations to systems of more than one equation. This is even true in the common situation where there are several systems of equations with the same matrix A but different right-hand sides b. Consequently, we shall concentrate on the direct solution of systems of equations rather than the computation of the inverse. 2.2 ## The MATLAB Backslash Operator To emphasize the distinction between solving linear equations and computing inverses, Matlab has introduced nonstandard notation using backward slash and forward slash operators, \ and /. If A is a matrix of any size and shape and B is a matrix with as many rows as A, then the solution to the system of simultaneous equations AX = B is denoted by X = A\B. Think of this as dividing both sides of the equation by the coefficient matrix A. Because matrix multiplication is not commutative and A occurs on the left in the original equation, this is left division. Similarly, the solution to a system with A on the right and B with as many columns as A, XA = B, is obtained by right division, X = B/A. This notation applies even if A is not square, so that the number of equations is not the same as the number of unknowns. However, in this chapter, we limit ourselves to systems with square coefficient matrices. 2.3 A 3-by-3 Example ## To illustrate the general linear equation solution algorithm, consider an example of order three: 10 7 0 x1 7 3 2 6 x2 = 4 . 5 1 5 x3 6 This, of course, represents the three simultaneous equations 10x1 7x2 = 7, 3x1 + 2x2 + 6x3 = 4, 5x1 x2 + 5x3 = 6. The first step of the solution algorithm uses the first equation to eliminate x1 from the other equations. This is accomplished by adding 0.3 times the first equation ## 2.3. A 3-by-3 Example to the second equation and subtracting 0.5 times the first equation from the third equation. The coefficient 10 of x1 in the first equation is called the first pivot and the quantities 0.3 and 0.5, obtained by dividing the coefficients of x1 in the other equations by the pivot, are called the multipliers. The first step changes the equations to 10 7 0 x1 7 0 0.1 6 x2 = 6.1 . 0 2.5 5 x3 2.5 The second step might use the second equation to eliminate x2 from the third equation. However, the second pivot, which is the coefficient of x2 in the second equation, would be 0.1, which is smaller than the other coefficients. Consequently, the last two equations are interchanged. This is called pivoting. It is not actually necessary in this example because there are no roundoff errors, but it is crucial in general: 10 7 0 x1 7 0 2.5 5 x2 = 2.5 . 0 0.1 6 x3 6.1 Now the second pivot is 2.5 and the second equation can be used to eliminate x2 from the third equation. This is accomplished by adding 0.04 times the second equation to the third equation. (What would the multiplier have been if the equations had not been interchanged?) 10 7 0 x1 7 0 2.5 5 x2 = 2.5 . 0 0 6.2 x3 6.2 The last equation is now 6.2x3 = 6.2. This can be solved to give x3 = 1. This value is substituted into the second equation: 2.5x2 + (5)(1) = 2.5. Hence x2 = 1. Finally, the values of x2 and x3 are substituted into the first equation: 10x1 + (7)(1) = 7. Hence x1 = 0. The solution is 0 x = 1 . 1 ## This solution can be easily checked 10 7 3 2 5 1 ## using the original equations: 0 0 7 6 1 = 4 . 5 1 6 ## Chapter 2. Linear Equations The entire algorithm can be compactly expressed in matrix notation. For this example, let 1 0 0 10 7 0 1 0 0 L = 0.5 1 0 , U = 0 2.5 5 , P = 0 0 1 . 0.3 0.04 1 0 0 6.2 0 1 0 The matrix L contains the multipliers used during the elimination, the matrix U is the final coefficient matrix, and the matrix P describes the pivoting. With these three matrices, we have LU = P A. In other words, the original coefficient matrix can be expressed in terms of products involving matrices with simpler structure. 2.4 ## Permutation and Triangular Matrices A permutation matrix is an identity matrix with the rows and columns interchanged. It has exactly one 1 in each row and column; all the other elements are 0. For example, 0 0 0 1 1 0 0 0 P = . 0 0 1 0 0 1 0 0 Multiplying a matrix A on the left by a permutation matrix to give P A permutes the rows of A. Multiplying on the right, AP , permutes the columns of A. Matlab can also use a permutation vector as a row or column index to rearrange the rows or columns of a matrix. Continuing with the P above, let p be the vector p = [4 1 3 2] Then P*A and A(p,:) are equal. The resulting matrix has the fourth row of A as its first row, the first row of A as its second row, and so on. Similarly, A*P and A(:,p) both produce the same permutation of the columns of A. The P*A notation is closer to traditional mathematics, P A, while the A(p,:) notation is faster and uses less memory. Linear equations involving permutation matrices are trivial to solve. The solution to Px = b is simply a rearrangement of the components of b: x = P T b. An upper triangular matrix has all its nonzero elements above or on the main diagonal. A unit lower triangular matrix has ones on the main diagonal and all the 2.5. LU Factorization ## rest of its nonzero elements below the main 1 2 0 5 U = 0 0 0 0 is upper triangular, and 1 2 L= 3 4 0 1 5 6 ## diagonal. For example, 3 4 6 7 8 9 0 10 0 0 1 7 0 0 0 1 ## is unit lower triangular. Linear equations involving triangular matrices are also easily solved. There are two variants of the algorithm for solving an n-by-n upper triangular system U x = b. Both begin by solving the last equation for the last variable, then the next-to-last equation for the next-to-last variable, and so on. One subtracts multiples of the columns of U from b. x = zeros(n,1); for k = n:-1:1 x(k) = b(k)/U(k,k); i = (1:k-1); b(i) = b(i) - x(k)*U(i,k); end The other uses inner products between the rows of U and portions of the emerging solution x. x = zeros(n,1); for k = n:-1:1 j = k+1:n; x(k) = (b(k) - U(k,j)*x(j))/U(k,k); end 2.5 LU Factorization The algorithm that is almost universally used to solve square systems of simultaneous linear equations is one of the oldest numerical methods, the systematic elimination method, generally named after C. F. Gauss. Research in the period 1955 to 1965 revealed the importance of two aspects of Gaussian elimination that were not emphasized in earlier work: the search for pivots and the proper interpretation of the effect of rounding errors. In general, Gaussian elimination has two stages, the forward elimination and the back substitution. The forward elimination consists of n 1 steps. At the kth step, multiples of the kth equation are subtracted from the remaining equations to eliminate the kth variable. If the coefficient of xk is small, it is advisable to ## Chapter 2. Linear Equations interchange equations before this is done. The elimination steps can be simultaneously applied to the right-hand side, or the interchanges and multipliers saved and applied to the right-hand side later. The back substitution consists of solving the last equation for xn , then the next-to-last equation for xn1 , and so on, until x1 is computed from the first equation. Let Pk , k = 1, . . . , n 1, denote the permutation matrix obtained by interchanging the rows of the identity matrix in the same way the rows of A are interchanged at the kth step of the elimination. Let Mk denote the unit lower triangular matrix obtained by inserting the negatives of the multipliers used at the kth step below the diagonal in the kth column of the identity matrix. Let U be the final upper triangular matrix obtained after the n 1 steps. The entire process can be described by one matrix equation, U = Mn1 Pn1 M2 P2 M1 P1 A. It turns out that this equation can be rewritten L1 L2 Ln1 U = Pn1 P2 P1 A, where Lk is obtained from Mk by permuting and changing the signs of the multipliers below the diagonal. So, if we let L = L1 L2 Ln1 , P = Pn1 P2 P1 , then we have LU = P A. The unit lower triangular matrix L contains all the multipliers used during the elimination and the permutation matrix P accounts for all the interchanges. For our example 10 7 0 A = 3 2 6, 5 1 5 the matrices defined during the elimination are 1 0 0 1 0 0 P1 = 0 1 0 , M1 = 0.3 1 0 , 0.5 0 1 0 0 1 1 0 P2 = 0 0 0 1 The corresponding Ls are 0 1 0 0 1 , M2 = 0 1 0. 0 0 0.04 1 1 0 0 1 L1 = 0.5 1 0 , L2 = 0 0.3 0 1 0 0 0 1 0. 0.04 1 ## 2.6. Why Is Pivoting Necessary? The relation LU = P A is called the LU factorization or the triangular decomposition of A. It should be emphasized that nothing new has been introduced. Computationally, elimination is done by row operations on the coefficient matrix, not by actual matrix multiplication. LU factorization is simply Gaussian elimination expressed in matrix notation. With this factorization, a general system of equations Ax = b becomes a pair of triangular systems Ly = P b, U x = y. 2.6 ## Why Is Pivoting Necessary? The diagonal elements of U are called pivots. The kth pivot is the coefficient of the kth variable in the kth equation at the kth step of the elimination. In our 3-by-3 example, the pivots are 10, 2.5, and 6.2. Both the computation of the multipliers and the back substitution require divisions by the pivots. Consequently, the algorithm cannot be carried out if any of the pivots are zero. Intuition should tell us that it is a bad idea to complete the computation if any of the pivots are nearly zero. To see this, let us change our example slightly to 10 7 0 x1 7 3 2.099 6 x2 = 3.901 . 5 1 5 x3 6 The (2, 2) element of the matrix has been changed from 2.000 to 2.099, and the right-hand side has also been changed so that the exact answer is still (0, 1, 1)T . Let us assume that the solution is to be computed on a hypothetical machine that does decimal floating-point arithmetic with five significant digits. The first step of the elimination produces 7 x1 10 7 0 0 0.001 6 x2 = 6.001 . 0 2.5 5 x3 2.5 The (2, 2) element is now quite small compared with the other elements in the matrix. Nevertheless, let us complete the elimination without using any interchanges. The next step requires adding 2.5 103 times the second equation to the third: (5 + (2.5 103 )(6))x3 = (2.5 + (2.5 103 )(6.001)). On the right-hand side, this involves multiplying 6.001 by 2.5 103 . The result is 1.50025 104 , which cannot be exactly represented in our hypothetical floating-point number system. It must be rounded to 1.5002 104 . The result is then added to 2.5 and rounded again. In other words, both of the 5s shown in italics in (5 + 1.5000 104 )x3 = (2.5 + 1.50025 104 ) ## Chapter 2. Linear Equations are lost in roundoff errors. On this hypothetical machine, the last equation becomes 1.5005 104 x3 = 1.5004 104 . The back substitution begins with x3 = 1.5004 104 = 0.99993. 1.5005 104 Because the exact answer is x3 = 1, it does not appear that the error is too serious. Unfortunately, x2 must be determined from the equation 0.001x2 + (6)(0.99993) = 6.001, which gives 1.5 103 = 1.5. 1.0 103 Finally, x1 is determined from the first equation, x2 = 10x1 + (7)(1.5) = 7, which gives x1 = 0.35. T ## Instead of (0, 1, 1) , we have obtained (0.35, 1.5, 0.99993)T . Where did things go wrong? There was no accumulation of rounding error caused by doing thousands of arithmetic operations. The matrix is not close to singular. The difficulty comes from choosing a small pivot at the second step of the elimination. As a result, the multiplier is 2.5 103 , and the final equation involves coefficients that are 103 times as large as those in the original problem. Roundoff errors that are small if compared to these large coefficients are unacceptable in terms of the original matrix and the actual solution. We leave it to the reader to verify that if the second and third equations are interchanged, then no large multipliers are necessary and the final result is accurate. This turns out to be true in general: If the multipliers are all less than or equal to one in magnitude, then the computed solution can be proved to be satisfactory. Keeping the multipliers less than one in absolute value can be ensured by a process known as partial pivoting. At the kth step of the forward elimination, the pivot is taken to be the largest (in absolute value) element in the unreduced part of the kth column. The row containing this pivot is interchanged with the kth row to bring the pivot element into the (k, k) position. The same interchanges must be done with the elements of the right-hand side b. The unknowns in x are not reordered because the columns of A are not interchanged. 2.7 ## lutx, bslashtx, lugui ## We have three functions implementing the algorithms discussed in this chapter. The first function, lutx, is a readable version of the built-in Matlab function lu. ## 2.7. lutx, bslashtx, lugui There is one outer for loop on k that counts the elimination steps. The inner loops on i and j are implemented with vector and matrix operations, so that the overall function is reasonably efficient. function [L,U,p] = lutx(A) %LU Triangular factorization % [L,U,p] = lutx(A) produces a unit lower triangular % matrix L, an upper triangular matrix U, and a % permutation vector p, so that L*U = A(p,:). [n,n] = size(A); p = (1:n) for k = 1:n-1 % Find largest element below diagonal in k-th column [r,m] = max(abs(A(k:n,k))); m = m+k-1; % Skip elimination if column is zero if (A(m,k) ~= 0) % Swap pivot row if (m ~= k) A([k m],:) = A([m k],:); p([k m]) = p([m k]); end % Compute multipliers i = k+1:n; A(i,k) = A(i,k)/A(k,k); % Update the remainder of the matrix j = k+1:n; A(i,j) = A(i,j) - A(i,k)*A(k,j); end end % Separate result L = tril(A,-1) + eye(n,n); U = triu(A); Study this function carefully. Almost all the execution time is spent in the statement A(i,j) = A(i,j) - A(i,k)*A(k,j); 10 Chapter 2. Linear Equations At the kth step of the elimination, i and j are index vectors of length n-k. The operation A(i,k)*A(k,j) multiplies a column vector by a row vector to produce a square, rank one matrix of order n-k. This matrix is then subtracted from the submatrix of the same size in the bottom right corner of A. In a programming language without vector and matrix operations, this update of a portion of A would be done with doubly nested loops on i and j. The second function, bslashtx, is a simplified version of the built-in Matlab backslash operator. It begins by checking for three important special cases: lower triangular, upper triangular, and symmetric positive definite. Linear systems with these properties can be solved in less time than a general system. function x = bslashtx(A,b) % BSLASHTX Solve linear system (backslash) % x = bslashtx(A,b) solves A*x = b [n,n] = size(A); if isequal(triu(A,1),zeros(n,n)) % Lower triangular x = forward(A,b); return elseif isequal(tril(A,-1),zeros(n,n)) % Upper triangular x = backsubs(A,b); return elseif isequal(A,A) [R,fail] = chol(A); if ~fail % Positive definite y = forward(R,b); x = backsubs(R,y); return end end If none of the special cases is detected, bslashtx calls lutx to permute and factor the coefficient matrix, then uses the permutation and factors to complete the solution of a linear system. % Triangular factorization [L,U,p] = lutx(A); % Permutation and forward elimination y = forward(L,b(p)); % Back substitution x = backsubs(U,y); ## 2.8. Effect of Roundoff Errors 11 The bslashtx function employs subfunctions to carry out the solution of lower and upper triangular systems. function x = forward(L,x) % FORWARD. Forward elimination. % For lower triangular L, x = forward(L,b) solves L*x = b. [n,n] = size(L); for k = 1:n j = 1:k-1; x(k) = (x(k) - L(k,j)*x(j))/L(k,k); end function x = backsubs(U,x) % BACKSUBS. Back substitution. % For upper triangular U, x = backsubs(U,b) solves U*x = b. [n,n] = size(U); for k = n:-1:1 j = k+1:n; x(k) = (x(k) - U(k,j)*x(j))/U(k,k); end A third function, lugui, shows the steps in LU decomposition by Gaussian elimination. It is a version of lutx that allows you to experiment with various pivot selection strategies. At the kth step of the elimination, the largest element in the unreduced portion of the kth column is shown in magenta. This is the element that partial pivoting would ordinarily select as the pivot. You can then choose among four different pivoting strategies: Pick a pivot. Use the mouse to pick the magenta element, or any other element, as pivot. Diagonal pivoting. Use the diagonal element as the pivot. Partial pivoting. Same strategy as lu and lutx. Complete pivoting. Use the largest element in the unfactored submatrix as the pivot. The chosen pivot is shown in red and the resulting elimination step is taken. As the process proceeds, the emerging columns of L are shown in green and the emerging rows of U in blue. 2.8 ## Effect of Roundoff Errors The rounding errors introduced during the solution of a linear system of equations almost always cause the computed solutionwhich we now denote by x to differ somewhat from the theoretical solution x = A1 b. In fact, if the elements of x 12 ## Chapter 2. Linear Equations are not floating-point numbers, then x cannot equal x. There are two common measures of the discrepancy in x : the error, e = x x , and the residual, r = b Ax . Matrix theory tells us that, because A is nonsingular, if one of these is zero, the other must also be zero. But they are not necessarily both small at the same time. Consider the following example: x1 0.780 0.563 0.217 = . 0.913 0.659 x2 0.254 What happens if we carry out Gaussian elimination with partial pivoting on a hypothetical three-digit decimal computer? First, the two rows (equations) are interchanged so that 0.913 becomes the pivot. Then the multiplier 0.780 = 0.854 (to three places) 0.913 is computed. Next, 0.854 times the new first row is subtracted from the new second row to produce the system 0.913 0.659 x1 0.254 = . 0 0.001 x2 0.001 Finally, the back substitution is carried out: 0.001 = 1.00 (exactly), 0.001 0.254 0.659x2 x1 = 0.913 = 0.443 (to three places). x2 = ## Thus the computed solution is x = 0.443 1.000 To assess the accuracy without knowing the exact answer, we compute the residuals (exactly): ## 0.217 ((0.780)(0.443) + (0.563)(1.00)) r = b Ax = 0.254 ((0.913)(0.443) + (0.659)(1.00)) 0.000460 = . 0.000541 ## 2.8. Effect of Roundoff Errors 13 The residuals are less than 103 . We could hardly expect better on a three-digit machine. However, it is easy to see that the exact solution to this system is 1.000 x= . 1.000 So the components of our computed solution actually have the wrong signs; the error is larger than the solution itself. Were the small residuals just a lucky fluke? You should realize that this example is highly contrived. The matrix is very close to being singular and is not typical of most problems encountered in practice. Nevertheless, let us track down the reason for the small residuals. If Gaussian elimination with partial pivoting is carried out for this example on a computer with six or more digits, the forward elimination will produce a system something like x1 0.254000 0.913000 0.659000 . = x2 0.000001 0 0.000001 Notice that the sign of U2,2 differs from that obtained with three-digit computation. Now the back substitution produces 0.000001 = 1.00000, 0.000001 0.254 0.659x2 x1 = 0.913 = 1.00000, x2 = the exact answer. On our three-digit machine, x2 was computed by dividing two quantities, both of which were on the order of rounding errors and one of which did not even have the correct sign. Hence x2 can turn out to be almost anything. Then this arbitrary value of x2 was substituted into the first equation to obtain x1 . We can reasonably expect the residual from the first equation to be small x1 was computed in such a way as to make this certain. Now comes a subtle but crucial point. We can also expect the residual from the second equation to be small, precisely because the matrix is so close to being singular. The two equations are very nearly multiples of one another, so any pair (x1 , x2 ) that nearly satisfies the first equation will also nearly satisfy the second. If the matrix were known to be exactly singular, we would not need the second equation at allany solution of the first would automatically satisfy the second. In Figure 2.1, the exact solution is marked with a circle and the computed solution with an asterisk. Even though the computed solution is far from the exact intersection, it is close to both lines because they are nearly parallel. Although this example is contrived and atypical, the conclusion we reached is not. It is probably the single most important fact that we have learned about matrix computation since the invention of the digital computer: Gaussian elimination with partial pivoting is guaranteed to produce small residuals. 14 ## Chapter 2. Linear Equations 1.5 0.5 0.5 1.5 1.5 0.5 0.5 1.5 ## Figure 2.1. The computed solution, marked by an asterisk, shows a large error, but a small residual. Now that we have stated it so strongly, we must make a couple of qualifying remarks. By guaranteed we mean it is possible to prove a precise theorem that assumes certain technical details about how the floating-point arithmetic system works and that establishes certain inequalities that the components of the residual must satisfy. If the arithmetic units work some other way or if there is a bug in the particular program, then the guarantee is void. Furthermore, by small we mean on the order of roundoff error relative to three quantities: the size of the elements of the original coefficient matrix, the size of the elements of the coefficient matrix at intermediate steps of the elimination process, and the size of the elements of the computed solution. If any of these are large, then the residual will not necessarily be small in an absolute sense. Finally, even if the residual is small, we have made no claims that the error will be small. The relationship between the size of the residual and the size of the error is determined in part by a quantity known as the condition number of the matrix, which is the subject of the next section. 2.9 ## Norms and Condition Numbers The coefficients in the matrix and right-hand side of a system of simultaneous linear equations are rarely known exactly. Some systems arise from experiments, and so the coefficients are subject to observational errors. Other systems have coefficients given by formulas that involve roundoff error in their evaluation. Even if the system can be stored exactly in the computer, it is almost inevitabe that roundoff errors will be introduced during its solution. It can be shown that roundoff errors in Gaussian elimination have the same effect on the answer as errors in the original coefficients. ## 2.9. Norms and Condition Numbers 15 ## Consequently, we are led to a fundamental question. If perturbations are made in the coefficients of a system of linear equations, how much is the solution altered? In other words, if Ax = b, how can we measure the sensitivity of x to changes in A and b? The answer to this question lies in making the idea of nearly singular precise. If A is a singular matrix, then for some bs a solution x will not exist, while for others it will not be unique. So if A is nearly singular, we can expect small changes in A and b to cause very large changes in x. On the other hand, if A is the identity matrix, then b and x are the same vector. So if A is nearly the identity, small changes in A and b should result in correspondingly small changes in x. At first glance, it might appear that there is some connection between the size of the pivots encountered in Gaussian elimination with partial pivoting and nearness to singularity, because if the arithmetic could be done exactly, all the pivots would be nonzero if and only if the matrix is nonsingular. To some extent, it is also true that if the pivots are small, then the matrix is close to singular. However, when roundoff errors are encountered, the converse is no longer truea matrix might be close to singular even though none of the pivots are small. To get a more precise, and reliable, measure of nearness to singularity than the size of the pivots, we need to introduce the concept of a norm of a vector. This is a single number that measures the general size of the elements of the vector. The family of vector norms known as lp depends on a parameter p in the range 1 p : !1/p n X p |xi | . kxkp = i=1 ## We almost always use p = 1, p = 2, or lim p : kxk1 = n X |xi |, i=1 kxk2 = n X !1/2 |xi | i=1 ## kxk = max |xi |. i The l1 -norm is also known as the Manhattan norm because it corresponds to the distance traveled on a grid of city streets. The l2 -norm is the familiar Euclidean distance. The l -norm is also known as the Chebyshev norm. The particular value of p is often unimportant and we simply use kxk. All vector norms have the following basic properties associated with the notion of distance: kxk > 0 if x 6= 0, k0k = 0, kcxk = |c|kxk for all scalars c, kx + yk kxk + kyk (the triangle inequality). 16 ## Chapter 2. Linear Equations ## In Matlab, kxkp is computed by norm(x,p), and norm(x) is the same as norm(x,2). For example, x = (1:4)/5 norm1 = norm(x,1) norm2 = norm(x) norminf = norm(x,inf) produces x = 0.2000 0.4000 0.6000 0.8000 norm1 = 2.0000 norm2 = 1.0954 norminf = 0.8000 Multiplication of a vector x by a matrix A results in a new vector Ax that can have a very different norm from x. This change in norm is directly related to the sensitivity we want to measure. The range of the possible change can be expressed by two numbers: kAxk , kxk kAxk m = min . kxk M = max The max and min are taken over all nonzero vectors x. Note that if A is singular, then m = 0. The ratio M/m is called the condition number of A: (A) = max kAxk kxk min kAxk kxk The actual numerical value of (A) depends on the vector norm being used, but we are usually only interested in order of magnitude estimates of the condition number, so the particular norm is usually not very important. Consider a system of equations Ax = b and a second system obtained by altering the right-hand side: A(x + x) = b + b. ## 2.9. Norms and Condition Numbers 17 ## We think of b as being the error in b and x as being the resulting error in x, although we need not make any assumptions that the errors are small. Because A(x) = b, the definitions of M and m immediately lead to kbk M kxk and kbk mkxk. Consequently, if m 6= 0, kbk kxk (A) . kxk kbk The quantity kbk/kbk is the relative change in the right-hand side, and the quantity kxk/kxk is the relative error caused by this change. The advantage of using relative changes is that they are dimensionless, that is, they are not affected by overall scale factors. This shows that the condition number is a relative error magnification factor. Changes in the right-hand side can cause changes (A) times as large in the solution. It turns out that the same is true of changes in the coefficient matrix itself. The condition number is also a measure of nearness to singularity. Although we have not yet developed the mathematical tools necessary to make the idea precise, the condition number can be thought of as the reciprocal of the relative distance from the matrix to the set of singular matrices. So, if (A) is large, A is close to singular. Some of the basic properties of the condition number are easily derived. Clearly, M m, and so (A) 1. If P is a permutation matrix, then the components of P x are simply a rearrangement of the components of x. It follows that kP xk = kxk for all x, and so (P ) = 1. In particular, (I) = 1. If A is multiplied by a scalar c, then M and m are both multiplied by the same scalar, and so (cA) = (A). If D is a diagonal matrix, then (D) = max |dii | . min |dii | These last two properties are two of the reasons that (A) is a better measure of nearness to singularity than the determinant of A. As an extreme example, consider a 100-by-100 diagonal matrix with 0.1 on the diagonal. Then det(A) = 10100 , which is usually regarded as a small number. But (A) = 1, and the components of Ax are simply 0.1 times the corresponding components of x. For linear systems of equations, such a matrix behaves more like the identity than like a singular matrix. 18 ## Chapter 2. Linear Equations The following example uses the l1 -norm: 4.1 2.8 A= , 9.7 6.6 4.1 b= , 9.7 1 x= . 0 Clearly, Ax = b, and kbk = 13.8, kxk = 1. If the right-hand side is changed to b = the solution becomes x = 4.11 9.70 0.34 0.97 Let b = b b and x = x x . Then kbk = 0.01, kxk = 1.63. We have made a fairly small perturbation in b that completely changes x. In fact, the relative changes are kbk = 0.0007246, kbk kxk = 1.63. kxk Because (A) is the maximum magnification factor, (A) 1.63 = 2249.4. 0.0007246 We have actually chosen the b and b that give the maximum, and so, for this example with the l1 -norm, (A) = 2249.4. It is important to realize that this example is concerned with the exact solutions to two slightly different systems of equations and that the method used to obtain the solutions is irrelevant. The example is constructed to have a fairly large condition number so that the effect of changes in b is quite pronounced, but similar behavior can be expected in any problem with a large condition number. The condition number also plays a fundamental role in the analysis of the roundoff errors introduced during the solution by Gaussian elimination. Let us ## 2.9. Norms and Condition Numbers 19 assume that A and b have elements that are exact floating-point numbers, and let x be the vector of floating-point numbers obtained from a linear equation solver such as the function we shall present in the next section. We also assume that exact singularity is not detected and that there are no underflows or overflows. Then it is possible to establish the following inequalities: kb Ax k , kAkkx k kx x k (A). kx k Here is the relative machine precision eps and is defined more carefully later, but it usually has a value no larger than about 10. The first inequality says that the relative residual can usually be expected to be about the size of roundoff error, no matter how badly conditioned the matrix is. This was illustrated by the example in the previous section. The second inequality requires that A be nonsingular and involves the exact solution x. It follows directly from the first inequality and the definition of (A) and says that the relative error will also be small if (A) is small but might be quite large if the matrix is nearly singular. In the extreme case where A is singular but the singularity is not detected, the first inequality still holds but the second has no meaning. To be more precise about the quantity , it is necessary to introduce the idea of a matrix norm and establish some further inequalities. Readers who are not interested in such details can skip the remainder of this section. The quantity M defined earlier is known as the norm of the matrix. The notation for the matrix norm is the same as for the vector norm: kAxk kAk = max . kxk It is not hard to see that kA1 k = 1/m, so an equivalent definition of the condition number is (A) = kAkkA1 k. Again, the actual numerical values of the matrix norm and condition number depend on the underlying vector norm. It is easy to compute the matrix norms corresponding to the l1 and l vector norms. In fact, it is not hard to show that X |ai,j |, kAk1 = max j kAk = max i |ai,j |. Computing the matrix norm corresponding to the l2 vector norm involves the singular value decomposition (SVD), which is discussed in a later chapter. Matlab computes matrix norms with norm(A,p) for p = 1, 2, or inf. The basic result in the study of roundoff error in Gaussian elimination is due to J. H. Wilkinson. He proved that the computed solution x exactly satisfies (A + E)x = b, 20 ## Chapter 2. Linear Equations where E is a matrix whose elements are about the size of roundoff errors in the elements of A. There are some rare situations where the intermediate matrices obtained during Gaussian elimination have elements that are larger than those of A, and there is some effect from accumulation of rounding errors in large matrices, but it can be expected that if is defined by kEk = , kAk then will rarely be bigger than about 10. From this basic result, we can immediately derive inequalities involving the residual and the error in the computed solution. The residual is given by b Ax = Ex , and hence kb Ax k = kEx k kEkkx k. The residual involves the product Ax , so it is appropriate to consider the relative residual, which compares the norm of b Ax to the norms of A and x . It follows directly from the above inequalities that kb Ax k . kAkkx k If A is nonsingular, the error can be expressed using the inverse of A by x x = A1 (b Ax ), and so kx x k kA1 kkEkkx k. It is simplest to compare the norm of the error with the norm of the computed solution. Thus the relative error satisfies kx x k kAkkA1 k. kx k Hence kx x k (A). kx k ## The actual computation of (A) requires knowing kA1 k. But computing A1 requires roughly three times as much work as solving a single linear system. Computing the l2 condition number requires the SVD and even more work. Fortunately, the exact value of (A) is rarely required. Any reasonably good estimate of it is satisfactory. Matlab has several functions for computing or estimating condition numbers. cond(A) or cond(A,2) computes 2 (A). Uses svd(A). Suitable for smaller matrices where the geometric properties of the l2 -norm are important. ## 2.10. Sparse Matrices and Band Matrices 21 ## cond(A,1) computes 1 (A). Uses inv(A). Less work than cond(A,2). cond(A,inf) computes (A). Uses inv(A). Same as cond(A,1). condest(A) estimates 1 (A). Uses lu(A) and a recent algorithm of Higham and Tisseur . Especially suitable for large, sparse matrices. rcond(A) estimates 1/1 (A). Uses lu(A) and an older algorithm developed by the LINPACK and LAPACK projects. Primarily of historical interest. 2.10 ## Sparse Matrices and Band Matrices Sparse matrices and band matrices occur frequently in technical computing. The sparsity of a matrix is the fraction of its elements that are zero. The Matlab function nnz counts the number of nonzeros in a matrix, so the sparsity of A is given by density = nnz(A)/prod(size(A)) sparsity = 1 - density A sparse matrix is a matrix whose sparsity is nearly equal to 1. The bandwidth of a matrix is the maximum distance of the nonzero elements from the main diagonal. [i,j] = find(A) bandwidth = max(abs(i-j)) A band matrix is a matrix whose bandwidth is small. As you can see, both sparsity and bandwidth are matters of degree. An n-by-n diagonal matrix with no zeros on the diagonal has sparsity 1 1/n and bandwidth 0, so it is an extreme example of both a sparse matrix and a band matrix. On the other hand, an n-by-n matrix with no zero elements, such as the one created by rand(n,n), has sparsity equal to zero and bandwidth equal to n 1, and so is far from qualifying for either category. The Matlab sparse data structure stores the nonzero elements together with information about their indices. The sparse data structure also provides efficient handling of band matrices, so Matlab does not have a separate band matrix storage class. The statement S = sparse(A) converts a matrix to its sparse representation. The statement A = full(S) reverses the process. However, most sparse matrices have orders so large that it is impractical to store the full representation. More frequently, sparse matrices are created by S = sparse(i,j,x,m,n) 22 ## Chapter 2. Linear Equations ## This produces a matrix S with [i,j,x] = find(S) [m,n] = size(S) Most Matlab matrix operations and functions can be applied to both full and sparse matrices. The dominant factor in determining the execution time and memory requirements for sparse matrix operations is the number of nonzeros, nnz(S), in the various matrices involved. A matrix with bandwidth equal to 1 is known as a tridiagonal matrix. It is worthwhile to have a specialized function for one particular band matrix operation, the solution of a tridiagonal system of simultaneous linear equations: b1 a1 c1 b2 a2 c2 b3 .. . c3 .. . an2 .. . bn1 an1 cn1 bn x1 x2 x3 .. . n1 xn = d d1 d2 d3 .. . n1 dn ## The function tridisolve is included in the NCM directory. The statement x = tridisolve(a,b,c,d) solves the tridiagonal system with subdiagonal a, diagonal b, superdiagonal c, and right-hand side d. We have already seen the algorithm that tridisolve uses; it is Gaussian elimination. In many situations involving tridiagonal matrices, the diagonal elements dominate the off-diagonal elements, so pivoting is unnecessary. Furthermore, the right-hand side is processed at the same time as the matrix itself. In this context, Gaussian elimination without pivoting is also known as the Thomas algorithm. The body of tridisolve begins by copying the right-hand side to a vector that will become the solution. x = d; n = length(x); The forward elimination step is a simple for loop. for j = 1:n-1 mu = a(j)/b(j); b(j+1) = b(j+1) - mu*c(j); x(j+1) = x(j+1) - mu*x(j); end The mus would be the multipliers on the subdiagonal of L if we were saving the LU factorization. Instead, the right-hand side is processed in the same loop. The back substitution step is another simple loop. ## 2.11. PageRank and Markov Chains 23 x(n) = x(n)/b(n); for j = n-1:-1:1 x(j) = (x(j)-c(j)*x(j+1))/b(j); end Because tridisolve does not use pivoting, the results might be inaccurate if abs(b) is much smaller than abs(a)+abs(c). More robust, but slower, alternatives that do use pivoting include generating a full matrix with diag: T = diag(a,-1) + diag(b,0) + diag(c,1); x = T\d or generating a sparse matrix with spdiags: S = spdiags([a b c],[-1 0 1],n,n); x = S\d 2.11 ## PageRank and Markov Chains One of the reasons why GoogleTM is such an effective search engine is the PageRankTM algorithm developed by Googles founders, Larry Page and Sergey Brin, when they were graduate students at Stanford University. PageRank is determined entirely by the link structure of the World Wide Web. It is recomputed about once a month and does not involve the actual content of any Web pages or individual queries. Then, for any particular query, Google finds the pages on the Web that match that query and lists those pages in the order of their PageRank. Imagine surfing the Web, going from page to page by randomly choosing an outgoing link from one page to get to the next. This can lead to dead ends at pages with no outgoing links, or cycles around cliques of interconnected pages. So, a certain fraction of the time, simply choose a random page from the Web. This theoretical random walk is known as a Markov chain or Markov process. The limiting probability that an infinitely dedicated random surfer visits any particular page is its PageRank. A page has high rank if other pages with high rank link to it. Let W be the set of Web pages that can be reached by following a chain of hyperlinks starting at some root page, and let n be the number of pages in W . For Google, the set W actually varies with time, In June 2004, their value of n was over 4 billion. Today, Google does not reveal how many pages they reach. Let G be the n-by-n connectivity matrix of a portion of the Web, that is, gij = 1 if there is a hyperlink to page i from page j and gij = 0 otherwise. The matrix G can be huge, but it is very sparse. Its jth column shows the links on the jth page. The number of nonzeros in G is the total number of hyperlinks in W . Let ri and cj be the row and column sums of G: X X ri = gij , cj = gij . j The quantities rj and cj are the in-degree and out-degree of the jth page. Let p be the probability that the random walk follows a link. A typical value is p = 0.85. 24 ## Chapter 2. Linear Equations Then 1 p is the probability that some arbitrary page is chosen and = (1 p)/n is the probability that a particular random page is chosen. Let A be the n-by-n matrix whose elements are pgij /cj + : cj 6= 0 aij = 1/n : cj = 0. Notice that A comes from scaling the connectivity matrix by its column sums. The jth column is the probability of jumping from the jth page to the other pages on the Web. If the jth page is a dead end, that is has no out-links, then we assign a uniform probability of 1/n to all the elements in its column. Most of the elements of A are equal to , the probability of jumping from one page to another without following a link. If n = 4 109 and p = 0.85, then = 3.75 1011 . The matrix A is the transition probability matrix of the Markov chain. Its elements are all strictly between zero and one and its column sums are all equal to one. An important result in matrix theory known as the PerronFrobenius theorem applies to such matrices. It concludes that a nonzero solution of the equation x = Ax exists and is unique to within a scaling factor. If this scaling factor is chosen so that X xi = 1, i then x is the state vector of the Markov chain and is Googles PageRank. The elements of x are all positive and less than one. The vector x is the solution to the singular, homogeneous linear system (I A)x = 0. For modest n, an easy way to compute x in Matlab is to start with some approximate solution, such as the PageRanks from the previous month, or x = ones(n,1)/n Then simply repeat the assignment statement x = A*x until successive vectors agree to within a specified tolerance. This is known as the power method and is about the only possible approach for very large n. In practice, the matrices G and A are never actually formed. One step of the power method would be done by one pass over a database of Web pages, updating weighted reference counts generated by the hyperlinks between pages. The best way to compute PageRank in Matlab is to take advantage of the particular structure of the Markov matrix. Here is an approach that preserves the sparsity of G. The transition matrix can be written A = pGD + ez T ## 2.11. PageRank and Markov Chains 25 where D is the diagonal matrix formed from the reciprocals of the outdegrees, 1/cj : cj = 6 0 djj = 0 : cj = 0, e is the n-vector of all ones, and z is the vector with components : cj 6= 0 zj = 1/n : cj = 0. The rank-one matrix ez T accounts for the random choices of Web pages that do not follow links. The equation x = Ax can be written (I pGD)x = e where = z T x. We do not know the value of because it depends upon the unknown vector x, but we can temporarily take = 1. As long as p is strictly less than one, the coefficient matrix I pGD is nonsingular and the equation (I pGD)x = e can be solved for x. Then the resulting x can be rescaled so that X xi = 1. i ## Notice that the vector z is not actually involved in this calculation. The following Matlab statements implement this approach c k D e I x x = = = = = = = sum(G,1); find(c~=0); sparse(k,k,1./c(k),n,n); ones(n,1); speye(n,n); (I - p*G*D)\e; x/sum(x); The power method can also be implemented in a way that does not actually form the Markov matrix and so preserves sparsity. Compute G = p*G*D; z = ((1-p)*(c~=0) + (c==0))/n; Start with x = e/n 26 ## Chapter 2. Linear Equations ## Then repeat the statement x = G*x + e*(z*x) until x settles down to several decimal places. It is also possible to use an algorithm known as inverse iteration. A = p*G*D + delta x = (I - A)\e x = x/sum(x) At first glance, this appears to be a very dangerous idea. Because I A is theoretically singular, with exact computation some diagonal element of the upper triangular factor of I A should be zero and this computation should fail. But with roundoff error, the computed matrix I - A is probably not exactly singular. Even if it is singular, roundoff during Gaussian elimination will most likely prevent any exact zero diagonal elements. We know that Gaussian elimination with partial pivoting always produces a solution with a small residual, relative to the computed solution, even if the matrix is badly conditioned. The vector obtained with the backslash operation, (I - A)\e, usually has very large components. If it is rescaled by its sum, the residual is scaled by the same factor and becomes very small. Consequently, the two vectors x and A*x equal each other to within roundoff error. In this setting, solving the singular system with Gaussian elimination blows up, but it blows up in exactly the right direction. delta alpha beta gamma sigma rho ## Figure 2.2. A tiny Web. Figure 2.2 is the graph for a tiny example, with n = 6 instead of n = 4 109 . Pages on the Web are identified by strings known as uniform resource locators, or URLs. Most URLs begin with http because they use the hypertext transfer protocol. In Matlab , we can store the URLs as an array of strings in a cell array. This example involves a 6-by-1 cell array. ## 2.11. PageRank and Markov Chains 27 U = {http://www.alpha.com http://www.beta.com http://www.gamma.com http://www.delta.com http://www.rho.com http://www.sigma.com} Two different kinds of indexing into cell arrays are possible. Parentheses denote subarrays, including individual cells, and curly braces denote the contents of the cells. If k is a scalar, then U(k) is a 1-by-1 cell array consisting of the kth cell in U, while U{k} is the string in that cell. Thus U(1) is a single cell and U{1} is the string http://www.alpha.com. Think of mail boxes with addresses on a city street. B(502) is the box at number 502, while B{502} is the mail in that box. We can generate the connectivity matrix by specifying the pairs of indices (i,j) of the nonzero elements. Because there is a link to beta.com from alpha.com, the (2,1) element of G is nonzero. The nine connections are described by i = [ 2 6 3 4 4 5 6 1 1] j = [ 1 1 2 2 3 3 3 4 6] A sparse matrix is stored in a data structure that requires memory only for the nonzero elements and their indices. This is hardly necessary for a 6-by-6 matrix with only 27 zero entries, but it becomes crucially important for larger problems. The statements n = 6 G = sparse(i,j,1,n,n); full(G) generate the sparse representation of an n-by-n matrix with ones in the positions specified by the vectors i and j and display its full representation. 0 1 0 0 0 1 0 0 1 1 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 The statement c = full(sum(G)) computes the column sums c = 2 Notice that c(5) = 0 because the 5th page, labeled rho, has no out-links. The statements 28 ## Chapter 2. Linear Equations x = (I - p*G*D)\e x = x/sum(x) ## solve the sparse linear system to produce x = 0.3210 0.1705 0.1066 0.1368 0.0643 0.2007 Page Rank 0.35 0.3 0.25 0.2 0.15 0.1 0.05 ## Figure 2.3. Page Rank for the tiny Web The bar graph of x is shown in figure 2.3. If the URLs are sorted in PageRank order and listed along with their in- and out-degrees, the result is 1 6 2 4 3 5 page-rank 0.3210 0.2007 0.1705 0.1368 0.1066 0.0643 in 2 2 1 2 1 1 out 2 1 2 1 3 0 url http://www.alpha.com http://www.sigma.com http://www.beta.com http://www.delta.com http://www.gamma.com http://www.rho.com We see that alpha has a higher PageRank than delta or sigma, even though they all have the same number of in-links. A random surfer will visit alpha over 32% of the time and rho only about 6% of the time. For this tiny example with p = .85, the smallest element of the Markov transition matrix is = .15/6 = .0250. ## 2.11. PageRank and Markov Chains A = 0.0250 0.4500 0.0250 0.0250 0.0250 0.4500 0.0250 0.0250 0.4500 0.4500 0.0250 0.0250 0.0250 0.0250 0.0250 0.3083 0.3083 0.3083 29 0.8750 0.0250 0.0250 0.0250 0.0250 0.0250 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.8750 0.0250 0.0250 0.0250 0.0250 0.0250 ## Notice that the column sums of A are all equal to one. Our collection of NCM programs includes surfer.m. A statement like [U,G] = surfer(http://www.xxx.zzz,n) starts at a specified URL and tries to surf the Web until it has visited n pages. If successful, it returns an n-by-1 cell array of URLs and an n-by-n sparse connectivity matrix. The function uses urlread, which was introduced in Matlab 6.5, along with underlying Java utilities to access the Web. Surfing the Web automatically is a dangerous undertaking and this function must be used with care. Some URLs contain typographical errors and illegal characters. There is a list of URLs to avoid that includes .gif files and Web sites known to cause difficulties. Most importantly, surfer can get completely bogged down trying to read a page from a site that appears to be responding, but that never delivers the complete page. When this happens, it may be necessary to have the computers operating system ruthlessly terminate Matlab. With these precautions in mind, you can use surfer to generate your own PageRank examples. 0 50 100 150 200 250 300 350 400 450 500 0 100 200 300 nz = 2636 400 500 ## Figure 2.4. Spy plot of the harvard500 graph. 30 ## Chapter 2. Linear Equations The statement [U,G] = surfer(http://www.harvard.edu,500) accesses the home page of Harvard University and generates a 500-by-500 test case. The graph generated in August 2003 is available in the NCM directory. The statements load harvard500 spy(G) produce a spy plot (Figure 2.4) that shows the nonzero structure of the connectivity matrix. The statement pagerank(U,G) computes page ranks, produces a bar graph (Figure 2.5) of the ranks, and prints the most highly ranked URLs in PageRank order. For the harvard500 data, the dozen most highly ranked pages are 1 10 42 page-rank in 0.0843 195 0.0167 21 0.0166 42 out 26 18 0 130 18 15 9 17 46 13 260 0.0163 0.0139 0.0131 0.0114 0.0111 0.0100 0.0086 0.0086 24 45 16 21 13 18 9 26 12 46 49 27 6 21 1 1 19 0.0084 23 21 url http://www.harvard.edu http://www.hbs.edu http://search.harvard.edu:8765/ custom/query.html http://www.med.harvard.edu http://www.gse.harvard.edu http://www.hms.harvard.edu http://www.ksg.harvard.edu http://www.hsph.harvard.edu http://www.gocrimson.com http://www.hsdm.med.harvard.edu http://search.harvard.edu:8765/ query.html http://www.radcliffe.edu The URL where the search began, www.harvard.edu, dominates. Like most universities, Harvard is organized into various colleges and institutes, including the Kennedy School of Government, the Harvard Medical School, the Harvard Business School, and the Radcliffe Institute. You can see that the home pages of these schools have high PageRank. With a different sample, such as the one generated by Google itself, the ranks would be different. 2.12 Further Reading Further reading on matrix computation includes books by Demmel , Golub and Van Loan , Stewart [4, 5], and Trefethen and Bau . The definitive references on Fortran matrix computation software are the LAPACK Users Guide and Web site . The Matlab sparse matrix data structure and operations are described Exercises 31 Page Rank 0.02 0.018 0.016 0.014 0.012 0.01 0.008 0.006 0.004 0.002 0 50 100 150 200 250 300 350 400 450 500 ## Figure 2.5. PageRank of the harvard500 graph. in . Information available on Web sites about PageRank includes a brief explanation at Google , a technical report by Page, Brin, and colleagues , and a comprehensive survey by Langville and Meyer . Exercises 2.1. Alice buys three apples, a dozen bananas, and one cantaloupe for$2.36. Bob
buys a dozen apples and two cantaloupes for $5.26. Carol buys two bananas and three cantaloupes for$2.77. How much do single pieces of each fruit
cost? (You might want to set format bank.)
2.2. What Matlab function computes the reduced row echelon form of a matrix? What Matlab function generates magic square matrices? What is the
reduced row echelon form of the magic square of order six?
2.3. Figure 2.6 depicts a plane truss having 13 members (the numbered lines)
connecting 8 joints (the numbered circles). The indicated loads, in tons, are
applied at joints 2, 5, and 6, and we want to determine the resulting force on
each member of the truss.
For the truss to be in static equilibrium, there must be no net force, horizontally or vertically, at any joint. Thus, we can determine the member
forces by equating the horizontal forces to the left and right at each joint,
and similarly equating the vertical forces upward and downward at each joint.
For the eight joints, this would give 16 equations, which is more than the 13
unknown factors to be determined. For the truss to be statically determinate, that is, for there to be a unique solution, we assume that joint 1 is
rigidly fixed both horizontally and vertically and that joint 8 is fixed verti-

32

10

15

10

11

12

13

20

## Figure 2.6. A plane truss.

cally. Resolving the
member forces into horizontal and vertical components
and defining = 1/ 2, we obtain the following system of equations for the
member forces fi :
Joint 2:
Joint 3:
Joint 4:
Joint 5:
Joint 6:
Joint 7:
Joint 8:

f2 = f6 ,
f3 = 10;
f1 = f4 + f5 ,
f1 + f3 + f5 = 0;
f4 = f8 ,
f7 = 0;
f5 + f6 = f9 + f10 ,
f5 + f7 + f9 = 15;
f10 = f13 ,
f11 = 20;
f8 + f9 = f12 ,
f9 + f11 + f12 = 0;
f13 + f12 = 0.

## Solve this system of equations to find the vector f of member forces.

2.4. Figure 2.7 is the circuit diagram for a small network of resistors.
There are five nodes, eight resistors, and one constant voltage source. We
want to compute the voltage drops between the nodes and the currents around
each of the loops.
Several different linear systems of equations can be formed to describe this
circuit. Let vk , k = 1, . . . , 4, denote the voltage difference between each of
the first four nodes and node number 5 and let ik , k = 1, . . . , 4, denote the
clockwise current around each of the loops in the diagram. Ohms law says
that the voltage drop across a resistor is the resistance times the current. For

Exercises

33
3

r23
2

r13
i2
r12

r34
i3
1
i1

r25

r14

vs

i4

r45

r35

## Figure 2.7. A resistor network.

example, the branch between nodes 1 and 2 gives
v1 v2 = r12 (i2 i1 ).
Using the conductance, which is the reciprocal of the resistance, gkj = 1/rkj ,
Ohms law becomes
i2 i1 = g12 (v1 v2 ).
The voltage source is included in the equation
v3 vs = r35 i4 .
Kirchhoff s voltage law says that the sum of the voltage differences around
each loop is zero. For example, around loop 1,
(v1 v4 ) + (v4 v5 ) + (v5 v2 ) + (v2 v1 ) = 0.
Combining the voltage law with Ohms law leads to the loop equations for
the currents:
Ri = b.
Here i is the current vector,

i1
i
i = 2,
i3
i4
b is the source voltage vector,

0
0
b = ,
0
vs

34

## Chapter 2. Linear Equations

and R is the resistance matrix,

## r25 + r12 + r14 + r45

r12

r14
r45

r12
r23 + r12 + r13
r13
0

r14
r13
r14 + r13 + r34
r34

r45
0

.
r34
r35 + r45 + r34

Kirchhoff s current law says that the sum of the currents at each node is zero.
For example, at node 1,
(i1 i2 ) + (i2 i3 ) + (i3 i1 ) = 0.
Combining the current law with the conductance version of Ohms law leads
to the nodal equations for the voltages:
Gv = c.
Here v is the voltage vector,

v1
v
v = 2 ,
v3
v4
c is the source current vector,

0
0
c=
,
g35 vs
0

## g12 + g13 + g14

g12

g13
g14

g12
g12 + g23 + g25
g23
0

g13
g23
g13 + g23 + g34 + g35
g34

g14
0

.
g34
g14 + g34 + g45

You can solve the linear system obtained from the loop equations to compute
the currents and then use Ohms law to recover the voltages. Or you can solve
the linear system obtained from the node equations to compute the voltages
and then use Ohms law to recover the currents. Your assignment is to verify
that these two approaches produce the same results for this circuit. You can
choose your own numerical values for the resistances and the voltage source.
2.5. The Cholesky algorithm factors an important class of matrices known as
positive definite matrices. Andre-Louis Cholesky (18751918) was a French
military officer involved in geodesy and surveying in Crete and North Africa
just before World War I. He developed the method now named after him to
compute solutions to the normal equations for some least squares data-fitting

Exercises

35

## problems arising in geodesy. His work was posthumously published on his

behalf in 1924 by a fellow officer, Benoit, in the Bulletin Geodesique.
A real symmetric matrix A = AT is positive definite if any of the following
equivalent conditions hold:

xT Ax

## is positive for all nonzero vectors x.

All determinants formed from symmetric submatrices of any order centered on the diagonal of A are positive.
All eigenvalues (A) are positive.
There is a real matrix R such that
A = RT R.
These conditions are difficult or expensive to use as the basis for checking if
a particular matrix is positive definite. In Matlab , the best way to check
positive definiteness is with the chol function. See
help chol
(a) Which of the following families of matrices are positive definite?
M
H
P
I
R
R
R
R

=
=
=
=
=
=
=
=

magic(n)
hilb(n)
pascal(n)
eye(n,n)
randn(n,n)
randn(n,n); A = R * R
randn(n,n); A = R + R
randn(n,n); I = eye(n,n); A = R + R + n*I

## (b) If the matrix R is upper triangular, then equating individual elements in

the equation A = RT R gives
akj =

k
X

rik rij , k j.

i=1

Using these equations in different orders yields different variants of the Cholesky
algorithm for computing the elements of R. What is one such algorithm?
2.6. This example shows that a badly conditioned matrix does not necessarily
lead to small pivots in Gaussian elimination. The matrix is the n-by-n upper
triangular matrix A with elements

1, i < j,
1, i = j,
aij =

0, i > j.

36

## Chapter 2. Linear Equations

Show how to generate this matrix in Matlab with eye, ones, and triu.
Show that
1 (A) = n2n1 .

## For what n does 1 (A) exceed 1/eps?

This matrix is not singular, so Ax cannot be zero unless x is zero. However,
there are vectors x for which kAxk is much smaller than kxk. Find one such
x.
Because this matrix is already upper triangular, Gaussian elimination with
partial pivoting has no work to do. What are the pivots?
Use lugui to design a pivot strategy that will produce smaller pivots than
partial pivoting. (Even these pivots do not completely reveal the large condition number.)
2.7. The matrix factorization
LU = P A
can be used to compute the determinant of A. We have
det(L)det(U ) = det(P )det(A).
Because L is triangular with ones on the diagonal, det(L) = 1. Because U is
triangular, det(U ) = u11 u22 unn . Because P is a permutation, det(P ) =
+1 if the number of interchanges is even and 1 if it is odd. So
det(A) = u11 u22 unn .
Modify the lutx function so that it returns four outputs.
function [L,U,p,sig] = lutx(A)
%LU Triangular factorization
% [L,U,p,sig] = lutx(A) computes a unit lower triangular
% matrix L, an upper triangular matrix U, a permutation
% vector p, and a scalar sig, so that L*U = A(p,:) and
% sig = +1 or -1 if p is an even or odd permutation.
Write a function mydet(A) that uses your modified lutx to compute the
determinant of A. In Matlab, the product u11 u22 unn can be computed
by the expression prod(diag(U)).
2.8. Modify the lutx function so that it uses explicit for loops instead of Matlab
vector notation. For example, one section of your modified program will read
% Compute the multipliers
for i = k+1:n
A(i,k) = A(i,k)/A(k,k);
end
Compare the execution time of your modified lutx program with the original lutx program and with the built-in lu function by finding the order of
the matrix for which each of the three programs takes about 10 s on your
computer.

Exercises

37

2.9. Let

1 2
A = 4 5
7 8

3
1
6, b = 3.
9
5

(a) Show that the set of linear equations Ax = b has infinitely many solutions.
Describe the set of possible solutions.
(b) Suppose Gaussian elimination is used to solve Ax = b using exact arithmetic. Because there are infinitely many solutions, it is unreasonable to
expect one particular solution to be computed. What does happen?
(c) Use bslashtx to solve Ax = b on an actual computer with floating-point
arithmetic. What solution is obtained? Why? In what sense is it a good
solution? In what sense is it a bad solution?
(d) Explain why the built-in backslash operator x = A\b gives a different
solution from x = bslashtx(A,b).
2.10. Section 2.4 gives two algorithms for solving triangular systems. One subtracts
columns of the triangular matrix from the right-hand side; the other uses
inner products between the rows of the triangular matrix and the emerging
solution.
(a) Which of these two algorithms does bslashtx use?
(b) Write another function, bslashtx2, that uses the other algorithm.
2.11. The inverse of a matrix A can be defined as the matrix X whose columns xj
solve the equations
Axj = ej ,
where ej is the jth column of the identity matrix.
(a) Starting with the function bslashtx, write a Matlab function
X = myinv(A)
that computes the inverse of A. Your function should call lutx only once and
should not use the built-in Matlab backslash operator or inv function.
(b) Test your function by comparing the inverses it computes with the inverses
obtained from the built-in inv(A) on a few test matrices.
2.12. If the built-in Matlab lu function is called with only two output arguments
[L,U] = lu(A)
the permutations are incorporated into the output matrix L. The help entry
for lu describes L as psychologically lower triangular. Modify lutx so that
it does the same thing. You can use
if nargout == 2, ...
to test the number of output arguments.
2.13. (a) Why is
M = magic(8)
lugui(M)

38

## Chapter 2. Linear Equations

an interesting example?
(b) How is the behavior of lugui(M) related to rank(M)?
(c) Can you pick a sequence of pivots so that no roundoff error occurs in
lugui(M)?
2.14. The pivot selection strategy known as complete pivoting is one of the options
available in lugui. It has some slight numerical advantages over partial
pivoting. At each stage of the elimination, the element of largest magnitude
in the entire unreduced matrix is selected as the pivot. This involves both
row and column interchanges and produces two permutation vectors p and q
so that
L*U = A(p,q)
Modify lutx and bslashtx so that they use complete pivoting.
2.15. The function golub in the NCM directory is named after Stanford professor
Gene Golub. The function generates test matrices with random integer entries. The matrices are very badly conditioned, but Gaussian elimination
without pivoting fails to produce the small pivots that would reveal the large
condition number.
(a) How does condest(golub(n)) grow with increasing order n? Because
these are random matrices you cant be very precise here, but you can give
some qualitative description.
(b) What atypical behavior do you observe with the diagonal pivoting option
in lugui(golub(n))?
(c) What is det(golub(n))? Why?
2.16. The function pascal generates symmetric test matrices based on Pascals
triangle.
(a) How are the elements of pascal(n+1) related to the binomial coefficients
generated by nchoosek(n,k)?
(b) How is chol(pascal(n)) related to pascal(n)?
(c) How does condest(pascal(n)) grow with increasing order n?
(d) What is det(pascal(n))? Why?
(e) Let Q be the matrix generated by
Q = pascal(n);
Q(n,n) = Q(n,n) - 1;
How is chol(Q) related to chol(pascal(n))? Why?
(f) What is det(Q)? Why?
2.17. Play Pivot Pickin Golf with pivotgolf. The goal is to use lugui to
compute the LU decompositions of nine matrices with as little roundoff error
as possible. The score for each hole is
kRk + kL k + kU k ,
where R = LU P AQ is the residual and kL k and kU k are the nonzeros
that should be zero in L and U .

Exercises

39

(a) Can you beat the scores obtained by partial pivoting on any of the courses?
(b) Can you get a perfect score of zero on any of the courses?
2.18. The object of this exercise is to investigate how the condition numbers of
random matrices grow with their order. Let Rn denote an n-by-n matrix with
normally distributed random elements. You should observe experimentally
that there is an exponent p so that
1 (Rn ) = O(np ).
In other words, there are constants c1 and c2 so that most values of 1 (Rn )
satisfy
c1 np 1 (Rn ) c2 np .
Your job is to find p, c1 , and c2 .
The NCM M-file randncond.m is the starting point for your experiments.
This program generates random matrices with normally distributed elements
and plots their l1 condition numbers versus their order on a loglog scale.
The program also plots two lines that are intended to enclose most of the
observations. (On a loglog scale, power laws like = cnp produce straight
lines.)
(a) Modify randncond.m so that the two lines have the same slope and enclose
most of the observations.
(b) Based on this experiment, what is your guess for the exponent p in
(Rn ) = O(np )? How confident are you?
(c) The program uses (erasemode,none), so you cannot print the results. What would you have to change to make printing possible?
2.19. For n = 100, solve this tridiagonal system of equations three different ways:
2x1 x2 = 1,
xj1 + 2xj xj+1 = j, j = 2, . . . , n 1,
xn1 + 2xn = n.
(a) Use diag three times to form the coefficient matrix and then use lutx
and bslashtx to solve the system.
(b) Use spdiags once to form a sparse representation of the coefficient matrix
and then use the backslash operator to solve the system.
(c) Use tridisolve to solve the system.
(d) Use condest to estimate the condition of the coefficient matrix.
2.20. Use surfer and pagerank to compute PageRanks for some subset of the Web
that you choose. Do you see any interesting structure in the results?
2.21. Suppose that U and G are the URL cell array and the connectivity matrix
produced by surfer and that k is an integer. Explain what
U{k}, U(k), G(k,:), G(:,k), U(G(k,:)), U(G(:,k))
are.

40

## Chapter 2. Linear Equations

2.22. The connectivity matrix for the harvard500 data set has four small, almost
entirely nonzero, submatrices that produce dense patches near the diagonal
of the spy plot. You can use the zoom button to find their indices. The
first submatrix has indices around 170 and the other three have indices in
the 200s and 300s. Mathematically, a graph with every node connected to
every other node is known as a clique. Identify the organizations within the
Harvard community that are responsible for these near cliques.
2.23. A Web connectivity matrix G has gij = 1 if it is possible to get to page i
from page j with one click. If you multiply the matrix by itself, the entries
of the matrix G2 count the number of different paths of length two to page i
from page j. The matrix power Gp shows the number of paths of length p.
(a) For the harvard500 data set, find the power p where the number of
nonzeros stops increasing. In other words, for any q greater than p, nnz(G^q)
is equal to nnz(G^p).
(b) What fraction of the entries in Gp are nonzero?
(c) Use subplot and spy to show the nonzeros in the successive powers.
(d) Is there a set of interconnected pages that do not link to the other pages?
2.24. The function surfer uses a subfunction, hashfun, to speed up the search for a
possibly new URL in the list of URLs that have already been processed. Find
that have the same hashfun value.
2.25. Figure 2.8 is the graph of another six-node subset of the Web. In this example,
there are two disjoint subgraphs.
delta

alpha

beta

gamma

sigma

rho

## Figure 2.8. Another tiny Web.

(a) What is the connectivity matrix G?
(b) What are the PageRanks if the hyperlink transition probability p is the
default value 0.85?
(c) Describe what happens with this example to both the definition of PageRank and the computation done by pagerank in the limit p 1.

Exercises

41

## 2.26. The function pagerank(U,G) computes PageRanks by solving a sparse linear

system. It then plots a bar graph and prints the dominant URLs.
(a) Create pagerank1(G) by modifying pagerank so that it just computes
the PageRanks, but does not do any plotting or printing.
(b) Create pagerank2(G) by modifying pagerank1 to use inverse iteration
instead of solving the sparse linear system. The key statements are
x = (I - A)\e
x = x/sum(x)
What should be done in the unlikely event that the backslash operation involves a division by zero?
(c) Create pagerank3(G) by modifying pagerank1 to use the power method
instead of solving the sparse linear system. The key statements are
G = p*G*D
z = ((1-p)*(c~=0) + (c==0))/n;
while termination_test
x = G*x + e*(z*x)
end
What is an appropriate test for terminating the power iteration?
(d) Use your functions to compute the PageRanks of the six-node example
discussed in the text. Make sure you get the correct result from each of your
three functions.
2.27. Here is yet another function for computing PageRank. This version uses
the power method, but does not do any matrix operations. Only the link
structure of the connectivity matrix is involved.
function [x,cnt] = pagerankpow(G)
% PAGERANKPOW PageRank by power method.
% x = pagerankpow(G) is the PageRank of the graph G.
% [x,cnt] = pagerankpow(G)
%
counts the number of iterations.
[n,n] =
for j =
L{j}
c(j)
end

size(G);
1:n
= find(G(:,j));
= length(L{j});

% Power method
p = .85;
delta = (1-p)/n;
x = ones(n,1)/n;

42

## Chapter 2. Linear Equations

z = zeros(n,1);
cnt = 0;
while max(abs(x-z)) > .0001
z = x;
x = zeros(n,1);
for j = 1:n
if c(j) == 0
x = x + z(j)/n;
else
x(L{j}) = x(L{j}) + z(j)/c(j);
end
end
x = p*x + delta;
cnt = cnt+1;
end
(a) How do the storage requirements and execution time of this function
compare with the three pagerank functions from the previous exercise?
(b) Use this function as a template to write a function that computes PageRank in some other programming language.

Bibliography
 E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J.
Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney,
and D. Sorensen, LAPACK Users Guide, Third Edition, SIAM, Philadelphia,
1999.
http://www.netlib.org/lapack
 J. W. Demmel, Applied Numerical Linear Algebra, SIAM, Philadelphia, 1997.
 G. H. Golub and C. F. Van Loan, Matrix Computations, Third Edition,
The Johns Hopkins University Press, Baltimore, 1996.
 G. W. Stewart, Introduction to Matrix Computations, Academic Press, New
York, 1973.
 G. W. Stewart, Matrix Algorithms: Basic Decompositions, SIAM, Philadelphia, 1998.
 L. N. Trefethen and D. Bau, III, Numerical Linear Algebra, SIAM,
 J. R. Gilbert, C. Moler, and R. Schreiber, Sparse matrices in MATLAB:
Design and implementation, SIAM Journal on Matrix Analysis and Applications, 13 (1992), pp. 333356.
 N. J. Higham, and F. Tisseur, A block algorithm for matrix 1-norm estimation, with an application to 1-norm pseudospectra, SIAM Journal on Matrix
Analysis and Applications, 21 (2000), pp. 11851201.
 A. Langville, and C. Meyer, Deeper Inside PageRank,
http://meyer.math.ncsu.edu/Meyer/PS_Files/DeeperInsidePR.pdf
 L. Page, S. Brin, R. Motwani, and T. Winograd, The PageRank Citation Ranking: Bringing Order to the Web.
http://dbpubs.stanford.edu:8090/pub/1999-66

43

## Solution of non-linear equations

By Gilberto E. Urroz, September 2004

In this document I present methods for the solution of single non-linear equations as well
as for systems of such equations.
Solution of a single non-linear equation

Equations that can be cast in the form of a polynomial are referred to as algebraic
equations. Equations involving more complicated terms, such as trigonometric,
hyperbolic, exponential, or logarithmic functions are referred to as transcendental
equations. The methods presented in this section are numerical methods that can be
applied to the solution of such equations, to which we will refer, in general, as non-linear
equations. In general, we will we searching for one, or more, solutions to the equation,
f(x) = 0.
We will present the Newton-Raphson algorithm, and the secant method. In the secant
method we need to provide two initial values of x to get the algorithm started. In the
Newton-Raphson methods only one initial value is required.
Because the solution is not exact, the algorithms for any of the methods presented herein
will not provide the exact solution to the equation f(x) = 0, instead, we will stop the
algorithm when the equation is satisfied within an allowed tolerance or error, . In
mathematical terms this is expressed as
|f(xR)| < .
The value of x for which the non-linear equation f(x)=0 is satisfied, i.e., x = xR, will be
the solution, or root, to the equation within an error of units.

## The Newton-Raphson method

Consider the Taylor-series expansion of the function f(x) about a value x = xo:
f(x)= f(xo)+f'(xo)(x-xo)+(f"(xo)/2!)(x-xo)2+.
Using only the first two terms of the expansion, a first approximation to the root of the
equation
f(x) = 0
can be obtained from
f(x) = 0 f(xo)+f'(xo)(x1 -xo)

## Such approximation is given by,

x1 = xo - f(xo)/f'(xo).
The Newton-Raphson method consists in obtaining improved values of the approximate
root through the recurrent application of equation above. For example, the second and
third approximations to that root will be given by
and

x2 = x1 - f(x1)/f'(x1),
x3= x2 - f(x2)/f'(x2),

respectively.
This iterative procedure can be generalized by writing the following equation, where i
represents the iteration number:
xi+1 = xi - f(xi)/f'(xi).
After each iteration the program should check to see if the convergence condition,
namely,
|f(x i+1)|<,
is satisfied.
The figure below illustrates the way in which the solution is found by using the NewtonRaphson method. Notice that the equation f(x) = 0 f(xo)+f'(xo)(x1 -xo) represents a
straight line tangent to the curve y = f(x) at x = xo. This line intersects the x-axis (i.e., y =
f(x) = 0) at the point x1 as given by x1 = xo - f(xo)/f'(xo). At that point we can construct
another straight line tangent to y = f(x) whose intersection with the x-axis is the new
approximation to the root of f(x) = 0, namely, x = x2. Proceeding with the iteration we
can see that the intersection of consecutive tangent lines with the x-axis approaches the
actual root relatively fast.

The Newton-Raphson method converges relatively fast for most functions regardless of
the initial value chosen. The main disadvantage is that you need to know not only the
function f(x), but also its derivative, f'(x), in order to achieve a solution. The secant
method, discussed in the following section, utilizes an approximation to the derivative,
thus obviating such requirement.
The programming algorithm of any of these methods must include the option of stopping
the program if the number of iterations grows too large. How large is large? That will
depend of the particular problem solved. However, any Newton-Raphson, or secant
method solution that takes more than 1000 iterations to converge is either ill-posed or
contains a logical error. Debugging of the program will be called for at this point by
changing the initial values provided to the program, or by checking the program's logic.
A MATLAB function for the Newton-Raphson method

The function newton, listed below, implements the Newton-Raphson algorithm. It uses
as arguments an initial value and expressions for f(x) and f'(x).
function [x,iter]=newton(x0,f,fp)
% newton-raphson algorithm
N = 100; eps = 1.e-5; % define max. no. iterations and error
maxval = 10000.0;
% define value for divergence
xx = x0;
while (N>0)
xn = xx-f(xx)/fp(xx);
if abs(f(xn))<eps
x=xn;iter=100-N;
return;
end;
if abs(f(xx))>maxval
disp(['iterations = ',num2str(iter)]);
error('Solution diverges');
break;
end;
N = N - 1;
xx = xn;
end;
error('No convergence');
break;
% end function

We will use the Newton-Raphson method to solve for the equation, f(x) = x3-2x2+1 = 0.
The following MATLAB commands define the function f001(x) and its derivative,
f01p(x):
f001 = inline('x.^3-2*x.^2+1','x')
f001 =
Inline function:
f001(x) = x.^3-2*x.^2+1

f01p = inline('3*x.^2-2','x')
f01p =
Inline function:
f01p(x) = 3*x.^2-2

To have an idea of the location of the roots of this polynomial we'll plot the function
using the following MATLAB commands:
x = [-0.8:0.01:2.0]';y=f001(x);
plot(x,y);xlabel('x');ylabel('f001(x)');
grid on
1

f001(x)

0.5

-0.5

-1
-1

-0.5

0.5
x

1.5

We see that the function graph crosses the x-axis somewhere between 1.0 and 0.5,
close to 1.0, and between 1.5 and 2.0. To activate the function we could use, for
example, an initial value x0 = 2:
[x,iterations] = newton(2,f001,f01p)
x =

1.6180

iterations = 39

The following command are aimed at obtaining vectors of the solutions provided by
function newton.m for f001(x)=0 for initial values in the vector x0 such that 20 < x0 <
20. The solutions found are stored in variable xs while the required number of iterations
is stored in variable is.
x0 = [-20:0.5:20]; xs = []; is = [];
EDU for i = 1:length(x0)
[xx,ii] = newton(x0(i),f001,f01p);
xs = [xs,xx]; is = [is,ii];
end

## Plot of xs vs. x0, and of is vs. x0 are shown next:

4

figure(1);plot(x0,xs,'o');hold;plot(x0,xs,'-');hold;
title('Newton-Raphson solution');
xlabel('initial guess, x0');ylabel('solution, xs');

Newton-Raphson solution
1.8
1.7

solution, xs

1.6
1.5
1.4
1.3
1.2
1.1
1
-20

-15

-10

-5

0
5
initial guess, x0

10

15

20

This figure shows that for initial guesses in the range 20 < xo<20, function newton.m
converges mainly to the solution x = 1.6180, with few instances converting to the
solutions x = -1 and x = 1.
figure(2);plot(x0,is,'+'); hold;plot(x0,is,'-');hold;
title('Newton-Raphson solution');
xlabel('initial guess, x0');ylabel('iterations, is');
Newton-Raphson solution
70
60

iterations, is

50
40
30
20
10
0
-20

-15

-10

-5

0
5
initial guess, x0

10

15

20

This figure shows that the number of iterations required to achieve a solution ranges from
0 to about 65. Most of the time, about 50 iterations are required. The following example
shows a case in which the solution actually diverges:

[x,iter] = newton(-20.75,f001,f01p)
iterations = 6
??? Error using ==> newton
Solution diverges

NOTE: If the function of interest is defined by an m-file, the reference to the function
name in the call to newton.m should be placed between quotes.

## The Secant Method

In the secant method, we replace the derivative f'(xi) in the Newton-Raphson method with
f'(xi) (f(xi) - f(xi-1))/(xi - xi-1).
With this replacement, the Newton-Raphson algorithm becomes
xi +1 = xi

f ( xi )
( xi xi 1 ).
f ( xi ) f ( xi 1 )

To get the method started we need two values of x, say xo and x1, to get the first
approximation to the solution, namely,
f ( x1 )
( x1 x 0 ).
x 2 = x1
f ( x1 ) f ( xo )
As with the Newton-Raphson method, the iteration is stopped when
|f(x i+1)|<.
Figure 4, below, illustrates the way that the secant method approximates the solution of
the equation f(x) = 0.

## A MATLAB function for the secant method

The function secant, listed below, uses the secant method to solve for non-linear
equations. It requires two initial values and an expression for the function, f(x).
function [x,iter]=secant(x0,x00,f)
% newton-raphson algorithm
N = 100; eps = 1.e-5; % define max. no. iterations and error
maxval = 10000.0;
% define value for divergence
xx1 = x0; xx2 = x00;
while N>0
gp = (f(xx2)-f(xx1))/(xx2-xx1);
xn = xx1-f(xx1)/gp;
if abs(f(xn))<eps
x=xn;
iter = 100-N;
return;
end;
if abs(f(xn))>maxval
iter=100-N;
disp(['iterations = ',num2str(iter)]);
error('Solution diverges');
abort;
end;
N = N - 1;
xx1 = xx2;
xx2 = xn;
end;
iter=100-N;
disp(['iterations = ',iter]);
error('No convergence');
abort;
% end function

## Application of secant method

We use the same function f001(x) = 0 presented earlier. The following commands call
the function secant.txt to obtain a solution to the equation:
[x,iter] = secant(-10.0,-9.8,f001)
x = -0.6180
iter = 11

The following command are aimed at obtaining vectors of the solutions provided by
function Newton.m for f001(x)=0 for initial values in the vector x0 such that 20 < x0 <
20. The solutions found are stored in variable xs while the required number of iterations
is stored in variable is.
7

## x0 = [-20:0.5:20]; x00 = x0 + 0.5; xs = []; is = [];

for i = 1:length(x0)
[xx,ii] = secant(x0(i),x00(i),f001);
xs = [xs, xx]; is = [is, ii];
end

## Plot of xs vs. x0, and of is vs. x0 are shown next:

figure(1);plot(x0,xs,'o');hold;plot(x0,xs,'-');hold;
title('Secant method solution');
xlabel('first initial guess, x0');ylabel('solution, xs');
Secant method solution
2
1.5

solution, xs

1
0.5
0
-0.5
-1
-20

-15

-10

-5
0
5
first initial guess, x0

10

15

20

This figure shows that for initial guesses in the range 20 < xo<20, function newton.m
converges to the three solutions. Notice that initial guesses in the range 20 < x0 < -1,
converge to x = - 0.6180; those in the range 1< x < 1, converge to x = 1; and, those in
the range 1<x<20, converge to x =1.6180.
figure(2);plot(x0,is,'o');hold;plot(x0,is,'-');hold;
xlabel('first initial guess, x0');ylabel('iterations, is');
title('Secant method solution');

## Secant method solution

15

iterations, is

10

0
-20

-15

-10

-5
0
5
first initial guess, x0

10

15

20

This figure shows that the number of iterations required to achieve a solution ranges
from 0 to about 15. Notice also that, the closer the initial guess is to zero, the less
number of iterations to convergence are required.
NOTE: If the function of interest is defined by an m-file, the reference to the function
name in the call to newton.m should be placed between quotes.
Solving systems of non-linear equations
Consider the solution to a system of n non-linear equations in n unknowns given by
f1(x1,x2,,xn) = 0
f2(x1,x2,,xn) = 0
.
.
.
fn(x1,x2,,xn) = 0
The system can be written in a single expression using vectors, i.e.,
f(x) = 0,
where the vector x contains the independent variables, and the vector f contains the
functions fi(x):
x1
f1 ( x1 , x 2 ,..., x n ) f1 (x)
x
f ( x , x ,..., x ) f (x)
n
x = 2 , f ( x) = 2 1 2
= 2 .
M

M
M

xn
f n ( x1 , x 2 ,..., x n ) f n (x)
Newton-Raphson method to solve systems of non-linear equations

A Newton-Raphson method for solving the system of linear equations requires the
evaluation of a matrix, known as the Jacobian of the system, which is defined as:
f1 / x1

( f 1 , f 2 ,..., f n ) f 2 / x1
J=
=
( x1 , x 2 ,..., x n ) M

f n / x1

f 1 / x 2
f 2 / x 2
M
f n / x 2

L f1 / x n
L f 2 / x n
f
= [ i ] nn .
x j
O
M

L f n / x n

If x = x0 (a vector) represents the first guess for the solution, successive approximations
to the solution are obtained from
xn+1 = xn - J-1f(xn) = xn - xn,
with xn = xn+1 - xn.

A convergence criterion for the solution of a system of non-linear equation could be, for
example, that the maximum of the absolute values of the functions fi(xn) is smaller than a
certain tolerance , i.e.,
max | f i (x n ) |< .
i

Another possibility for convergence is that the magnitude of the vector f(xn) be smaller
than the tolerance, i.e.,
|f(xn)| < .
We can also use as convergence criteria the difference between consecutive values of the
solution, i.e.,
max | ( xi ) n +1 ( xi ) n | < .,
i

or,
|xn | = |xn+1 - xn| < .
The main complication with using Newton-Raphson to solve a system of non-linear
equations is having to define all the functions fi/xj, for i,j = 1,2, , n, included in the
Jacobian. As the number of equations and unknowns, n, increases, so does the number
of elements in the Jacobian, n2.
MATLAB function for Newton-Raphson method for a system of non-linear equations

The following MATLAB function, newtonm, calculates the solution to a system of n nonlinear equations, f(x) = 0, given the vector of functions f and the Jacobian J, as well as an
initial guess for the solution x0.
function [x,iter] = newtonm(x0,f,J)
%
%
%
%
%
%

## Newton-Raphson method applied to a

system of linear equations f(x) = 0,
given the jacobian function J, with
J = del(f1,f2,...,fn)/del(x1,x2,...,xn)
x = [x1;x2;...;xn], f = [f1;f2;...;fn]
x0 is an initial guess of the solution

N = 100;
% define max. number of iterations
epsilon = 1e-10;
% define tolerance
maxval = 10000.0; % define value for divergence
xx = x0;
while (N>0)
JJ = feval(J,xx);
if abs(det(JJ))<epsilon
error('newtonm - Jacobian is singular - try new x0');
abort;
end;
xn = xx - inv(JJ)*feval(f,xx);

10

if abs(feval(f,xn))<epsilon
x=xn;
iter = 100-N;
return;
end;
if abs(feval(f,xx))>maxval
iter = 100-N;
disp(['iterations = ',num2str(iter)]);
error('Solution diverges');
abort;
end;
N = N - 1;
xx = xn;
end;
error('No convergence after 100 iterations.');
abort;
% end function

The functions f and the Jacobian J need to be defined as separate functions. To illustrate
the definition of the functions consider the system of non-linear equations:
f1(x1,x2) = x12 + x22 - 50 = 0,
f2(x1,x2) = x1 x2 - 25 = 0,
whose Jacobian is
f1
x
J= 1
f 2
x1

f1
x 2 2 x1
=
f 2 x 2
x 2

2 x2
.
x1

We can define the function f as the following user-defined MATLAB function f2:
function [f] = f2(x)
% f2(x) = 0, with x = [x(1);x(2)]
% represents a system of 2 non-linear equations
f1 = x(1)^2 + x(2)^2 - 50;
f2 = x(1)*x(2) -25;
f = [f1;f2];
% end function

## The corresponding Jacobian is calculated using the user-defined MATLAB function

jacob2x2:
function [J] = jacob2x2(x)
% Evaluates the Jacobian of a 2x2
% system of non-linear equations
J(1,1) = 2*x(1); J(1,2) = 2*x(2);
J(2,1) = x(2);
J(2,2) = x(1);
% end function

11

## Before using function newtonm, we will perform some step-by-step calculations to

illustrate the algorithm. We start by defining an initial guess for the solution as:
x0 = [2;1]
x0 =
2
1

Lets calculate the function f(x) at x = x0 to see how far we are from a solution:
f2(x0)
ans =
-45
-23

Obviously, the function f(x0) is far away from being zero. Thus, we proceed to calculate
a better approximation by calculating the Jacobian J(x0):
J0 = jacob2x2(x0)
J0 =
4
1

2
2

## The new approximation to the solution, x1, is calculated as:

x1 = x0 - inv(J0)*f2(x0)
x1 =
9.3333
8.8333

## Evaluating the functions at x1 produces:

f2(x1)
ans =

115.1389

Still far away from convergence. Lets calculate a new approximation, x2:
x2 = x1-inv(jacob2x2(x1))*f2(x1)
x2 =
6.0428
5.7928

12

Evaluating the functions at x2 indicates that the values of the functions are decreasing:
f2(x2)
ans =
20.0723
10.0049

## A new approximation and the corresponding function evaluations are:

x3 = x2 - inv(jacob2x2(x2))*f2(x2)
x3 =
5.1337
5.0087
E f2(x3)
ans =
1.4414
0.7129

The functions are getting even smaller suggesting convergence towards a solution.
Solution using function newtonm

## Next, we use function newtonm to solve the problem postulated earlier.

A call to the function using the values of x0, f2, and jacob2x2 is:
[x,iter] = newtonm(x0,'f2','jacob2x2')
x =
5.0000
5.0000
iter =
16

The result shows the number of iterations required for convergence (16) and the solution
found as x1 = 5.0000 and x2 = 5.000. Evaluating the functions for those solutions results
in:
f2(x)
ans =
1.0e-010 *
0.2910
-0.1455

13

The values of the functions are close enough to zero (error in the order of 10-11).
Secant method to solve systems of non-linear equations

In this section we present a method for solving systems of non-linear equations through
the Newton-Raphson algorithm, namely, xn+1 = xn - J-1f(xn), but approximating the
Jacobian through finite differences. This approach is a generalization of the secant
method for a single non-linear equation. For that reason, we refer to the method applied
to a system of non-linear equations as a secant method, although the geometric origin
of the term not longer applies.
The secant method for a system of non-linear equations free us from having to define
the n2 functions necessary to define the Jacobian for a system of n equations. Instead, we
approximate the partial derivatives in the Jacobian with
f i ( x1 , x 2 ,L, x j + x,L, x n ) f i ( x1 , x 2 ,L , x j ,L, x n )
f i

,
x
x j
where x is a small increment in the independent variables. Notice that fi/xj represents
element Jij in the jacobian J = (f1,f2,,fn)/(x1,x2,,xn).
To calculate the Jacobian we proceed by columns, i.e., column j of the Jacobian will be
calculated as shown in the function jacobFD (jacobian calculated through Finite
Differences) listed below:
function [J] = jacobFD(f,x,delx)
% Calculates the Jacobian of the
% system of non-linear equations:
% f(x) = 0, through finite differences.
% The Jacobian is built by columns
[m n] = size(x);
for j = 1:m
xx = x;
xx(j) = x(j) + delx;
J(:,j) = (f(xx)-f(x))/delx;
end;
% end function

Notice that for each column (i.e., each value of j) we define a variable xx which is first
made equal to x, and then the j-th element is incremented by delx, before calculating the
j-th column of the Jacobian, namely, J(:,j). This is the MATLAB implementation of the
finite difference approximation for the Jacobian elements Jij = fi/xj as defined earlier.

14

## Illustrating the secant algorithm for a system of two non-linear equations

To illustrate the application of the secant algorithm we use again the system of two
non-linear equations defined earlier through the function f2.
We choose an initial guess for the solution as x0 = [2;3], and an increment in the
independent variables of x = 0.1:
x0 = [2;3]
x0 =
2
3
EDU dx = 0.1
dx =

0.1000

## Variable J0 will store the Jacobian corresponding to x0 calculated through finite

differences with the value of x defined above:
J0 = jacobFD('f2',x0,dx)
J0 =
4.1000
3.0000

6.1000
2.0000

A new estimate for the solution, namely, x1, is calculated using the Newton-Raphson
algorithm:
x1 = x0 - inv(J0)*f2(x0)
x1 =
6.1485
6.2772

## The finite-difference Jacobian corresponding to x1 gets stored in J1:

J1 = jacobFD('f2',x1,dx)
J1 =
12.3970
6.2772

12.6545
6.1485

## And a new approximation for the solution (x2) is calculated as:

x2 = x1 - inv(J1)*f2(x1)
x2 =
4.6671
5.5784

15

The next two approximations to the solution (x3 and x4) are calculated without first
storing the corresponding finite-difference Jacobians:
x3 = x2 - inv(jacobFD('f2',x2,dx))*f2(x2)
x3 =
4.7676
5.2365
EDU x4 = x3 - inv(jacobFD('f2',x3,dx))*f2(x3)
x4 =
4.8826
5.1175

## To check the value of the functions at x = x4 we use:

f2(x4)
ans =
0.0278
-0.0137

The functions are close to zero, but not yet at an acceptable error (i.e., something in the
order of 10-6). Therefore, we try one more approximation to the solution, i.e., x5:
x5 = x4 - inv(jacobFD('f2',x4,dx))*f2(x4)
x5 =
4.9413
5.0587

The functions are even closer to zero than before, suggesting a convergence to a solution.
f2(x5)
ans =
0.0069
-0.0034

## To make the process of achieving a solution automatic, we propose the following

MATLAB user -defined function, secantm:
function [x,iter] = secantm(x0,dx,f)
% Secant-type method applied to a
% system of linear equations f(x) = 0,

16

%
%
%
%
%

## given the jacobian function J, with

The Jacobian built by columns.
x = [x1;x2;...;xn], f = [f1;f2;...;fn]
x0 is the initial guess of the solution
dx is an increment in x1,x2,... variables

N = 100;
epsilon = 1.0e-10;
maxval = 10000.0;

## % define max. number of iterations

% define tolerance
% define value for divergence

if abs(dx)<epsilon
error('dx = 0, use different values');
break;
end;
xn

= x0;

[n m] = size(x0);
while (N>0)
JJ = [1,2;2,3]; xx = zeros(n,1);
for j = 1:n
% Estimating
xx = xn;
% Jacobian by
xx(j) = xn(j) + dx;
% finite
fxx = feval(f,xx);
fxn = feval(f,xn);
JJ(:,j) = (fxx-fxn)/dx; % differences
end;
% by columns
if abs(det(JJ))<epsilon
error('newtonm - Jacobian is singular - try new x0,dx');
break;
end;
xnp1 = xn - inv(JJ)*fxn;
fnp1 = feval(f,xnp1);
if abs(fnp1)<epsilon
x=xnp1;
disp(['iterations: ', num2str(100-N)]);
return;
end;
if abs(fnp1)>maxval
disp(['iterations: ', num2str(100-N)]);
error('Solution diverges');
break;
end;
N = N - 1;
xn = xnp1;
end;
error('No convergence');
break;
% end function

17

## Solution using function secantm

To solve the system represented by function f2, we now use function secantm. The
following call to function secantm produces a solution after 18 iterations:
[x,iter] = secantm(x0,dx,'f2')
iterations: 18
x =
5.0000
5.0000
iter =
18

## Solving equations with Matlab function fzero

Matlab provides function fzero for the solution of single non-linear equations. Use
help fzero

## to obtain additional information on function fzero. Also, read Chapter 7 (Function

Functions) in the Using Matlab guide.
For the solution of systems of non-linear equations Matlab provides function fsolve as
part of the Optimization package. Since this is an add-on package, function fsolve is not
available in the student version of Matlab. If using the full version of Matlab, check the
help facility for function fsolve by using:
help fsolve

If function fsolve is not available in the Matlab installation you are using, you can always
use function secantm (or newtonm) to solve systems of non-linear equations.

18