Chap 2

Chapter 2
Numerical Solution of Nonlinear Equations of One Variable
In this chapter, we study methods for nding approximate solutions to the equation f (x) = 0, where f is a real-valued function of a real variable. Some classical examples include the equation x tan x = 0 that occurs in the diraction of light, or Keplers equation x b sin x = 0 used for calculating planetary orbits. Other examples include transcendental equations such as f (x) = ex + x = 0 and algebraic equations such as x7 + 4x5 7x2 + 6x + 3 = 0.
2.1
Bisection Method
The bisection method is simple, reliable, and almost always can be applied, but is generally not as fast as other methods. Note that, if y = f (x), then f (x) = 0 corresponds to the point where the curve y = f (x) crosses the xaxis. The bisection method is based on the following direct consequence of the Intermediate Value Theorem. THEOREM 2.1 Suppose that f C [a, b] and f (a)f (b) < 0. Then there is a z [a, b] such that f (z ) = 0. (See Figure 2.1.) The method of bisection is simple to implement as illustrated in the following algorithm: ALGORITHM 2.1 (The bisection algorithm) INPUT: An error tolerance OUTPUT: Either a point x that is within of a solution z or failure to nd a sign change 1. Find a and b such that f (a)f (b) < 0. (By Theorem 2.1, there is a
39
40
Applied Numerical Methods
y = f (x)
+ a
+ b
FIGURE 2.1: Example for the Intermediate Value Theorem applied to roots of a function. z [a, b] such that f (z ) = 0.) (Return with failure to nd a sign change if such an interval cannot be found.) 2. Let a0 = a, b0 = b, k = 0. 3. Let xk = (ak + bk )/2. 4. IF f (xk )f (ak ) > 0 THEN (a) ak+1 xk , (b) bk+1 bk . ELSE (a) bk+1 xk , END IF 5. IF (bk ak )/2 < THEN Stop, since xk is within of z . (See the explanation below.) ELSE (a) k k + 1. END IF END ALGORITHM 2.1. Basically, in the method of bisection, the interval [ak , bk ] contains z and bk ak = (bk1 ak1 )/2. The interval containing z is reduced by a factor
(b) ak+1 ak .
(b) Return to step 3.
Numerical Solution of Nonlinear Equations of One Variable of 2 at each iteration.
41
Note: In practice, when programming bisection, we usually do not store the numbers ak and bk for all k as the iteration progresses. Instead, we usually store just two numbers a and b, replacing these by new values, as indicated in Step 4 of our bisection algorithm (Algorithm 2.1).
f (x) = ex + x
+ 1
FIGURE 2.2:
Graph of ex + x for Example 2.1.
Example 2.1 f (x) = ex + x, f (0) = 1, f (1) = 0.632. Thus, 1 < z < 0. (There is a unique zero, because f (x) = ex + 1 > 0 for all x.) Setting a0 = 1 and b0 = 0, we obtain the following table of values. k 0 1 2 3 4 ak bk xk 1 0 1/2 1 1/2 3/4 3/4 1/2 0.625 0.625 0.500 0.5625 0.625 0.5625 0.59375
Thus z (0.625, 0.5625); see Figure 2.2. The method always works for f continuous, as long as a and b can be found such that f (a)f (b) < 0 (and as long as we assume roundo error does not cause us to incorrectly evaluate the sign of f (x)). However, consider y = f (x) with f (x) 0 for every x, but f (z ) = 0. There are no a and b such that f (a)f (b) < 0. Thus, the method is not applicable to all problems in its present form. (See Figure 2.3 for an example of a root that cannot be found by bisection.) Is there a way that we can know how many iterations to do for the method of bisection without actually performing the test in Step 5 of Algorithm 2.1?
42
y = f (x)
x
FIGURE 2.3: Example of when the method of bisection cannot be applied.
Simply examining how the widths of the intervals decrease leads us to the following fact. THEOREM 2.2 Suppose that f C [a, b] and f (a)f (b) < 0, then |xk z | ba . 2k+1
k+1 Thus, in the algorithm, if 1 < , then |z xk | < . 2 (bk ak ) = (b a)/2
Example 2.2 How many iterations are required to reduce the error to less than 106 if a = 0 and b = 1? 6 Solution: We need 2k1 . Thus, 2k+1 > 106 , or k = 19. +1 (1 0) < 10 This example illustrates the preferred way of stopping the method of bisection. Namely, if the method of bisection is programmed, it is preferable to compute an integer N such that N> log((b a)/) 1, log(2)
and test k > N , rather than testing the length of the interval directly as in Step 5 of Algorithm 2.1. One reason is because integer comparisons (comparing k to N , or doing it implicitly in a programming language loop, such as the matlab loop for k=1:N) are more ecient than oating point comparisons. Another reason is because, if were chosen too small (such as smaller than the distance between machine numbers near the solution z ), the comparison in Step 5 of Algorithm 2.1 would never hold in practice, and the algorithm would never stop. The following is an example of programming Algorithm 2.1 in matlab.
43
function [root,success] = bisect_method (a, b, eps, f) % % [root, success] = bisect_method (a, b, eps, func) returns the % result of the method of bisection, with starting interval [a, b], % tolerance eps, and with function defined by y = f(x). For example, % suppose an m-file xsqm2.m is available in Matlabs working % directory, with the following contents: % function [y] = xsqm2(x) % y = x^2-2; % return % Then, issuing % [root,success] = bisect (1, 2, 1e-10, xsqm2) % from Matlabs command window will cause an approximation to % the square root of 2 that, in the absence of excessive roundoff % error, has absolute error of at most 10^{-16} % to be returned in the variable root, and success to be set to % true. % % success is set to false if f(a) and f(b) do not have the same % sign. success is also set to false if the tolerance cannot be met. % In either case, a message is printed, and the midpoint of the present % interval is returned in the variable root. error=b-a; fa=feval(f,a); fb=feval(f,b); success = true; % First, handle incorrect arguments -if (fa*fb > 0) disp(Error: f(a)*f(b)>0); success = false; root = a + (b-a)/2; return end if (eps <=0) disp(Error: eps is less than or equal to 0) success = false; root = a + (b-a)/2; return end if (b < a) disp(Error: b < a) success = false; root = (a+b)/2; return end % Set N to be the smallest integer such that N iterations of bisection
44
% suffices to meet the tolerance -N = ceil( log((b-a)/eps)/log(2) - 1 ) % This is where we actually do Algorithm 2.1 -disp( -----------------------------); disp( Error Estimate ); disp( -----------------------------); for i=1:N x= a + (b-a)/2; fx=feval(f,x); if(fx*fa > 0) a=x; else b=x; end error=b-a; disp(sprintf( %12.4e %12.4e, error, x)); end % Finally, check to see if the tolerance was actually met. (With % additional analysis of the minimum possible relative error % (according to the distance between floating point % numbers), unreasonable values of epsilon can be determined % before the loop on i, avoiding unnecessary work.) error = (b-a)/2; root = a + (b-a)/2; if (error > eps) disp(Error: epsilon is too small for tolerance to be met); success = false; return end
This program includes practical considerations beyond the raw mathematical operations in the algorithm Observe the following. 1. The comments at the beginning of the program state precisely how the function is used. In fact, within the matlab system, if the le bisect method.m contains this program within the working directory or within matlabs search path, and one issues the command help bisect method from the matlab command window, all of these comments (those lines starting with %) prior to the rst non-comment line are printed to the command window. 2. There are statements to catch errors in the input arguments. When developing such computer programs, it is wise to use a uniform style in the comments, indentation of if blocks and for loops, etc. To a large extend, matlabs editor does indentation automatically, and automatically highlights comments and syntax elements such as if and for in dierent
45
colors. It is also a good idea to identify the author and date programmed, as well as the package (if any) to which the program belongs. This is done for bisect method.m in the version posted on the web page for the book, but is not reproduced here, for brevity. The above implementation, stored in a matlab m-le, is an example of a matlab function , that is, an m-le that begins with a function statement. In such a le, quantities that are to be returned must appear in the list in brackets on the left of the =, while quantities that are input must appear in the list in parentheses on the right of the statement. In a function m-le, the only quantities from the command line that are available while the operations within the m-le are being done are those passed on the left, and the only quantities available to the command environment (or other function) from which the function is called are the ones in the bracketed list on the left. For example, consider the following dialog in the matlab command window.
>> eps = 1e-16 eps = 1.0000e-016 >> [root,success] = bisect_method(1,2,1e-2,xsqm2) N = 6 ----------------------------Error Estimate ----------------------------5.0000e-001 1.5000e+000 2.5000e-001 1.2500e+000 1.2500e-001 1.3750e+000 6.2500e-002 1.4375e+000 3.1250e-002 1.4063e+000 1.5625e-002 1.4219e+000 root = 1.4141 success = 1 >> N ??? Undefined function or variable N. >>eps eps = 1.0000e-016 >>
Observe that N is not available within the environment calling bisect method, and eps is not available within bisect method. This contrasts with matlab m-les that do not begin with a function statement. These les, termed matlab scripts Matlab!script. For example, the script run bisect method.m might contain the following lines.
clear a = 1
46
b = 2 eps = 1e-1 [root,success] = bisect_method(a, b, eps, xsqm2)
The clear command removes all quantities from the environment. Observe now the following dialog in the matlab command window.
>> clear >> a ??? Undefined function or variable a. >> b ??? Undefined function or variable b. >> run_bisect_method a = 1 b = 2 eps = 0.1000 N = 3 ----------------------------Error Estimate ----------------------------5.0000e-001 1.5000e+000 2.5000e-001 1.2500e+000 1.2500e-001 1.3750e+000 root = 1.4375 success = 1 >> a a = 1 >> b b = 2 >> eps eps = 0.1000 >>
The reader is invited to use matlabs help system to explore the other aspects of the function bisect method. We end our present discussion of matlab programs with a note on the use of the symbol =. In statements entered into the command line and in m-les, = means store the computed contents to the left of the = into the variable represented by the symbol on the right. In quantities printed by the matlab system, = means the value stored in the memory locations represented by
47
the printed symbol is approximately equal to the printed quantity. Note that this is signicantly dierent from the meaning that a mathematician attaches to the symbol. For example, the approximations might not be close enough for our purposes to the intended value, due to roundo error or other errors, or even due to error in conversion from the internal binary form to the printed decimal form.
2.2
The Fixed Point Method
The so-called xed point method is a really general way of viewing computational processes involving equations, systems of equations, and equilibria. We introduce it here, and will see it again when we study systems of linear and nonlinear equations. It is also seen in more advanced studies of systems of dierential equations. DEFINITION 2.1 REMARK 2.1 z G is a xed point of g if g (z ) = z .
If f (x) = g (x) x, then a xed point of g is a zero of f .
The xed-point iteration method is dened by the following: For x0 G, xk+1 = g (xk ) Example 2.3 Suppose for k = 0, 1, 2, . . . .
1 (x + 1). 2 Then, starting with x0 = 0, xed point iteration becomes g (x) = xk+1 = 1 (xk + 1), 2
and the rst few iterates are x0 = 0, x1 = 1/2, x2 = 3/4, x3 = 7/8, x4 = 15/16, . We see that this iteration converges to z = 1. Example 2.4 If f is as in Example 2.1 on page 41, a corresponding g is g (x) = ex . We can study xed point iteration with this g with the following matlab dialog.
>> x = -0.5
48
x = -0.5000 >> x = -exp(x) x = -0.6065 >> x = -exp(x) x = -0.5452 >> x = -exp(x) x = -0.5797 >> x = -exp(x) x = -0.5601 >> x = -exp(x) x = -0.5712 >> x = -exp(x) x = -0.5649 >>
(Here, we can recall the expression x = - exp(x) by simply pressing the up-arrow button on the keyboard.) We observe a convergence in which the approximation appears to alternate about the limit, but the convergence does not appear to be quadratic. An important question is: when does {xk } k=0 converge to z , a xed point of g ? Fixed-point iteration does not always converge. Consider g (x) = x2 , whose xed points are x = 0 and x = 1. If x0 = 2, then xk+1 = x2 k , so x1 = 4, x2 = 16, x3 = 256, . Although it is tempting to pose problems as xed point iteration, the xed point iterates do not always converge. We talk about convergence of xed point iteration in terms of Lipschitz constants. DEFINITION 2.2 g satises a Lipschitz condition on G if there is a Lipschitz constant L 0 such that |g (x) g (y )| L|x y | for all x, y G. (2.1)
If g satises (2.1) with 0 L < 1, g is said to be a contraction on the set G. For dierentiable functions, a common way of thinking about Lipschitz constants is in terms of the derivative of g . For instance, it is not hard to show (using the mean value theorem) that, if g is continuous and |g (x)| L for every x, then g satises a Lipschitz condition with Lipschitz constant L. Basically, if L < 1 (or if |g | < 1), then xed point iteration converges. In
Numerical Solution of Nonlinear Equations of One Variable fact, in such instances, |xk+1 z | = |g (xk ) g (z )| L|xk z |,
49
so xed point iteration is linearly convergent with convergence factor C = L. (Later, we state conditions under which the convergence is faster than linear.) This is embodied in the following theorem. THEOREM 2.3 (Contraction Mapping Theorem in one variable) Suppose that g maps G into itself (i.e., if x G then g (x) G) and g satises a Lipschitz condition with 0 L < 1 (i.e., g is a contraction on G). Then, there is a unique z G such that z = g (z ), and the sequence determined by x0 G, xk+1 = g (xk ), k = 0, 1, 2, converges to z , with error estimates |xk z | Lk |x1 x0 |, k = 1, 2, 1L L |xk xk1 |, k = 1, 2, |xk z | 1L (2.2) (2.3)
x5 x3 + , 6 120 and suppose we wish to nd a Lipschitz constant for g over the interval [1/2, 1/2]. We will proceed by an interval evaluation of g over [1/2, 1/2]. Since g (x) = x2 /2 + x4 /24, we have g (x) = 1 1 g ([1/2, 1/2]) [1/2, 1/2]2 + [1/2, 1/2]4 2 24 1 1 = [0, 1/4] + [0, 1/16] = [1/8, 0] + [0, 1/384] 2 24 [0.125, 0] + [0, 0.002605] [0.125, 0.00261]. Thus, since |g (x)| maxy[0.125,0.00261] |y | = 0.125, g satises a Lipschitz condition with Lipschitz constant 0.125. If g is a contraction for all real numbers x, then the hypotheses of the contraction mapping theorem are automatically satised, and xed point iteration converges for any x. (That is, the domain G can be taken to be the set of all real numbers.) On the other hand, if G must be restricted (such as if g is not a contraction everywhere or if g is not dened everywhere), then, to be assured that xed point iteration converges, we need to know that g maps G into itself. Two possibilities are with the following two theorems.
Example 2.5 Suppose
50
THEOREM 2.4 Let > 0 and G = [c , c + ]. Suppose that g is a contraction on G with Lipschitz constant L, 0 L < 1, and |g (c) c| (1 L). Then g maps G into itself. THEOREM 2.5 Assume that z is a solution of x = g (x), g (x) is continuous in an interval about z , and |g (z )| < 1. Then g is a contraction in a suciently small interval about z , and g maps this interval into itself. Thus, provided x0 is picked suciently close to z , the iterates will converge. Example 2.6 Let
x 1 + . 2 x Can we show that the xed point iteration xk+1 = g (xk ) converges for any starting point x0 [1, 2]? We will use Theorem 2.4, and Theorem 2.3 to show convergence. In particular, g (x) = 1/2 1/x2 . Evaluating g (x) over [1, 2] with interval arithmetic, we obtain 1 1 1 1 1 , = g ([1, 2]) 2 2 [1, 2] 2 2 [1, 4] 1 1 1 1 1 1 , ,1 = , + 1, = 2 2 4 2 2 4 1 1 . = , 2 4 g (x) =
1 Thus, since g (x) g ([1, 2]) [ 1 2 , 4 ] for every x [1, 2],
|g (x)|
1 x[ 1 2,4]
max |x| =
1 2
for every x [1, 2]. Thus, g is a contraction on [1, 2]. Furthermore, letting = 1/2 and c = 3/2, |g (3/2) 3/2| = 1/12 1/4. Thus, by Proposition 2.4, g maps [1, 2] into [1, 2]. Therefore, we can conclude from Theorem 2.3 that the xed point iteration converges for any starting point x0 [1, 2] to the unique xed point z = g (z ). Of course, it may be relatively easy to verify that |g | < 1, after which we may actually try xed point iteration to see if it stays in the domain and converges. In fact, in Theorem 2.4, we essentially do one iteration of xed point iteration and compare the change to the size of the region.
Numerical Solution of Nonlinear Equations of One Variable Example 2.7 Let g (x) = 4 +
51
1 3
sin 2x and xk+1 = 4 +
1 3
sin 2xk . Observing that
for all x shows that g is a contraction on all of R, so we can take G = R. Then g : G G and g is a contraction on R. Thus, for any x0 R, the iterations xk+1 = g (xk ) will converge to z , where z = 4 + 1 3 sin 2z . For x0 = 4, the following values are obtained. k 0 1 2 . . . xk 4 4.3298 4.2309 . . .
2 2 |g (x)| = cos 2x 3 3
14 4.2615 15 4.2615
It is not hard to show that, if 1 < L g (x) 0 and xed point iterates stay within G, then xed point iteration converges, with the iterates xk alternately less than and greater than the xed point z = g (z ). On the other hand, if 0 g (x) L < 1 and the xed point iterates stay within the domain G, then the xed point iterates xk converge monotonically to z . This latter situation is illustrated in Figure 2.4.
y g (x2 )+ g (x1 )+ g (a)+
y=x y = g (x)
+ a
FIGURE 2.4:
+ x1
++ x2 z
+ x b
Example of monotonic convergence of xed point iteration.
52
{xk } converges to z with rate of convergence . (We specify that c < 1 for = 1.)
There are conditions under which the convergence of xed point iteration is faster than linear. Recall if lim xk = z and |xk+1 z | c|xk z | , we say
k
THEOREM 2.6 Assume that the iterations xk+1 = g (xk ) converge to a xed point z . Furthermore, assume that q is the rst positive integer for which g (q) (z ) = 0 and if q = 1 then |g (z )| < 1. Then the sequence {xk } converges to z with order q . (It is assumed that g C q (G), where G contains z .) Example 2.8 Let g (x) =
x2 + 6 5
and G = [1, 2.3]. Since g (x) = 2x/5, the range of g is 2/5[1, 2.3] > 0, so g is monotonically increasing. Furthermore, g (1) = 7/5 and g (2.3) = 2.258, so the exact range of g over [1, 2.3] is the interval [1.4, 2.258] [1, 2.3], that is, g maps G into G. Also, 2x |g (x)| = 5 0.92 < 1
for x G. (Indeed, in this case, an interval evaluation gives 2[1, 2.3]/5 = [0.4, 0.92], the exact range of g in this case, since x occurs only once in the expression for g .) Theorem 2.3 then implies that there is a unique xed point z G. It is easy to see that the xed point is z = 2. In addition, since g (z ) = 4 5 = 0, there is a linear rate of convergence. Inspecting the values in the following table, notice that the convergence is not fast. k 0 1 2 3 4 xk 2.2 2.168 2.140 2.116 2.095
Example 2.9 Let g (x) =
x2 + 4 x 2 + = 2 x 2x
53
be as in Example 2.6. It can be shown that if 0 < x0 < 2, then x1 > 2. Also, xk > xk+1 > 2 when xk > 2. Thus, {xk } is a monotonically decreasing sequence bounded by 2 and hence is convergent. Thus, for any x0 (0, ), the sequence xk+1 = g (xk ) converges to z = 2. Now consider the convergence rate. We have that g (x) = so g (2) = 0, and 2 1 , 2 x2
4 , x3 so g (2) = 0. By Theorem 2.6, the convergence is quadratic, and as indicated in the following table, the convergence is rapid. g (x) = k 0 1 2 3 xk 2.2 2.00909 2.00002 2.00000000
3 4 x 4. 8 3 There is a unique xed point z = 2. However, g (x) = 3 2 x , so g (2) = 12, and we cannot conclude linear convergence. Indeed, the xed point iterations converge only if x0 = 2. If x0 > 2, then x1 > x0 > 2, x2 > x1 > x0 > 2, . Similarly, if x0 < 2, it can be veried that, for some k , xk < 0, after which xk+1 > 2, and we are in the same situation as if x0 > 2. That is, xed point iterations diverge unless x0 = 2. g (x) = Example 2.11 Consider again g from Example 2.8. Starting with x0 = 2.2, how many iterations would be required to obtain the xed point z with |xk z | < 1016 ? Can this number of iterations be computed before actually doing the iterations? We can use the bound |xk z | Lk |x1 x0 | 1L
Example 2.10 Let
from the Contraction Mapping Theorem (on page 49). The mean value theorem gives xk+1 z = g (ck )(xk z ),
54
but the smallest bound we know on |g (ck )| (and hence the smallest L in the formula) is L = 0.92). We also compute x1 = (2.2)2 + 6 /5, and |x1 x0 | = 0.032. Therefore, |xk z | Solving for k gives k > 16 log(25) 617.7. log(0.92) 0.92k 0.032 = 0.4 (0.92)k . 1 0.92
0.4 (0.92)k < 1016
Thus, 618 iterations would be required to achieve, roughly, IEEE double precision accuracy.
2.3
Newtons Method (Newton-Raphson Method)
We now return to the problem: given f (x), nd z such that f (z ) = 0. Newtons iteration for nding approximate solutions to this problem has the form f (x k ) for k = 0, 1, 2, . (2.4) xk+1 = xk f (xk ) REMARK 2.2 Newtons method is a special xed-point method with g (x) = x f (x)/f (x). Figure 2.5 illustrates the geometric interpretation of Newtons method. To nd xk+1 , the tangent line to the curve at point (xk , f (xk )) is followed to the x-axis. The tangent line is y f (xk ) = f (xk )(x xk ). Thus, at y = 0, x = xk f (xk )/f (xk ) = xk+1 . Newtons method is quadratically convergent, and is therefore fast when compared to a typical linearly convergent xed point method. However, Newtons method may diverge if x0 is not suciently close to a root z at which f (z ) = 0. To see this, study Figure 2.6. Another conceptually useful way of deriving Newtons method is using Taylors formula. We have 0 = f (z ) = f (xk ) + f (xk )(z xk ) + (z xk )2 f (k ), 2
55
(xk , f (xk ))
(xk+1 , f (xk+1 )) + + + xk xk+1xk+2

FIGURE 2.5:
Illustration of two iterations of Newtons method.
+ z
+ x k
+ xk +1
+ xk +2
+x + x k k+1
FIGURE 2.6: Examples of divergence of Newtons method. On the left, the sequence diverges; on the right, the sequence oscillates. where k is between z and xk . Thus, assuming that (z xk )2 is small, z xk f (x k ) . f (xk )
Hence, when xk+1 = xk f (xk )/f (xk ), we would expect xk+1 to be closer to z than xk . The quadratic convergence rate of Newtons method can be inferred from Theorem 2.6 by analyzing Newtons method as a xed point iteration. Consider f (x k ) xk+1 = xk = g (xk ). f (xk ) Observe that g (z ) = z , g (z ) = 0 = 1 f (z ) f (z )f (z ) + , f (z ) (f (z ))2
and, usually, g (z ) = 0. Thus, the quadratic convergence follows from Theorem 2.6.
56
Example 2.12 Let f (x) = x + ex . Compare bisection, simple xed-point iteration, and Newtons method. Newtons method: xk+1 = xk (xk 1)exk f (x k ) (xk + exk ) = xk = . x k f (xk ) (1 + e ) 1 + e xk
Fixed-Point (one form): xk+1 = exk = g (xk ). xk (Bisection) a = 1, b = 0 0 -0.5 1 -0.75 2 -0.625 3 -0.5625 4 -0.59375 5 -0.578125 10 -0.566895 20 -0.567143 k xk (Fixed-Point) -1.0 -0.367879 -0.692201 -0.500474 -0.606244 -0.545396 -0.568429 -0.567148 xk (Newtons) -1.0 -0.537883 -0.566987 -0.567143 -0.567143 -0.567143 -0.567143 -0.567143
2.4
The Univariate Interval Newton Method
A simple application of the ideas behind Newtons method and the Mean Value Theorem leads to a mathematically rigorous computation of the zeros of a function f . In particular, suppose x = [x, x] is an interval, and suppose that there is a z x with f (z ) = 0. Let x be any point (such as the midpoint of x). Then the Mean Value Theorem (page 5) gives 0 = f ( x) + f ( )(z x ). (2.5)
Solving (2.5) for z , then applying the fundamental theorem of interval arithmetic (page 27) gives z=x f ( x) f ( ) f ( x) x = N (f ; x, x ). f (x)
(2.6)
We thus have the following. THEOREM 2.7 Any solution z x of f (x) = 0 must also be in N (f ; x, x ).
57
We call N (f ; x, x ) the univariate interval Newton operator . The interval Newton operator forms the basis of a xed-point type of iteration of the form k ) for k = 1, 2, . . . . xk+1 N (f ; xk , x The interval Newton method is similar in many ways to the traditional NewtonRaphson method of Section 2.3 (page 54), but provides a way to use oating point arithmetic (with upward and downward roundings) to provide rigorous upper and lower bounds on exact solutions. We now discuss existence and uniqueness properties of the interval Newton method. In addition to providing bounds on any solutions within a given region, the interval Newton method has the following property. THEOREM 2.8 x, and N (f ; x, x ) x. Then there is an Suppose f C (x) = C ([x, x]), x x x such that f (x ) = 0. Furthermore, this x is unique. A formal algorithm for the interval Newton method is as follows. ALGORITHM 2.2 (The univariate interval Newton method) INPUT: x = [x, x], f : x R R, a maximum number of iterations N , and a stopping tolerance . OUTPUT: Either 1. solution does not exist within the original x, or 2. a new interval x such that any x x with f (x ) = 0 has x x , and one of: (a) existence and uniqueness veried and tolerance met. (b) existence and uniqueness not veried, (c) solution does not exist, or (d) existence and uniqueness veried but tolerance not met. 1. k 1. 2. existence and uniqueness veried false. 3. solution does not exist false. 4. DO WHILE k <= N . (a) x (x + x)/2.
(b) IF x x THEN RETURN.
58
Applied Numerical Methods N (f ; x, x (c) x ).
x (that is, if x x) THEN (d) IF x x and x existence and uniqueness veried true. i. solution does not exist true. ii. RETURN.
x = (that is, if x x or x x) THEN (e) IF x
(f ) IF w( x) < THEN . i. x x ii. tolerance met true. iii. RETURN. END IF (h) k k + 1. END DO 5. tolerance met false. 6. RETURN. END ALGORITHM 2.2. Notes: 1. The interval Newton method generally becomes stationary. (That is, the end points of x can be proven to not change, under certain assumptions on the machine arithmetic.) However, it is good general programming practice to enforce an upper limit on the total number of iterations of any iterative process, to avoid problems arising from slow convergence, etc. 2. In Step 4a of Algorithm 2.2, the midpoint is computed approximately, and it occasionally occurs (when the interval is very narrow), that the machine approximation lies outside the interval. Thus, we need to check for this possibility. 3. Although f is evaluated at a point in the expression x f ( x) f (x) . (g) x x x
for N (f ; x, x ), the machine must evaluate f with interval arithmetic to take account of rounding error. (That is, we start with the computations with the degenerate interval [ x, x ].) Otherwise, the results are not mathematically rigorous.
59
Similar to the traditional NewtonRaphson method, the interval Newton method exhibits quadratic convergence. (This is common knowledge.) An example of a specic theorem along these lines is THEOREM 2.9 (Quadratic convergence of the interval Newton method) Suppose f : x R, suppose f C (x) and f C (x), and suppose there is an x x such that f (x ) = 0. Suppose further that f is a rst order or higher order interval extension of f in the sense of Theorem 1.9 (on page 28). Then, for the initial width w(x) suciently small, w(N (f ; x, x )) = O(w(x) ). We will not give a proof of Theorem 2.9 here, although Theorem 2.9 is a special case of Theorem 6.3, page 222 in [20]. We will illustrate this quadratic convergence with Example 2.13 (Taken from [22].) Apply the interval Newton method x N (f ; x, x ), x ((x + x)/2), to f (x) = x2 2, starting with x = [1, 2] and x = 1.5. The results for Example 2.13 appear in Table 2.1. Here, w(xk ) k = max {maxyxk {|y |}, 1} is a scaled version of the width w(xk ), and k = maxyf (xk ) {|y |}. The displayed decimal intervals have been rounded out from the corresponding binary intervals.
2
TABLE 2.1:
f (x) = x2 2. k 0 1 2 3 4 5 6
Convergence of the interval Newton method with

k 5.00 101 4.35 102 2.51 104 4.70 109 4.71 1016 4.71 1016 4.71 1016 k 2.00 100 1.09 101 5.77 104 1.01 108 1.33 1015 1.33 1015 1.33 1015
xk [1.00000000000000, 2.00000000000000] [1.37499999999999, 1.43750000000001] [1.41406249999999, 1.41441761363637] [1.41421355929452, 1.41421356594718] [1.41421356237309, 1.41421356237310] [1.41421356237309, 1.41421356237310] [1.41421356237309, 1.41421356237310]
60
2.5
The Secant Method
Under certain circumstances, f may have a continuous derivative, but it may not be possible to explicitly compute it. This is less true now than in the past, because techniques of automatic dierentiation (or computational dierentiation), such as we explain in Section 6.2, page 215, have been developed, have become more widely available, and are used in practice. However, there are still various situations involving black box functions f . In black box functions, f is evaluated by some external procedure (such as a software system provided by someone other than its user), in which one supplies the input x, and the output f (x) is returned, but the user (or the designer of the method for nding points x , f (x ) = 0) does not have access to the internal workings, so that f cannot be easily computed. In such cases, methods that converge more rapidly than the method of bisection, but that do not require evaluation of f , are useful. Example 2.14 Suppose we wish to nd a zero of a f (x) = eax g cos x + sin x + ln x , x g (x) = 1 + 3x2 + 5x + h(5 + ex + cos x), 1 + x2 h(x) = and a is a constant. Problems as complicated as this are not uncommon. Prior to widespread use of automatic dierentiation, applying Newtons method to this problem was quite dicult because it would have been dicult and time-consuming to calculate f (xk ) at each time step. Automatic dierentiation is now an option for many problems of this type. However, in certain situations, such as applying the shooting method to solution of boundary-value problems (see the discussion in Chapter 10), f cannot be directly computed and the secant method is useful. In this section, we will assume that f cannot be computed, and we will treat f as a black-box function. In the secant method, f (xk ) is approximated by f (xk ) f (xk ) f (xk1 ) . xk xk1 e2x , (1 + x + x2 )
2
where
Numerical Solution of Nonlinear Equations of One Variable The secant method thus has the form xk+1 xk xk1 . = xk f (x k ) f (xk ) f (xk1 )
61
(2.7)
If f (xk ) and f (xk+1 ) have opposite signs, then, as with the bisection method, there must be an x between xk and xk+1 for which f (x ) = 0. For the secant method, we need starting values x0 and x1 . However, only one evaluation of the function f is required at each iteration, since f (xk1 ) is known from the previous iteration. Geometrically, (see gure 2.7), to obtain xk+1 , the secant to the curve through (xk1 , f (xk1 )) and (xk , f (xk )) is followed to the x-axis.
(xk1 , f (xk1 ))
(xk , f (xk )) + xk+1

FIGURE 2.7:
Geometric interpretation of the secant method.
Interestingly, the convergence rate of the secant method is faster than linear but slower than quadratic. THEOREM 2.10 (Convergence of the secant method) Let G be a subset of R containing a zero z of f (x). Assume f C 2 (G) and there exists an M 0 such that M= 2 min |f (x)|
xG
max |f (x)|
xG
Let x0 and x1 be two initial guesses to z and let K (z ) = (z , z + ) G,
62
and < 1. Let x0 , x1 K (z ). Then, the iterates x2 , x3 , x4 , where = M remain in K (z ) and converge to z with error
|xk z |
1 ( 1+5 )k 2 . M
Note that (1 + 5)/2 1.618, a fractional order of convergence between 1 k and 2. For Newtons method |xk z | q 2 with q < 1.
2.6
Software
The matlab function bisect method we presented, as well as a matlab function for Newtons method, are available from the web page for the graduate version of this book, namely at http://interval.louisiana.edu/Classical-and-Modern-NA/ A step of the interval Newton method is implemented with the matlab function i newton step no fp.m, explained in [26], and available from the web page for the book, at http://www.siam.org/books/ot110. This function uses intlab (see page 35) for the interval arithmetic and for automatically computing derivatives of f . Additional techniques for root-nding, such as nding complex roots and nding all roots of polynomials, appear in the graduate version of this book [1]. One common computation is nding all of the roots of a polynomial equation p(x) = 0. The matlab function roots accepts an array containing the coecients of the polynomial, and returns the roots of that polynomial. For example, we might have the following matlab dialog.
>> c = [1,1,1] c = 1 1 1 >> r = roots(c) r = -0.5000 + 0.8660i -0.5000 - 0.8660i >> c = [1 5 4] c = 1 5 4 >> r = roots(c) r = -4 -1 >>
The rst computation computes approximations to the roots of the polynomial p(x) = x2 + x + 1, namely, approximations to 1/2 3/2i, while the second
63
computation computes the roots of the polynomial p(x) = x2 + 5x + 4, namely x = 4 and x = 1. matlab also contains a function fzero for nding zeros of more general equations f (x) = 0. A matlab dialog with its use is
>> x = fzero(exp(x)+x,-0.5) x = -0.5671 >>
(Compare this with Example 2.1.) Various examples, as well as explanations of the underlying algorithms, are available within the matlab help system for roots and fzero. NETLIB (at http://www.netlib.org/ contains various software packages in Fortran, C, etc. for computing roots of polynomial equations and other equations. Software for nding veried bounds on all solutions is also available. See [26] for an introduction to some of the techniques. intlab has the function verifypoly for nding certied bounds on the roots of polynomials. There is a general function verifynlss in intlab for nding Generally, nding a root of an equation is a computation done as part of an overall modeling or simulation process. It is usually advantageous to use polished programs within the chosen system (matlab, a programming language such as Fortran or C++, etc.). However, in developing specialized packages, if the function f has special properties, one can take advantage of these. It may also be ecient in certain cases to program directly the simple methods described in this chapter, if one is certain of their convergence in the context of their use.
2.7
Applications
The problem of nding x such that f (x) = 0 arises frequently when trying to solve for equilibrium solutions (constant solutions) of dierential equation models in many elds including biology, engineering and physics. Here, we focus on a model from population biology. To this end, consider the general population model given by dx = f (x)x = (b(x) d(x))x. dt (2.8)
Here, x(t) is the population density at time t. The function b(x) is the densitydependent birth rate and d(x) is the density-dependent death rate. Thus, f (x) is the density-dependent growth rate of the population. An important problem in population biology is analyzing the solution behavior of such dynamical
64
models. A rst step in such analysis is often nding the equilibrium solutions. Clearly, these solutions satisfy dx/dt = 0 which is equivalent to x = 0, i.e., the trivial solution is an equilibrium solution of this population model usually referred to as the extinction equilibrium and f (x) = 0, i.e., values x which make the growth rate equal zero. To focus on a concrete example, assume the birth rate is of Ricker type b(x) = ex and the mortality rate is linear function given by d(x) = 2x. This implies that the growth rate is given by f (x) = ex 2x. To nd the unique positive equilibrium we need to solve the equation ex 2x = 0. Using Newton Method given by the following programming algorithm in matlab.
function [x_star,success] = newton (x0, f, f_prime, eps, maxitr) % % [x_star,success] = newton(x0,f,f_prime,eps,maxitr) % does iterations of Newtons method for a single variable, % using x0 as initial guess, f (a character string giving % an m-file name) as function, and f_prime (also a character % string giving an m-file name) as the derivative of f. % For example, suppose an m-file xsqm2.m is available in Matlabs working % directory, with the following contents: % function [y] = xsqm2(x) % y = x^2-2; % return % and an m-fine xsqm2_prime is also available, with the following % function [y] = xsqm2_prime(x) % y = 2*x; % return % contents: % Then, issuing % [x_star,success] = newton(1.5, xsqm2, xsqm2_prime, 1e-10, 20) % from Matlabs command window will cause an approximation to the square % root of 2 to be stored in x_star. % iteration stops successfully if |f(x)| < eps, and iteration % stops unsuccessfully if maxitr iterations have been done % without stopping successfully or if a zero derivative % is encountered. % On return: % success = 1 if iteration stopped successfully, and % success = 0 if iteration stopped unsuccessfully. % x_star is set to the approximate solution to f(x) = 0 % if iteration stopped successfully, and x_star % is set to x0 otherwise. success = 0; x = x0; for i=1:maxitr; fval = feval(f,x); if abs(fval) < eps; success = 1;

disp(sprintf( %10.0f %15.9f %15.9f , i, x, fval)); x_star = x; return; end; fpval = feval(f_prime,x); if fpval == 0; x_star = x0; end; disp(sprintf( %10.0f %15.9f %15.9f , i, x, fval)); x = x - fval / fpval; end; x_star =x0;
65
and the following matlab dialog

>> y=inline(exp(-x)-2*x) >> yp=inline(-exp(-x)-2) >> [x_star,success]=newton(0,y,yp,1e-10,40)
we obtain the following table of iterations for the solution

1 2 3 4 5 0.000000000 0.333333333 0.351689332 0.351733711 0.351733711
Thus, x = 0.351733711 is the unique positive equilibrium of this model.
2.8
Exercises
1. Consider the method of bisection applied to f (x) = arctan(x), with initial interval x = [4.9, 5.1]. (a) Are the hypotheses under which the method of bisection converges valid? If so, then how many iterations would it take to obtain the solution to within an absolute error of 102 ? (b) Apply Algorithm 2.1 with pencil and paper, until k = 5, arranging your computations carefully so you gain some intuition into the process. 2. Let f and x be as in Problem 1.
66
Applied Numerical Methods (a) Modify bisect method so it prints ak , bk , f (ak ), f (bk ), and f (xk ) for each step, so you can see what is happening. Hint: Deleting the semicolon from the end of a matlab statement causes the value assigned to the left of the statement to be printed, while a statement consisting only of a variable name causes that variable name to be printed. If you want more neatly printed quantities, study the matlab functions disp and sprintf. (b) Try to solve f (x) = 0 with = 102 , = 104 , = 108 , = 1016 , = 1032 , = 1064 , and = 10128 . i. For each , compute the k at which the algorithm should stop. ii. What behavior do you actually observe in the algorithm? Can you explain this behavior? 3. Repeat Problem 2, but with f (x) = x2 2 and initial interval x = [1, 2]. 4. Use the program for the bisection method in Problem 2 to nd an ap1 proximation to 1000 4 which is correct to within 105 . 5. Consider g (x) = x arctan(x). (a) Perform 10 iterations of the xed point method xk+1 = g (xk ), starting with x = 5, x = 5, x = 1, x = 1, and x = 0.1.
(b) What do you observe for the dierent starting points? What is |g | at each starting point, and how might this relate to the behavior you observe? 6. It is desired to nd the positive real root of the equation x3 + x2 1 = 0. (a) Find an interval x = [x, x] and a suitable xed point iteration function g (x) to accomplish this. Verify all conditions of the contraction mapping theorem. (b) Find the minimum number of iterations n needed so that the absolute error in the n-th approximation to the root is correct to 104 . Also, use the xed-point iteration method (with the g you determined in part (a)) to determine this positive real root accurate to within 104 . 7. Find an approximation to 1000 4 correct to within 105 using the xed point iteration method. 8. Consider f (x) = arctan(x). This function has a unique zero z = 0. (a) Use a digital computer with double precision arithmetic to do iterations of Newtons method, starting with x0 = 0.5, 1.0, 1.3, 1.4, 1.35, 1.375, 1.3875, 1.39375, 1.390625, 1.3921875. Iterate until one of the following occurs:
1
Numerical Solution of Nonlinear Equations of One Variable |f (x)| 1010 , an operation exception occurs, or 20 iterations are completed.
67
(i) Describe the behavior you observe. (ii) Explain the behavior you observe in terms of the graph of f . (iii) Evidently, there is a point p such that, if x0 > p, then Newtons method diverges, and if x0 < p, then Newtons method converges. () What would happen if x0 = p exactly? Illustrate what would happen on a graph of f . ( ) Do you think we could choose x0 = p exactly in practice? 9. Let f (x) = x2 a. (a) Write down and simplify the Newtons method iteration equation for f (x) = 0. (b) For a = 2, form a table of 15 iterations of Newtons method, starting with x0 = 2, x0 = 4, x0 = 8, x0 = 16, x0 = 32, and x0 = 64. (c) Explain your results in terms of the shape of the graph of f and in terms of the convergence theory in this section. (d) Compare your analysis here to the analysis in Example 2.9 on page 52. 10. Hint: The free intlab toolbox, mentioned on page 35, is recommended for this problem. (a) Let f be as in Problem 8 of this set. Experiment with the interval Newton method for this problem, and with various intervals that contain zero. Try some intervals of the form [a, a] (with x = 0) and other intervals of the form [a, b], a > 0, b > 0 and a = b. Explain what you have found. (b) Use the interval Newton method to prove that there exists a unique solution to f (x) = 0 for x [1, 0], where f (x) = x + ex . (c) Iterate the interval Newton method to nd as narrow bounds as possible on the solution proven to exist in part 10b.
11. Repeat Exercise 8a, page 66, but with the secant method instead of Newtons method. (Use pairs of starting points {0.5, 1.0}, {1.0, 1.3}, etc.) 12. Do three steps of Newtons method, using complex arithmetic, for the function f (z ) = z 2 + 1, with starting guess z0 = 0.2 + 0.7i. Although you may use a computer program, you should show intermediate results, including zk , f (zk ), and f (zk ). (Note: Newtons method with
68
Applied Numerical Methods complex arithmetic can be viewed as a multivariate Newton method in two variables; see Exercise 5 on page 322, in Section 8.2.)

Chap 2

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Chap 2

Caricato da

Copyright:

Formati disponibili

Chapter 2

Numerical Solution of Nonlinear Equations of One Variable

Applied Numerical Methods

(b) Return to step 3.

Numerical Solution of Nonlinear Equations of One Variable of 2 at each iteration.

Graph of ex + x for Example 2.1.

Applied Numerical Methods

k+1 Thus, in the algorithm, if 1 < , then |z xk | < . 2 (bk ak ) = (b a)/2

Numerical Solution of Nonlinear Equations of One Variable

Applied Numerical Methods

Numerical Solution of Nonlinear Equations of One Variable

Applied Numerical Methods

b = 2 eps = 1e-1 [root,success] = bisect_method(a, b, eps, xsqm2)

Numerical Solution of Nonlinear Equations of One Variable

The Fixed Point Method

If f (x) = g (x) x, then a xed point of g is a zero of f .

Applied Numerical Methods

Example 2.5 Suppose

Applied Numerical Methods

sin 2x and xk+1 = 4 +

sin 2xk . Observing that

y g (x2 )+ g (x1 )+ g (a)+

Example of monotonic convergence of xed point iteration.

Applied Numerical Methods

Example 2.9 Let g (x) =

Numerical Solution of Nonlinear Equations of One Variable

Example 2.10 Let

Applied Numerical Methods

0.4 (0.92)k < 1016

Newtons Method (Newton-Raphson Method)

Numerical Solution of Nonlinear Equations of One Variable

(xk+1 , f (xk+1 )) + + + xk xk+1xk+2

Illustration of two iterations of Newtons method.

Applied Numerical Methods

The Univariate Interval Newton Method

Numerical Solution of Nonlinear Equations of One Variable

(b) IF x x THEN RETURN.

Applied Numerical Methods N (f ; x, x (c) x ).

x = (that is, if x x or x x) THEN (e) IF x

Numerical Solution of Nonlinear Equations of One Variable

Convergence of the interval Newton method with

Applied Numerical Methods

The Secant Method

(xk , f (xk )) + xk+1

Geometric interpretation of the secant method.

Let x0 and x1 be two initial guesses to z and let K (z ) = (z , z + ) G,

Applied Numerical Methods

Numerical Solution of Nonlinear Equations of One Variable

Applied Numerical Methods

Numerical Solution of Nonlinear Equations of One Variable

and the following matlab dialog

we obtain the following table of iterations for the solution

Thus, x = 0.351733711 is the unique positive equilibrium of this model.

Potrebbero piacerti anche