Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Numerical differentiation
3.1 Introduction
Numerical integration and differentiation are some of the most frequently needed methods in compu-
′
Rtational physics. Quite often we are confronted with the need of evaluating either f or an integral
f (x)dx. The aim of this chapter is to introduce some of these methods with a critical eye on numerical
accuracy, following the discussion in the previous chapter.
The next section deals essentially with topics from numerical differentiation. There we present also
the most commonly used formulae for computing first and second derivatives, formulae which in turn find
their most important applications in the numerical solution of ordinary and partial differential equations.
This section serves also the scope of introducing some more advanced C++-programming concepts, such
as call by reference and value, reading and writing to a file and the use of dynamic memory allocation.
df (x) f (x + h) − f (x)
= lim
dx h→0 h
where h is the step size. If we use a Taylor expansion for f (x) we can write
h2 f ′′ (x)
f (x + h) = f (x) + hf ′ (x) + + ...
2
We can then set the computed derivative fc′ (x) as
f (x + h) − f (x) hf ′′ (x)
f ′ (x) ≈ ≈ f ′ (x) + + ...
h 2
Assume now that we will employ two points to represent the function f by way of a straight line between
x and x + h. Fig. 3.1 illustrates this subdivision.
This means that we can represent the derivative with
f (x + h) − f (x)
f2′ (x) = + O(h),
h
45
Numerical differentiation
where the suffix 2 refers to the fact that we are using two points to define the derivative and the dominating
error goes like O(h). This is the forward derivative formula. Alternatively, we could use the backward
derivative formula
f (x) − f (x − h)
f2′ (x) = + O(h).
h
If the second derivative is close to zero, this simple two point formula can be used to approximate the
derivative. If we however have a function like f (x) = a + bx2 , we see that the approximated derivative
becomes
f2′ (x) = 2bx + bh,
while the exact answer is 2bx. Unless h is made very small, and b is not too large, we could approach the
exact answer by choosing smaller and smaller and values for h. However, in this case, the subtraction in
the numerator, f (x + h) − f (x) can give rise to roundoff errors and eventually a loss of precision.
A better approach in case of a quadratic expression for f (x) is to use a 3-step formula where we
evaluate the derivative on both sides of a chosen point x0 using the above forward and backward two-step
formulae and taking the average afterward. We perform again a Taylor expansion but now around x0 ± h,
namely
h2 f ′′ h3 f ′′′
f (x = x0 ± h) = f (x0 ) ± hf ′ + ± + O(h4 ), (3.1)
2 6
which we rewrite as
h2 f ′′ h3 f ′′′
f±h = f0 ± hf ′ + ± + O(h4 ).
2 6
Calculating both f±h and subtracting we obtain that
fh − f−h h2 f ′′′
f3′ = − + O(h3 ),
2h 6
and we see now that the dominating error goes like h2 if we truncate at the scond derivative. We call
the term h2 f ′′′ /6 the truncation error. It is the error that arises because at some stage in the derivation,
a Taylor series has been truncated. As we will see below, truncation errors and roundoff errors play an
important role in the numerical determination of derivatives.
For our expression with a quadratic function f (x) = a + bx2 we see that the three-point formula
′
f3 for the derivative gives the exact answer 2bx. Thus, if our function has a quadratic behavior in x in
a certain region of space, the three-point formula will result in reliable first derivatives in the interval
[−h, h]. Using the relation
fh − 2f0 + f−h = h2 f ′′ + O(h4 ),
we can define the second derivative as
fh − 2f0 + f−h
f ′′ = + O(h2 ).
h2
We could also define five-points formulae by expanding to two steps on each side of x0 . Using a
Taylor expansion around x0 in a region [−2h, 2h] we have
4h3 f ′′′
f±2h = f0 ± 2hf ′ + 2h2 f ′′ ± + O(h4 ). (3.2)
3
Using Eqs. (3.1) and (3.2), multiplying fh and f−h by a factor of 8 and subtracting (8fh − f2h )− (8f−h −
f−2h ) we arrive at a first derivative given by
f−2h − 8f−h + 8fh − f2h
′
f5c = + O(h4 ),
12h
46
3.2 – Numerical differentiation
f (x)
x0 − 2h x0 − h x0 x0 + h x0 + 2h x
Figure 3.1: Demonstration of the subdivision of the x-axis into small steps h. Each point corresponds to
a set of values x, f (x). The value of x is incremented by the step length h. If we use the points x0 and
x0 + h we can draw a straight line and use the slope at this point to determine an approximation to the
first derivative. See text for further discussion.
47
Numerical differentiation
with a dominating error of the order of h4 at the price of only two additional function evaluations. This
formula can be useful in case our function is represented by a fourth-order polynomial in x in the region
[−2h, 2h]. Note however that this function includes two additional function evaluations, implying a more
time-consuming algorithm. Furthermore, the two additional subtraction can lead to a larger risk of loss of
numerical precision when h becomes small. Solving for example a differential equation which involves
the first derivative, one needs always to strike a balance between numerical accurary and the time needed
to achieve a given result.
It is possible to show that the widely used formulae for the first and second derivatives of a function
can be written as
∞ (2j+1)
fh − f−h X f0
′
= f0 + h2j , (3.3)
2h (2j + 1)!
j=1
and
∞ (2j+2)
fh − 2f0 + f−h X f
2
= f0′′ + 2 0
h2j , (3.4)
h (2j + 2)!
j=1
and we note that in both cases the error goes like O(h2j ).
These expressions will also be used when we
evaluate integrals.
To show this for the first and second derivatives starting with the three points f−h = f (x0 − h),
f0 = f (x0 ) and fh = f (x0 + h), we have that the Taylor expansion around x = x0 gives
∞ (j) ∞ (j)
X f X f
a−h f−h + a0 f0 + ah fh = a−h 0
(−h)j + a0 f0 + ah 0
(h)j , (3.5)
j! j!
j=0 j=0
where a−h , a0 and ah are unknown constants to be chosen so that a−h f−h + a0 f0 + ah fh is the best
possible approximation for f0′ and f0′′ . Eq. (3.5) can be rewritten as
a−h + a0 + ah = 0,
1
−a−h + ah = ,
h
and
a−h + ah = 0.
These equations have the solution
1
a−h = −ah = − ,
2h
and
a0 = 0,
yielding
∞ (2j+1)
fh − f−h X f
= f0′ + 0
h2j .
2h (2j + 1)!
j=1
48
3.2 – Numerical differentiation
a−h + a0 + ah = 0,
−a−h + ah = 0,
and
2
a−h + ah = .
h2
These equations have the solution
1
a−h = −ah = − ,
h2
and
2
a0 = − ,
h2
yielding
∞ (2j+2)
fh − 2f0 + f−h X f
= f ′′
0 + 2 0
h2j .
h2 (2j + 2)!
j=1
49
Numerical differentiation
a = 10; // line 3
b = new int[10]; // line 4
for(i = 0; i < 10; i++) {
b[i] = i; // line 5
}
func( a,b); // line 6
return 0;
} // End: function main()
– Lines 1,2: Declaration of two variables a and b. The compiler reserves two locations in memory.
The size of the location depends on the type of variable. Two properties are important for these
locations – the address in memory and the content in the
– Line 4: Memory to store 10 integers is reserved. The address to the first location is stored in b. The
address of element number 6 is given by the expression (b + 6).
– Line 5: All 10 elements of b are given values: b[0] = 0, b[1] = 1, ....., b[9] = 9;
– Line 6: The main() function calls the function func() and the program counter transfers to the first
statement in func(). With respect to data the following happens. The content of a (= 10) and the
content of b (a memory address) are copied to a stack (new memory location) associated with the
function func()
– Line 7: The variable x and y are local variables in func(). They have the values – x = 10, y =
address of the first element in b in the main() program.
– Line 8: The local variable x stored in the stack memory is changed to 17. Nothing happens with
the value a in main().
50
3.2 – Numerical differentiation
– Line 9: The value of y is an address and the symbol *y stands for the position in memory which
has this address. The value in this location is now increased by 10. This means that the value of
b[0] in the main program is equal to 10. Thus func() has modified a value in main().
– Line 10: This statement has the same effect as line 9 except that it modifies element b[6] in main()
by adding a value of 10 to what was there originally, namely 6.
– Line 11: The program counter returns to main(), the next expression after func(a,b);. All data on
the stack associated with func() are destroyed.
– The value of a is transferred to func() and stored in a new memory location called x. Any modi-
fication of x in func() does not affect in any way the value of a in main(). This is called transfer
of data by value. On the other hand the next argument in func() is an address which is transferred
to func(). This address can be used to modify the corresponding value in main(). In the program-
ming language C it is expressed as a modification of the value which y points to, namely the first
element of b. This is called transfer of data by reference and is a method to transfer data back to
the calling function, in this case main().
C++ allows however the programmer to use solely call by reference (note that call by reference is
implemented as pointers). To see the difference between C and C++, consider the following simple
examples. In C we would write
int n ; n =8;
f u n c (&n ) ; / ∗ &n i s a p o i n t e r t o n ∗ /
....
void func ( i n t ∗ i )
{
∗ i = 1 0 ; / ∗ n i s ch a n g ed t o 10 ∗ /
....
}
Note well that the way wex have defined the input to the function func( int& i) or func( int ∗i ) decides
how we transfer variables to a specific function. The reason why we emphasize the difference between
call by value and call by reference is that it allows the programmer to avoid pitfalls like unwanted changes
of variables. However, many people feel that this reduces the readability of the code. It is more or less
common in C++ to use call by reference, since it gives a much cleaner code. Recall also that behind the
curtain references are usually implemented as pointers. When we transfer large objects such a matrices
and vectors one should always use call by reference. Copying such objects to a called function slows
down considerably the execution. If you need to keep the value of a call by reference object, you should
use the const declaration.
51
Numerical differentiation
In programming languages like Fortran one uses only call by reference, but you can flag whether
a called function or subroutine is allowed or not to change the value by declaring for example an in-
teger value as INTEGER, INTENT(IN):: i . The local function cannot change the value of i. Declaring a
transferred values as INTEGER, INTENT(OUT):: i. allows the local function to change the variable i.
v o i d i n i t i a l i s e ( do uble ∗ , do uble ∗ , i n t ∗ ) ;
v o i d s e c o n d _ d e r i v a t i v e ( i n t , double , double , do uble ∗ , do uble ∗ ) ;
v o i d o u t p u t ( do uble ∗ , do uble ∗ , double , i n t ) ;
i n t main ( )
{
/ / declarations of variables
int number_of_steps ;
do uble x , i n i t i a l _ s t e p ;
do uble ∗ h _ s t e p , ∗ c o m p u t e d _ d e r i v a t i v e ;
// r e a d i n i n p u t d a t a fro m s c r e e n
i n i t i a l i s e (& i n i t i a l _ s t e p , &x , &n u m b e r _ o f _ s t e p s ) ;
// a l l o c a t e s p a c e i n memory f o r t h e one−d i m e n s i o n a l a r r a y s
// h _ s t e p and c o m p u t e d _ d e r i v a t i v e
h _ s t e p = new do uble [ n u m b e r _ o f _ s t e p s ] ;
c o m p u t e d _ d e r i v a t i v e = new do uble [ n u m b e r _ o f _ s t e p s ] ;
// co mp u te t h e s e c o n d d e r i v a t i v e o f e x p ( x )
s e c o n d _ d e r i v a t i v e ( number_of_steps , x , i n i t i a l _ s t e p , h_step ,
computed_derivative ) ;
// Then we p r i n t t h e r e s u l t s t o f i l e
output ( h_step , computed_derivative , x , number_of_steps ) ;
/ / f r e e memory
52
3.2 – Numerical differentiation
delete [] h_step ;
delete [] computed_derivative ;
return 0;
} / / end main program
We have defined three additional functions, one which reads in from screen the value of x, the initial step
length h and the number of divisions by 2 of h. This function is called initialise . To calculate the second
derivatives we define the function second_derivative . Finally, we have a function which writes our results
together with a comparison with the exact value to a given file. The results are stored in two arrays, one
which contains the given step length h and another one which contains the computed derivative.
These arrays are defined as pointers through the statement
do uble ∗ h _ s t e p , ∗ c o m p u t e d _ d e r i v a t i v e ;
A call in the main function to the function second_derivative looks then like this
s e c o n d _ d e r i v a t i v e ( number_of_steps , x , i n t i a l _ s t e p , h_step ,
computed_derivative ) ;
indicating that double ∗h_step , double ∗computed_derivative; are pointers and that we transfer the address
of the first elements. The other variables int number_of_steps, double x; are transferred by value and are
not changed in the called function.
Another aspect to observe is the possibility of dynamical allocation of memory through the new
function. In the included program we reserve space in memory for these three arrays in the following way
h_step = new double[number_of_steps]; and computed_derivative = new double[number_of_steps]; When we
no longer need the space occupied by these arrays, we free memory through the declarations delete []
h_step ; and delete [] computed_derivative ;
v o i d i n i t i a l i s e ( do uble ∗ i n i t i a l _ s t e p , do uble ∗x , i n t ∗ n u m b e r _ o f _ s t e p s )
{
p r i n t f ( "Read in from s
reen initial step , x and number of steps\n" ) ;
s c a n f ( "%lf %lf %d" , i n i t i a l _ s t e p , x , n u m b e r _ o f _ s t e p s ) ;
return ;
} / / end o f f u n c t i o n i n i t i a l i s e
This function receives the addresses of the three variables double ∗ initial_step , double ∗x, int ∗
number_of_steps; and returns updated values by reading from screen.
53
Numerical differentiation
v o i d s e c o n d _ d e r i v a t i v e ( i n t n u m b e r _ o f _ s t e p s , do uble x ,
do uble i n i t i a l _ s t e p , do uble ∗ h _ s t e p ,
do uble ∗ c o m p u t e d _ d e r i v a t i v e )
{
int counter ;
do uble h ;
// calculate the step size
// i n i t i a l i s e t h e d e r i v a t i v e , y and x ( i n m i n u t e s )
// and i t e r a t i o n c o u n t e r
h = initial_step ;
// s t a r t computing f o r d i f f e r e n t st e p s i z e s
f o r ( c o u n t e r = 0 ; c o u n t e r < n u m b e r _ o f _ s t e p s ; c o u n t e r ++ )
{
// s e t u p a r r a y s w i t h d e r i v a t i v e s and s t e p s i z e s
h_step [ counter ] = h ;
computed_derivative [ counter ] =
( exp ( x+h ) −2.∗ exp ( x ) + exp ( x−h ) ) / ( h ∗h ) ;
h = h ∗0.5;
} / / end o f do l o o p
return ;
} / / end o f f u n c t i o n s e c o n d d e r i v a t i v e
The loop over the number of steps serves to compute the second derivative for different values of h.
In this function the step is halved for every iteration (you could obviously change this to larger or
smaller step variations). The step values and the derivatives are stored in the arrays h_step and double
computed_derivative .
54
3.2 – Numerical differentiation
12 exit (1) ;
13 i n = f o p e n ( a r g v [ 1 ] , "r" ) ; } / / returns pointer to the i n _ f i l e
14 i f ( i n n == NULL ) { / / can ’ t f i n d i n _ f i l e
15 p r i n t f ( "Can't find the input file %s\n" , a r g v [ 1 ] ) ;
16 exit (1) ;
17 }
18 o u t = f o p e n ( a r g v [ 2 ] , "w" ) ; / / returns a pointer to the ou t _ f i l e
19 i f ( u t == NULL ) { / / can ’ t f i n d o u t _ f i l e
20 p r i n t f ( "Can't find the output file %s\n" , a r g v [ 2 ] ) ;
21 exit (1) ;
22 }
. . . p r o g r am s t a t e m e n t s
23 fclose ( in ) ;
24 fclose ( out ) ;
25 return 0;
}
The above represents a standard procedure in C for reading file names. C++ has its own class for
such operations.
55
Numerical differentiation
ofstream o f i l e ;
i n t main ( i n t a r g c , cha r ∗ a r g v [ ] )
{
/ / declarations of variables
cha r ∗ o u t f i l e n a m e ;
int number_of_steps ;
do uble x , i n i t i a l _ s t e p ;
do uble ∗ h _ s t e p , ∗ c o m p u t e d _ d e r i v a t i v e ;
/ / Read i n o u t p u t f i l e , a b o r t i f t h e r e a r e t o o f e w command−l i n e
arguments
i f ( a r g c <= 1 ) {
c o u t << "Bad Usage: " << a r g v [ 0 ] <<
" read also output file on same line" << e n d l ;
exit (1) ;
}
else {
o u t f i l e n a m e= a r g v [ 1 ] ;
}
o f i l e . o p en ( o u t f i l e n a m e ) ;
// r e a d i n i n p u t d a t a fro m s c r e e n
i n i t i a l i s e (& i n i t i a l _ s t e p , &x , &n u m b e r _ o f _ s t e p s ) ;
// a l l o c a t e s p a c e i n memory f o r t h e one−d i m e n s i o n a l a r r a y s
// h _ s t e p and c o m p u t e d _ d e r i v a t i v e
h _ s t e p = new do uble [ n u m b e r _ o f _ s t e p s ] ;
c o m p u t e d _ d e r i v a t i v e = new do uble [ n u m b e r _ o f _ s t e p s ] ;
// co mp u te t h e s e c o n d d e r i v a t i v e o f e x p ( x )
s e c o n d _ d e r i v a t i v e ( number_of_steps , x , i n i t i a l _ s t e p , h_step ,
computed_derivative ) ;
// Then we p r i n t t h e r e s u l t s t o f i l e
output ( h_step , computed_derivative , x , number_of_steps ) ;
/ / f r e e memory
delete [] h_step ;
delete [] computed_derivative ;
/ / close output f i l e
ofile . close () ;
return 0;
} / / end main program
The main part of the code includes now an object declaration ofstream ofile which is included in C++ and
allows the programmer to open and declare files. This is done via the statement ofile .open( outfilename ) ; .
We close the file at the end of the main program by writing ofile . close () ; . There is a corresponding
object for reading inputfiles. In this case we declare prior to the main function, or in an evantual header
file, ifstream ifile and use the corresponding statements ifile .open( infilename ) ; and ifile . close () ; for
opening and closing an input file. Note that we have declared two character variables char∗ outfilename
; and char∗ infilename ; . In order to use these options we need to include a corresponding library of
functions using # include <fstream>.
One of the problems with C++ is that formatted output is not as easy to use as the printf and scanf
functions in C. The output function using the C++ style is included below.
56
3.2 – Numerical differentiation
The function setw(15) reserves an output of 15 spaces for a given variable while setprecision (8) yields
eight leading digits. To use these options you have to use the declaration # include <iomanip>
Before we discuss the results of our calculations we list here the corresponding Fortran program. The
corresponding Fortran example is
MODULE f u n c t i o n s
USE c o n s t a n t s
IMPLICIT NONE
CONTAINS
SUBROUTINE d e r i v a t i v e ( n u m b e r _ o f _ s t e p s , x , i n i t i a l _ s t e p , h _ s t e p , &
computed_derivative )
USE c o n s t a n t s
INTEGER , INTENT ( IN ) : : n u m b e r _ o f _ s t e p s
INTEGER : : l o o p
REAL( DP ) , DIMENSION( n u m b e r _ o f _ s t e p s ) , INTENT (INOUT) : : &
computed_derivative , h_step
REAL( DP ) , INTENT ( IN ) : : i n i t i a l _ s t e p , x
REAL( DP ) : : h
! calculate the step s ize
! i n i t i a l i s e t h e d e r i v a t i v e , y and x ( i n m i n u t e s )
57
Numerical differentiation
! and i t e r a t i o n c o u n t e r
h = initial_step
! s t a r t computing f o r d i f f e r e n t st e p s i z e s
DO l o o p =1 , n u m b e r _ o f _ s t e p s
! s e t u p a r r a y s w i t h d e r i v a t i v e s and s t e p s i z e s
h_step ( loop ) = h
c o m p u t e d _ d e r i v a t i v e ( l o o p ) = ( EXP ( x+h ) −2.∗EXP ( x ) +EXP ( x−h ) ) / ( h ∗h )
h = h ∗0.5
ENDDO
END SUBROUTINE d e r i v a t i v e
END MODULE f u n c t i o n s
PROGRAM s e c o n d _ d e r i v a t i v e
USE c o n s t a n t s
USE f u n c t i o n s
IMPLICIT NONE
! declarations of variables
INTEGER : : n u m b e r _ o f _ s t e p s , l o o p
REAL( DP) : : x , i n i t i a l _ s t e p
REAL( DP) , ALLOCATABLE, DIMENSION ( : ) : : h _ s t e p , c o m p u t e d _ d e r i v a t i v e
! r e a d i n i n p u t d a t a fro m s c r e e n
WRITE( ∗ , ∗ ) ’ Read i n i n i t i a l s t e p , x v a l u e and number o f s t e p s ’
READ( ∗ , ∗ ) i n i t i a l _ s t e p , x , n u m b e r _ o f _ s t e p s
! o p en f i l e t o w r i t e r e s u l t s on
OPEN( UNIT=7 ,FILE = ’ o u t . d a t ’ )
! a l l o c a t e s p a c e i n memory f o r t h e one−d i m e n s i o n a l a r r a y s
! h _ s t e p and c o m p u t e d _ d e r i v a t i v e
ALLOCATE( h _ s t e p ( n u m b e r _ o f _ s t e p s ) , c o m p u t e d _ d e r i v a t i v e ( n u m b e r _ o f _ s t e p s ) )
! co mp u te t h e s e c o n d d e r i v a t i v e o f e x p ( x )
! i n i t i a l i z e the arrays
h _ s t e p = 0 . 0 _dp ; c o m p u t e d _ d e r i v a t i v e = 0 . 0 _dp
CALL d e r i v a t i v e ( n u m b e r _ o f _ s t e p s , x , i n i t i a l _ s t e p , h _ s t e p , c o m p u t e d _ d e r i v a t i v e
)
! Then we p r i n t t h e r e s u l t s t o f i l e
DO l o o p =1 , n u m b e r _ o f _ s t e p s
WRITE( 7 , ’ ( E16 . 1 0 , 2X, E16 . 1 0 ) ’ ) LOG10 ( h _ s t e p ( l o o p ) ) ,&
LOG10 ( ABS ( ( c o m p u t e d _ d e r i v a t i v e ( l o o p )−EXP ( x ) ) / EXP ( x ) ) )
ENDDO
! f r e e memory
DEALLOCATE ( h _ s t e p , c o m p u t e d _ d e r i v a t i v e )
! close the output f i l e
CLOSE( 7 )
END PROGRAM s e c o n d _ d e r i v a t i v e
The MODULE declaration in Fortran allows one to place functions like the one which calculates second
derivatives in a module. Since this is a general method, one could extend its functionality by simply
transfering the name of the function to differentiate. In our case we use explicitely the exponential
function, but there is nothing which hinders us from defining other functions. Note the usage of the
module constants where we define double and complex variables. If one wishes to switch to another
58
3.2 – Numerical differentiation
precision, one just needs to change the declaration in one part of the program only. This hinders possible
errors which arise if one has to change variable declarations in every function and subroutine. Finally,
dynamic memory allocation and deallocation is in Fortran done with the keywords ALLOCATE( array(
size)) and DEALLOCATE(array). Although most compilers deallocate and thereby free space in memory
when leaving a function, you should always deallocate an array when it is no longer needed. In case
your arrays are very large, this may block unnecessarily large fractions of the memory. Furthermore,
you should always initialise arrays. In the example above, we note that Fortran allows us to simply write
h_step = 0.0_dp; computed_derivative = 0.0_dp, which means that all elements of these two arrays are
set to zero. Coding arrays in this manner brings us much closer to the way we deal with mathematics.
In Fortran it is irrelevant whether this is a one-dimensional or multi-dimensional array. In the next next
chapter, where we deal with allocation of matrices, we will introduce the numerical library Blitz++ which
allows for similar treatments of arrays in C++. By default however, these features are not included in the
ANSI C++ standard.
Results
In Table 3.1 we present the results of a numerical evaluation for various step sizes for the second deriva-
f −2f +f
tive of exp (x) using the approximation f0′′ = h h02 −h . The results are compared with the exact ones
for various x values. Note well that as the step is decreased we get closer to the exact value. However, if
Table 3.1: Result for numerically calculated second derivatives of exp (x) as functions of the chosen step
size h. A comparison is made with the exact value.
it is further decreased, we run into problems of loss of precision. This is clearly seen for h = 0.0000001.
This means that even though we could let the computer run with smaller and smaller values of the step,
there is a limit for how small the step can be made before we loose precision.
as function of log10 (h). We used an intial step length of h = 0.01 and fixed x = 10. For large values of
h, that is −4 < log10 (h) < −2 we see a straight line with a slope close to 2. Close to log10 (h) ≈ −4
the relative error starts increasing and our computed derivative with a step size log10 (h) < −4, may no
longer be reliable.
59
Numerical differentiation
6
Relative error
4
-2
ǫ
-4
-6
-8
-10
-14 -12 -10 -8 -6 -4 -2 0
log10 (h)
Figure 3.2: Log-log plot of the relative error of the second derivative of ex as function of decreasing step
lengths h. The second derivative was computed for x = 10 in the program discussed above. See text for
further details
Can we understand this behavior in terms of the discussion from the previous chapter? In chapter 2
we assumed that the total error could be approximated with one term arising from the loss of numerical
precision and another due to the truncation or approximation made, that is
60
3.2 – Numerical differentiation
h eh + e−h eh + e−h − 2
10−1 2.0100083361116070 1.0008336111607230×10−2
10−2 2.0001000008333358 1.0000083333605581×10−4
10−3 2.0000010000000836 1.0000000834065048×10−6
10−4 2.0000000099999999 1.0000000050247593×10−8
10−5 2.0000000001000000 9.9999897251734637×10−11
10−6 2.0000000000010001 9.9997787827987850×10−13
10−7 2.0000000000000098 9.9920072216264089×10−15
10−8 2.0000000000000000 0.0000000000000000×100
10−9 2.0000000000000000 1.1102230246251565×10−16
10−10 2.0000000000000000 0.0000000000000000×100
Table 3.2: Result for the numerically calculated numerator of the second derivative as function of the step
size h. The calculations have been made with double precision.
digits.
From Fig. 3.2 we can read off the slope of the curve and thereby determine empirically how truncation
errors and roundoff errors propagate. We saw that for −4 < log10 (h) < −2, we could extract a slope
close to 2, in agreement with the mathematical expression for the truncation error.
We can repeat this for −10 < log10 (h) < −4 and extract a slope ≈ −2. This agrees again with our
simple expression in Eq. (3.6).
61
Numerical differentiation
(a) Find mathematical expressions for the total error due to loss of precision and due to the numerical
approximation made. Find the step length which gives the smallest value. Perform the analysis
with both double and single precision.
(b) Make thereafter a program which computes the first derivative using Eqs. (3.7) and (3.8) as function
of various step lengths h and let h → 0. Compare with the exact answer.
Your program should contain the following elements:
– A vector (array) which contains the step lengths. Use dynamic memory allocation.
– Vectors for the computed derivatives of Eqs. (3.7) and (3.8) for both single and double preci-
sion.
– A function which computes the derivative and contains call by value and reference (for C++
users only).
– Add a function which writes the results to file.
as function of log10 (h) for Eqs. (3.7) and (3.8) for both single and double precision. Plot the results
and see if you can determine empirically the behavior of the total error as function of h.
62