0 valutazioniIl 0% ha trovato utile questo documento (0 voti)

21 visualizzazioni99 pagineA numerical analysis solutions to problems in chapters 1 to 3

Jan 26, 2020

© © All Rights Reserved

A numerical analysis solutions to problems in chapters 1 to 3

© All Rights Reserved

0 valutazioniIl 0% ha trovato utile questo documento (0 voti)

21 visualizzazioni99 pagineA numerical analysis solutions to problems in chapters 1 to 3

© All Rights Reserved

Sei sulla pagina 1di 99

Fundamentals Of Numerical Methods

Principles Of Computer Operations

Number Representations

Fortran Rules And Vocabulary

PROBLEM 01 – 0001: The digital computer is a basic tool used in arriving at the

solution of numerical problems that would otherwise be

extremely long and laborious. Give a brief explanation of

how the digital computer operates.

CONTENTS CHAPTER PREVIOUS NEXT PREP FIND

PROBLEM 01 – 0001: The digital computer is a basic tool used in arriving at the

Input:

Magnetic tape, paper tape, and punched cards are the common

media for carrying information (data and program) from the outside world

Figure 2 displays two punched cards and a piece of punched paper tape.

The input mechanism of the computer is able to receive information from

cards and tape by reading the punched holes. A particular computer may

have only the mechanism for reading cards; another computer may be

able to read from both cards and paper tape. Card (a) Fig. 2 is typical of a

data card containing two pieces of data. On the other hand, card (b)

Output:

Output information can be punched in cards and in paper tape. It is

paper.

Memory:

numerical form. Of course, the sign and the decimal associated with the

Control Unit:

In this very brief look into how the digital computer operates, many

PROBLEM 01 – 0002:

In Algebra the expression N = N + 1 is meaningless, however, it is meaningful in Fortran. Explain.

Solution:

It means "take the value stored under the name N, add 1 to it, and store the result under the name N

again."

PROBLEM 01 – 0003:

Given a sequence of n numbers, x1, x2,…, xn, find the largest and the smallest number of this sequence and

write out the results.

Solution:

The figure shown is a flow chart for the solution of this problem. The READ statement reads the number n

into the location N and the numbers x1, x2,…, xn into the vector X. Locations XMAX and XMIN are next set

equal to X(1). The iteration statement sets up control for doing the operations that follow up to α for 1 = 2,

3,…,N. The first time through the loop XMAX is compared with X(2) (I = 2). If X(2) is greater than XMAX, then

X(2) is placed in XMAX; if it is equal to XMAX, control is sent directly to the end of the iteration loop. If X(2)

is less than XMAX, it is compared with XMIN, which contains X(1). If X(2) is less than XMIN, it is placed in

XMIN. If it is greater than or equal to XMIN , control is sent to the end of the iteration loop. The same

sequence of operations is then carried out using X(3). This continues for each I up to and including I = N.

Clearly when the iterations are completed, the contents of XMAX will have been compared with every

number and the largest retained in XMAX.

XMIN will likewise contain the smallest of the sequence. The

contents of XMAX and XMIN are then written out.

PROBLEM 01 – 0004: Construct a flow chart that will find the sum of a sequence of

numbers represented by a1, a2,…, an.

Solution:

This addition operation on the computer is done by finding first the partial sum of a1 + a2, then a1 + a2 + a3,

and so on, until each number has been added to the partial sum to obtain the total. Either of the two flow

charts in the accompanying figure will accomplish the necessary operations, the two solutions being the

same except for the control statements.

The first symbol in every flow chart is START. It indicates that appropriate control statements are to be

placed ahead of the program for proper initiation of the machine. The READ statement means that the

number n is read into a location labeled N and that the sequence of numbers a1, a2,…, an is read into the

locations labeled A(1), A(2),..., A(N).

The shorthand notation in this box will be used frequently. The

number in location N indicates the number of items in the set. A(1),…,

A(N) contain the numbers a1,…, an, respectively. The next statement is the

first arithmetic statement and places a floating-point zero in the contents of

a location called SUM.

The iteration statement in Figure (b) says to do the statements

between the box and the circle (in this case there is only one statement) N

times, the first time for I = 1, the second time for I = 2, etc., and the last

time for I = N. The box within the range of this iteration says to take the

current value of SUM, add the number in the location A(1), using the

current value of I, and store the result back into SUM. The first time

through the iteration this adds zero (SUM was initially set to zero) and A(1)

and stores the result, A(1), in SUM. The second time through the iteration,

this statement adds A(1), the current value of SUM, and A(2) , storing the

result, A(1) + A(2) , in SUM. The third time through the iteration, it adds

SUM [currently A(1) + A(2)] to A(3) and stores the result,

A(1) + A(2) + A(3) , back into SUM. Each time through the iteration, the

statement updates SUM by the next number in order from the set of

numbers A(1), A(2),…, A(N). At the conclusion of the iterations the

location SUM contains the sum of all the numbers A(1) through A(N).

In Fig. (a) the iteration is controlled by a logic statement and

illustrates the functions performed by the iteration statement. The integer 1

is placed into location I ahead of the loop. Then A(1) is added to the

contents of SUM, which is zero prior to this operation. SUM now contains

A(1). The logic statement tests the sign of I – N. If it is negative or equal to

zero, I is updated by 1, and control is returned to the arithmetic statement.

This sequence of operations is continued until I – N is greater than zero, at

which point SUM is written out.

The last box in each chart says to display the number in SUM on

the output. This is the conclusion to the problem.

The flow chart is completed with a STOP.

PROBLEM 01 – 0005: Find the sum of the first n terms in the Taylor-series

expansion for ex.

CONTENTS CHAPTER PREVIOUS NEXT PREP FIND

PROBLEM 01 – 0005: Find the sum of the first n terms in the Taylor-series

expansion for ex.

Solution: ex = 1 + x + (1/ 2!) x2 + (1/ 3!) x3 + … + [{xn–1} / {(n – 1)!}]

The arithmetic operations needed for evaluating this sum are simplified by

noting that the (i + 1)st term in the series is obtained from the ith term by

multiplying by (x/i). The second term is equal to 1 ∙ (x/1), and the (i + 1)st

term is equal to the product of

[{xi–1} / {(i – 1)!}] and (x/i)

The flow chart for the problem is shown in the figure. Numbers for

N and X are read into the machine. Locations TERM and EXPX, set to

contain floating point 1.0, will be used to store the results of the

computations. The iteration loop sets up control to repeat the operations to

a for I = 1, 2,..., N – 1. The first statement in the loop gives the next term in

the series by updating the previous term. The first time through the loop

TERM contains 1.0, which is multiplied by (x/1), and then stored back in

TERM, so that TERM now contains the second term of the series. Each

time through the loop TERM is updated, so that it contains the (i + 1)st

term in the series. The last statement in the loop adds the contents of

TERM to the contents of EXPX to give the partial sum of the first i + 1

terms of the series. The first time through the loop EXPX contains 1.0, and

TERM contains (x / 1.0) when this statement is executed. The updated

contents of EXPX are 1.0 + (x / 1.0), which is the sum of the first two terms

of the series. When the iteration has been completed n – 1 times, EXPX

contains the sum of the first n terms of the series. Note that the integer I is

used in a floating-point computation. Such statements may not be

acceptable in computer programs but are permitted here for convenience.

sin(x) = x – (x3 / 3!) + (x5 / 5!) – … + (– 1)n–1

[{x2n–1} / {(2n – 1)!}] + (– 1)n [{x2n+1} / {(2n + 1)!}] cos {ξx},

solve f(x) = 1∫0 sin (xt)dt.

CONTENTS CHAPTER PREVIOUS NEXT PREP FIND

sin(x) = x – (x3 / 3!) + (x5 / 5!) – … + (– 1)n–1

[{x2n–1} / {(2n – 1)!}] + (– 1)n [{x2n+1} / {(2n + 1)!}] cos {ξx},

solve f(x) = 1∫0 sin (xt)dt.

Solution: f(x) = 1∫0 [xt – {(x3t3) / (3!)} + … + (– 1)n–1 [{(xt)2n–1} / {(2n – 1)!}]

+ (– 1)n [{(xt)2n–1} / {(2n + 1)!}] cos {ξxt}] dt

= n∑j=1 (– 1)j–1 [{x2j–1} / {(2j – 1)! (2j)}]

+ (– 1)n [{x2n–1} / {(2n + 1)!}] 1∫0 t2n–1 cos {ξxt} dt

with {ξxt} between 0 and xt. The integral in the remainder is easily bounded

by [1 / (2n + 2)]; but one can also convert it to a simpler form. Although it

wasn’t proven, it can be shown that cos(ξxt) is a continuous function of t.

Then applying the integral mean value theorem,

∫0 sin(xt)dt = n∑j=1 (– 1)j–1 [{x2j–1} / {(2j – 1)! (2j)}]

1

for some ζx between 0 and x.

PROBLEM 01 – 0007: Consider a routine for evaluating F(x); and, let F*(x) denote

the computed result obtained by the program as a

approximation to F(x). Evaluate by the experimental method

a probable error in the computed result.

PROBLEM 01 – 0007: Consider a routine for evaluating F(x); and, let F*(x) denote

the computed result obtained by the program as a

approximation to F(x). Evaluate by the experimental method

a probable error in the computed result.

of computed results obtained with a function evaluation routine. The

simplest type of experimental testing consists of computing F*(x) for

selected values of x and checking against known values of F(x). Such

testing, which involves personal inspection of results, cannot be very

thorough. Fortunately the process of experimental testing can be made

more automatic and more thorough by writing a test program in the

following way.

With the aid of a random number generator compute a sequence

of, say, n test arguments, x1, x2,…, xn. For each xk compute (1) F*(xk), the

approximation to F(xk) produced by the function evaluation routine

being tested, and (2) F**(xk), another approximation to F(xk) that is

sufficiently accurate to be used as a check value for F*(xk). Compute

statistics that will provide a ready indication of the magnitude of absolute

error or relative error in the values of F*(xk). Commonly used test statistics

are the maximum relative error

max1≤k≤n |[{F*(xk) – F**(xk)} / {F**(xk)}]|

and the root-mean-square relative error

√[(1/n) n∑k=1 [{F*(xk) – F**(xk)} / {F**(xk)}]2]

The value of n, that is, the number of test arguments, can be very

large. Some programs have been tested with as many as n = 100,000

random arguments; but smaller values of n are normally used, such as

n = 5,000. A uniform random number generator can be used to generate

test arguments, with the range of values adapted to the argument range of

F(x). Thus, if the argument range for F(x) were a finite interval {a, b} , then

the distribution of xk could be uniform in {a, b}. An exponential random

number generator is sometimes useful when the argument range of F(x) is

infinite. For example, the distribution of xk might be exponential with mean

1 for testing of a In x routine for which the argument range was (0,∞ ).

Sometimes it is desirable to make several tests and generate test

arguments in different parts of the argument range of F(x) for different test

runs. In this way, for example, regions where F(x) is unstable can be given

special attention.

F**(xk) must be computed in such a way that it is a sufficiently

accurate approximation to F(xk) to be used as a check value for F*(xk).

Normally this means that F**(xk) must be computed in higher precision

than F*(xk) is. Thus, if F*(xk) is computed in single-precision arithmetic,

then F**(xk) might be computed in double-precision. A special- purpose,

high-precision routine for evaluating F(x) might have to be written just to

obtain F**(xk) with sufficient accuracy. The higher the precision in which

F*(xk) is obtained, of course, the more difficult it is to compute a suitably

accurate check value in still higher precision.

If computing F**(xk) does present difficulties on the computer in

which tests are being made, it may be desirable to compute check values

on another computer with which the necessary accuracy can be obtained

more readily. This is obviously more trouble than using just one computer

since test arguments and check values have to be transferred from one

computer to another in some fashion. Variable-precision computers with

which high-precision calculations can be performed with ease are

especially useful for the computation of check values in high precision.

(For example, Fortran programs can be run on the IBM 1620 with a

precision of 28 decimal digits.)

Principles Of Computer Operations

PROBLEM 01 – 0008: Computers are often imagined as giant electronic "brains"

which never make an error. This notion can be wrong.

Interpret this simple program to demonstrate this concept.

SUM = 0

DO 10 I = 1,10000

10 SUM = SUM + 0.0001

WRITE (3, 20) SUM

20 FORMAT (F15.8)

END

CONTENTS CHAPTER PREVIOUS NEXT PREP FIND

which never make an error. This notion can be wrong.

Interpret this simple program to demonstrate this concept.

SUM = 0

DO 10 I = 1,10000

10 SUM = SUM + 0.0001

WRITE (3, 20) SUM

20 FORMAT (F15.8)

END

Solution: Take a quantity SUM, make it equal to zero, and then add the

quantity 0.0001 to it 10,000 times, and the answer is 1,00000000, exactly.

If one runs this program on a typical computer, the answer is likely

to be

0.99935269

Granted that the error may be small– –only 0.00065 or so, or about 0.065

percent– –still, the answer is wrong. It certainly destroys the notion of a

computer's infallibility, and creates concern about the results of other,

more complicated calculations.

PROBLEM 01 – 0009: Solve the following two simultaneous equations for x1 and

x2, and formulate the equations for the computer.

A11 X1 + A12 X2 = C1

A21 X1 + A22 X2 = C2

CONTENTS CHAPTER PREVIOUS NEXT PREP FIND

PROBLEM 01 – 0009: Solve the following two simultaneous equations for x1 and

x2, and formulate the equations for the computer.

A11 X1 + A12 X2 = C1

A21 X1 + A22 X2 = C2

A11 X1 + A12 X2 = C1 (1)

A21 X1 + A22 X2 = C2 (2)

as Fortran substitution statements into a program. But, the substitution

statement is not an algebraic equation. One method of employing the

computer in Solving these two equations is to prepare algebraic

solutions– –using determinants, substitution method, or some other– –to

find solutions for X1 and X2 as shown in Eqs. (3) and (4)

X1 = [{A22 C1 – A12 C2} / {A11 A22 – A12 A21}] (3)

X2 = [{A11 C2 – A21 C1} / {A11 A22 – A12 A21}] (4)

By this method of solution, the computer is used only in performing the

numerical calculations shown in Eqs. (3) and (4) in order to ultimately find

the numerical value of X1 and X2. This might be a practical application of a

digital computer if one wishes to solve this pair of equations five hundred

times, each time for one set of values for the coefficients A11, A12, A21, A22,

C1, and C2.

Now to proceed with the problem one recognizes that Eqs. (3) and

(4) can be used as the basis for preparing the necessary Fortran program.

Obviously, in order to find one pair of solutions– –X1 and X2– –the

appropriate values of the six coefficients must be available in MEMORY.

After X1 and X2 have been calculated, then, it will be necessary either to

store these solutions in MEMORY or to output the appropriate values.

Because in Fortran alphabetic or numeric characters cannot be

used in a sub or super position, it is necessary to give Fortran variables

different names from those indicated in the above algebraic equations.

See the table for one acceptable list of variable names.

TABLE

Equivalent Symbols

Algebraic Fortran Algebraic Fortran

A11 A11 C1 C1

A12 A12 C2 C2

A21 A21 X1 X1

A22 A22 X2 X2

C TWO EQUATIONS

1 READ 2, A11, A12, A21, A22, C1, C2

2 FORMAT (6F10.5)

D = A11 * A22 – A12 * A21

X1 = [(A22 * C1 – A12 * C2) / D]

X2 = [(A11 * C2 – A21 * C1) / D]

PUNCH 3, X1, X2

3 FORMAT (2E15.7)

GO TO 1

END

Fortran is the arithmetic IF statement; show how the

programmer uses the arithmetic IF statement.

Fortran is the arithmetic IF statement; show how the

programmer uses the arithmetic IF statement.

of operations, depending upon whether the value of an expression is

negative, zero, or positive. For example, the factorial of an integer N may

be computed by means of the following program which contains an IF

statement.

•••

•••

I=0

NFACT = 1

10 I=I+1

NFACT = NFACT * I

IF (I – N) 10, 20, 20

20 •••

•••

The first two statements in the program set the initial values of I equal to 0

and of NFACT equal to 1. Statement 10 increases I by 1. The next

statement computes NFACT for the new value of I. The IF statement

checks whether I – N is less than, greater than, or equal to zero, i.e.,

whether I is less than, greater than, or equal to N.

mathematical expressions:

(a) X = A + B ∙ C (b) I = (J/2)4

(c) Z = √[sin x] (d) X ∙ (BD)

(e) [{DE} / F] – G (f) DE/(F–G)

Solution:

Mathematics FORTRAN

(a) X=A+B•C X=A+B*C

(b) I = (J/2)4 I = (J/2)** 4

(c) Z = √[sin x] Z = SQRT(SIN{X})

(d) X ∙ (B )

D

X * (B ** D)

(e) [{DE} / F] – G [{D**E} / {F–G}]

(f) DE/(F–G) D**(E / (F–G))

mathematical expressions:

(a) X = A + B ∙ C (b) I = (J/2)4

(c) Z = √[sin x] (d) X ∙ (BD)

(e) [{DE} / F] – G (f) DE/(F–G)

PROBLEM 01 – 0011: Write the corresponding Fortran statements to the following

mathematical expressions:

(a) X = A + B ∙ C (b) I = (J/2)4

(c) Z = √[sin x] (d) X ∙ (BD)

(e) [{DE} / F] – G (f) DE/(F–G)

Solution:

Mathematics FORTRAN

(a) X=A+B•C X=A+B*C

(b) I = (J/2)4 I = (J/2)** 4

(c) Z = √[sin x] Z = SQRT(SIN{X})

(d) X ∙ (BD) X * (B ** D)

(e) [{D } / F] – G

E

[{D**E} / {F–G}]

(f) DE/(F–G) D**(E / (F–G))

Number Representations

PROBLEM 01 – 0012: In the computer the binary number system is used as

opposed to the decimal number system used by humans to

execute arithmetic operations. However, in both systems,

there are some similarities and differences. Contrast and

compare the two systems.

opposed to the decimal number system used by humans to

execute arithmetic operations. However, in both systems,

there are some similarities and differences. Contrast and

compare the two systems.

system, a system that has "ten" as its base; but when it comes to the

computer, the story is quite different. In the computer, internal calculations

are done in the binary system, a system that has "two" as its base.

The decimal number system has ten different digits, 0 through 9.

The binary number system has only two different digits, 0 and 1. As an

abbreviation, the two binary digits are usually called bits.

However, the binary number system is similar to the decimal

system in that the position of a particular digit in a number has a great

importance in both systems. Thus in both number systems the left digits of

a number are more important than the right digits since they have a

greater value.

In a decimal integer, each digit has 10 times the value of the digit to

its right. Thus the last digit on the right of an integer is the "units" digit, the

next is the "tens" digit, the next is the "hundreds" digit, and so on.

Therefore, the decimal number 532 can be interpreted as

5 hundreds

+ 3 tens

+ 2 units.

A more concise notation expressing the same thing is

532 = (5 × 102) + (3 × 101) + 2 × 100.

The same principle applies to non-integer numbers such as 14.37 below

14.37 = (1 × 101) + (4 × 100) + (3 × 10–1) + (7 × 10–2)

= 10 + 4 + 0.3 + 0.07.

Binary numbers can be expressed in exactly the same way, except that

they only contain the digits 0 and 1, and that adjacent digits differ in value

by a power of 2, rather than by a power of 10. For example, the binary

integer 1011 can be expressed as

1011 = (1 × 23) + (0 × 22) + (1 × 21) + (1 × 20).

Changing into decimal numbers, we see that the binary 1011 is equal to

8 + 2 + 1, or a decimal number 11. Because each digit of a binary number

can only be a 0 or a 1 and therefore can carry less information than the 10

digits of the decimal number system, binary numbers are generally much

longer than their decimal equivalents. For example, the decimal number

4094 is 111111111110 in binary.

Non-integer binary numbers can be expanded into powers of 2 and

converted into decimal almost as easily as integers. For example, the

binary number 110.11 can be written as

110.11 = (1 × 22) + (1 × 21) + (0 × 20) + (1 × 2–1) + (1 × 2–2).

where the exponent of 2 starts with 0 to the left of the decimal point and

increases to the left, and starts with – 1 to the right of the decimal point

and becomes more negative to the right. In the case of the binary 110.11,

the decimal equivalent is

4 + 2 + (1/2) + (1/4) = 6.75.

A decimal integer can always be exactly represented by a binary integer, for every integer can be

expressed as a sum of powers of 2. But this is not true of fractional numbers.

In general, it can easily be shown that a rational fraction can be exactly expressed with a finite

number of binary digits only if it can be expressed as the quotient of two integers p/q, where q is a power

of 2; that is, q = 2n for some integer n. Obviously only a small proportion of rational fractions will satisfy this

requirement.

Thus even some simple decimal fractions cannot be exactly expressed in the binary number system.

For example, the simple decimal fraction 0.1 is an infinitely long binary number 0.000110011 ... with two 1's

and two 0's repeating themselves forever. Every repeating decimal (such as 0.33333333...) is also repeating

in binary, but other numbers which are not repeating decimals in the decimal number system may become

repeating in binary.

The decimal number 0.0001 is a repeating binary fraction which begins with

0.000000000000011010001101101110001 ... and lasts for 104 bits before starting to repeat with the part

...11010001101101110001.... Every 104 bits from now on, this binary fraction starts to repeat itself.

Obviously an infinitely long binary fraction cannot be stored or used in a digital computer, and so some

finite number of bits must be used and the rest discarded. This automatically leads to a small error which,

by being repeated many times, can lead to a large error in the final answer.

a) 1 1 0 1 1 1 0 1 12 into base 10

b) 4 5 78 into base 2

c) 7 B 316 into base 8

d) 1 2 4 2 510 into base 16

a) 1 1 0 1 1 1 0 1 12 into base 10

b) 4 5 78 into base 2

c) 7 B 316 into base 8

d) 1 2 4 2 510 into base 16

Solution:

a) 1 1 0 1 1 1 0 1 12 may be converted into base 10 by considering this fact: Each digit in a base two number

may be thought of as a switch, a zero indicating "off", and a one indicating "on". Also note that each digit

corresponds, in base 10, to a power of two. To clarify, look at the procedure:

28 27 26 25 24 23 22 21 20

1 1 0 1 1 1 0 1 12

If the switch is "on" , then you add the corresponding power of two. If not, you add a zero. Follow the next

conversion closely.

b) 4578 may be converted into base two by using the notion of triads.

Triads are three-bit groups of zeros and ones which correspond to octal (base 8) and decimal (base 10)

numbers. This table can, with some practice, be committed to memory:

000 0 0 0

001 1 1 1

010 2 2 2

011 3 3 3

100 4 4 4

101 5 5 5

110 6 6 6

111 7 7 7

48 = 1002

58 = 1012

78 = 1112

which becomes 1001011112 = 4578

c) 7B316 is a hexadecimal number. Letters are needed to replace numbers, since the base here is 16. The

following chart will help:

1000 10 8 8

1001 11 9 9

1010 12 10 A

1011 13 11 B

1100 14 12 C

1101 15 13 D

1110 16 14 E

1111 17 15 F

10000 20 16 10

7 B 316

(7 × 256) + (11 × 16) + (3 × 1) = 197110

Now, convert to octal using this procedure: Divide 8, the base to be used, into 1971. Find the remainder

of this division and save it as the first digit of the new octal number. Then divide 8 into the quotient of

the previous division, and repeat the procedure. The following is an illustration:

8 1971

– 1968

3 → 3 246 × 8 = 1968

246

– 240

6 → 6 30 ×8 = 240

30

– 24

6 → 6 3 × 8 = 24

3 → 3 3 ÷ 8 ≠ an integer

Take the highest power of 16 contained in 12423. Then, subtract that value and try the next lowest

power. Continue the process until all the digits are accounted for.

12425

16 × 3 =

3

12288

137

161 × 8 = 128

9

160 × 9 = 9

0

a) 1 0 0 1 1 0 1 0

b) 1 0 1 . 1 0 1

c) 0 . 0 0 1 1

base 2 to base 8, 10, and 16.

a) 1 0 0 1 1 0 1 0

b) 1 0 1 . 1 0 1

c) 0 . 0 0 1 1

base 2 to base 8, 10, and 16.

Solution: a) To convert base 2 to base 8, use the triad method where three

digits are taken together and converted to equivalent decimal value; this

results in base 8 conversion

|0 1 0| |0 1 1| |0 1 0|2 = 2328

Similarly for hexadecimal conversion 4 bits are taken together. Zeros ar

assumed for most significant grouping.

|0 0 0 0| |1 0 0 1| |1 0 1 0| = 9A16.

For binary to decimal conversion

1 0 0 1 1 0 1 0

1×27 + 0×26 + 0×25 + 1×24 + 1×23 + 0×22 + 1×21 + 0×20

128 + 0 + 0 + 16 + 8 + 0 + 2 + 0 = 15410

b) i) 101 . 1012 = 5 . 58

ii) [{0101} / 5] . [{10102} / A] = 5 . A16

iii) 101 . 1012

(1 × 22 + 0 × 21 + 1 × 20) . (1 × 2–1 + 0 × 2–2 + 1 × 2–3)

= 4 + 0 + 1 + 0.5 + 0 + 0.125 = 5 . 62510

c) i) 0 . 0011 = |000| . |001| |100| = 0 . 148

ii) 0 . 0011 = |0000| . |0011| = 0 . 316

iii) 0 . 0011 = . (0 × 20 + 0 × 2–1 + 1 × 2–3 + 1 × 2–4)

= 0 + 0 + 0 + 0.125 + 0.0625 = 0.187510

hexadecimal numbers.

hexadecimal numbers.

is almost identical to that of adding and subtracting Arabic digits. Actually,

there are a few ways in which these calculations may be performed. The

most efficient one is the method used in the decimal calculations. The

operations will be done from right to left. In using this method, each Hex

symbol will be translated to a decimal digit before the calculations, then,

upon completion of the calculations, each resulting decimal digit will be

retranslated to a corresponding Hex symbol.

Addition: In decimal calculations, as one goes from one column to the

next, one carries units of tens. For example:

1 2 8

8 8

2 1 6

In Hex addition, instead of carrying units of ten, carry units of sixteen.

Example 1

6 E 8 8

+ 1 D 3

For col. O:

8 8 8

3 3 3

11 → B

Note in this column the decimal sum is less than 16, and no units of

sixteen are carried.

For col. 1:

Hex → Decimal → Decimal Sum → Hex Sum

8 8 8

D 13 13

21 → 16 + 5

Here the decimal s-um is greater than 16. After factoring out the units of

sixteen (here only one), the remaining digit should be placed in this

column.

For col. 2:

Hex → Decimal → Decimal Sum → Hex Sum

E 14 14

1 1 1

15

+1 (form col.1)

16 → 16 + O

One unit of sixteen can be factored out, leaving zero as the remaining digit

for col. 2.

For col. 3:

Hex → Decimal → Decimal Sum → Hex Sum

6 6 (6/6)

+1 (form col.2)

7 → 7

Thus, the final result is:

6E88

+1D3

705B

Subtraction: In the subtraction procedure the units of sixteen are borrowed instead of being carried.

Example 2:

3 5 E

– 2 B 8

For col. O

Hex → Decimal → Decimal → Hex Difference

Difference

E 14 14

8 8 –8

6 → 6

For col. 1:

Hex → Decimal → Decimal → Hex Difference

Difference

5 5 (5) + 16 (Borrowed unit from col.2)

B 11 11

10 → A

The difference in col.2 is zero since one unit was borrowed from 3. The

final result is:

2 + 16

3 5 E

– 2 B 8

A 6

b) Convert 11011 . 101112 to octal.

b) Convert 11011 . 101112 to octal.

Solution:

a) The binary groups are used to replace individual digits, as indicated:

7 6 4 3 0 1

↓ ↓ ↓ ↓ ↓ ↓

111 110 100 011 000 001

TABLE OP CONVERSION

Decimal Octal Binary

1 1 1 (or 001)

2 2 10(or 010)

3 3 11(or 011)

4 4 100

5 5 101

6 6 110

7 7 111

8 10 1000

9 11 1001

10 12 1010

11 13 1011

12 14 1100

13 15 1101

14 16 1110

15 17 1111

16 20 10000

The resulting number is the required binary number. Some additional insight into the process can be

obtained from the following diagram, which illustrates for the integral part the values of the numbers in

decimal form:

7 6 4

↓ ↓ ↓

7 × 82 6 × 81 4 × 80

↓ ↓ ↓

7 × 26 6 × 23 4 × 20

↓ ↓ ↓

(1×22+1×21+1×20)×26 (1×22+1×21+0×20)×23 (1×22+0×21+0×20)×20

↓ ↓ ↓

1×28+1×27+1×26 1×25+1×24+0×23 1×22+0×21+0×20

↓ ↓ ↓

111 110 100

b) Grouping the digits by threes, starting from the binary point, we have

11 011 . 101 11

It is necessary to introduce an additional zero in the first group and last group in order to have three

binary bits in each group, thus:

011 011 . 101 110

Each of these groups is now replaced by its octal equivalent, giving 33.568

954 = 1 (3)6 + 0 (3)5 + 2(3)4 + 2 (3)3 + 1 (3)2 + 0 (3) + 0;

the successive divisions required are usually recorded in the following set

up.

3 |954

3 |318 + 0

3 |106 + 0

3 |35 + 1

3 |11 + 2

3 |3 + 2

1 +0

Thus the integral part of the given number is 1022100, in the scale of

three.

On the other hand, successive multiplications by 3 give

(4/10) = (12 / 30) = (1/3) + (2 / 30),

(2 / 30) = (6 / 90) = (0/9) + (6 / 90),

(6 / 90) = (18 / 270) = (1 / 27) + (8 / 270),

(8 / 270) = (24 / 810) = (2 / 81) + (4 / 810),

(4 / 810) = (12 / 2430) = (1 / 243) + (2 / 2430),

(2 / 2430) = (6 / 7290) = (0 / 729) + (6 / 7290),

…………………………………………………….

The process is now repeating and, collecting terms, one finds

954.4 (scale 10) = 1022100 . 101210121012 ... (scale 3).

a/b.

CONTENTS CHAPTER PREVIOUS NEXT PREP FIND

a/b.

Solution:

Fortran Rules And Vocabulary

PROBLEM 01 – 0019: Using FORTRAN rules what would be the value of sum in

each case ?

a) SUM = 3 – 1 + 5 * 2/2

b) SUM = (4 – 2) ** 2

c) SUM = 9 ** 2 + 9/2 ** 2

d) SUM = [[{(1 + 2) / (2 * 5)} * (10/2) + 7] / 3]

PROBLEM 01 – 0019: Using FORTRAN rules what would be the value of sum in

each case ?

a) SUM = 3 – 1 + 5 * 2/2

b) SUM = (4 – 2) ** 2

c) SUM = 9 ** 2 + 9/2 ** 2

d) SUM = [[{(1 + 2) / (2 * 5)} * (10/2) + 7] / 3]

evaluation of arithmetic operations is conducted from left to right, except

when the succeeding operation has a higher "binding power" than the one

currently being considered.

The binding power of "+" and "–" is the lowest in strength; operators

"*" and "/" are of middle strength, and raising to a power (**) has the

highest priority.

The concept of parenthesizing subexpressions was added to the

rules to achieve a sequence in operations different from the rules of

binding power of operators.

Therefore, using this knowledge the solutions are found to be:

a) First division 2/2, then multiplication by 5, followed by adding to – 1 and

to 3. The result is 7.

b) First, solving the expression inside the brackets, then using exponent 2.

The answer is 4. Note that the operation was done from left to right and

exponent binding power wasn't taken into consideration because of the

priority of parenthesis.

c) Using the binding power rule, 9**2 is 81 and 2**2 is 4. The third

operation division: (9/4) – which is 2.25. The last operation is addition. The

result is 83.25.

d) The result is 14.833.

Give reasons in each case.

a) READ (2,1) K, L, M + 1

b) K + 5 = 37 * I – J * 13.

c) WRITE (3,13) I, J, K, L.

d) K = 2 × I

e) Z ** Y ** W

Give reasons in each case.

a) READ (2,1) K, L, M + 1

b) K + 5 = 37 * I – J * 13.

c) WRITE (3,13) I, J, K, L.

d) K = 2 × I

e) Z ** Y ** W

FORTRAN statement are the twenty-six letters of the alphabet and the ten

digits 0 to 9 (some systems also accept the $ included in the variable

name). Algebraic signs cannot be included, and therefore M + 1 is an

invalid variable.

b) For FORTRAN statements involving arithmetic operations, use

the following symbols:

– for subtraction

+ for addition

** for exponentiation

* for multiplication

/ for division

No arithmetic operations are allowed on the left side of the "=" sign.

c) In the WRITE statement the first number (3) indicates the device

that is to be used to print the values obtained at the end of the program.

(Typewriter, pi*inter) . The second number (13) is the number of the

corresponding FORMAT statement. The listing of the variables should not

be followed by a period. This is an error.

d) FORTRAN has no symbol "x" (presumably we mean

multiplication, in which case the correct FORTRAN statement is K = 2 * I).

e) The expression is ambiguous. It could mean

(ZY)W or (Z)[(Y)W].

These are not always equal. For example, if Z = 2, Y = 3 W = 4:

(23)4 ≠ 2[(3)4]

a) (A – (B – (C + D (4. 7))

b) ((A/B)

d) V = 1.63 * / D

CONTENTS CHAPTER PREVIOUS NEXT PREP FIND

a) (A – (B – (C + D (4. 7))

b) ((A/B)

d) V = 1.63 * / D

is the first constant or variable of that expression, must be preceded

immediately by one of the following:

a left parenthesis

one of the operators +, –, *, /, or ** .

Any constant or variable in an expression, unless it is the last

constant or variable of that expression, must be followed by one of the

following:

a right parenthesis

one of the operators +, – , *, /, or **.

The number of opened and closed parentheses must be equal. Therefore

the correct expression is (A – (B – (C + D(4.7)))).

b) By the same argument that was used in part (a), the correct

expressions are (A/B) or (A) / (B).

c) The statement is incorrectly formed, and has no meaning. Except

for the unary minus, each operator +, *, /, or ** must have a term or facto

both to its left and to its right in order for an expression to be correctly

formed.

PROBLEM 01 – 0022: Pick out the errors in the following FORTRAN statements

and explain briefly why they are incorrect:

a) RESULT = SUM / FLOAT (NUM) + 2ERR

b) IF (M* (N/M) = N) GO TO 20

c) INTEGER CAPITAL, COST, INCOME

d) AREA = LENGTH * WIDTH

e) TAU = BETA / – 3.0

PROBLEM 01 – 0022: Pick out the errors in the following FORTRAN statements

and explain briefly why they are incorrect:

a) RESULT = SUM / FLOAT (NUM) + 2ERR

b) IF (M* (N/M) = N) GO TO 20

c) INTEGER CAPITAL, COST, INCOME

d) AREA = LENGTH * WIDTH

e) TAU = BETA / – 3.0

is an invalid variable.

b) In an IF statement, comparisons are made by using relational

operators. An equality sign, which is used only in FORTRAN assignment

statements, should be replaced by .EQ. TERM.

c) CAPITAL is a seven-letter word. FORTRAN allows a maximum

of six characters in each variable name.

d) In FORTRAN, variables beginning with letters I through N are

integer unless specified otherwise. Since WIDTH and AREA are real

variable names, LENGTH cannot be used in order to avoid mixed mode

multiplication.

e) Two arithmetic operators may not be juxtaposed in a FORTRAN

assignment statement.

Significant Figures, Errors

Absolute And Relative Errors

Truncation And Round-Off Errors

Methods Of Approximation

PROBLEM 02 – 0023: The number 31.546824 is known to have a relative error no

worse than 1 part in 100,000. How many of the digits are

known to be correct?

[{ΔQ} / {|Q|}] ≤ [1 / {100,000}]

or

ΔQ ≤ [1 / {100,000}] |Q|

ΔQ is sought. In the below equation, Q is unknown and Q1 is known

(31.546824).

|Q| ≤ |Q1| + ΔQ

Hence

ΔQ ≤ [1 / {100,000}] (|Q| + ΔQ)

or

ΔQ ≤ .00031546824 + [1 / {100,000}] ΔQ

or

[{99,999} / {100,000}] ΔQ ≤ .00031546824

or

ΔQ ≤ .00032

Since ΔQ is less than half a unit in the thousandths place, the significant

digit in the thousandths place (the digit is correct, so the number has at

least five correct significant digits.

cos 7° 10' ÷ log10 242.7, assuming that the angle may be in

error by 1' and that the number 242.7 may be in error by a

unit in its last figure.

Solution: Since this is a quotient of two functions, it is better to compute the

relative error from the formula Er ≤ Δu1/u1 + Δu2/u2 and then find the

absolute error from the relation Ea = NEr. Now write

N = [{cos 7° 10'} / {log10 242.7}] = [{cos x} / {log10 y}] = u1 / u2

and

Δu1 = Δ cos x = – sin xΔx,

Δu2 = Δ log10 y = 0.43429 (Δy / y).

∴ Er ≤ [{sin x} / {cos x}] Δx + [{0.43429} / {y log y}] Δy,

or

Er ≤ tan xΔx + [{0.435} / {y log y}] Δy.

Now taking x = 7° 10', Δx = 1' = 0.000291 radian, y = 242, Δy = 0.1,

and using a slide rule for the computation,

Er < 0.126 × 0.000291 + [{0.435 × 0.1] / {242 × 2.38}] = 0.00011.

Since N = cos 7° 10'/log 242.7 = 0.41599,

Ea = 0.00011 × 0.416 = 0.000046,

or

Ea < 0.00005.

The value of the fraction is therefore between 0.41604 and

0.41594, and the mean of these numbers is taken to four figures as the

best value of the fraction, or

N = 0.4160.

obtained by telescoping the Taylor series for ex:

g1(x) = 0.994571 + 1.130318x + 0.542990x2, – 1 ≤ x ≤ 1

g2(x) = 1.008129 + 0.860198x + 0.839882x2, 0 ≤ x ≤ 1

In each case a sufficient number of terms was employed in

the original Taylor series to ensure that the coefficients of

these approximations have essentially "converged." That is,

if additional terms were taken in the Taylor series, the

coefficients of g1(x) and g2(x) would not change significantly.

Note that gi(x) is a valid approximation over the interval

– 1 ≤ x ≤ 1 while g2(x) applies only over the interval

0 ≤ x ≤1. Compare the accuracy of these two approximations

on 0 ≤ x ≤ 1 where both are valid.

Solution: The errors in each of the approximations are plotted in the figure.

The errors in both approximations have nearly equal positive and negative

peaks. However, the error in g2 (x) is distributed much more uniformly over

the interval. (The error in g1(x) is distributed quite uniformly over the

interval – 1 ≤ x ≤ 1 for which it was con true ted, but g1(x) tends to

overestimate ex on much of 0 ≤ x ≤ 1). The magnitude of the maximum

error of g1(x) is approximately 0.054, while that of g2(x) is approximately

0.0101, or only about 1/5 that of g1(x). It is almost inevitable that the

quadratic g1(x) which must provide a good approximation over the interval

– 1 ≤ x ≤ 1 will have a larger maximum error than will an approximation of

the same type (g2(x)) which is constructed to serve as a good

approximation over an interval only half as large (0 ≤ x ≤ 1) . It would be

necessary to employ an economized approximation of higher degree than

2 in order to obtain the same accuracy on – 1 ≤ x ≤ 1 that g2(x) provides

on 0 ≤ x ≤ 1. It should be apparent that the simplest, most effective

approximations can be obtained by restricting the interval of approximation

to the absolute minimum size required.

PROBLEM 02 – 0026: The following definitions apply to errors in a calculation:

The absolute error is defined as:

Errorabs = (Calculated value) – (True value)

The relative error is defined as:

Errorrel = [{(Calculated value) – (True value)}

/ {True value}]

The percentage error is defined as

Errorpct = Errorrel × 100

Assume the true value for a calculation should be 5.0, but

the calculated value is 4.0: Calculate the absolute error,

relative error and the percentage error.

Errorrel = [{4.0 – 5.0} / {5.0}] = [{– 1.0} / {5.0}] = – 0.2,

Errorpct = – 0.2 × 100 = – 20%.

equal to the absolute error in its natural logarithm, since

Δ(In y) ≈ d(ln y) = (dy / y) ≈ (Δy / y).

Solution: Using an exact formula, one has

Δ(In y) = In (y + Δy) – ln(y) = ln(1 + Δy / y). A comparison follows:

Δ(In y) = 0.00100 0.00995 0.095 0.182 – 0.223 – 0.105

Δ(log10 y) = log10 e ∙ Δ(ln y) ≈ 0.434 (Δy / y).

representation. For example, if x = (2/9) and k = 4, then

F(2/9) = .2222. The operations of addition, subtraction,

multiplication and division are defined as follows:

x (+) y = F[F(x) + F(y)]

x (–) y = F[F(x) – F(y)] (1)

x (×) y = F[F(x) × F(y)]

x (÷) y = F[{F(x)} / {F(y)}]

Compute the absolute error and relative error for the

operations defined in (1) for x = (1/3), y = (4/7). For

arithmetic calculations apply 4–digit chopping. Suppose

a = .5711, b = 4271, c = .1001 × 10–3,

compute

a (+) b

(a (–) y) (÷) c (2)

and the corresponding absolute errors and relative errors.

x = (1/3) F(x) = .3333

y = (4/7) F(y) = .5714

Then x (+) y = (1/3) (+) (4/7)

= F[F(x) + F(y)]

= F[. 3333 + .5714]

= F[.9047]

= .9047

The results of the operations are listed in Table 1.

Table 1

Operation Result Actual Value

x (+) y .9047 (19 / 21)

x (–) Y – .2381 – (5 / 21)

x (×) y .1904 (4 / 21)

x (÷) y .5833 (7 / 12)

|ω – ω̃| (3)

and the relative error is

[{|ω – ω̃|} / {|ω|}] (4)

assuming ω ≠ 0.

From Table 1 and (3), (4) we compute

Table 2

Operation Result Actual Value

x (+) y .6190 × 10–4 .6842 × 10–4

x (–) Y .4762 × 10–5 .2 × 10–4

x (×) y .7619 × 10–4 .4 × 10–3

x (÷) y .3333 × 10–4 .5714 × 10–4

If we apply, for example, 7-digit chopping then

F(x) = .3333333 F(y) = .5714286

and

x (+) y = .1904762.

The absolute error is .191 × 10–7, and the relative error is .131 × 10–6. The

accuracy has improved.

For a, b, c given in (2), we obtain

Table 3

Operation Result Actual Value Absolute Relative Error

Error

a (+) b 4271 4271.5711 .5711 .1123 × 10-3

(a (–) y) (÷) c – .003 – .000328 .00267 8.1445

Note that the absolute error in a (+) b is large, while the relative error is

small because we divide by a large actual value.

One should exercise caution in arranging the arithmetic operations

and avoiding multiplication by large numbers (or division by small

numbers) and subtraction of two numbers nearly equal.

PROBLEM 02 – 0028:

Let x be a real number and F(x)it's floating–point k–digit representation. For example, if x = (2/9)

and k = 4, then F(2/9) = .2222. The operations of addition, subtraction, multiplication and division

are defined as follows:

x (+) y = F[F(x) + F(y)]

x (–) y = F[F(x) – F(y)] (1)

x (×) y = F[F(x) × F(y)]

x (÷) y = F[{F(x)} / {F(y)}]

Compute the absolute error and relative error for the operations defined in (1) for x = (1/3), y =

(4/7). For arithmetic calculations apply 4–digit chopping. Suppose

a = .5711, b = 4271, c = .1001 × 10–3, compute

a (+) b

(a (–) y) (÷) c (2)

and the corresponding absolute errors and relative errors.

x = (1/3) F(x) = .3333

y = (4/7) F(y) = .5714

Then x (+) y = (1/3) (+) (4/7)

= F[F(x) + F(y)]

= F[. 3333 + .5714]

= F[.9047]

= .9047

The results of the operations are listed in Table 1.

Table 1

Operation Result Actual Value

x (+) y .9047 (19 / 21)

x (–) Y – .2381 – (5 / 21)

x (×) y .1904 (4 / 21)

x (÷) y .5833 (7 / 12)

|ω – ω̃| (3)

and the relative error is

[{|ω – ω̃|} / {|ω|}] (4)

assuming ω ≠ 0.

From Table 1 and (3), (4) we compute

Table 2

Operation Result Actual Value

x (+) y .6190 × 10–4 .6842 × 10–4

x (–) Y .4762 × 10–5 .2 × 10–4

x (×) y .7619 × 10–4 .4 × 10–3

x (÷) y .3333 × 10–4 .5714 × 10–4

If we apply, for example, 7-digit chopping then

F(x) = .3333333 F(y) = .5714286

and

x (+) y = .1904762.

The absolute error is .191 × 10–7, and the relative error is .131 × 10–6. The

accuracy has improved.

For a, b, c given in (2), we obtain

Table 3

Operation Result Actual Value Absolute Relative Error

Error

a (+) b 4271 4271.5711 .5711 .1123 × 10-3

(a (–) y) (÷) c – .003 – .000328 .00267 8.1445

Note that the absolute error in a (+) b is large, while the relative error is

small because we divide by a large actual value.

One should exercise caution in arranging the arithmetic operations

and avoiding multiplication by large numbers (or division by small

numbers) and subtraction of two numbers nearly equal.

Solution: Mathematical truncation error refers to the error of approximation in

numerically solving a mathematical problem, and it is the error generally

associated with the subject of numerical analysis. It involves the

approximation of infinite process by finite ones, replacing non-computable

problems with computable ones. Following are some examples to make

the idea more precise.

(a) Using the first two terms of the Taylor series from

(1 + x)α = 1 + (α1)x + (α2)x2 + … + (αn)xn + (α n+1) [{xn+1} / {(1 + ξx)n+1–α}]

with

(αk) = [{α(α – 1)…(α – k + 1)} / {k!}] k = 1, 2, 3 …

for any real number a(for all cases the unknown ξx is located between x

and 0 ), one can write

√(1 + x) ≈ 1 + (1/2)x

which is a good approximation when x is small.

(b) For the differential equation problem

Y'(t) = f(t, Y(t)) Y(t0) = Y0

use the approximation of the derivative

Y'(t) ≈ [{Y(t + h) – Y(t)} / {h}]

for some small h. Let tj = t0 + jh for j ≥ 0, and define an approximate

solution function y(tj)by

[{y(tj +1) – y(tj)} / {h}] = f(tj, y(tj))

so one gets

y(tj + 1) = y(tj) + hf (tj, y(tj)) j≥0

This is Euler's method of solving a differential equation initial value

problem.

PROBLEM 02 – 0030: Use exact, chopping and rounding to evaluate the following:

f(x) = x3 – 4x2 + 2x – 2.2 at x = 2.41.

For chopping and rounding, apply three–digit arithmetic.

Compute the relative errors. To decrease their values, try to

carry out the same calculations for the same polynomial, but

written in a different form.

Solution: The numerical data necessary for the solution of the problem are

summarized in the following table.

x x2 x3 4x2 2x

exact 2.41 5.8081 13.997521 23.2324 4.82

3–digit 2.41 5.80 13.9 23.2 4.82

chopping

3–digit 2.41 5.81 14.0 23.2 4.82

rounding

Note that the three-digit chopped numbers retain the leading three digits

and may differ from the three-digit rounded numbers. For example, if

x = 14.7124, then Xchopped = 14.7, and Xrounded = 14.7. On the other hand, if

x = 14.7824, then Xchopped = 14.7. Xrounded = 14.8.

We obtain

f(2.41)exact = 13.997521 – 23.2324 + 4.82 – 2.2

= – 6.614879

f(2.41)chopped = 13.9 – 23.2 + 4.82 – 2.2

= – 6.68

f(2.41) rounded = 14.0 – 23.2 + 4.82 – 2.2

= – 6.58

The corresponding relative errors are:

relative error, 3-digit chopping = [{– 6.614879 + 6.68} / {– 6.614879}]

= .0098

relative error, 3-digit rounding = [{– 6.614879 + 6.58} / {– 6.614879}]

= .0052

In order to obtain smaller values for the relative errors we can write f(x) in

an equivalent form

f(x) = x[x(x – 4) + 2] – 2.2

Then, the exact value of f(x) at x = 2.41 is, of course, the same, but

f{2.41)chopped = 2.41[2.41(2.41 – 4) + 2] – 2.2

= 2.41[– 3.83 + 2] – 2.2

= – 4.41 – 2.2

= – 6.61

f(2.41)rounded = 2.41[2.41(2.41 – 4) + 2] – 2.2

= – 6.61

The relative error for both values is now

[{– 6.614879 + 6.61} / {– 6.614879}] = .00073

The relative error is now much smaller. The reason for this is that there is

a decrease in the number of error-producing computations. Note that in

x3 – 4x2 + 2x – 2.2, we had five multiplications and three additions, while in

x[x(x – 4) + 2] – 2.2, we had two multiplications and three additions.

PROBLEM 02 – 0031: When one gives the number of digits in a numerical value

one should not include zeros in the beginning of the number,

as these zeros only help to denote where the decimal point

should be. If one is counting the number of decimals, one

should of course include leading zeros to the right of the

decimal point.

For example, 0.001234 ± 0.000004 has five correct

decimals and three significant digits, while 0.001234

± 0.000006 has four correct decimals and two significant

digits.

The number of correct decimals gives one an idea of the

magnitude of the absolute error, while the number of

significant digits gives a rough idea of the magnitude of the

relative error.

Consider the following decimal numbers: 0.2397, – 0.2397,

0.23750, 0.23650, 0.23652. Shorten to three decimals by the

Round–off and Chopping methods.

– 0.2397 rounds to – 0.240 (is chopped to – 0.239),

0.23750 rounds to 0.238 (is chopped to 0.237),

0.23650 rounds to 0.236 (is chopped to 0.236),

0.23652 rounds to 0.237 (is chopped to 0.236).

Observe that when one rounds off a numerical value one produces

an error; thus it is occasionally wise to give more decimals than those

which are correct.

One consequence of these rounding conventions is that numerical

results which are not followed by any error estimations should often,

though not always, be considered as having an uncertainty of (1/2) unit in

the last decimal place.

digits is used on the system

x + 400y = 801,

200x + 200y = 600.

Solve it by rounding off the errors.

Solution: Multiplying the first equation by – 200 and adding the result to the

second equation, one gets

– 7.98 × 104 y = – 15.9 × 104.

This gives the approximate value of 1.99 for y. Substitution of this

value in the first equation yields the approximate value x = 5. The relative

error in the value for x is 400% since the correct solution is (1, 2). The

problem here lies in the equation x = 801 – 400y. Any rounding error in y

(in this case 0.01) gets magnified by a factor of 400, which can be quite

significant because x is small.

PROBLEM 02 – 0033: Scaling is one way to get around errors that can be caused

by improportionate computations. Scale the system

x + 230y + 34602 = 20,000

30x + 5y + 0.1z = 300 (1)

0.001x + 0.002y + 0.003z = 7,

by columns.

Hint: Column scaling is similar to row scaling except that it

alters the solution. For example, if the first column is divided

by 10, then the new solution will be in terms of 10x1 instead

of x1.

y = y* / 103, and z = z* / 104. This results in the system

0.01x* + 0.230y* + 0.3460z* = 20,000,

0.3x* + 0.005y* + 0.00001z* = 300, (2)

0.00001x* + 0.000002y* + 0.0000003z* = 7.

Of course, one must remember to replace x*, y*, and z* by their values in

terms of x, y, and z after solving the system (2).

x + 400y = 801

200x + 200y = 600,

use the technique of partial pivoting to round off the

solution's error to three significant digits.

0.001x + 0.4y = 0.801,

0.2x + 0.2y = 0.6.

The technique of partial pivoting requires that the second equation be

used as the pivot since 0.2 > 0.001. Carrying three significant digits, one

obtains the equation

0.399y = 0.798.

Thus y = 2.00 and x = 1.00. Note that a small round–off error in y would

have produced only a small error in x, since the second equation is less

sensitive than the first to an error in y.

Methods Of Approximation

max-degrees 1, 2, 3, in the neighborhood of the point (1, 0).

f(1) = 0, f'(1) = 1, f"(1) = – 1, f"'(1) = 2,

and

y = x – 1,

y = (x–1) – (1/2) (x – 1)2,

y = (x – 1) – (1/2) (x – 1)2 + y (x – 1)3,

are the required approximating polynomials.

n – 1, and

n–1∫ f(x) dx = In n.

0

index n.) Given the above, use the Euler Summation formula

n∑ yi = n∫0 f(x)dx + (1/2) (y0 + yn) + n∫0 P1(x)f'(x)dx

i=0

where

Pn(x) = a + a1(x – x0) + a2(x – x0)2 + ... + an(x – x0)

to derive the formula for the Euler's Constant.

Solution: n–1∑

k=0 [1 / {k + 1}] = In n + (1/2) [1 + (1/n)]

– n–1∫0 [{P1(x)} / {x + 1}2]dx

Here P1(x) is a polynomial used to approximate the function of x.

Since

|P1(x)| ≤ (1/2)

for every x and

k+1∫ [{P1(x)} / {x + 1}2]dx < 0,

k

then

0 > n–1∫0 [{P1(x)} / {x + 1}2]dx ≥ – (1/2) n–1∫0 [1 / {x + 1}2]dx

= – (1/2) [1 – (1/n)].

It follows that

∞∫ [{P1(x)} / {x + 1}2]dx = limn→∞ n–1∫0 [{P1(x)} / {x + 1}2]dx

0

C = limn→∞ [n–1∑k=0 (1 / {k + 1}) – In n] = (1/2) – ∞∫0 [{P1(x)} / {x + 1}2]dx

exists. The number C is called Euler's constant; it can be shown that

C = 0.577216.

y = a0 + a1x + a2x2

which minimizes the integral Is given by

Is = b∫a [f(x) – Pn(x)]2 dx

where

Pn(x) = y

in the interval [–1, 1].

I = 1∫–1 [{(1/2) (x + |x|) – (a0 + a1x + a2x2)]2 dx

= 0∫–1 (a0 + a1x + a2x2)2 dx + 1∫0 [a0 + (a1 – 1) x + a2x2]2 dx

= (1/3) + 2a02 – a0 + (4/3) a0a2 + (2/3) a12 – (2/3)a1 + (2/5) a22 – (1/2)a2.

Now

I(a)0 = 4a0 – 1 + (4/3)a2,

I(a)1 = (4/3)a1 – (2/3),

I(a)2 = (4/3)a0 + (4/5)a2 – (1/2),

where

I(a)i = (∂I / ∂ai),

and

I(a)0 (a)0 = 4, I(a)0 (a)1 = I(a)1 (a)0 = 0,

I(a)1 (a)1 = (4/3), I(a)0 (a)2 = I(a)2 (a)0 = (4/3),

I(a)2 (a)2 = (4/3), I(a)1 (a)2 = I(a)2 (a)1 = 0,

where

I(a)i (a)j = [(∂2I) / (∂ai ∂aj)].

Sufficient conditions for a minimum are

I(a)0 = I(a)1 = I(a)2 = 0,

| I(a)0 (a)0 I(a)0 (a)1 I(a)0 (a)2|

| I(a)1 (a)0 I(a)1 (a)1 I(a)1 (a)2| > 0.

| I(a)2 (a)0 I(a)2 (a)1 I(a)2 (a)2|

These conditions are satisfied if

a0 = (3 / 32),

a1 = (16 / 32),

a2 = (15 / 32),

and therefore

y = (1 / 32) (3 + 16x + 15x2)

is the required polynomial. It is suggested to graph this parabola and

y = (1/2) (x + |x|)

on the same set of axes.

PROBLEM 02 – 0038: Approximate

y = esin x

at x = 0 by a function of type

C(x) = a0 + a1 sin (2π / p) (x – x0)

+ ... + an sin n (2π / p) (x – x0) + b1 cos (2π / p) (x – x0)

+ ... + bn cos m (2π / p) (x – x0),

where

a0, ... , an, b0, ... , bm and x0

are constants; C(x) is everywhere continuous and periodic

with period p and hence is suitable for approximating the

function y = f (x) at a point (x0, y0) if the function f(x) is itself

continuous and periodic with period p. As a rule, m is chosen

to be n, n – 1, or n + 1. The function C(x) has n + m + 1

arbitrary constants and hence can be determined by the

imposition of an equivalent number of conditions. Solve for

n = m = 1 and for n = m = 3.

C1(x) = a0 + a1 sin x + b1 cos x.

(The function to be approximated has the period p = 2π.) Determine the

three constants a0, a1, and b1 by equating the values of esin x and C1(x) and

the values of their first and second derivatives at x = x0. Obtain

C1(x) = 2 + sin x – cos x.

If the graphs of y = C1(x) and the given function were drawn, they would

show the periodicity of the two curves and indicate that C1(x) is a good

approximation to esin x from about – 2π / 9 to about 7π / 18 (roughly from

– 40° to 70°) in the period interval from – π to π.

For n = m = 3, start with

C3(x) = a0 + a1 sin x + a2 sin 2x + a3 sin 3x + b1 cos x + b2 cos 2x

+ b3 cos 3x.

Since six derivatives are needed it is convenient to write esin x as a power

series in x, namely,

y = esin x = 1 + x + (x2 / 2!) – (3x4 / 4!) – (8x5 / 5!) – (3x6 / 6!)

+ (56x7 / 7!) + ... .

Obtain on substituting 0 for x, equating the values of corresponding derivatives, and solving,

C3(x) = (1 / 180) (200 + 210 sin x – 6 sin 2x – 6 sin 3x + 45 cos x

– 72 cos 2x + 7 cos 3x) .

The graphs of y = C3(x) and the given function would show that this time

the approximation is good from about – 4π / 9 to about 11π / 18 (– 80° to

110°) in the period interval from – π to π.

C(x) = a0 + a1 sin (2π / p) (x – x0)

+ ... + an sin n (2π / p) (x – x0) + b1 cos (2π / p) (x – x0)

+ ... + bn cos m (2π / p) (x – x0),

where

a0, ..., an, b0,..., bm and x0 are constants.

C(x) coincides with y = esin x at x = 0, π/3, π/2, π, 3π / 2.

C2(x) = a0 + a1 sin x + a2 sin 2x + b1 cos x + b2 cos 2x

so that the graph of this equation passes through the five points

(0,1), (π/3, e(1/2)√3), (π/2, e), (π, 1), (3π/2, e–1) .

Substituting the coordinates of the points into C2(x), equating

corresponding values, and solving,

C2(x) = 1.272 + 1.175 sin x – 0.057 sin 2x – 0.272 cos 2x.

Its graph and the graph of y = esin x indicate that C2(x) is a good

approximation to y = esin x for all values of x.

Note that because of the periodicity of. the trigonometric functions, one is

more likely to run into a system of inconsistent linear equations (in the a's

and b's) than in polynomial approximation.

approximates F(x) = 16–x with minimax absolute error in

[0, a], and let Qn*(x) denote the polynomial of degree ≤ n that

approximates F(x) = 16–x with minimax relative error in [0, a].

Use the theorem:

Let Pn*(x) denote the polynomial of degree ≤ n that

approximates F(x) with minimax absolute error in [a, b].

Suppose that F(x) possesses an (n + 1) st derivative F(n+1)

(x) for x in [a, b] . If for two nonnegative numbers m and M

and for a ≤ x ≤ b either

m ≤ F(n+1)(x) ≤ M

or

m ≤ F(n+1)(x) ≤ M,

then

[{m(b – a)n+1} / {22n+1(n + 1)!}] ≤ max[a, b] |Pn*(x) – F(x)|

≤ [{M(b – a)n+1} / {22n+1(n + 1)!}];

to find bounds for both

max[0, a] |Pn*(x) – 16–x|

and

max[0, a] |{Qn*(x) – 16–x} / {16–x}|.

Solution: Since

F(n+1)(x) = (– 1)n+1(In 16)n+1 16–x,

one gets m ≤ ± F(n+1)(x) ≤ M for all x in [0, a],

where

m = 16–a(In 16)n+1

and

M = (In 16)n+1,

thus satisfying the condition of the theorem, it follows that

[{16–a(In 16)n+1 an+1} / {22n+1(n + 1)!}] ≤ max[0, a] |Pn*(x) – 16–x|

≤ [{(In 16)n+1 an+1} / {22n+1(n + 1)!}]

Applied directly, the theorem cannot give bounds for

max[0, a] |{Qn*(x) – 16–x} / {16–x}|.

With the additional analysis shown below, however, one can also obtain

such bounds.

max[0, a] |{Qn*(x) – 16–x} / {16–x}| ≥ max[0, a] |Qn*(x) – 16–x|

≥ max[0, a] |Pn*(x) – 16–x|

≥ [{16–a(In 16)n+1 an+1} / {22n+1(n + 1)!}]

Likewise,

max[0, a] |{Qn*(x) – 16–x} / {16–x}| ≤ max[0, a] |{Pn*(x) – 16–x} / {16–x}|

≤ 16a max[0, a] |Pn*(x) – 16–x|

≤ [{16a (In 16)n+1 an+1} / {22n+1(n + 1)!}]

Therefore

[{16–a(in 16)n+1 an+1} / {22n+1(n + 1)!}] ≤ max[0, a] |{Qn*(x) – 16–x} / {16–x}|

≤ [{16a (In 16)n+1 an+1} / {22n+1(n + 1)!}].

F(x) = √x with minimax relative error in [(1 / 16), 1]. Use

Remez' method.

As initial estimates for the critical points of

[{P2*(x) – √x} / {√x}],

use

x1 = (1 / 16), x2 = .4, x3 = .8, and x4 = 1

Solution: Applying Reme z' method:

Step (1) Choose the initial estimates for the critical points.

Step (2) The system of linear equations

a0 + a1xk +…+ anxkn – (– 1)k μF (xk) = F(xk),

k = 1, 2,…, n + 2,

becomes

a0 + a1x1 + a2x12 + μ√x1 = √x1

a0 + a1x2 + a2x22 + μ√x2 = √x2

a0 + a1x3 + a2x32 + μ√x3 = √x3

a0 + a1x4 + a2x42 + μ√x4 = √x4

On each iteration this system of equations is solved numerically by means

of Gaussian elimination.

Step (3) On each iteration it is necessary to locate the extreme points in

[(1 / 16), 1]

of the relative-error function

[{P2(x) – √x} / {√x}],

where the coefficients of

P2(x) = a0 + a1x + a2x2

are obtained from step (2) . Two of the extreme points are

y1 = (1 / 16)

and

y4 = 1

on every iteration. The other two, y2 and y3, are the two roots of the

quadratic equation

3a2x2 + a1x – a0 = 0

and are computed with the quadratic formula on every integration.

The quadratic equation in step (3) was derived as follows:

Let

E(x) = [{P2(x) – √x} / {√x}].

If one replaces P2(x) by a1 + a1x + a2x2, differentiates E(x), and set the

derivative equal to zero, then one obtains after some simplifications the

condition

3a2x2 + a1x – a0 = 0.

Since E(x) must have at least four extreme points in

[{1 / 16), 1],

whereas the quadratic equation has only two roots, it follows that y1 and

y4 must be the endpoints of the interval and that the two interior extreme

points, y2 and y3, must be the roots of the quadratic equation. Note,

incidentally, that this argument also establishes that

[{P2*(x) – √x} / {√x}]

must be a standard error function.

TABLE 1

After x2 x1 a0 a1 a2 μ

iteration

0 .400000 .800000 — — — —

1 .137582 .692428 .171896 1.339080 – .525123 – .014147

2 .146345 .588965 .170415 1.453820 – .659052 – .034816

3 .148921 .590636 .171509 1.442061 – .649966 – .036396

4 .148917 .590599 .171509 1.442106 – .650022 – .036407

Table 1 shows the results of four iterations with Remez' method. The

numbers shown in this table have been rounded to six decimal places.

Values of the iterates converged to six decimals in these four iterations.

cos (1/4) πx

with minimax absolute error in [–1, 1] . Use the Remez'

method. The initial values are prescribed by

xk = (1/2) (b – a) cos [{n – k + 2} / {n + 1}] + (1/2) (b + a),

k = 1, 2, ..., n + 2. (1)

Solution: Since

cos (1/4) πx

is an even function and since *the approximation interval is symmetric

about the origin, P7*(x) contains only even-degree terms.

Let

P7*(x) = a0* + a2*x2 + a4*x4 + a6*x6.

The absolute-err or function P7*(x) – cos (1/4) πx is a standard error

function. It is also an even function, and its nine critical points are

therefore symmetrically located about the origin. That is,

– x1* = a9*, – x2* = x8*, – x3* = x7* – x4*, = x6*

and

x3* = 0.

If the starting values in step 1 of Remez' method are chosen so that

analogous conditions hold, then this symmetry be preserved throughout

subsequent iterations, and the odd- degree coefficients of P7(x) in step 2

will be zero on every iteration. Taking advantage of these facts, apply

Remez method in the following way:

Step 1) Because of the symmetries mentioned above, it is necessary to

use only x5, x6, x7 x8, and x9. For initial values use (1), that is,

xk = cos [{(9 – k)π} / {8}],

k = 5, 6, 7, 8, 9.

Step 2) The system of linear equations

a0 + a1xk + ... + anxkn – (– 1)kμ = F(xk),

k = 1, 2,..., n + 2,

for the n + 2 unknowns a0, a1,..., an, and μ,

simplifies to the followings

a0 + a2x52 + a4x54 + a6x56 + μ = cos (1/4) πx5

a0 + a2x62 + a4x64 + a6x66 – μ = cos (1/4) πx6

a0 + a2x72 + a4x74 + a6x76 + μ = cos (1/4) πx7

a0 + a2x82 + a4x84 + a6x86 – μ = cos (1/4) πx8

a0 + a2x92 + a4x94 + a6x96 + μ = cos (1/4) πx9.

On each iteration this system of equations is solved numerically ,by

means of Gaussian elimination.

Step 3) Since P7(x) – cos (1/4) πx is an even function, its extreme points

are distributed symmetrically about the origin. Therefore, as was said

earlier, it is necessary to locate only y5, y6, y7, y8, and y9. On every

iteration y5 = 0 and y9 = 1. To obtain y6, y7, and y8, differentiate the

absolute-error function P7(x) – cos (1/4) πx and compute the zeros of the

derivative. This can be done by isolating each yk by searching and then

computing it accurately by bisection. Since the starting values chosen for

step 1 are very good, yk is always close to xk and can be isolated easily by

searching near xk.

Because the initial choices for the xk's happen to be very accurate, after

only one iteration a0, a2, a4, a6, and μ become stabilized to ten decimal

places. The xk's do not converge quite so rapidly. The results after one

iteration are shown below, rounded to ten decimal places.

a0 = .9999999724 a6 = – .0003188805

a2 = – .3084242536 μ = .0000000276

a4 = .0158499153

PROBLEM 02 – 0043: Develop the power series expansion of cos (1/4) πx.

Solution: cos (1/4) πx = ∞∑k=0 [{(– 1)k {(1/4) π}2k} / {(2k)!}] x2k

= 1 – .3084251375x + .0158543442x4*

– .0003259919x6 + .0000035909x8 – .0000000246x10

+ ... ,

where the coefficients have been rounded to ten decimal places. Let P6(x)

denote the polynomial of degree six obtained by truncating this series after

the fourth term, and consider this polynomial as an approximation to cos

(1/4) πx for – 1 ≤ x ≤ 1. Graphs of the absolute-error function P6(x) – cos

(1/4) πx and the relative-error function

[{P6(x) – cos (1/4) πx} / {cos (1/4) πx}]

for 0 ≤ x ≤ 1 are shown in the figure. For both absolute error and relative

error, the magnitude of the error is greatest when x = 1. Simple

calculations show that the maximum absolute error is approximately .357

(10–5) and that the maximum relative error is approximately .504(10–5)

R5,2(x) in V5, 2[– 1, 1] that approximates cos (1/4)πx with

minimax absolute error in [– 1, 1]. Hint: Since cos (1/4)πx is

an even function and the approximation interval is symmetric

about the origin, the odd-degree coefficients in

Rm,n*(x) = [{p0* + p1*x + p2*x2 + ... + pm*xm}

/ {q0* + q1*x + q2*x2 + ... + qn*xn}], (1)

vanish so that it becomes just

R5,2* (X) = [{p0* + p2*x2 + p4*x4} / {1 + q2*x2}]

Solution: Approximate numerical values for p0*, p2*, p4*, and. Q2* need to be

computed. Since the absolute-error function for this approximation is an

even function, too, the nine critical points are symmetrically located about

the origin. That is,

– x1* = x9*, – x2* = x8*, – x3* = x7*, – x4* = x6*, and x5* = 0.

Choose the starting values xk, k = 1, 2, ..., 9, so that similar conditions hold

for them; then this will continue to be true on every iteration of Remez'

method, and the odd–degree coefficients of R5,2(x) in step 2 will vanish on

every iteration. Now apply the steps in Remez' method in the following

way:

Step 1) Because of the symmetries mentioned above, one needs to us

only x5, x6, x7, x8, and x9. Using

xk = (1/2) (b – a) cos [{(m + n + 2 – k)π} / {m + n + 1}] + (1/2) (b – a),

k = 1, 2, ..., m + n + 2,

to determine starting values, one gets

xk = cos [{(9 – k)π} / {8}], k = 5, 6, 7, 8, 9.

Step 2) The system of nonlinear equations

p0 + p1xk + ... + pmxkm + q1xk [– F(xk) – (–1)kμ] + q2xk2 [– F(xk)

– (–1)kμ] +...+ qnxkn [– F(xk) – (–1)kμ] – (– 1)k μ = F(xk),

k = 1, 2, 3,..., m + n + 2,

for the m + n + 2 unknowns p0, p1,..., pm, q1, q2,..., qn, and μ; becomes

p0 + p2x52 + p4x54 + q2x52 (– cos (1/4) πx5 + μ) + μ = cos (1/4)πx5

p0 + p2x62 + p4x64 + q2x62 (– cos (1/4) πx6 + μ) + μ = cos (1/4)πx6

p0 + p2x72 + p4x74 + q2x72 (– cos (1/4) πx7 + μ) + μ = cos (1/4)πx7

p0 + p2x82 + p4x84 + q2x82 (– cos (1/4) πx8 + μ) + μ = cos (1/4)πx8

p0 + p2x92 + p4x94 + q2x92 (– cos (1/4) πx9 + μ) + μ = cos (1/4)πx9

On each iteration with Remez' method, this system is solved for p0, p2,

p4, q2, and μ by the iterative method.

Step 3) Since the absolute–error function R5,2(x) – cos (1/4) πx is an even

function, it is sufficient to locate its extreme points in [0, 1], instead of

[– 1, 1]. The coefficients of R5,2(x) are the numbers obtained in step 2.

Always have y5 = 0 and y9 = 1. To obtain y6, y7, and y8, differentiate the

error function and compute the zeros of the derivative. This can be done

by isolating each yk by searching and then computing it accurately by

bisection. Since the starting values for the xk's chosen in step 1 happen to

be very good, yk is close to xk on every iteration and can easily be isolated

by searching in the neighborhood of xk.

Results after two iterations with Remez' method are shown below,

with constants rounded to ten decimal places. After two iterations the

numbers have stabilized to ten decimals.

p0 = 1.0000000241 q2 = .0209610796

p2 = – .2874648358 y = – .0000000241

p4 = .0093933390

for

– 1 ≤ x ≤ 1,

where

μ = (1/2) In 3.

Solution: First, find a rational approximation

R2,4(x) to

[{tan μx} / {x}]

by Maehly's method and use xR2,4(x) as an approximation to tanh μx. The object of using this

procedure (instead of applying Maehly's method to tanh μx itself) is to obtain an approximation to

tanh μx with low relative error around the origin, at which tanh μx is zero. Consider the equations

p0 = q0c0 + (1/2) n∑r=1 qrcr,

pk = q0ck + (1/2) qkc0 + (1/2) n∑r=1 qr(cr+k + c|r–k|),

k = 1,2, ..., m + n. (1)

In this case use m = 2 and n = 4. The ck's are the coefficients of the Chebyshev series for

[{tanh μx} / x];

all the odd–numbered ck's are zero because

[{tanh μx} / x]

is an even function. It is not difficult to show that the unknowns p1, q1, and q3 equal zero. If all the

zero terms are omitted and one simplifies the equations, then (1) becomes

p0 = q0c0 + (1/2) q2c2 + (1/2) q4c4

p2 = q0c2 + q2[c0 + (1/2)c4] + q4[(1/2)c2 + (1/2)c6]

0 = q0c4 + q2[(1/2)c2 + (1/2)c6] + q4[c0 + (1/2)c8]

0 = q0c6 + q2[(1/2)c4 + (1/2)c8] + q4[(1/2)c2 + (1/2)c10].

Approximate numerical values for the ck's appearing in the equations above can be computed by

the method of Chebyshev series expansion. In order to obtain a particular solution of the

equations, set q0 = 1 and solve for p0, p2, q2, and q4, computing approximate numerical values for

these unknowns. Then

P2(x) = .52320016427 T0(x) + .00739002644 T2(x)

= .51581013784 + .01478005287x2

and

Q4(x) = T0(x) + .06107956977 T2(x) + .00010081226 T4(x)

= .93902124249 + .12135264149x2 + .00080649804x4,

where the coefficients are rounded to 11 decimal places. Then

R2,4(x) = P2(x) / Q4(x) can be expressed in the form

R2,4(x) = [{.54930614399 + .01573984933x2}

/ {1 + .12923311636x2 + .00085887093x4}],

where the numerator and denominator have been scaled so that in thedenominator equals one

and the other coefficients are rounded to 11decimal places. The approximation to tanh μx for – 1 ≤

x ≤ 1 is xR2, 4(x).

CHAPTER 03: SERIES

Infinite Series (Test For Convergence And Divergence)

Taylor’s Series Expansion

Maclaurin’s Series Expansion

Power Series Expansion

Laurent Series Expansion

PROBLEM 03 – 0046: Define the term series with computational considerations.

Solution:

A series is a group of terms, often infinite, where the terms are formed according to some general

rule. Series are very useful for numerical computations of such constants as e, π, etc.; and for the

computations of terms such as log x, sin x, etc.

Note that a series can be used for computational purposes only if it converges, that is the only

kind of series that can be used for computation is a convergent series. The reason for this is that if

a series were not convergent but divergent, as one would add more and more terms, a continually

different result would be obtained. With a convergent series, however, the series approaches a

limit, and gives an ever more accurate result, the more terms added. Adding more terms does not

change the result substantially after the first several terms, it only makes the result more accurate.

Assuming, then, that a series converges, it can be differentiated, integrated, added to other series,

subtracted from other series, multiplied by a constant, etc.

For example, if the series for sin x is known, the series for cos x can be found by differentiating that

series term by term, since it is known that

(d / dx) (sin x) = cos x

In a computation, once convergence is established, the number of terms to be used in the

computation depends only on the accuracy desired.

a) ∞∑n=2 [(log n)2 / n3] is a convergent series.

b) ∞∑n=2 [1 / {n(log n)p}] is convergent if p > 1,

divergent for p ≤ 1.

Solution: a) To show

∞∑ [(log n)2 / n3]

n=2

is a convergent series apply the integral test, Hence, it must be shown that

limb→∞ b∫2 [(log x)2 / x3] dx < ∞.

To do this, use the method of integration by parts, Therefore, take

dv = x–3 dx;

v = [(– 1) / 2x2], u = (log x)2, du = [{2(log x)} / x]dx.

Then,

b∫ [(log x)2 / x3] dx = [{– (log x)2} / 2x2] |b 2

2

Now, applying integration by parts again, let

dv = x–3 dx;

v = [(– 1) / 2x2], u = log x, du = (dx / x).

Then (1) becomes

b∫ [(log x)2 / x3] dx = [{– (log x)2} / 2x2] |b 2 – [(log x) / 2x2] |b 2

2

Hence

limb→∞ b∫2 [(log x)2 / x3] dx

equals

limb→∞ ( [{– (log b)2} / 2b2] – [(log b) / 2b2] – (1 / 4b2) + [(log 2)2 / 8]

+ [(log 2) / 8] + [1 / 16])

Then, by L'Hospital's Rule the first three terms tend to 0 as b→∞.

Therefore

limb→∞ b∫2 [(log x)2 / x3] dx = [(log 2)2 / 8] + [(log 2) / 8] = [1 / 16] < ∞

Thus, the series is convergent.

b) To show

∞∑ [1 / {n(log n)p}]

n=2

Therefore set up the integral

b∫ [dx / {x(log x)p}].

2

This yields

log b∫ (du / up) = [u–p+1 / (– p + 1)] |log b log 2

log 2

Therefore,

limb→∞ b∫2 [dx / {x(log x)p}] = limb→∞ ( [(log b)–p+1 / (– p + 1)]

– [(log 2)–p+1 / (– p + 1)] )(p ≠ 1).

If p > 1,

the first term goes to 0 as b→∞ and the integral equals

[(log 2)–p+1 / (p – 1)] < ∞.

Thus, the series is convergent.

If p < 1, the first terra grows without bound as b→∞ and the integral is unbounded. Thus, the

series is divergent. For p = 1,

limb→∞ b∫2 [dx / {x(log x)p}] = limb→∞ b∫2 [dx / {x(log x)}]

= limb→∞ log(log b) – [log(log 2)]

Thus, the series is divergent. This shows that the series is convergent for p > 1;

and divergent for p ≤ 1.

(1/2) + (1/3) + (1 / 22) + (1 / 32) + (1 / 23) + (1 / 33) + …

is convergent or divergent, using the ratio test and the root test.

Solution:

The series can be rewritten as the sum of the sequence of numbers given by

an = [1 / 2{(n+1)/2}] if n is odd (n > 0)

and = [1 / 3(n/2)] if n is even (n > 0)

Now the ratio test states:

If ak > 0

and

limk→∞ (ak+1 / ak) = ℓ < 1,

then

∞∑ ak converges.

k=1

Similarly, if

limk→∞ (ak+1 / ak) = ℓ(1 < ℓ ≤ ∞)

then

∞∑ ak diverges.

k=1

If n is odd,

limn→∞ (an+1 / an) = limn→∞ [(1 / 3(n/2)) / (1 / 2{(n+1)/2})]

= limn→∞ (2{(n+1)/2} / 3(n/2))

= limn→∞ (2/3)(n/2) 2(1/2) = 0.

If n is even,

limn→∞ (an+1 / an) = limn→∞ [(1 / 2{(n+1)/2}) / (1 / 3(n/2))]

= limn→∞ (3/2)(n/2) 2(1/2)

and no limit exists.

Hence, the ratio test gives two different values, one < 1 and the other > 1, therefore the test fails to

determine if the series is convergent. Thus, another test, known as the root test is now applied.

This test states:

Let

∞∑ ak

k=1

limn→∞ (n√an) = S,

where

0 ≤ S ≤ ∞.

If:

1) 0 ≤ S < 1, the series converges

2) 1 < S ≤ ∞, the series diverges

3) S = 1, the series may converge or diverges.

Applying this test yields, if n is odd

limn→∞ n√an = limn→∞ n√[(1 / 2{(n+1)/2})

= limn→∞ n√(1 / 2(n/2)) n√[1 / {2(1/2)}]

= limn→∞ (1 / √2) (1 / 2(1/2n))

= (1 / √2) < 1.

If n is even

limn→∞ n√an = limn→∞ n√(1 / 3(n/2))

= (1 / √3) < 1.

Thus, since for both cases, the

limn→∞ n√an < 1,

the series converges.

PROBLEM 03 – 0049:

Determine if the series.

a) ∞∑k=1 [(k + 1)(1/2) / (k5 + k3 – 1)(1/3)]

converges or diverges. Use the limit test for convergence.

b) ∞∑k=1 [(k log k) / (7 + 11k – k2)]

converges or diverges. Use the limit test for divergence.

Solution:

a) To determine if the given series converges or diverges, the following test called the limit test for

convergence is used. This test states:

If

limk→∞ kp Uk = A for p > 1,

then

∞∑ Uk

k=1

converges absolutely.

To apply this test let p = (7/6) > 1 . Then since

Uk = [(k + 1)(1/2) / (k5 + k3 – 1)(1/3)],

limk→∞ kp Uk = limk→∞ [{k(7/6) (k + 1)(1/2)} / (k5 + k3 – 1)(1/3)]

= limk→∞ [{k(7/6) k(1/2)[1 + (1/k)](1/2)} / (k5 + k3 – 1)(1/3)]

= limk→∞ [{k(5/3) [1 + (1/k)](1/2)} / (k5 + k3 – 1)(1/3)]

= limk→∞ [{1 + (1/k)}(1/2) / {k–5(k5 + k3 – 1)}(1/3)]

= limk→∞ [{1 + (1/k)}(1/2) / {1 + (1/k2) – (1/k5)}(1/3)]

= [(1)(1/2) / (1)(1/3)]

=1

Therefore, the series converges absolutely.

b) For the series

∞∑ [(k log k) / (7 + 11k – k2)],

k=1

the following test called the limit test for divergence is used. This test states, if

limk→∞ k Uk = A ≠ 0 (or ± ∞)

then

∞∑ Uk

k=1

Since

Uk = [(k log k) / (7 + 11k – k2)]

limk→∞ k Uk = limk→∞ [(k2 log k) / (7 + 11k – k2)]

= limk→∞ [(k2 log k) / {k2[(7/k2) + (11/k) – 1]}]

= limk→∞ [(log k) / {(7/k2) + (11/k) – 1}].

As k → ∞, log k → ∞

while

(7/k2) + (11/k) – 1 → – 1.

Thus

∞∑ [(k log k) / (7 + 11k – k2)] diverges.

k=1

PROBLEM 03 – 0050:

Find the Taylor series expansion of f(x) about x = x0 and show that a complete knowledge of the

behavior of f(x) at a point x0 determines the value of the function at any point x at which the series

converges.

Solution:

Let f(x) be a function with derivatives of all orders at x = x0, and assume that in a neighborhood of

x0, f(x) may be represented by the power series in x – x0:

f(x) ≡ c0 + c1(x – x0) + c2(x – x0)2 + ...

= ∞∑n=0 cn(x – x0)n. (a)

The derivatives of f(x) may be obtained by differentiating the series (a) term by term:

f'(x) = ∞∑n=1 ncn(x – x0)n–1

= c1 + 2c2(x – x0) + 3c3(x – x0)2 + ..., (b)

f"(x) = ∞∑n=2 n(n – 1) cn(x – x0)n–2

= 2 ∙ 1c2 + 3 ∙ 2c3(x – x0) + ..., (c)

.....................................................................................................

f(m)(x) = ∞∑n=m n(n – 1)(n – 2) ... (n – m + 1)cn(x – x0)n–m

= ∞∑n=m [n! / (n – m)!] cn(x – x0)n–m (d)

= m! cm + (m + 1)! cm+1(x – x0) + ...

Setting x = x0 on both sides of (a) , (b) , (c), and (d), one finds

f(x0) = c0 or c0 = f(x0),

f'(x0) = c1 or c1 = f'(x0),

f"(x0) = c2 or c2 = [f"(x0) / 2],

.....................................................................

f(m)(x0) = m! cm or cm = [{f(m)(x0)} / m!],

by means of which (a) becomes

f(x) = ∞∑n=0 [{f(n)(x0)} / n!] (x – x0)n, (e)

where

f(n)(x0) ≡ [(dnf) / (dxn)] |x=(x)0,

f(0)(x0) ≡ f(x0)

and 0! = 1.

Equation (e) is the Taylor series expansion. Note also that by setting x0 = 0 in (e), one gets the

Maclaurin series expansion:

f(x) = ∞∑n=0 [{f(n)(0)} / n!] xn.

PROBLEM 03 – 0051:

Evaluate √(1 + x) to two decimal places at x = .4 by Taylor series expansion. Consider the error

expression:

|Rm| ≤ |[{f(m)(x0)} / m!](x – x0)m|,

the mth derivative is computed at x0. |Rm| is the truncation error after the mth term.

= 1 + (1/2)x – (1/8)x2 + (1 / 16)x3 – (5 / 128)x4 + ... .

To evaluate √(1 + .4) to two decimal places, at x = .4

1 + (1/2)(.4) – (1/8)(.4)2 = 1.18 [1.183];

|R3| < [(.4)3 / 16] = .004 [.003].

PROBLEM 03 – 0052:

Using the Taylor series expansion for ex about x = 0, find e0.5 to (e) (0.5)3. Bound the error using

the error expression

|[(dn+1f) / dxn+1]|max ∙ [(|x – a|)n+1 / (n + 1)!] (1)

(The subscript "max" denotes the maximum magnitude of the derivative on the interval from a to x

and (e) means error order), and compare the result with the actual error.

f(x) = f(a) + (x – a)f'(a) + [(x – a)2 / 2!] f"(a) + [(x – a)3 / 3!] f'"(a) + ... + [(x – a)n / n!] f(n)(a) + ...

ex = e(0) + xe(0) + (x2 / 2!) e(0) + (x3 / 3!) e(0) + (x4 / 4!) e(0) + ...

or

ex = 1 + x + (x2 / 2!) + (x3 / 3!) + (x4 / 4!) + ...

and if x = 0.5,

e0.5 = 1 + 0.5 + [(0.5)2 / 2!] + [(0.5)3 / 3!] + [(0.5)4 / 4!] + ...

or, to (e)(0.5)3,

e0.5 = 1 + 0.5 + [(0.5)2 / 2!] = 1.625

Now according to (1), the error in this quantity should not be greater than

|[{d3 (ex)} / dt3]|max [(0.5)3 / 3!] = |ex|max (0.0208333)

where max denotes the maximum magnitude on 0 ≤ x ≤ 0.5.

|ex|max = e0.5 = 1.6487213,

so the error is no greater in magnitude than

(1.6487213) (0.0208333) = 0.0343831

The actual error is

e0.5 – 1.625 = 1.6487213 – 1.6250000 = 0.0237213

which lies within the error bound. Notice that in this case the error which was (e) (0.5) 3 was actually

0.0237213 or about 0.19(0.5)3.

PROBLEM 03 – 0053:

Show that f(x) = (x – 1)(1/2) cannot be expanded in a Taylor series about x = 0 or x = 1, but can be

expanded about x = 2. Carry out the expansion about x = 2.

f'(x) = (1/2) (x – 1)–(1/2)

f"(x) = – (1/4) (x – 1)–(3/2)

f"'(x) = (3/8) (x – 1)–(5/2)

For an expansion about x = 0, the quantities f(0), f'(0), f"(0), etc. are needed. These involve

noninteger powers of (– 1), e.g. (– 1)(1/2). These cannot be evaluated to give real values and thus this

expansion is impossible.

For an expansion about x = 1, one needs quantities such as

f(1) = (0)(1/2)

f'(1) = (1/2) (0)–(1/2)

f"(1) = – (1/4) (0)–(3/2)

While f(1) is bounded, all of the derivatives f'(1), f"(1), etc. are unbounded. Thus the expansion

about x = 1 is impossible.

The Taylor series expansion about x = 2 is

f(x) = (2 – 1)(1/2) + (x – 2)[(1/2)(2 – 1)–(1/2)] + [(x – 2)2 / 2][– (1/4)(2 – 1)–(3/2)]

+ [(x – 2)3 / 3!] [(3/8) (2 – 1)–(5/2)] + ...

All derivatives are bounded and finite and the series is

f(x) = 1 + [(x – 2) / 2] – [(x – 2)2 / 8] + [(x – 2)3 / 16] – ...

This series is convergent for |x – 2| < 1.

PROBLEM 03 – 0054: Write the Taylor expansion for ln x about the point x = 1. Use n terms.

Solution: f(x) = ln x f(1) = 0

f'(x) = (1/x) f'(1) = 1

f"(x) = [(– 1) / x2] f"(1) = – 1

f"'(x) = [(+ 2) / x3] f"'(1) = 2

fiv(x) = – 2 × (3 / x4) fiv(1) = – 2 × 3

∙ ∙

∙ ∙

∙ ∙

f(n–1)(x) = [{(– 1)n(n – 2)!} / (xn–1)] f(n–1)(1) = (– 1)n(n – 2)!

f(n)(x) = [{(– 1)n+1(n – 1)!} / xn] f(n) [1 + θ(x – 1)]

= [{(– 1)n+1(n – 1)!} / {1 + θ(x – 1)}]

Substituting these values into the expression

f(x) = f(a) + (x – a)f'(a) + [(x – a)2 / 2!]f"(a) + .... + [(x – a)n–1 / (n – 1)!]f(n–1)(x)

+ [(x – a)n / n!] f(n)[a + θ (x – a)],

0 < θ < 1,

and remembering that a = 1 in this case, one obtains

ln x = (x – 1) – (1 / 2!)(x – 1)2 + (2 / 3!)(x – 1)3

– [(2 × 3) / 4!] (x – 1)4 + ... + [{(– 1)n(n – 2)!(x – 1)n–1} / (n – 1)!]

+ [{(– 1)n+1(n – 1)! (x – 1)n} / {n![1 + θ (x – 1)]n}],

0<θ<1

or, simplifying the expression,

ln x = (x – 1) – [(x – 1)2 / 2] + [(x – 1)3 / 3] – [(x – 1)4 / 4] + ...

+ [{(– 1)n(x – 1)n–1} / (n – 1)] + [{(– 1)n+1(x – 1)n} / {n(1 – θ + θx)n}],

0<θ<1

PROBLEM 03 – 0055:

(a) Derive the first three (non-zero) terms of Taylor's series expansion for the function f(x) = sin x,

about the origin. Since it is about the origin, this series is also called a Maclaurin series.

(b) Use the above result to compute the value of sin x for

x = 0.2 rad, correctly to five decimal places.

Solution: (a) With the expansion about the origin the Taylor's series is:

f(x) = f(0) + xf'(0) + (x2 / 2!)f"(0) + (x3 / 3!)f(3)(0) + (x4 / 4!)f(4)(0)

+ (x5 / 5!)f(5)(0) + ...

For

f(x) = sin x f(0) = 0

f'(x) = cos x f'(0) = 1

f"(x) = – sin x f"(0) = 0

f(3)(x) = – cos x f(3)(0) = – 1

f(4)(x) = sin x f(4)(0) = 0

f(5)(x) = cos x f(5)(0) = 1

Therefore,

sin x = [(x – x3) / 3!] + (x5 / 5!) + ...

(b) For x = 0.2 rad,

sin 0.2 = 0.2 – [(0.2)3 / 6] + [(0.2)5 / 120] + ...

sin 0.2 = 0.2 – 0.0013333 + 0.0000027 + ...

To five decimal places,

sin 0.2 = 0.19867

It is noted that this is the correct answer for sin 0.2, to five decimal places. Actually, one can tell

from the above computations that this answer is correct to five places by using the theorem which

says that in a converging alternating series the error committed in stopping with any term is

always less than the first term neglected. The first term omitted in the calculation above for sin 0.2

is the third term, and it contributes only 3 in the sixth place.

f(x) = (1 + x)(1/2)

by means of the Taylor expansion about x0 = 0.

Solution:

f'(x) = [1 / {2(1 + x)(1/2)}]

f"(x) = [(– 1) / {4[(1 + x)(1/2)]3}]

f(0) = 1

f'(0) = (1/2)

Thus the Taylor expansion with remainder term is

(1 + x)(1/2) = 1 + (x/2) – [x2 / {8[(1 + ξ)(1/2)]3}]

(Here ξ lies between 0 and x.).

Thus the first-degree polynomial approximation to (1 + x)(1/2) is [(1 + x) / 2].

The accuracy of this approximation depends on what set of values x is allowed to assume. If x ε [0,

.1] , for example, one gets

|(1 + x)(1/2) – [1 + (x/2)]| ≤ [.12 / {8[(1 + ξ)(1/2)]3}] ≤ (.12 / 8) = .00125

PROBLEM 03 – 0057:

Obtain a second-degree polynomial approximation to f(x) = e–(x)2 over [0, .1] by means of the

Taylor expansion about x0 = 0. Use the expansion to approximate f(.05), and bound the error.

f"(x) = (– 2 + 4x2)e–(x)2 f"'(x) = (12x – 8x3)e–(x)2

f(0) = 1 f'(0) = 0 f"(0) = – 2

Thus f(x) = 1 – x2 + [{f"'(t)x3} / 6]

where

0 < t < .1

Using f(x) ≅ 1 – x2, one gets f(.05) ≅ .9975. The truncation error is bounded by

|f(.05) – .9975| ≤ [(.05)3 / 6]maxtεI |f"'(t)|

where I = [0, .05]. To obtain a bound on |f'"(t)|, use the result that, for t ε I,

|f"'(t)| = |12 – 8t2| ∙ |t| ∙ |e–(t)2| < maxtεI |12 – 8t2| ∙ maxtεI |t| ∙ maxtεI |e–(t)2|

Hence

|f(0.5) – .9975| < [(.05)3 / 6] (12) (.05) (1.0) = 1.25 × 10–5

PROBLEM 03 – 0057:

Obtain a second-degree polynomial approximation to f(x) = e–(x)2 over [0, .1] by means of the

Taylor expansion about x0 = 0. Use the expansion to approximate f(.05), and

bound the error.

f"(x) = (– 2 + 4x2)e–(x)2 f"'(x) = (12x – 8x3)e–(x)2

f(0) = 1 f'(0) = 0 f"(0) = – 2

Thus f(x) = 1 – x2 + [{f"'(t)x3} / 6]

where

0 < t < .1

Using f(x) ≅ 1 – x2, one gets f(.05) ≅ .9975. The truncation error is bounded by

|f(.05) – .9975| ≤ [(.05)3 / 6]maxtεI |f"'(t)|

where I = [0, .05]. To obtain a bound on |f'"(t)|, use the result that, for t ε I,

|f"'(t)| = |12 – 8t2| ∙ |t| ∙ |e–(t)2| < maxtεI |12 – 8t2| ∙ maxtεI |t| ∙ maxtεI |e–(t)2|

Hence

|f(0.5) – .9975| < [(.05)3 / 6] (12) (.05) (1.0) = 1.25 × 10–5

(1/2) 1∫–1 [(sin x) / x] dx

by means of the Taylor expansion.

Solution: The expansion of sin x in a Taylor series about x0 = 0 through terms of degree 6 is

sin x = x – (x3 / 3!) + (x5 / 5!) – (x7 / 7!) cos ξ

[(sin x) / x] = 1 – (x2 / 6) + (x4 / 120) – (x6 / 7!) cos ξ

Thus

(1/2) 1∫–1 [(sin x) / x] dx = (1/2) [x – (x3 / 18) + (x5 / 600)] |1 –1

– (1/2) 1∫–1 (x6 / 7!) cos ξ dx

This simplifies to

(1/2) 1∫–1 [(sin x) / x] dx = (1,703 / 1,800) – (1/2) 1∫–1 (x6 / 7!) cos ξ dx

To bound the error, compute

|(1/2) 1∫–1 (x6 / 7!) cos ξ dx| ≤ (1/2) 1∫–1 (x6 / 7!) |cos ξ| dx

≤ (1/2) 1∫–1 (x6 / 7!) dx = [1 / {(7)(7!)}]

Hence

|(1/2) 1∫–1 [(sin x) / x] dx – (1,703 / 1,800)| ≤ [1 / {(7)(7!)}] < 3.0 × 10–5

Taylor's series.

Solution: The general idea here is that the equation x = cos x is to be replaced by an

equation of the form

x = 1 – (x2 / 2!) + (x4 / 4!) – (x6 / 6!) ... + (– 1)n [x2n / (2n)!] (1)

Taking n = 1, for example, solve

x = 1 – (x2 / 2)

to obtain x = – 1 ±3(1/2). Since the only solution to x = cos x lies in the interval [0, 1], one obtains

– 1 + 3(1/2)

as an approximate solution. This value can be. used as an initial guess in a fixed-point iteration to

solve the equation. The iteration might be performed on the original equation x = cos x or on the

approximate equation (1) for some n > 1.

PROBLEM 03 – 0060: Find sin .1, using Taylor's series and t = 2 floating-point arithmetic.

Hence

sin .1 = .1 – (.13 / 6) + (.15 / 120) – (.17 / 5,040) ...

Now in floating-point arithmetic

(.13 / 6) = [{(.10 × 100)(.10 × 100)(.10 × 100)} / (.60 × 101)]

= [(.10 × 10–2) / (.60 × 101)]

= .17 × 10–3

The sum of the first two terms of the series is

.10 × 100 – .17 × 10–3 = .10 × 100

Thus, the second term and all subsequent terms of the series do not affect the result.

a = – (π / 4),

and determine the interval of convergence.

Taylor Series for the function. To find the Taylor Series determine f(x), f(a), f'(x), f'(a), f"(x), f"(a) , etc.

One finds:

f(x) = cos x; f(a) = f [– (π / 4)] = (√2 / 2).

f'(x) = – sin x; f'(a) = f' [– (π / 4)] = (√2 / 2).

This is a positive value because the value of sin x in the 2nd quadrant is negative, therefore – sin x

is positive.

f"(x) = – cos x; f"(a) = f" [– (π / 4)] = [(– √2) / 2].

f"'(x) = sin x; f"'(a) = f"' [– (π / 4)] = [(– √2) / 2].

Develop the series as follows:

f(x) = f(a) + f'(a)[x – a] + [{f"(a)} / 2!][x – a]2 + [{f"'(a)} / 3!][x – a]3 + ....

By substitution:

cos x = (√2 / 2) + (√2 / 2) [x + (π / 4)] – [(√2 / 2) / 2!] [x + (π / 4)]2

– [(√2 / 2) / 3!] [x + (π / 4)]3 + .....

To determine the law of formation, examine the terms of this series. The nth term of the series is

found to be:

[(√2 / 2) / (n – 1)!] [x + (π / 4)]n–1;

Then, the (n + 1)th term is:

[(√2 / 2) / n!] [x + (π / 4)]n.

Therefore the Taylor series is:

cos x = (√2 / 2) + (√2 / 2) [x + (π / 4)] – [(√2 / 2) / 2!] [x + (π / 4)]2

– [(√2 / 2) / 3!] [x + (π / 4)]3 + ... + [(√2 / 2) / (n – 1)!] [x + (π / 4)]n–1

+ [(√2 / 2) / n!] [x + (π / 4)]n + ...

To find the interval of convergence, use the Ratio Test. Set up the ratio

(Un+1 / Un),

obtaining:

[{√2[x + (π / 4)]n} / {2(n!)}] × [{2(n – 1)!} / {√2[x + (π / 4)]n–1}]

= [{x + (π / 4)}n / {n(n – 1)!}] × [(n – 1)! / {x + (π / 4)}n–1]

= [{x + (π / 4)} / n].

Now, one finds

limn→∞ |[{x + (π / 4)} / n]| = |0| = 0.

By the ratio test it is known that if

limn→∞ |(Un+1 / Un)| < 1

the series converges. Since 0 < 1, the series converges for all values of x.

compact convex domain E containing (0, 0).

Solution: Assume f(x, y) = ɸ (x) Ψ(y) where ɸ (x) = ex and Ψ(y) = cos y. Also assume E =

Ex × Ey where Ex and Ey are compact convex subsets of R, i.e., closed and bounded intervals. Note

that

Supxε(E)x ||[{dkɸ (x)} / dxk]|| = emax(E)x

and

Supyε(E)y ||[{dkΨ(y)} / dxk]|| ≤ 1.

Hence ɸ (x) and Ψ(y) are real analytic on E. Furthermore, note that

ex = ∞∑i=0 (xi / i!) [(diex) / (dxi)] |x=0 = ∞∑i=0 (xi / i!) (1)

and

cos y = ∞∑i=0 (yi / i!)[(di cos y) / dyi] |y=0 = ∞∑i=0 (– 1)i [y2i / (2i)!] (2)

Since f = ɸ Ψ and ɸ and Ψ are real analytic on E, f is real analytic on E:

Supx, yεE ||[{∂kf(x, y)} / (∂xi∂yk–1)]|| ≤ emax(E)x.

hence, from (1) and (2)

ex cos y = [∞∑i=0 (xi / i!)] [∞∑j=0 (– 1)j {y2j / (2j)!}]

= ∞∑k=0 [∑(i+j=k)(i,j≥0) (– 1)j {(xiy2j) / [i!(2j)!]}].

The first three terms corresponding to k = 0, k = 1, and k = 2 give the approximation:

ex cos y ≈ 1 + x – (1/2)y2 + (1/2)x2 – (1/2)xy2 + (1 / 24)y4

which is known to be accurate near (0, 0).

n∑ αiti

i=0

1 + t + 3t4.

f(t) = n∑i=0 αiti

as a polynomial in x = t – 1, i.e., as

g(x) = m∑i=1 bixi,

use Taylor's Theorem. Note that f is C∞ and that

f'(t) = n∑i=0 iαiti–1,

f"(t) = n∑i=0 i(i – 1)αiti–2,...,

f(n)(t) = n∑i=0 i(i – 1) ... (i – n + 1)αiti–n,

and

0 = f(n+1)(t) = f(n+2)(t) = ....

To get a polynomial in t – 1, expand about 1. Hence

f(1) = n∑i=0 αi, f'(1) = n∑i=1 iαi, f"(1) = n∑i=2 i(i – 1)αi,...,

f(k)(1) = n∑i=k i(i – 1) ... (i – k + 1)αi, ...,

f(n)(1) = n! αn.

Therefore, the Taylor expansion

f(t) = n∑j=0 [{f(j)(1)} / j!] (t – 1)j

= n∑j=0 (1 / j!) n∑i=j i(i – 1) ... (i – j + 1) αi(t – 1)j

= n∑j=0 (1 / j!) n∑i=j [i! / (i – j)!] αi(t – 1)j

= n∑j=0 [n∑i=j {i! / [(i – j)!j!]} αi](t – 1)j

= n∑j=0 [n∑i=j (i j) αi](t – 1)j

= m∑j=0 bjxj = g(x).

Note that the degree of the new polynomial (1) is also n and the j-th coefficient is in terms of the

last n – j + 1 of the αi. For f(t) = 1 + t + 3t4, according to this method,

α0 = α1 = 1, α2 = α3 = 0

and

α4 = 3.

Hence

f(t) = 4∑j=0 [4∑i=j {i! / [(i – j)! j!]} αi](t – 1)j

= [{0! / (0!0!)}α0 + {1! / (1!0!)}α1 + {2! / (2!0!)}α2 + {3! / (3!0!)}α3

+ {4! / (4!0!)}α4](t – 1)0

+ [{1! / (0!1!)}α1 + {2! / (1!1!)}α2 + {3! / (2!1!)}α3 + {4! / (3!1!)}α4](t – 1)1

+ [{2! / (0!2!)}α2 + {3! / (1!2!)}α3 + {4! / (2!2!)}α4](t – 1)2

+ [{3! / (0!3!)}α3 + {4! / (1!3!)}α4](t – 1)3 + {4! / (0!4!)}α4 (t – 1)4

= (1 + 1 + 0 + 0 + 3) + (1 + 0 + 0 + 12)(t – 1) + (0 + 0 + 18)(t – 1)2

+ (0 + 12)(t – 1)3 + 3(t – 1)4

= 5 + 13(t – 1) + 18(t – 1)2 + 12(t – 1)3 + 3(t – 1)4. (2)

To show that (2) is actually equal to f(t); expand:

f(t) = 5 + 13(t – 1) + 18(t2 – 2t + 1) + 12(t3 – 3t2 + 3t – 1)

+ 3(t4 – 4t3 + 6t2 – 4t + 1)

= (5 – 13 + 18 – 12 + 3) + t(13 – 36 + 36 – 12) + t2(18 – 36 + 18)

+ t3(12 – 12) + t4(3)

= 1 + t + 3t4.

So, indeed,

1 + t + 3t4 = 5 + 13x + 18x2 + 12x3 + 3x4

where

x = t – 1.

PROBLEM 03 – 0064: A. Find the Taylor series expansion of f(x, y) about x0, y0.

B. Expand f(x, y) = cos xy about x0 = (π / 2), y0 = (1/2) up to

second-order terms.

about the point (x0, y0) is obtained from

f(x0 + x) = ∞∑n=0 [{f(n)(x0)} / n!] xn (a)

by incrementing one of the variables at a time. Indicating the partial derivatives of z(x, y) by

fx = (∂z / ∂x), fy = (∂z / ∂y), fxx = (∂2z / ∂x2), fxy = [(∂2z / (∂x∂y)]

fyy = (∂2z / ∂y2),.., (b)

(a) applied to f(x0 + x, y0), i.e., to the function f considered as a function of x only while y0 is kept

constant, gives

f(x0 + x, y0) = f(x0, y0) + fx(x0, y0) (x / 1!) + fxx(x0, y0) (x2 / 2!) + ...,

from which, changing y0 into y0 + y on both sides of the equation,

f(x0 + x, y0 + y) = f(x0, y0 + y) + fx(x0, y0 + y) (x / 1!)

+ fxx(x0, y0 + y) (x2 / 2!) + ... . (c)

Expanding by means of (a) the coefficient of each power of x in (c) into a

power series in y about y0, while x0 is kept constant, yields

f(x0, y0 + y) = f(x0, y0) + fy(x0, y0)(y / 1!) + fyy(x0, y0)(y2 / 2!) + ...,

fx(x0, y0 + y) = fx(x0, y0) + fxy(x0, y0)(y / 1!) + fxyy(x0, y0)(y2 / 2!) + ... , (d)

fxx(x0, y0 + y) = fxx(x0, y0) + fxxy(x0, y0)(y / 1!) + fxxyy(x0, y0)(y2 / 2!) + ...

.................................................................................................................

Substituting (d) in (c), the Taylor series expansion of f(x, y) about x0, y0 becomes

f(x0 + x, y0 + y)

= f(x0, y0) + [fx(x0, y0)x + fy(x0, y0)y]

+ (1 / 2!)[fxx(x0, y0)x2 + 2fxy(x0, y0)xy + fyy(x0, y0)y2]

+ (1 / 3!) [fxxx(x0, y0)x3 + 3fxxy(x0, y0)x2y + 3fxyy(x0, y0)xy2

+ fyyy(x0, y0)y3] + ... .

B. Here

fx = – y sin xy; fy = – x sin xy; fxx = – y2 cos xy;

fxy = – sin xy – xy cos xy; fyy = – x2 cos xy;

f[(π / 2), (1/2)] = (√2 / 2);

fx[(π / 2), (1/2)] = – (1/2)(√2 / 2); fy[(π / 2), (1/2)] = – (π / 2)(√2 / 2);

fxx[(π / 2), (1/2)] = – (1/4)(√2 / 2);

fxy[(π / 2), (1/2)] = – (√2 / 2) [1 + (π / 4)];

fyy[(π / 2), (1/2)] = – (π2 / 4)(√2 / 2);

f[(π / 2) + x, (1/2) + y] = (√2 / 2){1 – [(1/2)x + (π / 2)y]

– [(1/8)x2 + {1 + (π / 4)}xy + (π2 / 8)y2]}.

f'(x) = ex f'(0) = 1

f"(x) = ex f"(0) = 1

f"'(x) = ex f"'(0) = 1

fiv(x) = ex fiv(0) = 1

∙

∙

∙

f(n–1)(x) = ex f(n–1)(0) = 1

f(n)(x) = ex f(n)(θx) = eθx

Substituting these quantities in the expression for the Maclaurin series:

f(x) = f(0) + xf'(0) + (x2 / 2!) f"(0) + ... + [xn–1 / (n – 1)!] f(n – 1)(0)

+ (xn / n!) f(n)(θx),

0<θ<1

obtain

ex = 1 + x + (x2 / 2!) + (x3 / 3!) + ... + [xn–1 / (n – 1)!] + (xn / n!)eθx,

0<θ<1

the Maclaurin series for xe–(x)2.

Solution: The successive derivatives of this function soon become very unwieldy; hence

it is desirable to compute the coefficients of the power series and to estimate the magnitude of the

error committed by stopping at a certain term by other means. If in series

ex = 1 + (x / 1!) + (x2 / 2!) + ... + (xn / n!) + ... ,

R = ∞.

– x2 replaces x and then multiplying through by x, yields

xe–(x)2 = x – (x3 / 1!) + (x5 / 2!) – (x7 / 3!) ± ... + (– 1)n[x2n+1 / n!] + ... .

This is an alternating series and it can be shown that six terms are sufficient to yield, for

x = (1/2), [1 / (2e0.25)] = 0.38940,

a value which is correct to five decimal places.

(1/2)(ex + e–x),

and the interval of convergence.

Solution: Let

f(x) = (1/2)(ex + e–x).

To find the Maclaurin series, determine f(0), f'(x), f'(0), f"(x), f"(0), etc. One finds:

f(x) = (1/2)(ex + e–x) f(0) = 1

f'(x) = (1/2)(ex – e–x) f'(0) = 0

f"(x) = (1/2)(ex + e–x) f"(0) = 1

f"'(x) = (1/2)(ex – e–x) f"'(0) = 0

f4(x) = (1/2)(ex + e–x) f4(0) = 1, etc.

Now, the series is developed as follows:

f(x) = f(0) + f'(0)x + [{f"(0)} / 2!]x2 + [{f"'(0)} / 3!]x3 + ... .

By substitution:

(1/2)(ex + e–x) = 1 + 0 + (x2 / 2!) + 0 + (x4 / 4!) + ...

= 1 + (x2 / 2!) + (x4 / 4!) + ... .

To determine the law of formation examine the terms of this series. The nth term of the series is

found to be:

[x2n–2 / (2n – 2)!];

Then, the (n + 1)th term is:

[x2n / (2n)!].

Therefore the Maclaurin series is:

(1/2)(ex + e–x) = 1 + (x2 / 2!) + (x4 / 4!) + ... + [x2n–2 / (2n – 2)!] + [x2n / (2n)!].

To find the interval of convergence use the ratio test. Set up the ratio

(un+1 / un),

obtaining:

[x2n / (2n)!] ∙ [(2n – 2)! / x2n–2] = [x2n / {(2n)(2n – 2)!}] ∙ [(2n – 2)! / x2n–2]

= (x2 / 2n).

One finds:

limn→∞ |(x2 / 2n)| = |0| = 0.

By the ratio test it is known that, if

limn→∞ |(un+1 / un)| < 1,

the series converges. Since 0 is always less than 1, the series converges for all values of x.s

PROBLEM 03 – 0068: Find the Maclaurin series and the interval of convergence for

the function f(x) = cos x.

Solution: To find the Maclaurin series for the given function, determine f(0),

f'(x), f'(0), f"(x), f"(0), etc.

One finds:

f(x) = cos x f(0) = 1

f'(x) = – sin x f'(0) = 0

f"(x) = – cos x f"(0) = – 1

f"'(x) = sin x f"'(0) = 0

f4(x) = cos x f4(0) = 1

f5(x) = – sin x f5(0) = 0

f6(x) = – cos x f6(0) = – 1.

Develop the series as follows:

f(x) = f(0) + f'(0)x + [{f"(0)} / 2!]x2 + [{f"'(0)} / 3!]x3 + [{f4(0)} / 4!]x4

+ [{f5(0)} / 5!]x5 + [{f6(0)} / 6!]x6 + ...

By substitution:

cos x = 1 + 0 – (x2 / 2!) + 0 + (x4 / 4!) + 0 – (x6 / 6!) + ...

= 1 – (x2 / 2!) + (x4 / 4!) – (x6 / 6!) + ... .

To determine the law of formation, examine the terms of this series. The nth term of the series is

found to be

[x2n–2 / (2n – 2)!];

Then, the (n + 1)th term is:

[x2n / (2n)!].

Therefore, the Maclaurin series is:

cos x = 1 – (x2 / 2!) + (x4 / 4!) – (x6 / 6!) + ... ± [x2n–2 / (2n – 2)!] ± [x2n / (2n)!] ... .

To find the interval of convergence use the ratio test. Set up the ratio

(un+1 / un),

obtaining:

[x2n / (2n)!] × [(2n – 2)! / x2n–2] = [x2n / {(2n)(2n – 2)!}] × [(2n – 2)! / x2n–2] = (x2 / 2n).

Now, one finds

limn→∞ |(x2 / 2n)| = |0| = 0.

By the ratio test it is known that, if

limn→∞ |(un+1 / un)| < 1

the series converges.

Since 0 is always less than 1, the series converges for all values of x.

PROBLEM 03 – 0069: Using the Maclaurin expansion, find the value of the sine

function to eight significant figures.

Solution: The remainder term in the Maclaurin expansion of the sine function

sin x = x – (x3 / 3!) + (x5 / 5!) + ... , 0<θ<1

is

[{(– 1)p x2p+1 cos θx} / (2p + 1)!],

where p is the number of terms required. If one desires to compute the sine function to eight significant

figures, the relative error would be less than [1 / (2 × 108)]. Then, note that cosθx < 1, so it is required that

[1 / (sin x)] [x2p+1 / (2p + 1)!] < [1 / (2 × 108)]

The value of p, the number of terms required, depends on how large a value of x must be accommodated.

Because of the periodicity of the sine function, it is certainly sufficient to consider only x < 2π. Also, since

sin(π + x) = – sin x, values of x between π and 2π can be replaced by values between 0 and π. Further,

since sin(π – x) = sin x, angles between (π / 2) and π can be replaced by angles between 0 and (π / 2). For

values of x between 0 and (π / 2), the sin x in the denominator is no problem, since 1 ≤ [x / (sin x)] ≤ (π / 2)

for x in this range. The largest value of the left-hand side of the inequality occurs when x = (π / 2). For this

value of x, the inequality is satisfied by 2p + 1 = 15, or p = 7. Thus the value of the sine function will be

given in eight significant figures by the expression

sin x = x – (x3 / 3!) + (x5 / 5!) – (x7 / 7!) + (x9 / 9!) – (x11 / 11!) + (x13 / 13!)

U = X ∗Y

– A(3)} ∗ U – A(1)] ∗ X

PROBLEM 03 – 0070: Compute ln 2 correct to five decimal places by putting

x = (1/2) in the power series

ln [1 / (1 – x)] = x + (x2 / 2) + (x3 / 3) + (x4 / 4) + ...

+ (xn / n) +...,

R = 1.

f(n+1)(x) = [n! / (1 – x)n+1].

Hence, the expression

E(x) = [(x – x0)n+1 / (n + 1)!] f(n+1)(X),

for the error term becomes (x0 is again 0),

[xn+1 / (n + 1)!] [n! / (1 – X)n+1] = [xn+1 / {(n + 1)(1 – X)n+1}].

We have x = (1/2); further, X must be chosen between 0 and (1/2) so that the magnitude of the error is as

large as possible, hence X is also (1/2).

The error term consequently reduces to [1 / (n + 1)] and the first n for which [1 / (n + 1)] < 0.000005 must be

found. The first n is n = 200,000.

On the other hand,

En = ln [1 / (1 – x)] – n∑i=1 (xi / i)

= [xn+1 / (n + 1)] + [xn+2 / (n + 2)] + ... < [xn+1 / (n + 1)]

+ [xn+2 / (n + 1)] + ...

= [xn+1 / (n + 1)] (1 + x + x2 + ...)

= [xn+1 / (n + 1)] [1 / (1 – x)].

If x = (1/2), the last expression becomes [1 / {2n(n + 1)}] and the first n for which this is less than 0.000005

must now be found. This time, n = 14. This estimate of the error gives a far better result in the shape of a

much

smaller n than the previous one. Taking the first fourteen terms of the series, one finds ln 2 = 0.693143+. If

the error committed in neglecting terms from (x15 / 15) onward is 0.000002–, or less, the value of ln 2 is

0.69314; but if the error is between 0.000002 and 0.000005, then ln 2 is 0.69315. Then one or two terms

more of the series must be computed. Since (1 / 15) ∙ 215 = 0.00002+,

ln 2 = 0.69315, correct to five decimal places.

PROBLEM 03 – 0071:

Compute sin 40° correct to five significant figures from

sin x = x – (x / 3!) + (x5 / 5!) ± ... + (– 1)n [x2n+1 / (2n + 1)!] + ...,

3

R = ∞. (1)

Solution: Equation (1) yields 40° = (2π / 9) = 0.6981317 rad. Thus, using

The first positive integral n for which this inequality holds (obtained by trial and error) is n = 7. Hence, a

polynomial of max-degree 7 is necessary to attain the desired precision. On the other hand, using

whence n = 4. That is, the first four terms of (1) are sufficient to yield sin 40° correct to five significant

figures; a result equivalent to the preceding one. We have

sin 40° 0.6981317 – [(0.6981317)3 / 3!] + [(0.6981317)5 / 5!] – [(0.6981317)7 / 7!] = 0.6427875,

sin 40° = 0.64279.

f(x) = (x3 + 1)–(1/2).

f(x) = (x3 + 1)–(1/2)

= x–(3/2) (1 + x–3)–(1/2)

= x–1.5 (1 – 0.5 ∙ x–3 + [(0.5 ∙ 1.5) / 2]x–6 – ...)

= x–1.5 – (1/2)x–4.5 + (3/8)x–7.5 – ...

Differentiate three times:

f"'(x) = – x–4.5 ([105 / 8] – [1,287 / 16]x–3 + ...).

For x = 10 the second term is less than 1 percent of the first; the terms after the second decrease quickly

and are negligible. One can show that the magnitude of each term is less than 8 ∙ x–3 of the previous term.

Hence one gets f"'(10) = – 4.14 ∙ 10–4 to the desired accuracy.

y" = x + y2 which passes through the point (0, 1).

y = a0 + a1x + a2x2 + ... + anxn + ... ,

which converges in some interval about x0 = 0. Then

y' = a1 + 2a2x + ... + nanxn+1 + (n + 1) an+1xn + ... ,

and

y2 = a02 + (a0a1 + a1a0)x + (a0a2 + a1a1 + a2a0)x2

+ ... + (a0an + a1an–1 + ana0)xn + ... .

Therefore,

a1 + 2a2x + ... + (n + 1) an+1xn + ... = a02 + (a0a1 + a1a0 + 1)x

+ (a0a2 + a1a1 + a2a0)x2

+ ... + (a0an + ... + ana0)xn + ... .

Equate the coefficients of like powers of x and obtain the following system of simultaneous equations:

a1 = a02

2a2 = 2a0a1 + 1

3a3 = 2a0a2 + a12

..................................................................

(n + 1)an+1 = a0an + a1an–1 + ... + ana0.

Consequently,

a1 = a02

a2 = a03 + (1/2)

a3 = a04 + (1/3)a0

a4 = a05 + (5 / 12)a02

a5 = a06 + (1/2)a03 + (1 / 20)

...........................................................

Substituting these values for a1, a2, a3, ..., into the power series for y gives a general solution of the

differential equation. This might have been anticipated because no use has been made as yet of the point

(0, 1) through which the graph of the solution must pass. If this point is used, one finds, since x = 0, and y =

a0 = 1, that a1 = 1, a2 = (3/2), a3 = (4/3),

a4 = (17 / 12), a5 = (31 / 20), ... Consequently,

y = 1 + x + (3/2)x2 + (4/3)x3 + (17 / 12)x4 + (31 / 20)x5 + ... .

One can obtain the same result somewhat differently, and perhaps more briefly, recalling that

n! an = y0(n) = y(n)(x0) = [{dnf(x0)} / dxn],

where, to repeat, [{dnf(x0)} / dxn] is the nth derivative of f(x) evaluated at x0.

By successive differentiation (with respect to x) of the given differential

equation, y' = x + y2,

y" = 1 + 2yy',

y(3) = 2(yy" + y'2),

y(4) = 2(yy(3) + 3y'y"),

y(5) = 2(yy(4) + 4y'y(3) + 3y"2),

................................................................ ;

whence, at

(0, 1), y' = 1

and

y" = 3, y(3) = 8, y(4) = 34, y(5) = 186, ... .

Hence,

a0 = 1, a1 = 1, a2 = (3/2), a3 = (4/3), a4 = (17 / 12), a5 = (31 / 20) ... ,

as before.

n

∑n=1 (xn / n2).

Then determine if the convergence is uniform for

– R ≤ x ≤ R.

so that R = 1. That is, the series converges absolutely for – 1 < x < 1 and diverges for |x| > 1. Note that this

result could have also been found by the relation,

R = (1 / α)

where

α = limn→∞ sup n√(|an|)

where

an = (1 / n2)

which yields

α = limn→∞ sup n√[|(1 / n2)|]

= limn→∞ sup n√(1 / n2)

= limn→∞ sup (1 / n(2/n))

= limn→∞ sup [1 / (e(2/n) log n)]

= [1 / {e limn→∞ sup (2/n) log n}]

= (1 / e0)

= 1,

therefore,

(1 / R) = 1,

so that as before R = 1. For x = ±1 the series converges by comparison with the harmonic series of order 2,

that is

|[(±1)n / n2]| ≤ (1 / n2)

Hence the series converges for – 1 ≤ x ≤ 1. To determine if the convergence is uniform in this interval, the

Weierstrass M-test for uniform convergence is needed. That is, one must find a convergent series of

constants

∞

∑n=1 Mn

such that

|(xn / n2)| ≤ Mn

for all x in – 1 ≤ x ≤ 1 if one is to determine that

∞

∑n=1 (xn / n2)

is uniformly convergent in this interval.

Since – 1 ≤ x ≤ 1,

|(xn / n2)| ≤ (1 / n2)

for all x in the range.

Therefore since

∞

∑n=1 Mn

converges, this shows the given power series converges uniformly on – 1 ≤ x ≤ 1.

a) tan x

b) [(sin x) / (sin 2x)] (x ≠ 0)

Solution: To find the power series of the given functions, one needs the following theorem:

Given the two power series

∞

∑n=0 anxn = a0 + a1x + a2x2 + ... + anxn + ...

and

∞

∑n=0 bnxn = b0 + b1x + b2x2 + ... + bnxn + ... ,

where b0 ≠ 0, and where both of the series are convergent in some interval |x| < R, let f be a function

defined by

f(x) = [(a0 + a1x + a2x2 + ... + anxn + ...) / (b0 + b1x + b2x2 + ... + bnxn + ...)].

Then for sufficiently small values of x the function f can be represented by the power series

f(x) = c0 + c1x + c2x2 + ... + cnxn + ... ,

where the coefficients c0, c1, c2, ... , cn, ... are found by long division or equivalently by solving the

following relations successively for each ci (i = 0 to ∞):

b0 c0 = a0

b0 c1 + b1 c0 = a1

∙

∙

∙

b0 cn + b1cn–1 + ... + bnc0 = an

∙

∙

∙

a) To find the power series expansion of tan x, Taylor's series for sin x and cos x are needed. That is

sin x = x – (x3 / 3!) + (x5 / 5!) – ...

and

cos x = 1 – (x2 / 2!) + (x4 / 4!) – ...

Then,

tan x = [(sin x) / (cos x)]

= [{x – (x3 / 3!) + (x5 / 5!) – ...} / {1 – (x2 / 2!) + (x4 / 4!) – ...}] (1)

Therefore, by the theorem the power series expansion of tan x can be found by dividing the numerator by

the denominator on the right side of (1). Hence, using long division one gets

x + (1/3)x3 + (2 / 15)x5 + ...

1 – (1/2)x2 + (1 / 24)x4 – ... √x – (1/6)x3 + (1 / 120)x5 – ...

x – (1/2)x3 + (1 / 24)x5 – ...

(1/3)x3 – (1 / 30)x5 + ...

(1/3)x3 – (1 / 6)x5 + ...

(2 / 15)x5 – ...

(2 / 15)x5 – ...

Thus

tan x = x + (1/3)x3 + (2 / 15)x5 + ...

b) Since

sin x = x – (x3 / 3!) + (x5 / 5!) – ...

one gets

sin (2x) = 2x – [(2x)3 / 3!] + [(2x)5 / 5!] – ...

so that

[(sin x) / (sin 2x)] = [{x – (x3 / 3!) + (x5 / 5!) – ...}

/ {2x – [(2x)3 / 3!] + [(2x)5 / 5!] – ...}]. (2)

Now multiplying the numerator and denominator on the right side of (2) by (1/x) yields

[(sin x) / (sin 2x)] = [{1 – (x2 / 6) + (x4 / 120) – ...}

/ {2 – (4/3)x2 + (4 / 15)x4 – ...}].

Now by long division

(1/2) + (1/4)x2 + (5 / 48)x4 + ...

2 – (4/3)x2 + (4 / 15)x4 √ 1 – (1/6)x2 + (1 / 120)x4 – ...

(1/2) – (4/6)x2 + (4 / 30)x4 – ...

(1/2)x2 – (15 / 120)x4 + ...

(1/2)x2 – (1 / 3)x4 + ...

(25 / 120)x4 – ...

(25 / 120)x4 – ...

Thus, for x ≠ 0 ,

[(sin x) / (sin 2x)] = (1/2) + (1/4)x2 + (5 / 48)x4 + ... .

Solution: To do this problem make use of Taylor's formula. That is if

exist and are continuous in the interval a ≤ x ≤ b and if f(n+1)(x) exists in the interval a < x < b, then

f(x) = f(a) + f'(a)(x – a) + [{f"(a)(x – a)2} / 2!] + ... + [{f(n)(a)(x – a)n} / n!] + Rn (1)

where a < ξ < x. Note that as n changes, ξ also changes in general. In addition, if for all x and ξ in [a, b],

limn→∞ Rn = 0,

f(x) = f(a) + f'(a)(x – a) + [{f"(a)} / 2!](x – a)2 + [{f"'(a)(x – a)3} / 3!] + ... (2)

For the given problem f(x) = ex, so that f(n)(x) = ex for all orders n. Now taking a = 0 in (1),

f(x) = f(0) + f'(0)x + [{f"(0)x2} / 2!] + ... + [{f(n)(0)xn} / n!] + Rn

which yields upon substitution of f(x) = ex,

ex = 1 + x + (x2 / 2!) + ... + (xn / n!) + Rn

where

Rn = [{f(n+1)(ξ)} / (n + 1)!] xn+1 = [eξ / (n + 1)!] xn+1

where

0 < ξ < x.

Now we must prove that

limn→∞ Rn = 0,

so that we will have the Taylor expansion for ex. However,

– |x| < ξ < |x|

and

0 < eξ < e|x|;

hence

Thus by (3) to prove

limn→∞ Rn = 0,

To do this choose an integer N such that N ≥ 2 |x|. Then if n > N,

Therefore

[(|x|n) / n!] ≤ [(|x|N) / N!] (1/2)n–N. (4)

Now keeping N fixed we have

limn→∞ (1/2)n–N = 0.

Therefore by (4)

limn→∞ [(|x|n) / n!] = 0.

This means that the series representation

ex = 1 + x + (x2 / 2!) + ... + (xn / n!) + ...

is valid for all values of x.

[{d(sin x)} / dx] = cos x

and

[{d(cos x)} / dx] = – sin x.

b) Then show that sin a cos b + cos a sin b = sin (a + b) and

cos a cos b – sin a sin b = cos (a + b).

Solution: The power series expansion for sin x and cos x are for all values x

sin x = x – (x3 / 3!) + (x5 / 5!) – (x7 / 7!) + ... (1)

Since, by theorem, a power series can be differentiated term-by-term in any interval lying entirely within its

radius of convergence, one gets by (1) and (2), for all values x

[{d(sin x)} / dx] = 1 – (3x2 / 3!) + (5x4 / 5!) – (7x6 / 7!) + ...

= 1 – (x2 / 2!) + (x4 / 4!) – (x6 / 6!) + ... = cos x.

and

[{d(cos x)} / dx] = [(– 2x) / 2!] + (4x3 / 4!) – (6x5 / 5!) + ...

= – x + (x3 / 3!) – (x5 / 5!) + ... = – sin x

b) Using the result from part (a)

{sin x cos(h – x) + cos x sin(h – x)}'

= cos x cos(h – x) + sin x sin(h – x) – sin x sin(h – x) – cos x cos(h – x) = 0

Thus, sin x cos(h – x) + cos x sin(h – x) is constant and this constant equals the value sin h (this was found by

letting x = 0).

Hence

Now replacing x by a and h by a + b yields

sin a cos b + cos a sin b = sin(a + b).

From which differentiation with respect to a yields

cos a cos b – sin a sin b = cos(a + b).

Note that from this result one obtains

cos2x – sin2x = cos 2x.

PROBLEM 03 – 0078: Write the function f(z) = [z / (ez – 1)] in terms of a power series.

f(z) = [z / (ez – 1)] (1)

is analytic everywhere except at those points where ez – 1 vanishes and z does not, i.e., except at the points

±2πi, ±4πi, .... Substituting the power series

ex – 1 = (z / 1!) + (z2 / 2!) + ... + (zn / n!) +... .

into (1), and dividing both numerator and denominator by z, obtain

f(z) = [1 / {1 + (z / 2!) + ... + [zn / (n + 1)!] +...}]

(note that this expression is meaningful for z = 0) . The series in the denominator converges for all z and

does not vanish for z = 0, but otherwise has the same zeros as ez – 1. The two zeros closest to the origin are

at ±2πi, and therefore f(z) has a power series representation of the form

∞

∑n=0 cn(z – zn)n

for z0 = 0

on the disk K: |z| < 2π. To find this representation, use the technique of division of power series. That is,

given an expression of the form

f(z) = [{g(z)} / {h(z)}]

= ∞∑n=0 cn zn,

the coefficient cn may be found by using the coefficients c0, c1, c2, ..., cn–1 in formula

cn = [(an – c0bn – c1bn–1 – ... – cn–1b1) / b0] (2)

In the present case, the coefficients an, bn appearing in this problem are just

a0 = 1, an = 0 (n = 1, 2,...),

bn = [1 / (n + 1)!] (n = 0, 1, 2,...).

Therefore the first of the equations (2) gives

c0 = 1,

and the rest reduce to the recurrence relation

c0 [1 / (n + 1)!] + c1(1 / n!) + ... + cn = 0 (n = 1, 2,...), (3)

relating cn to the values eg, c0, c1, ... , cn–1.

The numbers cnn! are called the Bernoulli numbers and are denoted by Bn. To calculate Bn, use the

recurrence relation (3), which now takes the form

B0 [1 / {0!(n + 1)!}] + B1 [1 / (1! n!)] + ... + Bn [1 / (n! 1!)] = 0

(n = 1,2,...) (4)

(obviously B0 = c00! = 1) . Multiplying (4) by (n + 1)! and introducing the notation

[(n + 1)! / {k!(n + 1 – k)!}] = (n+1 k)

(the familiar binomial coefficient), find that

B0(n+1 0) + B1(n+1 1) + ... + Bn(n+1 n) = 0

(n = 1,2,...). (5)

Equation (5) can be written symbolically as

(1 + B)n+1 – Bn+1 = 0, (6)

where after raising 1 + B to the (n + 1)th power, every Bk is changed to Bk (k = 1, 2,...,n + 1). Using (6) and the

fact that B0 = 1, deduce step by step that

B0 + 2B1 = 0,

B0 + 3B1 + 3B2 = 0,

B0 + 4B1 + 6B2 + 4B3 = 0,

B0 + 5B1 + 10B2 + 10B3 + 5B4 = 0,

B1 = – (1/2)B0 = – (1/2),

B2 = – (1/3)B0 – B1 = (1/6),

B3 = – (1/4)B0 – B1 – (3/2)B2 = 0,

B4 = – (1/5)B0 – B1 – 2B2 – 2B3 = – (1 / 30)

B0 + 6B1 + 15B2 + 20B3 + 15B4 + 6B5 = 0,

B5 = – (1/6)B0 – B1 – (5/2)B2 – (10 / 3)B3 – (5/2)B4 = 0,

B0 + 7B1 + 21B2 + 35B3 + 35B4 + 21B5 + 7B6 = 0,

B6 = – (1/7)B0 – B1 – 3B2 – 5B3 – 5B4 – 3B5 = (1 / 42),

Collecting these results,

B0 = 1, B1 = – (1/2), B2 = (1/6) , B3 = 0, B4 = – (1 / 30), B5 = 0,

B6 = (1 / 42),... (7)

As suggested by (7), the Bernoulli numbers with odd indices greater than 1 vanish. To see this, write

f(z) = [z / (ez – 1)] = c0 + c1z + c2z2 + c3z3 + ... + cnzn + ... (8)

= B0 + (B1 / 1!)z + (B2 / 2!)z + (B3 / 3!)z + ... + (Bn / n!)zn + ...

2 3

f(– z) = [(– z) / (e–z – 1)] = [(– zez) / {(e–z – 1)ez}] = [zez / (ez – 1)] (9)

= B0 – (B1 / 1!)z + (B2 / 2!)z2 – (B3 / 3!)z3 + ... + (Bn / n!)(– 1)nzn

Subtraction of (9) from (8) gives

f(z) – f(– z) = [z / (ez – 1)] – [zez / (ez – 1)] = – z

= 2(B1 / 1!)z + 2(B3 / 3!)z3 + ... + 2[B2m+1 / (2m + 1)!]z2m+1 + ...,

and hence, by the uniqueness of power series expansions,

2B1 = – 1, B3 = B5 = ... = B2m+1 = ... = 0,

as asserted. Using this fact, write the expansion (8) in the form

f(z) = [z / (ez – 1)] = 1 – (z/2) + ∞∑m=1 [B2m / (2m)!] z2m, (10)

where the series converges on the disk K: |z| < 2π. Moreover, the fact that f(z) becomes infinite for z = ±2πi

implies that K is the largest disk on which (10) converges.

PROBLEM 03 – 0079: Obtain series expansions of the function

f(z) = [(– 1) / {(z – 1)(z – 2)}]

about the point z0 = 0 in the regions

|z| < 1, 1 < |z| < 2, |z| > 2.

Solution: The function f has singular points, or points at which it is not analytic, at z 1 = 1, z2 = 2.

Taylor's theorem tells that f has a valid Taylor series expansion about z0 = 0 within the circle |z| = 1 since it is

analytic there. Hence

f(z) = ∞∑n=0 [{f(n)(z0)} / n!] (z – z0)n |z| < 1.

It is convenient, however, to use partial fractions to write

f(z) = [(– 1) / {(z – 1)(z – 2)}]

= [1 / (z – 1)] – [1 / (z – 2)]

= (1/2) [1 / (1 – z/2)] – [1 / (1 – z)]. (1)

For |z| < 1, we have |(z/2)| < 1 and note that these are the conditions under which each of the terms in (1)

can be represented by a geometric series. Hence, since

[1 / (1 – w)] = ∞∑n=0 wn, (|w| < 1),

f(z) = (1/2) ∞∑n=0 (z/2)n – ∞∑n=0 zn

= ∞∑n=0 [(1/2)(z/2)n – zn].

or

f(z) = ∞∑n=0 (2–n–1 – 1)zn (|z| < 1). (2)

Since the Taylor coefficients are unique this must be the Taylor expansion for f(z) in the region |z| < 1 and as

a by-product of the calculations, it has been deduced that

[{f(n)(0)} / n!]

must be equal to the coefficients in (2). Thus

f(n)(0) = n! (2–n–1 – 1).

In the region 1 < |z| < 2 one must use Laurent' s theorem which states that if f is analytic in the region

bounded by two concentric circles C1 and C2 centered at z0. then at each point z in that region f(z) is

represented by its Laurent series expansion

f(z) = ∞∑n=0 an (z – z0)n + ∞∑n=1 [bn / (z – z0)n] (3)

where the radius of C1 is assumed to be less than that of C2, and

an = (1 / 2πi) ∫(C)1 [{f(s)ds} / (s – z0)n+1] n = 0, 1, 2, ... (4)

bn = (1 / 2πi) ∫(C)2 [{f(s)ds} / (s – z0)–n+1] n = 1, 2, ... (5)

Formulas (4) and (5) are not very useful since the integrals are usually difficult to do. Therefore, note that in

the region 1 < |z| < 2, |(1/z)| < 1 and |(z/2)| < 1 so that one may use the geometric series expansion to write

f(z) = (1/z) [1 / (1 – 1/z)] + (1/2) [1 / (1 – z/2)]

= ∞∑n=0 (1 / zn+1) + (1/2) ∞∑n=0 (zn / 2n+1), (6)

which is valid for 1 < |z| < 2. Now the Laurent coefficients in (4) and (5) are unique, i.e., any representation

such as that in (6) must be the Laurent expansion regardless of how it is arrived at. Therefore, the

coefficients in the expansion must be equal to those of (4) and (5) so that as a by-product of the

calculations we have been able to find the values of the integrals there. E.g., for n = 1 in (5) see from (6) that

b1 = 1 so

∫|z|=2 f(z)dz = 2πi

Finally, for |z| > 2 one may write

f(z) = (1/z) [{1 / (1 – 1/z)} – {1 / (1 – 2/z)}] (7)

and |(1/z)| < 1 as well as |(2/z)| < 1 in the region |z| > 2.

Therefore, one may use the geometric series in this region to find from (7) that

f(z) = ∞∑n=0 (1 / zn+1) – ∞∑n=0 (2n / zn+1)

or

f(z) = ∞∑n=0 [(1 – 2n) / zn+1] (8)

Again, since the Laurent coefficients are unique it is deduced that the coefficients in (8) are the Laurent

coefficients. In particular since b1 = 0 in

(8) one finds from (4) that

∫C f(s)ds = 0,

where in this case, C can be any circle centered at z0 = 0 with radius r > 2.

PROBLEM 03 – 0080: Find the first three nonzero terms in the Laurent series

expansion of csc z about z0 = 0.

Solution: The Laurent series expansion of a function f(z) which is analytic in an annulus

bounded by the concentric circles C1 and C2 with centers at z0 is given by

f(z) = ∞∑n=0 an (z – z0)n + ∞∑n=1 [bn / (z – z0)n] (1)

where

an = (1 / 2πi) ∫(C)1 [{f(s)ds} / (s – z0)n+1] (n = 0, 1, 2, ...) (2)

bn = (1 / 2πi) ∫(C)2 [{f(s)ds} / (s – z0)–n+1] (n = 1, 2, ...) . (3)

Here the radius of C2 is assumed to be larger than the radius of C1. The integrals in (2) and (3) are usually

difficult and sometimes impossible to do by elementary methods so that a simpler way to proceed is to find

the Taylor series expansion for sin z and use long division to obtain a series representing csc z = (1 / sin z).

Then observing that the Laurent expansion is unique, it can be stated that the resulting series is indeed the

Laurent series.

Now since sin z is analytic Vz with |z| < ∞ its Taylor expansion can be written as

sin z = ∞∑n=0 [{f(n)(0)zn} / n!] = ∞∑n=1 [(– 1)n–1 / (2n – 1)!] z2n–1 (4)

Now sin z = 0 only for z = nπ, n = 0, ±1, ±2, ... .

Hence,

csc z = (1 / sin z)

is analytic in the annulus 0 < |z| < π can be represented by a series expansion there. Thus

csc z = (1 / sin z)

= [1 / {z – (z3 / 3!) + (z5 / 5!) – (z7 / 7!) + ...}]. (5)

Thus, by long division

z – (z3 / 3!) + (z5 / 5!) – (z7 / 7!) + ... [{(1/z) + (1/3!) + z3} / {3! ∙ 3! – 5!}] + …

√1 + 0 + 0 + 0 + 0

– [1 – (z / 3!) + (z / 5!) – (z7 / 7!) + ...]

3 5

– [(z2/3!) – {z4/(3! ∙ 3!)} + {z6/(3! ∙ 5!)+ ...]

[z4 / {3! ∙ 3! – 5!}] – … .

Therefore, from equation (5),

csc z = (1/z) + (1 / 3!)z + [{1 / (3! ∙ 3!)} – (1 / 5!)]z3 + ...

or

csc z = (1/z) + (1/6)z + (7 / 360)z3 + ... (0 < |z| < π). (6)

As noted before, this must be the Laurent series for csc z since that series is unique. As a by-product of this

calculation, one may therefore equate the coefficients in (6) with the corresponding Laurent coefficients

given by (2) and (3) with C2 being the circle |z| = π, and C1 the degenerate circle |z| = 0. This provides a

useful way to calculate the integrals of equations (2) and (3).

PROBLEM 03 – 0081:

Find the principal part of the function f(z) = [(ez cos z) / z3] at its singular point and determine the type of

singular point it is.

Solution: The given function has an isolated singular point at z = 0. Such a function may be

represented by a Laurent series

f(z) = ∞∑n=0 an (z – z0)n + ∞∑n=1 [bn / (z – z0)n] (1)

where an and bn are the Laurent coefficients. Since a Laurent expansion is unique, one needn't calculate

these coefficients directly but may proceed as follows, noting that any series of the form (1) that are

obtained must be the Laurent series. First, expanding the numerator of f in a Taylor series about z0 = 0

yields

ez cos z = [1 + z + (z2 / 2!) + ...] [1 – (z2 / 2!) + ...]

= 1 + z – (z3 / 3!) + ... . |z| < ∞ (2)

This is valid by Taylor's theorem since ez cos z is analytic for all z. Hence

[(ez cos z) / z3] = (1 / z3) + (1 / z2) – (1 / 3) + ... 0 < |z| < ∞. (3)

As noted above, this must be the Laurent series for f(z). The principal part of a function f at a point z0 is

defined as the portion of its Laurent series involving negative powers of z – z0. Hence, the principal part of

f(z) = [(ez cos z) / z3]

at 0, its only singular point, can be seen from (3) to be

(1 / z3) + (1 / z2).

If the principal part of f at z0 contains at least one non-zero term but the number of such terms is finite, the

isolated singular point z0 is then called a pole of order m, where m is the largest of the powers

[bj / (z – z0)j]

in the principal part of f. Hence, in this case, it is said that f has a pole of order 3 at z0 = 0.

Solution: Since the singular point z0 = – 1 is isolated, the function has a Laurent series

expansion about – 1 which is valid at every point except – 1 in the circular domain centered at – 1 with

radius r equal to the distance between – 1 and the next closest singularity, i.e.,

r = |2(1/3)|.

To find this expansion first expand

f1(z) = [z / (z3 + 2)]

in a Taylor series about z0 = – 1 to obtain

f1(z) = ∞∑n=0 [{f(0)(z0)} / n!] (z – z0)n. (1)

Now

f1(z) = [z / (z3 + 2)]

so

f1(1)(z) = [(2 – 2z3) / (z3 + 2)2]

and

f1(2) = [{6(z5 – 4z2)} / (z3 + 2)3]

etc. Using these results in (1) one finds

[z / (z3 + 2)] = – 1 + 4(z + 1) – 15(z + 1)2 + ... . (2)

f(z)[z / {(z + 1)2 (z3 + 2)}] = [(– 1) / (z + 1)2] + [4 / (z + 1)] – 15 + ... . (3)

The principal part of a function at a point z0 is defined as that part of its Laurent series involving negative

powers of (z – z0). Since (3) is the Laurent series representing

f(z) = [z / {(z + 1)2 (z3 + 2)}]

for 0 < |z + 1| < |2(1/3)|, it is concluded that the principal part of f at z0 = – 1 is

[(– 1) / (z + 1)2] + [4 / (z + 1)].

If the principal part of f at z0 contains at least one nonzero term but the number of such terms is

finite, then the isolated singular point z0 is called a pole of order m where m is the largest of

[bj / (z – z0)j]

in the principal part of f. Hence, in this case,

f(z) = [z / {(z + 1)2 (z3 + 2)}]

is said to have a pole of order 2 at z0 = – 1.

## Molto più che documenti.

Scopri tutto ciò che Scribd ha da offrire, inclusi libri e audiolibri dei maggiori editori.

Annulla in qualsiasi momento.