0 valutazioniIl 0% ha trovato utile questo documento (0 voti)

1 visualizzazioni141 pagineMétodos numéricos

Aug 31, 2019

Chapter 1

© © All Rights Reserved

PDF, TXT o leggi online da Scribd

Métodos numéricos

© All Rights Reserved

0 valutazioniIl 0% ha trovato utile questo documento (0 voti)

1 visualizzazioni141 pagineChapter 1

Métodos numéricos

© All Rights Reserved

Sei sulla pagina 1di 141

Colombia

March 9, 2017

www.hdspgroup.com

henarfu@uis.edu.co

LP 304

Outline

1 Introduction

2 Binary numbers

3 Error Analysis

Introduction: numerical methods applications

(a) Model the probable evolution of (b) Model and simulate the growth

a pathology of a tumor

Contents

1 Introduction

2 Binary numbers

Base 2 numbers

Base 2 representation of the integer N

Sequences and Series

Binary Fractions

Binary shifting

Scientific Notation

Machine Numbers

3 Error Analysis

Base 2 numbers

Base 2 numbers

Let N denote a positive integer; then the digits a0 , a1 , ..., ak exist so that

N has the base 10 expansion

Base 10 expansion

Base 2 numbers

(1 × 24 ) + (1 × 23 ) + (0 × 22 ) + (1 × 21 ) + (1 × 20 ).

So that:

Base 2 numbers

has the base 2 expansion

Base 2 expansion

N = (bJ × 2J ) + (bJ−1 × 2J−1 ) + · · · + (b1 × 21 ) + (b0 × 20 ), (2)

notation as

N = bJ bJ−1 · · · b2 b1 b0two . (3)

Contents

1 Introduction

2 Binary numbers

Base 2 numbers

Base 2 representation of the integer N

Sequences and Series

Binary Fractions

Binary shifting

Scientific Notation

Machine Numbers

3 Error Analysis

Base 2 representation of the integer N

Process: Generate sequences Qk and Rk of quotients and remainders,

respectively. End the process when Qk = 0, for some integer k = J.

Base 2 representation of the integer N

Process: Generate sequences Qk and Rk of quotients and remainders,

respectively. End the process when Qk = 0, for some integer k = J.

Example:

𝒌 1563 𝑸𝒌 𝑹𝒌

0 1563/2= 781 1

1 781/2= 390 1

2 390/2= 195 0

3 195/2= 97 1

4 97/2= 48 1

5 48/2= 24 0

6 24/2= 12 0

7 12/2= 6 0

8 6/2= 3 0

9 3/2= 1 1

10 1/2= 0 1

1 1 0 0 0 0 1 1 0 1 1

b10 b9 b8 b7 b6 b5 b4 b3 b2 b1 b0

Professor PhD Henry Arguello Fuentes Numerical methods March 9, 2017 9 / 78

Base 2 representation of the integer N

Exercise 1: Find the base 2 representation of 697

Base 2 representation of the integer N

Exercise 1: Find the base 2 representation of 697

Start by dividing the integer N from 2 to calculate Q0 and R0 .

697/2 = 348.5 → Q0 = 348 and R0 = 1

Base 2 representation of the integer N

Exercise 1: Find the base 2 representation of 697

Start by dividing the integer N from 2 to calculate Q0 and R0 .

697/2 = 348.5 → Q0 = 348 and R0 = 1

Continue the process until finding Qk = 0, for some integer k = J.

Qk = Qk−1 /2

𝒌 𝟔𝟗𝟕 𝑸𝒌 𝑹𝒌

0 697/2= 348 1

1

2

3

4

5

6

7

8

9

b9 b8 b7 b6 b5 b4 b3 b2 b 1 b0

Base 2 representation of the integer N

Solution

𝒌 𝟔𝟗𝟕 𝑸𝒌 𝑹𝒌

0 697/2= 348 1

1 348/2= 174 0

2 174/2= 87 0

3 87/2= 43 1

4 43/2= 21 1

5 21/2= 10 1

6 10/2= 5 0

7 5/2= 2 1

8 2/2= 1 0

9 1/2= 0 1

1 0 1 0 1 1 1 0 0 1

b9 b8 b7 b6 b5 b4 b3 b2 b 1 b0

Professor PhD Henry Arguello Fuentes Numerical methods March 9, 2017 11 / 78

Contents

1 Introduction

2 Binary numbers

Base 2 numbers

Base 2 representation of the integer N

Sequences and Series

Binary Fractions

Binary shifting

Scientific Notation

Machine Numbers

3 Error Analysis

Sequences and Series

require infinitely many digits.

1

For example, in = 0.3 , the symbol 3 means that the digit 3 is repeated

3

forever to form an infinite repeating decimal.

1

But, the number is the shorthand notation for the infinite series S

3

∞

X 1

S= 3(10)−k = .

3

k=1

Sequences and Series

Definition 1.

The infinite series S

∞

X

S= crn = c + cr + cr2 + · · · + crn + · · · , (4)

n=0

Sequences and Series

Definition 1.

The infinite series S

∞

X

S= crn = c + cr + cr2 + · · · + crn + · · · , (4)

n=0

The geometric series has the following properties:

∞

X c

If |r| < 1, then crn = .

1−r (5)

n=0

If |r| > 1, then the series diverges.

Sequences and Series

1 2 ∞ X∞ n

1 1 1 1

S = (7) + (7) + · · · + (7) = 7 ,

7 7 7 7

n=1

Sequences and Series

1 2 ∞ X∞ n

1 1 1 1

S = (7) + (7) + · · · + (7) = 7 ,

7 7 7 7

n=1

∞ n

X 1

which is equal to − 7 + 7 ,

7

n=0

Sequences and Series

1 2 ∞ X∞ n

1 1 1 1

S = (7) + (7) + · · · + (7) = 7 ,

7 7 7 7

n=1

∞ n

X 1

which is equal to − 7 + 7 ,

7

n=0

7 7

and acording with (5) S = −7 + = = 1.16,

1 6

1−

7

7

Then, is the shorthand notation for the infinite series S

6

Contents

1 Introduction

Base 2 numbers

Base 2 representation of the integer N

Sequences and Series

Binary Fractions

Binary shifting

Scientific Notation

Machine Numbers

3 Error Analysis

Binary Fractions

used to express a real number R that lies in the range 0 < R < 1.

Binary Fractions

used to express a real number R that lies in the range 0 < R < 1.

Binary fractions

R = (d1 × 2−1 ) + (d2 × 2−2 ) + · · · + (dn × 2−n ) + · · · , (6)

P∞ −j

R = 0.d1 d2 · · · dn · · ·two R= j=1 dj (2)

Binary Fractions-Decimal to binary

Process: Generate sequences dk and Fk multiplying by two.

Binary Fractions-Decimal to binary

Process: Generate sequences dk and Fk multiplying by two.

Example:

d1 d2 d 3 d 4 d 5 d6 d7 d8 d9

0. 1 0 1 1 0 0 1 1 0 …

𝑗 0.7 𝐹𝑗 𝑑𝑗 𝑓𝑟𝑎𝑐

1 (0.7)(2) = 1.4 1 0.4

2 (0.4)(2) = 0.8 0 0.8

3 (0.8)(2) = 1.6 1 0.6

4 (0.6)(2) = 1.2 1 0.2

5 (0.2)(2) = 0.4 0 0.4

6 (0.4)(2) = 0.8 0 0.8

7 (0.8)(2) = 1.6 1 0.6

8 (0.6)(2) = 1.2 1 0.2

9 (0.2)(2) = 0.4 0 0.4

0.7 0.10110 2

…

Binary Fractions-Decimal to binary

Exercise 2: Calculate the binary fraction for 0.6.

Start by multiplying 0.6 by 2, to generate sequences dj and Fj

d1 d2 d3 d4 d5 d6 d7 d8 d9

…

𝑗 0.6 𝐹𝑗 𝑑𝑗 𝑓𝑟𝑎𝑐

1 (0.6)(2) = 1.2 1 0.2

2

3

4

5

6

7

8

9

…

Binary Fractions-Decimal to binary

Solution

d1 d2 d3 d4 d5 d6 d7 d8 d9

…

𝑗 0.6 𝐹𝑗 𝑑𝑗 𝑓𝑟𝑎𝑐

1 (0.6)(2) = 1.2 1 0.2

2 (0.2)(2) = 0.4 0 0.4

3 (0.4)(2) = 0.8 0 0.8

4 (0.8)(2) = 1.6 1 0.6

5 (0.6)(2) = 1.2 1 0.2

6 (0.2)(2) = 0.4 0 0.4

7 (0.4)(2) = 0.8 0 0.8

8 (0.8)(2) = 1.6 1 0.6

9 (0.6)(2) = 1.2 1 0.2

…

0.6 = 0. 1001

Binary Fractions-Binary to decimal

The base 10 rational number R10 associated to a base 2 binary fraction

R2 can be found using geometric series.

Binary Fractions-Binary to decimal

The base 10 rational number R10 associated to a base 2 binary fraction

R2 can be found using geometric series.

Example:

the expression above is writted as

Binary Fractions-Binary to decimal

The base 10 rational number R10 associated to a base 2 binary fraction

R2 can be found using geometric series.

Example:

the expression above is writted as

X∞ ∞

X

= (2−2 )k = −1 + (2−2 )k

k=1 k=0

1 2 1

= −1 + = −1 + = .

1 3 3

1−

4

Binary Fractions-Binary to decimal

The base 10 rational number R10 associated to a base 2 binary fraction

R2 can be found using geometric series.

Example:

the expression above is writted as

X∞ ∞

X

= (2−2 )k = −1 + (2−2 )k

k=1 k=0

1 2 1

= −1 + = −1 + = .

1 3 3

1−

4

1

then, is the 10 rational number associated to 0.012

3

Professor PhD Henry Arguello Fuentes Numerical methods March 9, 2017 21 / 78

Contents

1 Introduction

Base 2 numbers

Base 2 representation of the integer N

Sequences and Series

Binary Fractions

Binary shifting

Scientific Notation

Machine Numbers

3 Error Analysis

Binary shifting

Let R be

R = 0.00000110002 . (7)

Binary shifting

Let R be

R = 0.00000110002 . (7)

25 = 32 will shift the binary 210 = 1024 will shift the binary

point 5 places to the right point 10 places to the right

32R = 0.110002 . 1024R = 11000.110002 .

Binary shifting

Let R be

R = 0.00000110002 . (7)

25 = 32 will shift the binary 210 = 1024 will shift the binary

point 5 places to the right point 10 places to the right

32R = 0.110002 . 1024R = 11000.110002 .

we obtain 992R = 110002 ,

given that 110002 = 2410 we find that,

3

992R = 24, Therefore R = .

124 10

Contents

1 Introduction

Base 2 numbers

Base 2 representation of the integer N

Sequences and Series

Binary Fractions

Binary shifting

Scientific Notation

Machine Numbers

3 Error Analysis

Scientific Notation

obtained by properly shifting the decimal point.

Scientific Notation

obtained by properly shifting the decimal point.

Examples

0.0000747 = 7.47 × 10−5

31.4159265 = 3.14159265 × 10

9, 700, 000.000 = 9.7 × 109

The Avogadro’s constant used in chemistry = 6.02252 × 1023 .

The quantity 1K = 1.024 × 103 used in computer science.

Contents

1 Introduction

Base 2 numbers

Base 2 representation of the integer N

Sequences and Series

Binary Fractions

Binary shifting

Scientific Notation

Machine Numbers

3 Error Analysis

Machine Numbers

Machine Numbers

mation given by

x ≈ ±q × 2n . (8)

The integer n is the exponent.

Floating-point format

expressing:

The sign

The exponent

The mantissa

The sign is always one bit where, S = 0 if, x > 0 and S = 1, if x < 0.

The amount of bits for the exponent and the mantissa depends on

the precision of the machine.

Floating-point format-IEEE 754 standard

bias

Single 32 bits 1 bit 8 bits 23 bits 127

Double 64 bits 1 bit 11 bits 52 bits 1023

Floating-point format-IEEE 754 standard

bias

Single 32 bits 1 bit 8 bits 23 bits 127

Double 64 bits 1 bit 11 bits 52 bits 1023

be able to represent both tiny and huge values, but two’s complement.

Floating-point format-IEEE 754 standard

bias

Single 32 bits 1 bit 8 bits 23 bits 127

Double 64 bits 1 bit 11 bits 52 bits 1023

be able to represent both tiny and huge values, but two’s complement.

Then, the exponent is biased by adjusting its value.

Floating-point format-IEEE 754 standard

bias

Single 32 bits 1 bit 8 bits 23 bits 127

Double 64 bits 1 bit 11 bits 52 bits 1023

be able to represent both tiny and huge values, but two’s complement.

Then, the exponent is biased by adjusting its value.

The exponent bias is calculated as bias = 2exp−1 −1, where exp indicates

the amount of bits for the exponent.

Example:

if exp = 15 bits, then, bias = 215−1 − 1 = 16383

Floating-point format-IEEE 754 standard

Possible cases:

0-‐1 All 0 < E < All 1 M (-‐1)S (1.M)(2E-‐bias)

0 E=all 1 M=0 +∞

1 E=all 1 M=0 -‐∞

0-‐1 E=all 1 M≠0 NaN

0-‐1 E=all 0 M=0 0

0-‐1 E=all 0 M≠0 (-‐1)S (0.M)(21-‐bias)

Floating-point format

Example: Determine the floating point format to stored the number

59.187510 in a computer with 32 bits of precision.

Floating-point format

Example: Determine the floating point format to stored the number

59.187510 in a computer with 32 bits of precision.

59.187510

5910 0.187510

59 2

0.1875×2 = 0.375 0 MSB

LBS 1 29 2

0.375×2 = 0.75 0

1 14 2

2 0.75×2 = 1.5 1

0 7

1 3 2 0.5×2 = 1.0 1 LBS

1 1 2

MSB 1 0

111011.00112

Floating-point format

2. Do the proper binary shifting

111011.00112 = 1.1101100112 × 25

Floating-point format

2. Do the proper binary shifting

111011.00112 = 1.1101100112 × 25

bias = 28−1 − 1 = 127

Floating-point format

2. Do the proper binary shifting

111011.00112 = 1.1101100112 × 25

bias = 28−1 − 1 = 127

Mantissa = 1101100112

Floating-point format

2. Do the proper binary shifting

111011.00112 = 1.1101100112 × 25

bias = 28−1 − 1 = 127

Mantissa = 1101100112

exp = 5 + bias = 5 + 127 = 13210 = 100001002

Floating-point format

2. Do the proper binary shifting

111011.00112 = 1.1101100112 × 25

bias = 28−1 − 1 = 127

Mantissa = 1101100112

exp = 5 + bias = 5 + 127 = 13210 = 100001002

S E M

0 10000100 11011001100000000000000

Professor PhD Henry Arguello Fuentes Numerical methods March 9, 2017 32 / 78

Floating-point format

Example: Determine the floating point format to stored the number

132.2812510 in a computer with 32 bits of precision.

Floating-point format

Example: Determine the floating point format to stored the number

132.2812510 in a computer with 32 bits of precision.

132.2812510

132 2 13210

0.2812510

LBS 0 66 2

0.28125×2 = 0.5625 0 MSB

0 33 2

1 16 2 0.375×2 = 1.125 1

0 8 2

0.125×2 = 0.25 0

0 4 2

0 2 2 0.25×2 = 0.5 0

0 1 2 0.5×2 = 1.0 1 LBS

MSB 1 0

10000100.010012

Floating-point format

2. Do the proper binary shifting

10000100.010012 = 1.0000100010012 × 27

Floating-point format

2. Do the proper binary shifting

10000100.010012 = 1.0000100010012 × 27

bias = 28−1 − 1 = 127

Floating-point format

2. Do the proper binary shifting

10000100.010012 = 1.0000100010012 × 27

bias = 28−1 − 1 = 127

Mantissa = 0000100010012

Floating-point format

2. Do the proper binary shifting

10000100.010012 = 1.0000100010012 × 27

bias = 28−1 − 1 = 127

Mantissa = 0000100010012

exp = 7 + bias = 7 + 127 = 13410 = 100001102

Floating-point format

2. Do the proper binary shifting

10000100.010012 = 1.0000100010012 × 27

bias = 28−1 − 1 = 127

Mantissa = 0000100010012

exp = 7 + bias = 7 + 127 = 13410 = 100001102

S E M

0 10000110 00001000100100000000000

Professor PhD Henry Arguello Fuentes Numerical methods March 9, 2017 34 / 78

Floating-point format

23

!

X

value = (−1)S 1 + d(23−i) 2−i × 2(E−127)

i=1

Where,

S = The sign

E = Exponent

127 = Bias

dj = Bits of the mantissa

Floating-point format

Exercise: Find the real value for the binary data:

S
E
M

0
01010010
01101000000100100000000

Floating-point format

Exercise: Find the real value for the binary data:

S
E
M

0
01010010
01101000000100100000000

23

!

X

−i

value = (−1) S

1+ d(23−i) 2 × 2(E−127)

i=1

Floating-point format

Exercise: Find the real value for the binary data:

S
E
M

0
01010010
01101000000100100000000

23

!

X

−i

value = (−1) S

1+ d(23−i) 2 × 2(E−127)

i=1

In this example:

S=0

P23

1 + i=1 d(23−i) 2−i = 1 + 2−2 + 2−3 + 2−5 + 2−12 + 2−15 = 1.4065246582

1 4

+26 )−127)

2(E−127) = 2((2 +2 = 282−127 = 2−45

Floating-point format

Exercise: Find the real value for the binary data:

S
E
M

0
01010010
01101000000100100000000

23

!

X

−i

value = (−1) S

1+ d(23−i) 2 × 2(E−127)

i=1

In this example:

S=0

P23

1 + i=1 d(23−i) 2−i = 1 + 2−2 + 2−3 + 2−5 + 2−12 + 2−15 = 1.4065246582

1 4

+26 )−127)

2(E−127) = 2((2 +2 = 282−127 = 2−45

Thus

value = 1.4065246582 × 2−45

Professor PhD Henry Arguello Fuentes Numerical methods March 9, 2017 36 / 78

Floating-point format

S
E
M

1
10000100
01000000000000000000000

31
30
23
22
0

In this example:

S=1

P23

1 + i=1 d(23−i) 2−i = 1 + 2−2 = 1.25

2(E−127) = 2(132−127) = 25

Floating-point format

S
E
M

1
10000100
01000000000000000000000

31
30
23
22
0

In this example:

S=1

P23

1 + i=1 d(23−i) 2−i = 1 + 2−2 = 1.25

2(E−127) = 2(132−127) = 25

Thus

value = 1.25 × 25 = −40.

Contents

1 Introduction

2 Binary numbers

3 Error Analysis

Absolute and relative error

Truncation Error

Round-off Error

Loss of Significance

Order of Approximation

Propagation of Error

Absolute and relative error

Definition 2.

Suppose that b p is an approximation to p. The absolute error is

Ep = |p − b

p|, and the relative error is Rp = |p − b

p|/|p|, provided that

p 6= 0.

Absolute and relative error

Definition 2.

Suppose that b p is an approximation to p. The absolute error is

Ep = |p − b

p|, and the relative error is Rp = |p − b

p|/|p|, provided that

p 6= 0.

The absolute error is the difference between the true value and

the approximate value.

The relative error expresses the error as a percentage of the true

value.

Absolute and relative error

Example: Find the absolute and relative error in the following three

cases:

Absolute and relative error

Example: Find the absolute and relative error in the following three

cases:

Approximation p̂ x = 3.14

b by = 999, 996 bz = 0.000009

Absolute Error Ep Ex = |x − b

x| Ey = |y − by| Ez = |z − bz|

= 0.001592 =4 = 0.000003

Relative Error Rp Rx = Ex /|x| Ry = Ey /|y| Rz = Ez /|z|

= 5.067×10−4 = 0.000004 = 0.25

Absolute and relative error

Example: Find the absolute and relative error in the following three

cases:

Approximation p̂ x = 3.14

b by = 999, 996 bz = 0.000009

Absolute Error Ep Ex = |x − b

x| Ey = |y − by| Ez = |z − bz|

= 0.001592 =4 = 0.000003

Relative Error Rp Rx = Ex /|x| Ry = Ey /|y| Rz = Ez /|z|

= 5.067×10−4 = 0.000004 = 0.25

Observe that as |p| moves away from 1 (greater than or less than) the

relative error Rp is a better indicator than Ep of the accuracy of the ap-

proximation.

Absolute and relative error

Definition 3.

The number bp is said to approximate p to d significant digits if d is the

largest nonnegative integer for which

|p − p|

b 101−d

< .

|p| 2

Absolute and relative error

Example:

Let ŵ be the approximation for w = 2.1645, then

|2.1645 − 2.16|

= 2.07900 × 10− 3

|2.1645|

Absolute and relative error

Example:

Let ŵ be the approximation for w = 2.1645, then

|2.1645 − 2.16|

= 2.07900 × 10− 3

|2.1645|

101−0

if d = 0: 2.07900 × 10− 3 < 2 = 5 Xsatisfies. However, as we need

to find the largest integer d, we need to continue..

Absolute and relative error

Example:

Let ŵ be the approximation for w = 2.1645, then

|2.1645 − 2.16|

= 2.07900 × 10− 3

|2.1645|

101−0

if d = 0: 2.07900 × 10− 3 < 2 = 5 Xsatisfies. However, as we need

to find the largest integer d, we need to continue..

101−1

if d = 1: 2.07900 × 10− 3 < 2 = 0.5 Xsatisfies

Absolute and relative error

Example:

Let ŵ be the approximation for w = 2.1645, then

|2.1645 − 2.16|

= 2.07900 × 10− 3

|2.1645|

101−0

if d = 0: 2.07900 × 10− 3 < 2 = 5 Xsatisfies. However, as we need

to find the largest integer d, we need to continue..

101−1

if d = 1: 2.07900 × 10− 3 < 2 = 0.5 Xsatisfies

101−2

if d = 2: 2.07900 × 10− 3 < 2 = 0.05 Xsatisfies

Absolute and relative error

Example:

Let ŵ be the approximation for w = 2.1645, then

|2.1645 − 2.16|

= 2.07900 × 10− 3

|2.1645|

101−0

if d = 0: 2.07900 × 10− 3 < 2 = 5 Xsatisfies. However, as we need

to find the largest integer d, we need to continue..

101−1

if d = 1: 2.07900 × 10− 3 < 2 = 0.5 Xsatisfies

101−2

if d = 2: 2.07900 × 10− 3 < 2 = 0.05 Xsatisfies

101−3

if d = 3: 2.07900 × 10− 3 < 2 = 0.005 Xsatisfies

Absolute and relative error

Example:

Let ŵ be the approximation for w = 2.1645, then

|2.1645 − 2.16|

= 2.07900 × 10− 3

|2.1645|

101−0

if d = 0: 2.07900 × 10− 3 < 2 = 5 Xsatisfies. However, as we need

to find the largest integer d, we need to continue..

101−1

if d = 1: 2.07900 × 10− 3 < 2 = 0.5 Xsatisfies

1−2

if d = 2: 2.07900 × 10− 3 < 102 = 0.05 Xsatisfies

1−3

if d = 3: 2.07900 × 10− 3 < 102 = 0.005 Xsatisfies

1−4

if d = 4: 2.07900 × 10− 3 < 102 = 0.0005 X does not satisfy

Then, ŵ approximate w to 3 significant digits.

Absolute and relative error

Other examples:

x = 3.14, then |x − b

If x = 3.141592 and b

Therefore, bx approximates x to three significant digits.

|y − by|/|y| = 0.000004 < 10−5 /2. Therefore, by approximates y to six

significant digits.

Therefore, bz approximates z to one significant digits.

Contents

1 Introduction

2 Binary numbers

3 Error Analysis

Absolute and relative error

Truncation Error

Round-off Error

Loss of Significance

Order of Approximation

Propagation of Error

Truncation Error

mathematical expression is "replaced" with a more elementary formula.

Truncation Error

mathematical expression is "replaced" with a more elementary formula.

2 x4 x6 x8 x2n

ex = 1 + x 2 + + + + ··· + + ···

2! 3! 4! n!

x4 x6 x8

might be replaced with just the first five terms 1 + x2 + + + .

2! 3! 4!

Then a truncation error appears.

Truncation Error

R 1/2 2

Example: Given p = 0 ex dx = 0.544987104184. Determine the accu-

2

racy of the approximation obtained by replacing the integrand f (x) = ex

x4 x6 x8

with the truncated Taylor series P8 (x) = 1 + x2 + + + .

2! 3! 4!

R 1/2

Determine 0 P8 (x)dx:

Truncation Error

R 1/2 2

Example: Given p = 0 ex dx = 0.544987104184. Determine the accu-

2

racy of the approximation obtained by replacing the integrand f (x) = ex

x4 x6 x8

with the truncated Taylor series P8 (x) = 1 + x2 + + + .

2! 3! 4!

R 1/2

Determine 0 P8 (x)dx:

1/2 x=1/2

x4 x6 x8 x3 x5 x7 x9

Z

2

1+x + + + dx = x + + + +

0 2! 3! 4! 3 5(2!) 7(3!) 9(4!) x=0

1 1 1 1 1

= + + + +

2 24 320 5376 110592

2109491

= = 0.544986720817 = b p

3870720

Since

|p − bp| 101−6

= 7.03442 × 10−7 < = 5 × 106

|p| 2

then, the approximation b

p agrees with the true value to 6 significant digits.

Professor PhD Henry Arguello Fuentes Numerical methods March 9, 2017 46 / 78

Contents

1 Introduction

2 Binary numbers

3 Error Analysis

Absolute and relative error

Truncation Error

Round-off Error

Loss of Significance

Order of Approximation

Propagation of Error

Round-off Error

computer is determined by the precision of the mantissa.

Round-off Error

computer is determined by the precision of the mantissa.

error.

Round-off Error

computer is determined by the precision of the mantissa.

error.

chopping or rounding of the last digit.

Round-off Error

computer is determined by the precision of the mantissa.

error.

chopping or rounding of the last digit.

machine numbers, errors are introduced and propagated in

successive computations.

Chopping Off versus Rounding Off

Example:

Consider p expressed in normalized decimal form:

Chopping Off versus Rounding Off

Example:

Consider p expressed in normalized decimal form:

represented by flchop (p), which is given by

called the chopped floating-point representation of p.

Chopping Off versus Rounding Off

flround (p) is given by

obtained by rounding the number dk dk+1 dk+2 · · · to the nearest integer.

Chopping Off versus Rounding Off

Example:

22

The real number p = = 3.142857142857142857... has the following

7

six-digit representations:

flround (p) = 0.314286 × 101 .

3.14285 and 3.14286, respectively.

Contents

1 Introduction

2 Binary numbers

Absolute and relative error

Truncation Error

Round-off Error

Loss of Significance

Order of Approximation

Propagation of Error

Loss of Significance

nearly equal and both carry 11 decimal digits of precision.

six digits of p and q are the same, their difference p − q contains

only five decimal digits of precision.

Loss of Significance

Example:

Compare the results of calculating f (500) and g(500) using six digits and round-

√ √ x

ing. Where, f (x) = x( x + 1 − x) and g(x) = √ √ .

x+1+ x

Loss of Significance

Example:

Compare the results of calculating f (500) and g(500) using six digits and round-

√ √ x

ing. Where, f (x) = x( x + 1 − x) and g(x) = √ √ .

x+1+ x

For the first function,

√ √

f (500) =500 501 − 500

500(22.3830 − 22.3607) = 500(0.0223) = 11.1500

Loss of Significance

Example:

Compare the results of calculating f (500) and g(500) using six digits and round-

√ √ x

ing. Where, f (x) = x( x + 1 − x) and g(x) = √ √ .

x+1+ x

For the first function,

√ √

f (500) =500 501 − 500

500(22.3830 − 22.3607) = 500(0.0223) = 11.1500

For g(x)

500

g(500) = √ √

501 + 500

500 500

= = 11.1748.

22.3830 + 22.3607 44.7437

Loss of Significance

Example:

Compare the results of calculating f (500) and g(500) using six digits and round-

√ √ x

ing. Where, f (x) = x( x + 1 − x) and g(x) = √ √ .

x+1+ x

For the first function,

√ √

f (500) =500 501 − 500

500(22.3830 − 22.3607) = 500(0.0223) = 11.1500

For g(x)

500

g(500) = √ √

501 + 500

500 500

= = 11.1748.

22.3830 + 22.3607 44.7437

The second function, g(x), is algebraically equivalent to f (x), but the answer,

g(500) = 11.1748, involves less error and it is the same as that obtained by

rounding the true 11.174755300747198... to six digits.

Professor PhD Henry Arguello Fuentes Numerical methods March 9, 2017 54 / 78

Loss of Significance

Example: Compare the results of calculating f (0.01) and P(0.01) using six

digits and rounding, where

ex − 1 − x 1 x x2

f (x) = and P(x) = + +

x2 2 6 24

The function P(x) is the Taylor polynomial of degree n = 2 for f (x) expanded

about x = 0.

Loss of Significance

Example: Compare the results of calculating f (0.01) and P(0.01) using six

digits and rounding, where

ex − 1 − x 1 x x2

f (x) = and P(x) = + +

x2 2 6 24

The function P(x) is the Taylor polynomial of degree n = 2 for f (x) expanded

about x = 0.

For the first function

e0.01 − 1 − 0.01 1.010050 − 1 − 0.01

f (0.01) = = = 0.5.

(0.01)2 0.001

Loss of Significance

Example: Compare the results of calculating f (0.01) and P(0.01) using six

digits and rounding, where

ex − 1 − x 1 x x2

f (x) = and P(x) = + +

x2 2 6 24

The function P(x) is the Taylor polynomial of degree n = 2 for f (x) expanded

about x = 0.

For the first function

e0.01 − 1 − 0.01 1.010050 − 1 − 0.01

f (0.01) = = = 0.5.

(0.01)2 0.001

1 0.01 0.001

P(0.01) = + + = 0.5 + 0.001667 + 0.000004 = 0.501671.

2 6 24

Loss of Significance

Example: Compare the results of calculating f (0.01) and P(0.01) using six

digits and rounding, where

ex − 1 − x 1 x x2

f (x) = and P(x) = + +

x2 2 6 24

The function P(x) is the Taylor polynomial of degree n = 2 for f (x) expanded

about x = 0.

For the first function

e0.01 − 1 − 0.01 1.010050 − 1 − 0.01

f (0.01) = = = 0.5.

(0.01)2 0.001

1 0.01 0.001

P(0.01) = + + = 0.5 + 0.001667 + 0.000004 = 0.501671.

2 6 24

The answer P(0.01) = 0.501671 contains less error and it is the same as that

obtained rounding the true answer 0.5016708416805... to six digits.

Contents

1 Introduction

2 Binary numbers

Absolute and relative error

Truncation Error

Round-off Error

Loss of Significance

Order of Approximation

Propagation of Error

O(hn ) Order of Approximation

For functions

Definition 4.

The function f (h) is said to be big Oh of g(h), denoted f (h) = O(g(h)),

if there exist constants C and c such that:

O(hn ) Order of Approximation

For functions

Definition 4.

The function f (h) is said to be big Oh of g(h), denoted f (h) = O(g(h)),

if there exist constants C and c such that:

O(hn ) Order of Approximation

For functions

Definition 4.

The function f (h) is said to be big Oh of g(h), denoted f (h) = O(g(h)),

if there exist constants C and c such that:

O(hn ) Order of Approximation

For functions

Definition 4.

The function f (h) is said to be big Oh of g(h), denoted f (h) = O(g(h)),

if there exist constants C and c such that:

O(hn ) Order of Approximation

For functions

Definition 4.

The function f (h) is said to be big Oh of g(h), denoted f (h) = O(g(h)),

if there exist constants C and c such that:

growth of a function in terms of the well-known elementary function (xn ,

x1/n , ax , loga (x), etc.).

Professor PhD Henry Arguello Fuentes Numerical methods March 9, 2017 57 / 78

O(hn ) Order of Approximation

For sequences

Definition 5.

Let xn = 1∞ and yn = 1∞ be two sequences. The sequence xn is said

to be of order big Oh of yn , denoted xn = O(yn ), if there exist constants

C and N such that

O(hn ) Order of Approximation

For sequences

Definition 5.

Let xn = 1∞ and yn = 1∞ be two sequences. The sequence xn is said

to be of order big Oh of yn , denoted xn = O(yn ), if there exist constants

C and N such that

Example:

n2 − 1 n2 − 1 n2

1 1

=O , since ≤ = whenever n ≥ 1.

n3 n n3 n3 n

O(hn ) Order of Approximation

Definition 6.

Assume that f (h) is approximated by the function p(h) and there exist a

real constant M > 0 and a positive integer n so that

|f (h) − p(h)|

≤ M for sufficiently small h. (13)

hn

We say that p(h) approximates f (h) with order of approximation O(hn )

and write

f (h) = p(h) + O(hn ) (14)

When relation (13) is rewritten in the form |f (h) − p(h)| ≤ M|hn |, we see

that the notation O(hn ) stands in place of the error bound M|hn |.

O(hn ) Order of Approximation

Assume that f (h) = p(h) + O(hn ), g(h) = q(h) + O(hm ), and

r = min(m, n). Then

and

f (h) p(h)

= + O(hr ) provided that g(h) 6= 0 and q(h) 6= 0. (17)

g(h) q(h)

O(hn ) Order of Approximation

Assume f ∈ Cn+1 [a, b]. If both x0 and x = x0 + h lie in [a, b], then

n

X f (k)(x0 )

f (x0 + h) = hk + O(hn+1 ). (18)

k!

k=0

Additional properties:

(i) O(hp ) + O(hp ) = O(hp ),

(ii) O(hp ) + O(hq ) = O(hr ), where r = min(m, n), and

(iii) O(hp )O(hq ) = O(hs ), where s = p + q.

O(hn ) Order of Approximation

Example:

Consider the Taylor polynomial expansions

h2 h3 h2 h4

eh = 1+h+ + +O(h4 ) and cos(h) = 1 − + + O(h6 ).

2! 3! 2! 4!

O(hn ) Order of Approximation

Example:

Consider the Taylor polynomial expansions

h2 h3 h2 h4

eh = 1+h+ + +O(h4 ) and cos(h) = 1 − + + O(h6 ).

2! 3! 2! 4!

For the sum we have

h2 h3 h2 h4

eh + cos(h) =1 + h + + + O(h4 ) + 1 − + + O(h6 )

2! 3! 2! 4!

h3 h4

=2+h+ + O(h4 ) + + O(h6 )

3! 4!

O(hn ) Order of Approximation

h4

Since O(h4 ) + = O(h4 ) and O(h4 ) + O(h6 ) = O(h4 ), this reduces to

4!

h3

eh + cos(h) = 2 + h + + O(h4 ),

3!

and the order of approximation is O(h4 ).

O(hn ) Order of Approximation

h2 h3 h2 h4

eh cos(h) = 1 + h + + + O(h4 ) 1− + + O(h6 )

2! 3! 2! 4!

h2 h3 h2 h4

= 1+h+ + 1− + +

2! 3! 2! 4!

h2 h3 h2 h4

6

1+h+ + O(h ) + 1 − + O(h4 ) + O(h4 )O(h6 )

2! 3! 2! 4!

h3 5h4 h5 h6 h7

=1 + h − − − + + + O(h6 ) + O(h4 ) + O(h4 )O(h6 ).

3 24 24 48 144

O(hn ) Order of Approximation

−5h4 h5 h6 h7

− + + + O(h6 ) + O(h4 ) + O(h10 )

24 24 48 144

Since O(h0 ) + O(h4 ) + O(h10 ) = O(h4 ), the preceding equation is

simplified to yield

h3

eh cos(h) = 1 + h + + O(h4 ),

3

and the order of approximation is O(h4 ).

Order of Convergence of a Sequence

Convergence of a sequence

Definition 7.

Suppose that limn−→∞ xn = x and {rn }∞ n=1 is a sequence with

limn−→∞ rn = 0. We say that {xn }∞

n=1 converges to x with the order

of convergence O(rn ), if there exists a constant K ≥ 0 such that

|xn − x|

≤ K for n sufficiently large. (19)

|rn |

of convergence O(rn )

Order of Convergence of a Sequence

Definition 7.

Example:

Let xn = cos(n)/n2 and rn = 1/n2 then,

limn−→∞ xn = 0

relation

|cos(n)/n2 |

= |cos(n) ≤ 1| for all n.

|1/n2 |

Contents

1 Introduction

2 Binary numbers

Absolute and relative error

Truncation Error

Round-off Error

Loss of Significance

Order of Approximation

Propagation of Error

Propagation of Error

Addition consider two numbers p and q (the true values) with the

approximate values b p and bq, which contains errors p and q ,

respectively. Starting with p = b

p + p and q = b

q + q , the sum is

p + q = (b

p + p ) + (b

q + q ) = (b

p+b

q) + (p + q ). (20)

Hence, for addition, the error in the sum is the sum of the errors in

the addends.

s = p + q .

Propagation of Error

product is

pq = (b

p + p )(b

q + q ) = bq+b

pb pp + b

qp + p q . (21)

Propagation of Error

product is

pq = (b

p + p )(b

q + q ) = bq+b

pb pp + b

qp + p q . (21)

Hence, if bp and bq are larger than 1 in absolute value, the terms bpq and

qp show that there is a possibility of magnification of the original errors

b

p and q . Insights are gained if we look at the relative error. Rearrange

the terms in (21) to get

pq − bq=b

pb pq + b

qp + p q . (22)

Propagation of Error

product is

pq = (b

p + p )(b

q + q ) = bq+b

pb pp + b

qp + p q . (21)

Hence, if bp and bq are larger than 1 in absolute value, the terms bpq and

qp show that there is a possibility of magnification of the original errors

b

p and q . Insights are gained if we look at the relative error. Rearrange

the terms in (21) to get

pq − bq=b

pb pq + b

qp + p q . (22)

Suppose that b p 6= 0 and b

q 6= 0; then we can divide (22) by pq to obtain

the relative error in the product pq:

pq − b

pb

q pq + b

b qp + p q pq b

b qp p q

Rpq = = = + + . (23)

pq pq pq pq pq

Propagation of Error

p and

p/p ≈ 1, b

q; then b

b q/q ≈ 1, and Rp Rq = (p /p)(q /q) ≈ 0 (Rp and Rq are

the relative errors in the approximations b p and b

q). Then making these

substitutions yields the simplified relationship

pq − b

pb

q

Rpq = ≈ q /q + p /p + 0 = Rq + Rp . (24)

pq

Propagation of Error

p and

p/p ≈ 1, b

q; then b

b q/q ≈ 1, and Rp Rq = (p /p)(q /q) ≈ 0 (Rp and Rq are

the relative errors in the approximations b p and b

q). Then making these

substitutions yields the simplified relationship

pq − b

pb

q

Rpq = ≈ q /q + p /p + 0 = Rq + Rp . (24)

pq

This shows that the relative error in the product pq is approximately the

sum of the relative errors in the approximations p b and qb.

A quality that is desirable for any numerical process is that a small error

in the initial conditions will produce small changes in the final result.

An algorithm with this feature is called stable; otherwise, it is called

unstable.

Propagation of Error

Definition 8.

Suppose that represents an initial error and (n) represents the growth

of the error after n steps. If |(n)| ≈ n, the growth of error is said to be

linear. If |(n)| ≈ K n , the growth of error is called exponential. If

K > 1, the exponential error growns without bound as n −→ ∞, and if

0 < K < 1, the exponential error diminishes to zero as n −→ ∞.

Propagation of error

Example: Show that the following three schemes can be used with finite-

precision arithmetic to recursively generate the terms in the sequence {1/3n }∞

n=0 .

1

r0 = 1 and rn = rn−1 for n = 1, 2, · · · , (25)

3

1 4 1

p0 = 1, p1 = , and pn = pn−1 − pn−2 for n = 1, 2, · · · , (26)

3 3 3

1 10

q0 = 1, q1 = , and qn = qn−1 − qn−2 for n = 1, 2, · · · , (27)

3 3

Propagation of error

Formula (25) is obvious. In (26) the difference equation has the general solu-

tion pn = A(1/3n ) + B. This can be verified by direct substitution:

4 1 4 A 1 A

pn−1 − pn−2 = + B − + B

3 3 3 3n−1 3 3n−2

4 3 4 1 1

= − A − − B = A n + B = pn

3n 3n 3 3 3

Setting A = 1 and B = 0 will generate the sequence desired.

Propagation of error

Formula (25) is obvious. In (26) the difference equation has the general solu-

tion pn = A(1/3n ) + B. This can be verified by direct substitution:

4 1 4 A 1 A

pn−1 − pn−2 = + B − + B

3 3 3 3n−1 3 3n−2

4 3 4 1 1

= − A − − B = A n + B = pn

3n 3n 3 3 3

Setting A = 1 and B = 0 will generate the sequence desired. In (27) the

difference equation has the general solution qn = A(1/3n ) + B3n . This too

verified by substitution:

10 10 A n−1 A n−2

qn−1 − qn−2 = + B3 − + B3

3 3 3n−1 3n−2

10 9 1

= n

− n A − (10 − 1)3n−1 B = A n + B3n = qn

3 3 3

Propagation of error

Example:

Generate approximations to the sequences {xn } = 1/3n using hemes

1

r0 = 0.99996 and rn = rn−1 for n = 1, 2, · · · , (28)

3

4 1

p0 = 1, p1 = 0.33332, and pn = pn−1 − pn−2 for n = 1, 2, · · · ,

3 3

(29)

10

q0 = 1, q1 = 0.33332, and qn = pn−1 − pn−2 for n = 1, 2, · · · ,

3

(30)

In (28) the initial error in r0 is 0.00004, and in (29) and (30) the initial

errors in p1 and q1 are 0.000013. Investigate the propagation of error for

each scheme.

Professor PhD Henry Arguello Fuentes Numerical methods March 9, 2017 75 / 78

Propagation of error

n xn rn pn qn

0 1.0000000000 0.9999600000 1.0000000000 1.0000000000

1 0.3333333333 0.3333200000 0.3333200000 0.3333200000

2 0.1111111111 0.1111066667 0.1110933333 0.1110666667

3 0.0370370370 0.0370355556 0.0370177778 0.0369022222

4 0.0123456790 0.0123451852 0.0123259259 0.0119407407

5 0.0041152263 0.0041150617 0.0040953086 0.0029002469

6 0.0013717421 0.0013716872 0.0013517695 -0.0022732510

7 0.0004572474 0.0004572291 0.0004372565 -0.0104777503

8 0.0001524158 0.0001524097 0.0001324188 -0.0326525834

9 0.0000508053 0.0000508032 0.0000308063 -0.0983641945

10 0.0000169351 0.0000169344 -0.0000030646 -0.2952280648

Propagation of error

n xn − rn xn − pn xn − qn

0 0.0000400000 0.0000000000 0.0000000000

1 0.0000133333 0.0000133333 0.0000133333

2 0.0000044444 0.0000177778 0.0000444444

3 0.0000014815 0.0000192593 0.0001348148

4 0.0000004938 0.0000197531 0.0004049383

5 0.0000001646 0.0000199177 0.0012149794

6 0.0000000549 0.0000199726 0.0036449931

7 0.0000000183 0.0000199909 0.0109349977

8 0.0000000061 0.0000199970 0.0328049992

9 0.0000000020 0.0000199990 0.0984149997

10 0.0000000007 0.0000199997 0.2952449999

Propagation of error

−5 −5

x 10 x 10

6 2

1.5

4

xn−pn

xn−rn

1

2

0.5

0 0

0 2 4 6 8 10 0 2 4 6 8 10

n n

0.4

0.3

xn−qn

0.2

0.1

0

0 2 4 6 8 10

n

Propagation of error

−5 −5

x 10 x 10

6 2

1.5

4

xn−pn

xn−rn

1

2

0.5

0 0

0 2 4 6 8 10 0 2 4 6 8 10

n n

0.4

0.3

xn−qn

0.2

0.1

0

0 2 4 6 8 10

n

The error for {rn } is stable and decreases in an exponential manner. The error

{pn } is stable. The errror for {qn } is unstable and grows at an exponential rate.

Although the error for {pn } is stable, the terms pn −→ 0 as n −→ ∞, so that the

error eventually dominates and teh terms past p8 have no significant digits.

Professor PhD Henry Arguello Fuentes Numerical methods March 9, 2017 78 / 78

## Molto più che documenti.

Scopri tutto ciò che Scribd ha da offrire, inclusi libri e audiolibri dei maggiori editori.

Annulla in qualsiasi momento.