Sei sulla pagina 1di 13

Matrix norms

February 7, 2020

1 Induced Norms
Theorem 1.1. If k.k is an induced norm on Rn×n , then kAxk ≤ kAkkxk∀x ∈ Rn ,this
inequality is sharp.

Proof. We know that kAk = max kAxk


kxk
∀x ∈ Rn , kxk =
6 0
kAxk
therefore, kAk ≥ kxk
∀x ∈ Rn , kxk =
6 0, =⇒ kAxk ≤ kAkkxk (as kxk is positive
valued)
kAxmax k
Let xmax ∈ Rn s.t kAk = kxmax k
=⇒ kAkkxmax k = kAxmax k ∴ inequality is sharp.

Theorem 1.2. Induced norms are matrix norms.

Proof. 1) if kAk = 0, thenA = 0

=⇒ max kAxk
kxk
= 0, ∀x 6= 0 , since the maximum of a set of non-negative values is zero,
kAxk
all elements in the set must be zero. =⇒ kxk
= 0 ∀x 6= 0 =⇒ kAxk = 0.
=⇒ Ax = 0 ∀x 6= 0, which is only possible when A = 0. Hence proved.

2. kλAk = |λ| kAk


kλAxk
kλAk = maxkxk6=0 kxk
= |λ| kAk. Hence proved.

3. kA + Bk ≤ kAk + kBk
k(A+B)xk kAxk+kBxk
kA + Bk = maxkxk6=0 kxk
≤ maxkxk6=0 kxk
= kAk + kBk

=⇒ kA + Bk ≤ kAk + kBk , Hence proved. Therefore, Induced norm is a norm.

1
Now we need to show that,

4. kABk ≤ kAkkBk
kAk·kBxk kBxk
kABk ≤ maxkxk6=0 kxk
≤ kAk · maxkxk6=0 kxk
= kAkkBk

=⇒ kABk ≤ kAkkBk Hence proved. All induced norms are matrix norms

Theorem 1.3. kAk1 = max1≤j≤n ni=1 |aij | (maximum absolute column sum)
P

Proof. kAxk1 =k nj=1 a1j xj ,


P Pn Pn Pn
j=1 a2j xj , j=1 a3j xj , ... j=1 anj xj k1

Pn Pn P P
n n
= a x ≤ i=1 j=1 |aij | |xj |

i=1 j=1 ij j

Pn Pn Pn
=⇒ kAxk1 ≤ j=1 |xj | i=1 |aij | ≤ kxk1 max1≤j≤n ( i=1 |aij |)
kAxk1 Pn
Therefore, kxk1
≤ max1≤j≤n i=1 |aij |

n
X
=⇒ kAk1 ≤ max |aij | (1)
1≤j≤n
i=1

Now, we need to show converse inequality to prove that this indeed is an equality. As-
sume matrix A attains maximum absolute column sum at the k th column. Let x = ek where
ek is the standard basis with one at the index k and zero elsewhere.

=⇒ kAek k1 = k(a1k , a2k , a3k ..... , ank )T k1 =


Pn
i=1 |aik |
Pn
=⇒ i=1 |aik | = kAek k1 ≤ kAk1 · 1
n
X
=⇒ kAk1 ≥ max |aij | (2)
1≤j≤n
i=1
Pn
By inequalities (1) and (2), kAk1 = max1≤j≤n i=1 |aij |. Hence proved.

Pn
Theorem 1.4. kAk∞ = max1≤i≤n j=1 |aij | (maximum absolute row sum)

2
Proof. kAxk∞ = k nj=1 a1j xj ,
P Pn Pn Pn
j=1 a2j xj , j=1 a3j xj , ... j=1 anj xj k∞
P
= max1≤i≤n j=1 aij xj ≤ max1≤i≤n nj=1 |aij | |xj | (using triangle inequality)
n P

Pn kAxk∞ Pn
≤ kxk∞ max1≤i≤n j=1 |aij | =⇒ kxk∞
≤ max1≤i≤n j=1 |aij |
n
X
=⇒ kAk∞ ≤ max |aij | (3)
1≤i≤n
j=1

Now we need to show converse inequality to prove equality. Let the maximum absolute
row sum be attained at the k th row of matrix A. Construct vector x such that,

x = (sgn (ak1 ) , sgn (ak2 ) , sgn (ak3 ) , ....... sgn (akn )) (4)

1
 x>0
sgn(x) = −1 x < 0

0 x=0

Pn Pn Pn Pn
Now kAxk∞ = k j=1 a1j sgn(akj ), j=1 a2j sgn(akj ), j=1 a3j sgn(akj ), ... j=1 anj sgn(akj )k∞
Pn
=⇒ j=1 |akj | = kAxk∞ ≤ kAk · 1 (kxk∞ = 1 for non empty matrix A)
n
X
=⇒ kAk∞ ≥ max |aij | (5)
1≤i≤n
j=1

Pn
By inequalities (3) and (5), kAk∞ = max1≤i≤n j=1 |aij |. Hence proved.

 1/2 
Theorem 1.5. kAk2 = λmax AT A where λmax AT A is the largest eigenvalue of AT A.
[spectral norm].

Proof. Let λ1 ≥ λ2 ≥ λ3 ......... ≥ λn ≥ 0 be the eigenvalues of AT A. The eigenvalues form


an orthonormal basis for Rn (AT A is symmetric).

3
Let Z = {Z1 , Z2 , Z3 ....Zn } be an orthonormal basis for Rn . For any x ∈Rn ,

x = α1 Z1 + α2 Z2 , +α3 Z3 +, ....... + αn Zn

kAxk22 = xT AT Ax = (α1 Z1 + α2 Z2 , +α3 Z3 +, ....... + αn Zn )T AT A ( ni=1 αi Zi )


P

T P
= ( ni=1 αi Zi ) ( ni=1 αi λ1 Zi )
P

= (α12 λ1 + α22 λ2 + α32 λ3 + ......... + αn2 λn ) ≤ λmax (α12 + α22 + α32 + ....... + αn2 )
kAxk22
=⇒ kxk22
≤ λmax
1/2
=⇒ kAk2 ≤ λmax (6)

Converse inequality,

Let x = Z1 , assuming λ1 is the maximum eigenvalue then AZ1 = λZ1



=⇒ kAZ1 k22 = Z1T AT AZ1 = Z T Z λ1
kAZ1 k22
=⇒ kZ1 k22
= λ1 =⇒ λmax ≤ kAk22 ,

1/2
=⇒ kAk2 ≥ λmax (7)

1/2
By inequalities (6) and (7), kAk2 = λmax . Hence proved

Corollary 1.1. if A is a symmetric positive semi definite matrix such that A = C T C , then
kAk2 = kCk22

Proof. Using Theorem 1.5,

kCk2 = [λmax C T C ]1/2 = [λmax (A)]1/2



(8)

4
kAk2 = [λmax AT A ]1/2 = [λmax (A2 )]1/2 = [λmax (A)2 ]1/2


[λmax (A)2 ]1/2 = λmax (A) (as A is p.s.d). Using equation (8),

kAk2 = kCk22 . Hence proved.


Theorem 1.6. kAkF = [trace AT A ]1/2

Proof. Exercise.

Theorem 1.7. kAk2 ≤ kAkF ≤ nkAk2

Proof. Exercise.

Theorem 1.8. Let A in Rn×n be a symmetric matrix, then

kAk2 = max |hAx, xi|


kxk2 =1

Proof. As A is symmetric, A = U DU T by spectral theorem for symmetric matrices, and


eigenvalues of A form a orthonormal basis for Rn .

kAk2 = |λmax | by Corrollary 1.1

|hAx, xi| ≤ kAxk2 kxk2 ≤ kAk2 · 1 (kxk2 = 1)

=⇒ max |hAx, xi| ≤ kAk2 (9)


kxk2 =1

Converse inequality,
Let x = vi , vi ∈ orthonormal basis formed by eigenvectors of A, such that Ax = λmax vi

|hAx, xi| = |hλv1 , v1 i| = |λmax | · 1 = kAk2

∴ kAk2 = max |hAx, xi|


kxk2 =1

. Hence proved.

5
Theorem 1.9. Let A ∈ Rn×n be a symmetric positive semi definite matrix then,

1. λmax (A) = maxkxk2 =1 hAx, xi

2. λmin (A) = minkxk2 =1 hAx, xi

Proof. 1. λmax (A) = kAk2 = maxkxk2 =1 hAx, xi by Corollary 1.1 and Theorem 1.8.

2. hAx, xi = xT Ax = xT U DU T x (By spectral theorem for symmetric matrices)

=⇒ hAx, xi = (U T x)T DU T x ≥ λmin kU T xk22

Proposition 1.1. kU T xk22 = kxk22 (Orthogonal matrices preserve length)

Proof. kU T xk22 = xT U U T x = xT x = kxk22 . (U T = U −1 by definition for orthogonal


matrices)

=⇒ hAx, xi ≥ λmin (10)


for kxk2 = 1.

Converse inequality,

Take y such that Ay = λmin y and kyk2 = 1 Now,

hAy, yi = y T Ay = λmin y T y = λmin kyk22

=⇒ λmin (A) = min hAx, xi (11)


kxk2 =1

Hence proved.

6
Definition 1.1. Absolute and Relative Error:

If x̂ scalar is an approximation of scalar x then, Absolute Error is given by |x̂ − x| and


Relative Error is given by |x̂−x|
|x|
. If |x̂| is 6= 0, then Relative Error is also given by |x̂−x|
|x̂|
. If
we use k.k instead of |.|, it is known as normwise relative/absolute error.

2 Sensitivity of Linear Systems


Consider a linear system
Ax = b

   
1000 999 1999
where A = and b =
999 998 1997
 
1
then by solving the linear system , x =
1

A slight perturbation is given to b,


   
1998.99 1000 999
b= A=
1997.01 999 998
 
20.97
The solution of this new linear system is x = , a drastic change!
−18.99

2.1 Geometric Intuition


The initial linear system of equations are given by,

1000x1 + 999x2 = 1999

999x1 + 998x2 = 1997


and after perturbation,
1000x1 + 999x2 = 1998.99
999x1 + 998x2 = 1997.01
Plotting the initial system of equations on a graph, we get figure 1., we can see that the
lines almost completely overlap each other, and intersect at (1,1) as expected.

7
figure 1

Now let us plot the perturbed system of equations, we get figure 2., we can see that the
intersection point has drastically changed and is at (-18.99,20.97).

figure 2

8
The drastic change in the intersection point is a result of the lines being extremely close to
each other. Close enough such that a minute change in the value of b results in a completely
different solution. This is a way to understand this behavior geometrically. But we shall see
how to formally measure sensitivity of linear systems.

Definition 2.1. Condition Number

For an invertible matrix A, the condition number of A with respect to a norm k.k is de-
noted by κ(A) and is defined as
κ(A) = kAkkA−1 k

1 ≤ κ(A) ≤ ∞

Theorem 2.1. Properties of Condition Number

1.
κ(A) = κ(A−1 )

2.
κ(A) = κ(cA) ∀c 6= 0, c ∈ R

3.
κ(A) ≥ 1

Proof. Excercise.

Remark 2.1. Let A ∈ Rn×n be an invertible matrix. Then,

1. Condition number of a singular matrix is defined to be ∞

2. In general there is no relationship between condition number and determinant.

 
α 0
Example 2.1. for A = κ(A) = 1, but det(A) = α2
0 α

Theorem 2.2. Let A be a non-singular and let x and x̂ = x + ∆x be the solutions of Ax=b
and Ax̂ = b + δb. Then,
k∆xk kδbk
≤ κ(A)
kxk kbk

9
Proof.
Ax = b (12)

A(x + ∆x) = b + δb (13)


Subtracting equation (12) from (13), we get A∆x = δb and as A is invertible,

∆x = A−1 δb
=⇒ k∆xk = kA−1 δbk ≤ kA−1 kkδbk (14)

Now,
kAxk = kbk =⇒ kbk ≤ kAkkxk

1 1 1
≥ · (15)
kbk kAk kxk

Multiplying inequality (14) and (15) we get,


k∆xk kδbk
≤ κ(A)
kxk kbk
Hence proved.

Remark 2.2. If we perturb the coefficient matrix A, then also we can bound the error in
the solution. Note that the perturbed matrix need not be invertible.

k∆Ak 1
Theorem 2.3. Let A be an invertible matrix, if kAk
< κ(A)
then A + ∆A is invertible.
Proof. If A + ∆A is singular then (A + ∆A)x = 0 for some x 6= 0

[as dimension of nullspace > 0 for singular matrices]

=⇒ Ax = −∆Ax =⇒ x = −A−1 ∆Ax

=⇒ kxk ≤ kA−1 kk∆Akkxk =⇒ 1 ≤ kA−1 kk∆Ak

=⇒ 1 ≤ κ(A) k∆Ak
kAk
=⇒ 1
κ(A)
≤ k∆Ak
kAk

k∆Ak 1
∴ if kAk
< κ(A)
then A + ∆A is invertible. Hence proved.

10
Theorem 2.4. Let A be an invertible matrix, if x and x̂ = x + ∆x are the solutions to
Ax = b and (A + ∆A)x̂ = b, and k∆Ak
kAk
1
< κ(A) then,

k∆xk κ(A) k∆Ak


kAk

kxk 1 − κ(A) k∆Ak
kAk

Proof.
(A + ∆A)x̂ = b

=⇒ A∆x + ∆Ax̂ = 0 =⇒ A∆x = −∆Ax̂

∆x = −A−1 ∆Ax̂ =⇒ k∆xk ≤ kA−1 kk∆Akkx̂k

=⇒ k∆xk ≤ kA−1 kk∆Ak(kxk + k∆xk)

=⇒ (1 − kA−1 kk∆Ak)k∆xk ≤ kA−1 kk∆Akkxk

Since kA−1 kk∆Ak = κ(A) k∆Ak


kAk

k∆xk κ(A) k∆Ak


kAk

kxk 1 − κ(A) k∆Ak
kAk

Hence proved.

Theorem 2.5. Let A be an invertible matrix if Ax = b and

(A + ∆A)(x + ∆x) = b + ∆b; b + ∆b 6= 0

then,  
k∆xk k∆Ak k∆bk k∆Ak k∆bk
≤ κ(A) + +
kx̂k kAk kb + ∆bk kAk kb + ∆bk
Proof. Exercise.

11
k∆Ak 1
Theorem 2.6. Let A be an invertible matrix and kAk
< κ(A)
if Ax = b and

(A + ∆A)(x + ∆x) = b + ∆b; b 6= 0

then,  
k∆Ak k∆bk
k∆xk κ(A) kAk
+ kbk

kxk 1 − κ(A) k∆Ak
kAk

Proof. Exercise.

3 Geometric meaning of condition number


Definition 3.1. The maximum and minimum magnification are defined by

• maxmag(A) = maxkxk=1 kAxk

• minmag(A) = minkxk=1 kAxk

Theorem 3.1. Let A be an invertible matrix then,


1
• maxmag(A) = minmag(A−1 )

1
• minmag(A) = maxmag(A−1 )

Proof. maxmag(A) = maxkxk=1 kAxk

Let,
Ax = y =⇒ A−1 y = x

x A−1 y y

A
kxk = A kxk = kA−1 yk

1 1
=⇒ max
−1 y = minmag(A−1 )
(16)
A kyk
1 1
=⇒ min = (17)
−1 y
A kyk maxmag(A−1 )

12
By equation (16) and (17)
1
maxmag(A) =
minmag(A−1 )

1
minmag(A) =
maxmag(A−1 )
Hence proved.

Remark 3.1. Let A be an invertible matrix, then by previous theorem,

maxmag(A)
κ(A) =
minmag(A)

Condition number captures the behavior of the unit ball


under transformation by matrix A.

13