Sei sulla pagina 1di 26

Computation of Plane Unitary Rotations Transforming a General Matrix to Triangular Form Author(s): Wallace Givens Reviewed work(s): Source:

Journal of the Society for Industrial and Applied Mathematics, Vol. 6, No. 1 (Mar., 1958), pp. 26-50 Published by: Society for Industrial and Applied Mathematics Stable URL: http://www.jstor.org/stable/2098861 . Accessed: 17/07/2012 22:59
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

Society for Industrial and Applied Mathematics is collaborating with JSTOR to digitize, preserve and extend access to Journal of the Society for Industrial and Applied Mathematics.

http://www.jstor.org

J. 80C. INDUmT.APPL. MATH.

Vol. 6, No. 1, March,1958 Printedin U.S.A.

COMPUTATION OF PLANE UNITARY ROTATIONS TRANSFORMING A GENERAL MATRIX TO TRIANGULAR FORM*t


WALLACE GIVENS

square matrixA and assumptions.Given an arbitrary 1. Introduction thereexists a unitarymatrixU such that with complexelements,
(1.1) ULAU = T

is triangular = 0 forj > i). Since this well knowntheoremdoes not (tij assume that the matrixA is similarto a diagonal matrix,(i.e., has simple elementarydivisors), it seems possible that a reductionbased on this theoremwill possess greaterstabilitythan methods which theoretically divisors. elementary shouldfail formatriceswith non-simple methodis an extensionof that publishedin [1] and [2]. The following used was real numberfixedpoint' but the In each of these the arithmetic extensionto complex numberswas indicated. Except in ?14, where we considersome specializationsto real numbers,we suppose that all maso chine numbersare complexand that A has been normalized2 that
n

(1.2)

N2(A)

i,j=l

aij 12

.24.

Hence, i i < 2 and the originaldata are digital. Moreover, since in equation (1.1), U-1 = U* (= U conjugate transpose) and (1.3) N2(A) = trace A*A

* Received by the editors November 22, 1957. t Presented at the Wayne State University Conference on Matrix Computations, Sept. 3-6, 1957. An abstract was published in the January 1958 issue of the Journal of the Association for Computing Machinery and a draft version appeared as a Numerical Analysis Note at The Ramo-Wooldridge Corporation on August 5, 1957 (NN-77). 1 Floating point was also used in [21 but we shall not have occasion to use this

part of [2]. 10-6 formatrices of order ?100 but 1 2 In [1] and [21we required only N2(A) _ the more stringent normalization N2(A) < 1 is more convenient here. Where there is a great disparity between the sizes of the elements or of the characteristic values of A, it may be supposed that the required normalization is undesirable. (It is never necessary if fixed point operation is not desired.) This is not in general true since it will always sufficeto require that maxi, I aij I < (2n)-1 and for n < 50 this loses fewer significant figures than are customarily allocated to the exponent in packed floating point. Of course, if the data really justify an answer of great accuracy, a double precision computation may be considered. 26

TRANSFORMING

GENERAL

MATRIX

TO TRIANGULAR

FORM

27

we have (1.4) N2(T) = trace U*A*UU*AU = trace U*A*AU


=

trace A*AUU* = trace A*A Xi ai + if i

N2(A). =1, 2, ... n).

If A has characteristic values (1.5) then,forsome ordering the Xi, of (1.6) Since we alreadyknowthat (1.7) we have N2(T)
n

ti=

Xi

j tsi12+
n
_

i>i

I tij12 =

< N2(A).<

(1.8)
so that certainly
(1.9)

<4

XiI 2 ai2+

2< <

and lia<

In ?6 we use these resultsto guaranteethat fixedpoint operationsmay in be used withoutdangerof overflow a certainpart of the reductionof A to T. In anotherpart (?7), economy seems to dictate a limited"scaling If and unsealing"in cases whichshould occur infrequently. optimumuse the of the wordlengthof the machineis not deemednecessary, normalizathe tionN(A) < (1 + nl) ' willpreventoverflow throughout computation (cf. (13.5)). 2. Semitriangularmatrices. Matrices with all elements zero either above some fixeddiagonal or below some fixeddiagonal will occur with a that we introduce special notationfor them and collect such frequency some easily provedresults. matrixM = (mij, i = 1, 2, ***m and 2.1. A (rectangular) Definition if j = 1, 2, ... , n) will be called p-subtriangular mij = 0 fori > j + p if and p-supertriangular mij = 0 forj > i + p. A matrixis p-triangular or if it is eitherp-subtriangular p-supertriangular. M' its LEMMA2.1. If M is p-subtriangular, transpose is p-supertriangular and conversely. This permitsthe following matlemmas,stated only forsubtriangular case. rices,to be carriedover to the supertriangular

WALLACE GIVENS

for every it LEMMA 2.2. If M is p-subtriangular, is q-subtriangular and q ? p. If M is m by n, then it is m-subtriangular n-supertriangular. then if and B is q-subtriangular, LEMMA 2.3. If A is p-subtriangular Also, it C = AB is defined, is (p + q)-subtriangular. (2.1) cij = aikbkj
aikbk
=

(fork = j + q wheni =j + p + q).


cij is

Clearly,each termin Ek

zero unless i < k + p and

k < I+

q,

j + p + q the so cij = 0 unlessi ? (j + q) + p. Moreover,when i only termin the sum whichcannot be guaranteedzero is that forwhich k = j + q = i - p so aikbkj = cij as claimed. The last lemmais of coursea consequenceof the second part of Lemma 2.2 if p + q is equal to or greaterthan the numberof rows in C (or, A). even thoughA is not Moreover,C may be (p + q - 1)-subtriangular In and B is not (q - 1)-subtriangular. an im(p - 1)-subtriangular portantspecial case, however,the last lemma can be strengthened. matrix A is called complete if 2.2. A p-subtriangular DEFINITIoN 0 wheni = j + p. aij LEMMA 2.4. In the last lemma if eitherfactoris complete,if A is not and and (p - 1)-subtriangular B is not (q - 1)-subtriangular p + q < the numberof rowsin A, thenthe productis not (p + q - 1)-subtriangl1are lar and is completeifboth factors complete. Allowingp to be zero or negative permitswell known results to be stated in the form: A-' exists if and only LEMMA 2.5. If A is square and O-subtriangular, If A is n by n and p-subifA is completeand thenA-' is 0-subtriangular. for triangular negativep, then Ak = 0 forany k such that k(-p) > n, so that A is nilpotent. rotations.A linear transformation 3. Coordinateplane unitary
x -+ y = Ux

in takes place (effectively) a coordinateplane if forsomej and k # SYi= Xi (3.1)


Yj = UjjXj +
Y8k = Ukjzj UjkXk, (for

i $ j, i $ k)

+ UkkXk.

fromthe identitymatrixonly in rows and columnsj That is, U differs and k and therehas only the fourindicatedelements(possibly) different fromzero.

TRANSFORMING

GENERAL

MATRIX

TO TRIANGULAR

FORM

29

The condition,U*U = Li , that U be unitarywill be satisfiedif and onlyif the 2 by 2 matrix (3.2)
U2
= U1j Ukj Uj k Ukk

a
~'t

satisfiesU2*U2 = 12. This of course implies the equivalent condition if (redundant) U2U2* = 12 and hence U is unitary and onlyifthe following are conditions satisfied:
(3.3)

1a1e2 = 1121 1j + an?y , O

2 = jyj2jaj2-jf12+ l12=1 and a# + aY=O.

(3.4)

o<

remainto be specified.3 In general,fourreal parameters (For example,if j a I < 1, I I and the amplitudesof ce,f and y may be used, special I cases then occurforj a j = 0 or 1). it For our use of plane unitarytransformations, will be sufficient and to consideration those of the special form to convenient restrict (3.5)

-8

wherenow we requireonly (3.6)


Ic!2?+ I2
=

1.

We shall referto this special type of unitary matrixas a (coordinate) plane unitaryrotation.Clearly if R is real it is the usual 2 by 2 plane the rotationmatrixand c and s are respectively cosine and sine of some real angle. in The restriction the formof (3.5) still permitsthe condition (3.7) (3.8) c = rF x, -sx + cy = 0, s = rF'y (r
[I x2

on the (c, s)-pairforany givenx and y. Indeed,


=
+

I y 12)

is a solution,unique to withinan arbitrarycommonfactor of absolute value one, of (3.6) and (3.7). We shall suppose that this is the solution case x will be real and y complex:then chosen.In a frequently occurring c is real. While the digital computationof c and s fromx and y presentsno esavoidance of overflow and considerations accuracy of sential difficulty, of dictatea distinction cases as in [1], cf. ?2.3 and the flowdiagramat the
3See [3] or [4] for another use of these plane unitary transformations.

30

WALLACE

GIVENS

and s = 0 in bottomof page 91. In particular, (3.8) we take c = x x i when y = 0, x # 0. For x = 0, y = 0, takingc = 1,8 = 0 is effective. preservesthe "length" (or, norm) Since a unitarytransformation
(3.9)

|v = (

v I

of a vectorto whichit is applied, (3.10) and


(3.11)

(x y)

(r 0),

r =[x

12 +

1y 1 2]' >Ix

and r _ jyj,

while transof so that the effect R is to "accumulate" on one component the the forming otherto zero. Interchanging columnsof R would produce of component vR. the zero in the first is of considerations this transformation an nBasic to the following componentvector so as to produce a zero in any selected component. any vectorcan be carriedinto Using a sequence of plane unitaryrotations one with only a single nonzero componentand this will (if desired) be and equal to the normof the vector: real, nonnegative (3.12)... vR12R13 R1, =

(I v

0, 0,

..

0),

whereRij is a unitaryrotationin the plane of the i-th and j-th coordinate of axes. The same considerations course apply to columnvectorsby taking transposes. For an arbitrary the values of Cl2 and 812 may both be complexbut a v, of real of component vR12 properarrangement the code will make the first real and nonThen c13,C14, , cinwill all be chosen and non-negative. negative. form.In [2] it was shownthat by transforming 4. The 1-subtriangular 1)(n - 2) coordinateplane an n by n matrixA by a sequence of 1(n the rotations4 same numberof zeros could be created and A reduced to form: 1-subtriangular
-

(4.1)

S = R*AR,

||

(withsj = O fori > j + 1).

in The rotationsare carriedout successively the coordinateplanes (2, 3), (2, 4), ... , (2, n), (3, 4), ... (3, n), (4, 5), ... , (n - 1, n) and produce
zeros at the respective positions (3, 1), (4, 1),
...

, (n, 1), (4, 2), * i

is trivial.

4Real number arithmetic was used in [2] but the change to matrices of type (3.5)

TRANSFORMING

GENERAL

MATRIX

TO TRIANGULAR

FORM

31

(n, 2), (5, 3), ... (n, n - 2), no zero once created being afterward destroyed.(Cf. equations (3) and (4) in [2].) Thus, in (4.1)

(4.2)

... R23R24

... R2nR34

Rn-1 n

A count of the multiplications needed to reduce A to S shows that less than -Vn3 are required([2], equation (28)) if R*AR is calculatedusingthe factored formof R so that R itself neverformed." is The elementson the firstsubdiagonal of S are of special interestand we denotethemby (4.3)
r= s+li

(i = 1,2,

1).

If all ri are different fromzero, S is complete (Definition2.2). Crossing offthe first row and last columnof the matrix
Sii
-

X
822

812 -

813 823 833 r3

. .

.81

n-1 n-1 n-1 84 n-1 8

Sin
82n 83n 84n

ri

* * *82

(4.4)

r2

. .

.83

rn-1

Snn

SX = S
FIG. SX =

Xln

S -X 1n exhibitsan n - 1 by n - 1 triangular submatrix which has rankat least equal to the numberof nonzerori on its diagonal.Hence, independent the choice of X, of

(4.5)

rank Sx _ n-1-z

whereexactlyz of the n - 1 numbersri are zero. The numberof linearly independent solutionsof (4.6) is the commonvalue of n
-

(S - Xln)x = 0

rank Sx = numberof elementary divisorswithroot X,

whichgives
Here "multiplication" refers to complex numbers but instead of allowing four real multiplications for each complex, the specialization at the end of the last paragraph reduces the factor of four to about three. Hence iOn3 real number multiplications will suffice.

32

WALLACE

GIVENS

maof THEOREM 4.1. If z of thesubdiagonalelements a 1-subtriangular characteristic trix are zero, thereare at most z + 1 linearlyindependent a 1value. In particular, complete associatedwithany characteristic vectors matrixhas a unique (to withina factor)characteristic vector subtriangular value. associated witheachcharacteristic Since the Jordancanonical formof a matrixis triangular(and hence of with all ri = 0) the non-existence zero values among 1-subtriangular conditionforplacing a limitationon the permissible the ri is a sufficient divisorsof S but knowledgeof the presenceof at least z (and elementary perhapsmore) zeros among the ri can never implyany limitationon the divisors. values or theirelementary characteristic matrices The obvious extensionof the last theoremto p-subtriangular states that modifying diagonal elementsof such a matrixcannot rethe elementson its pth subdiagonal duce itsrankbelow the numberof nonzero and hencewe have for THEOREM 4.2. If S is p-subtriangular p > 0 and si+, i is zerofor = 1, 2, * , n - p, thenno characteristic value is exactly values of i z, Sx divisors.(Equivalently, = Xx therootof morethan p + z, elementary vectors x). independent forat mostp + z, linearly From Lemma 2.4 and equation (2.1) we easily conclude with subdiagonal elementsri (cf. LEMMA 4.1. If S is 1-subtriangular (4.3)), and SP = 1J J, then mij (4.7) mi+pi =
-

ri+p-irirp2
-

* ri

(i = 1, 2,

...

, n -p),

and the numbern (4.8)

zP of nonzeromi4pi is
j=n-1

n-p-zp=

j=p

kj(j-p + 1),

of wherein the sequence ri, r2, ... , rn-1 thereare kj occurrences a maximal consecutivesubsequenceofj nonzeroterms.6 Applyingthe lemmato the powersof (S - X1n),we have divisors(X - X0)iof the THEOREM 4.3. If Xois therootof vi elementary matrix then 1-subtriangular S,
vi + 2v2 + 3v3 + (4.9) + (p1)vl + p
3=p jpn-1

< p + zp = n thenotation beingthatofthelast lemma.

j-p

kj(j-p

+ 1),

with a * designatFor example, if the ri form the sequence **00****0****0*0*0 = k18= O. an element 0O, = 2, k2 = 1, k3 = O, k4 = 2, k5 =... ing ki
6

TRANSFORMING

GENERAL

MATRIX

TO TRIANGULAR

FORM

33

5. Reduced or decomposedformof S. We now assume that the similaritytransformation (4.1) has been applied to A and that we henceforth deal with the 1-subtriangular matrix S with characteristic values Xi. The special formof S permits various economiesof machinetimein finding a singleXi by any of the standardmethodssuch as those describedin [5]. While the main reliance should perhaps be placed on these general methods,we suggestin ?12 some techniqueswhich are economicalonly because of the special formof S. A matrixM = 11 11is called reduced7 for some nj, mij = 0 for if, mij i > ni > 0 aridj < ni + 1 < n. Clearly,if rk = 0, S is reducedwith ni = k. The Laplace expansion off(X) = det (SX1n)in termsof its firstk columns gives f(X) = fK(X)f2(X), wherefi(X) is the characteristic polynomialof the k by k submatrix11 . sij, i, j = 1, **, k 11 Repetition of thisprocessgives the following generalresult. If ra, rag, ra+b+c ... are zero,then
S1 S12 S13 ..
.. .

(5.1)

8=,

o
o

82
o

823
S$

whereS,, S2, S3, *** are square 1-subtriangular matrices ordersa, b, of c, * *, respectively. characteristic The values of S are thoseof S, , S2, S3, ... and do notdependon thematrices Sij(i < j). If onlythecharacteristic values are to be calculated,the originalproblem should be replacedby the sameproblem twoor moresmaller for matrices. Since small values of the ri may be different fromzero only because of roundoff errorand are also usually uncertainin practicebecause of uncertaintyin the originaldata, a code for findingcharacteristic values (but not the corresponding vectors) by the method of this paper should provide for the replacementby zero of any ri numerically8 than a less parametere. The value of e would vary withthe demandsof the problem but an adequate criterion decidinghow large e can be chosenis lackfor ing. A desirableprecautionwould be to reinstateany ri replaced by zero and check and improvethe Xi using S or, even betterbut less economimatrixA. cally,the original For the calculation of the vectors it is much less clear that small ri should be replaced by zero or even that the vectorsshould be computed forthe submatrices and then built up to have n components. Sij = 0 If
8

Also if its transpose satisfies this condition. The reduction to S can make the ri real and nonnegative (cf. the proof of (6.9)).

34

WALLACE

GIVENS

forall i < I and i > I, the matrixdecomposes:


nl n2

(5.2)

S2

. n2*

S(2)

vector can be requiredeitherto begin with ni Then each characteristic unless or to end with n2 zeros and indeed must have this property zeros decomposivalue in common.A further and S(2) have a characteristic S specialization tion of 81) or S(2) could of courseoccur,withcorresponding in the formof the vectors.Since "smallness" of at least one ri seems to vector to be non-unique(= 0 is rebe necessaryfor the characteristic in it quired theoretically), mightappear that instability the computation shouldnot be expectedunlesssome of the ri are "small". example (suggestedby G. This is not true,however,as the following Forsythe)shows.Let all ri = p F 0 and the otherelementsof S be zero. divisorX~'and zero is anrn-foldcharThen S has the singleelementary equavalue. If, however,slnis replacedby a-, the characteristic acteristic = 0 with distinctroots,all of absolute value tion becomes X_ ap--p the normalizaicp n-1 i 1/n. For n = 10, p = .1 and a = 10-1 (satisfying tion of ?1 forboth versions of S), we have i apn-1 I 1/n = .01. Thus the change of a singleunit in the eleventhdecimal place of a singleelement of a matrixof order 10 may alter the "last nine decimal places" of all values. its characteristic growsworse forlargern, it appears to be imposSince this instability sible to give an a prioriguaranteeof the stabilityof the computationof values of a generalmatrixof large orderby a code emthe characteristic circuitsof a digital computer.This multiplication' ployingthe roundoff on guaranteesof accuracydepending a knowldoes not excludea posteriori resultsor on on edge of the finalresultsof the computation, intermediate some limitationon the generalityof the given matrix. For example, it matrixis changed can be shown that if no elementof a real symmetric value can be alteredby morethan n I 6 I . by morethan 5,no characteristic Since nonzerovalues forthe ri exclude the (theoretical)indeterminacy vectors but permit a possible gross instabilityin the of characteristic of determination the Xi, thereseems to be littleadvantage in makingthe and hence we do reductionof (5.1) at an early stage of the computation an this is theoretically attractivepossinot assume all ri F 0, although bility.'0
9 Exact methods which treat the elements as integers must of course be excluded. 10Some remarks by R. L. Causey suggested the possibility of avoiding the assumption that all ri - 0.

TRANSFORMING

GENERAL

MATRIX

TO TRIANGULAR

FORM

35

ordern to ordern - 1. If vj is thejth row vectorof 6. Reductionfrom w1r , vj = sj - Xej, wheresj is the jth row of S and ej
=

(6.1)

(0, 0,

0, 1,0 . *,

0)

with the one in the jth place. Hence, by the triangleinequality, (6.2) N(A) < Assuming (6.3)
2

vIj

+ IsIj ?
2

XI

and IX I <

gives

IviI < 1,

I Sjk 12. It follows since N(S) = N(A) and certainlyJk I Sjk 12 < Zjl to consideration those X forwhichIX 1 < 2, as we may that if we restrict in do by (1.9), thereis no danger of overflow the elementsof SA nor indeed in the elementsof SxR forany unitarymatrixR. If columnsinstead the of rowsare considered, same resultis obtainedforR*Sx but if R*SXR to mightoccur since S\R continues have the propis calculated,overflow erty(6.3) of Sx forits rowsbut may have lost it forits columns.No such existsif S ratherthan Sx is used since difficulty
N2(R*SR) = N2(SR) = N2(S) < 2
2

as so intermediate well as initialand finalresultsremainscaled. central technique proposed in this paper is the reductionof S The theorem. impliedby the following unitaryrotations 6.1. If Sx = S - Xl, is 1-subtriangular, THEOREM 1), (n, n - 2), ... , (n, 2), (n, 1) can be found in theplanes (n, nRij in such thatthelast n - k elements thelast columnof S?,k) are zero,where (6.4)
S(n) =

and

S(k) =

S(k+l)R

for k = n - 1, n - 2, ... , 2, 1, and each of theSXk) is 1-subtriangular. the last columnof S)1) is zero except (possibly)forits first In particular, element. of WithSk) = || sW || and thesubdiagonalelements sin) and SP1) given (cf. thespecialdesignations (4.4))
(6.5) ri = si+ i,
=

Pi = si+1 i

for (6.6)

1,

land g=

36 then (6.7) (6.8)

WALLACE GIVENS

f(X) = det (S - X1) = (-1) n


| 1pil

PIP2 ...

Pn

I |rij| Pi > ?

i=1 ,

n-)

so the and, if desired, Rnjcan be selected that (6.9) Pi

for impossible ri $ 0. withtheequality in may be clearerif prescribed the theorem The sequence of operations on we exhibitin Fig. 2 the finalformS of So = si, wheresubscripts the zeros show the orderin whichtheyare created.
* * * * * * 9

P1

*
P2

*
* P3
.

*n* *O
...

*
* *

*
* *

On-1
On-2 On-3

O
O

0
.

(6.10)

0 O

0 0 0

0 0 0

...

* Pn-2

03 02 01
...

... ...

pn-1

S(1) zerosare createdin the order01, 02,


FIG. 2

*n-

followsfromthe resultsof ?3 since the decisive operation The theorem ) S replaces the columnsk and n of SXk+l) by linear combinacolumnshas its last n - (k + 1) eletions of them.Each of the affected mentszero since the kth columnis unchangedfromthe same columnof by So and the nth column has this property the inductive assumption. is property clearlypreservedthroughout. The 1-subtriangular When R = Rnk its elementsCk and Sk as exhibitedin (3.5) will now be chosenas in (3.8) with
Sk+l)

(6.11)

X =

=(k+

1) = Sk+l k =

k+

k+lk1 k =

rk

and

=(k?

Then (6.8) and (6.9) followfrom (6.12)


Pk =

rk

2n

TRANSFORMING

GENERAL

MATRIX

TO TRIANGULAR

FORM

37

of Since the determinant a plane unitary rotationis one (and not merely of of absolute value one), det Sx = det Sxl) and a cyclicpermutation the matrixwith Pi, P2 X** columnsof the latter matrixgives a triangular p,,,- g on the diagonal,verifying (6.7). Any othersolutionof (3.6) and (3.7) would be obtainedby multiplying c and s by a commonfactorof absolute one. This would not invalidate (6.7) or (6.8) but would permitthe pi to be complexso that (6.9) would not hold. The usefulness this reductionas a means of calculatingthe value of of the characteristic polynomialby (6.7) depends on the small number of multiplications required.We show in ?10 below that n - 1 square roots fixedpoint) multiplications sufficeto calcuand n2 + n - 3 (ordinary, late a value of f(X). If only g = g(X) and not the pi = pi(X) are wanted, is the numberof multiplications reduced to less than n2. Where S is complete,(6.8) guaranteesI pi j > 0 and (6.7) gives: and LEMMA 6.1. If S is complete, condig(X) = 0 is a necessary sufficient value of S. tionthatX be a characteristic If ri = 0 fori = ki, k2, - *, klc, then it may happen that forsome X a (necessarily characteristic value) pi(X) = 0 for a subset ka, kb, *** of the indiceski, * * , k, . Indeed, if X is a root ofmorethan one elementary rank St1) = rank S) is n - 1. divisor,this must happen" since otherwise in reductionfromordern to This causes no difficulty the contemplated ordern - 1 providedg(X) = 0. Since this may not be the case, we now show how, at the cost of some additional computation, g(X) = 0 can be value even thoughsome of the pi assured wheneverX is a characteristic are zero. Suppose g(A) 5 0 but f(X) = 0. Then pi(X) = 0 forat least one value of i, say i = q and it may be convenientto choose the minimumsuch q. of Then the reduction Theorem6.1 is to be continuedas follows. rotations in theplanes (q, q - 1), 6.2. If rq = 0, unitary THEOREM Rqj (q, q - 2), * *, (q, 1) and (n, q) can befoundsuchthat:thelastn - q + j in elements theqthcolumn S8 are zero j = 0, 1, 2, * , q - 2, where of for (6.13) (6.14) S(0) = S(') and S(-j-7)
= S(--j)Rq -j-1

can and Rnq befoundso that


S(-q) - S(-q+')R

in has zeroforall theelements its nth column.


so small value of pi would be 1 A code should containa parameter a sufficiently treatedas a zero.

38

WALLACE GIVENS

will The construction be clear if we exhibit,with numberedzeros as in (6.10), the formof S7-q+l) in Fig. 3.
COLUMNS

*
P1

*
*
72

...

q-l *
*
*

q h
O-q+1
O-q+2

q+1 *
*
*

*.

*
*
*

9
0
0

0
qRows
q

0 0
0 0

* pq-l

0-2 0-

* *

* *

0
0

q+ q+2

00
0 0

0
0

0
0

*
Pq+1

*
*

0
0

0
0 0

0
0

0
0

0
0
,

. P**

n-1 O

2 S (-q+): zeros createdin the order -1,-2, FIG. 3

q + 1

by The finalmultiplication Rnqneeds only be chosen so that (h, g)Rng= (*, 0) and this is possible whetheror not h is zero. 7. Reduction to order n - 1 (continued).To completethe similarity value, let of reduction Sx, wherenow we assume that X is a characteristic (7.1) T(n) = 5(1)or S (-) if the latter choice is necessaryto guaranteethat the last columnof T zero. Defineinductively has all its components ... (k = n-1, n-2) T(k) = R*kT(k+) (7.2) and, if necessary, (7.3) forj==0, (7.4) 1,
T()=

T(1) and T(-j')

R*,q-j-,T

,q-2,and T(-q) =R*qT(-q+

TRANSFORMING

GENERAL

MATRIX

TO TRIANGULAR

FORM

39

If
(7.5) Tx = T or T

we have (7.6) for (7.7)


... R = Rnn,-, R., or R

Tx = R*SxR R.,.-,
... ... RniRqq-i RqRknq.

necessary We postponeuntil ?10 a count of the numberof multiplications in minorscalingdifficulties coding discussionof the relatively and further these equations. Also, we do not treat in full the relativelyinfrequently use case pq= 0, g id 0. The effective of (7.6) as a computing occurring device depends on the formof the matricesin the two sequences S'k) and
T(k)

aboveare 1-subtriangular. k = n For THEOREM 7.1. All the SOk) defined zeroand are 1-subtriangular , 1, 0 the T(k) havelast column exn-i, k zero.For k = has however, itsfirst - 1 elements cept thelastrowwhich, for continue havethesameform(with to -1, -2, * -q the T~k),if defined, a that q byq minor of no assurance zerosat thebeginning thelastrow)except of corner form in theupperleft mayfail to be in 1-subtriangular becauseofnonin zeroelements its qthrow. The theoremfollowsreadily by observingthat: (7.2) affectsonly the nth and kth rows of Tx(k+l) and so destroysonly a singlezero in the nth on effect the qthrow; and, (7.4) combinestwo rows row; (7.3) has a similar in which only a final zero is guaranteed. form destroyed, If pq = 0 so that the q by q minorhas its 1-subtriangular q it is only necessaryto restoreit by operationson the first columnsand out rows of Tx (i.e., carrying the reductionof ?4 as calculated fromthe q to by q minorbut applyingthe matrixmultiplications Tx ratherthan to = 0 in Sx and the corresponding elementof the minoronly). Of courserq To can be seen to continueto be zero. Since R*R = 1 foreitherchoice of R in (7.7), (7.8) T = R*SR = R*(S - Xl,)R + XR*R
-

Tx + Xln

the from factthat the simiAn important economyofmachinetimeresults transformation of S, namely S -> 1R'SR = R*SR, requires only larity the adding X back to the diagonal of Tx aftercompleting leftmultiplications by the Rij, the rightmultiplications being necessaryto findthe c the and s determining various Rij. The finalformof T is shownin (7.9),

40

WALLACE

GIVENS

where T is 1-subtriangular the need to carryout (7.3) and (7.4) either if does not occur or if the q by q minoris again reducedto 1-subtriangular form. 0

(7.9)

R*SR= T=
T1T2

0T
3 ...

0
an-1 X

The inductivetriangularization A is establishedby observing of that the necessaryoperationson the first - 1 rows and columnsof T will alter n in the first - 1 elements the last row of T but will not destroythe form n of T. 8. The characteristic vector. The equation Tx = Xx is clearly satisfied by x = en', whereen = (0,.O. 0, 0, 1) and en' is the transposeof en. Equivalently, (8.1) S(Ren') - X(Re.') and the last columnof R is a characteristic vectorof S. Inductively, aftern - p steps (8.2)

R*SR

||

0V

where V2 is triangular and R is a productof the plane unitaryrotations A used in the successivereductions. characteristic vectorv' (with p components)of VI is now found:Vjv' = Xov'. From (8.3) we get,forx = (8.4) V1 - X01
v, w w
(V2
-

V2-X01O ln0p

0.

XOlnP)W

-WVX

and this triangular systemof equations may be solved forw' if (1) either Xois not a diagonalelementof V2, or (2) thereis a non-uniquesolutionfor w' and (v, Tr1W + T2W2 + vector for all + Tmwm)' is a characteristic due choicesof the parametersrj. Failure to solve (8.4) is theoretically to divisor with root Xo. the existenceof a non-simple elementary Where thereare distinctcharacteristic associated values, the difficulty with pq = 0 cannot arise and so the solutionof (8.4) is possible theoretito cally. In this case the reduction lowerorderwithits attendanteconomy in time can be done but once a value X is found,it would be a reasonable
...

TRANSFORMING

GENERAL

MATRIX

TO TRIANGULAR

FORM

41

of optionto go back to S, of ordern, and use the formulas the nextsection vectorof unit length(hence unique to withina to obtain a characteristic factorof absolute value one). Consideringonly the rota9. The formof R = Rnn-jRn n-2 ... R tionsof Theorem6.1 the formof theirproductis as exhibitedin Figure4 This is and - Si as (j, n) component. whereRnjhas ci as (n, n) component as provedinductively follows. sum: LEMMA 9.1. The productRn n1lRnn-2 . . . Rnk= R(k) is a direct fromi and j. Hence For, Rije,' = ep' and eRij = ep forall p different = the , k - 1, establishing = R(k)epl ep' and epR(k) ep for p = 1, 2, stated decomposition. to It is convenient numberthe rows and columnsof M(k) fromk to n so that if (9.1) and (9.2) then (9.3) For example,when
n
al
-2891 C2 -S3 S2 -S4 C3 S2 -S5 C4 C3 S1
-S5

R(k)

_1 lk-1

M(k)M

rkii;i,j

1,2,

, n|
(i, j =k

M(k)

1m

k + 1, ... , n),

i=

r8j

(fori, j = k,

,n).

6,

R65 R64

R6

R62 R61

~~~~~~~~~~Si
-S2 C3 S4 S3 C4 S3 C4 -S5S4 C5S4 C5 S5

R =

-S3
-S4 -S5

C2 S1 C3 C2 S1

- S3 C2 C1 - S4 C3 C2 C1 -S5 C4 C3 C2 C1

C4 C3 C2 82

C5 C4 C3 C2 S1

C5 C4 C3 82

C5 C4 83

C5 C4 C3 C2 C1

FIG. 4
LEMMA

9.2. The elements M(k) are as follows.For j < n, of 0O ci (if i < j) (if i = j)
(if i < i < n) (ifji< i = n)

(9.4)

|-siC0n1 j+,sj
(Cn-1 j+ls i

42 Forj= (9.5) where n,

WALLACE

GIVENS

mk)= In

-Sk k

-siCi1
k

(Cn-1

(if i = k) (if k < i < n) (if i =n),

* Cpq = cpcp-1 (9.6) For k = n n - 1). For j


(9.7)
-1 =

cq

(forp > q), (forp < q).


=

Cpq = 1

n -1,

(since i, j > I reduce to the following the formulas


(cn-l mi
n-1

(if i = n (if i i
=

1)
=

Forj= (9.8)

n,
mi n
=

tsn-lSn-1 snot(if

n). 1) n).

since These are correct (9.9)


R(nI)
= ln-2

Cen-1 a

(if i
.

n-l sn-1

-n

C-

To establishthe (backward) inductionfromk to k -1, we note that columns k, k +?1 , ... , n - 1 of M(kl1) are the same as those of M(k) elementof the except forthe adjunctionof an additionalzero as the first for with the formulas the elements columnand that this is in agreement (thereis an additionalvalue ofi in the case i < j < n). The proofwill be completeif we verifythat columnsk - 1 and n of M(k-1) are correctly givenas in Figure5. For j = k - 1 < n, the lemma asserts that
(9.10)
(9.11)
mk-l,k-1

(k-i)
%,k-1

Ck-1 -SC-1,kSk-1

(k -

1 < i < n),

and (9.12)
mn,k-1

(k-i)

Cn-1i,kSk-1

,-

in agreementwith the calculated values. For j = n, the formulasare checked by observingthat


Ci,k-1

Ci-1,kCk-l

We already know that the last columnof R(1) will be the characteristic

TRA NSFORMIING

GENERAL

MATRIX

TO TRIANGULAR

FORM

43

Ct

_~

IGV

_~

IG

II

0~~~~~~~~~~~~~~~~~~~~
?~~~~~~~I.
I H

11~~~1

1U

_0

v~~~~~~~~~1

44

WALLACE

GIVENS

vectorsought. Its elementsx1, *** xn are obtained fromcl, ** Sn-i by 2n - 1 multiplications: Si X...
,

c,

(9.13) (9.14) and (9.15)

CoM 1,

sn

-1,

Ci1 = ciCi.1,,
xi=

(i

1,2, ...

n -1),
,

-siCiI I

(i = 1, 2, **

n).

the of formS has been done If, however, reduction A to 1-subtriangular witha choiceof (c, s) pairs as in (3.8) and (3.10), the subdiagonalelements this method of choosing(c, s) ri of S will be real and hence continuing and, pairswillgive all the ci real. Hence, (9.14) involvesonlyreal numbers of takingfullaccountof (9.13), only3(n - 2) multiplications real numbers are requiredin the complexcase. To calculate the entire matrix R(') of 10. Count of multiplications. Lemma 9.1 can be shownto requireno morethan Z7l' 3j + (2n-1) = -n2 multiplications, once the n - 1 (c, s) pairs have I(3n2 + n - 8) been found.Since the numberofsquare rootsneededforthe latterdepends of fraction the powerofn, theydo not requirea significant onlyon the first computingtime for large n. If, however,we were to avoid some scaling problems and calculate R*SR by the matrix multiplication(SR) and R*(SR) ratherthan by (6.4), (7.2) and (7.8),
W

1)(2n + 11) n' 1)(n + 4) + 'n(n would be requiredand this avoids both the multiplication multiplications by factorsknownto be zero and the calculationof answersknownto be of to zero. Even neglecting calculatethe transform W in (8.2) in the inductive reductionof order,we would still requireat least Zk2 (1k3)r 6n4 and multiplications an exponentfourcannot be toleratedforlarge n. of for Fortunately the usefulness the methodhere proposed,keepingR to formreducesthe whole computation ordern3 assumingthat in factored accuratecomputation the individualcharacof thisis truefora sufficiently or iterativetechniques12 by usingvalues of values by conventional teristic det S as givenby (6.7). will produce sXk) fromS(k+l) so Indeed 4k + 2 multiplications
in(n
-

n-1

(10.1)

kill

(4k + 2)

= 2(n2 -

1) < 2n2

will multiplications produceSf1) fromSx.


12

termsand SP is p-subtriangular. than forA since sums containfewer shorter

i(n2 + 3n - 2) so about halfthe timeis saved. CalculatingpowersofS is also much

But the operationx

-p

Ax requiresn2 multiplications whilex

-*

Sx requires

TRANSFORMING

GENERAL

MATRIX

TO TRIANGULAR

FORM

45

If only the value of g(X) is wanted (cf. Fig. 2), the new values of the first - 1 columnsof S),' need not be found,so that only kn-1 2k = n are n(n - 1) multiplications needed. Calculatingthe pi requires2(n - 1) and f(X) = P1P2 ... Pn-j requires (n - 1) more. Hence, multiplications we have THEOREM 10.1. A value off(X) = det (S - X1) can be foundwithonly n2 + 2n 3 multiplications complexnumbers and n - 1 solutionsof of x (3.6) and (3.7). A vector such that (S - Xln)x = (g, 0, 0, *-- , 0)' can thenbe obtainedby 2n - 1 additionalmultiplications. the subdiagonal If elements of S are real, takingthe ci and pi real reducesthesecountsto ri 3n2+ n - 3 and 3n - 6 multiplications real numbers,respectively of three as (i.e., to approximately fourths many). 11. An alternativereduction.Another method of reducingS to the form(7.9) is discussed in [51,and froma very general approach in [6]. Writing (11.1)

B r

b /3'

whereb is a columnvector,r is a rowvectorand t# scalar,supposingthat a vector of S is normalizedto have last componentone so a characteristic that (11.2) and taking (11.3) gives (11.4) K71SK Br ur 0 K
=1fl?1 4u

S 1-

Since r matrix by (n-1) 1, .0 ***, 0 Sn,'n-1, ur is an (n-1) fromzero. Hence, K' SK is still in 1with only its last columndifferent form. subtriangular The similarityS -* K-1 SK requires only the n - 1 multiplications be needed to get ur and would therefore much fasterthan the methodwe propose. Several difficulties appear to prevent this frombeing effective. vectorby its last componentto achieve the reDividing a characteristic would be impossibleif this componentwere zero13 quired normalization
13 If ri $ 0 for i = 1, * , n - 1, this could not happen in theory but even- then roundofferror might produce a zero where the true value was very small.

46

WALLACE

GIVENS

weresmall. and wouldprobablyproducegrossinaccuracyifthe component componentone would destroy Normalizingas in [5] to make the largest since in (11.4) r would in generalcome from property the 1-subtriangular a row otherthan the last and so would have more nonzerocomponents. Also, (11.4) does not preservethe normof S and so all controlof the size diffiof elementsis lost with attendantdanger of inaccuracy.A corollary pointoperations.Finally, to floating for cultywould be the necessity going vectorfroma given the methodhere proposed produces a characteristic of the reduction.'4 X and checksthe accuracyof X as a byproduct of of 12. Improvement an estimate of X. The determination a charof of as the finding a minimum the funcvalue can be regarded acteristic tion (12.1) F(X, x)
= = |

(S

X1)x

subject to the conditionthat x calculatedin (9.13-14-15), (12.2)

1. For given X, and x the unit vector


=

F(X, x)

g I 9(X)

F but minimizing forall choicesof X resultsin choosingfor Keeping x fixed X the Rayleighquotlent (12,3) x*SX

Calculating F(X, x) directlyby using the known formof x,


n

(12.4)

[F(X, X)]2-

(X _-)1
X MA-

+ +(X)

12 +

IX

12-2Re

{I(X- X)sg}

Iq12,

Hence, since x' = (-s,, c1u) foru a unit vectorwithn - 1 components. fixedchoice of x is attained for value of F(X, x) forthe the minimum (12.5)
X

= X -gi

is and is I cig I.If g = 0 the correction zero as expected.For s, to be zero, in the sequence of (6.4) must have been unnecessary; the rotationRnl of that is, the (2, n) component S(2) must have been zero. An interchange and second rows of S(2) would give a new si and a new g, proof the first ducinga nonzerochangein X unlessthe new si is also zero and forthis the
14 That (11.4) does retain the 1-subtriangular form was noted by G. E. Forsythe in conversation with Werner Frank and the author. This and other discussions at the Ramo-Wooldridge Corporation are acknowledged gratefully.

TRANSFORMING

GENERAL

MATRIX

TO TRIANGULAR

FORM

47

(1, n) component S(2) would have to be zero. Then S(2) alreadyhas last of column zero and X is a characteristic value. It would appear to be reasonable to try to approach a characteristic value by: (1) choose an estimate of X; (2) calculate x by (9.13-14-15); (3) "improve" X by (12.5); and, (4) using the new X, calculate x again. Unfortunately, cannot guarantee the convergenceto a characteristic we x value since at (2) we are finding as a solutionof the last n - 1 of the equations Sxx = 0 ratherthan by minimizing F(X, x). Indeed X mightnot even satisfya known bound for the I Xi 1. in Since the coefficients the firstrow of Sx are used only to calculate to 9(X) and not to determine it would be interesting trythe above iterax, tion with the modification that in (4) a row vectorbe calculated by the sequenceindicatedby (12.6)
> R12SA Ii

R13R12S>

R12SX

zeros in the first row insteadof in the last column.The same thus creating to code could of coursebe used forthisas for(6.4) sinceonlythe references and the sequence row and columnindiceswould need to be interchanged of labels 1, 2, ** , n reversed. 13. Scaling. While the scalingof A requiredin ?1 was adequate forthe leftmultiplications operationsleading fromSx to Sx('), the following by use For fullyefficient of the word length the R*ni could lead to overflow. of a machine,it would be desirableto carryalong an exponentforeach occurson any addition,scale columnof Tx(k) (cf. (7.2)) and whenoverflow the change and repeat the current down the two columnsin use, record and X is added back multiplication R*nk. When the processis finished by of the diagonalelements Tx (cf. (7.8)), the down scalingcan be reversed to by a corresponding scalingup since R*SR, unlikeR*SxR, will have norm to that of A; hence, < 4. equal A call Alternatively, can be normalizedmore strictlyso no overflow occurin the entireprocess.Taking (13.1) we have I Xi I < (13.2) Then (13.3) (13.4) N(S N(Xln) < o n2fi so that by the triangle inequality,
-

N(A) = N(S) < do,


fi

so we can limitour choice of X by


X I < .30.

X1n)

< (1 + nf)dOo.

48

WALLACE

GIVENS

A Hence, normalizing by requiring (13.5) N (A) < (1 + nT1 in will guaranteeno overflow the entireprocess.Even if this requirement is satisfiedby the excessivelycrude limitation Maxj,j aij

I < [n(l + nl)]-Y,

of digitsare lost at the beginning forn < 19 no morethan two significant each wordand theseare usuallyallocated to the exponentin packed floating pointrepresentation. For a matrixwithreal elementsand X a real num14. Real arithmetic. ber, the plane unitaryrotationsemployedwill be real and no complex be It is arithmetic encountered. may sometimes knownthat a nonsymmeand in this case real values15 tric real matrixhas only real characteristic the solution. can be used throughout arithmetic Where a pair of conjugate complexroots X = a + id and X = a - id Unare known,a littlemore can be done to stay withinreal arithmetic. the to it fortunately, seems difficult do this withoutincreasing numberof morethan is reasonable. multiplications Letting (14.1)
QX
=

S2 -

2aS +

(al2 +

f2)1

and real. Multiplying on the right Qx we note that Qx is 2-subtriangular the by successively Rij affecting planes (n - 1, n - 2), (n - 1, n - 3), choiceof , (n, 1), a suitable , (n - 1, 1), (n, n -2), (n, n - 3), (c, s) pairs will yield zeros in the orderexhibitedin Figure 6. Since we are assumingthat (14.2)
q(X) = (X _ a)2 + f2
= X2 -

2aX + (a2

2)

equation, then for fi5 0, the rank of Q, is a factorof the characteristic that a relation is equal to or less than n - 2. Using (2.1), and observing components like (6.8) holds betweenthe pi and the us in the corresponding
of S2

(14.3)

|liI >

riri+l|

(i-

1, 2,

-2),

f(X) = (det D) *[det (X1 - K) ].

15 Frequently and such cases arise because A = GH withG and H symmetric at An least one of thempositivedefinite. alternativethen is to consider f(X) = det (XG' - H), because G is) with R*R = 1 and factorG-1= R*DR (whereG-1is positivedefinite characteristic value problemforK = D-iRHR*D-i since solve the real symmetric

TRANSFORMING; GENERAL MATRIX TO TRIANGULAR FORM

49

* *

* *
/t2

* *
*On-2

*@

* *

a c

b d
02n-4

o
o
0 0

*1

On_3 02n15 *3

On_ 02n-6 4
...
... *

o o
0 0

0 0 0

* n-2

03
02 01

O0+1

On

.. *
...

On-1

Qx Rnl1,wn2 Rn-1,n-3

Rn,1 n-Rnn-3 Rn-1,


FIG.

*Rn

fromzero,the same is true of the ,Uj. Conand hence if all ri are different we have the lemma: sequently, LEMMA 14.1. If thesubdiagonal elements in S are all different zero, from ri that and condition q(X) be a factor theminimum a necessary sufficient of equationof S is that (14.4) (14.4)
M2(x)-|Cab M2W c d

0 0 0 0

in where corner thematrix Figure6. M2(X) occursin theupper right of a Since this assertionneither requires # 0 nor a real it offers possible to way to deal with multipleroots.'6(There is an obvious generalization roots of multiplicity greaterthan two.) the It would also seem veryworthwhileto investigate use of M2(X) as a as of matrixfunction the real variablesa and /3 a means of obtaining quadraticfactorsof the characteristic equation. Multiplying on the left successively the inversesof the matrices Qx by used as rightfactorswould produce a matrix R*QxR which would still the have its last two columnszero,assuming rootsofq(X) = 0 to be characvalues. The easy device of (7.8) now fails,however,and it would teristic to seem necessary calculate
(14.5) T = R*SR

as formof R) withoutusing QxIR an intermediate (using the factored step. formof S is destroyedsince rotationscomputed Worse,the subtriangular for Q cannot be expected to create any zeros in S. However,
(14.6)
(Tal)2 + ,21 = R*[(Sal)2 + /321]R

16 Note that for (X -a)2 to be a factor of the minimum (as opposed to "characteristic") equation, rank Qa < n- 2.

50

WALLACE

GIVENS

and each of thesematricesannihilatesa vectorx if and onlyif all but the of last two components x are zero,that is (14.7) x = {en-1+ nen with in providedwe assume all ,us$ 0. Hence, polynomials T commuting one another, (14.8) and (14.9) so T has the form 0 0 (14.10) ;0
: ** ** 01

(T - al)en,1 = Glen,, 77,e. + (T - a 1)e. ,


=

2en1 +

h2en,

0 0

0
A * * * * f* i+ a
1
62

M2 +

and the two by two matrixin the lowerrightcornerhas the characteristic equation q(X) = 0.
REFERENCES 1. GIVENS, W., Numerical computationof the characteristicvalues of a real symmetric matrix, Oak Ridge National Laboratory Report No. 1574 (1954), available from Officeof Technical Services, Washington 25, D. C. ($0.70). 2. GIVENS, W., The characteristic value-vectorproblem, J. Assoc. Comput. Mach., 4(1957), pp. 298-307. 3. KOGBETLIANTZ, E. G., Solution of linear equations by diagonalization of coefficients matrix, Quart. Appl. Math., 13(1955), pp. 123-132. 4. LOTKIN, M., Characteristic values of arbitrary matrices, Quart. Appl. Math., 14 (1956), pp. 267-275. 5. WILKINSON,J. H., The calculation of the latent roots and vectorsof matrices on the Pilot Model of the A.C.E., Proc. Cambridge Philos. Soc., 50(1954), pp. 536-566. The use of iterativemethodsfor finding the latent roots and vectors of matrices, Math. Tables Aids Comput., 9(1955), pp. 184-191. for 6. FELLER, W. AND FORSYTHE, G. E., New matrix transformations obtaining charQuart. Appl. Math., 8(1951), pp. 325-331. acteristicvectors,
WAYNE STATE UNIVERSITY CORPORATION

AND RAMO-WOOLDRIDGE

Potrebbero piacerti anche