nn n
a b
a b
a b
/
/
/
22 2
11 1
5
Easy to solve system (Cont.)
Lower triangular matrix:
Solution: This system is solved using forward substitution
6
Easy to solve system (Cont.)
Upper Triangular Matrix:
Solution: This system is solved using Backward substitution
7
LU Decomposition
and
Where,
(
(
(
(
=
mm
m
m
u
u u
u u u
U
0 0
0
2 22
1 12 11
(
(
(
(
=
mm m m
l l l
l l
l
L
2 1
22 21
11
0
0 0
LU A=
LU decomposition was originally derived as a decomposition of quadratic
and bilinear forms. Lagrange, in the very first paper in his collected works(
1759) derives the algorithm we call Gaussian elimination. Later Turing
introduced the LU decomposition of a matrix in 1948 that is used to solve the
system of linear equation.
Let A be a m m with nonsingular square matrix. Then there exists two
matrices L and U such that, where L is a lower triangular matrix and U is an
upper triangular matrix.
JL Lagrange
(1736 1813)
A. M. Turing
(19121954)
8
A ~ ~U (upper triangular)
U = E
k
E
1
A
A = (E
1
)
1
(E
k
)
1
U
If each such elementary matrix E
i
is a lower triangular matrices,
it can be proved that (E
1
)
1
, , (E
k
)
1
are lower triangular, and
(E
1
)
1
(E
k
)
1
is a lower triangular matrix.
Let L=(E
1
)
1
(E
k
)
1
then A=LU.
How to decompose A=LU?
(
(
(
(
(
(
(
(
(
=
(
(
(
(
(
(
(
(
(
=
(
(
(
(
(
(
=
2 13 3
6 8 12
2 2 6
1 0 2 / 1
0 1 2
0 0 1
1 3 0
0 1 0
0 0 1
5 0 0
2 4 0
2 2 6
2 13 3
6 8 12
2 2 6
1 0 2 / 1
0 1 2
0 0 1
1 12 0
2 4 0
2 2 6
Now,
2 13 3
6 8 12
2 2 6
A
U E
2
E
1
A
9
Calculation of L and U (cont.)
Now reducing the first column we have
(
(
(
=
2 13 3
6 8 12
2 2 6
A
(
(
(
(
(
(
2 13 3
6 8 12
2 2 6
1 0 0
0 1 0
0 0 1
(
(
(
(
(
(
(
(
(
=
(
(
(
(
(
(
(
(
(
=
(
(
(
2 13 3
6 8 12
2 2 6
1 0 2 / 1
0 1 2
0 0 1
1 3 0
0 1 0
0 0 1
5 0 0
2 4 0
2 2 6
2 13 3
6 8 12
2 2 6
1 0 2 / 1
0 1 2
0 0 1
1 12 0
2 4 0
2 2 6
=
10
If A is a Non singular matrix then for each L (lower triangular matrix) the
upper triangular matrix is unique but an LU decomposition is not unique.
There can be more than one such LU decomposition for a matrix. Such as
Calculation of L and U (cont.)
(
(
(
=
(
(
(
(
(
(
=
(
(
(
(
(
(
1 3 2 / 1
0 1 2
0 0 1
1 3 0
0 1 0
0 0 1
1 0 2 / 1
0 1 2
0 0 1
1 3 0
0 1 0
0 0 1
1 0 2 / 1
0 1 2
0 0 1
1 1
(
(
(
=
2 13 3
6 8 12
2 2 6
A
(
(
(
1 3 2 / 1
0 1 2
0 0 1
(
(
(
5 0 0
2 4 0
2 2 6
(
(
(
=
2 13 3
6 8 12
2 2 6
A
(
(
(
1 3 3
0 1 12
0 0 6
(
(
(
5 0 0
2 4 0
6 / 2 6 / 2 1
Now
Therefore,
=
=LU =
=LU
11
Calculation of L and U (cont.)
Thus LU decomposition is not unique. Since we compute LU
decomposition by elementary transformation so if we change
L then U will be changed such that A=LU
To find out the unique LU decomposition, it is necessary to
put some restriction on L and U matrices. For example, we can
require the lower triangular matrix L to be a unit one (i.e. set
all the entries of its main diagonal to ones).
LU Decomposition in R:
library(Matrix)
x<matrix(c(3,2,1, 9,3,4,4,2,5 ),ncol=3,nrow=3)
expand(lu(x))
Calculation of L and U (cont.)
12
Note: there are also generalizations of LU to nonsquare and singular
matrices, such as rank revealing LU factorization.
[Pan, C.T. (2000). On the existence and computation of rank revealing LU
factorizations. Linear Algebra and its Applications, 316: 199222.
Miranian, L. and Gu, M. (2003). Strong rank revealing LU factorizations.
Linear Algebra and its Applications, 367: 116.]
Uses: The LU decomposition is most commonly used in the solution of
systems of simultaneous linear equations. We can also find determinant
easily by using LU decomposition (Product of the diagonal element of
upper and lower triangular matrix).
Calculation of L and U (cont.)
13
Solving system of linear equation
using LU decomposition
Suppose we would like to solve a mm system AX = b. Then we can find
a LUdecomposition for A, then to solve AX =b, it is enough to solve the
systems
Thus the system LY = b can be solved by the method of forward
substitution and the system UX = Y can be solved by the method of
backward substitution. To illustrate, we give some examples
Consider the given system AX = b, where
and
(
(
(
=
2 13 3
6 8 12
2 2 6
A



.

\

=
17
14
8
b
14
We have seen A = LU, where
Thus, to solve AX = b, we first solve LY = b by forward substitution
Then
Solving system of linear equation
using LU decomposition
(
(
(
=
1 3 2 / 1
0 1 2
0 0 1
L
(
(
(
=
5 0 0
2 4 0
2 2 6
U
(
(
(
=
(
(
(
(
(
(
17
14
8
1 3 2 / 1
0 1 2
0 0 1
3
2
1
y
y
y
(
(
(
=
(
(
(
=
15
2
8
3
2
1
y
y
y
Y
15
Now, we solve UX =Y by backward substitution
then
Solving system of linear equation
using LU decomposition
(
(
(
=
(
(
(
(
(
(
15
2
8
5 0 0
2 4 0
2 2 6
3
2
1
x
x
x
(
(
(
=
(
(
(
3
2
1
3
2
1
x
x
x
16
QR Decomposition
If A is a mn matrix with linearly independent columns, then A can be
decomposed as , where Q is a mn matrix whose columns
form an orthonormal basis for the column space of A and R is an
nonsingular upper triangular matrix.
QR A =
Jrgen Pedersen Gram
(1850 1916)
Erhard Schmidt
(18761959)
Firstly QR decomposition
originated with Gram(1883).
Later Erhard Schmidt (1907)
proved the QR Decomposition
Theorem
17
QRDecomposition (Cont.)
Theorem : If A is a mn matrix with linearly independent columns, then
A can be decomposed as , where Q is a mn matrix whose
columns form an orthonormal basis for the column space of A and R is an
nonsingular upper triangular matrix.
Proof: Suppose A=[u
1
 u
2
 . . .  u
n
] and rank (A) = n.
Apply the GramSchmidt process to {u
1
, u
2
, . . . ,u
n
} and the
orthogonal vectors v
1
, v
2
, . . . ,v
n
are
Let for i=1,2,. . ., n. Thus q
1
, q
2
, . . . ,q
n
form a orthonormal
basis for the column space of A.
QR A =
1
2
1
1
2
2
2
2
1
2
1
1
, , ,
=
i
i
i i i i
i i
v
v
v u
v
v
v u
v
v
v u
u v
i
i
i
v
v
q =
18
QRDecomposition (Cont.)
Now,
i.e.,
Thus u
i
is orthogonal to q
j
for j>i;
1
2
1
1
2
2
2
2
1
2
1
1
, , ,
+ + + + =
i
i
i i i i
i i
v
v
v u
v
v
v u
v
v
v u
v u
1 1 2 2 1 1
, , ,
+ + + + =
i i i i i i i i
q q u q q u q q u q v u
} , , { } , , , {
2 2 1 i i i i
q q q span v v v span u = e
1 1 2 2 1 1
2 2 3 1 1 3 3 3 3
1 1 2 2 2 2
1 1 1
, , ,
, ,
,
+ + + + =
+ + =
+ =
=
n n n n n n n n
q q u q q u q q u q v u
q q u q q u q v u
q q u q v u
q v u
19
Let Q= [q
1
q
2
. . . q
n
] , so Q is a mn matrix whose columns form an
orthonormal basis for the column space of A .
Now,
i.e., A=QR.
Where,
Thus A can be decomposed as A=QR , where R is an upper triangular and
nonsingular matrix.
QRDecomposition (Cont.)
   
(
(
(
(
(
(
= =
n
n
n
n
n n
v
q u v
q u q u v
q u q u q u v
q q q u u u A
0 0 0 0
, 0 0
, , 0
, , ,
3 3
2 2 3 2
1 1 3 1 2 1
2 1 2 1
(
(
(
(
(
(
=
n
n
n
n
v
q u v
q u q u v
q u q u q u v
R
0 0 0 0
, 0 0
, , 0
, , ,
3 3
2 2 3 2
1 1 3 1 2 1
20
QR Decomposition
Example: Find the QR decomposition of
(
(
(
(
=
1 0 0
0 1 1
0 0 1
1 1 1
A
21
Applying GramSchmidt process of computing QR decomposition
1st Step:
2
nd
Step:
3
rd
Step:
Calculation of QR Decomposition





.

\

= =
= =
0
3 1
3 1
3 1
1
3
1
1
1
1 11
a
a
q
a r
3 2
2 1 12
= = a q r
T





.

\

= =
= =





.

\

=





.

\






.

\

= = =
0
6 / 1
3 2
6 / 1
1
3 2
0
3 / 1
3 / 2
3 / 1
0
3 1
3 1
3 1
) 3 / 2 (
0
1
0
1
2
2
2
2 22
12 1 2 2 1 1 2 2
q
q
q
q r
r q a a q q a q
T
22
4th Step:
5
th
Step:
6
th
Step:
Calculation of QR Decomposition
3 1
3 1 13
= = a q r
T
6 1
3 2 23
= = a q r
T





.

\

= =
= =





.

\

= = =
6 / 2
6 / 1
0
6 / 1
1
2 / 6
1
2 / 1
0
2 / 1
3
3
3
3 33
2 23 1 13 3 3 2 2 3 1 1 3 3
q
q
q
q r
q r q r a a q q a q q a q
T T
23
Therefore, A=QR
R code for QR Decomposition:
x<matrix(c(1,2,3, 2,5,4, 3,4,9),ncol=3,nrow=3)
qrstr < qr(x)
Q<qr.Q(qrstr)
R<qr.R(qrstr)
Uses: QR decomposition is widely used in computer codes to find the
eigenvalues of a matrix, to solve linear systems, and to find least squares
approximations.
Calculation of QR Decomposition
(
(
(
(
(
(
(
(
=
(
(
(
(
2 / 6 0 0
6 / 1 6 / 2 0
3 / 1 3 / 2 3
6 / 2 0 0
6 / 1 6 / 1 3 / 1
0 6 / 2 3 / 1
6 / 1 6 / 1 3 / 1
1 0 0
0 1 1
0 0 1
1 1 1
24
Least square solution using QR
Decomposition
The least square solution of b is
Let X=QR. Then
Therefore,
( ) Y X b X X
t t
=
( ) ( ) Z Y Q Rb Y Q R R Rb R R Y Q R Rb R
t t t t t t t t t
= = = =
1 1
( ) ( ) ( )
Y Q R Y X
Rb R QRb Q R b QR QR b X X
t t t
t t t
t
t
= =
= = =
25
Cholesky Decomposition
Cholesky died from wounds received on the battle field on 31 August 1918
at 5 o'clock in the morning in the North of France. After his death one of
his fellow officers, Commandant Benoit, published Cholesky's method of
computing solutions to the normal equations for some least squares data
fitting problems published in the Bulletin godesique in 1924. Which is
known as Cholesky Decomposition
Cholesky Decomposition: If A is a real, symmetric and positive definite
matrix then there exists a unique lower triangular matrix L with positive
diagonal element such that .
T
LL A=
AndreLouis Cholesky
18751918
26
Cholesky Decomposition
Theorem: If A is a nn real, symmetric and positive definite matrix then
there exists a unique lower triangular matrix G with positive diagonal
element such that .
Proof: Since A is a nn real and positive definite so it has a LU
decomposition, A=LU. Also let the lower triangular matrix L to be a unit
one (i.e. set all the entries of its main diagonal to ones). So in that case LU
decomposition is unique. Let us suppose
observe that . is a unit upper triangular matrix.
Thus, A=LDM
T
.Since A is Symmetric so, A=A
T
. i.e., LDM
T
=MDL
T
.
From the uniqueness we have L=M. So, A=LDL
T
. Since A is positive
definite so all diagonal elements of D are positive. Let
then we can write A=GG
T
.
T
GG A =
) , , , (
22 11 nn
u u u diag D =
U D M
T 1
=
) , , , (
22 11 nn
d d d diag L G =
27
Cholesky Decomposition (Cont.)
Procedure To find out the cholesky decomposition
Suppose
We need to solve
the equation
(
(
(
(
=
nn n n
n
n
a a a
a a a
a a a
A
2 1
2 22 21
1 12 11
T
L
nn
n
n
L
nn n n nn n n
n
n
l
l l
l l l
l l l
l l
l
a a a
a a a
a a a
A
(
(
(
(
(
(
(
(
=
(
(
(
(
=
0 0
0 0
0 0
2 22
1 21 11
2 1
22 21
11
2 1
2 22 21
1 12 11
28
Example of Cholesky Decomposition
Suppose
Then Cholesky Decomposition
Now,
2 / 1
1
1
2

.

\

=
=
k
s
ks kk kk
l a l
(
(
(
=
5 2 2
2 10 2
2 2 4
A
(
(
(
=
3 1 1
0 3 1
0 0 2
L
For k from 1 to n
For j from k+1 to n
kk
k
s
ks js jk jk
l l l a l 
.

\

=
=
1
1
29
R code for Cholesky Decomposition
x<matrix(c(4,2,2, 2,10,2, 2,2,5),ncol=3,nrow=3)
cl<chol(x)
If we Decompose A as LDL
T
then
and
(
(
(
=
1 3 / 1 2 / 1
0 1 2 / 1
0 0 1
L
(
(
(
=
3 0 0
0 9 0
0 0 4
D
30
Application of Cholesky
Decomposition
Cholesky Decomposition is used to solve the system
of linear equation Ax=b, where A is real symmetric
and positive definite.
In regression analysis it could be used to estimate the
parameter if X
T
X is positive definite.
In Kernel principal component analysis, Cholesky
decomposition is also used (Weiya Shi; YueFei
Guo; 2010)
31
Characteristic Roots and
Characteristics Vectors
Any nonzero vector x is said to be a characteristic vector of a matrix A, If
there exist a number such that Ax= x;
Where A is a square matrix, also then is said to be a characteristic root of
the matrix A corresponding to the characteristic vector x.
Characteristic root is unique but characteristic vector is not unique.
We calculate characteristics root from the characteristic equation A I=0
For =
i
the characteristics vector is the solution of x from the following
homogeneous system of linear equation (A
i
I)x=0
Theorem: If A is a real symmetric matrix and
i
and
j
are two distinct latent
root of A then the corresponding latent vector x
i
and x
j
are orthogonal.
32
Multiplicity
Algebraic Multiplicity: The number of repetitions of a certain
eigenvalue. If, for a certain matrix, ={3,3,4}, then the
algebraic multiplicity of 3 would be 2 (as it appears twice) and
the algebraic multiplicity of 4 would be 1 (as it appears once).
This type of multiplicity is normally represented by the Greek
letter , where (i) represents the algebraic multiplicity of i.
Geometric Multiplicity: the geometric multiplicity of an
eigenvalue is the number of linearly independent eigenvectors
associated with it.
33
Jordan Decomposition
Camille Jordan (1870)
Let A be any nn matrix then there exists a nonsingular matrix P and J
K
()
a kk matrix form
Such that
(
(
(
(
0 0 0
0 1 0
0 0 1
) (
k
J
(
(
(
(
(
) ( 0 0 0
0 ) ( 0
0 0 ) (
2
1
1 2
1
r k
k
k
r
J
J
J
AP P
where k
1
+k
2
+ + k
r
=n. Also
i
, i=1,2,. . ., r are the characteristic roots
And k
i
are the algebraic multiplicity of i ,
Jordan Decomposition is used in Differential equation and time series analysis.
Camille Jordan
(18381921)
34
Spectral Decomposition
Let A be a m m real symmetric matrix. Then
there exists an orthogonal matrix P such that
or , where is a diagonal
matrix.
A = AP P
T
T
P P A A =
CAUCHY, A.L.(17891857)
A. L. Cauchy established the Spectral
Decomposition in 1829.
35
Spectral Decomposition and
Principal component Analysis (Cont.)
By using spectral decomposition we can write
In multivariate analysis our data is a matrix. Suppose our data is
X matrix. Suppose X is mean centered i.e.,
and the variance covariance matrix is . The variance covariance
matrix is real and symmetric.
Using spectral decomposition we can write =PP
T
. Where is
a diagonal matrix.
Also
tr() = Total variation of Data =tr()
T
P P A A =
) ( X X
) , , , (
2 1 n
diag = A
n
> > >
2 1
36
The Principal component transformation is the transformation
Y=(X)P
Where,
E(Y
i
)=0
V(Y
i
)=
i
Cov(Y
i
,Y
j
)=0 if i j
V(Y
1
) V(Y
2
) . . . V(Y
n
)
Spectral Decomposition and
Principal component Analysis (Cont.)
=
E =
n
i
i
tr Y V
1
) ( ) (
[
=
E =
n
i
i
Y V
1
) (
37
R code for Spectral Decomposition
x<matrix(c(1,2,3, 2,5,4, 3,4,9),ncol=3,nrow=3)
eigen(x)
Application:
For Data Reduction.
Image Processing and Compression.
KSelection for Kmeans clustering
Multivariate Outliers Detection
Noise Filtering
Trend detection in the observations.
38
There are five mathematicians who were responsible for establishing the existence of the
singular value decomposition and developing its theory.
Historical background of SVD
Eugenio Beltrami
(18351899)
Camille Jordan
(18381921)
James Joseph
Sylvester
(18141897)
Erhard Schmidt
(18761959)
Hermann Weyl
(18851955)
The Singular Value Decomposition was originally developed by two mathematician in the
mid to late 1800s
1. Eugenio Beltrami , 2.Camille Jordan
Several other mathematicians took part in the final developments of the SVD including James
Joseph Sylvester, Erhard Schmidt and Hermann Weyl who studied the SVD into the mid1900s.
C.Eckart and G. Young prove low rank approximation of SVD (1936).
C.Eckart
39
What is SVD?
Any real (mn) matrix X, where (n m), can be
decomposed,
X = UV
T
U is a (mn) column orthonormal matrix (U
T
U=I),
containing the eigenvectors of the symmetric matrix
XX
T
.
is a (nn ) diagonal matrix, containing the singular
values of matrix X. The number of non zero diagonal
elements of corresponds to the rank of X.
V
T
is a (nn ) row orthonormal matrix (V
T
V=I),
containing the eigenvectors of the symmetric matrix
X
T
X.
40
Theorem (Singular Value Decomposition) : Let X be mn of rank
r, r n m. Then there exist matrices U , V and a diagonal
matrix , with positive diagonal elements such that,
Proof: Since X is m n of rank r, r n m. So XX
T
and X
T
X both
of rank r ( by using the concept of Grammian matrix ) and of
dimension m m and n n respectively. Since XX
T
is real
symmetric matrix so we can write by spectral decomposition,
Where Q and D are respectively, the matrices of characteristic
vectors and corresponding characteristic roots of XX
T
.
Again since X
T
X is real symmetric matrix so we can write by
spectral decomposition,
Singular Value Decomposition (Cont.)
T
V U X A =
T T
QDQ XX =
T T
RMR X X =
41
Where R is the (orthogonal) matrix of characteristic vectors and M
is diagonal matrix of the corresponding characteristic roots.
Since XX
T
and X
T
X are both of rank r, only r of their characteristic
roots are positive, the remaining being zero. Hence we can
write,
Also we can write,
Singular Value Decomposition (Cont.)
(
=
0 0
0
r
D
D
(
=
0 0
0
r
M
M
42
We know that the nonzero characteristic roots of XX
T
and X
T
X are
equal so
Partition Q, R conformably with D and M, respectively
i.e., ; such that Q
r
is m r , R
r
is n r and
correspond respectively to the nonzero characteristic roots of
XX
T
and X
T
X. Now take
Where are the positive characteristic roots of
XX
T
and hence those of X
T
X as well (by using the concept of
grammian matrix.)
Singular Value Decomposition (Cont.)
r r
M D =
) , (
*
Q Q Q
r
= ) R , (
* r
R R =
r
r
R V
Q U
=
=
) , , , (
2 / 1 2 / 1
2
2 / 1
1
2 / 1
r r
d d d diag D = A =
r i d
i
, , 2 , 1 , =
43
Now define,
Now we shall show that S=X thus completing the proof.
Similarly,
From the first relation above we conclude that for an arbitrary orthogonal matrix,
say P
1
,
While from the second we conclude that for an arbitrary orthogonal matrix, say P
2
We must have
Singular Value Decomposition (Cont.)
T
r r r
R D Q S
2 / 1
=
X X
RMR
R M R
R D R
R D Q Q D R
R D Q R D Q S S
T
T
T
r r r
T
r r r
T
r r r
T
r r r
T
r r r
T
T
r r r
T
=
=
=
=
=
=
) (
2 / 1 2 / 1
2 / 1 2 / 1
T T
XX SS =
X P S
1
=
2
XP S =
44
The preceding, however, implies that for arbitrary orthogonal
matrices P
1
, P
2
the matrix X satisfies
Which in turn implies that,
Thus
Singular Value Decomposition (Cont.)
2 2 1 1
, XP X P X X P XX P XX
T T T
T
T T
= =
n m
I P I P = =
2 1
,
T T
r r r
V U R D Q S X A = = =
2 / 1
45
R Code for Singular Value Decomposition
x<matrix(c(1,2,3, 2,5,4, 3,4,9),ncol=3,nrow=3)
sv<svd(x)
D<sv$d
U<sv$u
V<sv$v
46
Decomposition in Diagram
Matrix A
Lu decomposition
Not always unique
QR Decomposition
Full column rank
Square
Rectangular
SVD
Symmetric
Asymmetric
PD
Cholesky
Decomposition
Spectral
Decomposition
AM>GM
Jordan
Decomposition
AM=GM
Similar
Diagonalization
P
1
AP=
47
Properties Of SVD
Rewriting the SVD
where
r = rank of A
i
= the ith diagonal element of .
u
i
and v
i
are the ith columns of U and V
respectively.
T
i
r
i
i i
T
v u V U A
=
= A =
1
48
Proprieties of SVD
Low rank Approximation
Theorem: If A=UV
T
is the SVD of A and the
singular values are sorted as ,
then for any l <r, the best rankl approximation
to A is
;
Low rank approximation technique is very much
important for data compression.
n
> > >
2 1
T
i
l
i
i i
v u A
=
=
1
~
+ =
=
r
l i
i
A A
1
2
2
~
49
SVD can be used to compute optimal lowrank
approximations.
Approximation of A is of rank k such that
If are the characteristics roots of A
T
A then
and X are both mn matrices.
Lowrank Approximation
F
k X rank X
X A Min A =
= ) ( :
~
Frobenius norm
= =
=
m
i
n
j
ij
a A
1
2
1
n
d d d , , ,
2 1
=
=
n
i
i
d A
1
2
50
Lowrank Approximation
Solution via SVD
set smallest rk
singular values to zero
T
V
U X
(
(
(
(
(
(



(
(
(
(
(
(
=
(
(
(
(
(
(
A
* * *
* * *
* * *
* * *
* * *
* * *
* * *
* * *
* * *
* * *
* * *
* * *
* * *
K=2
T
k
V U A ) 0 ,..., 0 , ,..., ( diag
~
1
=
column notation: sum
of rank 1 matrices
T
i i
k
i
i
v u A
=
=
1
~
51
Approximation error
How good (bad) is this approximation?
Its the best possible, measured by the Frobenius norm of the
error:
where the
i
are ordered such that
i
>
i+1
.
+ =
=
= =
r
k i
i
F
F
k X rank X
A A X A
1
2 2
2
) ( :
~
min
2
~
F
A A
Now
52
Row approximation and column
approximation
Suppose R
i
and c
j
represent the ith row and jth column of A. The SVD
of A and is
The SVD equation for R
i
is
We can approximate R
i
by ; l<r
where i = 1,,m.
=
=
r
k
k k jk j
u v C
1
A
~
T
k
l
k
k k
T
l l l
v u V U A
=
= A =
1
~
T
k
r
k
k k
T
v u V U A
=
= A =
1
=
=
r
k
k k ik i
v u R
1
=
=
l
k
k k ik
l
i
v u R
1
Also the SVD equation for Cj is,
where j = 1, 2, , n
We can also approximate Cj by ; l<r
=
=
l
k
k k jk
l
j
u v C
1
53
Least square solution in inconsistent
system
By using SVD we can solve the inconsistent system.This gives the
least square solution.
The least square solution
where A
g
be the MP inverse of A.
2
min
b Ax
x
54
The SVD of A
g
is
This can be written as
Where
55
Basic Results of SVD
56
SVD based PCA
If we reduced variable by using SVD then it performs like PCA.
Suppose X is a mean centered data matrix, Then
X using SVD, X=UV
T
we can write XV = U
Suppose Y = XV = U
Then the first columns of Y represents the first
principal component score and so on.
o SVD Based PC is more Numerically Stable.
o If no. of variables is greater than no. of observations then SVD based PCA will
give efficient result(Antti Niemist, Statistical Analysis of Gene Expression
Microarray Data,2005)
57
Data Reduction both variables and observations.
Solving linear least square Problems
Image Processing and Compression.
KSelection for Kmeans clustering
Multivariate Outliers Detection
Noise Filtering
Trend detection in the observations and the variables.
Application of SVD
58
Origin of biplot
Gabriel (1971)
One of the most
important advances in
data analysis in recent
decades
Currently
> 50,000 web pages
Numerous academic
publications
Included in most
statistical analysis
packages
Still a very new
technique to most
scientists
Prof. Ruben Gabriel, The founder of biplot
Courtesy of Prof. Purificacin Galindo
University of Salamanca, Spain
59
What is a biplot?
Biplot = bi + plot
plot
scatter plot of two rows OR of two columns, or
scatter plot summarizing the rows OR the columns
bi
BOTH rows AND columns
1 biplot >> 2 plots
60
Practical definition of a biplot
Any twoway table can be analyzed using a 2Dbiplot as soon as it can be
sufficiently approximated by a rank2 matrix. (Gabriel, 1971)
GbyE table
Matrix decomposition
P(4, 3) G(3, 2) E(2, 3)
(Now 3Dbiplots are also possible)
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
2 1 4
3 3 2
3 2 1
0 4 4
3 1 3
3 3 2
3 4 1
12 12 8 4
9 6 10 3
15 12 6 2
6 9 20 1
3 2 1
y
x
e e e
g
g
g
g
y x
g
g
g
g
e e e
4
3
2
1
0
1
2
3
4
5
4 3 2 1 0 1 2 3 4 5
X
Y
O
G1 G2
G3
G4
E1
E2
E3
61
Singular Value Decomposition (SVD) &
Singular Value Partitioning (SVP)
SVD:
SVP:
Biplot Plot Plot
=
r
k
kj
f
k
f
k ik
SVP
r
k
kj k ik
SVD
ij
v u
v u X
1
1
1
) )( (
=
n
i
i j ij
u c x
1
1
Calculate right singular vector v
1
=c/c
, where . refers to Euclidean norm.
Again fit the L1 regression coefficient
di by minimizing ; i=1,2,.,n
=
p
j
j i ij
v d x
1
1
Calculate the resulting estimate of
the left eigenvector u
i
=d/ d
Iterate this process untill it converge.
The Alternating L1 Regression Algorithm for Robust Singular Value
Decomposition.
For the second and subsequent of the SVD, we replaced X by a deflated matrix
obtained by subtracting the most recently found them in the SVD X X
k
u
k
v
k
T
80
Clustering weather stations on Map
Using RSVD
81
References
Brown B.W., Jr. (1980). Prediction analysis for binary data. in
Biostatistics Casebook, R.G. Miller, Jr., B. Efron, B. W. Brown, Jr., L.E.
Moses (Eds.), New York: Wiley.
Dhrymes, Phoebus J. (1984), Mathematics for Econometrics, 2nd ed.
Springer Verlag, New York.
Hawkins D. M., Bradu D. and Kass G.V.(1984),Location of several
outliers in multiple regression data using elemental sets. Technometrics,
20, 197208.
Imon A. H. M. R. (2005). Identifying multiple influential observations in
linear Regression. Journal of Applied Statistics 32, 73 90.
Kumar, N. , Nasser, M., and Sarker, S.C., 2011. A New Singular Value
Decomposition Based Robust Graphical Clustering Technique and Its
Application in Climatic Data Journal of Geography and Geology,
Canadian Center of Science and Education , Vol3, No. 1, 227238.
Ryan T.P. (1997). Modern Regression Methods, Wiley, New York.
Stewart, G.W. (1998). Matrix Algorithms, Vol 1. Basic
Decompositions, Siam, Philadelphia.
Matrix Decomposition. http://fedc.wiwi.hu
berlin.de/xplore/ebooks/html/csa/node36.html
82