0 valutazioniIl 0% ha trovato utile questo documento (0 voti)

53 visualizzazioni82 pagineMatrix Decomposition and Its Application in Statistics NK

Apr 04, 2017

© © All Rights Reserved

PPT, PDF, TXT o leggi online da Scribd

Matrix Decomposition and Its Application in Statistics NK

© All Rights Reserved

0 valutazioniIl 0% ha trovato utile questo documento (0 voti)

53 visualizzazioni82 pagineMatrix Decomposition and Its Application in Statistics NK

© All Rights Reserved

Sei sulla pagina 1di 82

Application in Statistics

Nishith Kumar

Lecturer

Department of Statistics

Begum Rokeya University, Rangpur.

Email: nk.bru09@gmail.com

1

Overview

Introduction

LU decomposition

QR decomposition

Cholesky decomposition

Jordan Decomposition

Spectral decomposition

Singular value decomposition

Applications

2

Introduction

This Lecture covers relevant matrix decompositions, basic

numerical methods, its computation and some of its applications.

Decompositions provide a numerically stable way to solve

a system of linear equations, as shown already in [Wampler,

1970], and to invert a matrix. Additionally, they provide an

important tool for analyzing the numerical stability of a system.

Cholesky, Jordan, Spectral decomposition and Singular value

decompositions.

3

Easy to solve system (Cont.)

Some linear system that can be easily solved

The solution:

b1 / a11

b /a

2 22

bn / a nn

4

Easy to solve system (Cont.)

Lower triangular matrix:

5

Easy to solve system (Cont.)

Upper Triangular Matrix:

6

LU Decomposition

LU decomposition was originally derived as a decomposition of quadratic

and bilinear forms. Lagrange, in the very first paper in his collected

works( 1759) derives the algorithm we call Gaussian elimination. Later

Turing introduced the LU decomposition of a matrix in 1948 that is used to

solve the system of linear equation.

matrices L and U such that, where L is a lower triangular matrix and U is an

upper triangular matrix.

u11 u12 u1m l11 0 0

0 u l 0

u 2 m L 21

l 22

U 22

and

0 0 u mm l m1 lm2 l mm

(1736 1813) (1912-1954)

7

How to decompose A=LU?

6 2 2

A 12 8 6 Now,

3 13 2

A U (upper triangular) 6 2 2 1 0 0 6 2 2

0 4 2 2 1 0 12 8 6

U = Ek E1 A 0 12 1 1 / 2 0 1 3 13 2

6 2 2 1 0 0 1 0 0 6 2 2

A = (E1)1 (Ek)1 U 0 4 2 0 1 0 2 1 0

12 8 6

0 0 5 0 3 1 1 / 2 0 1 3 13 2

U E2 E1 A

it can be proved that (E1)1, , (Ek)1 are lower triangular, and

(E1)1 (Ek)1 is a lower triangular matrix.

Let L=(E1)1 (Ek)1 then A=LU. 8

Calculation of L and U (cont.)

6 2 2 1 0 0 6 2 2

A 12 8 6 = 0 1 0 12

8 6

3 13 2 0 0 1 3 13 2

6 2 2 1 0 0 6 2 2

0 4 2 2 1 0 12 8 6

0 12 1 1 / 2 0 1 3 13 2

6 2 2 1 0 0 1 0 0 6 2 2

0 4 2 0 1 0 2 1 0 12

8 6

0 0 5 0 3 1 1 / 2 0 1 3 13 2

9

Calculation of L and U (cont.)

Now

1 1

1 0 0 1 0 0 1 0 0 1 0 0 1 0 0

2 1 0 0 1 0 2 1 0 0 1 0 2 1 0

1 / 2 0 1 0 3 1 1 / 2 0 1 0 3 1 1 / 2 3 1

Therefore,

6 2 2 1 0 0 6 2 2

A 12 8 6 = 2 1 0 0 4 2

=LU

3 13 2 1 / 2 3 1 0 0 5

If A is a Non singular matrix then for each L (lower triangular matrix) the upper

triangular matrix is unique but an LU decomposition is not unique. There can

be more than one such LU decomposition for a matrix. Such as

6 2 2 6 0 0 1 2 / 6 2 / 6

12 1 0

A 12 8 6 = 0 4 2 =LU

3 3 1 0 10

3 13 2 0 5

Calculation

Calculationof

of LL and U (cont.)

and U (cont.)

Thus LU decomposition is not unique. Since we compute LU

decomposition by elementary transformation so if we change

L then U will be changed such that A=LU

put some restriction on L and U matrices. For example, we can

require the lower triangular matrix L to be a unit one (i.e. set

all the entries of its main diagonal to ones).

LU Decomposition in R:

library(Matrix)

x<-matrix(c(3,2,1, 9,3,4,4,2,5 ),ncol=3,nrow=3)

expand(lu(x)) 11

Calculation of L and U (cont.)

Note: there are also generalizations of LU to non-square and singular

matrices, such as rank revealing LU factorization.

[Pan, C.T. (2000). On the existence and computation of rank revealing LU

factorizations. Linear Algebra and its Applications, 316: 199-222.

Miranian, L. and Gu, M. (2003). Strong rank revealing LU factorizations.

Linear Algebra and its Applications, 367: 1-16.]

systems of simultaneous linear equations. We can also find determinant

easily by using LU decomposition (Product of the diagonal element of

upper and lower triangular matrix).

12

Solving system of linear equation

using LU decomposition

Suppose we would like to solve a mm system AX = b. Then we can find

a LU-decomposition for A, then to solve AX =b, it is enough to solve the

systems

substitution and the system UX = Y can be solved by the method of

backward substitution. To illustrate, we give some examples

Consider the given system AX = b, where

6 2 2 and 8

A 12 8 6 b 14

3 13 2 17

13

Solving system of linear equation

using LU decomposition

We have seen A = LU, where

1 0 0 6 2 2

L 2 1 0 U 0 4 2

0 0 5

1 / 2 3 1

1 0 0 y1 8

2 1 0 y 2 14

1 / 2 3 1 y 3 17

Then

y1 8

Y y 2 2

y 3 15

14

Solving system of linear equation

using LU decomposition

Now, we solve UX =Y by backward substitution

6 2 2 x1 8 x1 1

0 4 x 2

2 x 2

2 2

then

0 0 5 x3 15 x3 3

15

QR Decomposition

Firstly QR decomposition

originated with Gram(1883).

Later Erhard Schmidt (1907)

proved the QR Decomposition

Jrgen Pedersen Gram Theorem Erhard Schmidt

(1850 1916) (1876-1959)

decomposed as , A QR where Q is a mn matrix whose columns

form an orthonormal basis for the column space of A and R is an

nonsingular upper triangular matrix.

16

QR-Decomposition (Cont.)

Theorem : If A is a mn matrix with linearly independent columns, then

A can be decomposed as , A QR where Q is a mn matrix whose

columns form an orthonormal basis for the column space of A and R is an

nonsingular upper triangular matrix.

Apply the Gram-Schmidt process to {u 1, u2 , . . . ,un} and the

orthogonal vectors v1, v2 , . . . ,vn are

u i , v1 ui , v2 u i , vi 1

vi u i 2

v1 2

v2 2

vi 1

v1 v2 vi 1

vi

qi

vi

Let for i=1,2,. . ., n. Thus q1, q2 , . . . ,qn form a orthonormal

basis for the column space of A.

17

QR-Decomposition (Cont.)

Now, u i , v1 ui , v2 u i , vi 1

u i vi 2

v1 2

v2 2

vi 1

v1 v2 vi 1

u i vi qi u i , q1 q1 u i , q 2 q 2 u i , qi 1 qi 1

i.e.,

u i span{v1 , v 2 , , vi } span{qi , q 2 , qi }

Thus ui is orthogonal to qj for j>i;

u1 v1 q1

u 2 v 2 q 2 u 2 , q1 q1

u 3 v3 q3 u 3 , q1 q1 u 3 , q 2 q 2

u n v n q n u n , q1 q1 u n , q 2 q 2 u n , q n 1 q n 1 18

QR-Decomposition (Cont.)

Let Q= [q1 q2 . . . qn] , so Q is a mn matrix whose columns form an

orthonormal basis for the column space of A .

v1 u 2 , q1 u 3 , q1 u n , q1

0 v2 u3 , q2 u n , q2

Now,

A u1 u 2 u n q1 q 2 qn 0 0 v3 u n , q3

0 0 0 0 v n

i.e., A=QR.

v1 u 2 , q1 u 3 , q1 u n , q1

Where,

0 v2 u3 , q2 u n , q2

R 0 0 v3 u n , q3

0 0 0 0 vn

19

nonsingular matrix.

QR Decomposition

1 1 1

1 0 0

A

1 1 0

0 0 1

20

Calculation of QR Decomposition

Applying Gram-Schmidt process of computing QR decomposition

1st Step: r11 a1 3

1 3

1 1 3

q1 a1

a1 1 3

0

3rd Step: 1 1 3 1 / 3

0 1 3 2/3

q 2 a 2 q1 q1 a 2 a 2 q1 r12

T

(2 / 3 ) 1 / 3

1 1 3

0 0 0

r22 q 2 2 3

1/ 6

1 23

q2 q 2

q 2 1/ 6

21

0

Calculation of QR Decomposition

4th Step:

r13 q1T a3 1 3

6th Step:

1/ 2

0

q 3 a3 q1 q1T a3 q 2 q 2T a3 a3 r13 q1 r23 q 2

1/ 2

1

r33 q 3 6 / 2

1/ 6

1 0

q3 q

q3 3 1 / 6

2/ 6

22

Calculation of QR Decomposition

Therefore, A=QR

1 1 1 1/ 3 1/ 6 1/ 6

1 0 3 2 / 3 1/ 3

0 1/ 3 2 / 6 0

0 2 / 6 1 / 6

1 1 0 1/ 3 1/ 6 1/ 6

0 0 6 / 2

0 0 1 0 0 2 / 6

R code for QR Decomposition:

x<-matrix(c(1,2,3, 2,5,4, 3,4,9),ncol=3,nrow=3)

qrstr <- qr(x)

Q<-qr.Q(qrstr)

R<-qr.R(qrstr)

eigenvalues of a matrix, to solve linear systems, and to find least squares

approximations. 23

Least square solution using QR

Decomposition

The least square solution of b is

X X b X Y

t t

X X b QR QR b R Q QRb R Rb

t t t t t

X Y RQY

t t t

Therefore,

R Rb R Q Y R

t t t

t 1

R Rb R

t

t 1

R t Q t Y Rb Q t Y Z

24

Cholesky Decomposition

Cholesky died from wounds received on the battle field on 31 August 1918

at 5 o'clock in the morning in the North of France. After his death one of

his fellow officers, Commandant Benoit, published Cholesky's method of

computing solutions to the normal equations for some least squares data

fitting problems published in the Bulletin godesique in 1924. Which is

known as Cholesky Decomposition

matrix then there exists a unique lower triangular matrix L with positive

diagonal element such that A LLT .

Andre-Louis Cholesky 25

1875-1918

Cholesky Decomposition

Theorem: If A is a nn real, symmetric and positive definite matrix then

there exists a unique lower triangular matrix G with positive diagonal

element such that A GG T .

decomposition, A=LU. Also let the lower triangular matrix L to be a unit

one (i.e. set all the entries of its main diagonal to ones). So in that case LU

decomposition is unique. Let us suppose D diag (u11 , u 22 , , u nn )

observe that M T D 1U . is a unit upper triangular matrix.

Thus, A=LDMT .Since A is Symmetric so, A=AT . i.e., LDMT =MDLT. From

the uniqueness we have L=M. So, A=LDLT . Since A is positive definite so

G L diag ( d , d , , d )

all diagonal elements of D are positive. Let 11 22 nn

26

Cholesky Decomposition (Cont.)

Procedure To find out the cholesky decomposition

Suppose a11 a12 a1n

a 21

a 22 a 2 n

A

a n1 a n 2 a nn

a a a 2 n l 21 l 22 0 0 l 22 l n 2

the equation A 21 22

a n1 a n 2 a nn l n1 l n 2 l nn

0

0

l nn

L LT

27

Example of Cholesky Decomposition

Suppose 4 2 2 For k from 1 to n 1/ 2

A 2 10 2

k 1

2 2 5 l kk a kk l ks2

s 1

k 1

For j from k+1 to n l jk a jk l js l ks l kk

Then Cholesky Decomposition s 1

Now,

2 0 0

L 1 3 0

28

1 1 3

R code for Cholesky Decomposition

x<-matrix(c(4,2,-2, 2,10,2, -2,2,5),ncol=3,nrow=3)

cl<-chol(x)

1 0 0 4 0 0

L 1 / 2 1 0 and D 0 9 0

1 / 2 1 / 3 1 0 0 3

29

Application of Cholesky

Decomposition

Cholesky Decomposition is used to solve the system

of linear equation Ax=b, where A is real symmetric

and positive definite.

In regression analysis it could be used to estimate the

parameter if XTX is positive definite.

decomposition is also used (Weiya Shi; Yue-Fei

Guo; 2010)

30

Characteristic Roots and

Characteristics Vectors

Any nonzero vector x is said to be a characteristic vector of a matrix A, If

there exist a number such that Ax= x;

the matrix A corresponding to the characteristic vector x.

For = i the characteristics vector is the solution of x from the following

homogeneous system of linear equation (A- iI)x=0

Theorem: If A is a real symmetric matrix and i and j are two distinct latent

root of A then the corresponding latent vector x i and xj are orthogonal.

31

Multiplicity

Algebraic Multiplicity: The number of repetitions of a certain

eigenvalue. If, for a certain matrix, ={3,3,4}, then the

algebraic multiplicity of 3 would be 2 (as it appears twice) and

the algebraic multiplicity of 4 would be 1 (as it appears once).

This type of multiplicity is normally represented by the Greek

letter , where (i) represents the algebraic multiplicity of i.

eigenvalue is the number of linearly independent eigenvectors

associated with it.

32

Jordan Decomposition

Camille Jordan (1870)

Let A be any nn matrix then there exists a nonsingular matrix P and JK()

a kk matrix form 1 0 0

0 1 0

J k ( )

0 0 0

Such that

Camille Jordan

J k1 (1 ) 0 0 (1838-1921)

0 J ( ) 0

P 1 AP

k2 2

0 0 0 J kr (r )

where k1+k2+ + kr =n. Also i , i=1,2,. . ., r are the characteristic roots

And ki are the algebraic multiplicity of i ,

Jordan Decomposition is used in Differential equation and time series analysis. 33

Spectral Decomposition

Decomposition in 1829.

CAUCHY, A.L.(1789-1857)

there exists an orthogonal matrix P such that

P T AP or A PP , where is a diagonal

T

matrix.

34

Spectral Decomposition and

Principal component Analysis (Cont.)

By using spectral decomposition we can write A PPT

X matrix. Suppose X is mean centered i.e., X ( X )

and the variance covariance matrix is . The variance covariance

matrix is real and symmetric.

a diagonal matrix. diag (1, 2 ,, n )

Also

1 2 n

35

Spectral Decomposition and

Principal component Analysis (Cont.)

The Principal component transformation is the transformation

Y=(X-)P

Where,

E(Yi)=0

V(Yi)=i

Cov(Yi ,Yj)=0 if i j

V(Yn1) V(Y2) . . . V(Yn)

V (Yi ) tr ()

i 1

V (Y )

i 1

i

36

R code for Spectral Decomposition

x<-matrix(c(1,2,3, 2,5,4, 3,4,9),ncol=3,nrow=3)

eigen(x)

Application:

For Data Reduction.

Image Processing and Compression.

K-Selection for K-means clustering

Multivariate Outliers Detection

Noise Filtering

Trend detection in the observations.

37

Historical background of SVD

There are five mathematicians who were responsible for establishing the existence of the

singular value decomposition and developing its theory.

Eugenio Beltrami Camille Jordan James Joseph Erhard Schmidt Hermann Weyl

(1835-1899) (1838-1921) Sylvester (1876-1959) (1885-1955)

(1814-1897)

The Singular Value Decomposition was originally developed by two mathematician in the

mid to late 1800s

1. Eugenio Beltrami , 2.Camille Jordan

Several other mathematicians took part in the final developments of the SVD including James

Joseph Sylvester, Erhard Schmidt and Hermann Weyl who studied the SVD into the mid-1900s.

38

C.Eckart

What is SVD?

Any real (mn) matrix X, where (n m), can be

decomposed,

X = UVT

U is a (mn) column orthonormal matrix (UTU=I),

containing the eigenvectors of the symmetric matrix

XXT.

is a (nn ) diagonal matrix, containing the singular

values of matrix X. The number of non zero diagonal

elements of corresponds to the rank of X.

VT is a (nn ) row orthonormal matrix (VTV=I),

containing the eigenvectors of the symmetric matrix

XTX.

39

Singular Value Decomposition (Cont.)

Theorem (Singular Value Decomposition) : Let X be mn of rank r,

r n m. Then there exist matrices U , V and a diagonal matrix

, with positive diagonal elements such that, X UV T

of rank r ( by using the concept of Grammian matrix ) and of

dimension m m and n n respectively. Since XXT is real

symmetric matrix so we can write by spectral decomposition,

XX T QDQ T

Where Q and D are respectively, the matrices of characteristic

vectors and corresponding characteristic roots of XXT.

Again since XTX is real symmetric matrix so we can write by

spectral decomposition,

X T X RMR T 40

Singular Value Decomposition (Cont.)

Where R is the (orthogonal) matrix of characteristic vectors and M

is diagonal matrix of the corresponding characteristic roots.

Since XXT and XTX are both of rank r, only r of their characteristic

roots are positive, the remaining being zero. Hence we can

write,

Dr 0

D

0 0

Also we can write,

Mr 0

M

0 0

41

Singular Value Decomposition (Cont.)

We know that the nonzero characteristic roots of XXT and XTX are

equal so Dr M r

Partition Q, R conformably with D and M, respectively

i.e., Q (Qr , Q* ) ; R ( Rr , R * ) such that Qr is m r , Rr is n r and

correspond respectively to the nonzero characteristic roots of XXT

and XTX. Now take

U Qr

V Rr

1/ 2 1/ 2 1/ 2

D 1/ 2

r diag (d 1 , d2 ,, d r )

Where d , i 1,2, , r are the positive characteristic roots of XXT

i

and hence those of XTX as well (by using the concept of

grammian matrix.) 42

Singular Value Decomposition (Cont.)

Now define, S Qr Dr1 / 2 Rr

T

1/ 2 T 1/ 2 T

S T S (Qr Dr R r ) T Qr Dr Rr

Rr Dr1 / 2 QrT Qr Dr1 / 2 RrT

T

R r Dr Rr

T

R r M r Rr

RMR T

XTX

Similarly,

SS T XX T

From the first relation above we conclude that for an arbitrary orthogonal matrix, say

P1 ,

S P1 X

While from the second we conclude that for an arbitrary orthogonal matrix, say P 2

We must have 43

S XP2

Singular Value Decomposition (Cont.)

matrices P1 , P2 the matrix X satisfies

T

XX T P1 XX T P1 , X T X P2T X T XP2

44

R Code for Singular Value Decomposition

sv<-svd(x)

D<-sv$d

U<-sv$u

V<-sv$v

45

Decomposition in Diagram

Matrix A

Full column rank

Lu decomposition

QR Decomposition

Not always unique

Rectangular

Square

Asymmetric

Symmetric SVD

AM>GM AM=GM

PD Similar

Cholesky Jordan Diagonalization

Spectral Decomposition

Decomposition P-1AP=

Decomposition 46

Properties Of SVD

r

A UV T

ui i viT

i 1

where

r = rank of A

i = the i-th diagonal element of .

ui and vi are the i-th columns of U and V

respectively.

47

Proprieties of SVD

Low rank Approximation

Theorem: If A=UVT is the SVD of A and the

singular values are sorted as 1 2 , n

then for any l <r, the best rank-l approximation

to A is ~ l

r

~2

A ui i vi ; A A i2

T

i 1 i l 1

important for data compression.

48

Low-rank Approximation

SVD can be used to compute optimal low-rank

approximations.

Approximation of A is of rank k such that

~

A Min A X F Frobenius norm

X :rank ( X ) k

m n 2

A a

i 1 j 1

ij

n

di

2

d1 , d 2 , , d n A

If are the characteristics roots of ATA then i 1

49

Low-rank Approximation

Solution via SVD

~

A U diag (1 ,..., k ,0,...,0)V T

singular values to zero

* * * * * *

* * * * * *

* * *

* * * * * * * * *

* * * * * * * * *

* * * * * * VT

X U

K=2

~

A i 1 i u i viT

k

column notation: sum

50

of rank 1 matrices

Approximation error

How good (bad) is this approximation?

Its the best possible, measured by the Frobenius norm of the

error: ~ r

2 2 2

min

X :rank ( X ) k

A X F A A

F

i k 1

i

~

Now A A

F

2

51

Row approximation and column

approximation

Suppose Ri and cj represent the i-th row and j-th column of A. The SVD

~

of A and A is

r l

~

A UV T

u

k 1

k k v T

k A U l lVl uk k vkT

T

k 1

r

Ri u

k 1

ik k v k

l

R

i

l

u k 1

ik k vk

We can approximate Ri by ; l<r

where i = 1,,m.

Also the SVD equation for Cj is, r

C j v jk k u k

where j = 1, 2, , n k 1

l

We can also approximate Cj by C v jk k u k

l

j ; l<r 52

k 1

Least square solution in inconsistent

system

By using SVD we can solve the inconsistent system.This gives the

least square solution. min 2

Ax b

x

53

The SVD of Ag is

Where

54

55

Basic Results of SVD

SVD based PCA

X using SVD, X=UVT

we can write- XV = U

Suppose Y = XV = U

Then the first columns of Y represents the first

principal component score and so on.

o If no. of variables is greater than no. of observations then SVD based PCA will

give efficient result(Antti Niemist, Statistical Analysis of Gene Expression

Microarray Data,2005) 56

Application of SVD

Solving linear least square Problems

Image Processing and Compression.

K-Selection for K-means clustering

Multivariate Outliers Detection

Noise Filtering

Trend detection in the observations and the variables.

57

Origin of biplot

Gabriel (1971)

One of the most

important advances in

data analysis in recent

decades

Currently

> 50,000 web pages

Numerous academic

publications

Included in most

statistical analysis

packages

Still a very new

technique to most

scientists

of. Ruben Gabriel, The founder of biplot

Courtesy of Prof. Purificacin Galindo

University of Salamanca, Spain

58

What is a biplot?

Biplot = bi + plot

plot

scatter plot of two rows OR of two columns, or

scatter plot summarizing the rows OR the columns

bi

BOTH rows AND columns

1 biplot >> 2 plots

59

Practical definition of a biplot

Any two-way table can be analyzed using a 2D-biplot as soon as it can be

sufficiently approximated by a rank-2 matrix. (Gabriel, 1971)

(Now 3D-biplots are also possible)

Matrix decomposition E1

G2 G1

P(4, 3) G(3, 2) E(2, 3)

e1 e2 e3 x y E2

g1 9 6 g1 3

20 4 e1 e2 e3 G4

g2 6 12 15 g 2 3 3 x 2 3 3

g3 10 6 9 g 3 1 3 y 4 1 2

g 4 12 12 g 4 4 0

8 E3

G3

G-by-E table

60

Singular Value Decomposition (SVD) &

Singular Value Partitioning (SVP)

The rank of Y,

i.e., the Matrix Matrix

minimum characterisi Singular values characterisi

number of PC ng the rows ng the

required to fully columns

represent Y

r

SVD: X ij

SVD

uik k vkj Common uses value

k 1 of f

r f=1

SVP

(uik )( f 1 f

vkj )

SVP: k 1

k k f=0

f=1/2

Rows Column

scores scores

Plot Biplot Plot 61

Biplot

The simplest biplot is to show the first two PCs together with the

projections of the axes of the original variables

x-axis represents the scores for the first principal component

Y-axis the scores for the second principal component.

The original variables are represented by arrows which

graphically indicate the proportion of the original variance

explained by the first two principal components.

The direction of the arrows indicates the relative loadings on

the first and second principal components.

i) Graphically

ii) Effectively

iii) Conveniently.

62

Biplot of Iris Data

-10 -5 0 5 10

1 33

0.2 Sepal W.

10

1

3

1 1

111

0.1

1111

5

1

1 333 333 Sepal L.

22 2 33 33

11 1 22223 3 33

333

1 11111 2 3 3 3 3

Comp. 2

11

11 2 233 3 3 Petal W.

111

0.0

1 1 33 23

2

2222 3 3 Petal L.

0

1 222 22 233 33

111 2 2 22 333

1 11 1

111 22 323

1 222 3 3 3

11 22 22

2

-0.1

-5

1 3

22 2

2 22 3 23

2 22

2

-10

-0.2

1

1= Setosa

2

2= Versicolor

3= Virginica -0.2 -0.1 0.0 0.1 0.2

63

Comp. 1

Image Compression Example

Pansy Flower image, collected from

http://www.ats.ucla.edu/stat/r/code/pansy.jpg

64

Singular values of flowers image

65

Low rank Approximation to flowers image

Rank- 5

Rank-1 approximation 66

approximation

Low rank Approximation to flowers image

67

Low rank Approximation to flowers image

68

Low rank Approximation to flowers image

69

Low rank Approximation to flowers image

70

Outlier Detection Using SVD

Nishith and Nasser (2007,MSc. Thesis) propose a graphical

method of outliers detection using SVD.

It is suitable for both general multivariate data and regression

data. For this we construct the scatter plots of first two PCs,

and first PC and third PC. We also make a box in the scatter

plot whose range lies

median(1stPC) 3 mad(1stPC) in the X-axis and

median(2ndPC/3rdPC) 3 mad(2ndPC/3rdPC) in the Y-

axis.

Where mad = median absolute deviation.

The points that are outside the box can be considered as

extreme outliers. The points outside one side of the box is

termed as outliers. Along with the box we may construct

another smaller box bounded by 2.5/2 MAD line

71

Outlier Detection Using SVD (Cont.)

HAWKINS-BRADU-KASS

(1984) DATA

with 14 influential observations.

Among them there are ten high

leverage outliers (cases 1-10)

and for high leverage points

(cases 11-14) -Imon (2005).

Scatter plot of Hawkins, Bradu and kass data (a) scatter plot of first two PCs and

(b) scatter plot of first and third PC. 72

Outlier Detection Using SVD (Cont.)

MODIFIED BROWN DATA

Data set given by Brown (1980).

Ryan (1997) pointed out that the

original data on the 53 patients

which contains 1 outlier

(observation number 24).

this data set by putting two more

Scatter plot of modified Brown data (a) scatter plot of first outliers as cases 54 and 55.

two PCs and (b) scatter plot of first and third PC.

Also they showed that observations

24, 54 and 55 are outliers by using

generalized standardized

Pearson residual (GSPR) 73

Cluster Detection Using SVD

Singular Value Decomposition is also used for cluster

detection (Nishith, Nasser and Suboron, 2011).

PCs are given below,

median (1st PC) k mad (1st PC) in the X-axis and

median (2nd PC/3rd PC) k mad (2nd PC/3rd PC)

in the Y-axis.

Where mad = median absolute deviation. The value of

k = 1, 2, 3.

74

75

Principals stations in climate data

76

Climatic Variables

The climatic variables are,

1. Rainfall (RF) mm

2. Daily mean temperature (T-MEAN)0C

3. Maximum temperature (T-MAX)0C

4. Minimum temperature (T-MIN)0C

5. Day-time temperature (T-DAY)0C

6. Night-time temperature (T-NIGHT)0C

7. Daily mean water vapor pressure (VP) MBAR

8. Daily mean wind speed (WS) m/sec

9. Hours of bright sunshine as percentage of maximum possible sunshine

hours (MPS)%

10. Solar radiation (SR) cal/cm2/day

77

Consequences of SVD

Generally many missing values may present in the data. It may also contain

unusual observations. Both types of problem can not handle Classical singular

value decomposition.

regression approach (Douglas M. Hawkins, Li Liu, and S. Stanley Young,

(2001)).

78

The Alternating L1 Regression Algorithm for Robust Singular Value

Decomposition.

the initial values of u1

left singular vector u1

n Calculate right singular vector v1=c/c

minimizing xij c j u i1 ; j=1,2,

,p i 1 , where . refers to Euclidean norm.

Calculate the resulting estimate of

the left eigenvector ui=d/ d p

di by minimizing x

j 1

ij d i v j1 ; i=1,2,.,n

For the second and subsequent of the SVD, we replaced X by a deflated matrix 79

obtained by subtracting the most recently found them in the SVD X X-kukvkT

Clustering weather stations on Map

Using RSVD

80

References

Brown B.W., Jr. (1980). Prediction analysis for binary data. in

Biostatistics Casebook, R.G. Miller, Jr., B. Efron, B. W. Brown, Jr., L.E.

Moses (Eds.), New York: Wiley.

Dhrymes, Phoebus J. (1984), Mathematics for Econometrics, 2nd ed.

Springer Verlag, New York.

Hawkins D. M., Bradu D. and Kass G.V.(1984),Location of several

outliers in multiple regression data using elemental sets. Technometrics,

20, 197-208.

Imon A. H. M. R. (2005). Identifying multiple influential observations in

linear Regression. Journal of Applied Statistics 32, 73 90.

Kumar, N. , Nasser, M., and Sarker, S.C., 2011. A New Singular Value

Decomposition Based Robust Graphical Clustering Technique and Its

Application in Climatic Data Journal of Geography and Geology,

Canadian Center of Science and Education , Vol-3, No. 1, 227-238.

Ryan T.P. (1997). Modern Regression Methods, Wiley, New York.

Stewart, G.W. (1998). Matrix Algorithms, Vol 1. Basic

Decompositions, Siam, Philadelphia.

Matrix Decomposition.

http://fedc.wiwi.hu-berlin.de/xplore/ebooks/html/csa/node36.html 81

82