0 valutazioniIl 0% ha trovato utile questo documento (0 voti)

5 visualizzazioni82 paginematrix

Sep 18, 2019

© © All Rights Reserved

PPT, PDF, TXT o leggi online da Scribd

matrix

© All Rights Reserved

0 valutazioniIl 0% ha trovato utile questo documento (0 voti)

5 visualizzazioni82 paginematrix

© All Rights Reserved

Sei sulla pagina 1di 82

Application in Statistics

Nishith Kumar

Lecturer

Department of Statistics

Begum Rokeya University, Rangpur.

Email: nk.bru09@gmail.com

1

Overview

• Introduction

• LU decomposition

• QR decomposition

• Cholesky decomposition

• Jordan Decomposition

• Spectral decomposition

• Singular value decomposition

• Applications

2

Introduction

This Lecture covers relevant matrix decompositions, basic

numerical methods, its computation and some of its applications.

Decompositions provide a numerically stable way to solve

a system of linear equations, as shown already in

[Wampler, 1970], and to invert a matrix. Additionally, they

provide an important tool for analyzing the numerical stability of

a system.

Cholesky, Jordan, Spectral decomposition and Singular value

decompositions.

3

Easy to solve system (Cont.)

Some linear system that can be easily solved

The solution:

b1 / a11

b / a

2 22

n nn

b / a

4

Easy to solve system (Cont.)

Lower triangular matrix:

5

Easy to solve system (Cont.)

Upper Triangular Matrix:

6

LU Decomposition

LU decomposition was originally derived as a decomposition of quadratic

and bilinear forms. Lagrange, in the very first paper in his collected works(

1759) derives the algorithm we call Gaussian elimination. Later Turing

introduced the LU decomposition of a matrix in 1948 that is used to solve the

system of linear equation.

matrices L and U such that, where L is a lower triangular matrix and U is an

upper triangular matrix.

u11 u12 u1m l11 0 0

0 u l

u 2 m L 21

l 22 0

U 22

and

u mm l m1 lm2 l mm

0 0

(1736 –1813) (1912-1954)

7

How to decompose A=LU?

6 - 2 2

A 12 - 8 6 Now,

3 - 13 2

A … U (upper triangular) 6 - 2 2 1 0 0 6 - 2 2

0 - 4 2 - 2 1 0 12 - 8 6

U = Ek E1 A 0 - 12 1 - 1 / 2 0 1 3 - 13 2

0 0 6 - 2 2

0 - 4 2 0 1 0 - 2 1 0 12 - 8 6

0 0 - 5 0 - 3 1 - 1 / 2 0 1 3 - 13 2

U E2 E1 A

it can be proved that (E1)-1, , (Ek)-1 are lower triangular, and

(E1)-1 (Ek)-1 is a lower triangular matrix.

Let L=(E1)-1 (Ek)-1 then A=LU.

8

Calculation of L and U (cont.)

6 - 2 2 1 0 0 6 - 2 2

A 12 - 8 6 = 0 1 0 12 - 8 6

3 - 13 2 0 0 1 3 - 13 2

6 - 2 2 1 0 0 6 - 2 2

0 - 4 2 - 2 1 0 12 - 8 6

0 - 12 1 - 1 / 2 0 1 3 - 13 2

6 - 2 2 1 0 0 1 0 0 6 - 2 2

0 - 4 2 0 1 0 - 2 1 0 12 - 8 6

0 0 - 5 0 - 3 1 - 1 / 2 0 1 3 - 13 2

9

Calculation of L and U (cont.)

Now

-1 -1

1 0 0 1 0 0 1 0 0 1 0 0 1 0 0

- 2 1 0 0 1 0 2 1 0 0 1 0 2 1 0

- 1 / 2 0 1 0 - 3 1 1 / 2 0 1 0 3 1 1 / 2 3 1

Therefore,

6 - 2 2 1 0 0 6 - 2 2

A 12 - 8 6 = 2 1 0 0 - 4 2

=LU

3 - 13 2 1 / 2 3 1 0 0 - 5

If A is a Non singular matrix then for each L (lower triangular matrix) the

upper triangular matrix is unique but an LU decomposition is not unique.

There can be more than one such LU decomposition for a matrix. Such as

6 - 2 2 6 0 0 1 - 2 / 6 2 / 6

A 12 - 8 6 = 12 1 0 0 - 4 2 =LU

3 3 1 0 10

3 - 13 2 0 - 5

Calculation

Calculationof

ofLLand

and U

U (cont.)

(cont.)

Thus LU decomposition is not unique. Since we compute LU

decomposition by elementary transformation so if we change

L then U will be changed such that A=LU

put some restriction on L and U matrices. For example, we can

require the lower triangular matrix L to be a unit one (i.e. set

all the entries of its main diagonal to ones).

LU Decomposition in R:

• library(Matrix)

• x<-matrix(c(3,2,1, 9,3,4,4,2,5 ),ncol=3,nrow=3)

• expand(lu(x))

11

Calculation of L and U (cont.)

• Note: there are also generalizations of LU to non-square and singular

matrices, such as rank revealing LU factorization.

• [Pan, C.T. (2000). On the existence and computation of rank revealing LU

factorizations. Linear Algebra and its Applications, 316: 199-222.

• Miranian, L. and Gu, M. (2003). Strong rank revealing LU factorizations.

Linear Algebra and its Applications, 367: 1-16.]

systems of simultaneous linear equations. We can also find determinant

easily by using LU decomposition (Product of the diagonal element of

upper and lower triangular matrix).

12

Solving system of linear equation

using LU decomposition

Suppose we would like to solve a m×m system AX = b. Then we can find

a LU-decomposition for A, then to solve AX =b, it is enough to solve the

systems

substitution and the system UX = Y can be solved by the method of

backward substitution. To illustrate, we give some examples

Consider the given system AX = b, where

6 - 2 2

8

A 12 - 8 6 and b 14

3 - 13 2 - 17

13

Solving system of linear equation

using LU decomposition

We have seen A = LU, where

1 0 0 6 - 2 2

L 2 1 0 U 0 - 4 2

0 0 - 5

1 / 2 3 1

1 0 0 y1 8

2 1 0 y 14

2

1 / 2 3 1 y3 - 17

Then

y1 8

Y y 2 - 2

y3 - 15

14

Solving system of linear equation

using LU decomposition

Now, we solve UX =Y by backward substitution

6 - 2 2 x1 8 x1 1

0 - 4 2 x - 2 x 2

2 then 2

0 0 - 5 x3 - 15

x3

3

15

QR Decomposition

Firstly QR decomposition

originated with Gram(1883).

Later Erhard Schmidt (1907)

proved the QR Decomposition

Jørgen Pedersen Gram Theorem Erhard Schmidt

(1850 –1916) (1876-1959)

decomposed as , A QR where Q is a m×n matrix whose columns

form an orthonormal basis for the column space of A and R is an

nonsingular upper triangular matrix.

16

QR-Decomposition (Cont.)

Theorem : If A is a m×n matrix with linearly independent columns, then

A can be decomposed as , A QR where Q is a m×n matrix whose

columns form an orthonormal basis for the column space of A and R is an

nonsingular upper triangular matrix.

Apply the Gram-Schmidt process to {u1, u2 , . . . ,un} and the

orthogonal vectors v1, v2 , . . . ,vn are

u i , v1 ui , v2 u i , vi -1

vi u i - 2

v1 - 2

v2 - - 2

vi -1

v1 v2 vi -1

vi

qi

Let vi for i=1,2,. . ., n. Thus q1, q2 , . . . ,qn form a orthonormal

basis for the column space of A.

17

QR-Decomposition (Cont.)

Now, u i , v1 ui , v2 u i , vi -1

u i vi 2

v1 2

v2 2

vi -1

v1 v2 vi -1

ui vi qi ui , q1 q1 ui , q2 q2 ui , qi-1 qi-1

i.e.,

ui span{v1 , v2 ,, vi } span{qi , q2 ,qi }

Thus ui is orthogonal to qj for j>i;

u1 v1 q1

u 2 v 2 q 2 u 2 , q1 q1

u 3 v3 q3 u 3 , q1 q1 u 3 , q 2 q 2

u n v n q n u n , q1 q1 u n , q 2 q 2 u n , q n -1 q n -1 18

QR-Decomposition (Cont.)

Let Q= [q1 q2 . . . qn] , so Q is a m×n matrix whose columns form an

orthonormal basis for the column space of A .

v1 u 2 , q1 u 3 , q1 u n , q1

Now,

0 v2 u3 , q2 u n , q2

A u1 u 2 u n q1 q 2 q n 0 0 v3 u n , q3

0 0 0 0 vn

i.e., A=QR. v1 u 2 , q1 u 3 , q1 u n , q1

Where,

0 v2 u3 , q2 u n , q2

R 0 0 v3 u n , q3

0 0 0 0 vn

nonsingular matrix.

QR Decomposition

1 - 1 - 1

1 0 0

A

1 - 1 0

0 0 - 1

20

Calculation of QR Decomposition

Applying Gram-Schmidt process of computing QR decomposition

1st Step: r11 a1 3

1 3

1 1 3

q1 a1

a1 1 3

0

2nd Step:

r12 q1T a 2 - 2 3

3rd Step: - 1 1 3 - 1 / 3

0 1 3 2 / 3

qˆ 2 a 2 - q1 q1 a 2 a 2 - q1 r12 - (-2 / 3 )

T

- 1 / 3

-1 1 3

0 0 0

r22 qˆ 2 2 3

- 1/ 6

1 23

q2 qˆ 2

qˆ 2 - 1/ 6

0 21

Calculation of QR Decomposition

4th Step:

r13 q1T a3 - 1 3

6th Step:

- 1/ 2

0

qˆ 3 a3 - q1 q1T a3 - q 2 q 2T a3 a3 - r13 q1 - r23 q 2

1/ 2

-1

r33 qˆ 3 6 / 2

- 1/ 6

1 0

q3 qˆ 3

qˆ 3 1/ 6

- 2/ 6

22

Calculation of QR Decomposition

Therefore, A=QR

1 - 1 - 1 1 / 3 - 1 / 6 - 1/ 6

1 0 3 - 2 / 3 - 1/ 3

0 1 / 3 2 / 6 0

1 - 1 0 1 / 3 - 1 / 6 0 2 / 6 1 / 6

1/ 6

0 0 6 / 2

0 0 - 1 0 0 - 2 / 6

R code for QR Decomposition:

x<-matrix(c(1,2,3, 2,5,4, 3,4,9),ncol=3,nrow=3)

qrstr <- qr(x)

Q<-qr.Q(qrstr)

R<-qr.R(qrstr)

eigenvalues of a matrix, to solve linear systems, and to find least squares

approximations.

23

Least square solution using QR

Decomposition

The least square solution of b is

X X b X Y

t t

t t t t t

X Y RQY

t t t

Therefore,

R Rb R Q Y R

t t t

t -1

R Rb R

t

t -1

R t Q t Y Rb Q t Y Z

24

Cholesky Decomposition

Cholesky died from wounds received on the battle field on 31 August 1918

at 5 o'clock in the morning in the North of France. After his death one of

his fellow officers, Commandant Benoit, published Cholesky's method of

computing solutions to the normal equations for some least squares data

fitting problems published in the Bulletin géodesique in 1924. Which is

known as Cholesky Decomposition

matrix then there exists a unique lower triangular matrix L with positive

diagonal element such that A LLT.

Andre-Louis Cholesky 25

1875-1918

Cholesky Decomposition

Theorem: If A is a n×n real, symmetric and positive definite matrix then

there exists a unique lower triangular matrix G with positive diagonal

element such that A GG T .

decomposition, A=LU. Also let the lower triangular matrix L to be a unit

one (i.e. set all the entries of its main diagonal to ones). So in that case LU

decomposition is unique. Let us suppose D diag (u11 , u 22 ,, u nn )

observe that M T D -1U . is a unit upper triangular matrix.

From the uniqueness we have L=M. So, A=LDLT . Since A is positive

definite so all diagonal elements of D are positive. Let G L diag ( d , d ,,

11 22 d nn )

26

Cholesky Decomposition (Cont.)

Procedure To find out the cholesky decomposition

Suppose a11 a12 a1n

a a a

A 21 22 2n

n1

a a n2 a nn

a a a 2 n l 21 l 22 0 0 l 22 l n 2

the equation A 21 22

a n1 a n 2 a nn l n1 l n 2 l nn 0 0 l nn

L LT

27

Example of Cholesky Decomposition

Suppose 4 2 - 2 For k from 1 to n

1/ 2

A 2 10 2 k -1

- 2 2 5 l kk a kk - l ks2

s 1

k -1

Then Cholesky Decomposition For j from k+1 to n l jk a jk - l js l ks l kk

s 1

Now,

2 0 0

L 1 3 0

28

- 1 1 3

R code for Cholesky Decomposition

• x<-matrix(c(4,2,-2, 2,10,2, -2,2,5),ncol=3,nrow=3)

• cl<-chol(x)

1 0 0 4 0 0

L 1 / 2 1 0 and D 0 9 0

- 1 / 2 1 / 3 1 0 0 3

29

Application of Cholesky

Decomposition

Cholesky Decomposition is used to solve the system

of linear equation Ax=b, where A is real symmetric

and positive definite.

In regression analysis it could be used to estimate the

parameter if XTX is positive definite.

decomposition is also used (Weiya Shi; Yue-Fei

Guo; 2010)

30

Characteristic Roots and

Characteristics Vectors

Any nonzero vector x is said to be a characteristic vector of a matrix A, If

there exist a number λ such that Ax= λx;

the matrix A corresponding to the characteristic vector x.

For λ= λi the characteristics vector is the solution of x from the following

homogeneous system of linear equation (A- λiI)x=0

Theorem: If A is a real symmetric matrix and λi and λj are two distinct latent

root of A then the corresponding latent vector xi and xj are orthogonal.

31

Multiplicity

Algebraic Multiplicity: The number of repetitions of a certain

eigenvalue. If, for a certain matrix, λ={3,3,4}, then the

algebraic multiplicity of 3 would be 2 (as it appears twice) and

the algebraic multiplicity of 4 would be 1 (as it appears once).

This type of multiplicity is normally represented by the Greek

letter α, where α(λi) represents the algebraic multiplicity of λi.

eigenvalue is the number of linearly independent eigenvectors

associated with it.

32

Jordan Decomposition

Camille Jordan (1870)

• Let A be any n×n matrix then there exists a nonsingular matrix P and JK(λ)

a k×k matrix form 1 0 0

0 1 0

J k ( )

0 0 0

Such that

J k1 (1 )

Camille Jordan

0 0 (1838-1921)

0 J ( ) 0

P -1 AP

k2 2

0 0 0 J kr (r )

where k1+k2+ … + kr =n. Also λi , i=1,2,. . ., r are the characteristic roots

And ki are the algebraic multiplicity of λi ,

Spectral Decomposition

Decomposition in 1829.

CAUCHY, A.L.(1789-1857)

there exists an orthogonal matrix P such that

P T AP or A PP , where Λ is a diagonal

T

matrix.

34

Spectral Decomposition and

Principal component Analysis (Cont.)

By using spectral decomposition we can write A PPT

X matrix. Suppose X is mean centered i.e., X ( X - )

and the variance covariance matrix is ∑. The variance covariance

matrix ∑ is real and symmetric.

a diagonal matrix. diag (1, 2 ,, n )

Also

1 2 n

tr(∑) = Total variation of Data =tr(Λ)

35

Spectral Decomposition and

Principal component Analysis (Cont.)

The Principal component transformation is the transformation

Y=(X-µ)P

Where,

E(Yi)=0

V(Yi)=λi

Cov(Yi ,Yj)=0 if i ≠ j

V(Y1) ≥ V(Y2) ≥ . . . ≥ V(Yn)

n

V (Yi ) tr ()

i 1

n

V (Yi )

i 1

36

R code for Spectral Decomposition

x<-matrix(c(1,2,3, 2,5,4, 3,4,9),ncol=3,nrow=3)

eigen(x)

Application:

For Data Reduction.

Image Processing and Compression.

K-Selection for K-means clustering

Multivariate Outliers Detection

Noise Filtering

Trend detection in the observations.

37

Historical background of SVD

There are five mathematicians who were responsible for establishing the existence of the

singular value decomposition and developing its theory.

Eugenio Beltrami Camille Jordan James Joseph Erhard Schmidt Hermann Weyl

(1835-1899) (1838-1921) Sylvester (1876-1959) (1885-1955)

(1814-1897)

The Singular Value Decomposition was originally developed by two mathematician in the

mid to late 1800’s

1. Eugenio Beltrami , 2.Camille Jordan

Several other mathematicians took part in the final developments of the SVD including James

Joseph Sylvester, Erhard Schmidt and Hermann Weyl who studied the SVD into the mid-1900’s.

38

C.Eckart

What is SVD?

Any real (m×n) matrix X, where (n≤ m), can be

decomposed,

X = UΛVT

U is a (m×n) column orthonormal matrix (UTU=I),

containing the eigenvectors of the symmetric matrix

XXT.

Λ is a (n×n ) diagonal matrix, containing the singular

values of matrix X. The number of non zero diagonal

elements of Λ corresponds to the rank of X.

VT is a (n×n ) row orthonormal matrix (VTV=I),

containing the eigenvectors of the symmetric matrix

XTX.

39

Singular Value Decomposition (Cont.)

Theorem (Singular Value Decomposition) : Let X be m×n of rank

r, r ≤ n ≤ m. Then there exist matrices U , V and a diagonal

matrix Λ , with positive diagonal elements such that, X UV T

Proof: Since X is m × n of rank r, r ≤ n ≤ m. So XXT and XTX both

of rank r ( by using the concept of Grammian matrix ) and of

dimension m × m and n × n respectively. Since XXT is real

symmetric matrix so we can write by spectral decomposition,

XX T QDQ T

Where Q and D are respectively, the matrices of characteristic

vectors and corresponding characteristic roots of XXT.

Again since XTX is real symmetric matrix so we can write by

spectral decomposition,

X T X RMR T 40

Singular Value Decomposition (Cont.)

Where R is the (orthogonal) matrix of characteristic vectors and M

is diagonal matrix of the corresponding characteristic roots.

Since XXT and XTX are both of rank r, only r of their characteristic

roots are positive, the remaining being zero. Hence we can

write,

Dr 0

D

0 0

Also we can write,

M r 0

M

0 0

41

Singular Value Decomposition (Cont.)

We know that the nonzero characteristic roots of XXT and XTX are

equal so Dr M r

Partition Q, R conformably with D and M, respectively

i.e., Q (Qr , Q* ) ; R (R , R ) such that Qr is m × r , Rr is n × r and

r *

XXT and XTX. Now take

U Qr

V Rr

diag (d

1/ 2 1/ 2 1/ 2 1/ 2

D r 1 , d2 , , d r )

Where d i , i 1,2,, r are the positive characteristic roots of

XXT and hence those of XTX as well (by using the concept of 42

grammian matrix.)

Singular Value Decomposition (Cont.)

Now define, S Q D R r

1/ 2

r r

T

S T S (Qr Dr

1/ 2 T 1/ 2 T

Rr ) T Qr Dr Rr

Rr Dr1 / 2 QrT Qr Dr1 / 2 RrT

Rr Dr Rr

T

Rr M r Rr

T

RMR T

XTX

Similarly, SS T XX T

From the first relation above we conclude that for an arbitrary orthogonal matrix,

say P1 , S P1 X

While from the second we conclude that for an arbitrary orthogonal matrix, say P2

We must have S XP2

43

Singular Value Decomposition (Cont.)

matrices P1 , P2 the matrix X satisfies

XX T P1 XX T P1 , X T X P2T X T XP2

T

44

R Code for Singular Value Decomposition

sv<-svd(x)

D<-sv$d

U<-sv$u

V<-sv$v

45

Decomposition in Diagram

Matrix A

Full column rank

Lu decomposition

QR Decomposition

Not always unique

Rectangular

Square

Asymmetric

Symmetric SVD

AM>GM AM=GM

PD Similar

Cholesky Jordan Diagonalization

Spectral Decomposition

Decomposition P-1AP=Λ

Decomposition 46

Properties Of SVD

r

A UV T

ui i viT

i 1

where

r = rank of A

λi = the i-th diagonal element of Λ.

ui and vi are the i-th columns of U and V

respectively.

47

Proprieties of SVD

Low rank Approximation

Theorem: If A=UΛVT is the SVD of A and the

singular values are sorted as 1 2 ,n

then for any l <r, the best rank-l approximation

to A is ~ l

r

A ui i vi ; A - A i2

T ~2

i 1 i l 1

important for data compression.

48

Low-rank Approximation

• SVD can be used to compute optimal low-rank

approximations.

• Approximation of A is Ã of rank k such that

~

A Min A- X F

X :rank( X ) k Frobenius norm

m n 2

A a

i 1 j 1

ij

n

di

2

If d1 , d 2 ,, d n are the characteristics roots of ATA then A

Ã and X are both mn matrices.

i 1

49

Low-rank Approximation

• Solution via SVD

~

A U diag (1 ,..., k ,0,...,0)V T

singular values to zero

* * * * * *

* * * * * *

* * *

* * * * * * * * *

* * * * * *

* * *

* * *

* * *

VT

X U

K=2

~

A i 1 i ui viT

k

column notation: sum

50

of rank 1 matrices

Approximation error

• How good (bad) is this approximation?

• It’s the best possible, measured by the Frobenius norm of the

error: r

~

A- X A- A

2 2 2

min

X :rank( X ) k

F

F

i k 1

i

~

Now A- A

2

F

51

Row approximation and column

approximation

Suppose Ri and cj represent the i-th row and j-th column of A. The SVD

~

of A and A is

r l

A UV T

u

k 1

k k v T

k

~

A U l lVl uk k vkT

T

k 1

r

The SVD equation for Ri is

Ri u

k 1

ik k vk

l

We can approximate Ri by R

i

l

u k 1

ik k vk ; l<r

where i = 1,…,m.

C j v jk k u k

where j = 1, 2, …, n k 1

l

We can also approximate Cj by C v jk k u k

l

j ; l<r 52

k 1

Least square solution in inconsistent

system

By using SVD we can solve the inconsistent system.This gives the

least square solution. min

Ax - b

2

53

The SVD of Ag is

Where

54

55

Basic Results of SVD

SVD based PCA

X using SVD, X=UΛVT

we can write- XV = UΛ

Suppose Y = XV = UΛ

Then the first columns of Y represents the first

principal component score and so on.

o If no. of variables is greater than no. of observations then SVD based PCA will

give efficient result(Antti Niemistö, Statistical Analysis of Gene Expression

Microarray Data,2005) 56

Application of SVD

Solving linear least square Problems

Image Processing and Compression.

K-Selection for K-means clustering

Multivariate Outliers Detection

Noise Filtering

Trend detection in the observations and the variables.

57

Origin of biplot

Gabriel (1971)

One of the most

important advances in

data analysis in recent

decades

Currently…

> 50,000 web pages

Numerous academic

publications

Included in most

statistical analysis

packages

Still a very new

technique to most

scientists

Prof. Ruben Gabriel, “The founder of biplot”

Courtesy of Prof. Purificación Galindo

University of Salamanca, Spain

58

What is a biplot?

• “Biplot” = “bi” + “plot”

– “plot”

• scatter plot of two rows OR of two columns, or

• scatter plot summarizing the rows OR the columns

– “bi”

• BOTH rows AND columns

• 1 biplot >> 2 plots

59

Practical definition of a biplot

“Any two-way table can be analyzed using a 2D-biplot as soon as it can be

sufficiently approximated by a rank-2 matrix.” (Gabriel, 1971)

(Now 3D-biplots are also possible…)

5

Matrix decomposition 4 E1

3

G2 G1

P(4, 3) G(3, 2) E(2, 3)

2

e1 e2 e3 x y 1 E2

g1

Y

20 - 9 6 g 1 4 3

e1 e2 e3 G4

g 2 - 3 3 x 2 - 3 3

0

g2 6 12 - 15 O

- 10 - 6 g 3 1 - 3 y 4 1 - 2

-1

g3 9

g 4 8 - 12 12 g 4 4 0 -2 E3

-3 G3

G-by-E table

-4

-4 -3 -2 -1 0 1 2 3 4 5

X

60

Singular Value Decomposition (SVD) &

Singular Value Partitioning (SVP)

The ‘rank’ of Y, i.e.,

the minimum number Matrix Matrix

of PC required to characterising “Singular values” characterising

fully represent Y the rows the columns

X ij SVD

Common uses value

k 1 of f

r f=1

SVP f

k

1- f

k vkj ) f=0

k 1

f=1/2

Rows scores Column scores

Biplot

The simplest biplot is to show the first two PCs together with the

projections of the axes of the original variables

x-axis represents the scores for the first principal component

Y-axis the scores for the second principal component.

The original variables are represented by arrows which

graphically indicate the proportion of the original variance

explained by the first two principal components.

The direction of the arrows indicates the relative loadings on

the first and second principal components.

i) Graphically

ii) Effectively

iii) Conveniently.

62

Biplot of Iris Data

-10 -5 0 5 10

1 33

0.2 Sepal W.

10

1

3

1 1

111

0.1

1111

5

1

1 333 333 Sepal L.

22 2 33 33

11 1 22223 3 33

333

1 11111 2 3 3 3 3

Comp. 2

1

111 2 233 3 3 Petal W.

11

0.0

11 1 33 23

2 Petal L.

2222 3 3

0

1 222 22 233 33

11

1 2 2 22 333

1 11 1

1 11 22 323

1 222 3 3 3

11 22 22

2

-0.1

-5

1 3

22 2

2 22 3 23

2 22

2

-10

-0.2

1

1= Setosa

2

2= Versicolor

3= Virginica -0.2 -0.1 0.0 0.1 0.2

63

Comp. 1

Image Compression Example

Pansy Flower image, collected from

http://www.ats.ucla.edu/stat/r/code/pansy.jpg

64

Singular values of flowers image

65

Low rank Approximation to flowers image

Rank- 5 approximation

Rank-1 approximation 66

Low rank Approximation to flowers image

67

Low rank Approximation to flowers image

68

Low rank Approximation to flowers image

69

Low rank Approximation to flowers image

70

Outlier Detection Using SVD

Nishith and Nasser (2007,MSc. Thesis) propose a graphical

method of outliers detection using SVD.

It is suitable for both general multivariate data and regression

data. For this we construct the scatter plots of first two PC’s,

and first PC and third PC. We also make a box in the scatter

plot whose range lies

median(1stPC) ± 3 × mad(1stPC) in the X-axis and

median(2ndPC/3rdPC) ± 3 × mad(2ndPC/3rdPC) in the Y-

axis.

Where mad = median absolute deviation.

The points that are outside the box can be considered as

extreme outliers. The points outside one side of the box is

termed as outliers. Along with the box we may construct

another smaller box bounded by 2.5/2 MAD line

71

Outlier Detection Using SVD (Cont.)

HAWKINS-BRADU-KASS

(1984) DATA

with 14 influential observations.

Among them there are ten high

leverage outliers (cases 1-10)

and for high leverage points

(cases 11-14) -Imon (2005).

Scatter plot of Hawkins, Bradu and kass data (a) scatter plot of first two PC’s and

(b) scatter plot of first and third PC. 72

Outlier Detection Using SVD (Cont.)

MODIFIED BROWN DATA

Data set given by Brown (1980).

Ryan (1997) pointed out that the

original data on the 53 patients

which contains 1 outlier

(observation number 24).

this data set by putting two more

Scatter plot of modified Brown data (a) scatter plot of first outliers as cases 54 and 55.

two PC’s and (b) scatter plot of first and third PC.

Also they showed that observations

24, 54 and 55 are outliers by using

generalized standardized

Pearson residual (GSPR) 73

Cluster Detection Using SVD

Singular Value Decomposition is also used for cluster

detection (Nishith, Nasser and Suboron, 2011).

PC’s are given below,

median (1st PC) ± k × mad (1st PC) in the X-axis

and median (2nd PC/3rd PC) ± k × mad (2nd

PC/3rd PC) in the Y-axis.

Where mad = median absolute deviation. The value of

k = 1, 2, 3.

74

75

Principals stations in climate data

76

Climatic Variables

The climatic variables are,

1. Rainfall (RF) mm

2. Daily mean temperature (T-MEAN)0C

3. Maximum temperature (T-MAX)0C

4. Minimum temperature (T-MIN)0C

5. Day-time temperature (T-DAY)0C

6. Night-time temperature (T-NIGHT)0C

7. Daily mean water vapor pressure (VP) MBAR

8. Daily mean wind speed (WS) m/sec

9. Hours of bright sunshine as percentage of maximum possible sunshine

hours (MPS)%

10. Solar radiation (SR) cal/cm2/day

77

Consequences of SVD

Generally many missing values may present in the data. It may also contain

unusual observations. Both types of problem can not handle Classical singular

value decomposition.

regression approach (Douglas M. Hawkins, Li Liu, and S. Stanley Young,

(2001)).

78

The Alternating L1 Regression Algorithm for Robust Singular Value

Decomposition.

the initial values of u1

left singular vector u1

n

Calculate right singular vector v1=c/║c║

minimizing xij - c j u i1 ;

, where ║.║ refers to Euclidean norm.

j=1,2,…,p i 1

Calculate the resulting estimate of

the left eigenvector ui=d/ ║d║ p

di by minimizing x

j 1

ij - d i v j1 ; i=1,2,….,n

For the second and subsequent of the SVD, we replaced X by a deflated matrix 79

obtained by subtracting the most recently found them in the SVD X X-λkukvkT

Clustering weather stations on Map

Using RSVD

80

References

• Brown B.W., Jr. (1980). Prediction analysis for binary data. in

Biostatistics Casebook, R.G. Miller, Jr., B. Efron, B. W. Brown, Jr., L.E.

Moses (Eds.), New York: Wiley.

• Dhrymes, Phoebus J. (1984), Mathematics for Econometrics, 2nd ed.

Springer Verlag, New York.

• Hawkins D. M., Bradu D. and Kass G.V.(1984),Location of several

outliers in multiple regression data using elemental sets. Technometrics,

20, 197-208.

• Imon A. H. M. R. (2005). Identifying multiple influential observations in

linear Regression. Journal of Applied Statistics 32, 73 – 90.

• Kumar, N. , Nasser, M., and Sarker, S.C., 2011. “A New Singular Value

Decomposition Based Robust Graphical Clustering Technique and Its

Application in Climatic Data” Journal of Geography and Geology,

Canadian Center of Science and Education , Vol-3, No. 1, 227-238.

• Ryan T.P. (1997). Modern Regression Methods, Wiley, New York.

• Stewart, G.W. (1998). Matrix Algorithms, Vol 1. Basic

Decompositions, Siam, Philadelphia.

• Matrix Decomposition. http://fedc.wiwi.hu-

berlin.de/xplore/ebooks/html/csa/node36.html

81

82