Sei sulla pagina 1di 18

4.

4. Eigenvalues and Eigenvectors


4.1 Trace and Determinant
The covariance matrix:
11 12 L 1p

22 L 2p
21

M M O
M

p1
p2
pp

Remember this matrix is symmetric so that ij = ji.


Definition 4.1: The trace of a square matrix is defined as
p

tr(S ) ii 11 22 ... pp ; i.e., the sum of a square


i 1

matrixs diagonal elements.


Example: Bivariate normal distribution (bi_norm.sas)
The bivariate normal distribution examined in Chapters 1
and 3 is
15
1
0.5

N2 ,
.
20
0.5
1.25


Note that tr() = 1 + 1.25 = 2.25.

2005 Christopher R. Bilder

4.2

The trace of a covariance matrix is sometimes referred


to as the total variance since all variances are being
summed.
Using PROC IML, this can be found by using the
following code:
proc iml;
mu={15,20};
sigma={1 0.5,
0.5 1.25};
tr_sigma = trace(sigma);
print 'The trace of sigma' tr_sigma;
quit;

Definition 4.2: The determinant of a square matrix is


p

S 1jS1j where 1j = (-1)1+j| 1j| and 1j is obtained from


j 1

by deleting its first row and its jth column.


From Chapter 1:
The determinant for a 22 matrix is defined as
a11 a12
a11a22 a12 a21 and
a

21 a22
the determinant for a 33 matrix can be defined as

2005 Christopher R. Bilder

4.3

a11 a12 a13


a22
a

a22 a23 a11


21

a32
a31 a32 a33
for a matrix A.

a23
a21
a21 a 23

a12
a 13

a33
a31
a31 a 33

a22
a
32

Example: Bivariate normal distribution (bi_norm.sas)


0.5
1
S
11.25 0.5 0.5 1

0.5 1.25
The determinant of a matrix can be found in PROC IML
using det(sigma) where sigma is the matrix of interest.
4.2 Eigenvalues
Eigenvalues are also known as characteristic roots.
Definition 4.3: Eigenvalues of are the roots of the
polynomial equation defined by I 0 where I is an
identity matrix.
If p=2, then the eigenvalues are the roots of
11
12
11 12 0

21
22
21 22 0

2005 Christopher R. Bilder

4.4

(11 )(22 ) 2112


2 (11 22 ) (2112 1122 ) 0
11 22 (11 22 )2 4(2112 1122 )

2
In general, the eigenvalues are the p roots of
c1p + c2p-1 + c3p-2 + + cp + cp+1 = 0 where cj denote
constants.
Since is a symmetric matrix, the eigenvalues are real
numbers and can be ordered from largest to smallest as
1 2 p where 1 is the largest.
Example: Bivariate normal distribution (bi_norm.sas)
2.25 2.252 4(.25 1.25)

1.6404 and 0.6096


2
The eigenvalues of a matrix can be found in PROC IML
using eigval(sigma) where sigma is the matrix of interest.
4.3 Eigenvectors

2005 Christopher R. Bilder

4.5

Definition 4.4: Each eigenvalue of has a corresponding


nonzero vector a called an eigenvector that satisfies a =
a.
Eigenvectors for a particular eigenvalue are not
uniqu.
When two eigenvalues are not equal, their
corresponding eigenvectors are orthogonal (i.e.,
aia j 0 ).
p

tr( ) i
p

i 1

L
i 1 2
i 1

Example: Bivariate normal distribution (bi_norm.sas)


The eigenvectors of satisfy
0.5 a1
1
0.5 1.25 a

a1

a
2

for 1=1.6404 and 2=0.6096.


Two possible vectors are:
0.6154
0.7882 for 1=1.6404 and

2005 Christopher R. Bilder

4.6

0.7882
0.6154 for 2=0.6096.

The eigenvectors of a matrix can be found in PROC


IML using eigvec(sigma) where sigma is the matrix of
interest.
Note that
0.5
1
0.5 1.25

0.5
1
0.5 1.25

0.6154
0.6154

1.6404
and

0.7882
0.7882

0.7882
0.7882

0.6096

0.6154

0.6154

Below is part of the PROC IML code and output:


eig = eigval(sigma);
eigenvec = eigvec(sigma);
print 'The eigenvectors of sigma' eigenvec;
*verify eigenvectors;
check1_left = sigma*eigenvec[,1];
check1_right = eig[1,1]*eigenvec[,1];
check2_left = sigma*eigenvec[,2];
check2_right = eig[2,1]*eigenvec[,2];
print 'lambda1' check1_left check1_right;
print 'lambda2' check2_left check2_right;
EIGENVEC
The eigenvectors of sigma 0.6154122 0.7882054

2005 Christopher R. Bilder

4.7
0.7882054 -0.615412
CHECK1_LEFT CHECK1_RIGHT
lambda1
1.0095149
1.0095149
1.2929629
1.2929629
CHECK2_LEFT CHECK2_RIGHT
lambda2
0.4804993
0.4804993
-0.375163
-0.375163

Note:
SAS gives the eigenvector for the largest eigenvalue
first.
These eigenvectors have a length of one.
Positive Definite and Positive Semidefinite Matrices
Definition 4.5: If a symmetric matrix has all of its eigenvalues
positive, the matrix is called a positive definite matrix.
Definition 4.6: If a symmetric matrix has all nonnegative
eigenvalues and if at least one of the eigenvalues is actually
0, then the matrix is called a positive semidefinite matrix.
Definition 4.7: If a matrix is either positive definite or positive
semidefinite, the matrix is defined to be a nonnegative
matrix.
Covariance and correlation matrices are always
nonnegative!

2005 Christopher R. Bilder

4.8

This will be important when dealing with quadratic


forms.
Example: Bivariate normal distribution (bi_norm.sas)
All eigenvalues are positive.
4.4 Geometric Descriptions (p=2)
Vectors
Two dimensional vectors can be plotted like points.
Example: Bivariate normal distribution (chapter4_plots.R)
Plot the eigenvectors for the covariance matrix.

2005 Christopher R. Bilder

4.9

0.0
-1.0

-0.5

a2

0.5

1.0

Eigenvectors of

-1.0

-0.5

0.0

0.5

1.0

a1

0.6154
0.7882 for 1=1.6404 and

0.7882
0.6154 for 2=0.6096

#Input mu, sigma


mu<-c(15,20)
sigma<-matrix(c(1, 0.5, 0.5, 1.25), nrow=2, ncol=2)
#Find the eigenvalues with eigenvectors
eig<-eigen(sigma)
eig
#Note how individual components of eig can be accessed
eig$values
eig$vectors

2005 Christopher R. Bilder

4.10
eig$vectors[1,1]
############################################################
# Eigenvector plot
#Square plot (default is m which means maximal region)
par(pty = "s")
#Set up some dummy values for plot
a1<-c(-1,1)
a2<-c(-1,1)
plot(x = a1, y = a2, type = "n", main =
expression(paste("Eigenvectors of ", Sigma)),
xlab = expression(a[1]), , ylab = expression(a[2]) ,
panel.first=grid(col="gray", lty="dotted"))
#Run demo(plotmath) for help on greek letters and other
# characters
#draw line on plot - h specifies a horizontal line, lwd is
# line width
abline(h = 0, lty = 1, lwd = 2)
#v specifies a vertical line
abline(v = 0, lty = 1, lwd = 2)
arrows(x0 = 0, y0 = 0, x1 = 0.6154, y1 = 0.7882, col = 2,
lty = 1)
arrows(x0 = 0, y0 = 0, x1 = 0.7882, y1 = -0.6154, col = 2,
lty = 1)
#Note: the first arrows statement could also have been
# done by:
# arrows(x0 = 0, y0 = 0, x1 = eig$vectors[1,1], y1 =
#
eig$vectors[2,1], col = 2, lty = 1)

2005 Christopher R. Bilder

4.11

Bivariate Normal Distributions


The spread and direction of contours on a contour plot
are related to the direction of the eigenvectors.
The first eigenvector (corresponding to 1) points in the
direction of the major axis of the ellipse created by the
contours. This is the direction of the largest variability.
The second eigenvector (corresponding to 2) points in
the direction of the minor axis of the ellipse created by
the contours. This is the direction of the smallest
variability.
Example: Bivariate normal distribution (chapter4_plots.R)
Plotted below is the f(x)=0.01 and 0.001 contours and
the eigenvectors with a length of 10.

2005 Christopher R. Bilder

4.12

-10

-5

X2

10

15

20

25

Contour plot for bivariate normal distribution


with eigenvectors plotted

-10

-5

10

15

20

25

X1
mu = [15, 20]', sigma = [(1, 0.5)', (0.5, 1.25)']

Notes:
The eigenvectors for 1 and 2 point in the direction of
the ellipses major and minor axes.
The X1 and X2 axes are switched from the contour plot
shown in Chapter 1.
Below is the R code
library(mvtnorm) #Need for dmvnorm() function
#sequence of numbers from 10-25 by 0.1

2005 Christopher R. Bilder

4.13
x1<-seq(from = 10, to = 25, by = 0.1)
x2<-seq(from = 10, to = 25, by = 0.1)
#Finds all possible combinations of x1 and x2
all<-expand.grid(x1, x2)
#Finds f(x) - dmvnorm() is in the mvtnorm package
f.x<-dmvnorm(all, mean = mu, sigma = sigma)
#Puts f(x) values in a matrix - need for contour(), but
# not for contourplot()
f.x2<-matrix(f.x, nrow=length(x1), ncol=length(x2))
par(pty = "s")
#The f.x2 values need to be in a matrix. Think of x1 as
# denoting the rows and x2 as denoting the columns
# corresponding to the values of x1 and x2 used to
# produce a f(x) in a cell of the matrix
contour(x = x1, y = x2, z = f.x2, xlim = c(-10,25), ylim =
c(-10,25), levels = c(0.01, 0.001),
xlab = expression(X[1]), ylab = expression(X[2]),
panel.first=grid(col="gray", lty="dotted"),
main = "Contour plot for bivariate normal
distribution \n with eigenvectors plotted",
sub = "mu = [15, 20]', sigma = [(1, 0.5)', (0.5,
1.25)']")
abline(h = 0, lty = 1, lwd = 2)
abline(v = 0, lty = 1, lwd = 2)
#Vector lengths are 10 in order to help see better on the
# plot
arrows(x0 = 0, y0 = 0, x1 = 10*0.6154, y1 = 10*0.7882, col
= 2, lty = 1)
arrows(x0 = 0, y0 = 0, x1 = 10*0.7882, y1 = 10*(-0.6154),
col = 2, lty = 1)

2005 Christopher R. Bilder

4.14

Ellipsoid of concentration: ( x m)S 1( x m) c


Note that this comes from the exponent in the
multivariate normal distribution:
f( x | mS
, )

1
(2)

p/2

1/ 2

1
( x m)S - 1( x m)

Johnson (1998) plots the ellipsoid of concentration


instead of the actual contours of the distribution.
Example: Bivariate normal distribution (bi_norm.sas)
What happens if the Var(X2) is very small? Let
15
1
0.5

N2 ,
and note that ||=0.01>0, 1=1.25
20
0.5
0.26


and 2=0.01. The correlation is 0.5 / 1 0.26 = 0.98.
Below is the contour plot. Notice that there is little
variability for the minor axis.

2005 Christopher R. Bilder

4.15

Note:
As 2 goes to 0, the contours become a straight line.
Thus, the data can be viewed in ONE dimension!
4.5 Geometric Descriptions (p=3)
Vectors
Plot 31 vectors in 3-dimensions with each axis
corresponding to a one of the three variable values.

2005 Christopher R. Bilder

4.16

Trivariate Normal Distributions


3D scatter plots of data from these type of distributions
can be constructed. The data points will appear to be
enclosed within a 3D ellipsoid.
The size of the 3D ellipsoid correspond to the size of the
eigenvalues. A small eigenvalue corresponds to a small
axis, and vice versa for a large eigenvalue. See
Johnsons (1998) comparison to a football. Pay special
attention to examples where an eigenvalue is 0 (the
dimension of the football is reduced).
Below is a 3D contour plot that a group of students did
for a project.

2005 Christopher R. Bilder

4.17

This is for a multivariate normal distribution with =


1 1 0.5

(2,2,2) and 1 2 1 . The eigenvalues of the


0.5 1 1.5
covariance matrix are 3.31, 0.83, and 0.36 and the
eigenvectors are plotted.

2005 Christopher R. Bilder

4.18

4.6 Geometric Descriptions (p>3)


Johnson (1998) presents a great lead into principal
components analysis on p.90-1! Below is a summary.
Suppose there are 15 variables in a data set. A plot of
these variables in 15 dimensions would be VERY difficult
to do! If possible, we would prefer to view data values in
2 or 3 dimensions. Suppose that all of the estimated
covariance matrixs eigenvalues were approximately 0
except for 2 or 3. This implies that the data can be
viewed in 2 or 3 dimensions without losing much
information.
Principal components analysis helps to determine the
dimensionality of the data set.
For more information about matrix algebra with statistical
applications, see Graybill (1981).

2005 Christopher R. Bilder

Potrebbero piacerti anche