Matrix Algebra Notes

Econometrics - II
Indira Gandhi Institute of Development Research

January - May Semester 2013
c
Subrata Sarkar
Elements Of Matrix Algebra

Start with an example
Yt = 1 + 2 X2t + 3 X3t + + k Xkt + Ut
t = 1, 2, n
Writing for each observation

Y1 = 1 + 2 X21 + 3 X31 + + k Xk1 + U1
Y2 = 1 + 2 X22 + 3 X32 + + k Xk2 + U2
..
.
Yn = 1 + 2 X2n + 3 X3n + + k Xkn + Un
Summarize these n equations in a convenient form
Y1
Y2
..
.
Yn
X21 Xk1
X22 Xk2
..
..
.
.
1 X2n Xkn
1
1
..
.
1
2
..
.
k
U1
U2
..
.
Un
or, Y = X + U
Y, X, , U are vectors and matrices
Vector: An ordered sequence of numbers arranged in a row or a column
Y, , U arranged in column
U, Y n element column vector
k element column vector
1
Y 0 = [Y1 , Y2 Yn ] transpose of Y
U 0 = [U1 , U2 Un ] transpose of U
0 = [1 , 2 k ] transpose of
X=
1 X21 Xk1
1 X22
Xk2
..
..
..
.
.
.
1 X2n
Xkn
X is a matrix
Matrix: A rectangular array of elements.

Order of a Matrix = number of rows number of columns
= n k (number of rows being always written first)
Observations:
1. A column vector of n elements i.e.
Y =
Y1
Y2
..
.
is a matrix of order n 1
Yn
2. A row vector of k elements i.e. Z 0 = [Z1 Z2 Zk ] is a matrix of order
1k
3. Representing the X matrix
X = [X1n1 X2n1 Xkn1 ]
or
X=
S1
S2
..
.
Sn
4. Transpose of a matrix
Xkn
= [Xji ]
Xnk = [Xij ]
Example
X=
1
3
4
5
6
2
1
3
4
2
1
5
1 3 4 5
X = 6 2 1 3
4 2 1 5
1. Operations on Vectors
(a) Multiplication by a scalar
2
22
4
3
2
3
2
=
=
6
4
24
8
(b) Addition of two vectors
U + V = Sum of corresponding elements
order has to be the same
(c) Linear combination K1 U + K2 V where K1 and K2 are constants.
(d) Vector Multiplication
a0 b = [1 2 3] 5
6
=14+25+36
a0 = 1 3
} number of elements have to be same
b = 31
A special vector: S the sum vector
S=
1
1
..
.
therefore a0 s =
n
X
ai
i=1
1
n1
2. Operations on Matrices
(a) Multiplication by a scalar KA = {Kaij}
(b) Addition of two matrices sum of corresponding elements
(c) Equality of Matrices orders have to be same
(d) Matrix Multiplication: Ank Bkm
a11 a1k
AB
a21 a2k
nk km
=
..
.
nm
an1 ank
b11 b1m
b21 b2m
..
.
bk1 bkm
a01 b1 a01 bm
a02 b1 a02 bm
..
.
a0n b1 a0n bm
The two matrices have to be conformable.

An example:
#
2 3 "
26+32
6 3
= 36+12
3 1
2 2
4 2
46+22
32 22
23+32
33+12
43+22
A special case
a11 a1m
..
an1 anm
1
2
..
.
m
a01
a02
..
.
a0n
3. Some Special Matrices

(a) Diagonal matrix
Ann =
a11
a22
...
ann
has to be square
(b) The identity matrix
1
1
Inn =
..
.
1
nn
(c) Symmetric matrix A = A0

(d) A scalar matrix
...
= I
(e) Idempotent matrix. Has to be square

A = A2
A = A2 = A3
4. Some Properties of Matrices

(a) (AB)0 = B 0 A0 , (ABC)0 = C 0 B 0 A0
(b) (A + B) + C = A + (B + C)
(c) (AB)C = A(BC)
(d) A(B + C) = AB + AC
(e) AI = A
(f) (A + B)0 = A0 + B 0
5. Trace of a Square Matrix

tr(A) =
n
X
aij
i=1
Properties of Trace
tr(ABC) = tr(BCA) = tr(CAB)
6. Matrix Inverse
In algebra we have ab = 1 b =
1
a
In matrix algebra we ask that given Ann , does there B |

AB = In ?
Answer: If columns of A are linearly independent, then there B such
that AB = I. In that case B is denoted as A1 i.e. AA1 = I
Linear Independence: a1 , a2 , an are linearly independent. If not then

ai can be written as a linear combination of the other ai s
Theorem: If all columns of A are linearly independent, then so are all
the rows. Then there exists C such that
CA = I
Now C =
=
=
Therefore C =
CI
CAB
IB
B. = A1
Therefore if A is square matrix with all columns (rows) linearly independent, then there exists a unique matrix called the inverse of A,
denoted by A1 such that
AA1 = A1 A = I. A is non singular
7. Properties of Inverse
(a) [A1 ]1 = A
(b) [A0 ]1 = [A1 ]0
(c) [AB]1 = B 1 A1
8. Calculations of Inverse
"
A=
a11 a12
a21 a22
Replace each element by its minor

"
a22 a21
a12 a11
Sign the minors -, i.e. get cofactors (1)i+j

"
a22 a21
a12 a11
Transpose
"
a22 a12
a21 a11
Adj(A)
Get determinant a11 a22 a12 a21 =| A |

Divide each element of Adj(A) by | A |
Therefore A1 =
1
[adj(A)]
|A|
a11 a12 a13
A = a21 a22 a23

a11 a31 a33
Step 1: Minor
(a22 a33 a23 a32 ) (a21 a33 a31 a32 ) (a21 a32 a22 a31 )
Step 2: Cofactor
+ +
+
+ +
Step 3: Transpose Adjoint
8
Step 4: Determinant
| A | = a11 [a22 a33 a23 a32 ]
= a12 [a21 a33 a23 a31 ]
= +a13 [a21 a32 a22 a31 ]
Step 5: Inverse
Divide every element of Adjoint (Step 3) by determinant (Step 4).
9. The Rank of a Matrix
The rank of a matrix A, not necessarily square, is the maximum number of linearly independent columns (or rows).
The maximum number of linearly independent columns of A = The
maximum number of linearly independent rows of A.
rank is unique and is denoted by (A)
(A) min[m, n]
When (A) = m < n A has full row rank
When (A) = n < m A has full column rank
If A is square matrix of order n, with full row(column) rank then A is
non-singular
Example A
1
1
2
3
2
0
2
6
3
1
4
7
4
1
5
4
Summary of Basic Matrix Algebra
1. Matrix: A rectangular array of elements.
1 2 3 2
= {aij }
A= 4 5 6 7
7 8 9 2 34
A is a 3 (rows) 4 (columns) matrtix
2. Row vector: x = [1 2 3 4]14
3. Column vector: y = 8
9 31
1
2
4. Diagonal Matrix: D =
5. Symmetric C matrix: {aij } = {aji }

"
1 2
2 7
1 2 3
4 5 6
A=
6. Transpose of a matrix:
"
A=
23
1 4
0
A = 2 5
3 6 32
Symmetric matrix : A = A0
10
7. Rank of Matrix: The number of linearly independent rows (columns)

Rank
A
min[m, n]
mn
8. Square Matrix: Ann

if Rank A = n Then A has an inverse :
AA1 = I = A1 A
where Inn =
1
1
..
.
1
9. Addition of matrices: An m and Bn m

| {z }
| {z }
same order
A + B = {aij } + {bij } = {aij + Bij }
10. Multiplication:
Anm Bmp = ABnp .[Example]
conformable
11. (AB)0 = B 0 A0
12. (AB)1 = B 1 A1 assuming A and B are square matrices with full
rank
11
Quadratic Form and Matrix Derivatives
1. Quadratic Form
Consider the expression
q1 = 2X12 + X1 X2 + X32
Calling X the
0
column vector of Xs, ie.X = [X1 , X2 Xn ], a quadratic form can be
put in the form q = X 0 AX with A symmetric. A is unique once the
order of X is chosen. A has
in the diagonal ie. aii , the coefficient attached to Xi2
in the off-diagonal, aij , 12 , the coefficient attached to Xi Xj
2 1/2 0
1/2
0 0
In our example: A =
0
0 1
"
Example 1: A =
2 1
1 1
X 0 AX = 2X12 + 2X1 X2 + X22 0 X

= (X1 + X2 )2 + X12 > 0 X =
6 0
"
Example 2: A =
but X 6= 0
1 1
1 1
X 0 AX = X12 + 2X1 X2 + X22

= (X1 + X2 )2 0 X
such that X 0 AX = 0
Definition:
12
A quadratic form is said to be positive definite if X 0 AX > 0 for

all X 6= 0
A quadratic form is said to be positive semi-definite if X 0 AX 0
for all X and X 6= 0such thatX 0 AX = 0
Remarks:
(a) A matrix is said to be n.n.d if it is either P.D or P.S.D.
(b) The concept of n.d and n.s.d can be defined similarly (by reversing
the sign).
(c) A symmetric matrix A is said to be P.D (P.S.D) if the associated
quadratic form is in P.D (P.S.D)
There are three equivalent conditions for a symmetric matrix A to be
P.D. These are iff conditions
(a) The matrix A is non-singular
(b) a non singular matrix P, such that
P 0P = A
(c) a non singular matrix Q, such that
Q= I
Some more properties related to Quadratic Forms

(a) Let B be any n k matrix. Then
i. B 0 B (order of k k) is n.n.d
ii. B 0 B is p.d if rank(B) = k
iii. B 0 B is p.s.d if rank(B) < k
(b) A is p.d
B is n.n.d
A + B is p.d.
(c) A
B
p.d.
any
nn
nk
13
i. B 0 AB (order of k k) is n.n.d
ii. B 0 AB is p.d if rank(B) = k
iii. B 0 AB is p.s.d if rank(B) < k
2. Matrix Derivatives
(a) Scalar Function
Y
= f (X) where X is a vector

"
AX1 X2
= f (X1 , X2 )
X1
X2
X=
Definition
Y
X
Y /X1
Y /X2
..
.
gradient vector
Y /Xn
Y
=
X 0
"
Y
Y
gradient vector
X1
Xn
Y
=
X
Our example
"
AX11 X2
AX1 X21
i. Special case: Linear function

Y = P1 X 1 + P2 X 2 + + P n X n = P 0 X = X 0 P
Y
= [P1 P2 Pn ] = P 0
0
X
Y
= P
X
ii. Special case: Quadratic form

Y
Y
X
= X 0 AX
A symmetric
"
= 2AX
Example
14
A=
2 1
1 1
(b) Vector Function

Ym1 = Fm1 (Xn1 )
Then
Y
=
X 0
Y1
.X1
.
.
Y1 = F1 (X)
Y2 = F2 (X)
..
.
Ym = Fm (X)
Y1
X2
Ym
X2
Ym
X1
Y1
Xn
Ym
Xn
The Jacobian Martix
Special case: Linear vector functions

Y1 = P10 X
Y2 = P20 X
..
.
Ym = Pm0 X
Y = PX
Then
15
P =
Y
=P
X 0
X = IX
Y
= I
X 0
P10
P20
..
.
Pm0
(c) Application of Derivative

1 0
X AX + b0 X + c
2
1
=
2AX + b
2
q =
where A is p.d.
q
X
q
= AX + b = 0 (F.O.C)
X
Therefore AX = b
X = A1 b
[A1 since a is p.d.]
2q
= A is p.d.
XX 0
(The Hessian Matrix)
Also
X defines a minimum of q
Proof:
X = X + Z
Let
Then
X 0 AX = (X +Z)0 A(X +Z) = X 0 AX +2X 0 AZ+Z 0 AZ
Therefore q =
q =
q =
=
q =
Therefore q > q
1 0
1
X AX + 6 2X 0 AZ + Z 0 AZ + b0 (X + Z) + c
2
2
1
1 0
X AX + b0 X + c + X 0 AZ + Z 0 AZ + b0 Z
2
2
1
q + (b0 A1 )AZ + Z 0 AZ + b0 Z
2
1
q 6 b0 Z + Z 0 AZ+ 6 b0 Z
2
1
q + Z 0 AZ
>0
for Z 6= 0
2
for
X 6= X
16
Matrix Statistics
(a) Random Vectors and Matrices
If X1 , X2 Xn are random variables then X =
X1
X2
..
.
is a ran-
Xn
dom vector. Elements are r.v.s.
Likewise, W = {Wij } is a random matrix when Wij s are all random variables.
(b) Expectation
E(X) = [E(Xi )] =
E(X1 )
E(X2 )
..
.
E(Xn )
Let E(Xi ) = for all i. E(X) =
E(X1 )
..
=
.
E(Xn )
Sn
sum vector
Properties of Expectation: Let A, B, C, U be constants

i.
ii.
iii.
iv.
v.
E(U ) = U
E(AX) = AE(X)
E(X + Y ) = E(X) + E(Y )
E(BXC) = BE(X)C
E(W1 W2 ) = E(W1 )E(W2 ) when W1 and W2 are iid
17
..
.
(c) Variance and Co-variance Matrix
X=
X1
X2
..
.
Xn
E(X1 )
..
=
Let E(X) =
.
E(Xn )
Therefore
1
..
.
E(Xi ) = i
i. There are n expectations.

ii. There are variances and co-variances
there are n variances E[Xi i ]2 = ii > 0
(n)(n 1)
there are
covariances E[Xi i ][Xj j ] =
2
ij i 6= j ij = ji
The variance-covariance matrix of X can be written as
V (X) = E[(X )(X ) ]
where E(X) = =
U1
U1
..
.
Un
= {E(Xi i )(Xj j )} = {ij } V
In the diagonal we have variances and in the off-diagonal we have
covariances. The matrix is symmetric.
Remarks:
i. If the Xi s are uncorrelated then ij = 0 i, j, i 6= j.
Then V = diag{ii }
ii. In addition, if there is homoskedasticity i.e ii = 2 i, then
V = 2 In
iii. If E(X) = 0 then V (X) = E(XX 0 )
(d) Linear Transformation
Consider X with E(X) = , V (X) = V
Define Y = AX
is a L.T of X
18
Then
E(Y ) = A
V (Y ) = AV A0
Proof: Y = AX
E(Y ) = E(AX) = AE(X) = A
V (Y ) = E(Y E(Y ))(Y E(Y ))0
= E[AX A][AX A]0
= E[A(X )][A(X )]0
= E[A(X )(X )0 A0 ]
= A E(X )(X )0 A0
= AV A0
Now consider the scalar linear transformation Y = Z 0 X
V (Y ) > 0 if Y is not a constant. i.e. if Z 0 X is not a constant.
V (Y ) = 0 if Y is a constant. i.e. if Z 0 X is a constant. i.e X are linearly
dependent.
V (Y ) = Z 0 V Z > 0 Z 6= 0 if the Xi are linearly independent
=0
if the Xi are linearly dependent
Conclusion: The variance-covariance matrix is always p.d, except in cases
where the Xi s are linearly dependent, in which case it is a p.s.d matrix.
Corollary: E(X) = V (X) = V
V positive def.
Then it is possible to get a standard vector through a linear transformation.
i.e
E(Y ) = 0
V (Y ) = I
Define:
Y = Q[X ]
E(Y ) = QE(X ) = Q[E(X) ] = 0
V (Y ) = QV Q0 = I for some Q
19
(e) Expectation of a Quadratic Form

X E(X) = 0 V (X) = V
q = X 0 AX where A is a symmetric matrix of constants.
Then E(q) = E(X 0 AX) = tr AV
Trace: trace of a square matrix is the sum of its diagonal elements.
Proof:
X 0 AX is a scalar, and so equal to its trace.
X 0 AX = tr(X 0 AX)
E(X 0 AX) = E(tr X 0 AX)
Trace is commutative
E(X 0 AX) = E(tr AXX 0 )
Trace is a linear operator
E(X 0 AX) = tr E(AXX 0 )
= tr AE(XX 0 )
= tr AV
Example: V = 2 In
A is idempotent of rank K
Therefore E(X 0 AX) = tr A 2 I = 2 tr A = 2 K
[Rank (A) = tr (A) since A is idempotent]
(f) Multinomial Normal Distribution and Related Distributions
20
i. Introduction
Xi i = 1, 2, n be n independent normal random variables with
E(Xi ) = i
V (Xi ) = i2
Xi N (i , i2 )
12 (Xi i )2
1
e 2i
A. Density of Xi f (Xi ) = q
2i2
B. Joint density of X1 , X2 Xn , when Xi s are independent

is the product of the individual densities.
f (X1 , X2 Xn ) = (2)
n
2
n
Y
1
i2 ) 2 e
21
1
(Xi i )2
i 2
i
i=1
Lets write the above in vector-matrix notation
X=
X1
X2
..
.
E(X) =
1
2
..
.
Xn
V (X) =
12
..
.
n2
=V
Note:
1/12
|V | = 12 22 n2 , V 1 =
...
1/n2
1
1
1
1
(Xi i )2 = 2 (X1 1 )2 + 2 (X2 2 )2 + + 2 (Xn n )2
2
i
1
2
n
i
is a quadratic form (X )0 V 1 (X )
X
Therefore f (X1 Xn ) = (2) 2 |V | 2 e 2 (X) V
21
1 (X)
ii. Formal definition: The random vector X, with E(X) = and

V (X) = V , is said to be normally distributed iff:
1
f (X) = (2) 2 |V | 2 e 2 (X) V
1 (X)
We then write X N (, V )
When X N (0, In ) then we say X is a standard normal
vector.
iii. Properties of normal distribution
Y
A. If X N (, V ) and
m1
= AX
mn n1
m1
Then Y N (A +
, AV A0 )
E(Y ) = A +
0
V (Y ) = AV A
is p.d. since r(A) = m
B. The orthogonal transformation of a standard normal vector is also a standard normal vector.
Z
n1
CX
nnn1
orthogonal transformation
C 1 = C 0
CC 1 = CC 0
CC 0 = I CC 0 = I
E(Z) = CE(X) = 0
V (Z) = CV C 0
= CIC 0
= CC 0
= I
Corollary: If X N (, V ) then by a suitable transformation we can get a S.N.V.
Since V is pd Q|QV Q0 = I
Y = Q(X ) E(Y ) = 0
V (Y ) = QV Q0 = I
22
Y N (0, I)
C. For normal variables zero covariance independence

"
X N (, V ) X =
#SX1
X1
X2
"
with E(X) =
(nS)X1
"
V (X) =
E(X1 )
E(X2 )
"
V11 V12
V21 V22
Zero covariance between X1 , X2 V12 = V210 = 0

"
n
12 [(X1 1 )(X2 2 )]
V111
Then f (X) = (2) 2 (|V11 ||V22 |) 2 e

n
= (2) 2 |V11 | 2 |V22 | 2 e 2 [(X1 1 ) V11

1
= (2) 2 |V11 | 2 e 2 (X1 1 ) V11

1
e 2 (X2 2 ) V22
(X1 1 )
(X2 2 )
.(2)
indep.
nS
2
independent
X12 + X22 + + Xn2 2(n)
23
V221
X 1 1
X 2 2
1
(X1 1 )+(X2 2 )0 V22
(X2 2 )]
iv. The chi-squared

X N (0, In )
#"
|V22 | 2
1
2
Characterization
A.
X12 + X22 + Xn2 X 0 X
Therefore X N (0, In )
X 0 X 2(n)
B.
Y N (, V )
(Y )0 V 1 (Y ) 2(n)
,
Since V is p.d
X = Q(Y )
Q|QV Q0 = I
X is normal
V (X) = QV Q0 = I
E(X) = 0
Therefore X N (0, I)
X 0 X 2(n)
(Y )0 Q0 Q(Y ) 2(n)
QV Q = I
Q1 QV Q0 = Q1
V Q0 = Q1
V (Q0 )(Q0 )1 = Q1 (Q0 )1
V = (Q0 Q)1
V 1 = Q0 Q)
Therefore
(Y )0 V 1 (Y ) 2(n)
C.
Z N (0, In )
"
Z =
Z1
Z2
#SX1
(nS)X1
Z1 N (0, IS )
Z2 N (0, InS ) independent
Z10 Z1 2(S)
Z20 Z2 2(nS)
|
{z
independent
24
Now observe
Z1 = AZ
Z1 = [IS 0]Z
S1
Sn
"
Z10 Z1
= Z A AZ = Z
Therefore Z10 Z1 = Z 0 M Z
n1
IS
0
"
[IS 0]Z = Z
IS 0
0 0
with M idempotent of r(M) = S

Z 0 M Z 2(S)
"
Z2 = [0 InS ]
"
Z20 Z2 = Z 0
Z1
Z2
0
0
0 InS
= Z
#
Z = Z 0M Z
M ID r(M ) = n S
Therefore Z 0 M Z 2(nS)
Also
M = I M
M M = M (I M ) = M M.M = M M = 0
Theorem: If
S, then
Z N (0, In ) and M is an idempotent matrtix of rank
Z 0 M Z 2(S)
Z 0 (1 M )Z 2(nS)
and 2(S) and 2(nS) are independent since M(I-M) = 0
25

Matrix Algebra Notes

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Matrix Algebra Notes

Caricato da

Copyright:

Formati disponibili

Econometrics - II

Indira Gandhi Institute of Development Research

Elements Of Matrix Algebra

Writing for each observation

Matrix: A rectangular array of elements.

A special vector: S the sum vector

The two matrices have to be conformable.

3. Some Special Matrices

(c) Symmetric matrix A = A0

(e) Idempotent matrix. Has to be square

4. Some Properties of Matrices

5. Trace of a Square Matrix

In matrix algebra we ask that given Ann , does there B |

Linear Independence: a1 , a2 , an are linearly independent. If not then

Replace each element by its minor

Sign the minors -, i.e. get cofactors (1)i+j

Get determinant a11 a22 a12 a21 =| A |

a11 a12 a13

A = a21 a22 a23

Summary of Basic Matrix Algebra

1. Matrix: A rectangular array of elements.

5. Symmetric C matrix: {aij } = {aji }

7. Rank of Matrix: The number of linearly independent rows (columns)

8. Square Matrix: Ann

9. Addition of matrices: An m and Bn m

Quadratic Form and Matrix Derivatives

X 0 AX = 2X12 + 2X1 X2 + X22 0 X

X 0 AX = X12 + 2X1 X2 + X22

A quadratic form is said to be positive definite if X 0 AX > 0 for

(c) a non singular matrix Q, such that

Some more properties related to Quadratic Forms

= f (X) where X is a vector

i. Special case: Linear function

ii. Special case: Quadratic form

(b) Vector Function

The Jacobian Martix

Special case: Linear vector functions

(c) Application of Derivative

X 0 AX = (X +Z)0 A(X +Z) = X 0 AX +2X 0 AZ+Z 0 AZ

(a) Random Vectors and Matrices

If X1 , X2 Xn are random variables then X =

Let E(Xi ) = for all i. E(X) =

Properties of Expectation: Let A, B, C, U be constants

(c) Variance and Co-variance Matrix

i. There are n expectations.

V (X) = E[(X )(X ) ]

(e) Expectation of a Quadratic Form

(f) Multinomial Normal Distribution and Related Distributions

B. Joint density of X1 , X2 Xn , when Xi s are independent

Lets write the above in vector-matrix notation

Therefore f (X1 Xn ) = (2) 2 |V | 2 e 2 (X) V

ii. Formal definition: The random vector X, with E(X) = and

f (X) = (2) 2 |V | 2 e 2 (X) V

is p.d. since r(A) = m

C. For normal variables zero covariance independence

Zero covariance between X1 , X2 V12 = V210 = 0

Then f (X) = (2) 2 (|V11 ||V22 |) 2 e

= (2) 2 |V11 | 2 |V22 | 2 e 2 [(X1 1 ) V11

= (2) 2 |V11 | 2 e 2 (X1 1 ) V11

X12 + X22 + + Xn2 2(n)

iv. The chi-squared

with M idempotent of r(M) = S

Z N (0, In ) and M is an idempotent matrtix of rank

Potrebbero piacerti anche