Sei sulla pagina 1di 49

Review of Linear Algebra and Matrix Computations I

Week 6
Objective
Lesson Topics:
• Matrix addition, scalar multiplication
• Matrix multiplication
• Matrix inverses
• Elementary matrices
• Determinants and matrix inverses
• Diagonalization and Eigenvalues
Lesson Outcomes:
• Matrix addition, scalar multiplication, and transposition
• Matrix multiplication
• Matrix inverses
• Determinants and matrix inverses
• Diagonalization and eigenvalues
Liner algebra
• Linear algebra is the branch of mathematics concerning linear
equations their representations through matrices and vector spaces.
• Linear algebra is central to almost all areas of mathematics:
– Operations on or between vectors and matrices
– Coordinate transformations
– Dimensionality reduction
– Linear regression
– Solution of linear systems of equations
– …
Linear algebra and machine learning

𝑿𝟏 𝑾𝟏 + 𝑿𝟐 𝑾𝟐 + 𝒃𝟏 = 𝒚
Linear Algebra and machine learning
• Machine learning is optimizing the network (system of equations) and
• find the sensitivity of the system for each feature (The change in
output with change in each feature of input.
𝐴1,1 𝑥1 + ⋯ + 𝐴1,𝑛 𝑥𝑛 = 𝑏1
𝐴2,1 𝑥1 + ⋯ + 𝐴2,𝑛 𝑥𝑛 = 𝑏2

𝐴𝑛,1 𝑥1 + ⋯ + 𝐴𝑛,𝑛 𝑥𝑛 = 𝑏𝑛

𝐴1,1 ⋯ 𝐴1,𝑚 𝑥1 𝑏1
⋮ ⋱ ⋮ ⋮ = ⋮
𝐴𝑛,1 ⋯ 𝐴𝑛,𝑚 𝑥𝑛 𝑏𝑛
Vectors
• An array of numbers arranged in order:
𝑌1
𝑌𝑛×1 = ⋮
𝑌𝑛
𝑌𝑖 𝑌 ∈ ℛ𝑛

Vector Express a magnitude and a direction in n-dimensional space;


Matrix

Assume the 2D variable vectors with elements indicated as 𝑌𝑛×𝑚

𝑌1,1 ⋯ 𝑌1,𝑚
𝑌= ⋮ ⋱ ⋮
𝑌𝑛,1 ⋯ 𝑌𝑛,𝑚
𝑌𝑖𝑗 𝑖 is the row and j is the column and 𝑌 ∈ ℛ𝑛×𝑚
Tensor
• Tensor is a n-D array
where n>2.

For example for 3-D tensor


we have 𝑌𝑖𝑗𝑘
𝑖 is the row
j is the column
𝑌 ∈ ℛ𝑛×𝑚×𝑘
Vector operations
• Let u,v and w be vectors and c and d as scalars
Vector operations (cont.)
• Scalar (dot) product

• Vector length (norm)

• If a = ‹a1, a2, a3› and b = ‹b1, b2, b3›, then the cross product of a and b
is the vector
a x b = ‹a2b3 - a3b2, a3b1 - a1b3, a1b2 - a2b1›
= |a x b| = |a||b| sin θ
Inner product/Orthogonality
Inner product give us information about the length and the distance
between two vectors.
𝑣1
𝑢. 𝑣 = 𝑢𝑇 𝑣 = 𝑢1 … 𝑢1 ⋮ = 𝑢1 𝑣1 + 𝑢2 𝑣2 +…+ 𝑢𝑛 𝑣𝑛
𝑣𝑛

u u u.v=0 u u.v<0
u.v>0 𝜃
𝜃 𝜃
u.v
v v v
Norm
• The Euclidean norm assigns to each vector the length of its arrow.
Because of this, the Euclidean norm is often known as the magnitude.
For a n-dimentional space ℝ𝑛 , the length of vector 𝐴 = 𝐴1 + 𝐴2 +…+ 𝐴𝑛 we
have:

𝐴 2: = 𝐴12 + 𝐴22 + ⋯ + 𝐴2𝑛


𝐴 2 gives the distance from origin to the point x
Matrix/Vector Transpose
• Mirror image across the principal diagonal.
• For a 3 × 3 Matrix we have:

𝑌1,1 𝑌1,2 𝑌1,3 𝑌3,1 𝑌4,1


𝑌1,1 𝑌2,1
𝑌2,1 𝑌2,2 𝑌2,3
𝑌= ⇒ 𝑌 𝑇 = 𝑌1,2 𝑌2,2 𝑌3,2 𝑌4,2
𝑌3,1 𝑌3,2 𝑌3,3
𝑌1,3 𝑌2,3 𝑌3,3 𝑌4,3
𝑌4,1 𝑌4,2 𝑌4,3
• For a 𝑛 × 1 array we have:
𝑌1
𝑌𝑛×1 = ⋮ ⇒ 𝑌 𝑇 = 𝑌1,1 ⋯ 𝑌𝑛,1
𝑌𝑛
• For the scaler Y we have:
𝑌 = 𝑌𝑇

1 6
Question: What is transpose of , 3 4 ,6
8 2
Symmetric matrix
Matrix/Vector operation
• For two matrix with the same shape of 𝑖 × 𝑗 we have:
𝑌 = 𝐴 + 𝐵 ⇒ 𝑌𝑖,𝑗 = 𝐴𝑖,𝑗 + 𝐵𝑖,𝑗
• Scalar multiplied to a matrix:
𝑌 = 𝑎𝐴 ⇒ 𝑌𝑖,𝑗 = 𝑎𝐴𝑖,𝑗
• Vector added to a matrix
𝑌 = 𝐴 + 𝑏 ⇒ 𝑌𝑖,𝑗 = 𝐴𝑖,𝑗 + 𝑏𝑗
• Here vector b will be added to all the rows of the matrix

• Two matrices are equal when they have the same dimension (m x n)
and all of their corresponding entries are equal.
Basic matrix operations
Let A, B and C be mxn matrices and c and d be scalar
• 𝐴+𝐵 =𝐵+𝐴

• A+(B+C)=(A+B)+C

• 𝑐𝑑 𝐴 = 𝐶(𝑑𝐴)

• 1A=A

• A+0=A

• C(A+B)=cA+cB

• (c+d)A=cA+dA
Image processing input layers
• Input layers are where we load the raw input data of the image.
• This input data specifies the width, height, and number of channels.
• Typically, the number of channels is three, for the RGB values for each
pixel.
Convolution
• Type of linear operation where a small array of numbers,
called a kernel, is applied across the input (tensor).
• Two key hyperparameters
– are size (e.g. 3x3, 5x5 or 7x7) and number of kernels.
– The number of kernels determines the depth of output
feature maps.

Convolutional
3x3 filter

Input matrix

http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution
Convolution (cont.)
• The output from each
convolution layer is a set of
objects called feature maps,
generated by a single kernel
filter.
• The feature maps can be used to
define a new input to the next
layer.

https://medium.com/@RaghavPrabhu/understanding-of-convolutional-
neural-network-cnn-deep-learning-99760835f148
Solving a Matrix Equation
• Solve for X in the equation

3X + A = B
where
3 −8 −5 9
𝐴= 𝐵=
4 5 7 3
More definitions
• Only matrices of the same size can be added.
• A matrix containing only zero elements is called a zero matrix.
• Given matrix A, The matrix – A has as elements the additive inverses of
the elements of A.
• Multiplication operations
– Scalar x matrix
– Matrix x matrix

Arthur Cayley (1821-1895)


Example
2 3
1 2 3
• Given 𝐴 = and 𝐵 = 4 6 calculate
4 3 2
7 0
– AxB
– BxA

• Conclusion: For two matrix A and B:


𝐴𝐵 ≠ 𝐵𝐴
Example
1. 𝑌 = 𝑎𝐴 + 𝑏 => 𝑌𝑖,𝑗 = 𝑎𝐴𝑖,𝑗 + 𝑏𝑗 where b is a vector
2. 𝑌 = 𝑎𝐴 + 𝑏 => 𝑌𝑖,𝑗 = 𝑎𝐴𝑖,𝑗 + 𝑏 where b is a scalar

𝑎𝐴1,1 + 𝑏1 𝑎𝐴1,2 + 𝑏1 𝑎𝐴1,3 + 𝑏1


1. 𝑌 = 𝑎𝐴2,1 + 𝑏2 𝑎𝐴2,2 + 𝑏2 𝑎𝐴2,3 + 𝑏2
𝑎𝐴3,1 + 𝑏3 𝑎𝐴3,2 + 𝑏3 𝑎𝐴3,3 + 𝑏3
𝑎𝐴1,1 + 𝑏 𝑎𝐴1,2 + 𝑏 𝑎𝐴1,3 + 𝑏
2. 𝑌 = 𝑎𝐴2,1 + 𝑏 𝑎𝐴2,2 + 𝑏 𝑎𝐴2,3 + 𝑏
𝑎𝐴3,1 + 𝑏 𝑎𝐴3,2 + 𝑏 𝑎𝐴3,3 + 𝑏
Example
• Calculate the product of vector Y with shape 3× 1 and 𝑌 𝑇 :

𝑌1,1
𝑌 = 𝑌2,1
𝑌3,1

𝑌1,1 𝑌1,1 𝑌1,1 𝑌1,1 𝑌1,2 𝑌1,1 𝑌1,3


• 𝑌𝑌 𝑇 = 𝑌2,1 𝑌1,1 𝑌1,2 𝑌1,3 = 𝑌2,1 𝑌1,1 𝑌2,1 𝑌1,2 𝑌2,1 𝑌1,3
𝑌3,1 𝑌3,1 𝑌1,1 𝑌3,1 𝑌1,2 𝑌3,1 𝑌1,3

𝑌1,1
𝑇
• 𝑌 𝑌 = 𝑌1,1 𝑌1,2 𝑌1,3 𝑌2,1 = 𝑌1,1 𝑌1,1 + 𝑌1,2 𝑌2,1 + 𝑌1,3 𝑌3,1
𝑌3,1
Matrix operation Properties
• Distributivity
𝐴 𝐵 + 𝐶 = 𝐴𝐵 + 𝐴𝐶

• Associativity
𝐴 𝐵𝐶 = (𝐴𝐵) 𝐶

• Transpose of matrix product


(𝐴𝐵)𝑇 = 𝐵𝑇 𝐴𝑇
Diagonal Matrix
• Diagonal matrix is a a square matrix having nonzero elements only in the
principal diagonal running from the upper left to the lower right.

𝑌1,1 ⋯ 0
𝑌= ⋮ ⋱ ⋮ where 𝑌 ∈ ℛ 𝑛×𝑛
0 ⋯ 𝑌𝑛,𝑛

Identity matrix is a square diagonal matrix in which all the elements of the
principal diagonal are ones

1 ⋯ 0
𝑌= ⋮ ⋱ ⋮
0 ⋯ 1
Property of Identity Matrix
𝐼𝐴 = 𝐴
Matrix Inverse
• Type equation here.Assume we have matrix A , then 𝐴−1 called Inverse
matrix in which its product with A is identity. 𝐴𝐴−1 = 𝐼.
2 3 𝑎 𝑐
• Assume 𝐴 = then we find matrix 𝐴−1 = where 𝐴𝐴−1 = 𝐼
5 8 𝑏 𝑑

2𝑎 + 3𝑏 = 1 𝑎= 8
2 3 𝑎 𝑐 1 0 2𝑐 + 3𝑑 = 0 𝑏 = −5 8 −3
= 𝐴−1 =
5 8 𝑏 𝑑 0 1 5𝑎 + 8𝑏 = 0 𝐶= 3 −5 2
5𝑐 + 8𝑑 = 1 𝑑= 2
Finding the Inverse of a 2x2 matrix
a b 
• Given c d 
 
• First find the determinant  ad-bc
d b 
• Then swap the elements in the leading diagonal c a 
 
d  b 
• Then negate the other elements  c a 
 

• Then multiply the Matrix by 1/determinant

1 d  b 
ad  cb  c a 
Example

 2 4
A   
 4  10 

1 1  10  4 
A  (2)(10)  ( 4)(4)  4 2 

5 1 
1 1  10  4 
A   4  4 2  =
2 
 1
 1  
 2
Singular Matrix
• If the inverse matrix does not exist then we call it singular matrix
1 2 𝑎 𝑐
• Assume 𝐴 = then we find matrix 𝐴−1 = where 𝐴𝐴−1 = 𝐼
1 2 𝑏 𝑑
𝑎 + 2𝑏 = 1
1 2 𝑎 𝑐 1 0 𝑐 + 2𝑑 = 0 No Answer
=
1 2 𝑏 𝑑 0 1 𝑎 + 2𝑏 = 0
𝑐 + 2𝑑 = 1

A is a singular matrix
Determinant
• The concept “determinant” arose from early attempts to generalize the process of solving
systems of linear equations.

• It is a value that can be computed from the elements of a square matrix which defines the
characteristic of a matrix.
𝑎 𝑏
= 𝑎𝑑 − 𝑏𝑐
𝑐 𝑑

detA   (1)i  j aij det A (i , j )


j

  (1)i  j aij det A (i , j )


i
det a  a

Note:
Determinant of the Singular matrix is zero
Higher order matrix inversion
1
A1  adj ( A)
A
• Where adj is Adjugate (also called Adjoint)
• The first step is to create a "Matrix of Minors". This step has the most
calculations.
• For each element of the matrix:
• ignore the values on the current row and column and calculate the
determinant of the remaining values
• Put those determinants into a matrix (the "Matrix of Minors")
Example

Adjugate
Determinant of A:
Linear function
• Assume we have the following linear function where A ∈ ℛ 𝑛×𝑚 and b ∈ ℛ 𝑛
Then we have:
𝐴1,1 𝑥1 + ⋯ + 𝐴1,𝑛 𝑥𝑛 = 𝑏1
𝐴2,1 𝑥1 + ⋯ + 𝐴2,𝑛 𝑥𝑛 = 𝑏2

𝐴𝑛,1 𝑥1 + ⋯ + 𝐴𝑛,𝑛 𝑥𝑛 = 𝑏𝑛

𝐴𝑥 = 𝑏 where
𝐴1,1 ⋯ 𝐴1,𝑛 𝑥1 𝑏1
𝐴= ⋮ ⋱ ⋮ 𝑥= ⋮ 𝑌= ⋮
𝐴𝑛,1 ⋯ 𝐴𝑛,𝑛 𝑥𝑛 𝑏𝑛
𝑛×𝑛 𝑛×1 𝑛×1
Linear function
We solve linear equation using inverse matrix.
𝐴𝑥 = 𝑏
𝐴−1 𝐴𝑥 = 𝐴−1 𝑏
𝐼𝑥 = 𝐴−1 𝑏
𝑥 = 𝐴−1 𝑏
If 𝐴−1 exist we claim that matrix inverse is a closed-form solution.

The Condition to have closed form solution.


1. Matrix must be square
2. Columns must be linearly independent which means any column can not
be re-generated with combination of other columns.
Properties of inverse matrix
What about bigger matrices?
• Large matrices can be costly, in terms of computational time, to use,
and may have to be iterated hundreds or thousands of times for a
calculation.

• Solution:
Orthogonal matrices are useful tools as they are computationally
cheap and stable to calculate their inverse as simply their transpose.

If A is orthogonal matrix then we have 𝐴−1 = 𝐴𝑇 .


Orthogonal matrix
• a collection of real m-vectors a1, a2, . . . , an is orthonormal if
– the vectors have unit norm: ||ai|| = 1
– they are mutually orthogonal: 𝑎𝑖𝑇 𝑎𝑗 = 0 𝑖𝑓 𝑖 ≠ 𝑗
– Example:

• A square real matrix with orthonormal columns is called orthogonal


Eigenvalues and Eigenvectors
• If A is an nxn matrix, do there exist nonzero vectors x in Rn such that Ax
is a scalar multiple of x?

A: an nn matrix
: a scalar (could be zero)
x: a nonzero vector in Rn

Eigenvalue

Ax  x
Eigenvector
Verifying eigenvalues and eigenvectors
2 0  1  0 
A  x1    x2   
0  1 0 1 
Eigenvalue
 2 0  1   2  1 
Ax1          2    2x1
0 1 0 0  0
Eigenvector
 2 0  0   0  0
Ax2          1    (1)x2
0 1 1   1 1 
Finding eigenvalues and eigenvectors
• Let A be an nn matrix.

Ax   x  Ax   Ix  ( I  A)x  0

The eigenvectors of A corresponding to  are the nonzero solutions of

( I  A)x  0

Matrix is singular if & only if any eigenvalue is zero


Example

2  12
A 
 1  5 
 2 12
det( I  A) 
1  5
  2  3  2  (  1)(  2)  0

1  1, 2  2
Eigen vector/Eigen value definition
• Find the eigen values and eigen vectors of matrix A:
0.8 0.3
𝐴=
0.2 0.7
Eigen values
Lets assume are eigen values and eigen vectors then we have:
1 0
𝐴 − 𝜆𝐼 𝑣 = 0 where I=
0 1
0.8 − 𝜆 0.3
-> 𝜆𝐼 − 𝐴 = 0 → = 0 → 0.8 − 𝜆 0.7 − 𝜆 − 0.3 ∗ 0.2 = 0
0.2 0.7 − 𝜆
This equation has two eigen values −> 𝜆 = 1, 𝜆 = 1/2

1 0
𝐷=
0 1/2
Eigen vector/Eigen value definition
Eigen vector for 𝝀 = 𝟏 :
0.8 − 1 0.3 𝑣1
𝐴−1∗𝐼 𝑣 = 0→ =0
0.2 0.7 − 1 𝑣2
−0.2𝑣1 + 0.3𝑣2 = 0
→ −> 𝑣1 = 1.5𝑣2
0.2𝑣1 − 0.3𝑣2 = 0
3
Now let assume 𝑣1 =3 then 𝑣2 =2 then eigen vector is
2
Note that 𝑣1 can be any number.
Eigen vector for 𝝀 = 𝟏/𝟐 :
0.8 − 0.5 0.3 𝑣1
=0
0.2 0.7 − 0.5 𝑣2
0.3𝑣1 + 0.3𝑣2 = 0
→ −> 𝑣1 = 𝑣2
0.2𝑣1 + 0.2𝑣2 = 0
1
assume 𝑣1 =1 then 𝑣2 = 1 then eigen vector is
1
3 1
𝑈= where u is one of the eigen vector matrices
2 1
Decomposition of Symmetric Matrix
• Every real symmetric matrix A can be decomposed into real-valued
eigenvectors and eigenvalues
𝐴 = 𝑄Λ𝑄𝑇
• where Q is an orthogonal matrix composed of eigenvectors of A
• Applications:
– generalizing matrix inversion
– Data Compression
– Solving Linear Equations
– Recommendation systems
• Collaborative filtering (CF)
Matrix Decomposition

• Step1: Find out eigen values of matrix A :


(𝜆1 , 𝜆2 , … , 𝜆𝑛 )
• Step2: find out eigen vectors of matrix M :
U=(𝑢1 , 𝑢2 ,…, 𝑢𝑛 )
• Step3:Convert eigen vectors to the orthonormal vectors:
Q=(𝑞1 , 𝑞2 ,…, 𝑞𝑛 ).
𝜆1 ⋯ 0 𝜆1 ⋯ 0
A=Q ⋮ ⋱ ⋮ 𝑄𝑇 𝐴−1 = 𝑄 𝑇 ⋮ ⋱ ⋮ 𝑄
0 ⋯ 𝜆𝑛 0 ⋯ 𝜆𝑛
Orthogonal to orthonormal
We convert the eigen vector of last example to orthonormal.
3 1 3 1
• 𝑈= then we have 𝑣1 = 𝑎𝑛𝑑 𝑣2 =
2 1 2 1

𝑣1 = 32 + 22 = 3.6
𝑣2 = 12 + 12 = 1.4
• orthonormal matrix of U is:
3 1
3.6 1.4
Q= 2 1
3.6 1.4
Summary
We solve linear equation using inverse matrix. 𝐴𝑥 = 𝑏 → 𝑥 = 𝐴−1 𝑏
– Step1: Find out eigen values of matrix A : (𝜆1 , 𝜆2 , … , 𝜆𝑛 )
– Step2: find out eigen vectors of matrix A : U=(𝑢1 , 𝑢2 ,…, 𝑢𝑛 )
– Step3:Convert eigen vectors to the orthonormal vectors:
Q=(𝑞1 , 𝑞2 ,…, 𝑞𝑛 ).
– Step4:Generate the diagonalized matrix of
A= 𝑄𝐷𝑄𝑇

– Step 5: 𝐴−1 = (𝑄𝐷𝑄 𝑇 )−1 = (𝑄𝐷𝑄 𝑇 )𝑇 = 𝑄𝐷𝑄 𝑇


– Step6: x= 𝐴−1 𝑏 = (𝑄𝐷𝑄𝑇 )b
© All rights reserved. All content within our courses, such as
this video, is protected by copyright and is owned by the
course author or unless otherwise stated. Third party
copyrighted materials (for example, images and text) have
either been licensed for use in any given course, or have been
copied under an exception or limitation in Canadian Copyright
law. For further information, please contact the McMaster
University Centre for Continuing Education
ccecrsdv@mcmaster.ca.

Potrebbero piacerti anche