Sei sulla pagina 1di 47

Quick Tour of Basic Linear Algebra and Probability Theory

Quick Tour of Basic Linear Algebra and


Probability Theory
CS246: Mining Massive Data Sets
Winter 2011
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Outline
1 Basic Linear Algebra
2 Basic Probability Theory
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrices and Vectors
Matrix: A rectangular array of numbers, e.g., A R
mn
:
A =
_
_
_
_
_
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mn
_
_
_
_
_
Vector: A matrix consisting of only one column (default) or
one row, e.g., x R
n
x =
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrices and Vectors
Matrix: A rectangular array of numbers, e.g., A R
mn
:
A =
_
_
_
_
_
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mn
_
_
_
_
_
Vector: A matrix consisting of only one column (default) or
one row, e.g., x R
n
x =
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrix Multiplication
If A R
mn
, B R
np
, C = AB, then C R
mp
:
C
ij
=
n

k=1
A
ik
B
kj
Special cases: Matrix-vector product, inner product of two
vectors. e.g., with x, y R
n
:
x
T
y =
n

i =1
x
i
y
i
R
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrix Multiplication
If A R
mn
, B R
np
, C = AB, then C R
mp
:
C
ij
=
n

k=1
A
ik
B
kj
Special cases: Matrix-vector product, inner product of two
vectors. e.g., with x, y R
n
:
x
T
y =
n

i =1
x
i
y
i
R
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Properties of Matrix Multiplication
Associative: (AB)C = A(BC)
Distributive: A(B + C) = AB + AC
Non-commutative: AB = BA
Block multiplication: If A = [A
ik
], B = [B
kj
], where A
ik
s and
B
kj
s are matrix blocks, and the number of columns in A
ik
is
equal to the number of rows in B
kj
, then C = AB = [C
ij
]
where C
ij
=

k
A
ik
B
kj
Example: If

x R
n
and A = [

a
1
|

a
2
| . . . |

a
n
] R
mn
,
B = [

b
1
|

b
2
| . . . |

b
p
] R
np
:
A

x =
n

i =1
x
i

a
i
AB = [A

b
1
| A

b
2
| . . . | A

b
p
]
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Properties of Matrix Multiplication
Associative: (AB)C = A(BC)
Distributive: A(B + C) = AB + AC
Non-commutative: AB = BA
Block multiplication: If A = [A
ik
], B = [B
kj
], where A
ik
s and
B
kj
s are matrix blocks, and the number of columns in A
ik
is
equal to the number of rows in B
kj
, then C = AB = [C
ij
]
where C
ij
=

k
A
ik
B
kj
Example: If

x R
n
and A = [

a
1
|

a
2
| . . . |

a
n
] R
mn
,
B = [

b
1
|

b
2
| . . . |

b
p
] R
np
:
A

x =
n

i =1
x
i

a
i
AB = [A

b
1
| A

b
2
| . . . | A

b
p
]
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Properties of Matrix Multiplication
Associative: (AB)C = A(BC)
Distributive: A(B + C) = AB + AC
Non-commutative: AB = BA
Block multiplication: If A = [A
ik
], B = [B
kj
], where A
ik
s and
B
kj
s are matrix blocks, and the number of columns in A
ik
is
equal to the number of rows in B
kj
, then C = AB = [C
ij
]
where C
ij
=

k
A
ik
B
kj
Example: If

x R
n
and A = [

a
1
|

a
2
| . . . |

a
n
] R
mn
,
B = [

b
1
|

b
2
| . . . |

b
p
] R
np
:
A

x =
n

i =1
x
i

a
i
AB = [A

b
1
| A

b
2
| . . . | A

b
p
]
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Properties of Matrix Multiplication
Associative: (AB)C = A(BC)
Distributive: A(B + C) = AB + AC
Non-commutative: AB = BA
Block multiplication: If A = [A
ik
], B = [B
kj
], where A
ik
s and
B
kj
s are matrix blocks, and the number of columns in A
ik
is
equal to the number of rows in B
kj
, then C = AB = [C
ij
]
where C
ij
=

k
A
ik
B
kj
Example: If

x R
n
and A = [

a
1
|

a
2
| . . . |

a
n
] R
mn
,
B = [

b
1
|

b
2
| . . . |

b
p
] R
np
:
A

x =
n

i =1
x
i

a
i
AB = [A

b
1
| A

b
2
| . . . | A

b
p
]
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Properties of Matrix Multiplication
Associative: (AB)C = A(BC)
Distributive: A(B + C) = AB + AC
Non-commutative: AB = BA
Block multiplication: If A = [A
ik
], B = [B
kj
], where A
ik
s and
B
kj
s are matrix blocks, and the number of columns in A
ik
is
equal to the number of rows in B
kj
, then C = AB = [C
ij
]
where C
ij
=

k
A
ik
B
kj
Example: If

x R
n
and A = [

a
1
|

a
2
| . . . |

a
n
] R
mn
,
B = [

b
1
|

b
2
| . . . |

b
p
] R
np
:
A

x =
n

i =1
x
i

a
i
AB = [A

b
1
| A

b
2
| . . . | A

b
p
]
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Operators and properties
Transpose: A R
mn
, then A
T
R
nm
: (A
T
)
ij
= A
ji
Properties:
(A
T
)
T
= A
(AB)
T
= B
T
A
T
(A + B)
T
= A
T
+ B
T
Trace: A R
nn
, then: tr (A) =

n
i =1
A
ii
Properties:
tr (A) = tr (A
T
)
tr (A + B) = tr (A) + tr (B)
tr (A) = tr (A)
If AB is a square matrix, tr (AB) = tr (BA)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Operators and properties
Transpose: A R
mn
, then A
T
R
nm
: (A
T
)
ij
= A
ji
Properties:
(A
T
)
T
= A
(AB)
T
= B
T
A
T
(A + B)
T
= A
T
+ B
T
Trace: A R
nn
, then: tr (A) =

n
i =1
A
ii
Properties:
tr (A) = tr (A
T
)
tr (A + B) = tr (A) + tr (B)
tr (A) = tr (A)
If AB is a square matrix, tr (AB) = tr (BA)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Operators and properties
Transpose: A R
mn
, then A
T
R
nm
: (A
T
)
ij
= A
ji
Properties:
(A
T
)
T
= A
(AB)
T
= B
T
A
T
(A + B)
T
= A
T
+ B
T
Trace: A R
nn
, then: tr (A) =

n
i =1
A
ii
Properties:
tr (A) = tr (A
T
)
tr (A + B) = tr (A) + tr (B)
tr (A) = tr (A)
If AB is a square matrix, tr (AB) = tr (BA)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Operators and properties
Transpose: A R
mn
, then A
T
R
nm
: (A
T
)
ij
= A
ji
Properties:
(A
T
)
T
= A
(AB)
T
= B
T
A
T
(A + B)
T
= A
T
+ B
T
Trace: A R
nn
, then: tr (A) =

n
i =1
A
ii
Properties:
tr (A) = tr (A
T
)
tr (A + B) = tr (A) + tr (B)
tr (A) = tr (A)
If AB is a square matrix, tr (AB) = tr (BA)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Special types of matrices
Identity matrix: I = I
n
R
nn
:
I
ij
=
_
1 i=j,
0 otherwise.
A R
mn
: AI
n
= I
m
A = A
Diagonal matrix: D = diag(d
1
, d
2
, . . . , d
n
):
D
ij
=
_
d
i
j=i,
0 otherwise.
Symmetric matrices: A R
nn
is symmetric if A = A
T
.
Orthogonal matrices: U R
nn
is orthogonal if
UU
T
= I = U
T
U
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Special types of matrices
Identity matrix: I = I
n
R
nn
:
I
ij
=
_
1 i=j,
0 otherwise.
A R
mn
: AI
n
= I
m
A = A
Diagonal matrix: D = diag(d
1
, d
2
, . . . , d
n
):
D
ij
=
_
d
i
j=i,
0 otherwise.
Symmetric matrices: A R
nn
is symmetric if A = A
T
.
Orthogonal matrices: U R
nn
is orthogonal if
UU
T
= I = U
T
U
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Special types of matrices
Identity matrix: I = I
n
R
nn
:
I
ij
=
_
1 i=j,
0 otherwise.
A R
mn
: AI
n
= I
m
A = A
Diagonal matrix: D = diag(d
1
, d
2
, . . . , d
n
):
D
ij
=
_
d
i
j=i,
0 otherwise.
Symmetric matrices: A R
nn
is symmetric if A = A
T
.
Orthogonal matrices: U R
nn
is orthogonal if
UU
T
= I = U
T
U
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Special types of matrices
Identity matrix: I = I
n
R
nn
:
I
ij
=
_
1 i=j,
0 otherwise.
A R
mn
: AI
n
= I
m
A = A
Diagonal matrix: D = diag(d
1
, d
2
, . . . , d
n
):
D
ij
=
_
d
i
j=i,
0 otherwise.
Symmetric matrices: A R
nn
is symmetric if A = A
T
.
Orthogonal matrices: U R
nn
is orthogonal if
UU
T
= I = U
T
U
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Linear Independence and Rank
A set of vectors {x
1
, . . . , x
n
} is linearly independent if
{
1
, . . . ,
n
}:

n
i =1

i
x
i
= 0
Rank: A R
mn
, then rank(A) is the maximum number of
linearly independent columns (or equivalently, rows)
Properties:
rank(A) min{m, n}
rank(A) = rank(A
T
)
rank(AB) min{rank(A), rank(B)}
rank(A + B) rank(A) + rank(B)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Linear Independence and Rank
A set of vectors {x
1
, . . . , x
n
} is linearly independent if
{
1
, . . . ,
n
}:

n
i =1

i
x
i
= 0
Rank: A R
mn
, then rank(A) is the maximum number of
linearly independent columns (or equivalently, rows)
Properties:
rank(A) min{m, n}
rank(A) = rank(A
T
)
rank(AB) min{rank(A), rank(B)}
rank(A + B) rank(A) + rank(B)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Linear Independence and Rank
A set of vectors {x
1
, . . . , x
n
} is linearly independent if
{
1
, . . . ,
n
}:

n
i =1

i
x
i
= 0
Rank: A R
mn
, then rank(A) is the maximum number of
linearly independent columns (or equivalently, rows)
Properties:
rank(A) min{m, n}
rank(A) = rank(A
T
)
rank(AB) min{rank(A), rank(B)}
rank(A + B) rank(A) + rank(B)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrix Inversion
If A R
nn
, rank(A) = n, then the inverse of A, denoted
A
1
is the matrix that: AA
1
= A
1
A = I
Properties:
(A
1
)
1
= A
(AB)
1
= B
1
A
1
(A
1
)
T
= (A
T
)
1
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Range and Nullspace of a Matrix
Span: span({x
1
, . . . , x
n
}) = {

n
i =1

i
x
i
|
i
R}
Projection:
Proj (y; {x
i
}
1i n
) = argmin
vspan({x
i
}
1i n
)
{||y v||
2
}
Range: A R
mn
, then R(A) = {Ax| x R
n
} is the span
of the columns of A
Proj (y, A) = A(A
T
A)
1
A
T
y
Nullspace: null (A) = {x R
n
| Ax = 0}
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Range and Nullspace of a Matrix
Span: span({x
1
, . . . , x
n
}) = {

n
i =1

i
x
i
|
i
R}
Projection:
Proj (y; {x
i
}
1i n
) = argmin
vspan({x
i
}
1i n
)
{||y v||
2
}
Range: A R
mn
, then R(A) = {Ax| x R
n
} is the span
of the columns of A
Proj (y, A) = A(A
T
A)
1
A
T
y
Nullspace: null (A) = {x R
n
| Ax = 0}
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Range and Nullspace of a Matrix
Span: span({x
1
, . . . , x
n
}) = {

n
i =1

i
x
i
|
i
R}
Projection:
Proj (y; {x
i
}
1i n
) = argmin
vspan({x
i
}
1i n
)
{||y v||
2
}
Range: A R
mn
, then R(A) = {Ax| x R
n
} is the span
of the columns of A
Proj (y, A) = A(A
T
A)
1
A
T
y
Nullspace: null (A) = {x R
n
| Ax = 0}
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Range and Nullspace of a Matrix
Span: span({x
1
, . . . , x
n
}) = {

n
i =1

i
x
i
|
i
R}
Projection:
Proj (y; {x
i
}
1i n
) = argmin
vspan({x
i
}
1i n
)
{||y v||
2
}
Range: A R
mn
, then R(A) = {Ax| x R
n
} is the span
of the columns of A
Proj (y, A) = A(A
T
A)
1
A
T
y
Nullspace: null (A) = {x R
n
| Ax = 0}
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Range and Nullspace of a Matrix
Span: span({x
1
, . . . , x
n
}) = {

n
i =1

i
x
i
|
i
R}
Projection:
Proj (y; {x
i
}
1i n
) = argmin
vspan({x
i
}
1i n
)
{||y v||
2
}
Range: A R
mn
, then R(A) = {Ax| x R
n
} is the span
of the columns of A
Proj (y, A) = A(A
T
A)
1
A
T
y
Nullspace: null (A) = {x R
n
| Ax = 0}
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Determinant
A R
nn
, a
1
, . . . , a
n
the rows of A,
S = {

n
i =1

i
a
i
| 0
i
1}, then det (A) is the volume of
S.
Properties:
det (I) = 1
det (A) = det (A)
det (A
T
) = det (A)
det (AB) = det (A)det (B)
det (A) = 0 if and only if A is invertible.
If A invertible, then det (A

1) = det (A)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Quadratic Forms and Positive Semidenite Matrices
A R
nn
, x R
n
, x
T
Ax is called a quadratic form:
x
T
Ax =

1i ,j n
A
ij
x
i
x
j
A is positive denite if x R
n
: x
T
Ax > 0
A is positive semidenite if x R
n
: x
T
Ax 0
A is negative denite if x R
n
: x
T
Ax < 0
A is negative semidenite if x R
n
: x
T
Ax 0
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Quadratic Forms and Positive Semidenite Matrices
A R
nn
, x R
n
, x
T
Ax is called a quadratic form:
x
T
Ax =

1i ,j n
A
ij
x
i
x
j
A is positive denite if x R
n
: x
T
Ax > 0
A is positive semidenite if x R
n
: x
T
Ax 0
A is negative denite if x R
n
: x
T
Ax < 0
A is negative semidenite if x R
n
: x
T
Ax 0
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Eigenvalues and Eigenvectors
A R
nn
, C is an eigenvalue of A with the
corresponding eigenvector x C
n
(x = 0) if:
Ax = x
eigenvalues: the n possibly complex roots of the
polynomial equation det (A I) = 0, and denoted as

1
, . . . ,
n
Properties:
tr (A) =

n
i =1

i
det (A) =

n
i =1

i
rank(A) = |{1 i n|
i
= 0}|
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrix Eigendecomposition
A R
nn
,
1
, . . . ,
n
the eigenvalues, and x
1
, . . . , x
n
the
eigenvectors. X = [x
1
|x
2
| . . . |x
n
], = diag(
1
, . . . ,
n
),
then AX = X.
A called diagonalizable if X invertible: A = XX
1
If A symmetric, then all eigenvalues real, and X orthogonal
(hence denoted by U = [u
1
|u
2
| . . . |u
n
]):
A = UU
T
=
n

i =1

i
u
i
u
T
i
A special case of Signular Value Decomposition
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Outline
1 Basic Linear Algebra
2 Basic Probability Theory
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Elements of Probability
Sample Space : Set of all possible outcomes
Event Space F: A family of subsets of
Probability Measure: Function P : F R with properties:
1 P(A) 0 (A F)
2 P() = 1
3 A
i
s disjoint, then P(

i
A
i
) =

i
P(A
i
)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Conditional Probability and Independence
For events A, B:
P(A|B) =
P(A

B)
P(B)
A, B independent if P(A|B) = P(A) or equivalently:
P(A

B) = P(A)P(B)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Random Variables and Distributions
A random variable X is a function X : R
Example: Number of heads in 20 tosses of a coin
Probabilities of events associated with random variables
dened based on the original probability function. e.g.,
P(X = k) = P({ |X() = k})
Cumulative Distribution Function (CDF) F
X
: R [0, 1]:
F
X
(x) = P(X x)
Probability Mass Function (pmf): X discrete then
p
X
(x) = P(X = x)
Probability Density Function (pdf): f
X
(x) = dF
X
(x)/dx
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Properties of Distribution Functions
CDF:
0 F
X
(x) 1
F
X
monotone increasing, with lim
x
F
X
(x) = 0,
lim
x
F
X
(x) = 1
pmf:
0 p
X
(x) 1

x
p
X
(x) = 1

xA
p
X
(x) = p
X
(A)
pdf:
f
X
(x) 0
_

f
X
(x)dx = 1
_
xA
f
X
(x)dx = P(X A)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Expectation and Variance
Assume random variable X has pdf f
X
(x), and g : R R.
Then
E[g(X)] =
_

g(x)f
X
(x)dx
for discrete X, E[g(X)] =

x
g(x)p
X
(x)
Properties:
for any constant a R, E[a] = a
E[ag(X)] = aE[g(X)]
Linearity of Expectation:
E[g(X) + h(X)] = E[g(X)] + E[h(X)]
Var [X] = E[(X E[X])
2
]
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Some Common Random Variables
X Bernoulli (p) (0 p 1):
p
X
(x) =
_
p x=1,
1 p x=0.
X Geometric(p) (0 p 1): p
X
(x) = p(1 p)
x1
X Uniform(a, b) (a < b):
f
X
(x) =
_
1
ba
a x b,
0 otherwise.
X Normal (,
2
):
f
X
(x) =
1

2
e

1
2
2
(x)
2
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Multiple Random Variables and Joint Distributions
X
1
, . . . , X
n
random variables
Joint CDF: F
X
1
,...,X
n
(x
1
, . . . , x
n
) = P(X
1
x
1
, . . . , X
n
x
n
)
Joint pdf: f
X
1
,...,X
n
(x
1
, . . . , x
n
) =

n
F
X
1
,...,X
n
(x
1
,...,x
n
)
x
1
...x
n
Marginalization:
f
X
1
(x
1
) =
_

. . .
_

f
X
1
,...,X
n
(x
1
, . . . , x
n
)dx
2
. . . dx
n
Conditioning: f
X
1
|X
2
,...,X
n
(x
1
|x
2
, . . . , x
n
) =
f
X
1
,...,X
n
(x
1
,...,x
n
)
f
X
2
,...,X
n
(x
2
,...,x
n
)
Chain Rule: f (x
1
, . . . , x
n
) = f (x
1
)

n
i =2
f (x
i
|x
1
, . . . , x
i 1
)
Independence: f (x
1
, . . . , x
n
) =

n
i =1
f (x
i
).
More generally, events A
1
, . . . , A
n
independent if
P(

i S
A
i
) =

i S
P(A
i
) (S {1, . . . , n}).
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Random Vectors
X
1
, . . . , X
n
random variables. X = [X
1
X
2
. . . X
n
]
T
random vector.
If g : R
n
R, then
E[g(X)] =
_
R
n
g(x
1
, . . . , x
n
)f
X
1
,...,X
n
(x
1
, . . . , x
n
)dx
1
. . . dx
n
if g : R
n
R
m
, g = [g
1
. . . g
m
]
T
, then
E[g(X)] =
_
E[g
1
(X)] . . . E[g
m
(X)]

T
Covariance Matrix:
= Cov(X) = E
_
(X E[X])(X E[X])
T

Properties of Covariance Matrix:

ij
= Cov[X
i
, X
j
] = E
_
(X
i
E[X
i
])(X
j
E[X
j
])

symmetric, positive semidenite


Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Multivariate Gaussian Distribution
R
n
, R
nn
symmetric, positive semidenite
X N(, ) n-dimensional Gaussian distribution:
f
X
(x) =
1
(2)
n/2
det ()
1/2
exp
_

1
2
(x )
T

1
(x )
_
E[X] =
Cov(X) =
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Parameter Estimation: Maximum Likelihood
Parametrized distribution f
X
(x; ) with parameter(s) unknown.
IID samples x
1
, . . . , x
n
observed.
Goal: Estimate
MLE:

= argmax

{f (x
1
, . . . , x
n
; )}
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
MLE Example
X Gaussian(,
2
). = (,
2
) unknown. Samples x
1
, . . . , x
n
.
Then:
f (x
1
, . . . , x
n
; ,
2
) = (
1
2
2
)
n/2
exp
_

n
i =1
(x
i
)
2
2
2
_
Setting:
log f

= 0 and
log f

= 0
Gives:

MLE
=

n
i =1
x
i
n
,
2
MLE
=

n
i =1
(x
i
)
2
n
If not possible to nd the optimal point in closed form, iterative
methods such as gradient decent can be used.
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Some Useful Inequalities
Markovs Inequality: X random variable, and a > 0. Then:
P(|X| a)
E[|X|]
a
Chebyshevs Inequality: If E[X] = , Var (X) =
2
, k > 0,
then:
Pr (|X | k)
1
k
2
Chernoff bound: X
1
, . . . , X
n
iid random variables, with
E[X
i
] = , X
i
{0, 1} (1 i n). Then:
P(|
1
n
n

i =1
X
i
| ) 2 exp(2n
2
)
Multiple variants of Chernoff-type bounds exist, which can
be useful in different settings
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
References
1 CS229 notes on basic linear algebra and probability theory
2 Wikipedia!

Potrebbero piacerti anche