Sei sulla pagina 1di 22

Applied Probability and Stochastic

Processes: MGT-484
Introduction to Markov Chains

Jianzhe (Trevor) Zhen

École Polytechnique Fédérale de Lausanne


Stochastic Processes

Definition: A discrete-time stochastic process is a family of random


variables X0 , X1 , X2 , . . .

Examples:
• The daily closing price of a stock
• The yearly gross domestic product of a country
• The daily sales of a retail store
• The weekly temperature of lake Geneva
• The yearly rate of unemployment
• The number of hits of a website every minute
• etc.
Markov Chains

Informal Definition: A Markov chain is a stochastic process whose


"future depends on the past only through the present."

Put differently, the only part of the history of the process that affects
its future evolution is the current state.

Example: Let Xt be the closing price of a stock on day t, and assume


that the rates of return Rt of this stock from day t to day t + 1 are
independent and identically distributed (iid) and normal. Thus

Xt+1 = Xt (1 + Rt ) where Rt ∼ N (µ, σ 2 ) iid.

Note that Xt+1 depends on the stock price history X1 , X2 , . . . , Xt only


through the current price Xt . This is a Markov chain with a continuous
state space (as the stock price can adopt any real number).
Markov Chains

Definition: The state space of a stochastic process is the set of


possible values taken by the random variables X0 , X1 , X2 , . . .

In this course we will concentrate mainly on Markov chains with finite


or countably infinite state spaces.

Example: Consider a sequence of coin flips and define Xt as the


number of heads observed up to time t. If

1 if the outcome at time t is heads,
Zt =
0 otherwise,

then Xt = Z0 + Z1 + · · · + Zt = Xt−1 + Zt . Thus, Xt depends on the


past coin flips only through the most recent count Xt−1 .
The (S, s) Inventory Model
Let Xt denote the inventory level at the end of period t and assume
that we face a demand Dt during period t.

Simple ordering policy: Order nothing as long as the inventory


exceeds a level s. Otherwise, increase the inventory to a level S > s.

Dynamics:1 
(S − Dt+1 )+ if Xt ≤ s
Xt+1 =
(Xt − Dt+1 )+ if Xt > s
If the demands are independent, Xt+1 thus depends on the past
demands only through the current inventory level Xt .

s
t

1 For any c ∈ R we define c+ = max{c, 0}.


Markov Property

Common property of all the examples:

If you want to know the distribution of the next state given the
current state, you gain no additional information if I tell you the
entire past history of the process.

Mathematically, this translates to

P(Xt+1 = xt+1 |Xt = xt ) = P(Xt+1 = xt+1 |X0 = x0 , X1 = x1 , . . . , Xt = xt ),

where x0 , x1 , . . . , xt is the sequence of observed states (i.e., these are


numbers and not random variables).

The above relationship is called the Markov property. It is the defining


characteristic of a Markov chain.
Transition Probabilities
We make the assumption that the state space X is a subset of N.

Definition: We refer to Pt (i, j) = P(Xt+1 = j|Xt = i) for i, j ∈ X as the


transition probabilities. A Markov chain is time-homogeneous if the
transition probabilities are independent of time, that is,

Pt (i, j) = Pt 0 (i, j) ∀t, t 0 = 0, 1, 2, . . .

From now on we focus only on time-homogeneous Markov chains.


State
X0 X1 X2 X3 X4
4

P3(2,4)
1 2 3

X
P0(3,1)

t
0 1 2 3 4
Markov Chains: Formal Definition

Definition: A process X0 , X1 , X2 , . . . is a (time-homogeneous)


discrete-time Markov Chain with state space X, initial distribution γ
and transition matrix P if:
1 Xt is a random variable with values in X for all t.
2 X0 ∼ γ, that is, P(X0 = i) = γi .
3 The process {Xt } satisfies the Markov property.
4 P(Xt+1 = j|Xt = i) = Pij for all t.

Remark: P is called a "matrix" even though X might be infinite. If X is


infinite, then P has also infinitely many entries.
P
Remark: The relation j∈X P(Xt+1 = j|Xt = i) = 1 implies that
P
j∈X Pij = 1. Thus, all rows of P sum to 1. A matrix with this property
is called a stochastic matrix.
Joint State Distribution

The Markov chain is completely described by X, γ and P:

P(Xt = xt , Xt−1 = xt−1 , . . . , X0 = x0 )


(i)
= P(X0 = x0 ) P(X1 = x1 |X0 = x0 ) P(X2 = x2 |X1 = x1 , X0 = x0 ) · · ·
· · · P(Xt = xt |Xt−1 = xt−1 , . . . , X0 = x0 )
(ii)
= P(X0 = x0 ) P(X1 = x1 |X0 = x0 ) P(X2 = x2 |X1 = x1 ) · · ·
· · · P(Xt = xt |Xt−1 = xt−1 )
(iii)
= γx0 Px0 x1 Px1 x2 · · · Pxt−1 xt

(i) as P(A ∩ B ∩ C) = P(A|B ∩ C)P(B ∩ C) = P(A|B ∩ C)P(B|C)P(C)


(ii) due to the Markov property
(iii) by the definition of the initial distribution and the transition matrix
Sequence of Coin Flips
Let Xt be the number of heads in t coin flips and set X0 = 0.

State space: X = {0, 1, 2, . . .}

Initial distribution: γ0 = 1, γi = 0 for all i > 0

Transition matrix: If Xt−1 = xt−1 , then Xt can take the values xt−1 or
xt−1 + 1 with probability 21 each. Thus,

1

 2 if xt = xt−1 + 1,

1
P(Xt = xt |Xt−1 = xt−1 ) = if xt = xt−1 ,


2

0 otherwise.
We can write P as a matrix:
1 1 
2 2 0 0 ···
0 1 1
0 · · ·
 2 2 
P = 0 0 1 1
· · ·
 2 2 
.. .. .. .. ..
. . . . .
Graphical Representation

Any Markov chain has a graphical representation:


1 The nodes of the graph are the elements of X.
2 Draw an arc from i to j if Pij > 0.
3 Label the arc with the value Pij .

The sequence of coin flips:

1/2 1/2 1/2 1/2 1/2

1/2 1/2 1/2 1/2 1/2


0 1 2 3 4
Describing Markov Chains

We have seen three ways of describing a Markov chain:

1 Specify the recursion that expresses Xt in terms of Xt−1 .


2 Specify the state space X, the initial distribution γ and the
transition matrix P.
3 Use the graphical representation of the Markov chain.
A Queuing Model
Example:

• Customers arrive to a waiting room according to the stochastic


process A0 , A1 , . . ., where At represents the number of
customers arriving in period t.
• The server (e.g., a receptionist) can process Dt people in
period t.

Denote by Xt the total number of customers in the waiting room at the


end of period t:

0 if Xt + At+1 ≤ Dt+1
Xt+1 =
Xt + At+1 − Dt+1 otherwise

= [Xt + At+1 − Dt+1 ]+

=⇒ {Xt } follows a Markov chain if the arrivals are independent.


A Queuing Model
Example (extended):

• A security guard controls access to the building.


• If more than K people are in the waiting room, then any new
arrivals are turned away.
• The guard observes the number of people in the room with a
delay of 1 period.

The number of people in the room now satisfies the recursion



[Xt + At+1 − Dt+1 ]+ if Xt−1 ≤ K ,
Xt+1 =
[Xt − Dt+1 ]+ if Xt−1 > K .

=⇒ {Xt } is not a Markov chain.

Increase the state dimension: If we define Xt0 = (Xt , Xt−1 ) for all t,
then {Xt0 } is again a Markov chain!
State Space Explosion

This example shows that many (if not most) stochastic processes can
be converted to Markov chains if the state space is enlarged.

A key to good Markov chain models is to


control the state space explosion!
State Distributions
If {Xt } be a Markov chain with initial distribution γ and transition
matrix P, then the law of total probability implies:
X X
P(X1 = j) = P(X1 = j|X0 = i) P(X0 = i) = Pij γi
i∈X i∈X

Denote the distribution of Xt by pt , i.e., pt (j) = P(Xt = j) for all j ∈ X.


Thus, p0 = γ, and p1 = γP (distributions = row vectors).

More generally:

pt = pt−1 P = pt−2 PP = · · · = γP t

The (i, j) entry of P t is the probability of transitioning from i to j in t


periods, and the ith row of P t is the distribution of Xt given X0 = i.

=⇒ Computing state distributions is tantamount to


computing powers of the transition matrix.
Sequence of Coin Flips
Let Xt be the number of heads in t coin flips and set X0 = 0.

Question: What is the probability P(Xt = j) of j heads in t tosses?

Answer: Binomial distribution:


   j  t−j    t
t 1 1 t 1
P(Xt = j) = = for 0≤j ≤t
j 2 2 j 2

Alternatively, we can compute the powers of P by hand, e.g.,


1 1 1 
4 2 4 0 0 ···
 0 1 1 1 0 · · ·
2  4 2 4 
P =  0 0 1 1 1 · · · .
 4 2 4 
.. .. .. .. .. . .
. . . . . .

The first row contains the numbers P(Xt = j|X0 = 0) for 0 ≤ j < ∞.
This is consistent with the binomial formula above.
Recipe for Computing Powers of P
Input: P ∈ RN×N diagonizable, t ∈ N; Output: P t .

1 Find a diagonal matrix Λ and an invertible matrix R with


 
λ1
 ..  −1
P = RΛR −1 = R  . R ,
λN

where λ1 , . . . , λN denote the eigenvalues of P, and the columns


of R represent the corresponding eigenvectors.
2 P t = (RΛR −1 )t = RΛ R −1 −1 −1 t −1
| {z R} ΛR · · · RΛR = RΛ R
 t =I
λ1
 ..  −1
=R . R
t
λN

(These calculations are easy to do in MATLAB.)


The Simplest Possible Chain

Given: α > 0, β > 0. Then, 1-α 1-β


 
P=
1−α α α
β 1−β

has eigenvalues 1 and 1 − α − β. 1 2


β
   
a b 1 0
Task: Find R = c d with P = R 0 1 − α − β R −1 , recalling that
 
1 d −b
R −1 = ad−bc −c a
. A direct calculation yields:
! !
−α β
1 α+β t α+β
+ α
α+β
(1 − α − β)t α
α+β
− α
α+β
(1 − α − β)t
R= β , P = β β β
1 α+β α+β
− α+β
(1 − α − β)t α
α+β
+ α+β
(1 − α − β)t
Virus Mutation

Consider a virus with n > 1 possible strains. In each period, the virus
mutates with probability α, in which case it changes randomly to any
of the remaining n − 1 strains. What is P(Xt = i|X0 = i)?

Method 1: Construct a Markov chain with X = (1, 2, . . . , n).

1-α 1-α/(n-1)
Method 2: Use the previous ex- α
α
ample with β = n−1 :
strain i all other strains
α/(n-1)
 t
1 n−1 n
=⇒ P(Xt = i|X0 = i) = + 1− α
n n n−1
A Trick for Computing P t

Assume that we computed the eigenvalues λ1 , . . . , λn of P ∈ Rn×n


and that they are all different (the case of degenerate eigenvalues is
more intricate but can be handled using the Jordan normal form).

The computation of R and R −1 , which may be hard, can be avoided!

Note that P t = RΛt R −1 is a linear combination of λt1 , . . . , λtn , that is,


there must be n × n-matrices A1 , . . . , An with

P t = A1 λt1 + · · · An λtn ∀t = 0, 1, 2, . . . (?)

Explicitly calculating P t for t = 0, . . . , n − 1 allows us to interpret (?)


as a system of n3 linear equations for the n3 entries of the matrices
A1 , . . . , An .
Expected Reward

Assume that we receive a reward r (i) in state i.

Example: Let Xt be the number of customers in a queue at time t.To


calculate the fraction of time the server is busy, assign a reward 1 to
states where Xt > 0 and a reward 0 to states where Xt = 0.

The expected reward at time t is


X
E[r (Xt )] = E[r (Xt )|Xt = j] P(Xt = j)
j∈X
X
= r (j) P(Xt = j) = γP t r ,
j∈X

where r = [r (1), r (2), . . .]> is the (column-) vector of rewards. Recall


that γP t is the distribution of Xt . Thus, we take the expectation of r
w.r.t. this distribution.

Potrebbero piacerti anche