Sei sulla pagina 1di 47

Markov Chain

Applications
Example – Flight cancellation concerns: Daily Flight
Cancellations Data
State
0 no cancellations
1 one cancellation
2 2 cancellation
3 more than 2 cancellations

Revenue loss (Rs. Mn) due to cancellations


State 0 1 2 3
Loss 0 4.5 10 16

One Step Transition Probabilities


0 1 2 3
0 0.45 0.3 0.2 0.05
1 0.15 0.6 0.15 0.1
2 0.1 0.3 0.4 0.2
3 0 0.1 0.7 0.2
Example – Flight cancellation concerns: Daily
Flight Cancellations Data
• Questions:
• If there are no cancellations initially, what is the
probability that there will be at least one
cancellation after 2 days?
• Calculate steady state expected loss due to
cancellation of flights
• Through simultaneous equations
• Simulation
Retention Probability and Customer
Lifetime Value using Markov Chains
• Let {0,1,2,….s} be the states of the Markov Chain in
which {1, 2, …s} represents different customer
segments and state 0 represents non-customer
segment
• Retention Probability: Probability of retaining the
customer
𝜋𝜋0 (1 − 𝑃𝑃00 )
𝑅𝑅𝑡𝑡 = 1 −
1 − 𝜋𝜋0
𝑅𝑅𝑡𝑡  steady state retention probability
Retention Probability and Customer
Lifetime Value using Markov Chains
• Customer Lifetime Value (CLV): Net present value of
the future margin generated from a customer or
customer segment
• Customer Lifetime Value for N periods
𝑁𝑁
𝐏𝐏𝐈𝐈 × 𝐏𝐏𝑡𝑡 𝐑𝐑
𝐶𝐶𝐶𝐶𝑉𝑉 = �
1 + 𝑖𝑖 𝑡𝑡
𝑡𝑡=0
• 𝐑𝐑  Reward Vector
• i  interest rate
Example – Retention of customers
in Data services
State
0 no customers
1 to 4 different customer segments

discount factor (1/(1+i)) 0.95

Margin generated in different states


State 0 1 2 3 4
Average Margin 0 120 300 450 620
Customers (mn) 55.8 6.5 4.1 2.3 1.6

One Step Transition Probabilities


0 1 2 3 4
0 0.8 0.1 0.1 0 0
1 0.1 0.6 0.2 0.1 0
2 0.15 0.05 0.75 0.05 0
3 0.2 0 0.1 0.6 0.1
4 0.3 0 0 0.05 0.65
Example – Retention of customers
in Data services
• Questions
• Calculate steady state retention probability
• Calculate CLV for 6 periods (N=5)
Regular Matrix
• If 𝑃𝑃𝑖𝑖𝑖𝑖𝑛𝑛 > 0 for some n, then P is a regular matrix

• A regular matrix will have a stationary distribution


Various States in a Markov Chain
• Accessible State, j: If there exists an n such that
𝑃𝑃𝑖𝑖𝑖𝑖𝑛𝑛 > 0
• i.e. path exists
• Communicating States, i and j: If there exist m and
n such that 𝑃𝑃𝑖𝑖𝑖𝑖𝑛𝑛 > 0 and 𝑃𝑃𝑗𝑗𝑗𝑗𝑚𝑚 > 0
• If all states communicate, the Markov Chain is
irreducible
• Irreducible  go from any state to any state in finite
number of steps
0.2 0.7 0.1
0.5 0.5 0
0.3 0.7 0
Various States in a Markov Chain
• Recurrent State: A state i is said to be recurrent if

� 𝑃𝑃𝑖𝑖𝑖𝑖𝑛𝑛 = ∞
𝑛𝑛=1
i.e. the Markov chain will visit i infinite times in the long
run
(if state j and i are communicating states, then state j is
also recurrent)
• Transient State: A state k is said to be transient if

𝑛𝑛
� 𝑃𝑃𝑘𝑘𝑘𝑘 <∞
𝑛𝑛=1
i.e. the Markov chain will visit k finite times and may not
return to k in the long run
Transition graphs for recurrent
and transient Markov chains
First Passage Time
• First passage time: Probability that the Markov Chain will enter
state i exactly after n steps for the first time after leaving state I

𝑓𝑓𝑖𝑖𝑖𝑖𝑛𝑛 = 𝑃𝑃 𝑋𝑋𝑛𝑛 = 𝑖𝑖, 𝑋𝑋𝑘𝑘 ≠ 𝑖𝑖, 𝑘𝑘 = 1,2, . . 𝑛𝑛 − 1 𝑋𝑋0 = 𝑖𝑖

• For recurrent state ∞

𝐹𝐹𝑖𝑖𝑖𝑖 = � 𝑓𝑓𝑖𝑖𝑖𝑖𝑛𝑛 = 1
𝑛𝑛=1
• For transient state ∞

𝐹𝐹𝑖𝑖𝑖𝑖 = � 𝑓𝑓𝑖𝑖𝑖𝑖𝑛𝑛 < 1


𝑛𝑛=1
Mean Recurrence Time
• Mean recurrence time: Average time taken to
return to state i after leaving state i.

𝜇𝜇𝑖𝑖𝑖𝑖 = � 𝑛𝑛 × 𝑓𝑓𝑖𝑖𝑖𝑖𝑛𝑛
𝑛𝑛=1
• Positive recurrent state: If 𝜇𝜇𝑖𝑖𝑖𝑖 < ∞ (i.e. finite time)
• Null-recurrent state: If 𝜇𝜇𝑖𝑖𝑖𝑖 = ∞ (i.e. infinite time)
Periodic State
• Periodic state is special case of recurrent state
• Let 𝑑𝑑(𝑖𝑖) be the greatest common divisor of n such
that 𝑃𝑃𝑖𝑖𝑖𝑖𝑛𝑛 > 0
• Aperiodic State: 𝑑𝑑 𝑖𝑖 = 1
• Periodic State: 𝑑𝑑 𝑖𝑖 ≥ 2
1 2 3
1 0 1 0
P= 2 0 0 1
3 1 0 0

2
𝑃𝑃11 = 0, but for n=multiples of 3, 𝑃𝑃𝑖𝑖𝑖𝑖𝑛𝑛 = 1 > 0 ⇒ 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 1 𝑖𝑖𝑖𝑖 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
Ergodic Markov Chain
• A state i of a Markov chain is ergodic when it is
positive recurrent and aperiodic
• Markov chains in which all states are ergodic is an
ergodic Markov chain
• An ergodic Markov chain has a stationary distribution
that satisfies 𝑚𝑚

𝜋𝜋𝑗𝑗 = � 𝜋𝜋𝑘𝑘 𝑃𝑃𝑘𝑘𝑘𝑘


𝑚𝑚 𝑘𝑘=1

� 𝜋𝜋𝑘𝑘 = 1
𝑘𝑘=1
Limiting Probability
• Limiting Probability is
lim 𝑃𝑃𝑖𝑖𝑖𝑖𝑛𝑛
𝑛𝑛→∞
• Limiting probability depends may depend on the
initial state and is not unique
• Stationary distribution is unique and does not
depend on the initial state
Markov Chains with Absorbing
States
• Absorbing state: 𝑃𝑃𝑖𝑖𝑖𝑖 = 1
• Absorbing state Markov Chain is a Markov chain in
which there is at least one state k such that 𝑃𝑃𝑘𝑘𝑘𝑘 = 1
• Absorbing state markov chains are not ∞
ergodic
𝑛𝑛
since
other states will be transient (i.e. ∑𝑛𝑛=1 𝑃𝑃𝑘𝑘𝑘𝑘 < ∞)
• Transition matrix corresponding to an absorbing state
markov chain is not a regular matrix and thus do not
have stationary distributions
• i.e. 𝑃𝑃𝐼𝐼 𝑃𝑃𝑛𝑛 may not converge to a unique value and depends on
the initial distribution
• The long run probability of finding the system in
transient states is zero
Canonical Form of the Transition
Matrix of an Absorbing State Markov
Chain
• I = Identity matrix (corresponds to transition between absorbing states)
• 0= matrix in which all elements are zero (i.e. from absorbing state to
transient state)
• R = matrix in which element represents probability of absorption from
transient state to absorbing state
• Q= matrix in which elements represent transition between transient
states
A T
P= A I 0
T R Q

A T
Pn= A I 0
T ∑𝑛𝑛−1 k
𝑘𝑘=0 Q R Qn
Fundamental Matrix
• For large value of n, ∑𝑛𝑛−1
𝑘𝑘=0 Qk R gives the

probability of eventual absorption into an


absorbing state
• Hence, as n ∞, ∑𝑛𝑛−1𝑘𝑘=0 Q k = F = I − Q −1 , the
fundamental matrix
• Probability of eventual absorption from a transient
state to an absorbing state is FR
• Expected time to absorption = Fc
• c is a unit vector
Example – Customer churns for a telecom service provider
State1 Customer state that generated no revenue or profit
Customer state that generated INR 200 profit per month on average
State2 (incoming and data only)
State3 Customer state that generated INR 300 profit per month on an average
State4 Customer state that generated INR 400 profit per month on an average
State5 Customer state that generated INR 600 profit per month on an average
State6 Customer state that generated INR 800 profit per month on an average

Transition probability matrix (based on monthly data)


States State1 State2 State3 State4 State5 State6
1 1 0 0 0 0 0
2 0 1 0 0 0 0
3 0.05 0.05 0.9 0 0 0
4 0.1 0.05 0 0.8 0.05 0
5 0.2 0.1 0 0.05 0.6 0.05
6 0.1 0.2 0 0 0 0.7
Questions
• If a customer is in state 6, calculate the probability
of eventual absorption in state 2?
• Calculated the expected value of time taken to
absorption if the current state is 4?
Example – Customer churns for a telecom service provider
State1 Customer state that generated no revenue or profit
Customer state that generated INR 200 profit per month on average
State2 (incoming and data only)
State3 Customer state that generated INR 300 profit per month on an average
State4 Customer state that generated INR 400 profit per month on an average
State5 Customer state that generated INR 600 profit per month on an average
State6 Customer state that generated INR 800 profit per month on an average

Transition probability matrix (based on monthly data)


States State1 State2 State3 State4 State5 State6
1 1 0 0 0 0 0
2 0 1 0 0 0 0
3 0.05 0.05 0.9 0 0 0
4 0.1 0.05 0 0.8 0.05 0
5 0.2 0.1 0 0.05 0.6 0.05
6 0.1 0.2 0 0 0 0.7
Example – Customer churns for a telecom service provider
State1 Customer state that generated no revenue or profit
Customer state that generated INR 200 profit per month on average
State2 (incoming and data only)
State3 Customer state that generated INR 300 profit per month on an average
State4 Customer state that generated INR 400 profit per month on an average
State5 Customer state that generated INR 600 profit per month on an average
State6 Customer state that generated INR 800 profit per month on an average

Transition probability matrix (based on monthly data)


States State1 State2 State3 State4 State5 State6
1 1 0 0 0 0 0
2 0 1 0 0 0 0
3 0.05 0.05 0.9 0 0 0
4 0.1 0.05 0 0.8 0.05 0
5 0.2 0.1 0 0.05 0.6 0.05
6 0.1 0.2 0 0 0 0.7
Expected Duration to reach a
state from other states
• Let 𝐸𝐸𝑖𝑖𝑖𝑖 be the expected duration (no. of steps) to
reach state j from state i. Then, it satisfies the
following set of equations
𝐸𝐸𝑖𝑖𝑖𝑖 = 1 + � 𝑃𝑃𝑖𝑖𝑖𝑖 𝐸𝐸𝑘𝑘𝑗𝑗 ∀𝑖𝑖, 𝑖𝑖 ≠ 𝑗𝑗
𝑘𝑘

𝐸𝐸𝑗𝑗𝑗𝑗 = 0
Example – how long does it take for
NPA problem to become worse
State1 NPA is less than 1%
State2 NPA is between 1% and 2%
State3 NPA is between 2% and 3%
State4 NPA is between 3% and 4%
State5 NPA is between 4% and 5%
State6 NPA is between 5% and 6%
State7 NPA is greater than 6%
Transition probability matrix (based on monthly data)
States State1 State2 State3 State4 State5 State6 State7
1 0.95 0.05 0 0 0 0 0
2 0.1 0.85 0.05 0 0 0 0
3 0 0.1 0.8 0.1 0 0 0
4 0 0 0.15 0.7 0.15 0 0
5 0 0 0 0.15 0.65 0.2 0
6 0 0 0 0 0.2 0.6 0.2
7 0 0 0 0 0 0.1 0.9
Question
• Calculate the expected duration (in months) for the
process to reach 7 from state 4
𝐸𝐸𝑖𝑖7 = 1 + � 𝑃𝑃𝑖𝑖𝑖𝑖 𝐸𝐸𝑘𝑘7 ∀𝑖𝑖, 𝑖𝑖 ≠ 7
𝑘𝑘

𝐸𝐸77 = 0
Markov Reward Processes
Markov Reward Process
NPV of Rewards for each state
Use of NPV
State Value Function
Bellman Equations - Obtaining the
expected value of rewards
Bellman Equations in Matrix Form
Exercise
• Obtain the value functions for each state
• 𝛾𝛾 = 0

• 𝛾𝛾 = 1

• 𝛾𝛾 = 0.75
Markov Decision Processes
• Used for analyzing sequential decision making over
a planning horizon
• Decisions are made based on every state of the
system, leading to outcomes over a period of time,
along with state changes – What are the best
decisions?
• Substitute for a 90 minutes football match
• Whether to promote a product or not
• When to buy and sell shares
• Movement of robots in a given context
• When to stop or change a television serial with the
objective to maximize television ratings
Two algorithms
• Objective is to find the optimal sequence of action
{𝑎𝑎0 , 𝑎𝑎1 … } that maximize total rewards
• Policy iteration algorithm
• Value iteration algorithm
Example – evaluating policy for maintenance
of mining equipment
States 1(excellent condition), 2, 3, 4(bad condition)
Discount factor 0.95
State 1 2 3 4
Revenue 20000 16000 12000 5000

Actions
1 Do nothing
2 Carry out preventive maintenance. This is applicable when in state 3
or state 4. Converts either state to state 2. Preventive maintenance
costs Rs. 2000
3 Replace the equipment. Applicable to states 2, 3 and 4. Converts
either state to state 1. Cost of replacement is Rs. 10000
Transition probability matrix
States 1 2 3 4
1 0.8 0.1 0.1 0
2 0 0.7 0.2 0.1
3 0 0 0.7 0.3
4 0 0 0 1
Find policy values for {1,1,2,2} {1,1,2,3}
Policy Iteration algorithm
• For any policy 𝜋𝜋 = 𝑎𝑎0 , 𝑎𝑎1 …

• Based on infinite planning horizon


Improvement and Optimal Policy
• We can test other policies, but this may not help in
identifying the best or optimal policy quickly

• Use LPP
• Represent the bellman equations for the value of each
state corresponding to each policy as an inequality,
assuming that policy is optimal (LP constraints)
• Minimize the total optimal value function to identify the
optimal policy
Value Iteration Algorithm
• Based on a finite planning horizon
• Dynamic Programming Algorithm
• Identify the time period of planning
• Start from the last time period, n
• Obtain the best action for each state for that time period
based on immediate rewards generated
• Move backwards to the previous time period, n-1
• Obtain the best action for that time period for each state
based on rewards and expected value
• Repeat these steps till you reach the first time period
• Total optimal value for each state is obtained  corresponds
to optimal action taken at each stage (i.e. actions profile over
the time period can also be mapped)

Potrebbero piacerti anche