Benvenuto in Scribd!

Salta carosello

(Partially Observable) Markov Decision Processes: Frederike Petzschner & Lionel Rigoux

Caricato da

monsieresm

Il 0% ha trovato utile questo documento (0 voti)

11 visualizzazioni19 pagine

markov decision processes in computational psychiatry

Titolo originale

CPC2018_MarkovDecisionProcesses_Petzschner

Copyright

Formati disponibili

PDF, TXT o leggi online da Scribd

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Segnala questo documento

markov decision processes in computational psychiatry

Copyright:

Formati disponibili

Scarica in formato PDF, TXT o leggi online su Scribd

Segnala contenuti inappropriati

Il 0% ha trovato utile questo documento (0 voti)

11 visualizzazioni19 pagine

(Partially Observable) Markov Decision Processes: Frederike Petzschner & Lionel Rigoux

Caricato da

monsieresm

markov decision processes in computational psychiatry

Copyright:

Formati disponibili

Scarica in formato PDF, TXT o leggi online su Scribd

Segnala contenuti inappropriati

Salta alla pagina

Sei sulla pagina 1di 19

Cerca all'interno del documento

(PARTIALLY OBSERVABLE)

MARKOV DECISION PROCESSES

Frederike Petzschner & Lionel Rigoux
What’s the fastest way out?
r=+1

r=-1

Plan: sequence of actions

rstep = -0.04
Nondeterministic Action Rule

s‘1
0.8
Chosen action
0.1 0.1

s‘2 s s‘3
What’s is the reliability of the action sequence: up, up, right, right, right?

r=+1

r=-1

Answer: 0.32776
= 0.85 + 0.14 0.8

rstep = -0.04
The MDP is defined by:

States s à S (state space) (Start state; Maybe: terminating state)

Actions a à A (action space)
Transition Function: T (s, a, s‘): P(s‘|s,a)
Reward Function: R (s, a, s‘), R (s, a), R (s)

MDP is a nondeterministic search problem.

Whats Markovian about an MDP?

Future and Past are independent.

Action outcomes only depend on your current state.

P(St+1 = s' St = st , At = at , St−1 = st−1, At−1 = at−1,...) = P(St+1 = s' St = st , At = a)

Not every process is an MDP!

What if the reward structure of the world changes.

r=+1 Quiz: what‘s the best

strategy in the four
white fields?
rstep = -2
r=-1
We have been looking at sequences of reward

a b
[-1] < [+1]
+1 -1

𝑈 𝑠# , 𝑠% , 𝑠& , … = 𝑅(𝑠# ) + 𝑅(𝑠% ) + 𝑅(𝑠& ) + ⋯

𝑈 𝑠# , 𝑠% , 𝑠& , … = . 𝑅(𝑠/ )
/1#
Sequences of rewards

Utility depends on all successor states !!! (longterm)

a b
[+1-7] < [-1+7]
+1 -1
[-6] < [+6]
c d
-7 +7

𝑈 𝑠# , 𝑠% , 𝑠& , … = . 𝑅(𝑠/ )
/1#
Utility of a sequence of states in non-deterministic worlds

[+1,-70.1 + 0.92] < [-1,+60.1 -100.9]

[+2.1] < [-9.8]

a b

+1 -1
0.1 0.9 0.9 0.1
c d
-7 +2 -10 +6

𝑈 𝑠# , 𝑠% , 𝑠& , … = 𝐸 . 𝑅(𝑠/ )
/1#
Sequences of rewards
[+1,+20,+1,+20,+1+1,.....]
[+1,+1,+1,+1,+1+1,.....] Discounting

2
1 γ γ
0

𝑈 𝑠# , 𝑠% , 𝑠& , … = 𝐸 . 𝑦𝑡𝑅(𝑠/ )
0 < γ <1 /1#
Discounting

a b

+1 -1
c d
γ

e h
γ2

g f
γ3
Policies
We want a plan! But this is not a deterministic world!
Plan: mapping from states to actions

Policy π: states à actions 0

• It’s like an if-then-plan 𝑈𝜋 𝑠 = 𝐸 . 𝑦𝑡𝑅(𝑠/ ) 7 𝜋
• look-up table /1#

Optimal Policy π*: states à actions

• maximized expected utility
0

𝜋 ∗ = max 𝐸 . 𝑦𝑡𝑅(𝑠/ ) 7 𝜋
<
/1#
0

𝑈𝜋 𝑠 = 𝐸 . 𝑦𝑡𝑅(𝑠/ ) 7 𝜋
/1#

𝑈𝜋 𝑠 = 𝑅 𝑠 + 𝑦 max . 𝑇(𝑠, 𝑎, 𝑠′) A 𝑈(𝑠′)

=
CD

BELLMAN EQUATION
Let’s find the optimal policy….
-‐ n equations
-‐ n unknowns
-‐ but…
How to act optimal? – value iteration

Step 1: Start with arbitrary utilities. Take the correct first action
Step 2: Update utilities based on their neighbours

Keep going....until convergence (Adding truth to wrong)

„An optimal policy has the property that whatever the initial state and initial
decision are, the remaining decisions must constitute an optimal policy with
regard to the state resulting from the first decision.“

– Bellman, 1957
Let‘s do it - value iteration

𝑈𝜋 𝑠 = 𝑅 𝑠 + 𝑦 max . 𝑇(𝑠, 𝑎, 𝑠′) A 𝑈(𝑠′)

.36
.376 =
r=+1 CD

𝑈𝑡 + 1𝜋 𝑠 = 𝑅 𝑠 + 𝑦 max . 𝑇(𝑠, 𝑎, 𝑠′) A 𝑈𝑡(𝑠′)

-.04 =
CD
r=-1

𝑈𝑡 = 1 𝑤ℎ𝑖𝑡𝑒 = −.04 + .5 A (0 + 0 + .8 A 1) = .36

𝑈𝑡 = 2 𝑤ℎ𝑖𝑡𝑒 = −.04 + .5 A .1 A .36 + .8 A 1 − .1 A .04
= .376
step cost: r = - 0.04 y= 0.5 𝑈0 𝑠 = 0
SUM

MDP: Non-deterministic search problem

States s à S (state space) (Start state; Maybe: terminating state)

Actions a à A (action space)
Transition Function: T (s, a, s‘): P(s‘|s,a)
Reward Function: R (s, a, s‘), R (s, a), R (s)

à POMDP: Uncertainty about states

What if the reward structure of the world changes.

r=+1 Quiz what‘s the best

strategy in the four
white fields?
rstep = +2
r=-1

Potrebbero piacerti anche

MDP PDF
Documento37 pagine
MDP PDF
bsudheertec
Nessuna valutazione finora
Maulina Putri Lestari - M0220052 - Tugas 1
Documento5 pagine
Maulina Putri Lestari - M0220052 - Tugas 1
Maulina Putri Lestari
Nessuna valutazione finora
SS ZG568 EC 2M SECOND SEM 2020 2021 Solution 1617600765956
Documento9 pagine
SS ZG568 EC 2M SECOND SEM 2020 2021 Solution 1617600765956
amrasirah
Nessuna valutazione finora
Algorithms Midterm 2020
Documento6 pagine
Algorithms Midterm 2020
Duc Anh Quan
Nessuna valutazione finora
Mathematics Csec Summary 2022
Documento22 pagine
Mathematics Csec Summary 2022
Daniel McAlmont
Nessuna valutazione finora
Equation Sheet For Midterm 1
Documento2 pagine
Equation Sheet For Midterm 1
37. Nguyễn Lê Mỹ Tiên
Nessuna valutazione finora
School of Computing and Information Systems The University of Melbourne COMP90049 Introduction To Machine Learning (Semester 1, 2022)
Documento4 pagine
School of Computing and Information Systems The University of Melbourne COMP90049 Introduction To Machine Learning (Semester 1, 2022)
Cedric Tang
Nessuna valutazione finora
Formulae
Documento14 pagine
Formulae
Nashwa Tambe
Nessuna valutazione finora
Lecture 3 - MDPs and Dynamic Programming
Documento66 pagine
Lecture 3 - MDPs and Dynamic Programming
Trinaya Kodavati
Nessuna valutazione finora
7th Lecture Note 230515 135845
Documento21 pagine
7th Lecture Note 230515 135845
Girie Yum
Nessuna valutazione finora
Sp14 Cs188 Lecture 8 - Mdps I
Documento50 pagine
Sp14 Cs188 Lecture 8 - Mdps I
xixip66099
Nessuna valutazione finora
Problem Set 4 Solution Numerical Methods
Documento6 pagine
Problem Set 4 Solution Numerical Methods
Ariyan Jahanyar
Nessuna valutazione finora
CORPFIN1002 Formula Sheet - 2023
Documento1 pagina
CORPFIN1002 Formula Sheet - 2023
j
Nessuna valutazione finora
Lesson 1
Documento33 pagine
Lesson 1
Matt Geoligao
Nessuna valutazione finora
Modelo VECM para Series Financieras-1
Documento30 pagine
Modelo VECM para Series Financieras-1
Cuenta descargas
Nessuna valutazione finora
Regularization and Feature Selectio N
Documento102 pagine
Regularization and Feature Selectio N
Ehab Emam
Nessuna valutazione finora
Exercise Solution For Chapter 2: Nanxi Zhang July 22, 2020
Documento2 pagine
Exercise Solution For Chapter 2: Nanxi Zhang July 22, 2020
maryamsaleem430
Nessuna valutazione finora
Logisticregression
Documento27 pagine
Logisticregression
Syed Hamza Ibrar Shah
Nessuna valutazione finora
Unit II D&C and Greedy 2022I
Documento81 pagine
Unit II D&C and Greedy 2022I
1367 AYUSHA KATEPALLEWAR
Nessuna valutazione finora
Chapter 7d
Documento16 pagine
Chapter 7d
Suman Samai
Nessuna valutazione finora
Fuzzylogicppt 1262890416595 Phpapp01
Documento38 pagine
Fuzzylogicppt 1262890416595 Phpapp01
Avast Free
Nessuna valutazione finora
Solow-Swan Model
Documento8 pagine
Solow-Swan Model
Heman Lohano
Nessuna valutazione finora
Be Computer-Engineering Semester-4 2018 May Applied-Mathematics-Iv-Cbcgs
Documento19 pagine
Be Computer-Engineering Semester-4 2018 May Applied-Mathematics-Iv-Cbcgs
minji kun
Nessuna valutazione finora
Selecting The Right Jobs
Documento28 pagine
Selecting The Right Jobs
Ioannis
Nessuna valutazione finora
Dynamic Programming
Documento7 pagine
Dynamic Programming
Glenn Gibbs
Nessuna valutazione finora
Answer: Rise Time Answer: Equal Real Answer: Mass Spring Damper System U-Tube Manometer Pneumatic Valves
Documento2 pagine
Answer: Rise Time Answer: Equal Real Answer: Mass Spring Damper System U-Tube Manometer Pneumatic Valves
Lois Reyes
Nessuna valutazione finora
P14 Downloadable Slides 1
Documento74 pagine
P14 Downloadable Slides 1
Vi Nghiep Ly
Nessuna valutazione finora
Unit-1 DAA - Notes
Documento25 pagine
Unit-1 DAA - Notes
shivam02774
Nessuna valutazione finora
SOR December 1 2020 Lecture Notes Tuesday
Documento7 pagine
SOR December 1 2020 Lecture Notes Tuesday
John Larry Pasion02
Nessuna valutazione finora
Tutorial Implementación Computacional de Modelos Matemáticos para Planificación de Operaciones - Jupyter Notebook
Documento10 pagine
Tutorial Implementación Computacional de Modelos Matemáticos para Planificación de Operaciones - Jupyter Notebook
Alejandro Sepúlveda
Nessuna valutazione finora
Formulae
Documento8 pagine
Formulae
Dydtd Fyfh
Nessuna valutazione finora
EEE354 Assignment Answer Scheme
Documento7 pagine
EEE354 Assignment Answer Scheme
Izzat Azman
Nessuna valutazione finora
Ch15 Transfer Function
Documento17 pagine
Ch15 Transfer Function
Lujain Amro
Nessuna valutazione finora
School of Computing and Information Systems The University of Melbourne COMP90049 Introduction To Machine Learning (Semester 1, 2022)
Documento8 pagine
School of Computing and Information Systems The University of Melbourne COMP90049 Introduction To Machine Learning (Semester 1, 2022)
Cedric Tang
Nessuna valutazione finora
Wa 1
Documento9 pagine
Wa 1
Nohan Joemon
Nessuna valutazione finora
ch2 PDF
Documento32 pagine
ch2 PDF
david Abotsitse
Nessuna valutazione finora
Inequalities-Olympiad-Method With Partial Differentiation
Documento6 pagine
Inequalities-Olympiad-Method With Partial Differentiation
sanits591
Nessuna valutazione finora
Maths Guidelines For Board Exams PDF
Documento11 pagine
Maths Guidelines For Board Exams PDF
heena
Nessuna valutazione finora
Central Difference and Newmark
Documento44 pagine
Central Difference and Newmark
Krishna Kadiyam
Nessuna valutazione finora
Chapter 1 Solving Equations
Documento58 pagine
Chapter 1 Solving Equations
李峻毅
Nessuna valutazione finora
Dripto Bakshi - Lecture Slides - Constrained Optimization
Documento58 pagine
Dripto Bakshi - Lecture Slides - Constrained Optimization
Rupinder Goyal
Nessuna valutazione finora
MATP203MS
Documento7 pagine
MATP203MS
naveen student
Nessuna valutazione finora
3 - Lecture 3 - Properties & Inverse Z-Transform - (2nd Term 2021-2022)
Documento24 pagine
3 - Lecture 3 - Properties & Inverse Z-Transform - (2nd Term 2021-2022)
Artist Abram
Nessuna valutazione finora
Unit 4 - Week 3 - Conditional Executions and Loops, Data Management With Sequences
Documento8 pagine
Unit 4 - Week 3 - Conditional Executions and Loops, Data Management With Sequences
xlntyogesh
Nessuna valutazione finora
6 Aoa - Xf17f&a.docx: P. 1 Of, Nassar
Documento11 pagine
6 Aoa - Xf17f&a.docx: P. 1 Of, Nassar
Ahmed Abdulwahab
Nessuna valutazione finora
Untitled
Documento6 pagine
Untitled
maremqlo54
Nessuna valutazione finora
Lab 4 Classification v.0
Documento5 pagine
Lab 4 Classification v.0
ruso
Nessuna valutazione finora
MATLAB Programs (IVSEM) - 1
Documento9 pagine
MATLAB Programs (IVSEM) - 1
faizan soudagar
Nessuna valutazione finora
Solutions Exercises Chapter 2: Dependence)
Documento3 pagine
Solutions Exercises Chapter 2: Dependence)
DanyValentin
Nessuna valutazione finora
Class - XII: Mathematics (041) SQP Marking Scheme (2019-20)
Documento13 pagine
Class - XII: Mathematics (041) SQP Marking Scheme (2019-20)
ganesh kumar
Nessuna valutazione finora
Help and Revision
Documento13 pagine
Help and Revision
jonatan
Nessuna valutazione finora
CHAP4 Transform and Conquer
Documento38 pagine
CHAP4 Transform and Conquer
Do Dragon
Nessuna valutazione finora
Numerical Methods To Solve Systems of Equations in Python
Documento12 pagine
Numerical Methods To Solve Systems of Equations in Python
theodor_munteanu
Nessuna valutazione finora
FAI 4 Mathematical Concepts II
Documento39 pagine
FAI 4 Mathematical Concepts II
zhipengyang0110
Nessuna valutazione finora
Add Maths Crash Course 2022
Documento16 pagine
Add Maths Crash Course 2022
Raveena Singh
Nessuna valutazione finora
Slides VB Inz MM 07 F-Je 4
Documento102 pagine
Slides VB Inz MM 07 F-Je 4
Vladimir Baltic
Nessuna valutazione finora
Exam Programming and Numerical Analysis - 30 May 2016
Documento7 pagine
Exam Programming and Numerical Analysis - 30 May 2016
Eno Shehu
Nessuna valutazione finora
Inverse Trigonometric Functions (Trigonometry) Mathematics Question Bank
Da Everand
Inverse Trigonometric Functions (Trigonometry) Mathematics Question Bank
Mohmmad Khaja Shareef
Nessuna valutazione finora
Generalized Fermat Equation
Da Everand
Generalized Fermat Equation
Ran Van Vo
Nessuna valutazione finora
Amazing Java: Learn Java Quickly
Da Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
Nessuna valutazione finora
JDBC Questions - 1
Documento9 pagine
JDBC Questions - 1
vilasrao Deshmukh
Nessuna valutazione finora
DS7201 Advanced Digital Image Processing University Question Paper Nov Dec 2017
Documento2 pagine
DS7201 Advanced Digital Image Processing University Question Paper Nov Dec 2017
kamalanathan
Nessuna valutazione finora
Module 3-1 PDF
Documento43 pagine
Module 3-1 PDF
Birendra Kumar Baniya 1EW19CS024
Nessuna valutazione finora
DragGan AI
Documento11 pagine
DragGan AI
ahsan iqbal
Nessuna valutazione finora
Deep Learning Unit-II
Documento19 pagine
Deep Learning Unit-II
Anonymous xMYE0TiNBc
Nessuna valutazione finora
Cyprus University of Technology TEPAK Report Template English PDF
Documento17 pagine
Cyprus University of Technology TEPAK Report Template English PDF
Vishal Tyagi
Nessuna valutazione finora
AI Master Class Series - Day 3: Introductions To Computer Vision & Its Libraries
Documento25 pagine
AI Master Class Series - Day 3: Introductions To Computer Vision & Its Libraries
Aaj tak tv
Nessuna valutazione finora
Extended State Estimator Based Two Area Load Frequency Control
Documento7 pagine
Extended State Estimator Based Two Area Load Frequency Control
Paramjit Saha
Nessuna valutazione finora
Unit 1
Documento22 pagine
Unit 1
Vishal Shivhare
Nessuna valutazione finora
Big Data Cyber Security Paper
Documento3 pagine
Big Data Cyber Security Paper
Abdul Razzak
Nessuna valutazione finora
Rainfall Prediction
Documento8 pagine
Rainfall Prediction
ANIKET DUBEY
Nessuna valutazione finora
Plett Dawson
Documento270 pagine
Plett Dawson
Raghu
0% (1)
Learning Paths South Africa V3
Documento12 pagine
Learning Paths South Africa V3
Ahmad Abunassar
Nessuna valutazione finora
Performance Measurement and Organizational Effectiveness
Documento47 pagine
Performance Measurement and Organizational Effectiveness
ALTERINDONESIA
Nessuna valutazione finora
Block Diagrams
Documento26 pagine
Block Diagrams
manoj walekar
Nessuna valutazione finora
Basics 3
Documento16 pagine
Basics 3
Mahesh N 22MVD0090
Nessuna valutazione finora
Normalisation Workbook Answers
Documento14 pagine
Normalisation Workbook Answers
Merisa Huseinovic
100% (1)
Future Directions
Documento30 pagine
Future Directions
Tony Khánh
Nessuna valutazione finora
Ai Past Present Future
Documento76 pagine
Ai Past Present Future
Roshdy Salem
Nessuna valutazione finora
Decision Tree Algorithm, Explained
Documento20 pagine
Decision Tree Algorithm, Explained
Ali Khan
Nessuna valutazione finora
Channel Coding Problems
Documento6 pagine
Channel Coding Problems
samaa saad
Nessuna valutazione finora
Weather Prediction Based On LSTM Model Implemented AWS Machine Learning Platform
Documento10 pagine
Weather Prediction Based On LSTM Model Implemented AWS Machine Learning Platform
IJRASETPublications
100% (1)
MSC Chennamsetty LH 2020
Documento56 pagine
MSC Chennamsetty LH 2020
runrajarun1819
Nessuna valutazione finora
Tugas 2 Aplikasi Perkantoran (Axel Theo Winata Ursia)
Documento18 pagine
Tugas 2 Aplikasi Perkantoran (Axel Theo Winata Ursia)
Axel Ursia
Nessuna valutazione finora
Deepfake Elsevier CVIU
Documento19 pagine
Deepfake Elsevier CVIU
20261A3232 LAKKIREDDY RUTHWIK REDDY
Nessuna valutazione finora
A Simpler (And Better) SQL Approach To Relational Division
Documento5 pagine
A Simpler (And Better) SQL Approach To Relational Division
22520327
Nessuna valutazione finora
Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey
Documento103 pagine
Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey
Alex Mercer
Nessuna valutazione finora
Deep Learning - Wikipedia, The Free Encyclopedia
Documento27 pagine
Deep Learning - Wikipedia, The Free Encyclopedia
garbage
Nessuna valutazione finora
Innatism
Documento2 pagine
Innatism
mariody
Nessuna valutazione finora
Introduction To Kalman Filters: Michael Williams 5 June 2003
Documento23 pagine
Introduction To Kalman Filters: Michael Williams 5 June 2003
Muni Sankar Matam
Nessuna valutazione finora