Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
0.25 P(x0)
P(X)
0.2
8
0.15
0.1 P(x1)
7
0.05
0 *
z1
-8 -6 -4 -2 0 2 4 6 8
6
8 10 12 14 16 18 20 22 24 26
X position X
Chapter 15, Sections 1–5 26 Chapter 15, Sections 1–5 29
General Kalman update Where it breaks
Transition and sensor models: Cannot be applied if the transition model is nonlinear
P (xt+1|xt) = N (Fxt, Σx)(xt+1) Extended Kalman Filter models transition as locally linear around xt = µt
P (zt|xt) = N (Hxt, Σz )(zt) Fails if systems is locally unsmooth
F is the matrix for the transition; Σx the transition noise covariance
H is the matrix for the sensors; Σz the sensor noise covariance
Filter computes the following update:
µt+1 = Fµt + Kt+1(zt+1 − HFµt)
Σt+1 = (I − Kt+1)(FΣtF> + Σx)
where Kt+1 = (FΣtF> + Σx)H>(H(FΣtF> + Σx)H> + Σz )−1
is the Kalman gain matrix
Σt and Kt are independent of observation sequence, so compute offline
Chapter 15, Sections 1–5 27 Chapter 15, Sections 1–5 30
Dynamic Bayesian networks Exact inference in DBNs
Xt, Et contain arbitrarily many variables in a replicated Bayes net Naive method: unroll the network and run any exact algorithm
BMeter 1 P(R 0)
0.7
R0
t
f
P(R 1 )
0.7
0.3
P(R 0)
0.7
R0
t
f
P(R 1 )
0.7
0.3
R0
t
f
P(R 1 )
0.7
0.3
R0
t
f
P(R 1 )
0.7
0.3
R0
t
f
P(R 1 )
0.7
0.3
R0
t
f
P(R 1 )
0.7
0.3
R0
t
f
P(R 1 )
0.7
0.3
R0
t
f
P(R 1 )
0.7
0.3
Rain 0 Rain 1 Rain 0 Rain 1 Rain 2 Rain 3 Rain 4 Rain 5 Rain 6 Rain 7
P(R 0) R0 P(R 1 ) R1 P(U1 ) R1 P(U1 ) R1 P(U1 ) R1 P(U1 ) R1 P(U1 ) R1 P(U1 ) R1 P(U1 ) R1 P(U1 )
t 0.9 t 0.9 t 0.9 t 0.9 t 0.9 t 0.9 t 0.9 t 0.9
f 0.2 f 0.2 f 0.2 f 0.2 f 0.2 f 0.2 f 0.2 f 0.2
t 0.7
0.7 f 0.3 Battery 0 Battery 1 Umbrella 1 Umbrella 1 Umbrella 2 Umbrella 3 Umbrella 4 Umbrella 5 Umbrella 6 Umbrella 7
Rain 0 Rain 1
R1 P(U1 ) X0 X1 Problem: inference cost for each update grows with t
t 0.9
f 0.2
Umbrella 1 XX
t0 X1 Rollup filtering: add slice t + 1, “sum out” slice t using variable elimination
Largest factor is O(dn+1), update cost O(dn+2)
Z1
(cf. HMM update cost O(d2n))
Chapter 15, Sections 1–5 31 Chapter 15, Sections 1–5 34
DBNs vs. HMMs Likelihood weighting for DBNs
Every HMM is a single-variable DBN; every discrete DBN is an HMM Set of weighted samples approximates the belief state
Rain 0 Rain 1 Rain 2 Rain 3 Rain 4 Rain 5
Xt X t+1
Umbrella 1 Umbrella 2 Umbrella 3 Umbrella 4 Umbrella 5
Yt Y t+1 LW samples pay no attention to the evidence!
⇒ fraction “agreeing” falls exponentially with t
⇒ number of samples required grows exponentially with t
Z t Z t+1 1
LW(10)
LW(100)
0.8 LW(1000)
LW(10000)
Sparse dependencies ⇒ exponentially fewer parameters;
RMS error
0.6
e.g., 20 state variables, three parents each 0.4
DBN has 20 × 23 = 160 parameters, HMM has 220 × 220 ≈ 1012 0.2
0
0 5 10 15 20 25 30 35 40 45 50
Time step
Chapter 15, Sections 1–5 32 Chapter 15, Sections 1–5 35
DBNs vs Kalman filters Particle filtering
Every Kalman filter model is a DBN, but few DBNs are KFs; Basic idea: ensure that the population of samples (“particles”)
real world requires non-Gaussian posteriors tracks the high-likelihood regions of the state-space
E.g., where are bin Laden and my keys? What’s the battery charge? Replicate particles proportional to likelihood for et
BMBroken 0 BMBroken 1 Rain t Rain t +1 Rain t +1 Rain t +1
true
BMeter 1
false
E(Battery|...5555005555...)
5
Battery 0 Battery 1
4 E(Battery|...5555000000...) (a) Propagate (b) Weight (c) Resample
E(Battery)
3
X0 X1
2
Widely used for tracking nonlinear systems, esp. in vision
P(BMBroken|...5555000000...)
1
XX
t0 X1 Also used for simultaneous localization and mapping in mobile robots
0
P(BMBroken|...5555005555...) 105-dimensional state space
-1
Z1 15 20 25 30
Time step
Chapter 15, Sections 1–5 33 Chapter 15, Sections 1–5 36
Particle filtering contd.
Assume consistent at time t: N (xt|e1:t)/N = P (xt|e1:t)
Propagate forward: populations of xt+1 are
N (xt+1|e1:t) = Σxt P (xt+1|xt)N (xt|e1:t)
Weight samples by their likelihood for et+1:
W (xt+1|e1:t+1) = P (et+1|xt+1)N (xt+1|e1:t)
Resample to obtain populations proportional to W :
N (xt+1|e1:t+1)/N = αW (xt+1|e1:t+1) = αP (et+1|xt+1)N (xt+1|e1:t)
= αP (et+1|xt+1)Σxt P (xt+1|xt)N (xt|e1:t)
= α0P (et+1|xt+1)Σxt P (xt+1|xt)P (xt|e1:t)
= P (xt+1|e1:t+1)
Chapter 15, Sections 1–5 37
Particle filtering performance
Approximation error of particle filtering remains bounded over time,
at least empirically—theoretical analysis is difficult
1
LW(25)
LW(100)
LW(1000)
0.8 LW(10000)
ER/SOF(25)
Avg absolute error
0.6
0.4
0.2
0
0 5 10 15 20 25 30 35 40 45 50
Time step
Chapter 15, Sections 1–5 38
Summary
Temporal models use state and sensor variables replicated over time
Markov assumptions and stationarity assumption, so we need
– transition modelP(Xt|Xt−1)
– sensor model P(Et|Xt)
Tasks are filtering, prediction, smoothing, most likely sequence;
all done recursively with constant cost per time step
Hidden Markov models have a single discrete state variable; used
for speech recognition
Kalman filters allow n state variables, linear Gaussian, O(n3) update
Dynamic Bayes nets subsume HMMs, Kalman filters; exact update intractable
Particle filtering is a good approximate filtering algorithm for DBNs
Chapter 15, Sections 1–5 39