Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
1
Project
• Project details have been uploaded to Piazza,
and are in the handout
Data generating
Observed data
process
Inference
3
Figure based on one by Larry Wasserman, "All of Statistics"
Approximate Inference
• In principle, Bayesian inference is a simple
application of Bayes’ rule. This has been easy to do
for most of the simple models we’ve studied so far.
4
Approximate Inference
• Optimization approaches
• EM
• Variational inference
– Variational Bayes, mean field
– Message passing: loopy BP, TRW, expectation propagation
• Laplace approximation
5
Approximate Inference
• Simulation approaches
(Monte Carlo methods)
6
Monte Carlo Methods
• Suppose we want to approximately compute
7
Monte Carlo Methods
8
Monte Carlo Methods: Example
9
Monte Carlo Methods
10
Learning outcomes
By the end of the lesson, you should be able to:
11
12
13
Bayesian Inference:
One Computer Scientist’s Perspective
• In theory, the posterior is simply given by Bayes’ rule.
14
Bayesian Inference:
One Computer Scientist’s Perspective
• So, what do we actually mean when we say
we are doing Bayesian inference?
– Answering specific queries with respect to the
distribution?
(MAP, marginals, posterior predictive,…)
15
Sampling: An analogy
17
Sampling: Challenges
19
Uniform sampling
21
Importance sampling
• Same idea, but pick from a better “proposal”
distribution than uniform.
• Reweight samples to correct for sampling from the
wrong distribution.
22
Importance sampling
23
Importance sampling
24
Importance sampling
25
Importance sampling
26
Importance sampling
without normalization constants
27
Importance sampling
without normalization constants
28
Importance sampling
without normalization constants
29
Importance sampling
without normalization constants
30
Importance sampling
without normalization constants
31
Importance sampling
without normalization constants
32
Importance sampling
without normalization constants
33
34
Importance sampling
• Can be used to estimate the ratio of
partition functions between p(x) and q(x)
35
Importance sampling
• Can be used to estimate the ratio of
partition functions between p(x) and q(x)
36
Importance sampling
• Can be used to estimate the ratio of
partition functions between p(x) and q(x)
37
Importance sampling
• Can be used to estimate the ratio of
partition functions between p(x) and q(x)
38
39
Heavy tails
• If q(x) goes towards zero faster than p(x), importance
weights of rare events will become extremely large
Spherical Gaussian:
41
42
Sampling Importance Resampling
• We can convert a set of importance-weighted
samples to a set of unweighted samples
43
Rejection Sampling
44
Rejection Sampling
45
Rejection sampling
in high dimensions
• As the dimensionality of the space increases, the constant c
gets exponentially larger in general
Observations Y1 Y2 Y3 Y4 Y5
Observations Y1 Y2 Y3 Y4 Y5
Zt = ?
• Basic idea:
– Perform importance sampling to estimate z,
49
Updating importance weights
50
Degeneracy
• As we add more timesteps, the z vector
becomes higher dimensional
– Importance weights select only a few samples
51
Illustration of particle filtering
52
Application: visual object tracking
• Goal: track an object (in this case, a remote
controlled helicopter) in a video sequence
60