Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Chapter 14
Sampling
Why sampling?
Looking at samples from a model can help to understand the model's properties
Basis of sampling method is in generation of numbers, so we will be looking into random number generation and different distribution method of random number
Random Numbers
A random number is a number generated by a process, whose outcome is unpredictable, and which cannot be subsequently reliably reproduced. Pure Random number do not exist There are plenty of algorithms that produce pseudo-random number
which describes two independent variables with zero mean, unit variance Gaussian distribution
Monte Carlo
If you take independent and identically distributed samples from an unknown high-dimensional distribution p(x), then as the number of samples gets larger the sample distribution will converge to the true distribution. Mathematically,
Rejection Sampling
More generally, we would like to sample from p(x), but its easier to sample from a proposal distribution q(x) q(x) satisfies p(x) M q(x) for some M<
4. Else
Reject x and pick another sample
Rejection Sampling
If you dont choose M properly, you will have to reject a lot of them Curse of dimensionality makes the problem even worse To avoid problems we need to
I. develop some more sophisticated methods of understanding the space that we are sampling other is to try to ensure that samples are taken from areas of the space that have high probability
II.
Importance Sampling solves the problem listed above because it attaches a weight that says how important each sample is
Importance Sampling
to compute expectation E(f) for a continuous random variable x distributed according to unknown distribution p(x)
where ratio p(x(i))/q(x(i)) is importance weight Using importance weight we can resample the data (by Sampling-Importance-Resampling Algorithm)
Sampling-ImportanceResampling Algorithm
1. Produce N samples x(i), i=1.N from q(x) 2. Compute normalized importance weights
3. Resample from the distribution {x(i)} with probabilities given by the weights
Markov Chain
Markov chain on a space X with transitions T is a random process (infinite sequence of random variables)
That is, the probability of being in a particular state at time t given the state history depends only on the state at time t-1
Reversible: we can move backward and forwards along the chain with equal probability (detailed balance condition)
Probability of being in an unlikely state s, but heading for a likely state s should be same as being in the likely state s and heading for unlikely state s, so that
Metropolis-Hastings
Assume that we have a proposal distribution of the form q(x(i)|x(i-1)) that we can sample from Idea of Metropolis-Hastings is similar to that of rejection sampling: we take a sample x* and choose whether or not to keep it Except, unlike rejection sampling, rather than picking another sample if we reject the current one, instead add another copy of previous accepted sample. Probability of keeping the sample is u(x*|x(i-1)):
Metropolis-Hastings Algorithm
1. Given an initial value x0 2. Repeat (Until you have enough samples)
Sample x* from q(xi|xi-1) Sample u from the uniform distribution If u<u(x*|x(i-1)) where
Set x[i+1] = x* Set x[i+1] = x[i]
Otherwise:
Simulated Annealing
There are lots of times when we might just want to find the maximum of a distribution rather than approximate distribution itself we use simulated annealing to find the maximum of a distribution This method changes the Markov chain so that its invariant distribution is not p(x), but rather p1/Ti(x), where Ti0 as i
We need annealing schedule that cools system down over time so that we are progressively less likely to accept solutions that are worse over time
Gibbs Sampling
Gibbs Sampling is a Metropolis-Hastings algorithm whose proposals are always accepted for each step, replace the value of a variable using the distribution conditioned on the remaining variables Perfect for Bayesian network
Gibbs Sampling
More formally, the proposal distribution is
q( x* | x (t ) )
(t ) p ( x* | x j j ) 0
if x*-j=x(t)-j otherwise
p( x* ) q( x (t ) | x* ) r p( x (t ) ) q( x* | x (t ) )
(t ) p( x* ) p( x (jt ) | x j) * p( x (t ) ) p( x* j | x j ) (t ) * p( x* ) p( x (jt ) , x ) p( x j j) * (t ) p( x (t ) ) p( x* j , x j ) p( x j )
So we always accept!
p( x ) p( x )
* j (t ) j
Gibbs Sampler
1. For each variable xj:
Initialize xj(0)
o o o o