Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Xiaoning Qian
ECEN689613 Probability Models
Part I Papers
M. S. Crouse, R. D. Nowak, R. G. Baraniuk, Wavelet-Based Statistical Signal Processing Using Hidden Markov Models, TSP, 46(4), 1998. H. Choi, R. G. Baraniuk, Multiscale Image Segmentation Using Wavelet-Domain Hidden Markov Models, TIP, 10(9), 2001. J. Romberg, M. Wakin, H. Choi, R. G. Baraniuk, A Geometric Hidden Markov Tree Wavelet Model, dsp.rice.edu, 2003.
Wavelet
Wavelet
What is a wavelet?
Wikepedia: A wavelet series representation of a square-integrable function is with respect to either a complete, orthonormal set of basis functions, or an overcomplete set of Frame of a vector space (also known as a Riesz basis), for the Hilbert space of square integrable functions.
Wavelet
What is a wavelet?
The main idea of wavelets is from the idea of function representations. Wavelets are closely related to multiscale/multiresolution analysis:
Decompose functions into dierent scales/frequencies and study each component with a resolution that matches its scale.
Wavelets are a class of a functions used to localize a given function in both space and scaling/frequency. For more information: http://www.amara.com/current/wavelet.html
Wavelet
Example Haar wavelet: the wavelet function (mother wavelet) (t ); scaling function (father wavelet) (t ): 0 t < 1/2 1 1 0t<1 1 1/2 t < 1 ; (t ) = (t ) = . 0 otherwise 0 otherwise
1 b Daughter wavelets: a,b (t ) = ( t a ), ascale; b shift; |a |
J ,K (t ) =
(2J t
K ).
Wavelet
Example
Wavelet
Why wavelet?
Wavelets are localized in both space and frequency whereas the standard Fourier transform is only localized in frequency. Multiscale analysis Less computationally complex ...
Wavelet
Wavelet transform
W {z }(a, b ) =
Wavelet
Wavelet transform
z (t ) =
K
uK J0 ,K (t ) +
J = K
wJ ,K J ,K (t )
w J ,K = J
z (t )J ,K (t )dt
,K
(t )J ,K (t )dt = JJ (t )KK (t )
Wavelet
Locality: Each wavelet is localized simultaneously in space and frequency. Multiresolution: Wavelets are compressed and dilated to analyze at a nested set of scales. Compression: The wavelet transforms of real-world signals tend to be sparse.
Wavelet
Clustering: If a particular wavelet coecient is large/small, adjacent coecients are very likely to also be large/small. Persistence: Large/small values of wavelet coecients tend to propagate across scales
Application
Application
Example
Application
Note: The signal model is in the wavelet domain. Signal model: wik = yik + nik , where wik is the i th wavelet coecient by transforming the k th sample. And the task of denoising or detection is to estimate yik . Traditional assumption is that they follow indenpendent Gaussian distribution. ni is the white noise, adaptive thresholding is enough for denoising based on the compression property.
Application
Image segmentation
Example
Application
Image segmentation
Modeling the statistical dependency in images. Image model: f (xr |ci ), where ci are the labels for dierent objects in an image, xr are image regions with the same label; c = {ci , i } can be considered as a random eld while xr is the observation. The model for c can be considered as prior knowledge. Maximum likelihood segmentation: maxc r f (xr |c ) Maximum A Posteriori segmentation: maxc r f (xr |c )f (c ) Note: The model can be either in the image domain or the wavelet domain;
Application
Multiscale image segmentation: window size Note: the model in multiscale segmentation is again in the wavelet domain now; the label random eld is in the quadtree structure. Dierent statistical properties for wavelet coecients correspond to dierent image regions. Singularity structures (edges) have large wavelet coecients (useful for heterogeneous regions).
Application
Example
Application
HMT
HMT
General settings: c random eld (latent/hidden variables); x observations Independent c : f (ci ) and f (x |c ) = f (xi |ci ) Markov random eld (hidden Markov model): f (ci |x , c ) = f (ci |Ni ) and f (x |c ) = f (xi |ci ) Conditional random eld: f (ci |x , c ) = f (ci |x , Ni )
HMT
Independent c
Simplest assumption: c s are all independent: f (ci ) and f (x |c ) = f (xi |ci ) Classication algorithms
HMT
HMT
HMT
HMT
Image Pixel
Hidden state
HMT
Image Pixel
Hidden state
HMT
Image Pixel
Hidden state
HMT
HMT
Image Pixel
HMT
Tradeo between accuracy and complexity. Small sample size Overtting ...
HMT
Residual dependency structure (nested structure) secondary properties; A model that reects these properties would be appropriate, exible but not too complicated; Nested multiscale graph (tree to be specic) model:
HMT
HMT
HMT
HMT
HMT
For a single wavelet coecient, as the real world signal is always non-Gaussian, we model it with a mixture model: f (w ) =
m
f (w |c = m)f (c = m)
Independent mixture model; Hidden Markov chain model; Hidden Markov tree model
for the tree root c0 : f (c0 ); for the tree nodes other than root transition probability: f (ci |c(i ) ), where (i ) denotes the parent node of i .
HMT
Parameters in HMT
f (c0 ), f (ci |c(i ) ); mixture means and variance: i ,m , i2,m Notice the conditional independence properties of the model.
HMT
Problems of HMT
HMT
ExpectationMaximization algorithm
General settings for estimation: max f (x |) (ML) or max f (|x ) max f (x |)f () (MAP). EM algorithm provides a greedy and iterative way to solve the general estimation problem based on the hidden/latent variables c . log f (x |) = log f (x , c |) log f (c |x , ). Since this is the iterative algorithm, we take the expectation with respect to c with the estimated parameters k 1 : log f (x |)f (c |x , k 1 )dc = log f (x , c |)f (c |x , k 1 )dc log f (c |x , )f (c |x , k 1 )dc
HMT
ExpectationMaximization algorithm
To guarantee the increase of likelihood log f (x |), we only need to solve: k = arg max
Hence, E-step is for computing f (c |x , k 1 ); M-step is to solve the above optimization problem.
HMT
In HMT, = {f (c0 ), f (ci |c(i ) ), i ,m , i2,m }, where i denotes each wavelet coecient; m denotes each component in the mixture. Update the similar equation: k = arg max
We need several tricks to complete the EM algorithm here since we do not have an easy form for f (w , c |).
HMT
The main task to estimate the marginal state distribution f (ci = m|w , ) and the parent-child joint distribution f (ci = m, c(i ) = n|w , ). Based on the conditional independence we have for HMT, we can write: f (ci = m, w |) = f (wTi |wT i , ci = m, )f (ci = m , wT i | ) = f (wTi |ci = m, )f (ci = m, wT i | ) = i (m)i (m); and similarly, f (ci = m, c(i ) = n, w |) = i (m)f (ci |c(i ) )(i ) (n)(i )\i (n). While f (w |) = m f (ci = m, w |) = m i (m)i (m), we have these distributions expressed in terms of , .
HMT
For the computation, we need to follow the downward algorithm from coarse to ne levels to estimate s and upward algorithm from ne to coarse levels to estimate s as described in the paper. M-step is simply the conditional means due to Gaussian assumption. Note the tricks to handle with K trees and tying.
HMT
With the EM trained parameters, including f (ci = m|w, ), ci s , and n s, the estimation for the signal is simple as solving the conditional mean estimates: E(yi |w, ) =
m
f (ci = m|w, )
i2,m
2 i2,m + n
wi
HMT
Image segmentation
HMT
Image segmentation
HMT
Image segmentation
HMT
Image segmentation
HMT
HMT
Know available tools; Do not force one tool for every problem; Have a right model and appropriate assumptions; Work hard to nd the simplest (elegant) solution.