HMT

Hidden Markov Trees for Statistical Signal/Image Processing
Xiaoning Qian
ECEN689613 Probability Models
Texas A&M University
Part I Papers
M. S. Crouse, R. D. Nowak, R. G. Baraniuk, Wavelet-Based Statistical Signal Processing Using Hidden Markov Models, TSP, 46(4), 1998. H. Choi, R. G. Baraniuk, Multiscale Image Segmentation Using Wavelet-Domain Hidden Markov Models, TIP, 10(9), 2001. J. Romberg, M. Wakin, H. Choi, R. G. Baraniuk, A Geometric Hidden Markov Tree Wavelet Model, dsp.rice.edu, 2003.
Wavelet
Part II Wavelet Transform
Wavelet
What is a wavelet?
Wikepedia: A wavelet series representation of a square-integrable function is with respect to either a complete, orthonormal set of basis functions, or an overcomplete set of Frame of a vector space (also known as a Riesz basis), for the Hilbert space of square integrable functions.
Wavelet
What is a wavelet?
The main idea of wavelets is from the idea of function representations. Wavelets are closely related to multiscale/multiresolution analysis:
Decompose functions into dierent scales/frequencies and study each component with a resolution that matches its scale.
Wavelets are a class of a functions used to localize a given function in both space and scaling/frequency. For more information: http://www.amara.com/current/wavelet.html
Wavelet
An example Haar basis
Example Haar wavelet: the wavelet function (mother wavelet) (t ); scaling function (father wavelet) (t ): 0 t < 1/2 1 1 0t<1 1 1/2 t < 1 ; (t ) = (t ) = . 0 otherwise 0 otherwise
1 b Daughter wavelets: a,b (t ) = ( t a ), ascale; b shift; |a |
J ,K (t ) =
(2J t
K ).
Multi-dimensional wavelet tensor product of 1-dimensional wavelet
Wavelet
An example Haar basis
Example
Wavelet
Why wavelet?
Wavelets are localized in both space and frequency whereas the standard Fourier transform is only localized in frequency. Multiscale analysis Less computationally complex ...
Wavelet
Wavelet transform
Continuous wavelet transform (CWT): z (t ) =

R
W {z }(a, b )a,b (t )db

z (t )a ,b (t )dt R
W {z }(a, b ) =
a,b (t )c ,d (t )dt = ac (t )bd (t ) R
Wavelet
Wavelet transform
Discrete wavelet transform (DWT):

J0
z (t ) =
K
uK J0 ,K (t ) +
J = K
wJ ,K J ,K (t )
w J ,K = J
z (t )J ,K (t )dt
,K
(t )J ,K (t )dt = JJ (t )KK (t )
Wavelet
Properties for wavelet transform
Locality: Each wavelet is localized simultaneously in space and frequency. Multiresolution: Wavelets are compressed and dilated to analyze at a nested set of scales. Compression: The wavelet transforms of real-world signals tend to be sparse.
Wavelet
Secondary properties may be useful.
Clustering: If a particular wavelet coecient is large/small, adjacent coecients are very likely to also be large/small. Persistence: Large/small values of wavelet coecients tend to propagate across scales
Application
Part III Signal processing problems with wavelet applications
Application
Denoising or signal detection
Example
Application
Denoising or signal detection
Note: The signal model is in the wavelet domain. Signal model: wik = yik + nik , where wik is the i th wavelet coecient by transforming the k th sample. And the task of denoising or detection is to estimate yik . Traditional assumption is that they follow indenpendent Gaussian distribution. ni is the white noise, adaptive thresholding is enough for denoising based on the compression property.
Application
Image segmentation
Example
Application
Image segmentation
Modeling the statistical dependency in images. Image model: f (xr |ci ), where ci are the labels for dierent objects in an image, xr are image regions with the same label; c = {ci , i } can be considered as a random eld while xr is the observation. The model for c can be considered as prior knowledge. Maximum likelihood segmentation: maxc r f (xr |c ) Maximum A Posteriori segmentation: maxc r f (xr |c )f (c ) Note: The model can be either in the image domain or the wavelet domain;
Application
Multiscale image segmentation
Multiscale image segmentation: window size Note: the model in multiscale segmentation is again in the wavelet domain now; the label random eld is in the quadtree structure. Dierent statistical properties for wavelet coecients correspond to dierent image regions. Singularity structures (edges) have large wavelet coecients (useful for heterogeneous regions).
Application
Multiscale image segmentation
Example
Application
Basic assumptions in these applications
Independent Gaussian for wavelet coecients Better assumptions? Secondary properties?
HMT
Part IV Hidden Markov Trees
HMT
Graphical models as probability models
General settings: c random eld (latent/hidden variables); x observations Independent c : f (ci ) and f (x |c ) = f (xi |ci ) Markov random eld (hidden Markov model): f (ci |x , c ) = f (ci |Ni ) and f (x |c ) = f (xi |ci ) Conditional random eld: f (ci |x , c ) = f (ci |x , Ni )
HMT
Independent c
Simplest assumption: c s are all independent: f (ci ) and f (x |c ) = f (xi |ci ) Classication algorithms
HMT
Hidden Markov chain model
c follows a Markov chain structure: f (ci |x , c ) = f (ci |ci 1 , ci +1 ) EM algorithms
HMT
More general hidden Markov model
c has a complex neighbor structure: f (ci |x , c ) = f (ci |ci 2 , ci 1 , ci +1 , ci +2 ) EM algorithms
HMT
Conditional random eld

c has a Markov structure globally conditioned on x : f (ci |x , c ) = f (ci |x , Ni ) We usually assume that the probability (or transition function and state function) has some special form. Belief propagation algorithms
HMT
Graphical model in the image domain

Independent model for homogeneous image regions Simple classiers for pixel intensities
Image Pixel
Hidden state
HMT

Markov random eld for noisy images or texture images Adding prior on hidden states for neighbors
Image Pixel
Hidden state
HMT

Conditional random eld for more complicated appearance
Image Pixel
Hidden state
HMT

Hidden random eld for image regions with dierent parts
HMT

Hidden random eld for image regions with dierent parts
Image Pixel
Hidden state for parts Hidden state for regions
HMT
What is an appropriate model?
Tradeo between accuracy and complexity. Small sample size Overtting ...
HMT
Hidden Markov trees for wavelet coecients
Residual dependency structure (nested structure) secondary properties; A model that reects these properties would be appropriate, exible but not too complicated; Nested multiscale graph (tree to be specic) model:
HMT
Independent mixture model for wavelet coecients

Mixture model provides appropriate approximation for non-Gaussian real-world signals.
HMT
Hidden Markov chain model for wavelet coecients

Hidden Markov chain at the same scale:
HMT
Hidden Markov tree for wavelet coecients

Dependence across scales according to the secondary properties of wavelet coecients:
HMT
Hidden Markov tree for images
HMT
Probabilities in hidden Markov trees
For a single wavelet coecient, as the real world signal is always non-Gaussian, we model it with a mixture model: f (w ) =
m
f (w |c = m)f (c = m)
Independent mixture model; Hidden Markov chain model; Hidden Markov tree model
for the tree root c0 : f (c0 ); for the tree nodes other than root transition probability: f (ci |c(i ) ), where (i ) denotes the parent node of i .
HMT
Parameters in HMT
f (c0 ), f (ci |c(i ) ); mixture means and variance: i ,m , i2,m Notice the conditional independence properties of the model.
HMT
Problems of HMT
As all of the graphical models, we need to solve

Training the model; Computing the likelihood with the given observations; Estimating the latent/hidden states.
HMT
ExpectationMaximization algorithm
General settings for estimation: max f (x |) (ML) or max f (|x ) max f (x |)f () (MAP). EM algorithm provides a greedy and iterative way to solve the general estimation problem based on the hidden/latent variables c . log f (x |) = log f (x , c |) log f (c |x , ). Since this is the iterative algorithm, we take the expectation with respect to c with the estimated parameters k 1 : log f (x |)f (c |x , k 1 )dc = log f (x , c |)f (c |x , k 1 )dc log f (c |x , )f (c |x , k 1 )dc
HMT
ExpectationMaximization algorithm
Jensens inequality: log f (c |x , )f (c |x , k 1 )dc log f (c |x , k 1 )f (c |x , k 1 )dc
To guarantee the increase of likelihood log f (x |), we only need to solve: k = arg max
log f (x , c |)f (c |x , k 1 )dc
Hence, E-step is for computing f (c |x , k 1 ); M-step is to solve the above optimization problem.
HMT
Training hidden Markov trees with EM
In HMT, = {f (c0 ), f (ci |c(i ) ), i ,m , i2,m }, where i denotes each wavelet coecient; m denotes each component in the mixture. Update the similar equation: k = arg max
log f (w , c |)f (c |w , k 1 )dc
We need several tricks to complete the EM algorithm here since we do not have an easy form for f (w , c |).
HMT
The main task to estimate the marginal state distribution f (ci = m|w , ) and the parent-child joint distribution f (ci = m, c(i ) = n|w , ). Based on the conditional independence we have for HMT, we can write: f (ci = m, w |) = f (wTi |wT i , ci = m, )f (ci = m , wT i | ) = f (wTi |ci = m, )f (ci = m, wT i | ) = i (m)i (m); and similarly, f (ci = m, c(i ) = n, w |) = i (m)f (ci |c(i ) )(i ) (n)(i )\i (n). While f (w |) = m f (ci = m, w |) = m i (m)i (m), we have these distributions expressed in terms of , .
HMT
For the computation, we need to follow the downward algorithm from coarse to ne levels to estimate s and upward algorithm from ne to coarse levels to estimate s as described in the paper. M-step is simply the conditional means due to Gaussian assumption. Note the tricks to handle with K trees and tying.
HMT
Coming back to the denoising problem ...
With the EM trained parameters, including f (ci = m|w, ), ci s , and n s, the estimation for the signal is simple as solving the conditional mean estimates: E(yi |w, ) =
m
f (ci = m|w, )
i2,m
2 i2,m + n
wi
HMT
Image segmentation
2D hidden Markov trees Similar setting as in 1D signal model Dierence:

Subband independence: f (w |) = f (wLH |LH )f (wHL |HL )f (wHH |HH ) (scaling); Leads to dierent expansion of s and s; Context-based interscale fusion: prior f (c ): context vector Dierent EM
HMT
Image segmentation
HMT
Image segmentation
HMT
Image segmentation
HMT
Extended hidden Markov trees
Geometric hidden Markov trees:

Modeling contours explicitly; Hidden state space: ci = {dm , m } New conditional distribution of wavelet coecients: 2 f (wi |ci ) exp(dist (wi , em )2 /(2g )), where em is the response for edges with xed distance dm and angle m (lter banks) New transition probability: f (n|m) exp(HD (lm , ln )), where HD (lm , ln ) is the Hausdor distance between lines determined by distance and angle restricted to a square in the plane.
HMT
Take home message
Know available tools; Do not force one tool for every problem; Have a right model and appropriate assumptions; Work hard to nd the simplest (elegant) solution.

HMT

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

HMT

Caricato da

Copyright:

Formati disponibili

Hidden Markov Trees for Statistical Signal/Image Processing

Texas A&M University

Part II Wavelet Transform

An example Haar basis

Multi-dimensional wavelet tensor product of 1-dimensional wavelet

An example Haar basis

Continuous wavelet transform (CWT): z (t ) =

W {z }(a, b )a,b (t )db

a,b (t )c ,d (t )dt = ac (t )bd (t ) R

Discrete wavelet transform (DWT):

Properties for wavelet transform

Secondary properties may be useful.

Part III Signal processing problems with wavelet applications

Denoising or signal detection

Denoising or signal detection

Multiscale image segmentation

Multiscale image segmentation

Basic assumptions in these applications

Independent Gaussian for wavelet coecients Better assumptions? Secondary properties?

Part IV Hidden Markov Trees

Graphical models as probability models

Hidden Markov chain model

c follows a Markov chain structure: f (ci |x , c ) = f (ci |ci 1 , ci +1 ) EM algorithms

More general hidden Markov model

c has a complex neighbor structure: f (ci |x , c ) = f (ci |ci 2 , ci 1 , ci +1 , ci +2 ) EM algorithms

Conditional random eld

Graphical model in the image domain

Graphical model in the image domain

Graphical model in the image domain

Graphical model in the image domain

Graphical model in the image domain

Hidden state for parts Hidden state for regions

What is an appropriate model?

Hidden Markov trees for wavelet coecients

Independent mixture model for wavelet coecients

Hidden Markov chain model for wavelet coecients

Hidden Markov tree for wavelet coecients

Hidden Markov tree for images

Probabilities in hidden Markov trees

As all of the graphical models, we need to solve

Jensens inequality: log f (c |x , )f (c |x , k 1 )dc log f (c |x , k 1 )f (c |x , k 1 )dc

log f (x , c |)f (c |x , k 1 )dc

Training hidden Markov trees with EM

log f (w , c |)f (c |w , k 1 )dc

Training hidden Markov trees with EM

Training hidden Markov trees with EM

Coming back to the denoising problem ...

2D hidden Markov trees Similar setting as in 1D signal model Dierence:

Extended hidden Markov trees

Geometric hidden Markov trees:

Take home message

Potrebbero piacerti anche