Sei sulla pagina 1di 53

Hidden Markov Trees for Statistical Signal/Image Processing

Xiaoning Qian
ECEN689613 Probability Models

Texas A&M University

Part I Papers

M. S. Crouse, R. D. Nowak, R. G. Baraniuk, Wavelet-Based Statistical Signal Processing Using Hidden Markov Models, TSP, 46(4), 1998. H. Choi, R. G. Baraniuk, Multiscale Image Segmentation Using Wavelet-Domain Hidden Markov Models, TIP, 10(9), 2001. J. Romberg, M. Wakin, H. Choi, R. G. Baraniuk, A Geometric Hidden Markov Tree Wavelet Model, dsp.rice.edu, 2003.

Wavelet

Part II Wavelet Transform

Wavelet

What is a wavelet?

Wikepedia: A wavelet series representation of a square-integrable function is with respect to either a complete, orthonormal set of basis functions, or an overcomplete set of Frame of a vector space (also known as a Riesz basis), for the Hilbert space of square integrable functions.

Wavelet

What is a wavelet?

The main idea of wavelets is from the idea of function representations. Wavelets are closely related to multiscale/multiresolution analysis:
Decompose functions into dierent scales/frequencies and study each component with a resolution that matches its scale.

Wavelets are a class of a functions used to localize a given function in both space and scaling/frequency. For more information: http://www.amara.com/current/wavelet.html

Wavelet

An example Haar basis

Example Haar wavelet: the wavelet function (mother wavelet) (t ); scaling function (father wavelet) (t ): 0 t < 1/2 1 1 0t<1 1 1/2 t < 1 ; (t ) = (t ) = . 0 otherwise 0 otherwise
1 b Daughter wavelets: a,b (t ) = ( t a ), ascale; b shift; |a |

J ,K (t ) =

(2J t

K ).

Multi-dimensional wavelet tensor product of 1-dimensional wavelet

Wavelet

An example Haar basis

Example

Wavelet

Why wavelet?

Wavelets are localized in both space and frequency whereas the standard Fourier transform is only localized in frequency. Multiscale analysis Less computationally complex ...

Wavelet

Wavelet transform

Continuous wavelet transform (CWT): z (t ) =


R

W {z }(a, b )a,b (t )db


z (t )a ,b (t )dt R

W {z }(a, b ) =

a,b (t )c ,d (t )dt = ac (t )bd (t ) R

Wavelet

Wavelet transform

Discrete wavelet transform (DWT):


J0

z (t ) =
K

uK J0 ,K (t ) +
J = K

wJ ,K J ,K (t )

w J ,K = J

z (t )J ,K (t )dt

,K

(t )J ,K (t )dt = JJ (t )KK (t )

Wavelet

Properties for wavelet transform

Locality: Each wavelet is localized simultaneously in space and frequency. Multiresolution: Wavelets are compressed and dilated to analyze at a nested set of scales. Compression: The wavelet transforms of real-world signals tend to be sparse.

Wavelet

Secondary properties may be useful.

Clustering: If a particular wavelet coecient is large/small, adjacent coecients are very likely to also be large/small. Persistence: Large/small values of wavelet coecients tend to propagate across scales

Application

Part III Signal processing problems with wavelet applications

Application

Denoising or signal detection

Example

Application

Denoising or signal detection

Note: The signal model is in the wavelet domain. Signal model: wik = yik + nik , where wik is the i th wavelet coecient by transforming the k th sample. And the task of denoising or detection is to estimate yik . Traditional assumption is that they follow indenpendent Gaussian distribution. ni is the white noise, adaptive thresholding is enough for denoising based on the compression property.

Application

Image segmentation

Example

Application

Image segmentation

Modeling the statistical dependency in images. Image model: f (xr |ci ), where ci are the labels for dierent objects in an image, xr are image regions with the same label; c = {ci , i } can be considered as a random eld while xr is the observation. The model for c can be considered as prior knowledge. Maximum likelihood segmentation: maxc r f (xr |c ) Maximum A Posteriori segmentation: maxc r f (xr |c )f (c ) Note: The model can be either in the image domain or the wavelet domain;

Application

Multiscale image segmentation

Multiscale image segmentation: window size Note: the model in multiscale segmentation is again in the wavelet domain now; the label random eld is in the quadtree structure. Dierent statistical properties for wavelet coecients correspond to dierent image regions. Singularity structures (edges) have large wavelet coecients (useful for heterogeneous regions).

Application

Multiscale image segmentation

Example

Application

Basic assumptions in these applications

Independent Gaussian for wavelet coecients Better assumptions? Secondary properties?

HMT

Part IV Hidden Markov Trees

HMT

Graphical models as probability models

General settings: c random eld (latent/hidden variables); x observations Independent c : f (ci ) and f (x |c ) = f (xi |ci ) Markov random eld (hidden Markov model): f (ci |x , c ) = f (ci |Ni ) and f (x |c ) = f (xi |ci ) Conditional random eld: f (ci |x , c ) = f (ci |x , Ni )

HMT

Independent c

Simplest assumption: c s are all independent: f (ci ) and f (x |c ) = f (xi |ci ) Classication algorithms

HMT

Hidden Markov chain model

c follows a Markov chain structure: f (ci |x , c ) = f (ci |ci 1 , ci +1 ) EM algorithms

HMT

More general hidden Markov model

c has a complex neighbor structure: f (ci |x , c ) = f (ci |ci 2 , ci 1 , ci +1 , ci +2 ) EM algorithms

HMT

Conditional random eld


c has a Markov structure globally conditioned on x : f (ci |x , c ) = f (ci |x , Ni ) We usually assume that the probability (or transition function and state function) has some special form. Belief propagation algorithms

HMT

Graphical model in the image domain


Independent model for homogeneous image regions Simple classiers for pixel intensities

Image Pixel

Hidden state

HMT

Graphical model in the image domain


Markov random eld for noisy images or texture images Adding prior on hidden states for neighbors

Image Pixel

Hidden state

HMT

Graphical model in the image domain


Conditional random eld for more complicated appearance

Image Pixel

Hidden state

HMT

Graphical model in the image domain


Hidden random eld for image regions with dierent parts

HMT

Graphical model in the image domain


Hidden random eld for image regions with dierent parts

Image Pixel

Hidden state for parts Hidden state for regions

HMT

What is an appropriate model?

Tradeo between accuracy and complexity. Small sample size Overtting ...

HMT

Hidden Markov trees for wavelet coecients

Residual dependency structure (nested structure) secondary properties; A model that reects these properties would be appropriate, exible but not too complicated; Nested multiscale graph (tree to be specic) model:

HMT

Independent mixture model for wavelet coecients


Mixture model provides appropriate approximation for non-Gaussian real-world signals.

HMT

Hidden Markov chain model for wavelet coecients


Hidden Markov chain at the same scale:

HMT

Hidden Markov tree for wavelet coecients


Dependence across scales according to the secondary properties of wavelet coecients:

HMT

Hidden Markov tree for images

HMT

Probabilities in hidden Markov trees

For a single wavelet coecient, as the real world signal is always non-Gaussian, we model it with a mixture model: f (w ) =
m

f (w |c = m)f (c = m)

Independent mixture model; Hidden Markov chain model; Hidden Markov tree model
for the tree root c0 : f (c0 ); for the tree nodes other than root transition probability: f (ci |c(i ) ), where (i ) denotes the parent node of i .

HMT

Parameters in HMT

f (c0 ), f (ci |c(i ) ); mixture means and variance: i ,m , i2,m Notice the conditional independence properties of the model.

HMT

Problems of HMT

As all of the graphical models, we need to solve


Training the model; Computing the likelihood with the given observations; Estimating the latent/hidden states.

HMT

ExpectationMaximization algorithm
General settings for estimation: max f (x |) (ML) or max f (|x ) max f (x |)f () (MAP). EM algorithm provides a greedy and iterative way to solve the general estimation problem based on the hidden/latent variables c . log f (x |) = log f (x , c |) log f (c |x , ). Since this is the iterative algorithm, we take the expectation with respect to c with the estimated parameters k 1 : log f (x |)f (c |x , k 1 )dc = log f (x , c |)f (c |x , k 1 )dc log f (c |x , )f (c |x , k 1 )dc

HMT

ExpectationMaximization algorithm

Jensens inequality: log f (c |x , )f (c |x , k 1 )dc log f (c |x , k 1 )f (c |x , k 1 )dc

To guarantee the increase of likelihood log f (x |), we only need to solve: k = arg max

log f (x , c |)f (c |x , k 1 )dc

Hence, E-step is for computing f (c |x , k 1 ); M-step is to solve the above optimization problem.

HMT

Training hidden Markov trees with EM

In HMT, = {f (c0 ), f (ci |c(i ) ), i ,m , i2,m }, where i denotes each wavelet coecient; m denotes each component in the mixture. Update the similar equation: k = arg max

log f (w , c |)f (c |w , k 1 )dc

We need several tricks to complete the EM algorithm here since we do not have an easy form for f (w , c |).

HMT

Training hidden Markov trees with EM

The main task to estimate the marginal state distribution f (ci = m|w , ) and the parent-child joint distribution f (ci = m, c(i ) = n|w , ). Based on the conditional independence we have for HMT, we can write: f (ci = m, w |) = f (wTi |wT i , ci = m, )f (ci = m , wT i | ) = f (wTi |ci = m, )f (ci = m, wT i | ) = i (m)i (m); and similarly, f (ci = m, c(i ) = n, w |) = i (m)f (ci |c(i ) )(i ) (n)(i )\i (n). While f (w |) = m f (ci = m, w |) = m i (m)i (m), we have these distributions expressed in terms of , .

HMT

Training hidden Markov trees with EM

For the computation, we need to follow the downward algorithm from coarse to ne levels to estimate s and upward algorithm from ne to coarse levels to estimate s as described in the paper. M-step is simply the conditional means due to Gaussian assumption. Note the tricks to handle with K trees and tying.

HMT

Coming back to the denoising problem ...

With the EM trained parameters, including f (ci = m|w, ), ci s , and n s, the estimation for the signal is simple as solving the conditional mean estimates: E(yi |w, ) =
m

f (ci = m|w, )

i2,m
2 i2,m + n

wi

HMT

Image segmentation

2D hidden Markov trees Similar setting as in 1D signal model Dierence:


Subband independence: f (w |) = f (wLH |LH )f (wHL |HL )f (wHH |HH ) (scaling); Leads to dierent expansion of s and s; Context-based interscale fusion: prior f (c ): context vector Dierent EM

HMT

Image segmentation

HMT

Image segmentation

HMT

Image segmentation

HMT

Extended hidden Markov trees

Geometric hidden Markov trees:


Modeling contours explicitly; Hidden state space: ci = {dm , m } New conditional distribution of wavelet coecients: 2 f (wi |ci ) exp(dist (wi , em )2 /(2g )), where em is the response for edges with xed distance dm and angle m (lter banks) New transition probability: f (n|m) exp(HD (lm , ln )), where HD (lm , ln ) is the Hausdor distance between lines determined by distance and angle restricted to a square in the plane.

HMT

Take home message

Know available tools; Do not force one tool for every problem; Have a right model and appropriate assumptions; Work hard to nd the simplest (elegant) solution.

Potrebbero piacerti anche