Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Compression
4. Introduction to Wavelet
Topics
Introduction to Image Compression
z Transform Coding
z Subband Coding, Filter Banks
z Introduction to Wavelet Transform
z Haar, SPIHT, EZW
z Motion Compensation
z Wireless Video Compression
z
#2
Contents
History of Wavelet
z From Fourier Transform to Wavelet
Transform
z Haar Wavelet
z Multiresolution Analysis
z General Wavelet Transform
z EZW
z SPIHT
z
#3
Wavelet Definition
The wavelet transform is a tool that cuts up data,
functions or operators into different frequency
components, and then studies each component
with a resolution matched to its scale
---- Dr. Ingrid Daubechies, Lucent, Princeton U
#4
z
z
#5
#6
Introduction to Wavelets
z
#7
First wavelet.
#8
1985 Meyer
z
1988 I. Daubechies
z
1987 Mallat
z
Compression is PREDICTION.
There are many decomposition approaches
to modeling the signal.
z
z
Model
Probability
Distribution
Source
Messages
Encoder
Probability
Distribution
Probability
Estimates
Transmission System
Compressed
Bit Stream
Probability
Estimates
Original Source
Decoder Messages
#10
Sequence of samples
z
z
z
z
Pyramid (hierarchical)
Polynomial
Piecewise polynomials
z
Fourier Transform
z
z
Time domain
Frequency domain
Sinusoids of various frequencies
Wavelet Transform
z
Time/frequency domain
#11
tones, vibrations.
#13
#16
Uncertainty Principle
--- Preliminaries for the STFT
z
#17
Uncertainty Principle
z
Uncertainty Principle
-- In Signal Processing
Let g(t) be a function with the property .
Then
g (t ) 2 dt = 1
1
( (t tm ) g (t ) dt )( ( f f m ) G ( f ) dt )
2
16
It takes a non-stationary
signal and breaks it down into
windows of signals for a
specified short period of time
and does Fourier transform
on the window by considering
the signal to consist of
repeated windows over time.
Time
Frequency
Amplitude
Gabors Proposal
Short-time Fourier Transform.
STFT
Time
#20
#21
Analysis:
STFT ( , u ) = f (t ) w* (t )e j 2 ut dt
z
Synthesis:
f (t ) = STFT ( , u ) w(t )e j 2 ut d du
z
z
z
Basic idea:
z
#24
Step #2 (resolution 2)
z
Step #3 (resolution1)
z
L: {4.5} -average
H: {-2} - detail
Transmit { 4.5, -2, -1, -1, -0.5, -0.5, -0.5, -0.5}. No information has been lost or gained by this
process. We can reconstruct the original image from this vector by adding and subtracting the
detail coefficients. The vector has 8 values, as in the original sequence, but except for the first
coefficient, all have small magnitudes. This vector is called the wavelet transform or the wavelet
decomposition of the original sequence. As we will see soon,we have used the Haar basis for
this one-dimensional transfor.
#25
The computation of
the wavelet
transform used
recursive averaging
and differencing
coefficients. It
behaves like a filter
bank.
Recursive
application of a twoband filter bank to
the lowpass band of
the previous stage.
#26
#27
#28
z
z
V V V L
0
f ( x ) = 2
0
0.5
0.5
Same function!
1
1
f ( x ) = 2
2
else
#29
Scaling Function
z
0.5
0.5
1
0
0.5
1
1
0.5
0
1
0.5
0.5
0.5
0 2
Scaling Function
z
else
of 1 ( x) is [1/4, ).
0
z Note that 0 ( x) = ( x)
z
#31
Scaling Function
z
i=
i = 0,1,L,2 j 1
2
j
0
1
2
3
4
5
6
7
#32
Scaling Functions
z
+ c2 2 ( x) + c3 3 ( x)
2
where
f(x) = 9 *
c0 = 9, c1 = 7,
+7*
c2 = 3, c3 = 3
+3*
+3*
j can
In general, any function
with
resolution
2
2 1
be written as: f ( x) = ci ji j ( x)
j
i =0
#33
f(x) = 9 *
+7*
+3*
+ 5*
f(x) = 8 *
+4*
+1*
-1 *
#34
2.
#35
Haar Wavelet
z
i ( x) = (2 x i ),
j
i = 0,1,L ,2 1
j
where
1
( x ) = 1
0
0 x < 0.5
0.5 x < 1
else
0.5
-1
#36
1D Haar Wavelet
1
0,0 (t ) =
1
0t<
1
2
1
t <1
2
0.5
-1
1,0 (t ) = (2t )
0.5
-1
1,1 (t ) = (2t 1)
0.5
1
-1
2,0 (t ) = (4t )
0.5
1
1
-1
2,1 (t ) = (4t 1)
0.5
-1
2,2 (t ) = (4t 2)
0.5
-1
#37
f ( x) = c0 0 ( x) + c1 1 ( x) + d 0 0 ( x) + d1 1 ( x)
1
where
c0 = 8, c1 = 4,
1
d 0 = 1, d1 = 1
1
f(x) = 8 *
+4*
+1*
-1*
#38
f ( x) = c0 0 ( x) + d 0 0 ( x) + d 0 0 ( x) + d1 1 ( x)
0
where
c0 = 6, d 0 = 2,
0
+2*
d 0 = 1, d1 = 1
1
f(x) = 6*
+ 1*
+ -1*
#40
6
2
For example, [6, 2,1, -1] becomes
2
2
#41
Procedure DecompositionStep(c:array[1..2j])
for i = 1 to 2j-1 do
c[i] = (c[2i-1]+c[2i])/sqrt(2)
c[2j-1+i] = (c[2i-1]-c[2i])/sqrt(2)
end for
c = c
End procedure
#43
Try it yourself:
Work out the
reconstruction
procedure.
#44
Standard Decomposition
z
z
Nonstandard
Decomposition
z
Reconstruction
#46
0 0,0 ( x, y ) := ( x, y )
j kl ( x, y ) := 2 j (2 j x k ) (2 j y l )
j
j
j
j
kl ( x, y ) := 2 (2 x k ) (2 y l )
j kl ( x, y ) := 2 j (2 j x k ) (2 j y l )
#48
#49
O(bn)
= (1-ak+1)/(1-a)
#51
Multi-resolution Analysis
z
Multi-resolution Analysis
z
10
0
0
20
h=1
5
10
15
20
25
10
10
0
0
20
h=2
5
10
15
20
25
10
0
0
20
10
15
20
25
10
15
20
25
10
15
20
25
10
0
0
20
h=4
5
10
15
20
25
10
0
20
0
0
20
10
h=8
0
10
15
20
25
#54
f j ( x) = ci i ( x)
j
i =0
V
for V and
j
#55
j 1
i =0
where ci
j 1
j 1
2 j 1 1
j 1
i =0
j 1
j 1
c + c2i +1
c c2i +1
j 1
and d i = 2i
= 2i
2
2
j
j 1
( x) =
2 j 1 1
i j 1 ( x)
i =0
j 1
2 j 2 1
ci
j 2
j 2
c2 i
i =0
where ci
j 2
j 1
( x) +
2 j 2 1
di i
j 2
j 2
( x)
i =0
+ c2i +1
2
j 1
and d i
j 2
c2 i
j 1
c2i +1
2
j 1
So, we have:
V j = V j 2 W j 1 W j 1
#57
1
0
1
1
] V
W 0 W 1
#58
f ( x) = c0 0 ( x) +
j
k =0 i =0
i ( x)
k
i
#59
V V W W
j
s +1
L W
j 1
Frequency
Amplitude
Time
Amplitude
Time Domain
Scale
Frequency
Frequency Domain
Time
STFT (Gabor)
Time
Wavelet Analysis
#61
Linear transform.
t
(
) are shifts
All analysis functions s ,
s
and dilations of the mother wavelet (t )
Time resolution and frequency resolution vary
as a function of scale.
Continuous wavelet transform (CWT)
z
Analysis
t
1
s , (t ) = (
) Is a scaled and translated mother wavelet
s
s
Where s, are scaling and translation parameters.
Let
WT ( s, t ) = s ,t , f (t ) =
1
=
s
1
s , (t ) f (t )dt
s
x
f (t ) (
)dt
s
#63
Synthesis
The inverse wavelet transform is given by
f (t) =
WT(s, ) s, dsd
C s
where
and
| ( w) |2
C =
dw
| w|
s ,
(t )
1)Wave condition:
s ,
(t ) dt = 0
s ,
(t ) |2 dt <
2) Admissibility condition:
Small scale
z Rapidly changing
details,
z Like high frequency
Large scale
z Slowly changing details
z Like low frequency
Scale
Scaling
#65
More on Scale
z
z
z
z
#66
Shifting
Shifting a wavelet simply means delaying
(or hastening) its onset.
Mathematically, shifting a function f(t) by
k is represented by f(t-k).
#67
Haar
Battle-Lemarie
Meyer
Daubechies
Chui-Wang
#68
1
s
f (t ) (
t
)dt
s
2.
3.
4.
5.
1 2
#70
2D
3D
#71
z
z
1
s= m
s0
n 0
= m
s0
z
So,
s0 = 2
1
1 1 1
s = m =< 1, , , ,K >,
2
2 4 8
s0 = 2 , 0 = 1,
n
= m
2
t
2m
m
1
t
2
)=2
s , (t ) = (
1
s
s
m
2
m: integer
n, m: integer
m
m
= m ,n (t ) = 2 2 2 t n
)
#73
Wavelet Transform:
f (t ) (2m t n) dt
f (t ) = DWTm ,n m ,n (t ) + c (t )
m
#75
#76
#77
Conversion
Convert gray scale to floating point.
z Convert color to Y U V and then convert
each band to floating point. Compress
separately.
z
z
z
#79
Example
z
Image Sample
139 144 149 153 155 155 155 155
144 151 153 156 159 156 156 156
150 155 160 163 158 156 156 156
159 161 162 160 160 159 159 159
159 160 161 162 162 155 155 155
161 161 161 161 160 157 157 157
162 162 161 163 162 157 157 157
162 162 161 161 163 158 158 158
#81
Example
z
1259.6
-17.4
-20.2
-2.0
-6.0
-7.5
-1.5
0.0
0.1
-12.9
4.0
-3.0
-3.5
0.5
0.5
1.0
-13.2
-0.5
-3.2
1.5
-2.5
-2.5
0.0
1.0
1.5
5.0
0.0
0.0
-1.0
-3.0
-2.0
-1.0
-6.0
-3.5
-0.5
0.0
1.0
-1.5
-0.5
0.0
-3.5
-0.5
-0.5
-1.0
-0.5
-2.5
-0.5
-1.0
1.5
1.5
5.0
5.0
-1.5
0.5
2.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
#82
Quantization (Threshold 9)
9*
140
-1
-1
-2
-1
-1
-1
#83
Recovered Image
138
147
152
161
161
161
161
161
147
156
152
161
161
161
161
161
152
152
161
161
161
161
161
161
152
152
161
161
161
161
161
161
156
156
156
156
165
165
165
165
156
156
156
156
156
156
156
156
156
156
156
156
156
156
156
156
156
156
156
156
156
156
156
156
#84
#85
#86
E= all below it 0s
#87
z
z
http://perso.wanadoo.fr/polyvalens/cleme
ns/ezw/ezw.html
#88
#90
#92
#93
z
z
z
#94
Here MAX() means the maximum coefficient value in the image and y(x,y)
denotes the coefficient. With this threshold we enter the main coding loop
#95
In the dominant_pass
All the coefficients are scanned in a special order
z If the coefficient is a zero tree root, it will be encoded as ZTR. All
its descendants dont need to be encoded they will be
reconstructed as zero at this threshold level
z If the coefficient itself is insignificant but one of its descendants
is significant, it is encoded as IZ (isolated zero).
z If the coefficient is significant then it is encoded as POS
(positive) or NEG (negative) depends on its sign.
z
This encoding of the zero tree produces significant compression because gray
level images resulting from natural sources typically result in DWTs with many
ZTR symbols. Each ZTR indicates that no more bits are needed for encoding
the descendants of the corresponding coefficient
#96
all the coefficients that are in absolute value larger than the
current threshold are extracted and placed without their
sign on the subordinate list and their positions in the image
are filled with zeroes. This will prevent them from being
coded again.
In the subordinate_pass
z
All the values in the subordinate list are refined. this gives
rise to some juggling with uncertainty intervals and it
outputs next most significant bit of all the coefficients in the
subordinate list.
#97
EZW An example(1)
EZW An example(2)
The initial threshold is 32 and the result from the
dominant_pass is shown in the figure
63
POS
-34
NEG
49
POS
10
ZTR
7
ZTR
13
ZTR
-12
-31
IZ
23
ZTR
14
ZTR
-13
ZTR
3
ZTR
4
ZTR
-1
15
ZTR
14
IZ
-12
-7
-9
ZTR
-7
ZTR
-14
-2
--5
-1
ZTR
47
POS
-2
-3
ZTR
2
ZTR
-2
-3
-4
11
-4
#99
/* * Subordinate pass */
subordinate_threshold = current_threshold/2;
for all elements on subordinate list do {
if (coefficient > subordinate_threshold) {
output a one;
coefficient = coefficient-subordinate_threshold;
}
else output a zero;
}
#100
EZW An example(3)
The result from the dominant_pass is output as the following:
POS, NEG, IZ, ZTR, POS, ZTR, ZTR, ZTR, ZTR, IZ, ZTR, ZTR,
ZTR,ZTR,ZTR,ZTR, ZTR,POS, ZTR,ZTR
ABS(Original data)
63
34
49
47
Output symbol
ABS(Reconstructed data)
56
40
56
40
#101
z
z
z
z
z
S Bit: If Data- threshold (T) is greater than or equal to 0.5T, then S bit is set
1; otherwise, if it is less than 0.5T, it is set to be 0.
Reconstruction Value: Create a binary number starting from the most
significant bits whose values can be predicted certainly using T and 0.5T
only. For example 63 is greater than T=32 and 63-32=31 is greater than 16.
So, its S bit is 1. Also the most significant value bit is 1. Then 63-32=31 is
greater than 16 so the second value bit is also 1. Now, 32+16=48. So we can
say with certainty that the number is greater than 48. Then pick the midpoint
between 48 and next number which is a power of 2. This gives 56 as the
reconstructed value.
sign 32 16 8 4 2 1
0
1
1 ? ? ? ?
The number 34 is greater than 32 but 34-32 is less than 16; so its S bit is 0
and then pick the number which is the midpoint between 32 and 48. Thus 32
is reconstructed as 40. Similarly, 49 and 47 are reconstructed as 56 and 40,
respectively with S bits 1 and 0, respectively. The actual sign of the
numbers are captured in POS and NEG symbols. So, S is not a sign bit.
#102
EZW An example(4)
*
IZ
*
ZTR
10
13
-12
-31
NEG
23
POS
14
-13
-1
15
ZTR
14
ZTR
3
ZTR
-12
ZTR
-7
-9
ZTR
-7
ZTR
-14
ZTR
8
ZTR
-2
--5
-1
-2
-3
-2
-3
-4
11
-4
#103
EZW An example(5)
The result from the second dominant_pass is output as the following:
IZ, ZTR, NEG, POS, ZTR, ZTR, ZTR, ZTR, ZTR, ZTR, ZTR, ZTR
6332
3432
4932
4732
-31
23
60
36
52
44
28
20
#104
EZW An example(6)
The process is going on until threshold =1, the final output as:
D1: pnzt pttt tztt tttt tptt
S1: 1010
D2: ztnptttttttt
S2: 100110
D3: zzzzzppnppnttnnptpttnttttttttptttptttttttttptttttttttttt
S3: 10011101111011011000
D4: zzzzzzztztznzzzzpttptpptpnptntttttptpnpppptttttptptttpnp
S4: 11011111011001000001110110100010010101100
D5: zzzzztzzzzztpzzzttpttttnptppttptttnppnttttpnnpttpttppttt
S5: 10111100110100010111110101101100100000000110110110011000111
D6: zzzttztttztttttnnttt
Here p=pos, n=neg, z=iz, t=ztr The encoding used is : POS01, NEG11, ZTR00, IZ--10
For example, the output for 63 is:
sign 32 16 8 4 2 1
0
1
1 1 1
1 1
So 63 will be reconstructed as 32+16+8+4+2+1=63!
Note, how progressive transmission can be done.
#105