Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Note: Part of the materials in the slides are from Gonzalezs Digital Image Processing and Prof. Yao Wangs lecture slides
Lecture Outline
Introduction Binary encoding
Fixed length coding Variable length coding
Huffman coding Arithmetic coding
Goal of compression
Given a bit rate, achieve the best quality Given an allowed distortion, minimize the data amount
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 3
Transformation
Quantization
Binary Encoding
Scalar Q Vector Q
Motivation for transformation --To yield a more efficient representation of the original samples.
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 5
Binary Encoding
Binary encoding
To represent a finite set of symbols using binary codewords.
The minimum number of bits required to represent a source is bounded by its entropy.
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 6
Entropy of a Source
Consider a source of N symbols, rn, n = 1,2,,N. Suppose the probability of symbol rn is pn. The self information of symbol rn is defined as,
H n = log 2 pn H n = ln pn (nats ) (bits ) (dits ) H n = log10 pn
The entropy of this source, which represents the average information, is defined as:
H = pn log 2 pn
n =1
Srping 2010
(bits )
0 log 2 0 = 0
x 0+
lim x log
x = 0
l = pnln = pn log 2 pn = H
For an arbitrary source, a code can be designed so that,ln = log 2 pn and the average length is
l = pnln pn log 2 pn = H
Srping 2010
H l H +1
This is Shannon Source Coding Theorem, which states the lower and upper bound for variable length coding. The Shannon theorem only gives the bound but not the actual way of constructing the code to achieve the bound.
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 10
Huffman Coding
Procedure of Huffman coding
Step 1: Arrange the symbol probabilities pn in a decreasing order and consider them as leaf nodes of a tree. Step 2: While there is more than one node:
Find the two nodes with the smallest probability and arbitrarily assign 1 and 0 to these two nodes Merge the two nodes to form a new node whose probability is the sum of the two merged nodes.
I=
Srping 2010
.57
1.00
1 Section VIII
.43
0 1
.27
0 1
.20
0 1
.15
Section
VII
Section VI
.12
0 1
.12
0 .06
.08 .06
Section V
Section IV
.06 .06
Section I
1/6
01
1 1 4 2 l = 1 + 2 + 2 = = 1.33 3 6 6 3
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 16
0 2/9 1 1 0 1/18
l=
Srping 2010
1 1 1 1 1 1 46 1 4 1 1 5 3 5 1 + 3 + 4 + 4 + 5 + + + + 5 = = 1.27 9 9 9 36 36 9 36 36 36 2 9
ELEN E4830 Digital Image Processing
Conditional Entropy
If (X,Y) ~ p(x,y), then the conditional entropy H(Y|X) is defined as
H (Y | X ) = p ( x) H (Y | X = x) = p ( x) p ( y | x) log 2 p ( y | x)
x y x
= p ( x) p ( y | x) log 2 p ( y | x)
x y
= p( x, y ) log 2 p ( y | x)
x y
1 / 2 if i = j p{ai / a j } = 1 / 4 otherwise
a1 |a1 1/2 a2 |a1 1/4 a3 |a1 1/4 1 1 0 1/2 0 1 01 00
1 H = 3 1.5 = 1.5 3
Input
1 1 0 1/2 0
1 01 00
a1
a2
a3 |a3 1/2 a1 |a3 1/4 a2 |a3 1/4 1 1 0 1/2 0 1 01 00
1 1 l = 1 + 2 2 = 1.5 2 4
Srping 2010
Srping 2010
Arithmetic Coding
Represent each string x of length n by a unique interval [L,R) in [0,1). The width R-L of the interval [L,R) represents the probability of x occurring. The interval [L,R) can itself be represented by any number, called a tag, within the half open interval. Find some k such that the k most significant bits of the tag are in the interval [L,R). That is, .t1t2t3...tk000... is in the interval [L,R). Then t1t2t3...tk is the code for x.
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 21
0 1/3 a 1. tag must be in the half open interval. 2. tag can be chosen to be (L+R)/2. 3. code is the significant bits of the tag.
2/3
b bb
bba
Srping 2010
Example of Codes
P(a) = 1/3, P(b) = 2/3.
tag = (L+R)/2 0 a aa aaa aab aba ab abb ba b bb 1
Srping 2010
code 0 0001 001 01 0111 101 11 aaa aab aba abb bab bba bbb
01011 baa
27/27 .111111111...
Initialize L := 0 and R:= 1; for i = 1 to n do W := R - L; L := L + W * C(xi); R := L + W * P(xi); t := (L+R)/2; choose code for the tag
L R
Srping 2010
tag = (5/32 + 21/128)/2 = 41/256 = .001010010... L = 5/32 = .001010000... R = 21/128 = .001010100... code = 00101
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 26
Decoding
2 symbols: a, b, and p(a)=1/3, p(b)=2/3. Assume the string length is known to be 3. 0001 which converts to the tag .0001000...
.0001000...
.010101010
0 aa a ab
aaa aab
.000111000
.000010010
1
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 27
W*P(ai) W t aj L R W
Decoding Example
P(a) = 1/4, P(b) = 1/2, P(c) = 1/4 C(a) = 0, C(b) = 1/4, C(c) = 3/4 00101 The number of symbol is 4.
tag = .00101000... = 5/32 W L R output 0 1 1 0 1/4 a 1/4 1/16 3/16 b 1/8 5/32 6/32 c 1/32 5/32 21/128 a
W := R - L; find j such that L + W * C(aj) < t < L + W * (C(aj)+P(aj))
Srping 2010
Srping 2010
Significance of Compression for Facsimile For a size A4 document (digitized to 1728x2376 pixels)
Without compression
Group 1 Fax: 6 min/page Group 2 Fax: 3 min/page
With compression
Group 3 Fax: 1 min/page
Fax becomes popular only after the transmission time is cut down to below 1 min/page.
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 32
2D Runlength Coding:
Use relative address from the last transition in the line above Used in Facsimile Coding (G3,G4)
Srping 2010
X X 2 2 1 X X
Srping 2010
4 5
2 1 1
2 1 2 1
2 2 1
4 2 1
2 5
4 1
2 2 1
4 2
X 1 X
c is current transition, e is the last transition in the same line, c is the first similar transition past e in the previous line. If ec <= cc, d = ec; if cc < ec, d = cc.
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 35
CCITT Group 3 and Group 4 Facsimile Coding Standard the READ Code Relative Element Address Designate
The first line in every K lines is coded using 1D runlength coding, and the following K-1 lines are coded using 2D runlength coding
The reason for 1D RLC is used for every K line is to suppress propagation of transmission errors.
Group 4 method is designed for more secure transmission, such as leased data line where the bit error rate is very low.
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 36
Predictive Coding
Motivation
The value of a current pixel usually does not change rapidly from those of adjacent pixels. Thus it can be predicted quite accurately from the previous samples. The prediction error will have a non-uniform distribution, centered mainly near zero, and can be specified with less bits than that required for specifying the original sample values, which usually have a uniform distribution.
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 37
fp
Be
Binary Decoding
f
Decoder
Srping 2010
Predictor
fp
Linear Predictor
Let f0 represent the current pixel, and fk, k = 1,2,,K the previous pixels that are used to predict f0. For example, if f0=f(m,n), then fk=f(mi,n-j) for certain i, j 0. A linear predictor is
= a f f k k 0
k =1
ak are called linear prediction coefficients or simply prediction coefficients. The key problem is how to determine ak so that certain criterion is satisfied.
Srping 2010 ELEN E4830 Digital Image Processing Lecture 10, Page 39
a E{ f
k =1
K
f l } = E{ f 0 f l }, l = 1,2,..., K .
l = 1,2,..., K .
Lecture 10, Page 40
a R(k , l ) = R(0, l ),
k =1 k
l = 1,2,..., K .
In matrix format
Srping 2010
f 0 = f (m, n) f1 = f (m, n 1) f 2 = f (m 1, n) R (0,0) = E{ f (m, n) f (m, n)} R (1,1) = E{ f (m, n 1) f (m, n 1)} R (2,2) = E{ f (m 1, n) f (m 1, n)} R (0,1) = E{ f (m, n) f (m, n 1)} R (0,2) = E{ f (m, n) f (m 1, n)} R (1,2) = E{ f (m, n 1) f (m 1, n)} R (2,1) = E{ f (m 1, n) f (m, n 1)}
Srping 2010
a1 = a 1 + 2 d
2 p 2 f
1 1
Lecture 10, Page 42
2 2 = 1 1 + d
Homework (1)
1. Consider a discrete source with an alphabet A = {a1, a2, , aL}. Compute the entropy of the source for the following two cases: (a) the source is uniformly distributed, with p(al) = 1/L, l = 1, 2, , L. (b) For a particular k, p(ak) = 1 and p(al) = 0; l k. 2. A source with three symbols A = {a1, a2, a3} has the following probability distributions: 2 / 3 if i = j
p{ai / a j } = 1 / 6 otherwise
(a) Calculate the 1st order entropy, 2nd order entropy, 1st order conditional entropy. Hint: the probability for having a pair of symbols aiaj is P(ai)P(aj|ai). (b) Design the 1st order, 2nd order, and 1st order conditional Huffman codes for this source. Calculate the resulting bit rate for each case. Compare it to the corresponding lower and upper bounds defined by the entropy. Which method has the lowest bit rate per symbol? How do they compare in complexity?
Srping 2010
Homework (2)
3. For the following image which consists of 8 symbols, (a) Determine the probability of each symbol based on its occurrence frequency; (b) Find its entropy; (c) Design a codebook for these symbols using Huffman coding method. Calculate the average bit rate and compare it to the entropy.
0 2 2 4
1 3 5 4
2 5 5 6
3 5 6 7
4.
For the following bi-level image: (a) Give its run-length representation in one dimensional RLC. Assume that each line start with W and mark the end with EOL. (b) Suppose we want to use the same codeword for the black and white run-lengths, determine the probability of each possible run-length (including the symbol EOL) and calculate the entropy of this source; (c) Determine the Huffman code for the source containing all possible runlengths, calculate the average bit rate of this code, and compare it to the entropy.
Srping 2010
Reading
R. Gonzalez, Digital Image Processing, Section 8.1, 8.2.1, 8.2.3, 8.2.4, 8.2.5, and 8.2.9. A.K. Jain, Fundamentals of Digital Image Processing, Sections 11.1 11.3
Srping 2010