Sei sulla pagina 1di 8

Huffman Coding

Vida Movahedi

October 2006
Contents
• A simple example
• Definitions
• Huffman Coding Algorithm
• Image Compression
A simple example
• Suppose we have a message consisting of 5 symbols,
e.g. [►♣♣♠☻►♣☼►☻]
• How can we code this message using 0/1 so the coded
message will have minimum length (for transmission or
saving!)

• 5 symbols  at least 3 bits


• For a simple encoding,
length of code is 10*3=30 bits
A simple example – cont.
• Intuition: Those symbols that are more frequent should
have smaller codes, yet since their length is not the
same, there must be a way of distinguishing each code

• For Huffman code,


length of encoded message
will be ►♣♣♠☻►♣☼►☻
=3*2 +3*2+2*2+3+3=24bits
Definitions
• An ensemble X is a triple (x, Ax, Px)
– x: value of a random variable
– Ax: set of possible values for x , Ax={a1, a2, …, aI}
– Px: probability for each value , Px={p1, p2, …, pI}
where P(x)=P(x=ai)=pi, pi>0,  pi  1

i ai pi h(pi)
• Shannon information content of x
1 a .0575 4.1
– h(x) = log2(1/P(x))
2 b .0128 6.3

3 c .0263 5.2
.. .. ..
• Entropy of x 1
H ( x)   P ( x). log 26 z .0007 10.4
– x Ax P ( x)
Huffman Coding Algorithm
1. Take the two least probable symbols in the
alphabet
(longest codewords, equal length, differing in last digit)

1. Combine these two symbols into a single


symbol, and repeat.
Example
• Ax={ a , b , c , d , e }
• Px={0.25, 0.25, 0.2, 0.15, 0.15} 1.0
0

0.55 1

0
0.45 0.3
0 1 0 1
a b c d e
0.25 0.25 0.2 0.15 0.15
00 10 11 010 011
Disadvantages of the Huffman Code
• Changing ensemble
– If the ensemble changes the frequencies and probabilities
change  the optimal coding changes
– e.g. in text compression symbol frequencies vary with context
– Re-computing the Huffman code by running through the entire
file in advance?!
– Saving/ transmitting the code too?!

• Does not consider ‘blocks of symbols’


– ‘strings_of_ch’ the next nine symbols are predictable
‘aracters_’ , but bits are used without conveying any new
information

Potrebbero piacerti anche