Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Year
Session 19
Huffman Code
DATA COMPRESSION
How does a compression program really work?
Why can size of file be reduced without reducing its
contain?
Supposed we will store letter A. Computer recognizes the
letter as a character with sequence 65, then the letter is
stored in harddisk as 1000001 (binary number of 65
decimal). It needs 7 binary digit to store letter A.
Data compression tries to store as minimum as possible
number of binary digits.
Bina Nusantara
ENCODING CHARACTER
ASCII (American Standard Code for Information Interchange)
ASCII character encoding consists 128 characters; 7 bit binary numbers;
printed 95 characters and 33 command character.
ISO 8859-1 is a standard character with 8 bit, can store 256 characters.
This standar is called as character encoding Latin-1.
Bina Nusantara
CHARACTER TABLE
Binary
Binary
Binary
0010 0000
0010 0001
0010 0010
0010 0011
0010 0100
0010 0101
0010 0110
0010 0111
0010 1000
0010 1001
0010 1010
0010 1011
0010 1100
0010 1101
0010 1110
0010 1111
0011 0000
0011 0001
0011 0010
0011 0011
0011 0100
0011 0101
0011 0110
0011 0111
0011 1000
0011 1001
0011 1010
0011 1011
0011 1100
0011 1101
0011 1110
0011 1111
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
sp
!
"
#
$
%
&
'
(
)
*
+
,
.
/
0
1
2
3
4
5
6
7
8
9
:
;
<
=
>
?
0100 0000
0100 0001
0100 0010
0100 0011
0100 0100
0100 0101
0100 0110
0100 0111
0100 1000
0100 1001
0100 1010
0100 1011
0100 1100
0100 1101
0100 1110
0100 1111
0101 0000
0101 0001
0101 0010
0101 0011
0101 0100
0101 0101
0101 0110
0101 0111
0101 1000
0101 1001
0101 1010
0101 1011
0101 1100
0101 1101
0101 1110
0101 1111
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
0110 0000
0110 0001
0110 0010
0110 0011
0110 0100
0110 0101
0110 0110
0110 0111
0110 1000
0110 1001
0110 1010
0110 1011
0110 1100
0110 1101
0110 1110
0110 1111
0111 0000
0111 0001
0111 0010
0111 0011
0111 0100
0111 0101
0111 0110
0111 0111
0111 1000
0111 1001
0111 1010
0111 1011
0111 1100
0111 1101
0111 1110
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
Bina Nusantara
HUFFMAN ALGORITHM
Invented by David A. Huffman in 1951 when he
studied his Ph.D at Massachusetts Institute of
Technology (MIT).
He discovered a method to build binary tree based
on frequency. The Binary Tree is called as Huffman
Tree is foundation of data compression with ZIP
format.
The technique is used as algorithm to create JPEG
image and musical file format MP3.
Bina Nusantara
Bina Nusantara
EXAMPLE CASE
Supposed we store words:
LOGIKA ALGORITMA
Bina Nusantara
sp
HUFFMAN TREE
Bina Nusantara
sp
11110
11111
0100
0101
1110
011
100
101
110
00
L
O
G
I
K
A
sp
A
L
G
O
R
I
T
M
A
Bina Nusantara
011
100
101
110
11110
00
1110
00
011
101
100
11111
110
0100
0101
00
3 bit
3 bit
3 bit
3 bit
5 bit
2 bit
4 bit
2 bit
3 bit
3 bit
3 bit
5 bit
3 bit
4 bit
4 bit
2 bit
EXERCISE
Create frequency table, Huffman Tree, and
Huffman Code to compress:
DESIGN AND ANALYSIS OF ALGORITHMS
Bina Nusantara
REVIEW
DATA COMPRESSION
ENCODING CHARACTER
CHARACTER TABLE
Bina Nusantara
Books References
References:
Computer Algorithms / C++
Ellis Horowitz, Sartaj Sahni, Sanguthevar Rajasekaran.
Computer Science Press. (1998)
Introduction to Algorithms
Thomas H Cormen, Charles E Leiserson, Ronald L.
3nd Edition. The MIT Press. New York. (2009)
END
Bina Nusantara