Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Huffman Coding also called as Huffman Encoding is a famous greedy algorithm that is used
for the lossless compression of data.
It uses variable length encoding where variable length codes are assigned to all the characters
depending on how frequently they occur in the given text.
The character which occurs most frequently gets the smallest code and the character which
occurs least frequently gets the largest code.
Prefix Rule-
To prevent ambiguities while decoding, Huffman coding implements a rule known as a prefix rule which
ensures that the code assigned to any character is not a prefix of the code assigned to any other character.
Step-01:
Create a leaf node for all the given characters containing the occurring frequency of characters.
Step-02:
Arrange all the nodes in the increasing order of frequency value contained in the nodes.
Step-03:
Considering the first two nodes having minimum frequency, create a new internal node having
frequency equal to the sum of the two nodes frequencies and make the first node as a left child and the
other node as a right child of the newly created node.
Step-04:
Keep repeating Step-02 and Step-03 until all the nodes form a single tree.
After following all the above steps, our desired Huffman tree will be constructed.
Important Formulas-
Formula-01:
Formula-02:
Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per character
= ∑ ( frequencyi x Code lengthi )
Time Complexity-
Problem-01:
A file contains the following characters with the frequencies as shown. If Huffman coding is used for
data compression, determine-
a 10
e 15
i 12
o 3
u 4
s 13
t 1
Solution-
First let us construct the Huffman tree using the steps we have learnt above-
Step-01:
Step-02:
Step-03:
Step-04:
Step-05:
Step-06:
Step-07:
After we have constructed the Huffman tree, we will assign weights to all the edges. Let us assign
weight ‘0’ to the left edges and weight ‘1’ to the right edges.
Note
We can also assign weight ‘1’ to the left edges and weight ‘0’ to the right edges.
The only thing to keep in mind is that we must follow the same convention at the
time of decoding which we adopted at the time of encoding.
After assigning weight ‘0’ to the left edges and weight ‘1’ to the right edges, we get-
To find the redundant bit R1, we check for even parity. Since the total number of 1’s in all the bit
positions corresponding to R1 is an even number the value of R1 (parity bit’s value) = 0
2. R2 bit is calculated using parity check at all the bits positions whose binary representation includes a
1 in the second position from the least significant bit.
R2: bits 2,3,6,7,10,11
To find the redundant bit R2, we check for even parity. Since the total number of 1’s in all the bit
positions corresponding to R2 is an odd number the value of R2(parity bit’s value)=1
3. R4 bit is calculated using parity check at all the bits positions whose binary representation includes a
1 in the third position from the least significant bit.
R4: bits 4, 5, 6, 7
To find the redundant bit R4, we check for even parity. Since the total number of 1’s in all the bit
positions corresponding to R4 is an odd number the value of R4(parity bit’s value) = 1
4. R8 bit is calculated using parity check at all the bits positions whose binary representation includes a
1 in the fourth position from the least significant bit.
R8: bit 8,9,10,11
To find the redundant bit R8, we check for even parity. Since the total number of 1’s in all the bit
positions corresponding to R8 is an even number the value of R8(parity bit’s value)=0.
Thus, the data transferred is:
Solution:
Let P(x) be the probability of occurrence of symbol x:
3. In {A, C, E} group,