Sei sulla pagina 1di 13

PROJECT REPORT

ON

“SHANNON FANNON CODING”

A Project Report
Submitted in Partial Fulfillment of the Requirements
For the award of the Degree of

Bachelor of Technology in
Electronics & Communication Engineering (ECE)
By
K.SAI SANDEEP 14311A0464
K.SAI RAM 14311A0465
G. SAI KRISHNA 14311A0466
B.Tech III year I semester

Under the Guidance / Supervision of


P.LAVANYA

Department of Electronics & Communication Engineering


Sreenidhi Institute of Science & Technology (Autonomous)

SEPTEMBER 2016
ABSTRACT:
In the field of data compression, Shannon–Fano coding, named after Claude
Shannon and Robert Fano, is a technique for constructing a prefix code based on a
set of symbols and their probabilities (estimated or measured). It is suboptimal in
the sense that it does not achieve the lowest possible expected code word length
like Huffman coding; however unlike Huffman coding, it does guarantee that all
code word lengths are within one bit of their theoretical ideal
The technique was proposed in Shannon's "A Mathematical Theory of
Communication", his 1948 article introducing the field of information theory. The
method was attributed to Fano, who later published it as a technical report.[1]
Shannon–Fano coding should not be confused with Shannon coding, the coding
method used to prove Shannon's noiseless coding theorem, or with Shannon–Fano–
Elias coding (also known as Elias coding), the precursor to arithmetic coding.
CONTENTS

Chapter 1 INTRODUCTION

1.1 Aim of the project


• Introduction to Shannon Fannon Coding
• Shannon-Fano Algorithm
1.4 Example

Chapter 2 MATLAB

2.1 Introduction

Chapter 3 RESULT

3.1 Source code


3.2 Output
3.3 Output Waveforms
CHAPTER 1

INTRODUCTION

• Aim of the project:


• In Shannon–Fano coding, the symbols are arranged in order from most
probable to least probable, and then divided into two sets whose total
probabilities are as close as possible to being equal. All symbols then have
the first digits of their codes assigned; symbols in the first set receive "0"
and symbols in the second set receive "1". As long as any sets with more
than one member remain, the same process is repeated on those sets, to
determine successive digits of their codes. When a set has been reduced to
one symbol this means the symbol's code is complete and will not form the
prefix of any other symbol's code.
• The algorithm produces fairly efficient variable-length encodings; when the
two smaller sets produced by a partitioning are in fact of equal probability,
the one bit of information used to distinguish them is used most efficiently.
Unfortunately, Shannon–Fano does not always produce optimal prefix codes;
the set of probabilities {0.35, 0.17, 0.17, 0.16, 0.15} is an example of one that
will be assigned non-optimal codes by Shannon–Fano coding.
For this reason, Shannon–Fano is almost never used; Huffman coding is almost as
computationally simple and produces prefix codes that always achieve the lowest
expected code word length, under the constraints that each symbol is represented
by a code formed of an integral number of bits. This is a constraint that is often
unneeded, since the codes will be packed end-to-end in long sequences. If we
consider groups of codes at a time, symbol-by-symbol Huffman coding is only
optimal if the probabilities of the symbols are independent and are some power of
a half, i.e., . In most situations, arithmetic coding can produce greater overall
compression than either Huffman or Shannon–Fano, since it can encode in
fractional numbers of bits which more closely approximate the actual information
content of the symbol. However, arithmetic coding has not superseded Huffman
the way that Huffman supersedes Shannon–Fano, both because arithmetic coding
is more computationally expensive and because it is covered by multiple
patents.Shannon–Fano coding is used in the IMPLODE compression method,
which is part of the ZIP file format

• Introduction to Shannon Fannon Coding:

At about 1960 Claude E. Shannon (MIT) and Robert M. Fano (Bell Laboratories)
had developed a coding procedure to generate a binary code tree. The procedure
evaluates the symbol's probability and assigns code words with a corresponding
code length.

Compared to other methods the Shannon-Fano coding is easy to implement. In


practical operation Shannon-Fano coding is not of larger importance. This is
especially caused by the lower code efficiency in comparison to Huffman coding
as demonstrated later.

Utilization of Shannon-Fano coding makes primarily sense if it is desired to apply


a simple algorithm with high performance and minimum requirements for
programming. An example is the compression method IMPLODE as specified

• SHANNON- FANO ALGORITHM:


A Shannon–Fano tree is built according to a specification designed to define an
effective code table. The actual algorithm is simple. For a given list of symbols,
develop a corresponding list of probabilities or frequency counts so that each
symbol’s relative frequency of occurrence is known.
• Sort the lists of symbols according to frequency, with the most frequently
occurring symbols at the left and the least common at the right.
• Divide the list into two parts, with the total frequency counts of the left part
being as close to the total of the right as possible.
• The left part of the list is assigned the binary digit 0, and the right part is
assigned the digit 1. This means that the codes for the symbols in the first
part will all start with 0, and the codes in the second part will all start with 1.
• Recursively apply the steps 3 and 4 to each of the two halves, subdividing
groups and adding bits to the codes until each symbol has become a
corresponding code leaf on the tree.

The original data can be coded with an average length of 2.26 bit.
Linear coding of 5 symbols would require 3 bit per symbol. But,
before generating a Shannon-Fano code tree the table must be
known or it must be derived from preceding data.

1.4 EXAMPLE :
Freq- 1. Step 2. Step 3. Step
Symbol quency Sum Kode Sum Kode Sum Kode
-----------------------------------------------
A 24 24 0 24 00
----------
B 12 36 0 12 01
--------------------------
C 10 26 1 10 10
----------------------
D 8 16 1 16 16 110
-----------
E 8 8 1 8 8 111
CHAPTER 2

MATLAB
2.1 INTRODUCTION

The name MATLAB stands for MATRIX LABORATORY. MATLAB was


written originally to provide easy access to matrix software developed by the
LINPACK (linear system package) and EISPACK (Eigen system package)
projects. MATLAB [1] is a high-performance language for technical computing. It
integrates computation, visualization, and programming environment.
Furthermore, MATLAB is a modern programming language environment: it has
sophisticated data structures, contains built-in editing and debugging tools, and
supports object-oriented programming. These factors make MATLAB an excellent
tool for teaching and research. MATLAB has many advantages compared to
conventional computer languages (e.g., C, FORTRAN) for solving technical
problems. MATLAB is an interactive system whose basic data element is an array
that does not require dimensioning. The software package has been commercially
available since 1984 and is now considered as a standard tool at most universities
and industries worldwide. It has powerful built-in routines that enable a very wide
variety of computations. It also has easy to use graphics commands that make the
visualization of results immediately available. Specific applications are collected in
packages referred to as toolbox. There are toolboxes for signal processing,
symbolic computation, control theory, simulation, optimization, and several other
fields of applied science and engineering.
CHAPTER 3

RESULT

3.1 SOURCE CODE


clc;
clear all;
close all;

m=input('Enter the no. of message ensembles : ');


z=[];
h=0;l=0;
display('Enter the probabilities in descending order');
for i=1:m
fprintf('Ensemble %d\n',i);
p(i)=input('');
end
%Finding each alpha values
a(1)=0;
for j=2:m;
a(j)=a(j-1)+p(j-1);
end
fprintf('\n Alpha Matrix');
display(a);
%Finding each code length
for i=1:m
n(i)= ceil(-1*(log2(p(i))));
end
fprintf('\n Code length matrix');
display(n);
%Computing each code
for i=1:m
int=a(i);
for j=1:n(i)
frac=int*2;
c=floor(frac);
frac=frac-c;
z=[z c];
int=frac;
end
fprintf('Codeword %d',i);
display(z);
z=[];
end
%Computing Avg. Code Length & Entropy
fprintf('Avg. Code Length');
for i=1:m
x=p(i)*n(i);
l=l+x;
x=p(i)*log2(1/p(i));
h=h+x;
end
display(l);
fprintf('Entropy');
display(h);
%Computing Efficiency
fprintf('Efficiency');
display(100*h/l);
fprintf('Redundancy');
display(100-(100*h/l));
3.2 OUTPUT
3.3 OUTPUT WAVEFORM

Potrebbero piacerti anche