Unit 3 Entropy

Uni 3
Information
t Theory and
Information Theory and Coding
Coding
1
Definition of
information
We defin the amount informati gaine afte observi
th hi hoccurs with a defined
theeeventofS which d fi probability
d b bilit
as
on d r ng
the
( l,
/ ( sk )
logarithmic function

log pk
Where pk is probability of occurance of event sk
Rememb
er
di i l
Conditional b
Joint
bili
Probability
Probability
Important
If we ar absolute certai of th outcom of an
event,
e lyproperties
before it
n
there no
e e
even occurs, is information
The occurrence of an event either
gained.
provides some or no information, but
information never about
never brings about a loss of information.
loss
The less probable an the more information
event is, we
gain when it occurs.

If sk and sl are statistically independent.
9/4/2013 Lt Col A K Nigam, ITM University

4
Standard Practice for defining
Itinformation
is the standard practice today to use a
logarithm to base 2.
The resulting unit of information is called the bit
When pk = 1/2, we have I(sk) = 1 bit.

Hence, one bit is the amount of information
that we gain when one of two possible and
equally likely events
9/4/2013 occurs.
Lt Col A K Nigam, ITM 5
University
Entropy of a discrete memoryless
source of a discrete memoryless

Entropy
source with source alphabet S is a
measure of the average information content
per source symbol.
9/4/2013 Lt Col A K Nigam, ITM 6
University
Properties of
1. Entropy is a measure of the uncertainty
Entropy
of the random variable
2. H(s)=0, if and only if the probability p= 1 k, and

remaining probabilities in the set are all zero;
for some the
this lower
3. bound
H(s)=onlog 2K, if corresponds
entropy and only toif no
pkuncertainty.
= 1/K for
all k (i.e., all the
symbols in the alphabet Y are equiprobable);
this upper
bound on entropy corresponds to maximum

uncertainty. University
Proof of these properties of
2nd Property
H(s)
Since each probability pk is less than or

equal
followsto unity,
that each it
term pk is always
follows that each term pk Iog2(1/pk) is
nonnegative
always nonnegative,
and so H(s) 0.
Next, we note that the product term pk Iog2(1/pk)
is zero if,
and only if pk = 0 or 1.
We therefore deduce that H(s) = 0 if, and only if
pk= 0 or 1,
that is pk = 1 for some

9/4/2013
k and all the rest are
Lt Col A K Nigam, ITM 8
University
zero.
Example: Entropy of Binary Memoryless
We consider a binary source for which symbol 0 occurs
with probability P(0) and symbol 1 with probability P(1) =
Source
1 P(0) We assume that the source is memory-less
The entropy of such a source equals
H(s) = - P(0) log2 P(o) - P(1) log2 P(1)

= - P(o) log2 P(o) - {1 P(o)} log2{l P(o)} bits
For P(0)=0 P(1)=1 and thus H(s)=0
For P(0)=1 P(1)=0 and thus H(s)=0
For P(0)=P(1)=1/2 it is maximum=1

University
11
Proof of 3rd statement:Condition for Maximum
Entropy
We know that the entropy can achieve valu of
log M wher M is the number of
maximum e
If we e assume
2
If ll
that all symbols
symbols. i
are equiprobable
b bl
then
probability of each occurring l is 1/M

H (s) =
pk log 2
The associated entropy is therefore
M
2
M l/ M
k =l pk
= log 2 M
= M . l log l
This maximum value of entropy thu it is

when all symbols have equal of
is
9/4/2013
and Lt Col A K Nigam, ITM
s maximum 12
University
probability occurrence
EXAMPLE: Entropy of
Source
University
EXAMPLE: Entropy of
Six messages with probabilities
Source
0.30, 0.25,
0.15, 0.12, 0.10, and 0.08, respectively
arel
= (.30log
.30l
l0
.30 + .25log l0 .25 + .l5log l0 .l5 + .l2log l0
.l2 + .l0log l0
.l0 + .08log l0.08)
transmitted.
=+
l
.7292
Find the entropy
.30l
H (x) = (.30log2 .30 +.25log2 .25 +.l5log2 .l5 +.l2log2 .l2 +.l0log2 .l0 +.08log2 .08)
= +2.422644
University
Discrete Memoryless
Channel
A discrete memoryless channel is a statistical
input X and an output Y isa nois versio of X bot X
model with an
an Y ar rando variable
that y n ; h
d e m s.
University
Channel matrix, or transition
A convenient way of describing a discrete memory-less
matrix
channel is to arrange the various transition probabilities of
the channel in the form of a matrix as follows:
University
Capacity of a Discrete Memoryless
Channel capacity
Channel of a discrete
memoryless channel is defined as the
maximum mutual information I(x; y) in any
single use of the channel where the
possible probability {p( j on
maximization is over all possible input
probability distributions {p(xj)} on X.
The channel capacity is commonly denoted

C = p x / ( X ;Y )
max
by C. We thus write
{ p ( x j )}
or per
The channel capacity C is measured in bits
per channel use, or bits per transmission.
Binary symmetric channel
Correct bit transmitted with probability 1-

p
Wrong bit transmitted with probability p
Sometimes called cross-over probability
Capacity C = 1 - H(p,1-p)
Binary erasure channel
Correct bit transmitted with probability 1-

p
Erasure transmitted with probability p
Capacity C = 1 - p
Coding theory
Information theory only gives us an

upper bound on communication rate
Need to use coding theory to find a
practical method to achieve a high rate
2 types
Source coding - Compress source data to a
smaller size
Channel coding - Adds redundancy bits to
make transmission across noisy channel
more robust
Exampl Fin Mutua Informati fo th
h
channell shown below
b l
e: d l on r e
.8
P(X1)=. y1
.2 .3
6
P(X2)=.4 .7 y2
.8 .2
P( y / x) =
.3

.7
University
Types of channels and associated
Lossless channel
Entropy
Deterministic channel
Noiseless channel
i l h l
Binary symmetric channel
University
Lossless
For a lossless channel no source information

channel
transmission. It has one non zero element in
is lost in
only each
column.
)
Deterministic
Channel matrix has
non element in each fo
only one
example channel l
zero

row, r
l
0l 00

[ P(Y / X )] = 0
0
0

0 l 0
0 0 l
given that
In case ofxDeterministic
has occurred
channel
is 0/1 p(y/x)=0/1 as the
Putting this
probability ofinyeq 3 we get
H(y/x)=0
Thus from eq. 1 we
get
I(x, y)=H(y)
Also C=max
H(y) University
CHANNEL CAPACITY OF A CONTINUOUS
For a discret randomvariabl x the entrop H(x)was define

CHANNEL
as e e y d
H(x)for continuous random variables is by using

integral instead of discrete summation
obtained the
thus

34
Information Capacity Theorem for
limited, powerlimited Gaussian
band
Consider X(t) that is band-limited to B hertz.
channels
Also we assume that uniform sampling of the process X(t)
at the transmitter at Nyquist rate of 2B samples per second
produces 2B samples per second which are to be
transmitted over the channel
We also know that Mutual Information for a channel is

I(X; Y)=H(y) H(y/x)=H(x) - H(x/y).already done
University

With noise spectral density N0 , the total noise
spectr densit multiplie byBW ieBN0 Thu weca be
in BW B is
.
al y d s n
write
This Shannon theorem for Channel and is

widely in communication
is capacity used
computations.
University

Unit 3 Entropy

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Unit 3 Entropy

Caricato da

Copyright:

Formati disponibili

Uni 3

gain when it occurs.

9/4/2013 Lt Col A K Nigam, ITM University

When pk = 1/2, we have I(sk) = 1 bit.

source of a discrete memoryless

9/4/2013 Lt Col A K Nigam, ITM 6

2. H(s)=0, if and only if the probability p= 1 k, and

bound on entropy corresponds to maximum

Since each probability pk is less than or

that is pk = 1 for some

The entropy of such a source equals

H(s) = - P(0) log2 P(o) - P(1) log2 P(1)

For P(0)=0 P(1)=1 and thus H(s)=0

For P(0)=1 P(1)=0 and thus H(s)=0

For P(0)=P(1)=1/2 it is maximum=1

probability of each occurring l is 1/M

This maximum value of entropy thu it is

9/4/2013 Lt Col A K Nigam, ITM 13

9/4/2013 Lt Col A K Nigam, ITM 14

9/4/2013 Lt Col A K Nigam, ITM 15

9/4/2013 Lt Col A K Nigam, ITM 16

The channel capacity is commonly denoted

Correct bit transmitted with probability 1-

Correct bit transmitted with probability 1-

Information theory only gives us an

9/4/2013 Lt Col A K Nigam, ITM 25

9/4/2013 Lt Col A K Nigam, ITM 27

For a lossless channel no source information

For a discret randomvariabl x the entrop H(x)was define

H(x)for continuous random variables is by using

9/4/2013 Lt Col A K Nigam, ITM University

We also know that Mutual Information for a channel is

9/4/2013 Lt Col A K Nigam, ITM 37

This Shannon theorem for Channel and is

9/4/2013 Lt Col A K Nigam, ITM 41

Potrebbero piacerti anche