Sei sulla pagina 1di 111

Fundamentals of Information Theory

Wei Liu
Ph.D, Associate Professor,
Dept. of Electronics and Information Engineering
Huazhong University of Science and Technology
Web: http://itec.hust.edu.cn/ liuwei
Email: liuwei@hust.edu.cn
Phone: 027-87540745

About myself

Wei Liu

(liuwei@hust.edu.cn)
I

Research interests
I

Content Centric Network

Associate

Software Defined Network

Professor

Traffic Estimation

ITEC, EIE, HUST

Traffic Engineering

Internet Applications

Education
I

Ph.D., HUST

B.Eng., HUST

2/51

Lecture01

Overview
Chapter 1: Introduction to Information Theory

3/51

Lecture01

Overview
Chapter 1: Introduction to Information Theory

4/51

What is information theory?

5/51

What is information theory?


I

Theoretical stuff
I

Mathematics

Engineering insights

5/51

What is information theory?


I

Theoretical stuff
I

Mathematics

Engineering insights

The fundamental problem of communication is that of


reproducing at one point either exactly or approximately a
message selected at another point.
C.E. Shannon, 1948

5/51

What is information theory?


I

Theoretical stuff
I

Mathematics

Engineering insights

The fundamental problem of communication is that of


reproducing at one point either exactly or approximately a
message selected at another point.
C.E. Shannon, 1948

Not only for communication, but also for sensing,


classification, economics, computer science, mathematics,
networking, . . .

5/51

What you will learn in this course!

6/51

What you will learn in this course!


I

Knowledge:
I

Information theory framework

Coding theorems (source, channel, rate-distortion theorems)

Applications (source coding, channel coding, . . .)

6/51

What you will learn in this course!


I

Knowledge:
I

Information theory framework

Coding theorems (source, channel, rate-distortion theorems)

Applications (source coding, channel coding, . . .)

Skill:
I

Matlab programming

Designing and implementing coding algorithms

6/51

What you will learn in this course!


I

Knowledge:
I

Information theory framework

Coding theorems (source, channel, rate-distortion theorems)

Applications (source coding, channel coding, . . .)

Skill:
I

Matlab programming

Designing and implementing coding algorithms

Insight: key concepts in information theory


I

Entropy

Channel capacity

Rate distortion

...
6/51

About the course

Understand the basic concepts in information theory

Learn information and its transmission using


stochastic/statistic theory and techniques

Learn commonly used approaches and methods in information


theory to solve practical problems

Develop analytical capability in a information-theoretic


perspective

7/51

Course grading
I

Prerequisite courses: Probability Theory and Stochastic


Process

Course materials
I

Thomas M. Cover and Joy A. Thomas, Elements of


Information Theory, 2nd, John Wiley & Sons, 2006

Lecture notes

Reference books and papers

Grading
I

Homework (30%)

Class attendance (10%)

Quiz (10%)

Final (50%)

Course project (optional 5%)


8/51

Textbook: Elements of Information Theory

Thomas M. Cover and Joy A. Thomas, Elements of Information Theory, 2nd, John Wiley & Sons, 2006.
9/51

Course materials
I

Landmark paper in information theory:


I

Claude E. Shannon, The Mathematical Theory of


Communications. Bell System Technical Journal, July &
October 1948.

Reference books:
I

Thomas M. Cover and Joy A. Thomas, Elements of


Information Theory, 2nd, John Wiley & Sons, 2006.

R.G. Gallager, Information Theory and Reliable


Communication, John Wiley & Sons, 1968.

David Tse and Pramod Viswanath, Fundamentals of wireless


communication, Cambridge: Cambridge University Press,
2005.
10/51

Course organization: schedule (40 hours)


I Chapter 1: Course introduction (2 hours)
I Chapter 2: Entropy, Relative Entropy, and Mutual Information (8 hours)
I Chapter 3: Asymptotic Equipartition Property (2 hour)
I Chapter 4: Entropy Rate of a Stochastic Process (2 hour)
I Chapter 5: Data Compression (8 hours)
I Chapter 7: Channel Capacity (6 hours)
I Chapter 8: Differential Entropy (2 hour)
I Chapter 9: Gaussian Channel (2 hours)
I Chapter 10: Rate Distortion Theory (4 hours)
I Chapter 15: Network Information Theory (4 hours)

11/51

Lecture01

Overview
Chapter 1: Introduction to Information Theory
What is information?
What is information theory?
Information theory in communication

12/51

Lecture01

Overview

Chapter 1: Introduction to Information Theory


What is information?
What is information theory?
Information theory in communication

13/51

What is information?

14/51

What is information?

Eliminate redundancy?

14/51

What is information?

15/51

What is information?

15/51

What is information?

Extract essential characteristics?

15/51

What is information?

16/51

What is information?

Symbolic presentation?
16/51

What is information?

What do you see in the above drawing?


17/51

What is information?

18/51

What is information?

19/51

What is information?

Inspiration?

19/51

Shannons perspective on information

Causality is complex: inevitability vs. probability

Uncertainty is inevitable.

The past has happened, which cannot be controlled but can


be observed; the future is yet to come, which cannot be
observed but can be predicted.

20/51

Current status on the information definition

No universal definition

21/51

Current status on the information definition

No universal definition

Lack of complete, clear, universally recognized concept of


information

21/51

Current status on the information definition

No universal definition

Lack of complete, clear, universally recognized concept of


information

Reason:

21/51

Current status on the information definition

No universal definition

Lack of complete, clear, universally recognized concept of


information

Reason:
I

Do not completely understand the nature of information

21/51

Current status on the information definition

No universal definition

Lack of complete, clear, universally recognized concept of


information

Reason:
I

Do not completely understand the nature of information

Characteristics of information

21/51

Current status on the information definition

No universal definition

Lack of complete, clear, universally recognized concept of


information

Reason:
I

Do not completely understand the nature of information

Characteristics of information
I

Syntax: presentation

21/51

Current status on the information definition

No universal definition

Lack of complete, clear, universally recognized concept of


information

Reason:
I

Do not completely understand the nature of information

Characteristics of information
I

Syntax: presentation

Semantics: meaning

21/51

Current status on the information definition

No universal definition

Lack of complete, clear, universally recognized concept of


information

Reason:
I

Do not completely understand the nature of information

Characteristics of information
I

Syntax: presentation

Semantics: meaning

Pragmatics: utility

21/51

Understand information at different levels

Broad view

22/51

Understand information at different levels

Broad view
I

Syntax + Semantics + Utility

22/51

Understand information at different levels

Broad view
I

Syntax + Semantics + Utility

Technical view

22/51

Understand information at different levels

Broad view
I

Syntax + Semantics + Utility

Technical view
I

Syntax + Semantics

22/51

Understand information at different levels

Broad view
I

Technical view
I

Syntax + Semantics + Utility

Syntax + Semantics

Statistic view

22/51

Understand information at different levels

Broad view
I

Technical view
I

Syntax + Semantics + Utility

Syntax + Semantics

Statistic view
I

Formulated by mathematics

22/51

Understand information at different levels

Broad view
I

Technical view
I

Syntax + Semantics + Utility

Syntax + Semantics

Statistic view
I

Formulated by mathematics

Statistic characteristics in presentation

22/51

Understand information at different levels

Broad view
I

Technical view
I

Syntax + Semantics + Utility

Syntax + Semantics

Statistic view
I

Formulated by mathematics

Statistic characteristics in presentation

Syntax

22/51

A snapshot of Shannons information theory

Statistic information

Advantages:

Clear definition

Statistic property independent of presentation

Entropy
H(X ) =

p(x) log p(x)

23/51

Long-before Shannons information theory

24/51

Long-before Shannons information theory

1775 B.C., Greek letters

24/51

Long-before Shannons information theory

1775 B.C., Greek letters

1400 B.C., Chinese Oracle

24/51

Long-before Shannons information theory

1775 B.C., Greek letters

1400 B.C., Chinese Oracle

800 B.C, Making a Fool of Seigneurs by Lighting False Signal


Fire

24/51

Development of modern communication technologies

In 1838, Morse codes

In 1875, Telephone invented by Bell

In 1877, Gramophone invented by Edison

In 1901, Wireless telegraph invented by Marconi

In 1927, Two radio programs broadcasted by NBC

In 1938, Radio show world war caused panic

In 1939, Television broadcast

In 1944, First computer invented in Harvard U.

25/51

Technical preparation before Shannons information theory


I

Telegraph(Morse, 1830s);

Telephone(Bell, 1876);

Wireless Telegraph(Marconi, 1887);

AM Radio(early 1900s);

SSB modulation(Carson, 1922);

Television(1925-1927);

Telex(1931);

FM Radio(Armstrong, 1936);

Pulse modulation(PCM) (Reeves, 1937-1939);

Vocoder(Dudley, 1939);

Spread Spectrum(1940s)
26/51

Long before Shannons information theory

In 1924, Nyquist found max rate


vs. log N [2]

Are Morse codes optimal?

What is the gain with the optimal


Morse codes?

How to design the optimal codes?

In 1928, Nyquist sampling theorem


[3]

27/51

Technical preparation before Shannons information theory


I

In 1928, Hartley introduced rate of communication,


inter-symbol interference and capacity of a system to
transmit information [4]
I

The point of view developed is useful in that it provides a


ready means of checking whether or not claims made for the
transmission possibilities of a complicated system lie within the
range of physical possibility.

Hartley: Capacity requires a quantitative metric of


information
I

H = n log s, where n: times of selection, s: number of selected


symbols

information is out of the selection of a limited possibilities


28/51

Comments on Nyquists and Hartleys work

Contribution:
I

Definition of information

Metric of information quantity

29/51

Comments on Nyquists and Hartleys work

Contribution:
I

Definition of information

Metric of information quantity

Limitation:
I

Ignore noise

Ignore the randomness of source symbols

29/51

Major challenges for establishing a communication theory

Quantify information carried by symbols

Transmission efficiency in communication systems

Accuracy of information transmission

Noise interference

30/51

Major challenges for establishing a communication theory

Quantify information carried by symbols

Transmission efficiency in communication systems

Accuracy of information transmission

Noise interference

Core problems: efficiency vs. reliability

30/51

Major challenges for establishing a communication theory

Quantify information carried by symbols

Transmission efficiency in communication systems

Accuracy of information transmission

Noise interference

Core problems: efficiency vs. reliability

Pioneering work by Shannon and Wiener

30/51

Essence of information in Shannons information theory


I

Information vs. Material vs. Energy

31/51

Essence of information in Shannons information theory


I

Information vs. Material vs. Energy

Information vs. Message vs. Signal

31/51

Essence of information in Shannons information theory


I

Information vs. Material vs. Energy

Information vs. Message vs. Signal

The essence of information is that when you receive


information, uncertainty is eliminated.
31/51

How to measure information?

Uncertainty is eliminated by information.

Define a quantitative measure of information

32/51

How to measure information?

Uncertainty is eliminated by information.

Define a quantitative measure of information

Mathematical tool:
I

Uncertainty can be described by probability theory, stochastic


process, and so on.

32/51

How to measure information?

Uncertainty is eliminated by information.

Define a quantitative measure of information

Mathematical tool:
I

Uncertainty can be described by probability theory, stochastic


process, and so on.

Modeling steps:
1. investigate the properties of information
2. model them in probabilities

32/51

Step 1: Investigate the properties of information

33/51

Step 1: Investigate the properties of information

Property 1 Information contained in events ought to be defined in


terms of some measure of the uncertainty of the events.

33/51

Step 1: Investigate the properties of information

Property 1 Information contained in events ought to be defined in


terms of some measure of the uncertainty of the events.
Property 2 Less certain events ought to contain more information
than more certain events.

33/51

Step 1: Investigate the properties of information

Property 1 Information contained in events ought to be defined in


terms of some measure of the uncertainty of the events.
Property 2 Less certain events ought to contain more information
than more certain events.
Property 3 The information of unrelated/independent events taken as
a single event should equal the sum of the information of
the unrelated events.

33/51

Step 2: Model the properties of information in probabilities

Property 1

34/51

Step 2: Model the properties of information in probabilities

Property 1
I

A nature measure of uncertainty of event a is the probability of


a, P(a).

34/51

Step 2: Model the properties of information in probabilities

Property 1
I

A nature measure of uncertainty of event a is the probability of


a, P(a).

Define the information in terms of P(a).

34/51

Step 2: Model the properties of information in probabilities

Property 1
I

A nature measure of uncertainty of event a is the probability of


a, P(a).

Define the information in terms of P(a).

Property 2 and 3

34/51

Step 2: Model the properties of information in probabilities

Property 1
I

A nature measure of uncertainty of event a is the probability of


a, P(a).

Define the information in terms of P(a).

Property 2 and 3
I

Properties (2) and (3) are satisfied if the information in a is


defined as
I (a) = log P(a).

34/51

Information property 2

I (a) = log P(a).


Less certain events ought to contain more information than more
certain events.
I

Suppose P(a) < P(b).

35/51

Information property 2

I (a) = log P(a).


Less certain events ought to contain more information than more
certain events.
I

Suppose P(a) < P(b).

1
P(a)

>

1
P(b)

35/51

Information property 2

I (a) = log P(a).


Less certain events ought to contain more information than more
certain events.
I

Suppose P(a) < P(b).

1
P(a)

1
1
log P(a)
> log P(b)

>

1
P(b)

35/51

Information property 2

I (a) = log P(a).


Less certain events ought to contain more information than more
certain events.
I

Suppose P(a) < P(b).

1
P(a)

1
1
log P(a)
> log P(b)

log P(a) > log P(b)

>

1
P(b)

35/51

Information property 2

I (a) = log P(a).


Less certain events ought to contain more information than more
certain events.
I

Suppose P(a) < P(b).

1
P(a)

1
1
log P(a)
> log P(b)

log P(a) > log P(b)

I (a) > I (b)

>

1
P(b)

35/51

Information property 3
I (a) = log P(a).
The information of unrelated/independent events taken as a single
event should equal the sum of the information of the unrelated
events.
I

Suppose a and b are two unrelated/independent events.

36/51

Information property 3
I (a) = log P(a).
The information of unrelated/independent events taken as a single
event should equal the sum of the information of the unrelated
events.
I

Suppose a and b are two unrelated/independent events.

(a, b) is considered together as a single event.

36/51

Information property 3
I (a) = log P(a).
The information of unrelated/independent events taken as a single
event should equal the sum of the information of the unrelated
events.
I

Suppose a and b are two unrelated/independent events.

(a, b) is considered together as a single event.

Hence, P(a, b) = P(a)P(b).


I (a, b) = log P(a, b) = log {P(a)P(b)}
= log P(a) log P(b) = I (a) + I (b)

36/51

Information of event and source


I

Information of an event
I

Suppose the event A with probability P(A), its self-information


is defined by
I (A) = log P(A).

37/51

Information of event and source


I

Information of an event
I

Suppose the event A with probability P(A), its self-information


is defined by
I (A) = log P(A).

Information of a source
I

Suppose the source as random variable X with a probability


mass function p(x).

The average information or entropy is defined by


H(X ) =

37/51

p(x) log p(x).

Information of event and source


I

Information of an event
I

Suppose the event A with probability P(A), its self-information


is defined by
I (A) = log P(A).

Information of a source
I

Suppose the source as random variable X with a probability


mass function p(x).

The average information or entropy is defined by


H(X ) =

p(x) log p(x).

More details of these concepts in source entropy.


37/51

Lecture01

Overview

Chapter 1: Introduction to Information Theory


What is information?
What is information theory?
Information theory in communication

38/51

What is information theory?

The major issue in information theory is to discover


mathematical laws in communicating or manipulating
information.

The information theory sets up quantitative measures of


information, and of the capacity of various systems to
transmit, store, and otherwise process information.
I

Information of an event: I (A)

Information of a source: H(X )


39/51

What is information theory?


I

The classical information theory was published


in the landmark paper A Mathematical
Theory of Communication in 1948 by Claude
E. Shannon (1916-2001).

40/51

What is information theory?


I

The classical information theory was published


in the landmark paper A Mathematical
Theory of Communication in 1948 by Claude
E. Shannon (1916-2001).

It origins in analyzing the limits of communications.

The fundamental problem of communication is that of reproducing


at one point either exactly or approximately a message selected at
another point. E.C. Shannon

40/51

What is information theory?


I

The classical information theory was published


in the landmark paper A Mathematical
Theory of Communication in 1948 by Claude
E. Shannon (1916-2001).

It origins in analyzing the limits of communications.

The fundamental problem of communication is that of reproducing


at one point either exactly or approximately a message selected at
another point. E.C. Shannon
Q1 What is the ultimate data compression?
Answer: The entropy H
Q2 What is the ultimate transmission rate of communication?
Answer: The channel capacity C
40/51

Goals of this course

An understanding of the intrinsic properties of transmission of


information

41/51

Goals of this course

An understanding of the intrinsic properties of transmission of


information

The relation between coding and the fundamental limits of


information transmission in the context of communications

41/51

Goals of this course

An understanding of the intrinsic properties of transmission of


information

The relation between coding and the fundamental limits of


information transmission in the context of communications

NOT a comprehensive introduction to the field of information


theory

41/51

Goals of this course

An understanding of the intrinsic properties of transmission of


information

The relation between coding and the fundamental limits of


information transmission in the context of communications

NOT a comprehensive introduction to the field of information


theory

NO touch in a significant manner on important topics such as


modern coding methods and complexity

41/51

Applications in communication
I

Main application area: coding

42/51

Applications in communication
I

Main application area: coding

Three coding theorems invented by Shannon:


I

Source coding theorem

Channel coding theorem

Rate distortion theorem

42/51

Applications in communication
I

Main application area: coding

Three coding theorems invented by Shannon:

Source coding theorem

Channel coding theorem

Rate distortion theorem

Practical methods have been invented and implemented after


Shannon:
I

Source coding: Huffman codes (compact), Lempel-Ziv


(compress, gzip)

Channel coding: error-correcting codes (Hamming,


Reed-Solomon, convolutional, Trellis, Turbo)

Rate-distortion coding: vocoders, minidiscs, MP3, JPEG,


MPEG
42/51

Relationship with other fields

43/51

Comments on Shannons theory


I

Limitations of Shannons Information Theory


I

Model the information source by a sample space, and describe


the outcomes by probabilities.

How about otherwise? Uncountable space, unknown


probabilities, . . .

Does not involve subjective ideas.

44/51

Comments on Shannons theory


I

Limitations of Shannons Information Theory


I

Model the information source by a sample space, and describe


the outcomes by probabilities.

How about otherwise? Uncountable space, unknown


probabilities, . . .

Does not involve subjective ideas.

Other approaches in information theory


I

Noise theory, signal filter and detection, statistical detection


and prediction, modulation, . . .

44/51

Comments on Shannons theory


I

Limitations of Shannons Information Theory


I

Model the information source by a sample space, and describe


the outcomes by probabilities.

How about otherwise? Uncountable space, unknown


probabilities, . . .

Does not involve subjective ideas.

Other approaches in information theory


I

Noise theory, signal filter and detection, statistical detection


and prediction, modulation, . . .

In this course, we only focus on Shannons theory.

44/51

Lecture01

Overview

Chapter 1: Introduction to Information Theory


What is information?
What is information theory?
Information theory in communication

45/51

Information theory in communication

Typical model of a communication system


46/51

Block diagram of communication systems

47/51

Block diagram of communication systems

The transmission and process of information in


communication systems

47/51

Source coding vs. Channel coding


I

Source Coding
I

Core problem: efficiency

Efficiency: having an average code length that is as small as


possible

Example: to use shorter code for the English letters which


appear frequently, so as to reduce the average code length

Channel Coding
I

Core problem: reliability

Reliability: to cope with the errors in the transmission

Example: to send the same sequence multiple times, so as to


recover from the errors in channel

48/51

Reliability vs. Efficiency


I

The eternal issues of information theory


I

Lose reliability to achieve higher efficiency

Lose efficiency to achieve higher reliability

Balance between efficiency and reliability


I

Efficiency:
I

digital case : send as few symbols as possible

analog case : reduce the time that the channel is used or the
bandwidth

Reliability:
I

digital case : reduce the error probability as small as possible

analog case : reduce the noise as much as possible

49/51

Summary

Overview

Chapter 1: Introduction to Information Theory


What is information?
What is information theory?
Information theory in communication

50/51

Reference

T. M. Cover and J. A. Thomas, Elements of information theory, 2nd ed.


Hoboken, N.J. : J. Wiley, 2006.
H. Nyquist, Certain factors affecting telegraph speed, Bell Syst. Tech.
J., vol. 3, pp. 324352, Apr. 1924.
, Certain topics in telegraph transmission theory, AIEE Trans., vol.
47, pp. 617644, Apr. 1928.
R. V. L. Hartley, Transmission of information, Bell Syst. Tech. J., vol. 7,
pp. 535563, July 1928.

51/51

Potrebbero piacerti anche