Shannon Coding Extensions PDF

1-1
Shannon Coding for the Discrete Noiseless

Channel and Related Problems
Sept 16, 2009
Man DU Mordecai GOLIN Qin ZHANG
Barcelona
HKUST
2-1
This talks punchline: Shannon Coding can be
algorithmically useful
Shannon Coding was introduced by Shan-
non as a proof technique in his noiseless
coding theorem
Shannon-Fano coding is whats primarily
used for algorithm design
Overview
3-1
Outline
Human Coding and Generalizations
A Counterexample
New Work
Previous Work & Background
Open Problems
4-1
Prex-free coding
Let =
1
,
2
, . . . ,
r
be an encoding alphabet.
Word w
is a prex of word w
if w
= wu
where u
is a non-empty word. A Code over

is a collection of words C = w
1
, . . . , w
n
.
4-2
Prex-free coding
Let =
1
,
2
, . . . ,
r
Word w
is a prex of word w
if w
= wu
where u

1
, . . . , w
n
.
Code C is prex-free if for all i ,= j w
i
is not a prex of w
j
.
0, 10, 11 is prex-free.
0, 00, 11 isnt.
4-3
Prex-free coding
Let =
1
,
2
, . . . ,
r
Word w
is a prex of word w
if w
= wu
where u

1
, . . . , w
n
.
Code C is prex-free if for all i ,= j w
i
is not a prex of w
j
.
0, 10, 11 is prex-free.
0, 00, 11 isnt.
0 1
0 0
0 0
1
1
1
1
w
1
w
2
w
3
w
4
w
5
w
6
w
1
= 00
w
2
= 010
w
3
= 011
w
4
= 10
w
5
= 110
w
6
= 111
A prex-free code can be modelled as (leaves of) a tree
5-1
The prex coding problem
Dene cost(C) =
n
i=1
cost(w
i
)p
i
Let cost(w) be the length or number of characters
in w. Let P = p
1
, p
2
, . . . , p
n
be a xed discrete
probability distribution (P.D.).
5-2
Dene cost(C) =
n
i=1
cost(w
i
)p
i
in w. Let P = p
1
, p
2
, . . . , p
n
be a xed discrete
The prex coding problem, sometimes known as the
Human encoding problem is to nd a prex-free code
over of minimum cost.
5-3
Dene cost(C) =
n
i=1
cost(w
i
)p
i
in w. Let P = p
1
, p
2
, . . . , p
n
be a xed discrete
0 1
0 0
0
0
1
1
1
1
1
4
1
8
1
8
1
4
1
8
1
8
Equivalent to nding tree with
minimum external path-length
5-4
Dene cost(C) =
n
i=1
cost(w
i
)p
i
in w. Let P = p
1
, p
2
, . . . , p
n
be a xed discrete
0 1
0 0
0
0
1
1
1
1
1
4
1
8
1
8
1
4
1
8
1
8
Equivalent to nding tree with
minimum external path-length
2
1
4
+
1
4
+ 3
1
8
+
1
8
+
1
8
+
1
8
6-1
Useful for Data transmission/storage.
Modelling search problems
Very well studied
7-1
Whats known
Sub-optimal codes
Shannon coding: (from noiseless coding theorem)
There exists a prex-free code with word lengths
i
= log
r
p
i
|, i = 1, 2, . . . , n.
Shannon-Fano coding: probability splitting
Try to put
1
r
of the probability in each node.
7-2
Whats known
Sub-optimal codes
i
= log
r
p
i
|, i = 1, 2, . . . , n.
Try to put
1
r
Both methods have cost within 1 of optimal
7-3
Whats known
Human 1952: a well-know O(rnlog n)-time greedy-
algorithm (O(rn)-time if the p
i
are sorted in non-
decreasing order)
Optimal codes
Sub-optimal codes
i
= log
r
p
i
|, i = 1, 2, . . . , n.
Try to put
1
r
Both methods have cost within 1 of optimal
8-1
Whats not as well known
The fact that the greedy Human algorithm works
is quite amazing
Almost any possible modication or generalization to
the original problem causes greedy to fail
For some simple modications, we dont even have
polynomial time algorithms.
9-1
Generalizations: Min cost prex coding
Unequal-cost coding
Allow letters to have dierent costs, say, c(
j
) = c
j
.
Discrete Noiseless Channels (in Shannons original paper)
This can be viewed as a strongly connected aperiodic directed
graph with k vertices (states).
1. Each edge leaving a vertex is labelled by an encoding letter
, with at most one -edge leaving each vertex.
2. An edge labelled by leaving vertex i has cost c
i,
.
Language restrictions
Require all codewords to be contained in some given Language L
10-1
Generalizations: Prex-free coding ....
With Unequal-cost letters
a,1
a,1
a,1
b,2
b,2
b,2
2/6, aaa, 3
1/6, aab, 4
1/6, ab, 3
2/6, b, 2
c
1
= 1; c
2
= 2.
p
i
, w
i
, c(w
i
)
10-2
a,1
a,1
a,1
b,2
b,2
b,2
2/6, aaa, 3
1/6, aab, 4
1/6, ab, 3
2/6, b, 2
c
1
= 1; c
2
= 2.
Corresponds to dierent letter transmission/storage costs, e.g.,
the Telegraph Channel.
Also, to dierent costs for evaluating test outcomes in, e.g.,
group testing.
p
i
, w
i
, c(w
i
)
10-3
a,1
a,1
a,1
b,2
b,2
b,2
2/6, aaa, 3
1/6, aab, 4
1/6, ab, 3
2/6, b, 2
c
1
= 1; c
2
= 2.
Corresponds to dierent letter transmission/storage costs, e.g.,
the Telegraph Channel.
Also, to dierent costs for evaluating test outcomes in, e.g.,
group testing.
Size of encoding alphabet, , could be countably innite!
p
i
, w
i
, c(w
i
)
11-1
In a Discrete Noiseless Channel
S
2
S
1
S
3
a, 1
b, 2
b, 3
b, 3
a, 2
a, 1
start
S
1
S
2
S
3
S
3
S
1
S
2
S
3
S
2
S
3
a, 1
a, 1
a, 1
b, 2
a, 2
b, 3
b, 3
b, 3
1/6, aaa, 4
2/6, b, 3
1/6, abb, 5 1/6, aab, 5
1/6, aba, 4
11-2
S
2
S
1
S
3
a, 1
b, 2
b, 3
b, 3
a, 2
a, 1
start
S
1
S
2
S
3
S
3
S
1
S
2
S
3
S
2
S
3
a, 1
a, 1
a, 1
b, 2
a, 2
b, 3
b, 3
b, 3
1/6, aaa, 4
2/6, b, 3
1/6, abb, 5 1/6, aab, 5
1/6, aba, 4
Cost of letter depends upon current state.
In Shannons original paper, k = # states and [[ are both nite
11-3
S
2
S
1
S
3
a, 1
b, 2
b, 3
b, 3
a, 2
a, 1
start
S
1
S
2
S
3
S
3
S
1
S
2
S
3
S
2
S
3
a, 1
a, 1
a, 1
b, 2
a, 2
b, 3
b, 3
b, 3
1/6, aaa, 4
2/6, b, 3
1/6, abb, 5 1/6, aab, 5
1/6, aba, 4
A codeword has both start and end states. In coded message,
new codeword must start from nal state of preceeding one.
11-4
S
2
S
1
S
3
a, 1
b, 2
b, 3
b, 3
a, 2
a, 1
start
S
1
S
2
S
3
S
3
S
1
S
2
S
3
S
2
S
3
a, 1
a, 1
a, 1
b, 2
a, 2
b, 3
b, 3
b, 3
1/6, aaa, 4
2/6, b, 3
1/6, abb, 5 1/6, aab, 5
1/6, aba, 4
A codeword has both start and end states. In coded message,
new codeword must start from nal state of preceeding one.
Need k code trees; each one rooted with dierent state
12-1
With Language Restrictions
12-2
Find min-cost prex code in which all words belong to given
language L.
12-3
language L.
Example: L = 0
1, all binary words ending in 1.

Used in constructing self-synchronizing codes.
12-4
language L.
Example: L = 0
1, all binary words ending in 1.

Used in constructing self-synchronizing codes.
One of the problems that motivated this research.
Let L be the set of all binary words that do not contain a given
pattern, e.g., 010.
No previous good way of nding min cost prex code with such
restrictions.
13-1
With Regular Language Restrictions
13-2
In this case, there is a DFA / accepting Language L.
13-3
S
1
S
4
1
1
1
1
0 0 0
0
S
2
S
3
L = ((0 + 1)
000)
C
= binary strings not ending in 000
13-4
Erasing the nonaccepting states, / can be drawn with a nite #
of states but a countably innite encoding alphabet.
S
1
S
4
1
1
1
1
0 0 0
0
S
2
S
3
L = ((0 + 1)
000)
C
S
1
1 1
1
0 0
01
0
1
S
2
S
3
13-5
Erasing the nonaccepting states, / can be drawn with a nite #
of states but a countably innite encoding alphabet.
S
1
S
4
1
1
1
1
0 0 0
0
S
2
S
3
L = ((0 + 1)
000)
C
S
1
1 1
1
0 0
01
0
1
S
2
S
3
Note: graph doesnt need to
strongly connected. It might even
have sinks!
14-1
14-2
S
1
1 1
1
0 0
01
0
1
S
2
S
3
14-3
S
1
1 1
1
0 0
01
0
1
S
2
S
3
Can still be rewritten as a
min-cost tree problem
S
1
S
2
S
1
0
1
S
1
1
S
3
0
S
1
1
S
2
0
S
1
1
S
1
001
S
1
01
S
1
. . .
15-1
Outline
A Counterexample
New Work
Open Problems
16-1
Previous Work: Unequal Cost Coding
Letters in have dierent costs c
1
c
2
c
3
c
r
.
Models dierent transmission/storage costs
16-2
1
c
2
c
3
c
r
.
Karp (1961) Integer Linear Programming Solution
Blachman (1954), Marcus (1957), Gilbert (1995) Heuristics
G., Rote (1998) O(n
c
r
+2
) DP solution
Bradford, et. al. (2002), Dumitrescu(2006) O(n
c
r
)
G., Kenyon, Young (2002) A PTAS
16-3
1
c
2
c
3
c
r
.
Still dont know if its NP-Hard, in P or something between.
G., Rote (1998) O(n
c
r
+2
) DP solution
c
r
)
Big Open Question
16-4
1
c
2
c
3
c
r
.
Still dont know if its NP-Hard, in P or something between.
G., Rote (1998) O(n
c
r
+2
) DP solution
c
r
)
Big Open Question
Most Practical Solutions are arithmetic error approximations
17-1
17-2
Ecient algorithms (O(nlog n) or O(n)) that create
codes which are within an additive error of optimal.
COST OPT +K
17-3
COST OPT +K
Krause (1962)
Csiszar (1969)
Cott (1977)
Altenkamp and Mehlhorn (1980)
Mehlhorn (1980)
G. and Li (2007)
17-4
COST OPT +K
Krause (1962)
Csiszar (1969)
Cott (1977)
Mehlhorn (1980)
G. and Li (2007)
K is a function of letter costs c
1
, c
2
, c
3
, . . .
K(c
1
, c
2
, c
3
, . . .) are incomparable between dierent algorithms
K is often function of longest letter length c
r
, problem when r = .
17-5
COST OPT +K
Krause (1962)
Csiszar (1969)
Cott (1977)
Mehlhorn (1980)
G. and Li (2007)
K is a function of letter costs c
1
, c
2
, c
3
, . . .
K(c
1
, c
2
, c
3
, . . .) are incomparable between dierent algorithms
All algorithms above are Shannon-Fano type codes; dier in how they
dene approximate split
18-1
Previous Work:
The Discrete Noiseless Channel: Only previous result seems to
be Csiszar (1969) who gives additive approximation to optimal
code, again using a generalization of Shannon-Fano splitting.
18-2
Previous Work:
Language Constraints
1-ended codes:
Capocelli, et.al., (1994) Berger, Yeung(1990) Exponential Search
Chan, G. (2000) O(n
3
) DP algorithm
Sound of Silence Binary Codes with at most k zeros
Dolev, et. al. (1999) n
O(k)
DP algorithm
General Regular Language Constraint
Folk theorem: If a DFA with m states accepting L, optimal code
can be built in n
O(m)
time. (O(m) 3m.)
18-3
Previous Work:
Language Constraints
1-ended codes:
Capocelli, et.al., (1994) Berger, Yeung(1990) Exponential Search
Chan, G. (2000) O(n
3
) DP algorithm
Sound of Silence Binary Codes with at most k zeros
Dolev, et. al. (1999) n
O(k)
DP algorithm
General Regular Language Constraint
Folk theorem: If a DFA with m states accepting L, optimal code
can be built in n
O(m)
time. (O(m) 3m.)
No good ecient algorithm known
19-1
Previous Work:
Pre-Human there were two Sub-optimal construc-
tions for basic case
19-2
Previous Work:
Pre-Human there were two Sub-optimal construc-
tions for basic case
i
= log
r
p
i
|, i = 1, 2, . . . , n.
Try to put
1
r
20-1
l
2
Shannon Coding vs. Shannon-Fano Coding
Shannon Coding
l
1
l
i
= log
r
p
i
|
20-2
l
2
Shannon Coding
l
1
l
i
= log
r
p
i
|
Given depths l
i
, can build tree via
top-down linear scan. When
moving down a level, expand all
non-used leaves to be parents.
20-3
l
2
Shannon Coding
l
1
l
i
= log
r
p
i
|
Shannon-Fano Coding
p
1
, p
2
, . . . , p
n
p
1
, p
2
, . . . , p
i
1
p
i
1
+1
, . . . , p
i
2
p
i
r1
+1
, . . . , p
n
1/r
1/r 1/r
Given depths l
i
, can build tree via
top-down linear scan. When
moving down a level, expand all
non-used leaves to be parents.
21-1
Example: p
1
= p
2
=
1
3
, p
3
= p
4
= p
5
= p
6
=
1
12
21-2
Example: p
1
= p
2
=
1
3
, p
3
= p
4
= p
5
= p
6
=
1
12
Shannon coding
1
3
1
3
1
12
1
12
1
12
1
12
l
1
= l
2
= 2 = log
2
1
3
l
3
= l
4
= l
5
= l
6
= 4 =
log
2
1
12
Has empty slots

can be improved
21-3
Example: p
1
= p
2
=
1
3
, p
3
= p
4
= p
5
= p
6
=
1
12
Shannon coding
1
3
1
3
1
12
1
12
1
12
1
12
Shannon-Fano coding
1
3
1
3
1
12
1
12
1
12
1
12
22-1
Example: p
1
= p
2
=
1
3
, p
3
= p
4
= p
5
= p
6
=
1
12
Shannon-Fano: First, sort items and insert at root.
While a node contains more than 1 item, split its items weights
as evenly as possible. At most 1/2 nodes weight in left child.
22-2
Example: p
1
= p
2
=
1
3
, p
3
= p
4
= p
5
= p
6
=
1
12
1
3
1
3
,
1
12
,
1
12
,
1
12
,
1
12
22-3
Example: p
1
= p
2
=
1
3
, p
3
= p
4
= p
5
= p
6
=
1
12
1
3
1
3
,
1
12
,
1
12
,
1
12
,
1
12
1
3
1
12
,
1
12
,
1
12
,
1
12
1
3
22-4
Example: p
1
= p
2
=
1
3
, p
3
= p
4
= p
5
= p
6
=
1
12
1
3
1
3
,
1
12
,
1
12
,
1
12
,
1
12
1
3
1
12
,
1
12
,
1
12
,
1
12
1
3
1
3
1
12
,
1
12
1
3
1
12
,
1
12
22-5
Example: p
1
= p
2
=
1
3
, p
3
= p
4
= p
5
= p
6
=
1
12
1
3
1
3
,
1
12
,
1
12
,
1
12
,
1
12
1
3
1
12
,
1
12
,
1
12
,
1
12
1
3
1
3
1
12
,
1
12
1
3
1
12
,
1
12
1
3
1
3
1
12
1
12
1
12
1
12
23-1
Shannon Fano coding for unequal cost codes
p
1
, p
2
, . . . , p
n
p
1
, p
2
, . . . , p
i
1
p
i
1
+1
, . . . , p
i
2
p
i
k1
+1
, . . . , p
n
c
1
c
2
c
k
Previous Work. Unequal Cost Codes
c
1
c
2 c
k
: unique positive
root of
c
i
= 1
23-2
p
1
, p
2
, . . . , p
n
p
1
, p
2
, . . . , p
i
1
p
i
1
+1
, . . . , p
i
2
p
i
k1
+1
, . . . , p
n
c
1
c
2
c
k
c
1
c
2 c
k
Split probabilities so approximately
c
i
of the probability in a
node is put into its i
th
child.
: unique positive
root of
c
i
= 1
23-3
p
1
, p
2
, . . . , p
n
p
1
, p
2
, . . . , p
i
1
p
i
1
+1
, . . . , p
i
2
p
i
k1
+1
, . . . , p
n
c
1
c
2
c
k
c
1
c
2 c
k
c
i
th
child.
: unique positive
root of
c
i
= 1
Note: This can work for innite alphabets, as long as exists.
23-4
p
1
, p
2
, . . . , p
n
p
1
, p
2
, . . . , p
i
1
p
i
1
+1
, . . . , p
i
2
p
i
k1
+1
, . . . , p
n
c
1
c
2
c
k
c
1
c
2 c
k
c
i
th
child.
: unique positive
root of
c
i
= 1
All previous algorithms were Shannon-Fano like.
They diered in how they implemented approximate split.
24-1
Shannon-Fano coding for unequal cost codes
: unique positive root of
c
i
= 1
c
i
th
child.
24-2
c
i
= 1
c
i
th
child.
Example: Telegraph Channel: c
1
= 1, c
2
= 2
1
=
51
2
W
W/
W/
2
Put
1
of the roots weight
in the left subtree and
2
of
the weight in the right
25-1
c
i
= 1
c
i
th
child.
25-2
c
i
= 1
c
i
th
child.
Example: 1-ended coding. i > 0, c
i
= i.
c
i
= 1 gives
1
=
1
2
Put 2
i
of a nodes weight
into its i
th subtree
i
th encoding letter denotes string 0

i1
1.
0001
0001
001
001
01
01
1
1
26-1
Given coding letter lengths c = c
1
, c
2
, c
3
, . . ., gcd(c
i
) = 1,
let be the unique positive root of g(z) = 1
j

c
j
Previous Work. Well Known Lower Bound
26-2
1
, c
2
, c
3
, . . ., gcd(c
i
) = 1,
j

c
j
Note: sometimes called the capacity
26-3
1
, c
2
, c
3
, . . ., gcd(c
i
) = 1,
j

c
j
For given P.D. set H
p
i
log
p
i
.
26-4
1
, c
2
, c
3
, . . ., gcd(c
i
) = 1,
j

c
j
p
i
log
p
i
.
Note: If c
1
= c
2
= 1 then = 2 and H
is standard entropy
26-5
1
, c
2
, c
3
, . . ., gcd(c
i
) = 1,
j

c
j
p
i
log
p
i
.
Note: If c
1
= c
2
= 1 then = 2 and H
is standard entropy
Theorem:
Let OPT be cost of min-cost code for given P.D. and letter
costs. Then
H
OPT
26-6
1
, c
2
, c
3
, . . ., gcd(c
i
) = 1,
j

c
j
p
i
log
p
i
.
Note: If c
1
= c
2
= 1 then = 2 and H
is standard entropy
Theorem:
Let OPT be cost of min-cost code for given P.D. and letter
costs. Then
H
OPT
Note: If c
1
= c
2
= 1 then = 2 and this is classic
Shannon Information Theoretic Lower Bound
27-1
Outline
A Counterexample
New Work
Open Problems
28-1
Shannon coding only seems to have been used in the proof of
the noiseless coding theorem. It never seems to have actually
been used as an algorithmic tool.
28-2
All of the (additive-error) approximation algoritms for unequal
cost coding and Csiszars (1969) approximation algorithm for
coding in a Discrete Noiseless Channel, were variations of
Shannon-Fano coding
28-3
The main idea behind our new results is that
Shannon-Fano splitting is not necessary;
Shannon-coding suces
Shannon-Fano coding
28-4
The main idea behind our new results is that
Shannon-Fano splitting is not necessary;
Shannon-coding suces
Shannon-Fano coding
Yields ecient additive-error approximation algorithms for un-
equal cost coding and the Discrete Noiseless Channel, as well as
for regular language constraints.
29-1
New Results for Unequal Cost Coding
29-2
Given coding letter lengths c, let be capacity.
Then K > 0, depending only upon c, such that if
1. P = p
1
, p
2
, . . . , p
n
is any P.D., and
2.
1
,
2
, . . . ,
n
any set of integers such that
i,
i
K +log
p
i
|,
then there exists a prex free code for which the
i
are
the word lengths.
29-3
1. P = p
1
, p
2
, . . . , p
n
is any P.D., and
2.
1
,
2
, . . . ,
n
i,
i
K +log
p
i
|,
i
are
the word lengths.

i
p
i
i
K + 1 +H
(P) OPT +K + 1
29-4
1. P = p
1
, p
2
, . . . , p
n
is any P.D., and
2.
1
,
2
, . . . ,
n
i,
i
K +log
p
i
|,
i
are
the word lengths.

i
p
i
i
K + 1 +H
(P) OPT +K + 1
This gives an additive approximation of same type as Shannon-
Fano splitting without the splitting (same time complexity but
many fewer operations on reals).
29-5
1. P = p
1
, p
2
, . . . , p
n
is any P.D., and
2.
1
,
2
, . . . ,
n
i,
i
K +log
p
i
|,
i
are
the word lengths.

i
p
i
i
K + 1 +H
(P) OPT +K + 1
Same result holds for DNC and regular language restrictions.
is a function of the DNC or L-accepting automaton graph
30-1
Proof of the Theorem
We rst prove the following lemma.
Given c and corresponding then
> 0 depending only upon c such that if
n
i=1
i
,
then there exists a prex-free code with word lengths
1
,
2
, . . . ,
n
.
30-2
Proof of the Theorem
We rst prove the following lemma.
Given c and corresponding then
> 0 depending only upon c such that if
n
i=1
i
,
then there exists a prex-free code with word lengths
1
,
2
, . . . ,
n
.
Note: if c
1
= c
2
= 1 then = 2. Let = 1 and condition
becomes
i
1.
Lemma then becomes one direction of Kraft Inequality.
31-1
Proof of the Lemma
Let L(n) be the number of nodes on level n of the
innite tree corresponding to c
Can show t
1
, t
2
s.t., t
1
n
L(n) t
2
n
.
31-2
Proof of the Lemma
Can show t
1
, t
2
s.t., t
1
n
L(n) t
2
n
.
l
1
31-3
Proof of the Lemma
Can show t
1
, t
2
s.t., t
1
n
L(n) t
2
n
.
l
1
l
2
31-4
Proof of the Lemma
Can show t
1
, t
2
s.t., t
1
n
L(n) t
2
n
.
l
1
l
2
Grey regions are parts
of innite tree that are
erased when node k on
k
becomes leaf.
31-5
Proof of the Lemma
Can show t
1
, t
2
s.t., t
1
n
L(n) t
2
n
.
l
1
l
2
k
becomes leaf.
Node on
k
has L(
i
k
)
descendents on
i
l
i
31-6
Proof of the Lemma
Can show t
1
, t
2
s.t., t
1
n
L(n) t
2
n
.
l
1
l
2
k
becomes leaf.
Node on
i
can become
leaf i grey regions do not
cover all nodes on level
i
Node on
k
has L(
i
k
)
descendents on
i
l
i
31-7
Proof of the Lemma
Can show t
1
, t
2
s.t., t
1
n
L(n) t
2
n
.
l
1
l
2
k
becomes leaf.
Node on
i
can become
leaf i grey regions do not
cover all nodes on level
i
Node on
k
has L(
i
k
)
descendents on
i
i1
k=1
L(
k
) < L(
i
)
l
i
32-1
Just need to show that 0 < L(
i
)
i1
k=1
L(
k
).
Proof of the Lemma
32-2
i
)
i1
k=1
L(
k
).
Proof of the Lemma
L(
i
)
i1
k=1
L(
k
) t
1
t
2
i1
k=1
_
t
1
t
2
i1
k=1
k
_

(t
1
t
2
)
32-3
i
)
i1
k=1
L(
k
).
Proof of the Lemma
L(
i
)
i1
k=1
L(
k
) t
1
t
2
i1
k=1
_
t
1
t
2
i1
k=1
k
_

(t
1
t
2
)
Choose <
t
1
t
2
32-4
i
)
i1
k=1
L(
k
).
Proof of the Lemma
L(
i
)
i1
k=1
L(
k
) t
1
t
2
i1
k=1
_
t
1
t
2
i1
k=1
k
_

(t
1
t
2
)
Choose <
t
1
t
2
> 0
33-1
Proof of the Main Theorem
Set K = log
. (Recall l
i
K + log
p
i
|)
Then
n
i=1
i=1
Klog
p
i
i=1
log
p
i
=
n
i=1
p
i
=
33-2
Proof of the Main Theorem
Set K = log
. (Recall l
i
K + log
p
i
|)
Then
n
i=1
i=1
Klog
p
i
i=1
log
p
i
=
n
i=1
p
i
=
From previous lemma, a prex free code with those word
lengths
1
,
2
, . . . ,
n
exists, and we are done
34-1
Example: c
1
= 1, c
2
= 2
34-2
Example: c
1
= 1, c
2
= 2
=
5+1
2
, K = 1
34-3
Example: c
1
= 1, c
2
= 2
=
5+1
2
, K = 1
Consider p
1
= p
2
= p
3
= p
4
=
1
4
34-4
Example: c
1
= 1, c
2
= 2
=
5+1
2
, K = 1
Consider p
1
= p
2
= p
3
= p
4
=
1
4
Note that
_
log
p
i
_
= 3.
34-5
Example: c
1
= 1, c
2
= 2
=
5+1
2
, K = 1
Consider p
1
= p
2
= p
3
= p
4
=
1
4
Note that
_
log
p
i
_
= 3.
No tree with l
i
= 3 exists.
But, a tree with l
i
= log
p
i
| + 1 = 4 does!
1
4
1
4
1
4
1
4
35-1
The Algorithm
A valid K could be found by working through the
proof of Theorem. Technically, O(1) but, practically,
this would require some complicated operations on
reals.
35-2
The Algorithm
reals.
Alternatively, perform doubling search for K,
the smallest K for which theorem is valid.
Set K = 1, 2, 2
2
, 2
3
. . ..
Test if
i
= K + log
p
i
has valid code (can be done eciently)
until K is good but K/2 is not.
35-3
The Algorithm
reals.
Set K = 1, 2, 2
2
, 2
3
. . ..
Test if
i
= K + log
p
i
Note that K/2 < K K
35-4
The Algorithm
reals.
Set K = 1, 2, 2
2
, 2
3
. . ..
Test if
i
= K + log
p
i
Now set a = K/2, b = K, and binary search for K in [a,b].
Note that K/2 < K K
35-5
The Algorithm
reals.
Set K = 1, 2, 2
2
, 2
3
. . ..
Test if
i
= K + log
p
i
Note that K/2 < K K
Subtle point: Search will nd K
K for which code exists.

35-6
The Algorithm
reals.
Time complexity O(n log K).
Set K = 1, 2, 2
2
, 2
3
. . ..
Test if
i
= K + log
p
i
Note that K/2 < K K
Subtle point: Search will nd K
K for which code exists.

36-1
The Algorithm for innite encoding alphabets
(i) Root of
c
i
= 1 exists
Proof assumed two things.
(ii) t
1
, t
2
s.t., t
1
n
L(n) t
2
n
L(n) is number of nodes on level n of innite tree
36-2
(i) Root of
c
i
= 1 exists
(ii) t
1
, t
2
s.t., t
1
n
L(n) t
2
n
This is always true for nite encoding alphabet
36-3
(i) Root of
c
i
= 1 exists
(ii) t
1
, t
2
s.t., t
1
n
L(n) t
2
n
Not necessarily true for innite encoding alphabets
Will see simple example in next section
36-4
(i) Root of
c
i
= 1 exists
(ii) t
1
, t
2
s.t., t
1
n
L(n) t
2
n
But, if (i) and (ii) are true for an innite alphabet
Theorem/algorithm hold
36-5
(i) Root of
c
i
= 1 exists
(ii) t
1
, t
2
s.t., t
1
n
L(n) t
2
n
But, if (i) and (ii) are true for an innite alphabet
Example: 1-Ended codes. c
i
= i.
=
1
2
and (ii) is true Theorem/algorithm hold
Theorem/algorithm hold
37-1
Extensions to DNC and Regular Language Restrictions
Discrete Noiseless Channels
S
2
S
1
S
3
a, 1
b, 2
b, 3
b, 3
a, 2
a, 1
start
S
1
S
2
S
3
S
3
S
1
S
2
S
3
S
2
S
3
a, 1
a, 1
a, 1
b, 2
a, 2
b, 3
b, 3
b, 3
37-2
S
2
S
1
S
3
a, 1
b, 2
b, 3
b, 3
a, 2
a, 1
start
S
1
S
2
S
3
S
3
S
1
S
2
S
3
S
2
S
3
a, 1
a, 1
a, 1
b, 2
a, 2
b, 3
b, 3
b, 3
, , t
1
, t
2
s.t., t
1
n
L(n) t
2
n
Let L(n) be number of nodes on level n of innite tree
Fact that graph is biconnected and aperiodic implies that
37-3
S
2
S
1
S
3
a, 1
b, 2
b, 3
b, 3
a, 2
a, 1
start
S
1
S
2
S
3
S
3
S
1
S
2
S
3
S
2
S
3
a, 1
a, 1
a, 1
b, 2
a, 2
b, 3
b, 3
b, 3
, , t
1
, t
2
s.t., t
1
n
L(n) t
2
n
Algorithm will still work for
i
K + log
p
i
,
37-4
S
2
S
1
S
3
a, 1
b, 2
b, 3
b, 3
a, 2
a, 1
start
S
1
S
2
S
3
S
3
S
1
S
2
S
3
S
2
S
3
a, 1
a, 1
a, 1
b, 2
a, 2
b, 3
b, 3
b, 3
, , t
1
, t
2
s.t., t
1
n
L(n) t
2
n
i
K + log
p
i
,
Note: Algorithm must construct k dierent coding trees. One for each
state (tree root).
37-5
S
2
S
1
S
3
a, 1
b, 2
b, 3
b, 3
a, 2
a, 1
start
S
1
S
2
S
3
S
3
S
1
S
2
S
3
S
2
S
3
a, 1
a, 1
a, 1
b, 2
a, 2
b, 3
b, 3
b, 3
, , t
1
, t
2
s.t., t
1
n
L(n) t
2
n
i
K + log
p
i
,
Subtle point is that any node on level l
i
can be chosen for p
i
, independent
of its state! Algorithm still works.
38-1
Regular Language Restrictions
Assumption: Language is aperiodic, i.e., N, such that n > N
there is at least one word of length n
S
1
S
4
1
1
1
1
0 0 0
0
S
2
S
3
S
1
S
2
S
1
0 1
S
1
1
S
3
0
S
1
1
S
2
0
S
1
1
S
1
001
S
1
01
S
1
. . .
, , t
1
, t
2
s.t., t
1
n
L(n) t
2
n
Fact that language is aperiodic implies that
i
K + log
p
i
,
is largest dominant eigenvalue of a conn component of the DFA.
Again, any node at level l
i
can be labelled with p
i
, independent of state
39-1
Outline
A Counterexample
New Work
Conclusion and Open Problems
40-1
A Counterexample
Let c be the countably innite set dened by
[j [ c
j
= i[ = 2C
i1
where C
i
=
1
i+1
_
2i
i
_
is the i-th Catalan number.
Constructing prex-free codes with these c can be shown
to be equivalent to constructing balanced binary prex-free
codes in which, for every word, the number of 0s equals the
number of 1s.
40-2
A Counterexample
[j [ c
j
= i[ = 2C
i1
where C
i
=
1
i+1
_
2i
i
_
number of 1s.
No ecient additive-error approximation known.
40-3
A Counterexample
[j [ c
j
= i[ = 2C
i1
where C
i
=
1
i+1
_
2i
i
_
number of 1s.
For this problem, the length of a balanced word = # of 0s in word.
e.g., |10| = 1, |001110| = 3.
No ecient additive-error approximation known.
41-1
A Counterexample
Let L be the set of all balanced binary words.
Set Q = {01, 10, 0011, 1100, 000111, . . .},
the language of all balanced binary words without a balanced prex.
Then L = Q
and every word in L can be uniquely decomposed

into concatenation of words in Q.
41-2
A Counterexample
Set Q = {01, 10, 0011, 1100, 000111, . . .},
Then L = Q

# words of length i in Q is 2C
i1
.
41-3
A Counterexample
Set Q = {01, 10, 0011, 1100, 000111, . . .},
Then L = Q

# words of length i in Q is 2C
i1
.
Prex coding in L is equivalent to prex coding with innite alphabet Q.
01
10
0011
1100
01
10
0011
1100
42-1
A Counterexample
Note: the characteristic equation is
g(z) = 1
c
j
= 1
i
2C
i1
i
=
_
1 4/
for which root does not exist ( = 4 is an algebraic
singularity).
Can prove that for , K, we can always nd
p
1
, p
2
, . . . , p
n
s.t. there is no prex code with length
l
i
= K +log
p
i
|
43-1
A Counterexample
= 4 is algebraic singularity of characteristic equation
43-2
A Counterexample
Can prove that for 4, K, we can always nd
p
1
, p
2
, . . . , p
n
l
i
= K +log
p
i
|
43-3
A Counterexample
p
1
, p
2
, . . . , p
n
l
i
= K +log
p
i
|
Can also prove that for < 4, K, , we can always
nd p
1
, p
2
, . . . , p
n
s.t. if prex code with lengths
l
i
K +log
p
i
| exists, then
i
l
i
p
i
OPT > .
43-4
A Counterexample
p
1
, p
2
, . . . , p
n
l
i
= K +log
p
i
|
Can also prove that for < 4, K, , we can always
nd p
1
, p
2
, . . . , p
n
s.t. if prex code with lengths
l
i
K +log
p
i
| exists, then
i
l
i
p
i
OPT > .
No Shannon-Coding type algorithm can guarantee an

additive-error approximation for a balanced prex code.
44-1
Outline
A Counterexample
New Work
45-1
We saw how to use Shannon Coding to develop ecient
approximation algorithms for prex-coding variants, e.g.,
unequal cost cost coding, coding in the Discrete Noiseless
Channel and coding with regular language constraints.
45-2
Old Open Question: is unequal-cost coding NP-complete?
45-3
Old Open Question: is unequal-cost coding NP-complete?
New Open Question: is there an additive-error approximation
algorithm for prex coding using balanced strings?
We just saw that Shannon Coding doesnt work.
G. & Li (2007) proved that (variant of) Shannon-Fano doesnt work.
Perhaps no such algorithm exists.
46-1
The End
T H/A/ O|
Q and A

Shannon Coding Extensions PDF

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Shannon Coding Extensions PDF

Caricato da

Copyright:

Formati disponibili

1-1

Shannon Coding for the Discrete Noiseless

is a non-empty word. A Code over

is a non-empty word. A Code over

is a non-empty word. A Code over

1, all binary words ending in 1.

1, all binary words ending in 1.

Has empty slots

th encoding letter denotes string 0

K for which code exists.

K for which code exists.

and every word in L can be uniquely decomposed

and every word in L can be uniquely decomposed

and every word in L can be uniquely decomposed

No Shannon-Coding type algorithm can guarantee an

Potrebbero piacerti anche