Sei sulla pagina 1di 8

Inaccuracy and a Coding Theorem

Author(s): Ram Autar and Raminder Singh Soni


Source: Journal of Applied Probability, Vol. 12, No. 4 (Dec., 1975), pp. 845-851
Published by: Applied Probability Trust
Stable URL: http://www.jstor.org/stable/3212738
Accessed: 16/09/2010 10:08

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=apt.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

Applied Probability Trust is collaborating with JSTOR to digitize, preserve and extend access to Journal of
Applied Probability.

http://www.jstor.org
J. Appl. Prob. 12, 845-851 (1975)
Printed in Israel
0 Applied ProbabilityTrust 1975

INACCURACY AND A CODING THEOREM

RAM AUTAR* AND


RAMINDER SINGH SONI, University of Delhi

Abstract
Kerridge introduced a measure known as inaccuracy for complete probabil-
ity distributions which is the generalisation of Shannon's entropy. In this paper
we study a grouping property of the inaccuracy. Also we have established a
coding theorem for personal codes by considering inaccuracy of order a and
generalised mean length of order t under the condition D
p Dq"i
pq 1D
Ip,
ENTROPY; INACCURACY; GENERALISED CODE LENGTH; PERSONAL CODES; CODING
THEOREM

1. Introduction

For the probability distributions P = (pi, , pN) with p, 0, pi = 1 and


= f-=1,
Q = (q2, ., qN) with q, 0, =, q, = 1, Kerridge [4] has introduced a quantity
- by
known as inaccuracy given
N

(1.1) H(P; Q) = p, logDoq'


which is a generalisation of Shannon's [6] entropy.
In this paper we take inaccuracy of generalised distributions as

N N N N

(1.2) H(P; Q)=• h(p;qj) pi, p,i 1, q,


q =51,
i=1 = i=1

where h (p; q) is a function defined as

plogq-' for p, q>O

(1.3) h(p; q)= 0 for p = 0, q 0

I+x0 forp>0,q=0
and study some of its properties (we assume that 0 logDO = 0).
Further, following R6nyi [5], inaccuracy of order a, studied by Sharma [7], is

Received 15 January 1975.


* Present address: National Council of Educational Research and
Training, New Delhi.
845
846 RAM AUTAR AND RAMINDER SINGH SONI

H.(P; Q) = (1 - a)-'ologo piq,


a > 0, a/1

(1.4)
N N

i=1
E
i=1 = 1.
Ep,=
For convenience the base of the logarithm is here taken as D. Clearly H,(P; Q)
reduces to H(P; Q) as a -- 1.
The idea of personal codes was given earlier by Kerridge [4] for (1.1). We give
the coding theorem for personal codes by considering the inaccuracy of order a
and the generalised mean length of order t given by Campbell [1]

(1.5) L(t)= t-r'logD EpiD'"M, (0 < t < c)

where D represents the size of the code-alphabet and n, is the length of the code
word for the ith message under a given condition.
Ordinary communication theory, as discussed for example by Shannon [6],
shows how to choose the most efficient codes when the frequencies with which
messages will be sent are exactly known. In practice, these frequencies are rarely
known exactly, and personal codes have to be used instead.

2. Properties of inaccuracy
We prove below two results regarding the function h (p; q) defined in (1.3).
Lemma 1. Let r2, , r, and u1, u2,, UN be non-negative real numbers,
then rl,

(2.1) Eh (r,; .
_i=1 l i=
u,)_ hEr,; Eu,
Proof. Since ri and ui (i = 1,2, .-, N) are all non-negative numbers, we have
N
I ri

(2.2) u' u u (u + u2 + + uN)'=

with equality if either r, = 0 for each i or = 0 with rX# 0 for each i. The result
now follows on taking logarithms of both theu; sides and definition of h
by (p; q).
If = 0, the inequality (2.1) reduces to 0 = 0 for = 0, and to o ==
j=2ir,
~=~lu,
for Ea=r, > 0 provided ri > 0 for each i. Also u >0 and ui = 0 for some i,
=if
then the left-hand side of (2.1) becomes infinite, the inequality holding in that
case also.
Lemma 2. Let ri, , TN and u1, UN be non-negative real numbers
= r2,'. u2,",
with 11=Iui 1, we have
Inaccuracyand a coding theorem 847

(2.3) 0
u)-<oo
h(r,;
with if and only if uk =0 and >0 for some k and
,lh(ri;u,)=oo rk
N1=1h(rj; u,)=0 if and only if ri =0 for all i.

Proof. The inequalities follow from Lemma 1. If I~'=i u,)= then at


h(r,;
least one ui must vanish for r, > 0 since we are dealing with a finite sum. Next
ri = 0 for each i implies trivially that h(ri; u) = 0.
=1-
Now from these lemmas and the definition of inaccuracy in (1.2) we have the
following theorem.
Theorem 1. Let the generalised probability distributions P = (pi, p2, ", pN)
"
and Q = (qi, q2, ' ', qN) be partitioned into M disjoint subsets P1, P2, ' '', PM and
Q1, , QM respectively of N1, N2, '', NM elements such that 1 Nk = N,
Q2,' = for each k with = 0, then _=,
P(k) = Pi W(Pk) No
,Nk-,-1
M
(2.4) E W(P) (2), .
k=l W(Pk) H(Pk; Qk) > H(P(1), P(2), P(M);
Q(1),
,
Q(M))

where
M

W(P)= • W(Pk).
k=1

Proof. To prove the above theorem, it would be enough to establish the


result for the two partitions P, and P2 Of P and Q1, Q2 Of Q. Let P, =
* *,
(pi, PNi), P2 = (pN1+1,""", PN) and Q1= (ql, qN,), Q2= (qN+1,...-, qN). Now
from Lemma 1; we have " ",
N1 N N1

(2.5) h (p,;
q,) p,; i q) = h (P(1); O(1))
i=1 i=1 i=1
_-hi
and
N N N

(2.6) Ei
i = N+1 h(p,; E
i= N+1 i=
E q,) = h(P(2); Q(2)).
q,)_-h p,; N1+1
Using (1.2) and adding (2.5) and (2.6) we get

W(P1)H(P1; Q1) + W(P2)H(P2; Q2)


W(P) H(P(o), P(2); 0(1), 0(2)).
-
There is equality in (2.4) if either pi = 0 for each i or at least one qi is zero with
corresponding non-zero pi and there is at least one partition of Q consisting of
all zero elements with corresponding partition of P consisting of some non-zero
elements.

Corollary 1. The inaccuracy of distributions P and Q which are unions of


PI, P2, ? ",PM and Q1, Q2, ' ", QM respectively is given by
848 RAM AUTAR AND RAMINDER SINGH SONI

H(P; OQ)= H(P, U P2 PM; Q U Q2 U U QM)

=k=l W(Pk) H(Pk; Qk)/ W(P)

H(Pi, P2, Q 2,1


, PM ; QO, , OM).
-
This shows that inaccuracy of the distributions P and O usually decreases if
the events in the distributions are grouped in M classes, M < N.
Remarks. From Lemma 1 and Lemma 2 it can be seen that 0 ! H(P; Q) x
with H(P; Q)= 0 if and only if p, = qj = 1 for one value of i and consequently
=
pi = q = 0 for all other i. Again H(P; Q)= c if for any i, q, 0 and p, is not
zero. These are the properties of the inaccuracy function studied by Kerridge [4].

3. Coding theorem of order a for personal codes


Let a finite set of N input symbols X = (x, x2, X.,XN) be encoded using an
alphabet of D symbols. It has been shown (Feinstein [2]) that there is a uniquely
decipherable code with lengths n2, , n, if and only if
nl,
N

(3.1) D- 1.
Suppose that two persons A and B believe that the probability of the ith event
is q,, and that the code with word lengths nN has been constructed
nl, n2,- "-,
accordingly, while in fact the true probability is pi. This code is then a 'personal
probability code'.
We shall take the average code length for personal probability codes of order t
as given in (1.5) which possesses properties of a mean length. In this section we
prove the coding theorem for personal codes by considering inaccuracy of order
a given in (1.4) and average code length of order t. This gives a characterisation
of H(P; Q) through average length of order t, under the condition
N

(3.2) pq'D- 1

which is a generalisation of (3.1). When q, = p, for each i, (3.2) reduces to (3.1). A


personal code satisfying (3.2) would be termed as suitable.
Lemma 3. Let {p}=, {q} ,and {n,}L satisfy (3.2): then
N N

(3.3) C' p, = ) q, = 1.
L(t)>- H,-(P; 0),
where a = 1/(t + 1).
Proof. Let 0 < t < l By Holder's inequality [3]
Inaccuracy and a coding theorem 849

N
(N X)11/p(N
N
)/q

(3.4) =
Ex
i=1 iE i=1 y i=1 x,y,
=
where p-' + q-' = 1, p <1 and x, y >0.
t, q = 1-- a, x = p;''tD-", 1/y( y = pl/(o-a)q
Setting p = - a,Xi(D=p1/ in (3.4) we get
-tq Pp.
(3.5) ( "= pD'"> p( Pqiq c'
i=1
N pq D
i=1
?
Now, taking logarithms of both sides and using (3.2), we get (3.3). It may be
shown that there is equality in (3.3) and (3.5) if

i=l

or

(3.6) n,= - a logo q + logo -N


pq).

When qi = p, for each i, the result agrees with one derived in [1]. Thus if we take
ni not to be an integer it is seen that the minimum possible value of L(t) is
Ha(P; Q) where a = 1/(1 + t). Also when t = 0, a = 1, the result is obtained for
Kerridge inaccuracy [4] and equality then holds for ni = - logo qi for each i.
Now, let a sequence of input symbols be generated independently where each
symbol is governed by the probability distribution (pI, p2, , PN). Also let
s = (a,, a2, a) be a typical input sequence of length M. The probability of s
is

(3.7) P(s) = pi,pi,2.... PM

where a1 = x,, - -., aM = x,i. Now let n(s) be the length of the code sequence for
s in suitable personal code and let the code length of order t for the M-sequence
be

(3.8) = logDE P(s)D'n(),


LM(t) (0< t <oo)

where the summation Es extends over the NM sequences s. The inaccuracy of


order a for this product space is

(3.9) HY(P; Q)= (1- a)-alogDR, R = P(s)[Q(s)]"-1

where

(3.10) O(s) = q..


%,
850 RAM AUTAR AND RAMINDER SINGH SONI

is the probability of s asserted by an experimenter when the true probability is


P(s). It follows from (3.7) and (3.10) that R = piq-c')M, so that
(~-,
(3.11) H" (P; O)= MH.(P; Q).
Let us suppose that n(s) be the integer satisfying the inequality

(3.12) - a logo Q(s) + logo R -5 n(s) < - a logo Q (s) + logo R + 1.


Now if each n(s) equals - a logo Q (s) + logo R then

(3.13) LM(t) - Hm(P; O)


If n(s) satisfies (3.12) for each sequence s for suitable personal code, the number
n(s) should satisfy

(3.14) P(s)[Q(s)]-'D-""' s 1

so that there is a suitable personal code with lengths n(s). From (3.12), we have

(3.15) [O(s)]-"'R' = D'""' < D'[O(s)]- 'R'.


Multiplying each number of (3.15) by P(s), summing over all s, taking logarithms
and then dividing by t and using the relation a = 1/(t + 1), we get
(3.16) H"(P; O)!5 LM(t)< HA(P; Q)+ 1.
Now (3.16) with (3.11) gives

(3.17) Q) L(t) < H(P; +


Ha(P; O)- "
The quantity LM(t)IM may be called the average code length of order t per
input symbol. The average length may be made as close to Ha (P; Q) as desired
by taking M sufficiently large. Thus we have proved our second theorem.
Theorem 2. The average code length of order t per input symbol can be
made as close to Ha (P; Q) as desired by encoding sufficiently long sequences of
input symbols but it is not possible to construct a suitable personal code whose
average code length of order t is less than Ha(P; Q) where a = 1/(t + 1),
0 < t < 0o under the condition 1pi'pqDD-" D 1.

Acknowledgements
The authors wish to express their sincere thanks to Dr. Bhu Dev Sharma,
Reader in Mathematics, University of Delhi, for guidance in carrying out this
research work, and to Professor U.N. Singh, Pro Vice Chancellor, University
of Delhi and Professor Maumohan Singh Asora, N.C.E.R.T., New Delhi, for
encouragement.
Inaccuracy and a coding theorem 851

References

[1] CAMPBELL, L. L. (1965) A coding theorem and R6nyi's entropy. Inf. and Control 8, 423-429.
[2] FEINSTEIN, A. (1958) Foundations of Information Theory. McGraw-Hill, New York.
[3] HARDY, G. H. LITrrLEWOOD, J. E. AND POLYA, G. (1952) Inequalities. Cambridge University
Press.
[4] KERRIDGE, D. F. (1961) Inaccuracy and inference. J. R. Statist. Soc. B 23, 184-194.
[5] RtNYi, A. (1961) On measures of entropy and information. Proc. Fourth Berkeley Symp.
Math. Statist. 'Prob. 1, 547-561.
[6] SHANNON, C. E. (1948) A mathematical theory of communication. Bell System Tech. J. 27,
379-423.
[7] SHARMA, B. D. (1970) The mean value study of quantities in information theory. Ph.D.
Thesis, University of Delhi.

Potrebbero piacerti anche