Sei sulla pagina 1di 14

Joint events and joint probability

Theorem:
If one experiment has the possible outcomes
i
x
, n i , 1 = and the second experiment has the possible
outcomes
m j y
j
, 1 , =
, then the combined experiment has the possible outcomes
) , (
j i
y x
,
n i , 1 =
,
m j , 1 =
The joint probability:
) , (
i i
y x P
of combined experiment satisfies the condition:
1 ) , ( 0 s s
j i
y x P
Theorem:
If the outcomes
j
y
; m j , 1 = are mutually exclusive, then

=
=
m
j
i j i
x P y x P
1
) ( ) , (
Similarly, if the outcomes
n i x
i
, 1 , =
are mutually exclusive events, then
=
=
n
i
j j i
y P y x P
1
) ( ) , (
If all the outcomes of the 2 experiments are mutually exclusive, then :

= =
=
n
i
m
j
j i
y x P
1 1
1 ) , (
Conditional probabilities
) (
) , (
) | (
j
j i
j i
y P
y x P
y x P =
;
0 ) ( >
j
y P
- conditional probability of the event
i
x
given by the accuracy of
the event j
y
) (
) , (
) | (
i
j i
i j
x P
y x P
x y P =
,
) | ( ) ( ) | ( ) ( ) , ( 0 ) (
i j i j i j j i i
x y P x P y x P y P y x P x P = = >
Notes
1. If
I j i
y x
(
j i
y x &
are mutually exclusive events) then
0 ) | ( =
j i
y x P
2. If
i
x
is a subset of
j
y
(
I i j i
x y x =
) then
) (
) (
) | (
j
i
j i
y P
x P
y x P =
3. If
j
y
is a subset of
i
x
(
I j j i
y y x =
) then
1
) (
) (
) | ( = =
j
j
j i
y P
y P
y x P
Bayes Theorem
If
n i x
i
, 1 ; =
are mutually exclusive events such that U
n
i
i
X x
1 =
=
and Y is an arbitrary event with
0 ) ( > Y P , then

= =
n
j
j j
i i i
i
x P x Y P
x P x Y P
Y P
Y x P
Y x P
1
) ( ) / (
) ( ) / (
) (
) , (
) | (
Average information and entropy (+ conditional information)

Logarithmic measure of information

= =
) ( ) (
) , (
log
) (
) / (
log ) , (
j i
j i
b
i
j i
b j i
y P x P
y x P
x P
y x P
y x I
mutual information
j i
y x ,
if b=2 then:
-
) , (
j i
y x I
[bits]
if b=e then: b=e [nats]

a
a a
e
ln 44265 . 1 log
log 69315 . 0 log
2
2
=
Observations
1. If the random variables X & Y are statistically independent, then
) ( ) | (
i j i
x P y x P =
, then
0 ) , ( =
i i
y x I
2. If the random variable are fully dependent, then
= = = = ) ( ) ( log
) (
1
log ) , (
i i b
i
b j i
x I x P
x P
y x I
self information.

Show that:
) ( ) (
) , (
log
) (
) / (
log ) , (
j i
j i
b
i
j i
b j i
y P x P
y x P
x P
y x P
y x I

= =
) ( log ) (
i b i
x P x I =
Since:
) , ( ) ; (
) (
) / (
) ( ) (
) , (
) ( ) (
) ( ) / (
) (
) / (
i j j i
j
i j
j i
j i
j i
j j i
i
j i
x y I y x I
y P
x y P
y P x P
y x P
y P x P
y P y x P
x P
y x P
=
=

=
Conditional self information
I(x
i
/y
j
)= log
b
( )
j i
y x P /
1
= log
b
P(x
i
/y
j
) information about x=I after having observed the event Y=
j
y
I(x
i
/y
j
)= I(x
i
)-I(x
i
/y
j
)

)
`

>
>
0 ) y / x ( I
0 I(xi)
i i
I(x
i
/y
j
)>;0;<

Average mutual information
I(X/Y)=

= =

m
1 j
j i j i
n
1 i
) y , x ( I ) y , x ( P
=
= =

m
1 j
j i
j i
b j i
n
1 i
) y ( P ) x ( P
) y , x ( P
log ) y , x ( P
>0
I(X,Y)=0 if X & Y are statistically independent

Average self-information:
H(x)=
) x ( I ) x ( P
i
m
1 i
i

=
=
) x ( P log ) x ( P
i b
m
1 i
i

=
Entropy of the source is the average self-information per source letter, when X represents the alphabet of
possible output letters from a source.

Special case:
P(x
i
)=
n
1
H(X)=
n log
n
1
log
n
1
2 2
m
1 i
=
(

=
H(X)slog
2
n for equiprobable events

Variable Length Code
Kraft inequality

Condition of existence if the code satisfying the prefix condition
1 2
1
s

=
L
k
n
k
(L the number of words)
A necessary and sufficient condition for the existence of a binary code with code words of lengths
L
n n n s s s K
2 1
Proof of sufficiency:
Algorithm for constructing a code;
- start with a full binary tree of order n = n
L
(the targhet word)

- the tree
L
n
2 terminal nodes and for each node of order k-1 there are two nodes of order k connected to
it.
- select any node of order n
1
it results the code word c
1
dominating 2
n-n1
code words
- from the available select another one of order n
2
and attach the code word c
2
and eliminate
2
2
n n
code
words
- the process continues until n
L
it is reached and it results c
L
Proof of necessity:
-in the code three of order n=n
L
the no of total terminals nodes that must be assigned to all the code
words is given the sum of + number of nodes corresponding to the rotation

1 2 2 : 2
1 1
s s

=

L
k
nk n n
L
k
n n
k
d
Noisless Source Coding Theorem + Statistical average

Noiseless source catching theorem

Let X be a set of letters (symbols words) from a DMS (discrete memory source) and H(x) the entropy of
the source. Output letters X x
k
e whith probabilities p(x
k
) of occurrence have a length of n
k
symbols. It is
possible to construct a code that satisfies the prefix conditions and that has the average length R in the
range (H(x),H(x)+1)
Proof
Lower bound
0 ) ( 2 ) (log 1
) (
2
) ( log
) (
2
log ) ( log
) (
2
log ) ( ) (
1 ln
) (
2
log ) (
2 log ) (
) (
1
log ) (
) (
) (
1
log ) ( ) (
) (
1 ) ( ) (
1 1
2
1
2
1
2 2
1
2
1
2
1 1
2
1
2
1
s |
.
|

\
|
s
(

s
s = =
s
=
= + =
= =
s
+ < s



= =

=

=

= =
= =
L
k
k
L
k
nk
L
k k
nk
k
L
k k
nk
k
k
nk L
k
k
k
nk L
k
k
nk
L
k
k
k
L
k
k
L
k
h k
k
L
k
k
x p e
x p
x p e
x p
x p e
x p
x p R X H
x x
x p
x p
x p
x P
x p
x p n
x p
x p R X H
R X H
X H R X H
only for equiprobable events ) ( X H R =
Upper bound 1 ) ( + < X H R
n
k
length of the code word
L k , 1 =
*
N n
k
e
Select the number such that if a certain word x
k
has the probability p(x
k
); then
) ( 2
k
n
x p
k
s

.
If it is impossible chose n
k
such that:
1
1
2 ) ( 2
2 ) (
+
+
s s
<
k k
k
n
k
n
n
k
x p
x p

= =

=
s s
=
L
k
n
L
k
k
L
k
n
L
k
k
k k
x p
x p
1 1 1
1
1 2 ) ( 2
1 ) (
Having chosen:
1 2
1
2
1 1
2
1
2
1
2
3
1
2 ;
2
1
3
1
4
1
3
1
) (
1 ) (
) ( log ) ( ) ( ) (
) ( log 1 2 ) ( log
2 ) (

= = =
+
+
s s < <
+ <
< =
< <
<

k
L
k
k k
L
k
L
k
k k k
k k
n
k
n
k
x p
X H R
x p x p x p n x p R
x p n x p
x p
k
k
DM & ADM

Delta Modulation (DM)
Adaptive Delta Modulation (ADM)

Delta Modulation is a simplified form of DPCM in which a 2-level (1 bit) quantizer is used in
conjunction with a fixed first-order prediction.

x
n
= x
n-1
e
n-1

x
n
= x
n-1
+q
n-1
+ x
n-1
x
n-1
= x
n-1
+ q
n-1

q
n
= e
n
e
n
= e
n
(x
n
x
n
)
The estimated value of x
n
is the previous sample x
n-1
modified by the quantization noise q
n-1
. The
difference equation represents an integrator with an input e
n
.
An equivalent realization of the one-step predictor is an accumulator with an input equal to the quantized
error e
n
In general the quantized error signal is scaled by some value, say A1, called step size.

The encoder shown in the figure approximates the waveform x(t) by a linear staircase function (the
waveform must change slowly relative to the sampling rate. It results that the sampling rate must be
about 5 times greater or equal to the Nyquist rate.

+
Sampler _ Quantizer to transmitter

x
n
= x
n-1

Encoder Unit delay +
z
-1

e
n
LPF output

Decoder

z
-1

____________________________________________________________________________________
____________________________________________________

+ e
n
= 1
Sampler _ Quantizer to transmitter

Encoder x
n
Accumulator +

A
1
e
n
+ Accumulator LPF output

A
1
Decoder
Joyant (1970)
e
n
e
n-1

A
n
= A
n-1
K , K > 1
e
n
= 1
Sampler Quantizer

z
-1

Accumulator +

Continuous variable slope:

+ A = A
+ A = A

2 1
1 1
K
K
n n
n n
o
o
2 1
~
;
~
;
~
n n n
e e e
have the same sign
K1 >>K2 > 0 ; 0<K < 1
PCM & DPCM

PCM-Pulse Code Modulation

( ) t x
-a sample function emitted by the source
( ) n x -samples taken at a sample rate
max
2 f f
s
>
In PCM, each sample of the signal is quantized to one of
R
2 amplitude levels and the rate of source is
[ ]
s
bits
f R
s

.
n n
n q x x + =
~
- mathematical model of quantization
- n x
~
-quantized value of
n
x
-
n
q
-noise

Uniform Quantizer
7 bit quantizer
input-output characteristic for a uniform quanitzer:

pdf- probability density function

( )
A
=
1
q p
R
= A 2 - size of quantization

MSE mean square error - ( )
2
q c
( ) ( )
} }
A
A

A
A
A
A

=
A
=
A
=
A
= =
2
2
2 2
2
2
3
2 2
2
2 2
12
2
12 3
1 1
R
q
dq q dq q q p q N c
( ) ( ) dB R R q q
R
8 . 10 6 12 log 10 2 log 20
12
2
log 10 log 10
2
2
log
2
= = = =

c c
ex: for R=7bits,
( ) dB dB q 8 . 52
2
= c
uniform quantization non-uniform quantization

An uniform quantizer provides the same spacing between successive levels throughout the entire
dynamic range of a signal.
A better approach is to have more closely spaced levels at the large signal amplitude =>non uniform
quantizer.
A classic non uniform quantizer is a logarithmic compressor (Javant 1974 for speech processing)
y - magnitude of output
x - magnitude of input
PPPP P- compression factor

( )
) 1 log(
1 log
x
x
y
+
+
=

ex: =225; R=7 =>
( ) dB q 77
2
= c
A non quantizer is made from a non-linear device that compresses the signal and a uniform quantizer.
DPCM
Differential Pulse Code Modulation
In PCM each sample is encoded independently. However, most sources sampled at Nyquist rate
or faster exhibit significant correlation between successive samples (the average change is successive
sample and is small).
Exploiting the redundancy => lower rate for the output
Simple solution encoding the differences between successive samples.
Refinement: to predict the current sample based on previous p samples.

=
p
i
i n i n
x a x
1
^
;
{ }
i
a
- coefficient of prediction
n
x
- current sample
^
n
x
- predicted sample

MSE for computation of
{ }
i
a
coefficients:
( )
( ) ( ) ( )

=

= =

=
+ =
=
(
(

|
|
.
|

\
|
=
(
(

|
.
|

\
|
= =
p
j
j n i n j i
p
i
p
i
i n n i n
p
i
n i n n n n p
x x E a a x x E a x E
x a x E x x E e E
1 1 1
2
2
1
2
^
2
2
c
The source is stationary with ( ) n u -autocorrelation function.
If the source is wide sense stationary, results:

( ) ( ) ( )

= = =
u + u u =
p
i
p
i
p
j
j i i p
ij a a i a
1 1 1
2 0 c
Minimization of p
c
with respect to predicted coefficients { }
i
a results in a set of linear equations:

( ) ( ) j j i a
p
i
i
u = u

=1
; j=1,2,,p Yule-Walker equations
If ( ) n u is unknown apriory, it can be estimated by the relation:

( )

=
+
= u
N
i
n i i
x x
N
n
1
^
1
; n=0,1,2,,p
n
x
- sampled value
n
e - predicted error
~
n
x - quantized signal
~
n
e
- quantized predicted error

^
~
n
x - predicted quantized signal
n
q
- quantized error
Encoder:
Decoder:

= =
p
i
i n i n n n n
x a x x x e
1
~
^
~
n n n n n n n n n n n
x x x x e x x e e e q = + =
|
|
|
.
|

\
|
= =
~
^
~ ~
^
~ ~ ~
The quantized value
~
n
x differs from the input
n
x
by the quantization error
n
q
independent of the
predictors prediction. Therefore the quantization errors do not accumulate.
Improvement in the quality of estimation => inclusion of the linearly filtered past values of the quantized
error.

= =

+ =
p
i
m
i
i n i i n i n
e b x a x
1 1
~ ~
Adaptive PCM & DPCM
Many real sources are quasistationary. The variance and autocorrelation functions of the source output
vary slowly with time. The PCM and DPCM encoders are designed on the basis that the source output is
stationary. The efficiency and performance can be improved by having them adapted to the slowly time-
variant statistics of the source. The quantization error q
n
has a time-variant variance.

Reducing of the dynamic range of q
n
by using an adaptive quantifier

A
n+1
= A
n
M(n), where :

A
n
is the step size of the quantizer for processing
M(n) is the multiplication factor dependent of the quantizer level for the sample x(n)

PCM DPCM
2 3 4 2 3 4
M(1) 0.60 0.83 0.8 0.8 0.9 0.9
M(2) 2.20 1.00 0.8 1.6 0.9 0.9
M(3) 1.00 0.8 1.25 0.9
M(4) 1.50 0.8 1.70 0.9
M(5) 1.20 1.20
M(6) 1.60 1.60
M(7) 2.00 2.00
M(8) 2.40 2.40
In DPCM the predictor can also be made adaptive when the source output is stationary. The predictor
coefficients can be changed periodically to reflect the changing signal statistics of the source. The short-
term estimate of the autocorrelation function of x
n
replaces the ensemble correlation function. The
predictor coefficients are passed along the receiver with the quantization error e
n
.
The source predictor is implemented at the receiver and a higher bit rate results. The advantage of
decreased bit rate produced by using a quantizer with fewer bits is lost.

An alternative is the using of a predictor at the receiver that may compute its own predicted coefficients
from e
n
and x
n
.
Linear Predicting Coding

The source is modeled as a linear system (filter), which when excited by an appropriate signal gives the
observed output.

Instead of sending the samples of the source waveform, the parameters of the linear system are
transmitted along with the appropriate excitation signal.

- The source output sampled at a rate > Nyquist rate

{ }
1 , 0 = N n n
x
- The sample sequence is assumed to have been generated by an all-pole filter with:

=
p
k
k
k
z a
G
x H
1
1
) (
Excitation functions: - impulse
- sequence of impulses
- white noise with unit variance

Potrebbero piacerti anche