Sei sulla pagina 1di 45

Chapter 9

MAXIMUM LIKELIHOOD DECODING OF CONVOLUTIONAL CODES 1. Maximum Likelihood Decoding


l For a convolutional code, each code sequence is a path in the trellis diagram of the code. l Suppose each message sequence m consists of L message blocks of k bits each, m = ( m0 , m1 ,..., ml ,..., mL1 ) . l Then each code sequence c is a path of L+m branches long in the trellis diagram, c = (c0 , c1 ,..., cl ,..., c L1 ) Where the l-th branch (or code block) cl = (vl(1) , vl( 2) ,..., vl( n ) ) .

l Suppose a code sequence is transmitted. l Let r = (r0 , r1 ,..., rl ,..., rL + m1 ) be the received sequence where the l-th received block rl = ( rl (1) , rl ( 2) ,..., rl ( n ) ) . l MLD: Find the path c through the trellis diagram such that the conditional probability, P (r | c ) is the largest. l For a binary input, Q-ary output discrete memoryless channel (DMC), c is a binary sequence and r is a Q-ary sequence.

l The conditional probability be computed as follows:


P (r | c ) =
L+ m 1 l =0

P (r | c )

can

P( r | c )
l l

(1)

where P (rl | cl ) is the branch conditional probability. l The branch conditional probability is given by
P (rl | cl ) = P (rl( i ) | cl( i ) )
i =0 n

(2)

where P (rl( i ) | cl( i ) ) is the channel transition probability. l Define the log-likelihood function of a path c as follows: M ( r | c ) log P (r | c ) (3) which is called the metric of path c . l From (1) and (3), we have
M (r | c )
L + m1 l=0

log P (r | c ) = M (r | c )
l l l =0 l l

L +m 1

(4) (5)

where

M ( rl | cl ) log P( rl | cl )

is called the branch metric. l From (2) and (5), we have the branch metric
M ( rl | cl ) log P (rl( i ) | cl(i ) )
i =1 n

(6) (7)

where
M ( rl (i ) | cl(i ) ) = log P (rl( i ) | cl(i ) )

is called the bit metric. l MLD: Find the path c in the trellis diagram such M ( r | c ) is maximized. Then c is the eastimate of the transmitted code sequence. l For the first j branches of a path c through the trellis, the partial path metric is
M ([ r | c ] j ) = M (rl | cl )
l=0 j 1

(8)

Maximum Likelihood Decoding for a BSC


l For a BSC (Q=2) with transition probability p < 1/2, the log-likelihood function becomes
log P( r | c ) = d ( r , c ) log p + ( L + m) n log(1 p) (9) 1 p

where d ( r , c ) is the Hamming distance between r and c . l Since log[ p /(1 p )] < 0 and ( L + m) n log(1 p ) is a constant for all code sequences c , log P( r | c ) is maximized if and only if d ( r , c ) is minimized. l MLD: The received sequence r is decoded into the code sequence c if d ( r , c ) is minimized.

2. The Viterbi Decoding algorithm


l The Viterbi algorithm performs maximum likelihood decoding but reduces the computational complexity by taking advantage of the special structure of the code trellis. l It was first introduced by A. Viterbi in 1967. l It was first recognized by D. Forney in 1973 that it is a MLD algorithm for convolutional code.

Basic Concepts l Generate the code trellis at the decoder. l The decoder penetrates through the code trellis level by level in search for the transmitted code sequence. l At each level of the trellis, the decoder computes and compares the metrics of all the partial paths entering a node. l The decoder stores the partial path with the largest metric and eliminates all the other partial paths. The stored partial path is called the survivor. l For m < l L, there are 2km nodes at the l-th level of the code trellis. Hence there are 2km survivors. l When the code trellis begins to terminate, the number of survivors reduces. l At the end, the (L+m)-th level, there is only one node (the all-zero state) and hence only one survivor. l This last survivor is the maximum 7

likelihood path (or code sequence). The Viterbi Algorithm Step 1. Starting at the level l = m in the trellis, compute the partial metric for the single path entering each m-th order node. Store the path (the survivor) and its metric for each node. Step 2. Increasing l by 1. Compute the partial metric for all the paths entering a (l+1)-th order node by adding the branch metric entering that node to the metric of the connecting survivor at a previous l-th order node. For each (l+1)-th node, store the path with the largest metric (the survivor), together with its metric, and eliminate all the other paths. Step 3. If l < L+m, repeat Step 2. Otherwise, stop. 8

Example 9.1: Consider the (2,1,2) convolutional code given in Example 8.2 whose trellis diagram is shown in Fig. 9.1. Suppose the code is used for a BSC. In this case, we may use the Hamming distance as the path metric. The survivor at each node is the path with the smallest Hamming distance from the received sequence. l The message length L = 5. l There are 7 levels in the trellis. l The decoding process is shown in Figures 9.2 to 9.8.

11 01 11 01 11 01 11 10 10 10 10 10 10 01 10 10 01 10 10 01 10 10 01 10 01

00 00 00 01 01 01 01 01 11 11 11 11 11 11 11 11 11 11 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Initial Terminal node node Figure 9.1 The trellis diagram of a (2,1,2) code with L= 5.

10

2 11 1 10 1 10 10 01 11 2 01

1 11 3 00 00 00 00 00 Figure 9.2 Decoding process at level 2


r = (01, 11) c1 = (00, 00), c 2 = (00, 11), c3 = (11, 01), c 4 = (11, 10), d (c1 , r ) = 3 d (c 2 , r ) = 1 d (c 3 , r ) = 2 d ( c4 , r ) = 2

11

2 1 11 01 11 10 10 10 1 3 10 10 10 01 01 2 00 2 01 01 11 11 11 3 11 3 00 00 00 00 00 00 00
r = (01, 11, 10)

Figure 9.3 Decoding process at level 3 (comparison and elimination)

12

1 3 11 11 01 11 10 10 10 10 10 3 3 10 10 10 10 01 01 01 00 2 00 1 01 01 01 11 11 11 11 3 11 3 00 00 00 00 00 00 00 00
r = (01, 11, 10, 10)

Figure 9.4 Decoding process at level 4 (comparison and elimination)

13

3 4 11 11 01 11 01 11 10 10 10 10 10 10 3 1 10 10 10 10 10 01 01 00 00 1 00 4 01 01 01 01 11 11 11 11 11 3 11 3 00 00 00 00 00 00 00 00 00
r = (01, 11, 10, 10, 00)

Figure 9.5 Decoding process at level 5 (comparison and elimination)

14

4 01 11 01 11 11 11 10 10 10 10 10 1 10 10 10 10 10 01 01 01 00 00 00 4 01 01 01 01 11 11 11 11 11 3 00 00 00 00 00 00 00 00 00 00
r = (01, 11, 10, 10, 00, 11)

2 01 4 00

Figure 9.6 Decoding process at level 6 (comparison and elimination)

15

4 01 11 01 11 11 11 10 10 10 10 1 10 10 10 10 10 01 01 01 00 00 00 4 01 01 01 01 11 11 11 11 11 3 00 00 00 00 00 00 00 00 00
r = (01, 11, 10, 10, 00, 11, 10)

2 01 11 4 00 00 00

Figure 9.7 Decoding process at level 7 (comparison and elimination)

16

4 01 11 01 11 11 11 10 10 10 10 1 10 10 10 10 10 01 01 01 00 00 00 4 01 01 01 01 11 11 11 11 11 3 00 00 00 00 00 00 00 00 00

2 01 11 4 00

00

Figure 9.8 Decoding termination

17

3. Error Performance
l Without loss of generality, we assume that the all-zero code sequence (the all-zero path in the code trellis) is transmitted. l In the process of decoding, we say that a first-event error is committed if the all-zero path (the correct path) is eliminated for the first time at an arbitrary node in the trellis. l A measure of the error performance of a convolutional code is the first-event error probability, denoted P(E). l Another measure of the error performance is the probability that a decoded message bit is in error. This probability is normally called the decoded bit-error rate (BER), denoted Pb(E). l Upper bounds on these two error probabilities can be derived from the generating functions of the code.

l Two special classes of channels are considered


18

here.

19

For a BSC with Hard Decision Decoding


l Let p be the channel transition probability. l Then, the first-event error probability P(E) is upper bounded as follows: P ( E ) T ( X ) | X = 2 p (1 p ) (9) l Let
T ( X , Y ) = T ( X , Y , Z ) |Z =1

l Then, the decoded BER Pb(E) is upper bounded as follows:


Pb ( E ) 1 T ( X , Y ) |Y 1, X =2 k Y
p (1 p )

(10)

l For small p, P(E) and Pb(E) can be approximated as follows: d d /2 P ( E ) Ad 2 p (11)


free free free

Pb ( E )

1 d d /2 Bd free 2 free p free k

(12)

where weight

Ad free is

the number of code sequences of d free , and Bd is the total number of


free

nonzero message bits on all the code sequences of weight d free . 20

Example 9.2: Consider the (2,1,2) code given in Example 1. We have


T (X ) = X 5 + 2X 6 + 4 X 7 + T ( X , Y ) = X 5Y + 2 X 6Y 2 + 4 X 7Y 3 +

l We fine that

Ad free = 1

and

Bd free = 1 .

l For p=10-3, we fine that


P ( E ) 2 5 (10 3 ) 5 / 2 10 7 Pb ( E ) 2 5 (10 3 ) 5 / 2 10 7

21

l If the BSC is derived from an additive white Gaussian noise (AWGN) channel with BPSK modulation, optimum coherent detection and binary output quantization, then
p = Q( 2E 1 ) e E / N0 N0 2

(13)

where E is the energy per transmitted symbol and N0 is the one-sided noise power spectral density. l For a code of rate R = k / n , the energy per message bit is
Eb E R

l For large Eb/N, the bit-error probability with coding is


Pb ( E ) 1 d ( Rd / 2) ( E b / N 0 ) Bd free 2 free e free k

(14)

22

For a Binary Input AWGN Channel With No Output Quantization (Soft Decision)
l P(E) and Pb(E) are upper bounded as follows:
P ( E ) < T ( E ) | X = e REb / N 0 Pb ( E ) < 1 T ( X , Y ) |Y =1, X = e RE b / N 0 k Y

(15) (16)

l For large Eb/N0,


Pb ( E )
( R ) ( E / N ) 1 Bd free e d free b 0 k

(17)

23

Soft-Decision Decoding vs. Hard-Decision Decoding


l Comparing the exponent of (17) and that of (14), we see that the exponent of (17) is larger by a factor of 2. l This is equivalent to 3 dB energy (or power) advantage (asymptotic) for the AWGN channel with soft-decision decoding over the BSC with hard-decision decoding, since to achieve the same error probability on the BSC, the transmitter must generate an additional 3dB of signal energy (or power). l Soft-decision decoder for the unquantized modulator output is more complex than the hard-decision decoder, due to the need to accept analog inputs.

24

l Hard decision
Pb ( E ) Rd free Eb 1 Bd free 2 d free exp[ ( )( )] k 2 N0

l Soft-decision
Pb ( E ) E 1 Bd free exp[ Rd free ( b )] k N0

l For fixed Eb/N0,


Pb ( E ) soft < Pb ( E ) hard

l For the same Pb(E), Soft decision decoding requires less Eb/N0 than hard decision.
( Eb / N 0 ) soft < ( E b / N 0 ) hard

( Eb / N 0 ) hard = 2( Eb / N 0 ) soft

( Gain = 10 log10 = 3dB.

Eb ) N 0 hard = 10 log 10 2 E ( b ) soft N0

25

l If the modulator output is quantized to 8 levels, the energy gain of the soft-decision decoding over the hard-decision decoding is within about 1/4 dB of the optimum performance achievable with an unquantized demodulator output, while avoiding the need for an analog decoder. l For small Eb/N0, the energy gain of soft-decision decoding over hard-decision decoding is less 3dB, about 2 dB. l Over the entire range of Eb/N0 rations, the gain of soft-decision decoding over hard-decision decoding runs between 2 and 3 dB.

26

4. Coding Gain
l Coding gain is defined as the reduction in the require Eb/N0 (usually expressed in decibels) to achieve a specified error probability of a coded system over an uncodeed system with the same modulation and channel characteristic. l For an uncoded coherent BPSK system with an AWGN channel, the bit-error rate simply the transition probability,
Pb ( E ) = Q( 2E ) N0

l For large Eb/N0, this error rate (without coding) is approximated by Pb ( E ) 0.282 e E / N (18)
b 0

27

Coding Gain with Hard-Decision Decoding l Comparing (14) to (18), we see tat for a fixed Eb/N0, the exponent with coding is large than the exponent without coding by a factor of
Rd free 2

l For large values of Eb/N0, the exponential term dominates the error probability expressions. l Therefore, to achieve the same output bit-error rate Pb(E), the coded system requires
G = 10 log 10 ( Rd free 2 ) dB

(19)

less power than the uncoded BPSK system. l G is called the asymptotic coding gain of the coded system over the uncoded BPSK system.

28

l Uncoded coherent BPSK


Pb ( E ) 0.282 exp( E b / N 0 )

l Coded BPSK with hard decision


Pb ( E ) Rd free E 1 d Bd free 2 free exp[ ( ) ( b )] k 2 N0

l For fixed Eb/N0,


Pb ( E ) coded BPSK < Pb ( E ) uncoded BPSK

l For the same Pb(E)


Eb ) uncoded BPSK Rd free N0 = E 2 ( b ) coded BPSK N0 Rd free Coding gain = 10 log 10 2 (

dB

29

Coding gain with Soft-Decision Decoding (No Output Quantization)


l Comparing (17) to (18), we fine that the coding gain of a coded system with soft-decision decoding over an uncoded BPSK is G = 10 log 10 ( Rd free ) . (20)

30

5.Construction of Good Convolutional Codes


l The error performance of a convolutional code using the Viterbi decoding is determined by its minimum free distance dfree and weight structure. l For a given rate R=k/n and memory order m, dfree should be maximized, while Ad and Bd
free free

should be minimized. l Analytic construction of good convolutional codes has not been successful. l Most code construction has been done by computer search. As a result, only codes of relatively short constraint length (small memory order) which have maximal dfree have been found.

31

6.Catastrophic Error Propagation


l A catastrophic error is defined as an event whereby a finite number of channel errors result in an infinite number of decoded data bit errors. l This is a very undesirable event. l A code which inherits the catastrophic error propagation property is called a catastrophic code. l Necessary and sufficient conditions for a code to be non-catastrophic have been derived by Massey and Sain. l In code construction, catastrophic codes must be avoided under all circumstances.

32

7.The Most Widely Used Convolutional Codes


l The most widely used convolutional code is (2,1,6) Odenwalter code generate by the following generator sequence,
g (1) = (1101101), g ( 2) = (1001111).

l This code has dfree=10. l With hard-decision decoding, it provides a 3.98dB coding gain over the uncoded BPSK modulation system. l With soft-decision decoding, the coding gain is 6.98dB.

33

8.Implementation Viterbi Decoder

Consideration

of

the

l The complexity of a Viterbi decoder mostly depends on the number of states in the trellis (or state diagram) and the decoding span (the length of survivors needed to be stored for making a decoding decision). Storage Consideration l Consider an (n,k,m) convolutional code l The encoder for this code consists of k shift registers to store message bits from k input terminals. l Let mi be the length of the i-th shift register. Then the memory order of the code is
m = max mi
1 i k

34

l Let K= m1 + m2 + + mk. l Then the encoder stores a total of K message bits at a time. Therefore, the encoder has 2K possible states. l This implies that, in the decoding process, there are 2K survivors. l The decoder must reserve 2K words of storage (or buffer registers) for the survivors. l Each word must be capable of storing the surviving path along with its metric. l Since the storage size increase exponentially with K, in practice it is not feasible to use codes with large K.

35

l K=8 is normally considered the practical limit for the Viterbi decoding algorithm. l This constraint on K limits the available minimum free distance of a code. l As a result, the achievable error probability cannot be made arbitrarily small. Bit error probabilities in the range 10-5~10-6 and (soft-decision) coding gains of around 7dB are considered the practical limit for the Viterbi decoding algorithm in most cases.

36

Path Memory l Suppose the length of a message sequence is kL bits (or L blocks) long. l To terminate the trellis for making a decoding decision, m blocks of 0s must be inserted into the input stream after every L blocks of message bits. l By doing this, the effective rate of information transmission is reduced from R to
Reff = L R. L+m

l Since energy per message bit is inversely proportional to rate, a lower effective rate means a lager required Eb/N0 to achieve a gain performance. l Hence, it is desirable to have L as large as possible so that Reff is nearly R. 37

l The difficulty with large L is that each of the 2K words of storage must be capable of storing a kL-bit (hypothesized) message sequence (corresponding to a surviving path) plus its metric. l Fro very large L, this is practically impossible, and some trade-offs must be made. l One approach to this problem is to truncate the path memory of the decoder by storing only the most recent r blocks of message bits for each survivor, where r << L. l After the first r blocks of the received sequence have been processed by the decoder, the decoder memory is full.

38

l As soon as the next received block is processed, a decoding decision must be made on the first block of k message bits, since they can no longer be stored in the decoder memory. l The optimum strategy to make this decision is to select the survivor with the best metric, and the first block of k message bits of this survivor is chosen as the decoded message block and released to the user. l After the first decoding decision is made, subsequent decoding decisions are made in the same manner for each new received block processed. l Note that decoding decisions made in this way are no longer maximum likelihood, but can be almost as good if r is large enough. 39

l Experience and analysis have shown that if r is in the order of 5 times of the encoder memory K or more, with probability approaching 1, all the 2K survivors stem from the same branch r levels back as shown in Figure 9.9. l Hence there is no ambiguity in making decoding decision. l The parameter r is called the decoding span (or depth).

40

Figure 9.9

Decoding decision with a finite path memory r

41

Decoder Organization l A functional block diagram for a general Viterbi decoder is shown in Figure 9.10. l It consist of (1) A synchronizer for determining the beginning of a branch in the received bit stream. (2) A branch metric computer. (3) A path metric updating, comparison and storage device. (3) A device for updating and storing the survivors. (5) An output decision device. l A complete decoder can be built on a single chip. 42

INPUT

SYNCHRONIZER

BRANCH METRIC COMPUTER

PATH METRIC UPDATING AND STORAGE

TIMING AND CONTROL

INFORMATION SEQUENCE UPDATING AND STORAGE

OUTPUT DECISION DEVICE

OUTPUT

Figure 9.10

Functional block diagram for a Viterbi decoder

43

9.Concatenated Coding with a Convolutional Code as the Inner Code


l For moderate values of coding gain (or reliability), we may use short constraint-length convolutional codes with soft-decision Viterbi decoding. l To achieve large coding gains (or high reliability) with reduced decoding complexity, we may use concatenated coding with a RS outer code and a short constraint-length convolutional code as the inner code decoded by the Viterbi algorithm. Such a concatenated coding scheme is called RS/Viterbi concatenated coding. l This scheme can achieve extremely reliability with modest complexity. high

44

l One such scheme is being considered for NASA-TDRS system for error control. In this RS/Viterbi concatenated coding scheme, the outer code is the NASA standard (255,223) RS code over GF(28) and the inner code is the (2,1,6) Odenwalter convolutional code with Viterbi decoding.

45