A

83050E/1
TLT-5406 Digital Transmission

Lecture Notes, Spring 2006
Markku Renfors
Institute of Communications Engineering
Tampere University of Technology
Contents
Introduction 2
Brief Introduction to Information Theory 5
Information theory, Lossless source coding; Channel capacity
Transmission Channels 36
Baseband Digital Transmission 46
Baseband transmission techniques and line coding principles;
Nyquist pulse shaping principles; Eye diagram.
Modulation in Digital Transmission 87
Linear digital modulation, QAM, PSK; BER calculation and
performance evaluation; Bit mapping principles; VSB, CAP, D-PSK
FSK-Type Modulation Methods 141
FSK, MSK, CPFSK, GMSK
Basics of Detection and Estimation Theory 158
ML and MAP principles in BSC and AWGN cases,
ML sequence detection, Viterbi algorithm
Signal Space Concepts 199
Maximum Likelihood Detection for Continuous-Time Channels 208
Optimum receiver principles, Sufficient statistics, Correlation receiver,
Matched filter receiver, Sampled matched filter
Channel Equalizer Structures 232
Zero forcing and MSE principles; LE, FSE, DFE, MLSD/Viterbi
Adaptive Channel Equalization 258
General concepts, LMS algorithm in LE and DFE
Synchronization in Digital Receivers 284
Partial Response Signalling 320
Scrambling Tchniques 329
Error Control Coding 334
Basics of error control coding, Coding gain, Block codes,
Reed-Solomon codes
Convolutional Codes 368
Convolutional codes, Concatenated coding, Interleaving
Trellis Coding, Coded Modulation 388
83050E/2
INTRODUCTION
_______________________________________
A digital transmission system may or may not include conversions
between analog and digital signals (sampling, A/D- and D/A-
conversion)
The transmitter end of the transmission chain converts a digital
bit-stream into an analog waveform which is sent to the physical
channel, which is in practise analog. The receiving end converts
the received analog waveform back to digital format.
The transmission chain includes:
Source coding/decoding: Reducing the bit-rate of the
information signal by reducing the redundancy;
Compression.
Channel coding/decoding: Error control coding,
compensating the effect of bit errors that inevitably take
place in a practical transmission channel.
In any 'sensible' channel, it is possible to get arbitrarily
low bit-error-rate by increasing redundancy in the
transmitted signal and using error control coding.
One of the central results of information theory is that
source coding and channel coding can, in principle, be
carried out independently of each other.
Modulation/demodulation: converting a digital signal into
analog waveform.
Channel, that distorts the transmitted signal and adds noise
and interference to it.
When designing the system, the primary targit is to minimize the
used bandwidth and/or the transmitted signal power.
83050E/3
Block Diagram of a Digital Transmission System

__________________________________________
Analog Digital
input input
Sampling and Source Channel

quantization coding coding Modulation
Noise
Channel
Interferences
Synchroni-
zation
Reconstruction Source Channel Demodulation Channel

decoding decoding equali-
D/A-conversion Detection zation
Analog Digital
output output
In the following, the digital transmission system is considered to

include the chain between (and including) channel coding and
decoding.
With this definition, the transmission chain can be designed and

analyzed independently of the nature of the transmitted signal.
83050E/4
The Parameters of a Digital Transmission System

____________________________________________________
The properties of a transmission chain are characterized

by the following parameters:
Transmission rate, bit rate (bits/s, bps)
Bit-error-rate
Delay (due to propagation and signal processing)
'Timing jitter' in the bit-stream obtained from the channel

decoder.
83050E/5
BRIEF INTRODUCTION TO INFORMATION THEORY

_____________________________________________________
In this section we study the concepts of information, entropy,

and channel capacity.
With this theory, it is possible to calculate the largest

possible information transmission rate through a given
channel. This is called the channel capacity.
Even though it is usually not possible to achieve the channel

capacity in a practical system, it is a good reference point
when evaluating the system performance.
In fact, the Shannon-Hartley law is an important

fundamental law of nature in the field of communication
theory, and it is quite useful also in practical engineering
work.
Source: Lee&Messerschmitt, Chapter 4.
Source coding part: Benedetto&Biglieri, Chapter 3;

Sklar, Chapter 3;
Gitlin, Chapter 2.
83050E/6
Definition of Information
__________________________________________________
Here the message signal is modeled as a random process.
We begin by considering observations of a random variable:

Each observation gives a certain amount of information.
Rare observations give more information than usual
ones.
Example: The statement The sun rose this morning gives

very little information (high probability).
The statement San Francisco was destroyed this

morning by an earthquake gives a lot of
information (low probability).
Definition:
Observing a random variable X that takes its values from the

set X = {a1, a2 ,, a K }, the (self) information of observation
am is
h(am ) = log2 ( p X (am ))

83050E/7
Definition of Information - Interpretation

__________________________________________________
It is easy to see that
0 h( a m )
For a rare event the probability p X (am ) is small and the

information is large.
For a usual event the probability p X (am ) 1 and the

information is small.
Why logarithm?
In case of two independent random variables X and Y ,

Y = {b1, b2 ,, bN }, the information of the joint event
am , bn becomes
h(am , bn ) = log 2 ( p X ,Y (am ,bn )) = log 2 ( p X (am )) log 2 ( pY (bn ))
= h(am ) + h(bn )
In case of independent events, the information is

additive, which is intuitively sensible.
Using base 2 logarithm => the unit of information is bit.

(base 10 logarithm => the unit is dit)
(base e (natural) logarithm => the unit is nat or Hartley)
83050E/8
Entropy
__________________________________________________
The average information of a random variable is
H ( X ) = E [ log 2 ( p X ( x ))] = p X ( x ) log 2 ( p X ( x ))

x X
This is called the entropy.
Entropy has the following interpretations:

Average information obtained from an observation.
Average uncertainty about X before the observation.
Example
Binary random variable X , X = {0, 1} , p X (1) = q
The entropy is
H ( X ) = q log 2 (q ) (1 q ) log 2 (1 q )
The maximum entropy is 1 and it is obtained when q = 1 2 .
The entropy becomes zero for q = 0 or q = 1.
83050E/9
Example: The Entropy of an English Text

Memoryless model of English text:
Source Symbol a
i Probability pi
Space 0.186
A 0.064
B 0.013
C 0.022
D 0.032
E 0.103
F 0.021
G 0.015
H 0.047
I 0.058
J 0.001
K 0.005
L 0.032
M 0.020
N 0.057
O 0.063
P 0.015
Q 0.001
R 0.048
S 0.051
T 0.080
U 0.023
V 0.008
W 0.017
X 0.001
Y 0.016
Z 0.001
Entropy H ( X ) = pi log 2 pi = 4.03 bits

i
83050E/10
Source Coding Theorem

__________________________________________________
Let us consider a discrete-time and discrete-amplitude

source that creates independent observations of a random
variable X at rate r samples per second.
The rate of the source is defined as R = rH ( X ) .
Such a source can be coded using a source coder

into a bit-stream, the rate of which is less than R + ,
for any > 0 .
It is worth noting that it is often very difficult to construct

codes that provide a rate arbitrarily close to the rate R. But
often it is easy to get rather close to the rate R.
83050E/11
An Example of Source Coding

__________________________________________________
Let us consider again the case of binary random variable.
(1) q = 1 2
Now the entropy is H ( X ) = 1, so 1 bit per sample is

needed. In this case it is best to send the samples as
they are.
(2) q=0.1
H ( X ) = 0.1 log 2 (0.1) 0.9 log 2 (0.9) = 0.47

On the average, less than one bit per sample is
sufficient. Now it is possible to construct codes where
the average number of bits per sample is in the range
0.47 ... 1.
One very simple but rather efficient code is obtained by
coding two consecutive source bits as follows:
samples code word

0,0 0
0,1 10
1,0 110
1,1 111
It is also easy to decode a bit-stream constructed in this

way since no code word is a prefix of another.
In this code, 0.645 bits per sample are used on the
average.
83050E/12
More Examples
__________________________________________________
(1) When throwing a coin, if the result is always head (it

might be an unfair coin :-) the entropy becomes:
H ( X ) = 0log 2 0 1 log(1) = 0
(2) It is easy to show that the entropy satisfies
H ( X ) log 2 K
where K is the number of possible outcomes (sample

values, symbols, ). log 2 K bits is, of course, always
enough. It can also be shown that H ( X ) = log 2 K only if
all possible outcomes are equally probable.
83050E/13
About Source Coding

_________________________________________________
In the following we consider briefly binary coding methods

that aim at minimizing the number of used bits as close to
the source entropy as possible. These methods are lossless
in the sense that the decoder is able to reproduce exactly
the original source bit stream.
Many coding principles applied in this context utilize unequal
code wordlengths (i.e., different number of bits for different
symbols or symbol blocks). Naturally, the shortest word-
lengths are used for the most common symbols/symbol
blocks.
An important requirement for a code is that any coded bit-
stream can be uniquely decoded. One common principle to
achieve this goal is the prefix property: none of the code-
words appears as the beginning of another codeword. This
guarantees that, when scanning the bit-stream, a recognized
codeword (belonging to the code) determines also the
boundary between consecutive code-words.
The compression methods discussed below (entropy coding)
are widely used, e.g., in telefax systems and file
compression routines. They are also one ingredient in most
of the current speech, audio, and video compression
systems. However, in these applications much better
compression ratio can be achieved by using lossy coding
methods that remove some of the original source information
with small/negligible effect on the perceived quality.
83050E/14
Huffman Code
__________________________________________________
In Huffman codes, variable code wordlength is utilized

together with the prefix property.
Coding algorithm (from Proakis&Salehi) and example (from

Sklars book):
0.4 0.4 0.4 0.4 0.6
Sort in decreasing a 1 1.0
order of probability 0.2 0.2 0.2 0.4 0.4
b 1 0
0.6
0.1 0.2 0.2 0.2
c 1 0
0.4
0.1 0.1 0.2
d 1 0
Merge the two 0.2
least probable 0.1 0.1 Input Code
e 1 0 alphabet symbols
0.2
0.1 a 1 1
f 0
b 0 0
c 1 0 1
no d 1 0 0
Number of elements = 2? e 0 1 1
f 0 1 0
yes
Assign 0 and 1 to
the two codewords
Is any element the yes Append the codeword

result of merger
with 0 and 1
of two elements
no
Stop
83050E/15
Huffman Code (continued)

__________________________________________________
As an example, in symbol-by-symbol compression of English

text, about 43 % compression rate has been achieved.
However, this doesnt take into account the fact that certain
combinations of letters (the, in, on, ...) appear in the text
quite often.
For Huffman code, it can be shown that the average number

of bits to code a source symbol, L(X), satisfies
H ( X ) L( X ) H ( X ) + 1
When coding n symbols at a time, we obtain correspondingly
H ( X ) L( X ) H ( X ) + 1 / n .
So using the Huffman code blockwise, with sufficiently long

block length, it is possible to get arbitrarily close to entropy
limit. This is not a practical way, but this development
basically constitutes one proof of the source coding theorem.
One fundamental limitation of Huffman codes is that the

source symbol statistics have to be known (or estimated).
83050E/16
Run-Length Codes
__________________________________________________
Many sources produce long sequences (runs) of the same

source symbol (e.g., think about the operation principle of
telefax).
In such cases, it is more efficient to send, instead of the

symbol sequence, one of the repeated symbols and
information about the length of the sequence.
83050E/17
Lempel-Ziv Codes
__________________________________________________
Principle
The method uses a codebook consisting of a number of
source symbol sequences.
The source symbol stream is scanned one symbol at a
time. This is continued until the beginning of the
uncoded symbol sequence is not in the codebook
anymore.
This sequence can be represented as the concatenation
of one of the words in the codebook and one additional
symbol. This new symbol sequence will be added to the
codebook.
The same process is repeated starting from the
beginning of the uncoded source symbol stream.
83050E/18
Lempel-Ziv Codes (continued)

__________________________________________________
Example (from Proakis&Salehi):

Let us assume that we want to parse and encode the following sequence:
0100001100001010000010100000110000010100001001001
Parsing the sequence by the rules explained before results in the following
phrases:
0, 1, 00, 001, 10, 000, 101, 0000, 01, 010, 00001, 100, 0001, 0100, 0010,
01001,
It is seen that all the phrases are different and each phrase is a previous
phrase concatenated with a new source output. The number of phrases is
16. This means that for each phrase we need 4 bits, plus an extra bit to
represent the new source output. The above sequence is encoded by
0000 0, 0000 1, 0001 0, 0011 1, 0010 0, 0011 0, 0101 1, 0110 0,

0001 1, 1001 0, 1000 1, 0101 0, 0110 1, 1010 0, 0100 0, 1110 1,
Dictionary Dictionary
Location Contents Codeword
1 0001 0 0000 0
2 0010 1 0000 1
3 0011 00 0001 0
4 0100 001 0011 1
5 0101 10 0010 0
6 0110 000 0011 0
7 0111 101 0101 1
8 1000 0000 0110 0
9 1001 01 0001 1
10 1010 010 1001 0
11 1011 00001 1000 1
12 1100 100 0101 0
13 1101 0001 0110 1
14 1110 0010 1010 0
15 1111 0010 0100 0
16 1110 1
83050E/19
Lempel-Ziv Codes (continued)

__________________________________________________
In this example, one can hardly talk about compression. The

efficiency of the method is realized only when coding
considerably longer source symbol sequences.
When compressing English text, about 55 % compression

rate has been achieved.
One key parameter of the method is the size of the

codebook, which could be, for example, 4096,
corresponding to 12-bit sequences.
At some point, the codebook becomes full, and the oldest

(according to a proper criterion) code-word is removed from
the codebook to make room for new ones.
Lempel-Ziv Codes are commonly used, e.g., in file packing

routines, like Zip.
83050E/20
The Capacity of a Discrete-Time Channel

(1) Discrete-Valued Input and Output
_____________________________________________________
In the following, we consider the channel capacity, starting

from the case of discrete-time channel with discrete-valued
input and output, and moving stepwise towards the case of
continuous-time channel with continuous-valued input and
output.
The input is represented by the random process {X k }.

The output is represented by the random process {Yk }.
The channel is assumed to be memoryless, i.e., Yk depends
only on X k but not on any other input symbol (neither earlier
nor later ones).
Such a channel is completely determined by the conditional
probabilities
pY X ( y x), x X , y Y
Example: Binary symmetric channel (BSC)
X = Y = {0,1} pY*X(y*x)=1-p
x=0 y=0
The conditional probabilities
between input and output p
(transition probabilities)
are shown by such a graph. x=1 y=1
p is the bit-error probability 1-p
This is the simplest possible channel. Yet it is quite useful as a

model of many practical channels, and it is used often in the
continuation.
83050E/21

(1) Discrete-Valued Input and Output (continued)
_____________________________________________________
If the input symbols are independent, the information per

symbol at the input is H ( X ) .
The question is: How much of this information gets to the
destination through the channel? The answer will be seen
after a few intermediate steps.
(1) Uncertainty about X when the output symbol is Y = y :
H ( X y ) = E log 2 p X Y ( X y ) = p X Y ( x y ) log 2 p X Y ( x y )

x X
(2) Conditional entropy: Average uncertainty about X when

the output has been observed:
H ( X Y ) = H ( X y ) pY ( y ) = pY ( y ) p X Y ( x y ) log 2 p X Y ( x y )
yY yY x x
(3) Average mutual information: Information about X

obtained by observing Y:
I ( X ,Y ) = H ( X ) H ( X Y )
= H (Y ) H (Y X )
(The proof of latter form is an exercise problem.)
The mutual information, i.e., the information transmitted over
the channel depends on the input probability distribution
(that depends on the source coder) and on the transition
probabilities (that depend on the channel).
83050E/22

(1) Discrete-Valued Input and Output (continued)
_____________________________________________________
(4) It is sensible to choose the input probability distribution in
such a way that the mutual information is maximized. The
channel capacity is defined as the maximum average mutual
information over the input (X) probability distribution:
Cs = max I ( X , Y )
p X ( x)
The unit here is bits/symbol.
(5) The channel capacity can also be expressed in units of

bits/second (bps) as follows:
C = sCs
where s is the symbol rate (symbols/second).

83050E/23
Example of Discrete-Time Channel Capacity

_____________________________________________________
Binary symmetric channel, where the input symbol

probabilities are q and 1 q .
The mutual information is
I ( X , Y ) = H (Y ) + p log 2 p + (1 p ) log 2 (1 p ) .
This is maximized by choosing q=0.5, which gives H(Y)=1.

The resulting channel capacity per symbol is:
Cs = 1 + p log 2 p + (1 p) log 2 (1 p)
p = 12 Cs = 0 (input and output are independent of each other)

p = 0 or p = 1 Cs = 1 (error - free binary channel)
83050E/24
Channel Capacity Theorem

_____________________________________________________
Let us consider a case with
Source rate R = rH ( X ) bps
Channel capacity C = sCs
R<C
Then there is a combination of source coding and

channel coding providing error-free/distortion-free
transmission.
If the source signal is a bit-stream and if the source bit-rate

is lower than the channel capacity, then the channel coding
can be designed to provide arbitrarily low bit error rate
(BER).
In practice, very low error rates may not be possible in case

of noisy channels because the processing delay and
implementation complexity might become very high.
83050E/25

(2) Discrete-Valued Input, Continuous-Valued Output
_____________________________________________________
Example: A channel with additive noise.
Input: Discrete-valued random process X
Output: Y = X + N where N is a continuous-valued random
process modeling the noise. (Notice that here the channel
attenuation has been scaled away.)
The noise is often assumed to be Gaussian.
The entropy of a contiuous-valued random variable Y is defined as:
H (Y ) = E [ log 2 fY ( y )] = fY ( y )log 2 fY ( y )dy

Y
The following important property is used in the continuation (proof
as an exercise):
If the variance ( 2 ) of a random variable N is known and

bounded, there is an upper bound for the entropy:
(
0 H ( N ) 1 log 2 2e 2
2
)
The upper bound is achieved if and only if N is Gaussian
distributed.
In this case the conditional entropy is calculated as:
H (Y X ) = p X ( x) fY X ( y x)log 2 fY X ( y x)dy
x X Y
Mutual information and channel capacity are defined according to
the earlier models based on the definitions of entropy and
conditional entropy. The channel capacity depends naturally on the
channel alphabet X and the noise level. Examples later.
83050E/26

(3) Continuous-Valued Input and Output
_____________________________________________________
We consider here the case where the channel alphabet is

not limited in any way, i.e., the channel input is assumed to
be continuous-valued.
This is the most general case when considering the channel

capacity (when carrying out the maximization with respect to
the probability distribution of X). Discrete channel alphabet is
a special case of this.
The conditional entropy is now defined as:
H (Y X ) = f X ( x) fY X ( y x)log 2 fY X ( y x)dydx
X Y
In other respects, the derivations are similar to the earlier

ones.
83050E/27

(3) Continuous-Valued Input and Output; Example
_____________________________________________________
Consider a channel with additive Gaussian noise:
Y =X +N
where N is a zero-mean Gaussian noise process that is

statistically independent of X with variance 2 . The variance
of X is .
The mutual information is
I ( X , Y ) = H (Y ) H (Y X ) = H (Y ) H ( N )
Based on earlier results it is easy to see that
( )
H ( N ) = 1 log 2 2e 2
2
H (Y ) 1 log 2 (2e( + 2 ) )
2
Equality applies here when X is Gaussian, in which case

also Y is Gaussian (as the sum of two Gaussian processes).
Clearly, the mutual information is maximised with this
choice. Thus, the channel capacity becomes:
1
2
( 2 1
2
) 2 1
2
(
)
Cs = log 2 2e( + ) log 2 2e = log 2 1 +

2

83050E/28
Discrete Channel Alphabet = Constellation

_____________________________________________________
In the continuation we use commonly both real- and

complex-valued discrete channel alphabets, referred to as
constellations. Here are some examples:
4-PSK 8-PSK
2-AM
4-AM 16-QAM 64-QAM
8-AM
16-AM
8-AMPM 32-AMPM
83050E/29
Capacity of AWGN Channel and

Discrete Constellations
_____________________________________________________
The following plots show the maximum information

transmission rates for some common real and complex
constellations as functions of the signal-to-noise ratio (SNR)
in case of an AWGN (Additive White Gaussian Noise)
channel. Notice that these are not the channel capacities in
the strict sense, because the mutual information is not
maximized with respect to the input distribution. Instead, it is
assumed that all input symbols are equally probable, which
is the common starting point in digital transmission systems.
The figures show also the true channel capacity for the case
of continuous-valued input and output.
We can see that the continuous-valued channel capacity

gives an upper bound for information transmission rates of
the discrete-valued cases. By increasing the size of the
discrete-valued constellation, it is possible to get arbitrarily
close to the channel capacity. In other words, there is no
essential loss in channel capacity when using discrete
channel alphabets, if the size of the constellation is
sufficiently large.
83050E/30
Capacity of AWGN Channel and

Discrete Constellations (continued)
_____________________________________________________
The upmost plot shows the continuous-valued channel

capacity. Reliable information transmission is possible
below this curve.
Real Constellations (from Lee&Messerschmitt):
Complex Constellations:
83050E/31
The Capacity of a Continuous-Time Channel

_____________________________________________________
Most physical channels are continuous-time channels by

nature.
On the other hand, looking at the digital transmission chain,

the part of the channel that is essential from the channel
capacity point of view has a discrete-time input signal, in
addition to having discrete-valued channel alphabet.
Now the question arises, how much do we loose in channel
capacity in discretizing the channel in this way.
Let us consider one important (but idealized) example:
Tightly band-limited baseband channel. It has the transfer
function
1 for f W
B( f ) =
0 for f > W -W W
f
Here W is the bandwidth in units of Hz.
The channel output signal is

Y (t ) = [ X (t ) + N (t )] b(t ) = X (t ) + N (t )
~ ~
where X (t ) is the input signal

N (t ) is white Gaussian noise
b(t ) is the channel impulse response
~
X (t ) is the filtered, bandlimited input signal
~
N (t ) is the noise signal filtered to bandwidth W.
83050E/32
The Capacity of a Continuous-Time Channel (continued)

_____________________________________________________
~ ~
According to the sampling theorem, Y (t ) , X (t ) , and N (t ) can
be represented completely in discrete-time domain by using
T=1 -spaced samples. The samples define discrete-
2W
~ ~
time random processes N ( kT ) , X (kT ) , and Y (nT ) , which
define the discrete-time channel capacity. This must be the
same as the continuous-time channel capacity.
With the earlier assumptions, we get the capacity of the

continuous-time channel as follows:

C = W log 2 (1 + )
2
This is known as the Shannon-Hartley law.
This example demonstrates that it is possible to convert a

continuous-time channel into a discrete-time channel without
loosing anything in the capacity.
Earlier we have already seen that it is possible to use

discrete-valued without loosing significantly in the capacity.
It is now evident that digital transmission techniques utilizing

discrete-time channel inputs and outputs and discrete-
valued channel alphabet are able to fully utilize the capacity
of any continuous-time continuous-valued channel.
83050E/33
The Capacity of a Voice-Band Telephone

Channel
_____________________________________________________
Assume that the bandwidth is W = 3.3 kHz and SNR= 40 dB,

the capacity of a telephone channel can be calculated as

C = W log 2 (1 + ) = 3300log 2 (1 + 10000)
2
= 43.9 kbps
With different values of the SNR, the channel capacity

behaves as follows:
SNR/dB C/kbps
20 22.0
30 32.9
40 43.9
50 54.8
60 65.8
The date rate of 56 kbps is commonly achieved with voice-

band modems with good telephone connections. This is
actually quite close to the channel capacity, since the SNR
cannot be assumed to be more than 60 dB even in the best
connections.
83050E/34
The Capacity of Frequency-Selective and

Fading Channels
_____________________________________________________
We have seen how to calculate the capacity of an AWGN

channel. This basic result can be applied also to various
other cases by splitting the channel in time and/or frequency
domain to smaller parts that have stationary AWGN
characteristics.
In case of stationary frequency-selective channel, the
frequency band can be divided into smaller (non-
overlapping) parts and the overall capacity is the sum of the
capacities of the sub-channels. If the sub-channel bandwidth
is considerably smaller than the coherence bandwidth, the
sub-channels have AWGN characteristics, and the
Shannon-Hartley law can be used for determining the
capacities of the sub-channels.
In multicarrier systems (OFDM, DMT) this idea is also used
as a modulation technique to overcome the channel
equalization problem in heavily frequency selective
channels. Furthermore, in point-to-point links with sufficiently
fast feedback, the constellations and transmission powers
used in each sub-channel can be optimized based on the
measured sub-channel SNRs.
In case of temporal variations, i.e., a fading channel, the
instantaneous capacity can be calculated for the channel (or
the sub-channels). Within the coherence time, the
instantaneous channel capacity is practically constant, and
the capacity over a longer time interval can be obtained by
properly averaging the instantaneous capacities.
83050E/35
The Capacity in Case of Interferences

_____________________________________________________
In many applications, various interference sources limit the

channel capacity instead of the thermal noise. The
interference sources include other base-station/mobile
station signals at the same frequency (in case of a cellular
network), multi-access interference of the same cell in case
of CDMA based cellular systems, narrowband RF
interference in case of xDSL systems, spurious frequencies
generated due to non-idealities of transmitter/receiver
hardware, etc.
In most practical cases, the desired signal, thermal noise,

and the different interference sources are statistically
independent of each other. In the capacity calculations, the
sum of thermal noise power and the interference powers
should be used instead of the noise power. The capacity of
each frequency increment can be calculated from the signal-
to-noise+interference value and the overall capacity is
obtained by summing/integrating over the increments.
83050E/36
TRANSMISSION CHANNELS
In this part we have a very brief look at different types of

physical transmission channels.
In the following (and in communication theoretic literature in

general), the term channel is assumed to include
- the physical transmission medium
- antennas/other transducers used to connect to the
physical medium
- high-frequency/radio frequency (RF) parts of the
transmitter and receiver.
In other words, the channel models to be discussed include

here all parts of the transmission chain between the
modulator and demodulator. Rather simple channel models
are used (and are in most cases sufficient) when
considering, e.g, the modulation and detection methods.
Source: Lee&Messerschmitt, Chapter 5.

83050E/37
Transmission Media
The commonly used transmission channels in digital transmission include:

(Shielded/unshielded) twisted pair cable (STP/UTP), used as access
cables in the telephone network and LANs.
- Plain Old Telephone Service (POTS): 3 kHz/up to 56 kbps (as
composite channel, including also PCM-links, exchange
equipment, etc.)
- ISDN: 144 kbps
- ADSL: up to ~8 Mbps (in downlink) at several km distances
- VDSL: up to ~50 Mbps at few hundred meter distances
- LANs: up to ~1 Gbps (Gigabit Ethernet)
Coaxial cables with up to ~1 GHz bandwidth
- Cable TV network (CATV): hundreds of analog/digital TV
channels, with up to 40 Mbps data rate in each 8MHz BW.
- Cable modems: up to ~1 Mbps date rate also in uplink.
- Earlier in LANs (Ethernet).
Optical cables
- Core of the broadband fixed/mobile telecommunication networks,
CATV networks, and LANs. Terabps data rates possible in a
single fiber.
- Possibly coming also to the access network (Fibre to the Home,
FTTH)
Radio Waves
- Satellite communications.
- Digital audio and TV broadcasting systems (DAB, DVB-T)
- Microwave links.
- Fixed access technologies.
- Mobile communication systems (GSM, 3G/WCDMA, ) with
increasing data rates (e.g., 128 kbps everywhere, 2 Mbps in
urban areas/indoors)
- WLANS: up to ~54 Mbps mostly indoors
- Low-power RF (e.g., Bluetooth): ~1 Mbps at few meter distances.
Magnetic & optical storage
83050E/38
Twisted Pair Cable Transmission
The twisted pair channel can be modeled as a linear system with

more or less exponentially decaying impulse response (dispersive
channel).
The attenuation increases heavily with frequency, limiting the
useful bandwidth to ~30 MHz in VDSL environment and ~1
MHz in ADSL environment.
Also termination impedances affect the frequency response
due to reflections.
Branches (taps, e.g., having two telephone lines to different
rooms of a household) may cause echoes, resulting in notches
in the frequency response (like in multipath channels).
There are also various interference sources:
Leakage of signals (Ingress, RFI=radio frequency interference)
from radio communication systems
Impulsive and other types of noise signals from various
electrical appliances.
Crosstalk from other signals transmitted in other lines of the
cable bundle. The near-end crosstalk (NEXT) comes from a
transmitter at the same end of the cable as the receiver under
consideration. For example, in VDSL systems, frequency
domain dublexing (FDD) is used to avoid NEXT problems (time
domain dublexing, TDD, could be used as well). Far-end
crosstalk (FEXT) comes from a transmitter in the other end of
the cable in different pair, and has to be tolerated always.
RX TX
NEXT
TX RX
FEXT
TX RX
83050E/39
Coaxial Cable Transmission
The CATV network has much wider bandwidth, up to ~1

GHz. Reflections (echos) from branching points or imperfect
terminations and the main imperfection in the linear system
model.
RF ingress may have some significance, but in many

respects, coaxial cables are rather ideal transmission media.
83050E/40
Radio Channel
The main impairments of the radio channel are due to

multipath propagation, where the signal is received through
a number of propagation paths with different delays.
Multipath effects may be caused by inhomogenity of the

atmosphere, etc, but the main reasons are reflections from
natural (hills, cliffs) and man-made (buildings) obstacles.
Sufficiently long delay differences are observed as echoes in

the received signal.
In case of a two-path channel, the received signal can be

written as
y (t ) = x(t ) + Ax(t )
where is the delay difference of the two rays and A is the
(complex) gain of delayed path. In the model, it can usually
be assumed that the delay of the line-of-sight (LOS) path is
zero and its gain is unity.
Reflection
Direct
path
83050E/41
Radio Channel: Frequency Selectivity
The channel frequency response is in this case

HC ( f ) = 1 + Ae j
The corresponding amplitude response is in the range
(1 A) ... (1 + A). If the value of A is close to unity, the frequency
response has deep notches, periodically at angular frequencies
1 , 3 , 5 , .

Channel delay spread is a measure of the length of the impulse

response, i.e., the delay difference of the shortest and longest
significant paths. The inverse of the delay spread is the coherence
bandwidth, which is a measure of frequency domain variability of
the channel.
Depending on the channel bandwidth in relation to the channel
coherence bandwidth, the channel model may be frequency non-
selective, mildly frequency selective, or heavily frequency selective.
83050E/42
Radio Channel: Fading
Another aspect of the radio channel is fading, i.e., the

variation of the channel gain (or actually, the frequency
response) with time. This may also be due to atmospheric
effects, but the main reason for strong fading is the
movement of transmitter or reciever, causing the multipath
channel model to change.
It is important to note that a movement of half wavelength

(=15 cm at 1 GHz carrier frequency) may cause the
propagation conditions to change greatly.
The following figure shows a typical behaviour of a mobile

channel gain and phase with time. It has been observed
experimentally, and also derived theoretically that the
probability distribution of the magnitude of the channel gain
follows Rayleigh distribution.
83050E/43
Radio Channel: Fading (Continued)
The relative motion of transmitter and receiver with respect

to each other causes a Doppler shift. The Doppler shift
achieves the highest value when the movement straight
against each other, and the smallest negative value when
the movement is straight away from each other. In a mobile
channel, there are, with some probability, rays from both
directions, and also all the intermediate directions/Doppler
shift values. A typical Doppler spectrum is shown in the
following figure. This would be the received signal spectrum
if the transmitted signal is an unmodulated carrier.
SR(f)
f
v v
fc - fc fc +

The following figure shows a model for a two-path fading
channel, which includes the delays and complex gains of the
two paths, as well as the modulation by the Doppler
spectrum.
A1 r1(t)
DELAY
1
A2 r2(t) OUTPUT
u(t)
DELAY
2
83050E/44
Radio Channel: Example of Fading
Example
If the vehicle velocity is 100 km/h and the RF carrier

frequency is 1 GHz, the wavewlength is
c 3 108
= = = 0.3 m ,
f 9
10
the velocity
v = 27.8 m/s
and the maximum Doppler shift
v 27.8 m/s
fD = = = 92.6 Hz .
0.3 m
The bandwidth of the received carrier is then about 185 Hz,
since the Doppler shift can effect in both directions,
depending on whether the reflected beam arrives from front
or back.
The time it takes for the vehicle to travel half a wavelength is

0.15 m
t= = 5.4 ms .
27.8 m/s
Thus we expect a significant fade about every 5.4 ms, i.e., at
a rate of 185 Hz, which is equal to the bandwidth of the
Doppler shift.
83050E/45
Power and Bandwidth Limitations
The main constraints in digital transmission system design

are imposed on the signal bandwidth and transmission
power.
Bandwidth is limited by
Regulation (especially in case of radio communications)
Bandwidth of the medium.
Transmission power is limited by

Regulations
Keeping the interferences to other users of the same
medium at a reasonable level.
Keeping the power consumption at a reasonable level
(especially in handheld equipment and in satellite
communications.)
83050E/46
BASEBAND DIGITAL TRANSMISSION
Bits and symbols

The idea of digital transmission is to transmit bit sequences
or more generally multilevel symbol sequences using PAM-
modulation (pulse amplitude modulation).
Multilevel symbols are obtained when, e.g., 4 bit blocks are
combined into symbols. In this case, the number of bit
combinations is 2 4 = 16 . In general, B bits can be represented
as 2 B different symbols, which in baseband transmission are
usually coded as 2 B equally-spaced signal levels.
In practise, the number of levels or bit combinations
depends on the application and channel requirements. The
main requirement is that the levels can be reliably separated
from each other after a noisy channel.
Combining several bits into one symbol, the symbol rate
(baud rate) is reduced. This affects the transmitted signal
bandwidth. Example:
Bit sequence: 0 1 0 0 1 1 0 0... Rb bits/s
Symbol sequence: -3A 9A ... Rb / 4 symb/s
Binary signal
16-level signal:
T
T is the symbol period or symbol interval.

83050E/47
Pulse Waveforms
Digital PAM-signal is transmitted to the continuous-time

channel as the following waveform
x(t ) = ak p(t kT )
k
Here p(t) is a basic pulse shape whose amplitude is scaled

by the transmitted symbols ak .
It is important that adjacent pulses do not interfere with each

other in the reception.
It is assumed that the received continuos-time waveform is

observed at time instants kT for determining the
transmitting symbol values. The ideal condition is:
1 when t = 0
p (t ) =
0 when t = T ,2T ,...
This is possible to implement in two different ways:
(1) Using short pulses not overlapping in time domain

The bandwidth is not the smallest possible, but easy
to implement.
(2) Using pulses overlapping in time domain, but forcing the

condition to be satisfied anyway
Signal bandwidth can be minimized, more
complicated.
83050E/48
Spectrum of Baseband PAM-Signal
The spectrum of the transmitted digital baseband signal can

be obtained from the Fourier Transform of the equation of
the previous slide,
1
P ( f ) Ga (e j 2fT )
2
Gx ( f ) =
T
Here P(f) is the Fourier Transform of the pulse shape and

Ga ( e j 2 fT ) is the power spectral density function of the
transmitted discrete-time symbol sequence.
Here, we combine both continuous-time and discrete-time
signals. Therefore, it is not trivial to prove the previous
result, but it is intuitively quite clear.
The transmitted signal spectrum should be matched with the
channel properties.
In baseband systems, e.g., in cables, the attenuation is not
constant within the used frequency band. It increases in high
frequencies. So, the signal power should be concentrated on
low frequencies, where the cable attenuation is smallest.
This reduces also crosstalk and radio frequency
interferences.
Naturally, in AC-coupled systems, there should be no DC-
component in the transmitted signal. (Or at least the removal
of the possible DC-component should not distort the
waveform.)
83050E/49
Line Coding vs. Nyquist Pulse Shaping
Two different approaches to shape spectrum:

(1) Line coding
The basic pulse shape is a square pulse.
The spectrum is a sinc-type wide spectrum.
DC-component can be removed by constructing the signal
properly.
In general, the symbol sequence is generated to have
some correlations between consecutive symbols in order
to shape the transmitted spectrum.
Used mostly for binary source signal.
(2) Nyquist-pulse shaping

It is assumed that transmitted symbols are uncorrelated.
The transmitted spectrum has the shape of the
Fourier transform of the used pulse shape.
The pulse shape is optimised so that the needed
bandwidth is small.
Adjacent pulses are overlapped in time domain.
The methods can also be combined, but in the following
discussions, the focus is usually on one of these
approaches.
Note on terminology: In some areas, like xDSL modem literature,
the term line coding is used in a wider sense to cover all signal
processing techniques related to the modem.
83050E/50
Goals of Line Coding
Spectrum management and shaping: Keeping the

spectrum reasonably narrow.
To remove the drifting of the DC-component, baseline

wander, in AC-coupled systems.
To avoid synchronization problems when the transmitted

symbol sequence contains long sequences with
constant level (e.g., 000000 or 1111111.).
System monitoring during the normal operation is

possible by using suitable line codes: If sequences not
belonging to the used code are received repeatedly, it
can be determined that something is wrong in the
transmission link.
Example: Effects of lowpass filtering (RC-filter with

exponential step response) and AC-coupling (baseline
wander) in case of a binary signal:
83050E/51
Redundancy
Using an L-element channel symbol set (=alphabet) and symbol

rate fb, the channel capacity (assuming no errors) is
R = f b log 2 L bps .
Let the source/user bit rate be B. Then if B=R, there is no
redundancy in the code and the information bits must be mapped
deterministically to the channel symbols.
Usually in line coding, B<R and the difference of these values

corresponds to the redundancy of the code. When there is some
redundancy, it is possible to create correlations to the symbol
sequence and the power spectrum of the signal can be shaped.
83050E/52
Running Digital Sum
Let Ak denote the symbol sequence using the line code

under consideration. Then the transmitted signal can be
expressed as

x(t ) = Ak g (t kT )
k =
where g(t) is the unit square pulse.
The baseline wander effects depend heavily on a certain

characteristic of the code, the running digital sum, RDS. It is
defined as
k
S k = Am
m =
In the following, the maximum (absolute) RDS value of the

code is of great interest.
The RDS value of a good code is expected to be bounded to
a small value.
Example: Baseline wander effects with a code with small
maximum RDS and large RDS. Upper figures with DC-
coupling, lower ones with RC high-pass filtering.
t t
t t
83050E/53
Classification of Line Coding Methods
There are many (at least tens of) different line coding
methods, often based on ad-hoc principles.
In the following, we consider mostly the case of binary data.
Codes can be classified, e.g., by the used signal levels, as
follows:
unipolar: +a, 0
polar (antipodal): +a, -a
bipolar (pseudoternary): +a, 0, -a.
Examples of line codes

83050E/54
Binary Antipodal Codes
During one symbol interval, one of the following pulse waveforms

or its inverse is transmitted, depending on whether the transmitted
bit is 1 or 0. (Notice that there are two alternative symbol
mappings.)
RZ NRZ BIPHASE
T/2
t t t
-T/2 T/2 -T/2 T/2 -T/2
When using RZ (=return-to-zero) or NRZ (=non-return-to-

zero) waveforms in an AC-coupled channel, the average
number of positive and negative pulses should be made to
be equal, by some means.
In Biphase codes (=Manchester codes), the DC-level of the
basic pulse is zero, and consequently, the DC-level of the
transmitted symbol sequence is also zero.
o There is a zero crossing in the middle of symbol time
interval that makes the synchronization more easy
o Required bandwidth is higher, about twice compared to
NRZ-line code
- As a theoretical model, biphase signalling can be
constructed by using NRZ-pulse and (a) using double
symbol rate and sending a complement bit after each bit
or (b) modulating by using square wave, which has
double frequency. Both of these ideas show ways to get
the spectrum of the signal from the NRZ spectrum.
o This code is simple, and good enough if the bandwidth of
the system can be tolerated to be somewhat higher than
the minimum.
83050E/55
AMI-Line Code
Alternate Mark Inversion or AMI line code is an example of

bipolar line code. Its coding rule is
0 => 0
1 => +/- alternatingly
Example: Incoming 0 1 1 1 1 1 1 0 0 0 0
AMI-coded + - + - + - 0 0 0 0
A long 1-bit sequence is seen as square wave
AMI-code removes the long sequences of 1-bits, but it has

the following problem: the coder may produce a long
sequence of 0-bits which makes synchronization more
difficult. This is due to the linearity of AMI-code. In more
advanced codes, this problem can be avoided.
83050E/56
AMI-Line Code (continued)
AMI-decoding can be done easily using the following simple

memoryless function:
Received Decoded
+ 1
0 0
- 1
However, using a more advanced decoding method, like the

Viterbi algorithm, better system performance can be
achieved.
It is easy to see that the RDS value for an AMI code is
always 0 or 1.
The power spectrum of an AMI code sequence can be
shown to be
1 cos T
G A (e j ) = 2 p (1 p )
1 + (1 2 p )2 2(1 2 p ) cos T
where p is the probability of bit 1. The overall spectrum
depends on the used pulse
shape, as explained on
page 48.
83050E/57
Improved Line Codes Based on AMI
The starting point is AMI, which is modified. Synchronization

information is added to the long sequences of 0-bits.
Basic idea: If an AMI-coded block includes just 0-bits, it is

replaced by a three level sequence which
- Includes one or more + and/or symbols, i.e.,
information for synchronization.
- Is not a valid sequence in the AMI-code and can thus be
recognized in the decoder and changed back to the 0-
sequence.
The HDB3 code is one example of this idea, which is widely
used in PCM systems.
Drawbacks of the idea:
- The implementation is somewhat mode complicated.
- Monitoring the system performance becomes more
difficult
- RDS grows.
In spite of these drawbacks, these methods are used
commonly.
83050E/58
HDBk Codes
We use the following notation:
0 a transmitted 0-symbol
B is a valid AMI + or - symbol
V a + or symbol violating the AMI-principle, i.e., it
has the same polarity as the most recent +/- symbol
HDB (High-Density Bipolar) Codes

When scanning the bit stream, a 0-sequence of length k+1 is
replaced by either of the following two sequences:
B00...0V tai 00...0V.
The choice is made in such a way that there is always an
odd number of B-symbols between two consecutive V-
symbols. This means that the polarity of the V-symbols is
alternating and the RDS remains small. Notice that the
polarity of the AMI sequence may be changed after such a
replacement sequence.
For this code, 1 RDS 1.
HDB3-code is used in the PCM-transmission systems.
Example: HDB3
Input: 0 1 1 0 0 0 0 1 0 1 0 0 0 0 1
AMI: + - 0 0 0 0 + 0 - 0 0 0 0 +
HDB3: + - 0 0 0 - + 0 - + 0 0 + -
HDB-notation: B B 0 0 0 V B 0 B B 0 0 V B
83050E/59
Block Codes
Here, we consider the incoming signal blocks of k bits, which

are coded into the blocks of n symbols. The alphabet size is
L. The basic requirement is:
2 k Ln
A natural approach would be to choose (if possible) the code

words to have zero RDS values over the whole codeword,
and try to minimize also the RDS values within the code
words.
83050E/60
kBnT Codes
The previously discussed pseudoternary codes transmit 1 bit

per symbol. In principle, 3-level codes are able to transmit
log 2 3 = 1.58 bits/symbol. So there is some room for
improvement.
kBnT is a rather wide class of block codes. If we choose the
largest possible n for each k, we obtain the following table: .
k n code efficiency
1 1 1B1T 63% = AMI
3 2 3B2T 95%
4 3 4B3T 84%
6 4 6B4T 95%
7 5 7B5T 89%
Better efficiency also reduces the sensitivity to noise,
because the bandwidth, and consequently noise power, can
be reduced.
On the other hand, the redundancy is reduced with
increasing efficiency, and the goals of line coding become
more difficult to achieve.
As a suitable compromise, the 4B3T code has received a lot
of attention.
In these efficient codes, a major problem is that it is not
possible to find enough code word with zero RDS. One
solution is to use a number of alternative modes (typically
2...4) for each bit combination and choose the one that
minimizes the RDS.
83050E/61
Example: Bimode 4B3T Code
Binary Ternary Output Block Block Digital

Input Block Mode A Mode B Sum
0000 +0- +0- 0
0001 -+0 -+0 0
0010 0-+ 0-+ 0
0011 +-0 +-0 0
0100 ++0 --0 2
0101 0++ 0-- 2
0110 +0+ -0- 2
0111 +++ --- 3
1000 ++- --+ 1
1001 -++ +-- 1
1010 +-+ -+- 1
1011 +00 -00 1
1100 0+0 0-0 1
1101 00+ 00- 1
1110 0+- 0+- 0
1111 -0+ -0+ 0
Mode A is chosen if 3 RDS 1.
Mode B is chosen if 0 RDS 2 .
In this code, 4 RDS 3 . Improving the efficiency has thus
increased the range of RDS significantly, and increased the
sensitivity to baseline wander ISI.
The decoding can be done easily by using a slicer and a
table to find the bit combination based on the received 3-
level block. However, better performace can be achieved
using, e.g., the Viterbi algorithm to be discussed later on.
83050E/62
Binary Block Codes
These codes are suitable for cases where the channel

alphabet is binary (such as optical transmission/storage) or
when there are other reasons to avoid more than two signal
levels in the signaling.
Zero Disparity Codes
A coded n-bit block includes n/2 1-bits and n/2 0-bits. So at

the end of each block, RDS=0. Within a block,
n / 2 RDS n / 2.
Code properties with different block lengths:
n N Log2N k Efficiency
2 2 1 1 50%
4 6 2.58 2 50%
6 20 4.32 4 67%
8 70 6.13 6 75%
10 252 7.97 7 70%
83050E/63
Variable Rate Codes
In some cases it is possible to use codes where the

transmission rate is not constant, but a variable number of
channel symbols are transmitted for each input bit. The goal
is, of course, to minimize the average symbol rate.
A very simple but useful principle is bit-stuffing, where

sequences of 0 bits exceeding certain length are broken by
adding an extra 1-bit.
Example: (0, 2) runlength limited binary code
Input: 1 0 0 0 1 1 0 0 1
Coded: 1 0 0 1 0 1 1 0 0 1 1
83050E/64
Properties of Some Line Codes

83050E/65
Digital Transmission System Based on

Baseband Pulse Shaping
SYMBOLS
TRANSMIT S(t)
Bn Ak FILTER CHANNEL
CODER
BITS
g(t) SIGNAL
b (t )
TRANSMITTER N(t)
NOISE
RECEIVE
Bn Ak Qk Q( t) FILTER R(t)
DECODER
BITS
f (t)
SAMPLER
ESTIMATED
SYMBOLS SLICER OR TIMING
DECISION
DEVICE RECOVERY
RECEIVER
Channel is modeled as a linear time invariant transfer

function with additive Gaussian noise.
Coder maps the incoming bit sequence to a symbol

sequence according to the chosen constellation/alphabet.
The symbol sequence is here assumed to be uncorrelated,
i.e., the consequtive symbols are independent of each other.
Consequently, the spectrum of the symbol sequence, as a
discrete-time signal is white. Furthermore, the symbols of
the alphabet are assumed to be used with equal probability.
Notice that the system between coder and sampler is

basically a cascade of three continuous-time filters.
83050E/66
Transmitter Blocks
The transmitter filter forms a continuous time signal from the
symbol sequence, Am . The impulse response of the filter is
g (t ) . In the following this is called also as the transmitted
pulse shape.
The channel waveform is

S (t ) = Am g (t mT )
m =
Here T is the symbol interval and 1/T is the symbol rate.
So, the waveform is formed from pulses scaled by symbol

values. The pulses may overlap in time domain.
Example
+3
+2 TRANSMIT
+1 FILTER
SYMBOLS S(t)
2T
t g(t) t
0 T 3T 0 T 4T
1
-1 Ak
t
T
83050E/67
Channel
The received waveform is
R (t ) = b(t) S(t) + N (t ) = +b( ) S (t )d + N (t )

+
= - b( ) Am g (t mT )d + N (t )
m =

= Am h(t mT ) + N (t )
m =
where h(t ) is the received pulse shape

h(t ) = b(t ) g (t ) = b( )g (t )d

Example
If we consider a strictly bandlimited channel
1 f <W
B( f ) =
0 f W f
-W W
then a square pulse waveform of the previous page is not

very good, because it would be (more or less) distorted in
the channel.
83050E/68
Receiver Blocks
In general, the receiver design is more critical than the
transmitter, because the channel attenuates and distorts the
signal and it is important to recover the signal as well as
possible to minimize bit error rate.
Receiver filter
1. Filters out the adjacent channels and out-of-band noise&
interferences.
2. Effects on the pulse shape.
3. As equalizer compensates the linear distortion of the
channel, e.g., by using inverse transfer function. The
transfer function of the channel is usually unknown, so
adaptive methods are important.
Synchronization (timing recovery) defines the right symbol

timings for different blocks and the correct sampling
moments. The transmitted message includes often
components which make the synchronization more easy.
(However, this is not always necessary.) Here, it is assumed
that synchronization has been done in some way. The
synchronization concepts are considered in more detail at
the end of the course.
In the sampling block, samples are taken from the

continuous time signal. In the ideal case, the samples are
taken at time instants that correspond to the transmitted
symbol and when intersymbol interference is at minimum.
83050E/69
Receiver Blocks (continued)

In the decision device, an estimate of the transmitted symbol
sequence Ak is formed. In the simplest case, this is based
on decision threshold levels.
Example: Consider the alphabet {-1, 0, 1}: The decision

is made by using the following figure
OUT
(Note: Here the receiver
amplification is adjusted 1
so that the overall gain
of the link is 1.) -0.5
IN
0.5
-1
More advanced methods for making the decisions are

needed to achieve optimal performance in case of non-ideal
channels. The discussion of these methods, as well as
optimal choice of the receiver filter will constitute a major
part of this course.
The decoder generates a bit-sequence from the detected

symbols, which in the ideal case is a delayed version of the
transmitted bit-sequence.
83050E/70
Example: Simple and Cheap PAM System

This example illustrates a simple, non-optimized PAM
system, which works well if the needed bit rate is a lot
smaller than the capacity of the channel. However, the bit
rate should be much lower than what is possible with more
optimal pulse shaping.
A binary alphabet {-a, a} is used and a is chosen such that
transmission power satisfies the (regulatory or practical)
limitations. If the symbols are equally probable, DC is zero.
The transmit filter is a simple 1st-order RC lowpass filter.
BIT SIGNAL CODER CODED SIGNAL TRANSMIT FILTER
HARDWARE
In case of ideal channel, no channel noise, and wideband

receiver filter, the received signal could look like this:
83050E/71
Example: Simple and Cheap PAM System (contd)

Here the channel is modeled as 2nd-order lowpass with 3 dB
bandwidth equal to the symbol rate. Then the received
signal looks as follows. Eye diagram is almost closed, in the
system becomes very sensitive to noise and timing errors.
83050E/72
Integrate and Dump -Principle

Integrate and Dump Principle is a widely used term for
simple PAM system with square pulse shaping: A square
pulse of length T is used as the impulse response for both
the transmit and receive filters. (In this case the transmit and
receive filters are a matched filter pair, which is optimal in
case of ideal channel, as will be discussed later).
As a convolution of two square pulses, the overall pulse

shape (assuming ideal channel) is a triangle pulse of length
2T.
In practical implementation, the receiver filter can be

implemented by integrating the received signal over the
symbol interval, keeping the resulting value as the sampled
symbol value, and resetting the integrator for new sampling
interval.
g(t) f(t) p ( t )= g ( t ) f ( t )
0 T 0 T 0 T
The name is commonly used, e.g., in the spread-spectrum

context, for similar basic operation within a more advanced
system concept.
83050E/73
About Pulse Shapes in Digital Transmission

Earlier, we have seen that the power spectrum of the
transmitted signal comes directly as the Fourier-transform of
the used pulse waveform.
Usually, the channel is bandlimited,
1 f <W
B( f ) =
0 f W f
-W W
We could use a pulse whose spectrum is rectangular,
1 f <W
G ( f ) = 2W
0 f W
The corresponding pulse shape would be
sin( 2Wt )
g (t ) = = sinc(2Wt )
2Wt It has zero crossings at each
1 multiple of T=1/ 2W , except at
0.8 t=0. The pulses are
0.6 overlapping, but the
0.4 requirements are fulfilled
0.2 anyway.
0
This is not a practical
-0.2
solution, as will be discussed
-0.4
-3 -2 -1 0
Time in symbol intervals
1 2 3
later, but it illustrates the
principle.
I
83050E/74
Intersymbol Interference
Lets consider two adjacent symbols, a0=1 and a1=2.
Corresponding pulses and their effects on the overall
waveform are shown in the following figure.
2.5
1.5
0.5
-0.5
-3 -2 -1 0 1 2 3
If we consider the earlier presented ideal bandlimited

channel, this is also the received pulse waveform.
In the receiver, samples are taken at time instants t = mT .
Now the adjacent symbols do not interfere with each other,
and if we consider a noise free situation, the original symbol
values can be obtained.
We say that intersymbol interference (ISI) is zero. Normally,
this is the goal.
It should be noted that is only possible when the
synchronization is done perfectly. Even a slight timing error
could cause severe ISI.
83050E/75
About the Sinc Pulse

The bandlimited sinc-pulse waveform is not practical
because
It is not realizable.
The pulse is too long in time domain, because the
oscillations around the main lobe die out slowly.
The long tails of the pulse make also synchronization more
difficult.
83050E/76
Requirements for Pulse Shape

In time domain the Requirement is: Intersymbol interference
is zero. Mathematically, this can be written as
p ( 0) = 1

p ( mT ) = 0 when m = 1, 2,
1
0.8
0.6
0.4
0.2
-0.2
-0.4
-3 -2 -1 0 1 2 3
Otherwise the pulse shape can be arbitrary.

83050E/77
Nyquist Criterion in Frequency Domain

It follows from the previous criteria:

p(t ) (t kT ) = p(kT ) (t kT ) = (t )
k = k =
and through Fourier transform we obtain:

1 m
P f ) = 1
T m= T
Discussion: This result is closely related to the sampling theorem.
The time domain criterion says that the sampled continuous-
time pulse should be a discrete time impulse.
The left side of the frequency domain criterion is the spectrum
of the sampled discrete time signal, and the right side is the
Fourier-trasform of an impulse.
The latter form is the famous Nyquist Criterion. This is a
criterion in frequency domain for preventing ISI.
It follows that the smallest theoretical bandwidth to obtain 0
ISI is W = 1/ 2T . This would obtained by using the ideal
bandlimited pulse, i.e., the sinc-pulse.
In the frequency domain, the transition band is always
symmetrical with respect to 1/2T. Some spectra satisfying
the Nyquist criterion:
P( f ) P( f ) P( f )
f f f
1 1 0 1 1 1 1 0 1 1 1 1 0 1 1
T 2T 2T T T 2T 2T T T 2T 2T T
83050E/78
Nyquist Criterion in Frequency Domain (Cont.)

The pulses/filters with zero ISI are called Nyquist
pulses/filters.
In practice, the pulse shaping filter frequency response P ( f )

has a symmetrical transition band with respect to f = 1 / 2T .
The total signal bandwidth (to stopband edge) is

1
(1 + ) ..
2T
Here is the roll-off factor, or excess bandwidth, and it is
usually between 0.11 or 10-100%.
So the excess bandwidth defines the difference between

the actual and the minimum theoretical bandwidths.
|P( f )|

2T 2T
1
s
0 f
fp 1 fs
2T
83050E/79
Pulse Shaping in Baseband PAM-System

Intersymbol interference is a very important factor in the
receiver sampling. The pulse waveform at the sampling
depends on the transmitter filter, receiver filter, and the
transfer function of the channel.
The requirement is that the cascade of the transfer functions
P( f ) = G ( f ) B( f ) F ( f )
satisfies the Nyquist criterion. The individual filter responses
are not important in this respect.
This is the so-called zero-forcing criterion. It forces ISI to
zero. When we have a noisy channel, it is not necessary an
optimal solution, as we will see later.
The channel transfer function is usually fixed, or it
cannot be effected.
Transmitter and receiver filters are designed together.
Here, the following solutions are possible:
1. Pulse shaping in the transmitter (or receiver), the
receiver (transmitter) approximates an ideal low
pass filter whose bandwidth is (1 + ) / 2 T .
2. Matched filter pair is theoretically an optimal
solution in case of ideal channel. The impulse
responses are mirror images, amplitude responses
are the same.
3. Transmitter filter is designed as in the previous
case. The receiver filter (channel equalizer) tries
adaptively to minimize ISI using certain criterion.
83050E/80
Designing the Pulse Shaping Filter

In general, the filter design criteria are:
dB
1. Transmitted signal spec-
trum must satisfy certain 0
criteria in the frequency
domain, usually defined in -50
the system specifications.
Example frequency mask: -80 f fc
-26 -15 0 15 26 MHz
2. ISI should be minimized, or more generally, the overall

error (including ISI and various distortions) at the decision
device should be minimized to minimize the BER.
3. Out-of-band signals (adjacent channels, interferences,
channel noise) should be attenuated sufficiently in the
receiver.
The transmitter/receiver architecture&implementation

aspects (1 and 3) are more or less (but not completely in
practical implementations) separated from the
communication theoretic aspect (2), and different filter
stages are used to satisfy the different criteria.
In the implementation, discrete time transversal filters or

digital FIR filters are often used for final pulse shaping.
Sampling rate conversion and digital multirate signal
processing is often applied in this context.
In the literature, the raised cosine pulses/filters are the

standard solution.
83050E/81
Raised Cosine Pulses

There are analytical expressions both for a (single-stage)
raised cosine Nyquist filter and the square root raised cosine
filter. The latter type of filter can be used both as transmit
and receive filters, resulting in 0 ISI with ideal channel.
Raised cosine filter is constructed in frequency domain to

have ideal passband and stopband response. The transition
bands are formed using half a cycle of sine-wave, and the
transition bandwidth is controlled by the roll-off parameter .
T 0 f (1 )T / 2

T T 1
P ( f ) = 1 sin f (1 )T / 2 f (1 + )T / 2
2 2T
0 f > (1 + )T / 2
The impulse response can be shown to be:
sin ( t / T ) cos ( t / T )
p (t ) = i
t /T 1 ( 2 t / T )2
83050E/82
About Pulse Shaping Filter Design

The formulas for raised cosine filters give the continuous
time impulse response, and by sampling it with N times the
symbol rate, symmetric (linear phase) FIR filters are
obtained for N times oversampled input signal (i.e, for
sampling rate of N/T).
In practise, the filter has to be truncated to a finite length

symmetrically around the origin, and delayed to get a causal
impulse response.
The finite length window causes the filters to have finite

stopband attenuation. The truncation effects can be reduced
by using a suitable window function, like Hanning window.
Special filter optimization techniques can be used for

designing filters with minimum order and satisfying the
different constraints, like the Nyquist criterion and stopband
attenuation requirements.
One important property in this context is also the peak

envelope value of the transmitted signal, which is a very
important parameter from the transmitter power amplifier
design point of view. Experience has shown that raised
cosine filters are practically optimal in this respect, and some
filter optimization methods may result in poor performance in
this respect. Especially, nonlinear phase transmit filters
(nonsymmetric FIR filters) seem to result in increased peak
envelope values.
83050E/83
Eye Diagram
An eye diagram consists of many synchronized, overlaid
traces of small sections (a few symbols) of a signal. It is
assumed that symbols are random and independent, so all
the possible symbol combinations are expected to have
occurred.
Eye diagram can be measured by oscilloscope or by

computer simulations.
They are used for both checking the system operation and
evaluation system performance in research and
development work.
Intersymbol interference can easily be seen in the eye

diagram
The eye diagram depends on the received pulse shape and

the used constellation.
83050E/84
Properties of Eye Diagram
c a
b
The wider the vertical opening, the greater the noisy

immunity.
ISI will reduce the vertical opening.
The ideal sampling instant is at the point of maximum
vertical eye opening.
The smaller the horizontal opening, the greater the
sensitivity to errors in timing phase
83050E/85
Eyediagram (continued)
The effect of excess bandwidth (Raised cosine pulses, 2
level PAM):
25% 100%
We notice that by increasing the excess bandwidth, the

horizontal opening becomes wider.
In the extreme case of 0 excess bandwidth, the horizontal
opening becomes zero, and the system becomes extremely
sensitive to timing errors. This is one indication of the fact
that sinc pulses cannot be used in digital transmission. The
other bad property is that the peak envelope value of the
transmitted signal grows heavily as the excess bandwidth is
reduced.
4-PAM (25% excess bandwidth, Raised cosine):
83050E/86
Equivalent Discrete-Time System Model

If we look at the chain between coder output and sampler
output, it can be modelled as a single, linear discrete-time
system block (filter). The impulse response of this filter is
pk = p (k ) = p (kT ) = [g (t ) b(t ) f (t )]t = kT
i.e., the sampled version of the continuous-time impulse

response.
Uk
SYMBOLS
DISCRETE-TIME
BITS EQUIVALENT BITS
CODER CHANNEL DECODER
Bk Ak pk Qk Bk
In many cases we can use such a simplified discrete-time

system model in simulations. Of course, also the possible
discrete-time/digital filter blocks in the transmitter and
receiver can be included in the model.
It should be noted that the channel noise is filtered by the

receiver filter, which changes the noise characteristics. E.g.,
the noise source in this model is not necessarily white even
if the channel noise is. The discrite-time noise source in the
model and the actual channel noise N(t) are related through
U k = [N (t ) f (t )]t = kT .

A

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

A

Caricato da

Copyright:

Formati disponibili

83050E/1

TLT-5406 Digital Transmission

Block Diagram of a Digital Transmission System

Sampling and Source Channel

Reconstruction Source Channel Demodulation Channel

In the following, the digital transmission system is considered to

With this definition, the transmission chain can be designed and

The Parameters of a Digital Transmission System

The properties of a transmission chain are characterized

Transmission rate, bit rate (bits/s, bps)

Delay (due to propagation and signal processing)

'Timing jitter' in the bit-stream obtained from the channel

BRIEF INTRODUCTION TO INFORMATION THEORY

In this section we study the concepts of information, entropy,

With this theory, it is possible to calculate the largest

Even though it is usually not possible to achieve the channel

In fact, the Shannon-Hartley law is an important

Source: Lee&Messerschmitt, Chapter 4.

Source coding part: Benedetto&Biglieri, Chapter 3;

Here the message signal is modeled as a random process.

We begin by considering observations of a random variable:

Example: The statement The sun rose this morning gives

The statement San Francisco was destroyed this

Observing a random variable X that takes its values from the

h(am ) = log2 ( p X (am ))

Definition of Information - Interpretation

It is easy to see that

For a rare event the probability p X (am ) is small and the

For a usual event the probability p X (am ) 1 and the

In case of two independent random variables X and Y ,

In case of independent events, the information is

Using base 2 logarithm => the unit of information is bit.

The average information of a random variable is

H ( X ) = E [ log 2 ( p X ( x ))] = p X ( x ) log 2 ( p X ( x ))

Entropy has the following interpretations:

Example: The Entropy of an English Text

Entropy H ( X ) = pi log 2 pi = 4.03 bits

Source Coding Theorem

Let us consider a discrete-time and discrete-amplitude

The rate of the source is defined as R = rH ( X ) .

Such a source can be coded using a source coder

It is worth noting that it is often very difficult to construct

An Example of Source Coding

Let us consider again the case of binary random variable.

Now the entropy is H ( X ) = 1, so 1 bit per sample is

H ( X ) = 0.1 log 2 (0.1) 0.9 log 2 (0.9) = 0.47

samples code word

It is also easy to decode a bit-stream constructed in this

(1) When throwing a coin, if the result is always head (it

(2) It is easy to show that the entropy satisfies

where K is the number of possible outcomes (sample

About Source Coding

In the following we consider briefly binary coding methods

In Huffman codes, variable code wordlength is utilized

Coding algorithm (from Proakis&Salehi) and example (from

Is any element the yes Append the codeword

Huffman Code (continued)

As an example, in symbol-by-symbol compression of English

For Huffman code, it can be shown that the average number

When coding n symbols at a time, we obtain correspondingly

So using the Huffman code blockwise, with sufficiently long

One fundamental limitation of Huffman codes is that the

Many sources produce long sequences (runs) of the same