Coordinated Multi-Point in
Mobile Communications
From Theory to Practice
Edited by
GE RHARD P. FE T T WE IS
Technische Universitt Dresden, Germany
Contents
List of Contributors
Acknowledgements
List of Abbreviations
Nomenclature and Notation
page xiii
xvii
xviii
xxiv
Introduction
1.1 Motivation
1.2 Aim of this Book
1.3 Classes of CoMP Considered
1.4 Outline of this Book
3
3
5
5
6
An
2.1
2.2
2.3
2.4
Information-Theoretic Basics
3.1 Observed Cellular Scenarios
3.2 Usage of OFDMA for Broadband Wireless Communications
3.3 Multi-Point Frequency-Flat Baseband Model Considered
3.4 Uplink Transmission
3.4.1 Basic Uplink Capacity Bounds
3.4.2 Full Cooperation in the Uplink
3.4.3 No Cooperation in the Uplink
3.4.4 Numerical Example
3.5 Downlink Transmission
3.5.1 Basic Downlink Capacity Bounds
3.5.2 Full Cooperation in the Downlink
3.5.3 No Cooperation in the Downlink
3.5.4 Numerical Example
7
7
8
9
10
11
11
11
13
14
15
17
17
19
19
20
22
22
23
vi
3.6 Summary
24
25
25
25
27
29
32
32
35
37
37
38
39
41
41
41
43
44
45
46
48
50
52
54
54
56
61
64
66
67
68
68
70
74
80
81
81
81
82
vii
87
90
92
93
94
95
104
108
108
110
111
113
115
121
121
122
130
136
137
Clustering
7.1 Static Clustering Concepts
7.1.1 Non-Overlapping Clusters
7.1.2 Overlapping Clusters
7.1.3 Resulting Geometries
7.2 Self-Organizing Clustering Concepts
7.2.1 Self-Organizing Network Concepts in 3GPP LTE
7.2.2 Adaptive Clustering Algorithms
7.2.3 Simulation Results
7.2.4 Signaling and Control Procedures
7.3 Summary
139
141
142
145
146
148
148
149
152
157
159
Synchronization
8.1 Synchronization Concepts
8.1.1 Synchronization Terminology
8.1.2 Network Synchronization
8.1.3 Satellite-Based Synchronization
8.1.4 Endogenous Distributed Wireless Carrier Synchronization
8.1.5 Summary
8.2 Imperfect Sync in Time: Perf. Degradation and Compensation
8.2.1 MIMO OFDM Transmission with Asynchronous Interference
8.2.2 Interf.-Aware Multi-User Joint Detection and Transmission
161
161
161
163
165
166
170
170
173
176
viii
178
181
181
182
189
192
Channel Knowledge
9.1 Channel Estimation for CoMP
9.1.1 Channel Estimation - Single Link
9.1.2 Channel Estimation for CoMP
9.1.3 Multi-Cell Channel Estimation
9.1.4 Uplink Channel Estimation
9.1.5 Summary
9.2 Channel State Information Feedback to the Transmitter
9.2.1 Transmission Model
9.2.2 Sum-Rate Performance Measure
9.2.3 Channel Vector Quantization (CVQ)
9.2.4 Minimum Euclidean Distance Based CVQ
9.2.5 Maximum SINR Based CVQ
9.2.6 Pseudo-Maximum SINR based CVQ
9.2.7 Application to Zero-Forcing (ZF) Precoding
9.2.8 Resource Allocation
9.2.9 Simulation Results
9.2.10 Summary
193
193
194
202
204
206
208
208
210
211
211
213
214
215
216
216
216
218
10
219
219
220
221
222
224
226
227
227
228
231
233
237
241
241
ix
11
243
243
243
244
246
251
254
254
255
256
257
258
259
259
262
262
266
266
267
269
273
275
276
12
Backhaul
12.1 Fund. Limits of Interf. Mitigation with Limited Backhaul Coop.
12.1.1 Introduction
12.1.2 Uplink Scenario: Receiver Cooperation
12.1.3 Downlink Scenario: Transmitter Cooperation
12.1.4 UL-DL Reciprocity and Generalized Degrees of Freedom
12.1.5 Summary
12.2 Backhaul Requirements of Practical CoMP Schemes
12.2.1 Types of Backhaul Data and Scaling Laws
12.2.2 Specic Backhaul Requirements of Exemplary CoMP Schemes
12.2.3 Backhaul Latency Requirements
12.2.4 Backhaul Topology Considerations
12.2.5 Summary
12.3 CoMP Backhaul Infrastructure Concepts
12.3.1 Ethernet
12.3.2 Passive Optical Network
12.3.3 Digital Subscriber Line
12.3.4 Microwave
12.3.5 The X2 Interface
12.3.6 Backhaul Topology Concepts
277
277
278
281
286
287
291
291
291
294
299
300
300
301
301
303
305
306
307
307
12.3.7 Summary
310
311
13
313
313
314
314
317
319
319
320
321
322
322
325
330
331
332
334
346
347
352
353
353
356
358
363
364
14
367
367
368
370
373
373
375
376
376
377
380
380
382
xi
14.2.5 Summary
14.3 Uplink Simulation Results
14.3.1 Compared Schemes
14.3.2 Simulation Assumptions and Parameters
14.3.3 Backhaul Trac
14.3.4 Simulation Results
14.3.5 Summary
14.4 Downlink Simulation Results
14.4.1 Compared Schemes
14.4.2 Simulation Assumptions and Parameters
14.4.3 Detailed Analysis of Coordinated Scheduling/Beamforming
14.4.4 Backhaul Trac
14.4.5 Simulation Results
14.4.6 Summary
387
387
387
389
391
392
395
396
396
397
398
406
406
408
409
15
411
411
412
416
418
422
423
423
424
425
427
428
428
429
430
432
432
432
433
435
443
444
445
447
448
449
Outlook
15.1 Using CoMP for Terminal Localization
15.1.1 Localization based on the Signal Propagation Delay
15.1.2 Further Localization Methods
15.1.3 Localization in B3G Standards
15.1.4 Summary
15.2 Relay-Assisted Mobile Communication using CoMP
15.2.1 Introduction
15.2.2 Reference Scenario
15.2.3 System and Protocol Description
15.2.4 Trade-Os in Relay Networks
15.2.5 Numerical Evaluation of CoMP and Relaying
15.2.6 Cost/Benet Trade-O
15.2.7 Energy/Benet Trade-O
15.2.8 Computation/Transmission Power Trade-O
15.2.9 Summary
15.3 Next Generation Cellular Network Planning and Optimization
15.3.1 Introduction
15.3.2 Classical Cellular Network Planning and Optimization
15.3.3 Physical Characterization of Capacity Gains through CoMP
15.3.4 Summary
15.4 Energy-Eciency Aspects of CoMP
15.4.1 System Model
15.4.2 Eective Transmission Rates
15.4.3 Backhauling
15.4.4 Energy Consumption of Cellular Base Stations
xii
451
453
455
455
455
457
458
459
460
References
Index
461
479
List of Contributors
Amin, M. Awais
Bachl, Rainer
Bhagavatula, Ramya
Boccardi, Federico
Brown III, D. Richard
Br
uck, Stefan
Calin, Doru
Chae, Chan-Byoung
Dammann, Armin
Dekorsy, Armin
Dietl, Guido
Doll, Mark
dos Santos, Ricardo B.
Dotsch, Uwe
Droste, Heinz
Fahldieck, Torsten
Falconetti, Laetitia
Fehske, Albrecht
Fettweis, Gerhard
Fischer, Erik
Forck, Andreas
xiv
List of Contributors
Frank, Philipp
Fritzsche, Richard
Garavaglia, Andrea
Gesbert, David
Giese, Jochen
Grieger, Michael
Haustein, Thomas
Heath Jr., Robert W.
Holfeld, Jorg
Hoymann, Christian
Irmer, Ralf
Jackel, Stephan
Jandura, Carsten
Jungnickel, Volker
Kadel, Gerhard
Klein, Andrew G.
Klein, Anja
Koppenborg, Johannes
Kotzsch, Vincent
Maciel, Tarcisio F.
Marsch, Patrick
Mayer, Hans-Peter
Mensing, Christian
Molisch, Andreas F.
M
uller-Weinfurtner,
Stefan
List of Contributors
M
uller, Andreas
Olbrich, Michael
Palleit, Nico
Rost, Peter
Sand, Stephan
Schellmann, Malte
Schneider, Christian
Schulist, Matthias
Thiele, Lars
Tian, Yafei
Tse, David
Utschick, Wolfgang
Voigt, Jens
Wachsmann, Udo
Wahls, Sander
Wang, I-Hsiang
Weber, Andreas
Weber, Ralf
Weber, Tobias
Wei, Xinning
Wild, Thorsten
Wirth, Thomas
Yang, Chenyang
xv
xvi
List of Contributors
Zakhour, Randa
Zirwas, Wolfgang
Acknowledgements
This book is based on the knowledge and eort of a large number of authors,
some of whom have been working in the eld of CoMP for over a decade. The
editors would like to thank all contributors for their great cooperation in the last
months, their constructive discussions on contents, notation and nomenclature,
and their patience in ne-tuning contents up to the last minute of editing.
The request for searching for new limits of cellular beyond 3G came from
Vodafone Group R&D, initiating our research in the area of CoMP. As the
sponsor of the Vodafone Chair at Technische Universit
at Dresden, Vodafone
Group R&D has been instrumental in sharpening our view for CoMP schemes
with practical impact. In particular Mike Walker, Trevor Gill and Luke Ibbetson
among many others have been of great help in serving as a sounding board for
our ideas. As a result, we have focused our research on theoretical limits as
well as practical implementation challenges. The result of this view on CoMP
technology has provided the basis for what has nally led to this book.
Nothing would be possible without interaction with friends, colleagues, fellow
researchers and cooperation partners. The mindset and openness of our scientic
community is a platform for inspiration and motor for sharpening our minds. In
particular, the team at the Vodafone Chair has been of invaluable help in creating
scientic results and providing the framework for inspirations, discussions, and
many new insights. Thanks to the whole team for this major help!
While most parts of this book were mutually reviewed by the authors themselves, the editors would like to thank the following external reviewers for their
valuable feedback: Fabian Diehm, Alexandre Gouraud, Ines Kluge, Marco Krondorf, Eckhard Ohlmer, Simone Redana, Fred Richter, Hendrik Schoneich, Mikael
Sternad, Vinay Suryaprakash, Tommy Svensson, Stefan Valentin, Raphael Visoz,
Guillaume Vivier and Steen Watzek. Also, the appearance of the book would
not be as it is without the signicant work of Katharina Philipp, who adapted
the majority of gures in this book to the same look and feel.
Last but surely not least, the editors would like to thank Phil Meyler and
Sarah Finlay from Cambridge University Press for making this book possible,
and for the great and patient support during its creation.
Patrick Marsch and Gerhard Fettweis (Editors), January 2011
List of Abbreviations
ACK
ADC
AGC
aGW
ANR
AoA
AWGN
bpcu
BC
BER
BF
BLER
BPSK
BS
CAZAC
CB
CCU
CD
CDF
CDI
CDM
CDMA
CFO
CGI
CIF
CIR
CoMP
CP
CPRI
CQI
CRC
CRLB
CRS
CS
acknowledgement
analog to digital conversion
automatic gain control
advanced gateway
automatic neighbor relation
angle of arrival
additive white Gaussian noise
bits per channel use
broadcast channel
bit error rate
beamforming
block error rate
binary phase shift keying
base station
constant amplitude zero autocorrelation codes
coordinated beamforming
CoMP central unit
Cholesky decomposition
cumulative distribution function
channel direction indicator
code division multiplex
code division multiple access
carrier frequency oset
cell global identier
compressed interference forwarding
channel impulse response
coordinated multi-point
cyclic prex
common public radio interface
channel quality indicator
cyclic redundancy check
Cramer-Rao lower bound
common reference signal
coordinated scheduling
List of Abbreviations
CS/CB
CSG
CSI
CSIR
CSI RS
CSIT
CSU
CT
CTF
CU
CVQ
DAS
DBA
DF
DFT
DIS
DL
DM
DPC
DRS
DSL
DSLAM
DSP
EASY-C
eNB
EOC
ERC
EPON
E-UTRAN
EVD
EvDO
FDD
FDM
FEC
FFT
FIR
FPGA
FTP
g.d.o.f.
GF
GSM
GPON
GPRS
xix
xx
List of Abbreviations
GTC
GPS
GTP-U
HARQ
H-BLAST
HK
HPBW
HSPA
IAP
IC
ICI
ICIN
IDFT
IF
i.i.d.
IEEE
IFFT
INR
IP
IRC
ISD
ISI
JD
JT
LAN
LDC
LLR
LMMSE
LO
LOS
LSP
LSU
LTE
LTE-A
MAC
MAN
MCS
MET
MIESM
MIMO
MISO
MF
ML
MLE
List of Abbreviations
MME
MMSE
MPC
MRC
MRM
MRT
MS
MSE
MUI
MU-MIMO
NGMN
NLOS
NMEA
NR
NRT
NTP
OAM
OC
OCXO
ODN
OFDM
OFDMA
OLT
ONU
PA
PAPR
PCI
PDF
PDH
PDCCH
PDSCH
PDP
PIC
PLL
PMI
ppb
ppm
PPS
PON
POTS
PRB
PRS
PTP
PUCCH
xxi
xxii
List of Abbreviations
PUSCH
QAM
QoE
QoS
QPSK
RAN
RAP
RB
RE
RF
RI
RHS
RMS
RN
RNTI
RoF
RRH
RRM
RS
RSS
RSRP
RTOA
RTT
SC
SC-FDMA
SCM
SCME
SCTP
SDH
SDMA
SDIV
S-GW
SIC
SINR
SIR
SISO
SMUX
SON
SONET
SS
SSB
SSP
SNR
SU-MIMO
List of Abbreviations
SVD
SynchE
TB
TCI
TDD
TDM
TDMA
TDOA
THP
TOA
TTI
UDP
UCA
UE
UL
ULA
UMTS
UTRAN
VDSL
VID
VLAN
VoIP
WAN
WCDMA
WCI
WiMAX
WF
WINNER
WSSUS
XGPON
ZF
xxiii
Nomenclature
In this book, we generally consider the setup and involved nomenclature depicted
in Fig. 3.1 on page 13. Please note that we assume a site to consist of three sectors,
which are equivalent to cells. Each sector or cell is assumed to be served by one
dedicated base station (BS), even though in practice multiple such BSs may be
integrated into one physical device.
Notation
Unless stated otherwise, the following holds throughout most parts of the book:
xxv
Interferenceaware
transmission/
detection
Interference
coordination
Multi-cell
joint signal
processing
Decentralized
DL multi-user
beamforming with IRC
(Sections 5.1, 13.3)
IRC (Section 10.2)
UL cooperative interf.
prediction (Sections 5.2.2, 14.3)
DL coordinated
sched. / beamforming (CS/CB)
(Sections 5.3, 14.4.3)
UL decentralized
joint detection
(Sections 6.2, 13.1, 14.3)
UL distr. interference
subtraction (Sections 4.3.1, 13.2)
DL distributed
joint transmission
(Sections 6.3, 13.3, 13.4)
Centralized
UL joint scheduling
(Sections 5.2.1, 14.3)
DL centralized
joint scheduling
(Section 11.1)
UL centralized
joint detection
(Sections 6.1, 13.2)
DL centralized
joint transmission
(Sections 6.3,13.3)
Part I
Motivation and Basics
Introduction
Patrick Marsch and Gerhard Fettweis
1.1
Motivation
Mobile communication has gained signicant importance in todays society. As
of 2010, the number of mobile phone subscribers has surpassed 5 billion [ABI10],
and the global annual mobile revenue is soon expected to top $1 trillion [Inf10].
While these numbers appear promising for mobile operators at rst sight, the
major game-changer that has come up recently is the fact that the market is
more and more driven by the demand for mobile data trac [Cis10]. This is simply because Moores law in semiconductors leads to continuously more powerful
mobile devices with larger storage capacity, which in the era of Web 2.0 require
regular synchronization with the Internet. Consequently, Moores law can also
be found in the increase of data rates in wireless communications, as illustrated
in Fig. 1.1. The main challenge, however, is that mobile users tend to expect
the fast and cheap Internet access that they are used from their xed lines (e.g.
ADSL), but anytime and anywhere while being on the move. This puts mobile
operators under the pressure to respond to the increasing trac demand and
provide a more homogeneous quality of experience (QoE) over the area (often
referred to as improved fairness), while continuously decreasing cost per bit - and
addressing the more and more crucial issue of energy eciency [FMBF10].
But how can mobile data rates and fairness be increased in general? We have
to be aware that current cellular systems are mainly limited by inter-cell interference [GK00] - especially in urban areas where the rate demand is largest and
hence base station deployment is dense. Here, each point-to-point communication link is characterized by a certain ratio of desired receive signal power over
interference and noise power, where Shannon [Sha48] states a clear upper bound
on the capacity of the link. This then translates to a maximum spectral eciency, i.e. the maximum data rate achievable for a given bandwidth. In fact,
the standard Long Term Evolution (LTE) Release 8 [McC07] uses modulation
and coding schemes and link adaptation in conjunction with hybrid automatic
repeat request (HARQ) that allow to approach Shannon capacity to within less
than a dB at reasonable complexity [LS06]. Hence, the increasing rate demand
can surely not be met by improving point-to-point links, but requires other innovations. But which further options do we have?
Introduction
1.2
1.3
Introduction
1.4
2.1
2.2
Most of the requirements are already addressed with LTE, which is being
commercialized in 2010 in its rst release. However, there is a need to develop
LTE beyond the rst release, in order to address customer and operator requirements. The challenges faced by mobile communications in the second decade of
the 21st century are the following:
Exploding data volume - This is driven by attractive services, at-rate pricing and user-friendly devices. The most prominent example is the iPhone - which
resulted in a 10x trac increase. IPTV, 3D Internet, real-time web, and cloud
services will result in step changes in data consumption. IBM is predicting the
generation of 16 TB/person/year by 2020. The challenge is that networks need
to be structured to cope with data volume explosion without a cost or energy
explosion or constant need for equipment upgrades, as illustrated in Fig. 2.2.
Cost
Traffic
Data
Revenue
Voice
Time
Increased data rates - Driven by new services and the evolution from DSL
( 2 Mbps) to variants of bre technology (100 Mbps to 1 Gbps), the user expectation of acceptable Internet speed will rise substantially in line with the expectation set by bre networks and thus posing a challenge to wireless technologies.
Ubiquitous indoor coverage - Many data services are important for indoor
users and people are usually within buildings. Indoor coverage is therefore important and can be either provided by copper/bre with local radio distributions
(femto cell or WiFi) or from cellular networks.
Ubiquitous outdoor coverage - For voice calls, the user expectation has
moved from making calls along major roadways in the 1990s to being reachable
all the time in any building. Mobile Internet based on 3G or WiFi today can
only be characterized as best-eort, without continual connectivity whilst onthe-move and with patchy coverage in many places. In 2020, business customers
and consumers will rely more on data connectivity - they will need connectivity
anywhere, anytime. Coverage with a minimum guaranteed data rate and hence
reliability will be a key dierentiator between operators as the world moves from
a nice-to-be-connected model to one that is essential-to-be-connected.
There are technical innovations on the horizon to address these challenges:
Gradual improvements of existing technologies, e.g. better MIMO modes etc.
Active antennas, which may enable multi-element antennas
New deployment concepts like femto cells or MetroZone networks. They
require innovative backhaul solutions such as in-band and out-band backhaul
or mm-wave microwave, and self-organizing principles in order to be manageable
Miniaturized, exible, energy-ecient base stations
Base station cooperation concepts.
2.3
10
ments in urban areas and capacity hotspots. As we will see later, this increase
in access capacity with CoMP concepts comes at the cost of more backhaul
capacity, i.e. more communication bandwidth between base stations. However,
for HSPA+ and LTE, base station sites need high-capacity backhaul (bre or
microwave) anyway, and as the cost of backhaul increases less than linearly with
the backhaul capacity, this issue might not be as severe as often stated.
What are the alternatives to CoMP? Dierent frequency reuse, more spectrum, more sites, more antennas all are very expensive options for an operator.
Thus investing into more intelligent baseband (i.e. CoMP algorithms) and backhaul with higher data rate and lower latency requirements seems to be more and
more attractive when compared to the other options.
The complicated issue about CoMP concepts is that they are only partially
understood from the academic perspective today, and that implementation in a
standard at reasonable complexity is dicult. However, lets draw an analogy to
MIMO technologies. They are commercially used today in WiFi and cellular communications, but ten years ago there was only limited understanding of MIMO,
and the technology was seen by many as too complex to be commercialized.
2.4
Information-Theoretic Basics
Patrick Marsch and David Tse
In this chapter, the reader is made familiar with a set of theoretical concepts to
analytically capture the variety of CoMP schemes considered in this book. The
reader will obtain a rst understanding of the general capacity gains expectable
from multi-cell joint signal processing, and the many degrees of freedom involved.
The chapter introduces notation that will be reused in most parts of the book.
3.1
3.2
12
Information-Theoretic Basics
cells = sectors
sites with
3 base stations
each
CoMP cluster
nicating entities. The rst aspect implies that any transmission must be bandlimited in order not to disturb other transmissions on adjacent bands, which
requires the design of particular transmit and receive lters. The second aspect
implies that any receiver may observe a superposition of multiple dierently
delayed and attenuated copies of originally transmitted signals, which in the
context of broadband transmission may lead to inter-symbol interference (ISI)
that has to be dealt with. The third aspect means that we need a low-cost and
ecient signal processing solution that can divide a mobile communications system into a large number of exible bit pipes according to many users or the
applications needs.
The mobile communications standard LTE Release 8 [McC07] from 3GPP uses
an OFDMA approach to address all aspects stated above, where the baseband
signal processing chain for a downlink example is depicted in Fig. 3.2. Here, the
key concept is that the symbols to be transmitted from one BS towards multiple
UEs 1..U are modulated in frequency domain, mapped to dierent sub-carriers,
and then an inverse discrete Fourier transform (IDFT) is used to generate a
time domain signal. A cyclic prex is inserted before each orthogonal frequency
division multiplex (OFDM) symbol (i.e. before each block of samples processed
in one IDFT), in order to assure that even a channel with a large delay spread
does not cause ISI, and that the transmission leads to a circularly symmetric
convolution of the transmitted samples with the channel. Each receiving UE can
then discard the cyclic prex, perform a discrete Fourier transform (DFT), and
obtain (scaled and noisy) transmitted symbols in frequency domain again. While
13
d1
Mod./Cod.
..
.
..
.
dK
Mod./Cod.
P/S
IFFT
+CP
D/A
A/D
Sync
-CP
FFT
Det./Dec.
d1
..
.
..
.
..
.
..
.
..
.
..
.
..
.
A/D
Sync
-CP
FFT
Det./Dec.
dK
3.3
14
Information-Theoretic Basics
3.4
Uplink Transmission
In the uplink, the precoding, transmission and equalization of each symbol on
an OFDMA sub-carrier is illustrated in Fig. 3.3 and stated as
G
0
..
,
= WH y = WH (Hs + n) = WH
x
+
n
H
x
0
GK
(3.1)
G
[NUE 1]
where x =
C
are the symbols to be transmitted by the UEs,
which we generally assume uncorrelated with E{xxH } = I. These may then be
subject to linear UE-side precoding via matrices k : Gk C[Nue Nue ] , yielding
the nally transmitted signals s = [sT1 ..sTK ]T C[NUE1] . As we do not consider
UE cooperation, the overall transmit covariance E{ssH } = ss has a blockdiagonal structure, i.e. the signals originating from dierent UEs are uncorrelated. ss is usually subject to a per-antenna or per-UE power constraint to be
stated later. H C[NBS NUE ] is the instantaneous fast fading realization of the
channel on this sub-carrier. We also denote as Hm , Hk , Hm
k the parts of the
channel matrix H connected to BS m, UE k, or the link from BS m to UE k,
respectively, where we use a lower-case h if the expression becomes a vector
T T
] C[NBS 1] are the signals received by the BSs,
(e.g. for Nue = 1). y = [y1T ..yM
containing zero-mean Gaussian noise n C[NBS 1] with E{nnH } = 2 I, where
then equalization via a matrix W C[NBS NUE ] is performed to yield estimates
C[NUE 1] on the originally generated symbols x. The structure of W depends
x
on the particular CoMP strategy employed, as we will see later. We also write
Wm , Wk or Wkm for the part of W connected to a particular BS m, UE k, or
a specic link, respectively, and use a lower-case w if this yields a vector.
[xT1 ..xTK ]T
15
Network
Backhaul infrastructure
Base station 1
Base station 2
y1
y2
n1
yM = [yM,1 ..yM,Nbs ]T
n2
Base station M
nM = [nM,1 ..nM,Nbs ]
s2
G2
sK = [sK,1 ..sK,Nue ]T
s3
G3
x1
x2
x3
UE 1
UE 2
UE 3
GK
xK = [xK,1 ..xK,Nue ]T
UE K
3.4.1
where the notation I(X; Y ) denotes the mutual information between transmitter
and receiver side. The rate bound in (3.2) can be proven from both sides. On one
hand, one can construct an example coding technique (often based on the idea of
typical sequences) with a rate equivalent to the right-hand side (RHS) of (3.2),
such that any arbitrarily low probability of error can be achieved by simply
choosing a suciently long codeword. On the other hand, one can prove that
regardless of codeword length a non-zero probability of error remains if R exceeds
the RHS of (3.2) [CT06], typically making use of Fanos inequality. Hence, in
this point-to-point case with Gaussian x, the capacity of the transmission has
been precisely established. In the case of Nue > 1, i.e. multiple antennas per UE,
16
Information-Theoretic Basics
(3.2) changes to
1
H
R I (X; Y ) = max log2 I + 2 Hss H ,
ss
(3.3)
where the max operation over the transmit covariance ss implies that the UE
choses the optimal precoding matrix G for the current channel realization H,
hence requiring transmitter-side channel knowledge. Given perfect such knowledge, and assuming all transmit antennas of the UE to be subject to a sum power
constraint Psum , a capacity-achieving UE strategy is to perform a singular value
decomposition (SVD) of the channel, yielding H = UVH , and choose as precoding matrix G the RHS eigenvectors V, where the columns are scaled in power
such that tr{VVH } = tr{ss } Psum . Assuming that W = U is used as BS side
receive lter, the transmission from (3.1) can be re-stated as a transmission over
min(Nbs , Nue ) independent single-input single-output (SISO) links, often referred
to as the eigenmodes of the channel. The capacity on each eigenmode can then
be proved as in the case of Nue = 1 before. Finding the power scaling for V that
maximizes the sum capacity over all eigenmodes is a convex optimization problem [BV04] that can be solved easily via a water-lling algorithm [CT06], but not
in closed form. Note that the gap between capacity and rates achievable without
channel knowledge at the UE-side (e.g. without precoding and power control,
for example V = I) may be marginal, but then signicantly more complex signal
processing is required at the receiver side [HTB03].
If now multiple UEs 1..K are decoded by the same single BS, the setup resembles a multiple access channel (MAC) [Ahl71], where it becomes interesting to
observe the capacity region, hence all tuples of rates R1 ..RK at which the UEs can
transmit, such that all can be decoded at a probability of error decreasing exponentially in the codeword length. The capacity region of the MAC [Ahl71, Lia72]
is simply based on the fact that the sum-rate of any subset of UEs is bounded
by the joint mutual information between these UEs and the BS, given that all
other UEs are turned o. Formally, we can state this as
1
H
(3.4)
S {1..K} :
Rk max log2 I + 2 Hss (S) H ,
ss
kS
where ss (S) is the transmit covariance connected to the subset of UEs in set
S. Note that ss is now block-diagonal, as it is connected to multiple UEs. For
K = 2 UEs, this leads to the well-known pentagon-shaped capacity region illustrated in Fig. 3.4 [TV05]. Interpreting (3.4) from a more practical perspective,
it becomes clear that any point on the capacity region can be achieved by applying successive interference cancelation (SIC), hence by successively decoding the
transmissions of certain UEs, subtracting the corresponding receive signals, and
then decoding other UEs. Each of those cornerpoints of the capacity region
where all UEs rates are non-zero corresponds to one particular SIC order. The
optimal ss can be determined via a UE-wise successive SVD and water-lling
algorithm [YRBC01].
17
In the remainder of this chapter, we shift the focus to scenarios with multiple
BSs serving adjacent cells, and observe potential capacity gains through multicell joint signal processing.
3.4.2
3.4.3
Information-Theoretic Basics
4
bc
rs
bc
bc
bc
bc
bc
bc
bc
rs
ut
ut
rs
bc
ut
rs
ut
rs
bc
ut
rs
ut
rs
bc
no coop.
bc
rs
bc
rs
ut
no coop. (HK)
bc
full coop.
ut
rs
ut
R2 [bit/channel use]
18
bc
ut
rs
ut
FDM
rs
ut
0
0
2
3
R1 [bit/channel use]
bc
Figure 3.4 (Inner bounds on) capacity regions for no or full cooperation in the uplink.
ss
Note that this inner bound implies that each BS m also has perfect channel
knowledge towards interfering UEs, and takes this into account when calculating
receive lter Wm , referred to as interference rejection combining (IRC).
3.4.4
19
Numerical Example
Fig. 3.4 shows inner bounds on capacity regions for no BS cooperation, and the
capacity region forfull BS
cooperation, for an example with M = K = 2, Nbs =
Nue = 1, H = [1, 0.25, 0.5i, 1], 2 = 0.1 and unit transmit power limit per
UE. In the non-cooperative case, the bound is based on all possible assignments
of UEs to BSs, including the option of one BS decoding both UEs. This bound is
only marginally extended through HK schemes. For this channel, non-cooperative
performance can be improved through frequency division multiplex (FDM), as
also shown in the gure, where both UEs are placed on orthogonal resources and
hence mutual interference is avoided. Each UE then invests its transmit power
into a smaller portion of bandwidth, yielding an improved SNR. As FDM is
of little value in connection with BS cooperation [Mar10], however, we will not
further observe it in this work.
3.5
Downlink Transmission
In the downlink, the precoding, transmission and equalization of each OFDM
symbol on a single sub-carrier can be stated as
H
G1
0
H
..
= GH y =
(3.6)
x
H Wd (x) + n
.
0
GH
K
where x C[NUE 1] are the symbols to be transmitted to the UEs, and d() can
be any arbitrary manipulation of these symbols performed by the BSs. We will
see later that a non-linear operation d() is in fact required to achieve capacity in
the case of multiple UEs. W C[NBS NUE ] is a precoding matrix applied at the
BS side. The transmit covariance is now given as ss = E{Wd(x)(d(x))H WH },
which is typically subject to either a sum, per-BS or per-antenna power constraint. The latter is often motivated through the fact that each BS transmit
antenna has a separate power amplier with a limited linear range. In an OFDM
context, however, applying a per-antenna power constraint individually on each
sub-carrier is rather questionable, as the time-domain signal and its PAPR
appear more important. H C[NBSNUE ] is the channel matrix, as dened for the
uplink. G C[NUE NUE ] is a matrix containing the UE-side receive lters, which
is block-diagonal, as we again assume that no cooperation takes place between
UEs. n C[NUE1] is the thermal noise and background interference present at
the receive antennas of the UEs, which we assume zero-mean Gaussian with
C[NUE 1] of
covariance E{nnH } = 2 I. Each UE nally obtains estimates x
the originally transmitted symbols x. The same variable names have been used
in both (3.6) and for the uplink in (3.1) to emphasize duality: The receive lter
W in the uplink plays a dual role to the transmit lter in the downlink, and the
uplink transmit lters G a dual role to the downlink receive lters.
20
Information-Theoretic Basics
3.5.1
RBC = conv
R1 (W, ) , .., RK (W, ) ,
W,
(3.7)
21
where
states the union of multiple rate regions, and conv() is a convex hull
operation [BV04], in this case over all choices of precoding matrix W and encoding order , where for each xed parameter choice the UE rates are bounded as
1
2
H
H
H
H
Hk W j W j Hk
Hk Wk Wk Hk . (3.8)
Rk (W, ) log2 I + I +
(j)>(k)
Unfortunately, nding the optimal W for a certain point on the capacity
region (or, equivalently, the precoder maximizing a particularly weighted sum of
UE rates), is not trivial, as any sum of UE rates as given in (3.8) is typically nonconvex in W [BS02]. It has been observed in [JVG04], however, that an interesting duality can be exploited between uplink and downlink that we want to briey
H
illustrate in the sequel. Let us state Ak = 2 I + (j)>(k) HH
k Wj Wj Hk and
2
H H
Bk = I + (j)<(k) Hj Gj Gj Hj as the interference terms in downlink and
uplink, respectively. We then re-state the rate bound in the BC from (3.8) as
H
H
(3.9)
Rk (W, ) log2 I + A1
k Hk W k W k Hk
1
1
1
1
1
1
2
2
H 2 2
2
= log2 I + Ak 2 HH
(3.10)
k Bk Bk Wk Wk Bk Bk Hk Ak
1
1
1
1
12
12
H
H
= log2 I + Bk Hk Ak Bk2 Wk Wk Bk2 Ak 2 Hk Bk 2 (3.11)
=Gk
1
H
H
= log2 I + Bk Hk Gk Gk Hk ,
(3.12)
which is equivalent to the uplink rate bound for a MAC, given xed transmit l , as this can be derived from (3.4).
ters Gk and an opposite decoding order
The equality in (3.10) is based on the fact that |I + AB| = |I + BA|, and
that in (3.11) based on the idea of channel ipping [JVG04]. The authors
in [JVG04] have furthermore shown that the above equalities hold for all UEs
if and only if the sum power is the same in both cases, i.e. if tr{ k Gk GH
k } =
H
tr{ k Wk Wk }. Hence, we can conclude that the capacity region of the MIMO
BC under a sum power constraint is equivalent to that of the MIMO MAC
(obtained through the reciprocal channel H) under the same sum power constraint. As the standard uplink is typically subject to a per-UE power constraint,
we can obtain the BC capacity region by taking the convex hull around many
MAC regions with dierent per-UE powers summing up to the same overall
power. This is illustrated in Fig. 3.5 for the same example channel as before. It
was shown in [WSS06] that the obtained BC rate region corresponds to the Sato
upper bound [Sat78], proving that there can indeed be no scheme that performs
better. Hence, capacity has been established for the BC case of Gaussian noise.
Equations (3.9)-(3.12) also suggest that we can calculate the optimal precoding matrix W if the dual uplink transmit lters G are known. This is possible by calculating k Bk directly from G1 ..GK , and then determining Ak and
22
Information-Theoretic Basics
5
uplink capacity regions
downlink capacity region
R2 [bit/channel use]
0
0
2
3
R1 [bit/channel use]
1/2
3.5.2
3.5.3
5
b
bc
bc b
R2 [bit/channel use]
bc
rs
bc b
b
b
b
b
bc bc
rs
rs
bb
bc bc
rs
bc bc
rs
rs
no coop.
bc bc
rs
rs
bc
b
bc
rs
no coop. (HK) rs
bc
full coop. (MMSE)
b
full coop. (DPC)
sum pwr.
p.a. pwr.
bb
bc bc
rs
b b
bc bc
rs
bc
rs
rs
rs
bb
bc
bc
rs
rs
rs
bc
bb
bc
rs
bc
rs
2
3
R1 [bit/channel use]
rs
bc
bb
bc bc b b
rs
0
1
bb
bc
rs
rs
23
bc bc
Figure 3.6 (Inner bounds on) capacity regions for no or full cooperation in the
downlink.
3.5.4
Numerical Example
Fig. 3.6 shows inner bounds on capacity regions for no BS cooperation and
capacity regions for full BS cooperation, for
channel as before,
the same
example
2
i.e. M = K = 2, Nbs = Nue = 1, H = [1, 0.25, 0.5i, 1], = 0.1 and either a
unit per-antenna power constraint, or a sum power contraint of 2 (i.e. in both
cases tr{ss } 2). In the non-cooperative case, an inner bound on the capacity
region is in principal based on all possible assignments of UEs to BSs, including
the option of one BS transmitting to both UEs, though this is not benecial
for this particular channel. We again also observe HK schemes, but can see
that these are only interesting under a per-antenna power constraint. In general,
the dierence between sum and per-antenna power constraint is only visible at
the sides of the capacity regions, while the sum-rate remains largely unaected,
especially under non-linear precoding (DPC).
24
Information-Theoretic Basics
3.6
Summary
In this chapter, we have formalized the uplink and downlink transmissions
considered throughout the remainder of the book, and introduced the basic
information-theoretic concepts inherent in the many degrees of freedom of CoMP.
We have seen how capacity regions can be computed for uplink and downlink
under full base station cooperation, and inner bounds on these (or achievable rate
regions) can be computed for cases of no BS cooperation, as capacity remains
unknown here. While all computations are rather straight-forward for the uplink,
we have seen that uplink/downlink duality can be used to also make the downlink more mathematically amenable. The results in this chapter already suggest
substantial rate gains through multi-cell joint signal processing, but this will be
analyzed in more detail in the next chapter.
4.1
4.1.1
(4.1)
channel estimation related noise term
where we can see that the channel estimation error leads to an additional noise
term Es. Equation (4.1) implies that if the channel and its estimate are assumed
block-static (as dened for the former in Chapter 3), then the estimation error
is also block-static. If we now observe the average capacity of the transmission
over many transmission blocks, the impact of channel estimation noise can be
26
(4.3)
ss
Note that, dierent from the MAC capacity region under imperfect channel
state information (CSI), it is now not optimal anymore to let all UEs transmit at maximum power. This is because hh itself is a function of the transmit
covariance ss , hence increasing one UEs power will lead to the fact that the
residual channel estimation related noise impairing successive interference cancelation (SIC) performance is also increased. This is the reason why the MAC
capacity region under imperfect CSI is not a pentagon anymore, see [MF09b].
can be modeled for a channel realization
The question is now how hh and H
H and a particular channel estimation scheme. One option is to observe the
variance of the absolute (i.e. link-independent) channel estimation error variance,
which can be obtained via the Cramer-Rao lower bound [Kay93] as
2
=
E
p2
.
Np ppilots
(4.5)
Here, p2 denotes the variance of the noise the channel estimation is subject to,
Np is the number of pilots used to obtain CSI, and ppilots is the pilot power. Note
that p2 may deviate from 2 if, e.g., pilot sequences of multiple cells are designed
to be orthogonal, while data transmission in these cells is subject to mutual
interference not addressed by CoMP (hence leading to an increased background
2
noise 2 > pilots
). With the denition of E in (4.1), we can now state
"
# 2
!
E |hi,j |2 E
H
(4.6)
=
E ei,j (ei,j )
2,
E {|hi,j |2 } + E
from which the calculation of hh is straightforward. Note that some authors
derive the impact of channel estimation noise using a dierent model where
= H + E [PSS04, MF09b], i.e. where the estimated channel and estimation
H
error are assumed correlated, but obtain the same nal result as in (4.4). Now we
27
still have the problem that (4.4) states an inner bound on the capacity region for
assuming that the actual channel H is uctuating
a given channel estimate H,
around this. In most cases, however, we want to observe the opposite case, i.e.
the capacity of the transmission over an actual channel H under imperfect CSI.
in (4.4)
It is discussed in [Mar10] that this can be approximated by replacing H
by the expectation value of the channel estimate, which under the assumption of
of the
an unbiased MMSE detector is simply an element-wise scaled version H
actual channel H with [MF09b]
hi,j
i,j = $
.
i, j : h
2
1 + E
/E {|hi,j |2 }
(4.7)
4.1.2
(4.8)
=H
28
b
hi,j and E ei,j
E hi,j . (4.9)
i, j : hi,j = 1 2
=2
The downlink transmission equation from (3.6) can then be modied to
BS s + v + u + n,
y=H
(4.10)
Hk
W k Wk Hk
Rk (W, ) log2 I + I + ii + hh + hh
with kii =
BS
H
k
H
BS .
Wj WjH H
k
(4.11)
(j)>(k)
Here, kii is the residual inter-user interference, khh is the part of matrix hh
that is connected to UE k, denoting noise due to imperfect CSI at transmitter and receiver side, and BS,k
is noise due to additional CSI imperfectness
hh
29
cell edge
BS1
BS2
d1
0.6
d2
UE1
d1
UE2
km
0.2
0.5
0.5
SD
0.2
d2
dI
0.6
dis
t.
UE3
UE2
BS2
BS3
int
ers
UE1
ite
d3
BS1
at the transmitter side. As in the uplink, the capacity region can again be
BS in (4.11) by the average BS-side channel estiapproximated by replacing H
$
BS
i,j = hi,j 1 2Nb / 1 + 2 /E{|hi,j |2 }. It has
= E{H
} with h
mate H
E
been shown in [MF09a, Mar10] that uplink/downlink duality is still applicable
to the capacity region in (4.11), where the dual uplink is then subject to a particular extent of CSIR. Unfortunately, calculation of dual uplink precoders Gk
and the power distribution among UEs is now not a convex problem any more,
but still numerically more tractable (e.g. through a brute-force search) than trying to solve (4.11) directly. Duality can also be used to observe non-cooperative
performance under imperfect CSI and various power constraints [Mar10].
In the rest of this chapter, we use values of Np = 2 and Nb = 6, which
have shown in [MRF10, Mar10] to be representative for the performance in an
OFDMA system with the pilot structure of LTE Release 8, a UE speed of 3 km/h,
a maximum delay spread of 1 s, and a CSI feedback delay of 3 ms.
4.2
16
bc
14
bc
bc
bc
bc
bc
bc
bc
bc
bc
bc
bc
utrs
rs
rs
+90% rs
rs
utrs
rs
ut
utrs
utrsbc
ut
utrs
ut
ut
utrs
rs
ut
+20%
+28% ut
ut
ut
ut
ut
ut
ut
ut
utrs
full coop.
no coop. (IRC+SIC)
no coop. (IRC)
no coop. (MRC)
perf. CSI (Np = )
imp. CSI (Np = 2)
ut
rs
ut
ut
ut
bc
utrs
ut
ut
bc
utrs
10
ut
utrs
ut
bc
ut
ut
12
bcutrs
ut
bc
utrs
ut
ut
30
0.2
0.3
0.4
d1 = d2 = d3
0.5
0.6
Figure 4.2 Uplink joint signal processing gains for scenarios with M = K = 3.
In the uplink, shown in Fig. 4.2, we assume multi-cell power control, where
the transmit power of each UE is adjusted w.r.t. a certain target average receive
power at all BSs. Target receive power and noise variance are chosen such that a
single-input single-output (SISO) signal-to-noise ratio (SNR) of 10 dB is obtained
at the cell-edge. We compare the following schemes:
non-cooperative detection, based on maximum ratio combining (MRC) (i.e.
considering interference as spatially white noise),
non-cooperative detection, based on interference rejection combining (IRC)
(i.e. taking the spatial properties of interference into account),
non-cooperative detection, allowing a exible assignment of UEs to BSs and
the joint detection of multiple UEs at the same BS, and
fully-cooperative joint detection of all UEs by all BSs.
The strongest rate gains can be obtained at the cell-edge [KRF07]. Here, using
IRC to exploit the spatial color of interference (but without BS cooperation)
already yields 28% rate increase. Further 20% are possible if local SIC is used,
i.e. interference subtraction also not requiring BS cooperation. The strongest
gains, however, with an additional 90%, are visible if the BSs jointly process
all UEs, proting from array and spatial multiplexing gain and yielding MAC
performance. The gain of local, non-cooperative interference subtraction disappears quickly as we move away from the cell-edge, as enabling the decoding and
subtraction of interference poses constraints on the rates of interferers [HK81].
Han-Kobayashi (HK) techniques (superposition coding and partial interference
decoding) would yield marginal rate improvements here, and are strongly sensitive to imperfect CSI. Towards the cell-centers, all stated gains strongly diminish,
especially under imperfect CSI, as then the interference links cannot be estimated
well enough to be exploited. At the cell-edge, however, the relative rate improve-
20
utrs
bc
ut
r
p.a. pwr.
bc
b
bc
utrs
utrs
16
bc
bc
bc
bbc
utrs
bbc
ut
utrs
b
b
b
bc
14
utrs
r
ut
12
bc
b
ut
bbc
b
bc
bc
bc
utrs
+11%
ut
+55%
ut
ut
ut
utrs
ut
ut
ut
rsut
rs
ut
rs
ut
utrrs
ut
utrrs
ut
ut
+86% rrs
r
utrs
ut
utrs
ut
ut
bc
ut
utrsr
10
ut
b
ut
18
31
0
0.2
0.3
0.4
d1 = d2 = d3
0.5
0.6
Figure 4.3 Downlink joint signal processing gains for scenarios with M = K = 3.
ments due to full cooperation increase for decreasing CSI, as additional diversity
alleviates the impact of channel estimation errors [Mar10].
In the downlink, shown in Fig. 4.3, we assume that the transmit power is
xed, such that we obtain a SISO SNR at the cell-edge of 10 dB. The compared
schemes are analog to those in the uplink, namely
non-cooperative transmission, based on maximum ratio transmission (MRT),
non-cooperative transmission, based on interference-aware precoding (IAP),
non-cooperative transmission, allowing a exible assignment of UEs to BSs
and joint transmission from one BS to multiple UEs, possibly using local
DPC,
fully-cooperative joint transm. from all BSs to all UEs, possibly with DPC.
The dierence between linear precoding and DPC is shown through empty
and lled markers, respectively. We can see similar eects as in the uplink. The
cooperation gain is again largest at the cell-edge, with 55% improvement due
to interference-aware precoding, 11% additionally due to the option of local,
non-cooperative multi-UE transmission with DPC, and another 86% if full joint
transmission is employed. Under imperfect CSI, DPC is only (marginally) superior to linear precoding at the cell-edge, as interference links can otherwise not
be estimated accurately enough. This is also the reason why local multi-UE
transmission with DPC is less benecial than its counterpart (local SIC) in the
uplink. The small gap between linear and non-linear techniques under full cooperation, also under perfect CSI, is due to the fact that the compound channel
already enables a fairly good spatial separation of the UEs without DPC. In
general, the relative gain of cooperation remains more or less the same, regardless of the extent of CSIT [Mar10]. The reason is that both cooperative and also
non-cooperative transmission degrades equally for diminishing CSIT.
32
4.3
4.3.1
p1 h
1 p2 h
h
(4.12)
R1 log2 I + 2 I + 1hh + h
2
2
1
1
2 H 1 2 2 H
2 p2 h
p1 h
(4.13)
R1 + log2 I + 2 I + 2hh + h
h
2
2
1
1
zero if source coding is not considered
1 2 2 H
p2 h
R2 log2 I + 2 I + 2hh
h
.
2
2
(4.14)
33
Network
Network
dec. data
of UE 1
dec. data
of UE 2
dec. data
of UE 1
dec. data /
quant. interf. of UE 1
Network
dec. data
of both UEs
dec. data
of UE 2
quant. rx signals
or soft bits
B BS1 B
y1
B BS2 B
y2
B BS1 B
y1
s1
s2
UE1
UE2
quant. rx signals
or soft bits
B BS2 B
y2
B BS1 B
y1
B BS2 B
y2
s1
s2
s1
s2
UE1
UE2
UE1
UE2
quant. rx signals
or soft bits
1 m|l=m
log2 I + m
yy
,
qq
(4.17)
m=1
m|l
34
backhaul is optimally invested into the spatial dimensions of the received signals.
For decentralized JD, this is equivalent to letting each BS locally equalize the
interfering UE to obtain a scalar value which is then quantized. As source coding
might be regarded infeasible in practice, we also consider the case where this is
omitted, or a practical quantizer, where one quantization bit is lost per real
signal dimension [LBG80]. In these cases, (4.17) changes to [Mar10]
'
*1
2
m
m 1 m
max N
2,0
m
bs
log2 I + qq
yy or m : qq = 2
1 m
yy
m=1
(4.18)
with 1 + 2 . The rate/backhaul trade-o of decentralized JD can be
improved if the backhaul is used successively, and not simultaneously. One BS
could forward quantized receive signals, after which the other BS would decode
its assigned UE and subtract the corresponding signals from its receive signals
before quantizing and forwarding the remaining signals to the former BS. However, this only yields (marginal) gains in interference regimes where the following
scheme is superior, anyway, while increasing latency [Mar10].
Centralized Multi-Cell Joint Detection
Let us nally consider the case where one BS quantizes its received signals,
and forwards these to the other BS, where both UEs are then jointly decoded.
Assuming that received signals are forwarded from BS 1 to BS 2, the UE rates
can be stated for a given quantization noise covariance as
'
( 1
)*1
0
H
qq
k
k pk h
(4.19)
Rk log2 I + 2 I + hh +
h
0 0
kS
kS
One benet of centralized JD, becoming evident from (4.19), is the option of
SIC at the decoding BS. The quantization noise covariance 1qq has to fulll
1 1|2
log2 I + 1qq
yy ,
'
*1
1 1
max N 2,0
bs
yy or 1qq = 2
1 1yy , (4.20)
log2 I + 1qq
with or without source coding, or based on practical quantization, respectively.
Note that even under perfect CSIR, the rate region is not a polygon anymore,
as we have the degree of freedom of assigning dierent portions of backhaul to
the two UEs. This is treated analytically and illustrated in [dS08].
In [SSS07a], centralized JD has been investigated in conjunction with partial local decoding. For above cooperation direction, this would mean that BS 1
decodes part of its assigned UEs transmission itself, and forwards the remaining
received signals to the other BS for joint decoding of the remaining signals from
both UEs. While a benet regarding the rate/backhaul trade-o was reported
in [SSS07a], this is only marginally superior to a simple time-share between a
decentralized and centralized cooperation strategy [MF08c].
35
d1 = 0.5, d2 = 0.5
7.6
bc
bc
bc
bc
qp
qp
bc
rs
qp
ld
rs
bc
rs
qp
ld
ld
qpldrs
4
0
qp
qp
bc
Np = 2
rsld
ldrs qp
bc
DIS
CIF
ldrs
Dec. JD
Cen. JD
cut-set b.
2
4
6
8
10
sum backhaul [bit/channel use]
(a) Symmetrical, strong interference.
ldrs
d1 = 0.4, d2 = 0.2
rs
rs
7.4
rsld
rsld
ldrs
qp
qp
qp
ld
ld
bc
qp
qp
7.2
rs
ldrs
ld
qp
qp
7.0
bc
qpldrs
6.8
12
bc
bc
bc
Np = 2
DIS
CIF
Dec. JD
Cen. JD
cut-set b.
bc
2
4
6
8
10
sum backhaul [bit/channel use]
12
Figure 4.5 Sum-rate vs. backhaul for dierent uplink cooperation strategies.
4.3.2
Numerical Results
Let us now compare the rate/backhaul trade-os achievable with the cooperation
concepts stated before, again focussing on M = K = 2. In Fig. 4.5, the achievable sum-rate of both UEs is plotted as a function of backhaul, under imperfect
CSI with Np = 2. The left case shows a symmetrical interference scenario, where
both UEs are at the cell-edge (i.e. d1 = d2 = 0.5), while the right case resembles
an asymmetrical scenario of weaker interference (d1 = 0.4, d2 = 0.2). For each
scheme, multiple lines show the range between the best rate/backhaul trade-o
achievable in theory (upper left) and under practical considerations (lower right).
The dashed line indicates the cutset-bound [CT06], i.e. the sum-rate achieved if
every bit of backhaul leads to an equivalent sum-rate increase until MAC performance is reached. Only centralized JD asymptotically achieves MAC performance for a large backhaul, due to the full extent of spatial multiplexing, array
and interference cancelation gain. At the cell-edge (Fig. 4.5(a)), the scheme outperforms all others, and source coding is highly benecial due to strong signal
correlation. Decentralized JD also shows good asymptotical performance, but
lacks the option of SIC. One could argue that each BS could also decode the
interference as well, but then it would suce to perform cooperation only in one
direction, i.e. do centralized JD requiring less backhaul. However, such a strategy
may still be interesting from a signaling perspective, see Section 11.2. For the
cell-edge case, there is no benet of using DIS or CIF, as both BSs can independently decode the interference and subtract this before decoding their UEs,
without requiring backhaul at all. In the asymmetrical case of weaker interference
(Fig. 4.5(b)), the story changes. Beside lacking array and spatial multiplexing
gain, decentralized schemes can now oer an improved rate/backhaul trade-o
in regimes of low backhaul. Especially DIS here appears attractive, as BS 1 can
decode its assigned UE at moderate interference, while the extent of interference
cancelation enabled by the exchange of decoded bits over the backhaul is large.
0.6
rs
rs
ld
DIS
CIF
Dec. JD
Cen. JD
0.5
qp
bc
d2
36
rs
bc
> 10%
rs
0.4
bc
> 10%
ld
0.3
qp
rs
0.3
0.4
d1
0.5
0.6
Figure 4.6 Best uplink cooperation concept for a backhaul capacity of 4 bpcu.
This is conrmed in Fig. 4.6, where the best cooperation concept is shown
as a function of UE location, for an exemplary backhaul of 4 bpcu. While centralized JD is best for strong, possibly asymmetric interference, DIS is superior
for weaker interference and a constrained backhaul. CIF and decentralized JD
are only interesting in regimes of very weak interference, where we know from
Fig. 4.2 that expected CoMP gains are small, anyway. The results suggest that
a practical system should switch between centralized JD and DIS depending on
the interference situation. This is emphasized by the darker areas in Fig. 4.6,
where one strategy yields more than 10% larger rates than the other.
One may wonder why all schemes presented before actually yield a
rate/backhaul trade-o far away from the cut-set bound. This question enables
an interesting insight into fundamental properties of the compared schemes:
While centralized JD asymptotically achieves MAC performance, it fails to
meet the slope of the cut-set bound, as a certain extent of backhaul is wasted
into the quantization of noise [dS08]. In fact, the cut-set bound is approached
if source coding is applied and the SNR approaches innity [dS08, Mar10].
DIS, however, usually does not meet the at part of the cut-set bound, as
it lacks spatial multiplexing and array gain, and the rst UE decoded does
not prot from cooperation at all. It also usually fails to meet the slope of
the cut-set bound, as the entropy of the data handed over the backhaul is
mostly larger than the rate gain due to interference cancelation. An exception
is the Z-interference channel [ZY08] with d1 = 0.5, d2 = 0, where (with source
coding) every backhaul bit indeed yields exactly one bit of sum-rate increase,
until asymptotic DIS performance is reached [GMF09].
37
4.3.3
4.4
38
Network
data bits
of both UEs
Network
global CSI
global data
CU
analog or
digital tx signals
(possibly)
data of
both UEs
Network
(possibly)
data of
both UEs
(possibly) CSI /
data bit exch.
global CSI /
data
data of
some UEs
data of
some UEs
(possibly) CSI /
data bit exch.
global CSI /
data
partial CSI /
data
BBS 1B
BBS 2B
BBS 1B
BBS 2B
BBS 1B
B BS 2B
s1
s2
s1
s2
s1
s2
y1
y2
y1
y2
y1
y2
UE 1
UE 2
UE 1
UE 2
UE 1
UE 2
(a) Centralized.
(b) Distributed.
(c) Decentralized.
data to be transmitted to all jointly served UEs (or the BSs distribute this
data among each other), such that all BSs calculate their part of the precoding
matrix W independently. A crucial aspect, however, is the fact that all BSs now
require global CSI. This can be assured by either exchanging channel information
between BSs, or by designing the CSI feedback from the terminal side such that
all involved BSs can individually decode this. A distributed downlink JT scheme
is observed in Sections 6.3, 13.3 and 13.4.
Before mentioned CSI requirement can be alleviated if decentralized downlink
JT is performed, as shown in Fig. 4.7(c). In this case, the involved BSs may
have strongly dierent extents and accuracies of CSI, and dierent extents of
knowledge on UE data bits, but still contribute to the transmission through
local precoding. Such a scheme is considered in Section 6.4.
4.5
Summary
In this chapter, we have extended the models from Chapter 3 to observe multicell joint signal processing under imperfect channel knowledge. In both uplink
and downlink, we have seen that in representative scenarios of up to 3 cooperating base stations, spectral eciency gains of more than 100% are thinkable at
the cell-edge, while these gains strongly decrease towards the cell-center. Further considering a limited backhaul capacity between base stations has revealed
a major trade-o in the uplink: Either backhaul is used eciently, but only a
limited extent of capacity gain is achieved, or backhaul is wasted into the quantization of noise, but yielding maximum gain. In the downlink, we have discussed
three dierent joint transmission concepts that dier in the way how user data,
channel knowledge and precoding are distributed among the base stations.
Part II
Practical CoMP Schemes
5.1
5.1.1
Introduction
Transmission with multiple antennas both at the transmitting and receiving ends
of a wireless link has become increasingly mature in recent years. From theory,
the fundamental capacity gain of the multiple-input multiple-output (MIMO)
radio link, being proportional to the minimum of the number of transmit and
receive antennas, is well understood for an isolated point-to-point link. Under
perfect channel knowledge at transmitter and receiver, a capacity-achieving
42
interference
from other cells
xed beams
strategy is to exibly invest transmit power into the eigenmodes of the channel via water-lling (see Section 3.4.1). In practical systems, however, under
imperfect channel knowledge and a limited granularity of power allocations and
modulation and coding schemes (MCSs), one typically switches between the two
fundamental transmission modes spatial multiplexing (SMUX) and spatial diversity (SDIV) [ZT03] depending on the current channel state, in order to improve
the error rate performance for xed data rate transmission [HP05b] or to increase
the spectral eciency [SAH+ 04].
To enable ubiquitous broadband wireless access, MIMO transmission must be
made robust against multi-cell interference. However, it is not fully evident yet
how the potential capacity gains of MIMO can be realized under these conditions.
In fact, early results obtained for a small set of linear transceiver settings, i.e.
number of antennas, equalization and precoding strategy, indicate only small
gains for SMUX over SDIV systems [CDG00]. The achievable spectral eciency
may be enhanced by incorporating multi-user MIMO (MU-MIMO) into system
design and thus turning the focus to multi-user links [GKH+ 07]. However, BSs
would require coherent channel state information (CSI) to optimally serve their
users in MU-MIMO, which is dicult to obtain in frequency division duplex
(FDD) systems, as a high rate feedback link would be required from the terminals
to the base stations.
Further, fair resource assignment is mandatory in cellular networks in order
to guarantee radio access for all users. The multi-path structure of signal and
interference channels may be used benecially in this interference-aware scheduling process. Supplemental to the time-domain scheduling already used in todays
radio systems, groups of frequency resources may be assigned to the users according to their frequency-selective signal-to-interference-and-noise ratio (SINR) conditions. In this case, users may benecially be assigned to their best resources.
This section targets a practical solution for decentralized interference management. The key to success is a predictable interference scenario at the receiver
side, which also helps to improve the link adaptation process. Thus, we consider
using xed beams (i.e., xed sets of possible precoding vectors) for transmission
43
Network
data bits
of UE1
PMI / CQI
feedback
data bits
of UE2
BBS 1B
BBS 2B
s1
s2
precoding
precoding
IRC
IRC
y1
y2
UE 1
UE 2
PMI / CQI
feedback
as depicted in Fig. 5.1. In particular, terminals are assumed to report their preferred precoding matrix indicators (PMIs) in combination with corresponding
post-equalization SINRs via a low-rate feedback channel. For the equalization
at the UE, comprehensive channel knowledge on the radio system is required,
which may be obtained by multi-cell channel estimation based on pilot symbols,
as discussed in Section 9.1. Therefore, downlink transmission has to be synchronized [JWS+ 08]. With this approach, we demonstrate substantial throughput
gains for MIMO systems in multi-cell environments, similar to those known for
point-to-point links. We further indicate potential performance gains under the
inuence of imperfect channel estimation in systems with non-synchronized and
synchronized BSs.
5.1.2
44
sparse, as each column connected to one UE and one stream may only have
non-zero entries connected to the antennas of one BS.
In the sequel, let us observe one UE k which is served by BS m = k. While
set K captures all K UEs, we denote as Km the set of all UEs served by BS m
simultaneously on the same resource, which is obviously limited to the number
of BS transmit antennas, e.g. |Km | Nbs . All received signals of our observed
UE k can be expressed as
H
H
m
m
yk = (Hm
(Hm
(Hk )H Wj xj + nk ,
k ) Wk xk +
k ) Wj xj +
j{Km \k}
j{K\Km }
Hk
Intra-cell intfr. k
(5.1)
where Hk is the channel between UE k and all BSs, Wk is the compound prem
coding vector used to serve UE k, and Hm
k and wk are the sub-portions of
these matrices or vectors connected to BS m, as introduced in Chapter 3. We
write as Hk the eective channel between UE k and its serving BS after precoding, which consists of one column for each of the Nue streams the UE may
potentially receive, i.e. Hk = [Hk,1 . . . Hk,Nbs ]. The corresponding potential data
streams stacked in xk with x NC (0, I) are distorted by the intra-cell and intercell interference and noise aggregated in k and zk , respectively. Each BS m
may select a limited number Qm Nbs of active beams to serve one user with
multiple beams or multiple users simultaneously. This is done by choosing the
corresponding columns of BS m-related precoding matrix Wm from the columns
of a pre-dened beam set m
i . In the case of Nbs = 2, beam set size = 2 and
discrete Fourier transform (DFT)-based precoding, this can be either
(
)
(
)
1 1 1
1 1 1
m
or
.
(5.2)
=
=
m
1
2
2 i i
2 1 1
Columns in Wm representing streams that are not used are simply lled with
zeros. Note that Wm has to be scaled depending on the choice of Qm in order
to fulll a per base station power constraint, i.e. tr{Wm (Wm )H } Pm . If only
one beam is active, i.e. Qm = 1, we name it single stream (SS) mode, while for
Qm > 1, we refer to it as multiple stream (MS) mode.
5.1.3
Linear Receivers
Assuming that a linear equalizer gk,u is employed to extract the useful signal
from yk connected to stream u, this yields a post-equalization SINR given by
H
SINRk,u =
H
gk,u
hk,u hk,u gk,u
H Z
gk,u
k,u gk,u
(5.3)
45
(5.4)
where Ryy,k denotes the covariance matrix of the received signal yk , i.e.
!
!
H
Ryy,k = E yk (yk )H = Hk Hk + E (k + zk ) (k + zk )H .
(5.5)
SINRMMSE
= hk,u Z1
k,u
k,u hk,u .
(5.6)
Based on this SINR, the achievable spectral eciency is evaluated in a downlink OFDMA multi-cellular simulation environment. For reference purposes, we
compare these results with the performance achievable by using a maximum
ratio combining (MRC) receiver
MRC
= hk,u
gk,u
5.1.4
+
+2
+ H
+
+hk,u hk,u +
H
(5.7)
(5.8)
46
2
Frequency-at i.i.d. interference power IF
,
, k,u = Eq
Z
|hj,v (q)|2 |hk,u (q)|2 + 2 I
(5.9)
j,v
2
Frequency-selective i.i.d. interference power IF
, (q)|2 + 2 I
, k,u (q) =
|hj,v (q)|2 |h
Z
k,u
(5.10)
j,v
(5.11)
1
,
h
c (n)yk (n)
k,u =
N n=0 k,u
, ,H
,H
, h
, k,u =
Z
hj,v hj,v h
k,u k,u .
(5.12)
(5.13)
j,v
5.1.5
47
(a) U = 2 users
(b) U = 20 users
Figure 5.3 Rate allocation across two data streams, if the scheduler may choose from
c 2009 IEEE.
U = 2 or U = 20 users [TSWJ09].
48
for each PRB individually: First, each beam available per transmission mode is
assigned to the user providing the minimum, i.e. best, score for that beam. Then,
the mode is selected which corresponds to the minimum overall user score.
The objective of this score-based resource allocation process is to assign each
user to his best resources, and the decision on the spatial mode is taken under
the premise of achieving a high throughput for each user. Clearly, the process is
of heuristic nature, and hence the global scheduling target of assigning each user
an equal amount of resources is achieved only on average, or if the number of
available resources tends to innity. However, its convenient property for practical applications is its exible utilization, as the set of resources can be dened
over arbitrary dimensions (time/frequency/space). Thus, fairness w.r.t. an equal
amount of resources for all active users can be established on a small time-scale,
e.g. even for the scheduling of resources contained within a single orthogonal
frequency division multiplex (OFDM) symbol.
An illustration of the performance achievable by the score-based scheduler is
given in Fig. 5.3. It depicts the histogram of normalized achievable user rates
in the rate region plane for two UEs which may be scheduled in each PRB. In
particular, we assume two spatial layers to be available in each PRB (i.e. Nbs =
2), allowing two users to be served simultaneously in MU-MIMO mode. The rate
allocated to each of these two users is normalized to the rate it would achieve
if the PRB was assigned exclusively to it. Fig. 5.3(a) shows the distribution of
normalized rates if the total number of users to select from is limited to U = 2,
while Fig. 5.3(b) refers to the case U = 20. From both gures, it is clearly seen
that the achievable rates lie beyond the time division multiple access (TDMA)
rate region (dashed line in the rate region plane). For an increasing number of
UEs, the histogram is more and more concentrated in the upper right corner of
the rate region. This shows that the heuristic score-based scheduling approach
signicantly outperforms TDMA scheduling and conveniently achieves high user
rates by properly utilizing MU-MIMO.
5.1.6
Single-Cell Performance
Initial performance evaluation is carried out for a xed system setting in
an isolated cell (i.e., zk = nk in (5.1)), where K UEs, each equipped with
Nue = 2 receive antennas, communicate with a dual-antenna BS (Nbs = 2).
The evaluation environment is based on the spatial channel model extended
(SCME) [3GP07a], and full CSIR is assumed. We investigate the probabilities of
mode selection depending on the mean signal-to-noise ratio (SNR) conditions,
which are depicted in Fig. 5.4 for 2 or 10 users, respectively. Note that resources
where a rate cannot be supported by any user are not assigned by the scheduler.
For that reason, the selection probability of SS mode drops down to 75% at
Ps /N0 = 5 dB in the rst case. Three dierent congurations of the adaptive
mode switching system are considered here:
49
1.0
SU-MIMO
MU-MIMO
MU-MIMO, 2 beam sets
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
5
5
pi / 2 [dB]
(a) U = 2 users.
10
15
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
5
5
pi / 2 [dB]
10
15
(b) U = 10 users.
Figure 5.4 Probability of the selection of multiple stream (MS) mode vs.
c 2009 IEEE.
SNR [TSWJ09].
50
Parameter
channel model
scenario
trac model
carrier frequency fc
system bandwidth
inter-site distance (ISD)
number of sites
Nbs ; antenna spacing
transmit power
sectorization
BS height
Nbs ; antenna spacing
UE height
CQI granularity
feedback delay
channel estimation
Value
3GPP SCME
urban-macro with scenario-mixa
full buer
2 GHz, frequency reuse 1
18 MHz, 100 PRBs
500 m
19 having 3 cells each
1,2,4 ; 4
46 dBm
triple, with FWHM of 68
32 m
1,2,4 ; /2
2m
1 PRB
0 ms
as specied in text
a
Note, a mobile terminal might experience dierent propagation scenarios, i.e. line-ofsight (LOS) and non line-of-sight (NLOS), to distinct BSs.
5.1.7
51
ut
bcu
0.9
q
ut
0.7
0.6
bc u
rs
=2
=4
ut
= 1.95 = 2.88
= 3.62
= 2.11
= 3.43
0.5
0.4
bc u
rs
ut
0.3
bc
bc u
rs
0.6
ut
2
4
6
8
10
spectral eciency [bit/s/Hz/cell]
(a) Spectral eciency.
u
bc
q
bc
u
q
qbc
bc
bc
0.2
0
q
u
0.4
bc
bc
0.5
0.1
0.7
0.1
bc
bc
0.3
0.2
0.8
CDF
rs
ut
=2
ut
bc u
ut
0.8
CDF
ut
1.0
0.9
ut
1.0
rs
SISO
adapt.,
adapt.,
adapt.,
adapt.,
2x2
4x2
2x4
4x4
0.1
0.2
0.3
0.4
0.5
throughput [Mbit/s/user]
0.6
Figure 5.5 Idealistic system performance for the SISO, MIMO 2 2 (Nbs Nue ),
4 2, 2 4 and 4 4 system for 20 users per cell or sector. Dashed lines indicate the
c 2009 IEEE.
performance achievable with = {2, 4} beam sets m
i [TSWJ09].
= 2.88 and = 3.43 for the MIMO 2 2 (Nbs Nue ), 2 4 and 4 4 system.
We can observe only small additional capacity gains for systems with Nbs > Nue
compared to a system with Nbs = Nue . This is mainly caused by the constraint
of DFT-based precoding, where the total transmit power is distributed evenly
over all antennas. In contrast, the system with Nbs < Nue benets from advanced
capabilities for interference suppression and higher receive diversity. This enables
the system to achieve larger scaling factors, e.g. = 2.88 for MIMO 2 4. The
5th percentile of normalized user throughput, which may serve as a measure to
represent the throughput of cell-edge users, shows similar scaling.
Case 2: All BSs provide multiple xed unitary beam sets. Fig. 5.5 (dashed
lines) further indicates the potential capacity gains for allowing the users to
choose from multiple beam sets. Here, the system may prot from an improved
channel quantization, yielding a capacity increase of = 2.11 for MIMO 2 2
with two beam sets. However, it has to be considered that then also the PMI
feedback overhead doubles from 1 bit to 2 bit.
Interference prediction: Note that considering independent adaptation of
beam sets for all BSs does not inuence the received interference covariance
matrix Zk,u , since the Wishart product Wm (Wm )H equals the scaled identity
matrix if we assume Wm to be unitary. However, changing the power allocation
for dierent MIMO transmission modes results in a multi-cell system where Zk,u
cannot be predicted at the receiver side. In order to support cell-edge terminals,
we suggest to arrange e.g. SS with full base station power in an agreed access
scheme known to the users.
52
1.0
MRC, fs. i.i.d., =0.1
MRC, . i.i.d., =0.1
MRC, fs. cov., =0.1
MMSE, fs. cov., =0.1
MMSE, fs. corr. N=3
MMSE, fs. corr. N=12
0.9
0.8
0.7
CDF
0.6
0.5
0.4
0.3
0.2
0.1
0
15
10
5
0
5
SINRest / SINRavail [dB]
10
15
c 2009 IEEE.
Figure 5.6 SINR estimation errors [TSWJ09].
5.1.8
53
1.0
0.9
0.8
0.7
CDF
0.6
1.76
0.2
SISO, =0.1
SU, =0.1, i.i.d. IF
SS, =0.1, i.i.d. IF
SU, MMSE corr. N=12
SS, MMSE corr. N=12
0.1
0.5
0.4
0.3
0
1
3
4
5
6
spectral eciency [bit/s/Hz/cell]
c 2009 IEEE.
errors [TSWJ09].
54
5.1.9
Summary
In this section, we have evaluated the gains from using interference-aware,
frequency-selective MU-MIMO scheduling in a cellular network with synchronized base stations. Terminals were assumed to be able to estimate their dedicated and a certain number of interfering channel coecients. Two important
observations were made: Ecient MU-MIMO transmission can be achieved by
using xed unitary precoding, i.e. without the requirement of full channel knowledge. Further, proper application of the MU-MIMO mode enables to conveniently
serve even users with multiple streams who experience relatively poor SNR conditions. Thus, the MU-MIMO mode establishes a win/win situation for both, low
and high rate users. In addition, it was shown that knowledge on the interference
channels yields a more precise estimation of the achievable SINR compared to the
traditional approach, where interference is assumed white. Thus, CQI feedback
and supported modulation and coding scheme can be matched more accurately.
Acknowledgements
The authors are grateful for nancial support from the German Ministry of
Education and Research (BMBF) in the national collaborative project EASY-C
under contract No. 01BU0631.
5.2
55
schemes are described for the uplink in the following, but in principle they may
be employed in the downlink as well.
Joint scheduling generally belongs to the class of so-called interference coordination techniques, which recently have attracted a lot of research attention due to their potential to eciently mitigate inter-cell interference and
hence to realize signicant performance gains compared to non-cooperative systems [BPG+ 09, ADF+ 09]. The basic idea of interference coordination in general is to let dierent BSs cooperate with each other in order to control and
account for the inter-cell interference originating from the corresponding cooperating cells. This may be done in either a static or dynamic manner. With
a static approach, there are usually some pre-congured restrictions regarding
the resource allocation, for example, that on some frequency resources no celledge users may be scheduled, as it is the case for static fractional frequency
reuse [XSX07]. With a dynamic scheme, in contrast, such restrictions are determined on a much shorter time scale and usually by taking the instantaneous
channel conditions into account. In case of dynamic fractional frequency reuse,
for instance, there would be only restrictions on certain frequency resources
when high interference is expected, see for example [FKR+ 09, MMT08]. Clearly,
dynamic interference coordination generally should lead to a better performance
than static approaches, but this comes at the cost of a higher complexity and
possibly a higher backhaul load [BPG+ 09].
A general problem of interference coordination with independent, i.e., cellspecic scheduling is that the scheduling of one user in a certain cell may directly
impose certain restrictions on other cooperating cells and vice versa. Thus, nding the globally optimal solution becomes hardly feasible in practice. This is
also because the imposed restrictions cannot be changed arbitrarily fast due to
the inherent BS-BS signaling delay over the backhaul (see Section 12.2). However, this drawback can be overcome with a global scheduling algorithm that is
applied across all cooperating BSs, taking into account the channel state information (CSI) of all associated user equipments (UEs) in order to nd the optimal or at least close-to-optimal allocation of radio resources. Below, we propose
such a centralized cooperative scheduling scheme in Section 5.2.1, with which the
resource allocation as well as the link adaptation is performed jointly by a central
scheduling unit (CSU) for a set of cooperating BSs [FMDS10]. However, since
this requires the signaling of multi-cell CSI from all cooperating BSs to the CSU,
it results in a possibly massive backhaul load. In order to reduce this backhaul
load as well as the signal processing complexity, we therefore propose in a second
step in Section 5.2.2 a novel multi-cell interference prediction scheme. In contrast
to the joint scheduling approach, the scheduling process of the interference prediction scheme is still done independently by each BS as in conventional systems,
and only the inter-cell interference that is expected to occur during a future data
transmission is predicted and then used for improving the link adaptation process [MF10]. This is accomplished by exchanging scheduling information between
56
cells = sectors
multi-cell
CSI
scheduling
decisions
backhaul
link
central
scheduling unit
site with
3 base stations
5.2.1
57
HARQ management
Ordering of the
cooperating BSs
yes
Resource
allocation performed for
all cooperating BSs?
no
Determine joint
scheduling priorities for
current BS
Interference-aware joint
scheduling completed
Perform resource
allocation
Figure 5.9 Flow chart of the interference-aware joint scheduling algorithm [FMDS10].
c 2010 IEEE.
missions of all associated BSs, and then the actual joint scheduling process is carried out. Since the simultaneous allocation of radio resources to all UEs located
within the respective cooperation cluster would cause a tremendous increase in
computational complexity, we assume in the following that the joint scheduling
procedure is carried out stepwise for each set of UEs assigned to one of the cooperating BSs. This way, the computational eort can be signicantly reduced.
However, this entails also that the BSs associated to a certain CSU have to be
ordered by means of a certain fairness criterion in order to sustain fairness among
the various UEs. For that purpose, the long-term cell throughput averaged over
the number of assigned UEs is considered as fairness criterion, which can be
expressed for the m-th BS by
Tavg,m (t + 1) = Tavg,m (t) + (1 )
Tinst,m (t)
,
|Km |
(5.14)
where Tavg,m (t) denotes the long-term throughput for the m-th BS at the time
interval t, Tinst,m (t) the instantaneous throughput, the forgetting factor and
Km the set of UEs assigned to BS m. The actual BS ordering is then done
in such a way that the corresponding average long-term throughputs according
to (5.14) are non-decreasing, i.e., the resource allocation always starts with the
BS associated with the lowest long-term throughput, then it is done for the one
with the second smallest one, etc.
Having determined the ordering of the cooperating BSs, the radio resources
are allocated to the various UEs based on the exchanged multi-cell CSI. To this
end, not only the current channel conditions between the UEs and their serving
BSs are taken into account, but also the expected inter-cell interference caused
58
by assigning these UEs to certain radio resources. Thus, the joint scheduling
priority for the b-th radio resource and k-th UE associated to its serving BS m
can be expressed by
Gj,b,Kb (t) , k Km ,
(5.15)
Sk,b (t) = Gk,b,Kb (t) +
j Kb
where Gk,b,Kb (t) denotes the scheduling priority for the k-th UE allocated to
the b-th radio resource on which the UEs in set Kb are already scheduled. Furthermore, Gj,b,Kb (t) indicates the updated scheduling priority for the already
scheduled UE j, taking into account that the k-th UE will be allocated to the
b-th radio resource. In this regard, the updated set of interfering UEs allocated
to the b-th radio resource for the j-th UE is given by
b = (Kb \ j) k.
K
(5.16)
In the following, only the calculation of the scheduling priority Gj,b,Kb (t) is
explicitly outlined, but the scheduling priority Gk,b,Kb (t) can be determined in
a similar way and therefore is not further considered in more detail here. It
is assumed that the radio resources are shared between the various UEs by
means of the well-known proportional fair approach, but it should be noted
that any other scheduling metric may be used in conjunction with our joint
scheduling scheme as well. The basic idea of proportional fair scheduling is to
realize a reasonable trade-o between the maximal total throughput and cell-edge
throughput. Clearly, on the one hand, fair resource allocation among the UEs
will lower the overall throughput compared to the maximum possible one, but
in return it provides a higher throughput for UEs with relatively poor channel
conditions, thus improving the system fairness. In general, the proportional fair
metric is given by the ratio between instantaneously supportable and long-term
throughput of a certain UE [VTL02], i.e., Gj,b,Kb (t) can be determined by
Gj,b,Kb (t) =
Rj,b,Kb (t)
Tj (t)
(5.17)
with Rj,b,Kb (t) as the instantaneous supportable throughput and as the fairness factor, which determines the trade-o between eciency in terms of total
throughput and fairness. Furthermore, Tj (t) denotes the long-term average
throughput given by
j
/ Ktotal (t)
Tj (t)
Tj (t + 1) =
,
(5.18)
59
wH
j,b
H
Pj,b wH
j,b hj,b hj,b wj,b
!
,
2I w
E ij,b,Kb iH
+
j,b
b
j,b,K
(5.20)
with Pj,b as the transmit power of UE j for the b-th radio resource, hj,b C[Nbs 1]
as the channel vector from the j-th UE to its serving BS, wj,b C[Nbs 1] as the
corresponding weight vector for coherent detection, ij,b,Kb C[Nbs 1] as inter b and 2 as thermal noise variance.
cell interference caused by the set of UEs K
Based on the exchanged multi-cell CSI, the CSU is able to predict the interference
[Nbs Nbs ]
in (5.20), which is given by
covariance matrix ii = E{ij,b,Kb iH
} C
j,b,K
b
ii =
Pq,b hq,j,b hH
q,j,b ,
(5.21)
b
q K
where Pq,b and hq,j,b C[Nbs 1] denote the transmit power of UE q for the b-th
radio resource and the channel vector from the q-th UE to the serving BS of
UE j on the considered radio resource b, respectively. Clearly, ii in (5.21) contains both the inter-cell interference level caused by the already scheduled UEs
associated to the cooperating BSs as well as the one that will be generated by
assigning the k-th UE to the considered radio resource. As a result, the joint
scheduling priorities in (5.15) reect the weighted sum-throughput taking the
current inter-cell interference situation into account. This consequently leads to
an interference-aware joint scheduling, aiming at reducing the inter-cell interference within the given cooperation cluster while still taking channel-dependent
scheduling as well as user fairness into account.
Having determined the joint scheduling priorities in (5.15) for all UEs associated to a certain BS, the central scheduler generally aims at maximizing the
priority for each radio resource. The complexity of the resource allocation process depends on the used access scheme. In case of single carrier frequency
domain multiple access (SC-FDMA), for example, which is used in the 3GPP
LTE uplink, the allocated radio resources of each UE have to be either adjacent
or evenly spaced in frequency in order to achieve a low peak-to-average power
ratio (PAPR) [MLG06]. However, this leads to a signicantly reduced allocation
exibility and a higher complexity. To overcome this problem, a resource allocation algorithm presented in [CRA+ 08] may be applied after determining the joint
scheduling priorities in (5.15). The basic idea of this algorithm is that adjacent
radio resources are assigned to a certain UE until either a dierent UE has a
higher scheduling priority or the maximum transmit power is reached. This way,
the allocation constraints due to SC-FDMA can be met, while still exploiting
the multi-user diversity and the frequency selectivity of the uplink channel.
120
110
100
90
80
70
60
50
40
30
20
10
0
+102%
interference coord.
joint scheduling
+71%
+37%
+25%
+14%
+3%
30
60
100
bandwidth occupancy [%]
60
120
110
100
90
80
70
60
50
40
30
20
10
0
+105%
+67%
+58%
+32%
+26%
+21%
30
60
100
bandwidth occupancy [%]
(b) Gain in cell-edge throughput.
Figure 5.10 Relative uplink performance gains of the presented joint scheduling
scheme as well as of a dynamic interference coordination scheme compared to a 3GPP
LTE Release 8 system with 500 m inter-site-distance and six cooperating cells per BS.
Finally, after completing the resource allocation of all cooperating BSs, the
link adaptation selects for each UE the spectrally most ecient modulation and
coding scheme (MCS) that can be supported by its current uplink channel without exceeding a given target block error rate (BLER). To this end, the corresponding SINR is estimated by evaluating the available multi-cell CSI, resulting
in a more accurate link adaptation. This is because the knowledge of which UEs
are scheduled in the cooperating cells together with the available multi-cell CSI
facilitate an accurate prediction of the interference situation that will occur during the actual (future) data transmission. Especially in the uplink, this may lead
to signicant additional performance gains since the interference situation there
is usually rather volatile. This is because from one transmit time interval (TTI)
to the other completely dierent sets of UEs may be scheduled in nearby cells.
An example for the achievable uplink performance of the presented joint
scheduling scheme for dierent bandwidth utilizations is depicted in Fig. 5.10,
where the relative gains compared to an LTE Release 8 system in terms of average spectral eciency as well as cell-edge throughput are shown. The detailed
simulation assumptions, parameter settings as well as further results will be introduced later in Section 14.3. In order to achieve a certain bandwidth occupancy,
the scheduling is performed until the intended degree of bandwidth utilization is
reached. In addition to the joint scheduling results, Fig. 5.10 shows for comparison also the performance of a state-of-the-art dynamic interference coordination
scheme based on high interference indicator signaling [3GP07c, FMDS10]. First
of all, it can be seen that the achievable performance is heavily dependent on the
bandwidth occupancy. The gains increase with decreasing bandwidth occupancy,
which indicates that the exibility in assigning radio resources to the various UEs
is considerably increased at a low bandwidth occupancy. As a result, severe inter-
61
cell interference situations can be avoided by exploiting the whole bandwidth, i.e.
preventing the UEs associated to the cooperating BSs from being allocated to the
same radio resources. Furthermore, it is shown in Fig. 5.10 that the joint scheduling scheme outperforms the dynamic interference coordination scheme due to the
higher exibility in jointly allocating radio resources to the various UEs, which
consequently leads to an improved avoidance of severe inter-cell interference. The
better system performance, however, comes at the cost of an increased backhaul
load due to the required exchange of multi-cell CSI [FMDS10].
5.2.2
62
cells = sectors
backhaul
link
scheduling
decisions
site with
3 base stations
low to moderate user speeds, the impact of this eect should be only marginal.
On the other hand and more importantly, the interference situation may have
completely changed, since from one TTI to the other completely dierent sets
of users may be scheduled in nearby cells. As a consequence, the selected MCSs
are often over- or underestimated, thus leading to a very high BLER or a rather
low spectral eciency, respectively. Without any appropriate countermeasures
as proposed in the following, the performance therefore would be often degraded
compared to the idealized case with perfect link adaptation based on the channel
conditions during the actual data transmission.
Proposed Interference Prediction Scheme
The fundamental idea of the considered interference prediction scheme is to perform the link adaptation not based upon the currently estimated SINR values,
but rather based upon predicted SINR values likely to occur during the associated
future data transmissions. For that purpose, it is necessary that a BS can accurately predict the interference level that it will experience during such future data
transmissions already a couple of TTIs in advance. This may be accomplished by
means of cooperation between dierent BSs as illustrated in Fig. 5.11. First of all,
every BS performs conventional scheduling and power control, i.e. it determines
which UEs should transmit on which radio resources and at which power levels.
If the employed scheduling algorithm is channel-awarewhich is the case for a
proportional-fair scheduler, for examplethe corresponding scheduling metrics
are calculated as in conventional systems, taking into account only the currently
observed channel and interference conditions, respectively.
63
Afterwards, every BS exchanges the resource allocation tables that have been
xed during the scheduling process with a certain set of cooperating BSs via a
fast backhaul network. For the case of a 3GPP LTE system, for example, this
could be realized via the X2-interface [HT09]. Note that low-latency backhaul
links are a crucial prerequisite for the proposed approach since an additional
delay is introduced by exchanging and processing the scheduling information
as well as by performing the actual prediction of the interference. Without a
fast data exchange the overall latency may increase, resulting in a performance
degradation compared to the idealized case without any additional delay [MF10].
Provided that the various BSs have reasonably accurate CSI not only of the
channels from the UEs located in their own cell, but also from those associated
with any of their cooperating BSs, they can eventually accurately predict the
interference level that will be generated by these UEs when the actual data
transmission takes place. If, for example, the channel from the k-th interfering UE
to the various antenna elements of a particular BS sector m is denoted by hk
C[Nbs 1] , the expected contribution of this interferer to the overall interference
covariance matrix simply would be given by
ii,k = Pk hk hH
k ,
(5.22)
where Pk is the transmit power associated with UE k. The predicted interference is then used as an input to the link adaptation stage, and afterwards the
corresponding scheduling grants (including the assigned MCSs) are signaled to
all scheduled UEs, which nally transmit their data a couple of TTIs after the
reception of these grants. However, note that the scheduling decisions themselves
are not updated based on the predicted interference levels, since otherwise the
actual future interference situation would change again. Hence, in that case some
iterative procedure would be necessary, thus leading to an increased complexity
and backhaul load as well as a higher latency. An example for how the accuracy
of the link adaptation can be improved with the proposed approach is depicted
in Fig. 5.12, where the simulated distribution for certain deviations between the
ideal and the used MCSs are shown for the cases with and without interference
prediction. In this regard, the BSs may choose between several dierent MCSs
according to [3GP09f]. Furthermore, it is assumed that in case of interference prediction each BS always receives scheduling information with a delay of two TTIs
from its six cooperating sectors. It can be seen that with interference prediction
the probability that the ideal MCS is selected is almost twice as high as for the
case without interference prediction and also the variance of the deviations from
the ideal MCS can be considerably reduced. Note that further simulation results
can be either found in Section 14.3 or in [MF10]. Furthermore, the simulation
assumptions and parameter settings used for generating the results in Fig. 5.12
are the same as the ones that will be used later in Section 14.3.
Clearly, the performance of the approach strongly depends on the number of
cooperating BSs. While a BS generally should be able to predict the interfer-
64
5.2.3
Practical Considerations
A crucial prerequisite for both BS cooperation schemes presented in this section
is that accurate multi-cell CSI is obtained. For that reason, it is necessary that
the reference signals transmitted by dierent UEs within a certain cooperation
cluster can be separated again at the BS side, for example through orthogonal
reference signals as in Section 9.1. In any case, all BSs have to be aware of the
reference signals assigned to the various UEs. This consequently requires further signaling between cooperating BSs in addition to the necessary information
exchange via the backhaul network already outlined before. However, note that
this usually does not have to be done during every TTI since the utilized reference signals and hopping patterns are normally assigned in a semi-persistent
manner. Therefore, this additional backhaul load is expected to be comparatively
small.
In case of cooperative interference prediction as introduced in Section 5.2.2, it
is quite obvious that the requirements on the accuracy of the multi-cell channel
estimation between a BS and UEs located in other cells are generally much lower
65
Probability [%]
25
Selected MCS deviates from ideal MCS
Selected MCS corresponds to ideal MCS
20
15
without interference
prediction
10
5
0
10
6
4
2
0
2
4
6
Deviation between selected and ideal MCS [MCS indices]
10
Probability [%]
25
20
15
with interference
prediction
10
5
0
10
6
4
2
0
2
4
6
Deviation between selected and ideal MCS [MCS indices]
10
Figure 5.12 Exemplary illustration of the improved link adaptation accuracy with
interference prediction for the uplink of a 3GPP LTE Release 8 system with 500 m
c 2010 IEEE.
inter-site-distance and six cooperating cells per BS [MF10].
than those for the estimation of the desired link between a certain UE and its
serving BS. On one hand, this is because estimation errors made for dierent
interfering channels may compensate each otherparticularly if the number of
cooperating BSs is relatively highand on the other hand because it may be
already sucient for achieving a good performance to know whether on a certain
radio resource very high or very low interference has to be expected, whereas the
exact gures are only of secondary importance. In addition, if the channel from a
certain UE in one of the cooperating cells cannot be estimated reliably since it is
in a deep fade, this should also not represent a major problem since in such a case
this UE would cause only low interference anyway. By contrast, the requirements
on the accuracy of the multi-cell channel estimation are more stringent in the
case of interference-aware scheduling. This is due to the fact that the resource
allocation decisions heavily depend on the predicted inter-cell interference level
caused by single UEs, for which reason a high deviation between the predicted
and the actual interference levels during a data transmission would lead to rather
inaccurate resource allocation decisions.
Another prerequisite for the proposed schemes is that cooperating BSs can
quickly exchange the required information, such as the multi-cell CSI or scheduling tables, via a fast backhaul network. However, it is quite clear that even if
cooperating BSs are interconnected by means of direct optical ber links, in
general an additional delay is introduced because some time is always required
for the processing of the exchanged information. As a consequence, the overall
latency increases and the performance may degrade to some extent compared
to the idealized case without any additional delay. This is due to an increased
mismatch between the channels used as the basis for the scheduling and link
66
adaptation stages and those during the actual data transmission. Besides, the
increased delay between scheduling and actual data transmission clearly also
aects potential hybrid automatic repeat request (HARQ) retransmissions. In
the 3GPP LTE uplink, for example, a synchronous HARQ protocol is used. If
one of the previously discussed schemes is to be introduced here, it might therefore be necessary to switch to an asynchronous HARQ or adjust HARQ timing.
Finally, it goes without saying that in case of joint scheduling the generated
backhaul load generally should be much higher than for cooperative interference
prediction, particularly due to the required exchange of multi-cell CSI in that
case. As a concrete example, it will be shown later in Section 14.3 for a particular scenario that the average backhaul load per site can be reduced from
251 Mbit/s in the case of joint scheduling to 26 Mbit/s for cooperative interference prediction. However, it should also be noted that even for joint scheduling,
the backhaul capacity requirements are still considerably smaller than those for
joint signal processing CoMP schemes, which will be introduced in Chapter 6.
5.2.4
67
5.2.5
Summary
Two novel uplink CoMP schemes have been presented where dierent BSs cooperate with each other via a backhaul network in order to mitigate the eects
of inter-cell interference. While the interference-aware joint scheduling scheme
coordinates the allocation of radio resources to the various UEs by means of
periodically exchanged multi-cell CSI between the cooperating BSs and a central scheduling unit, the cooperative interference prediction scheme is a more
lightweight but yet ecient approach with reduced backhaul load requirements.
The latter scheme only requires the exchange of scheduling information between
a set of cooperating BSs to predict the inter-cell interference level that will
occur during a future data transmission for improving the link adaptation process. Since both proposed schemes are transparent to the UEs and cause only
a minor to moderate backhaul load, they represent very attractive options for
future LTE-A systems.
68
5.3
5.3.1
Introduction
There are several distributed approaches for coordinated beamforming in the
literature. For example, [EC05] proposes an iterative algorithm to minimize
transmit power, which does not necessarily maximize sum-rates. The authors
of [LJSL09] propose a non-iterative distributed solution to design precoding
matrices for multi-cell systems, which will maximize the sum-rates for only a
two-cell system at high signal-to-noise ratio (SNR), using a per base station
power constraint. Another important partial cooperation-based transmit strategy is inter-cell interference nulling (ICIN) [ZA10, JLD08, LKL09, BH10], in
which each BS transmits in the null-space of the interference it is causing to
neighboring cells.
The performance of a cooperative transmission strategy is highly dependent on the quality of the CSI fed back by the users. Most of the literature
on multi-cell cooperation assumes that full CSI is available at the transmitters [SSPS09c, SZ01, JTS+ 08a, EC05, LJSL09]. The impact of imperfect CSI
was considered in [MF09a, GMF10a, PTW10]. In [MF09a], the authors consider
imperfect CSI at the BS due to limited feedback or estimation errors and show
that the performance gains from BS cooperation can be obtained even when
CSI is imperfect. Noisy CSI estimates were considered in [PTW10], where the
objective was to maximize the performance gains that can be obtained using the
worst-case CSI perturbations.
Since quantization and feedback is a major source of imperfect CSI, it is
important to consider CSI quantization in multi-cell cooperative systems. Limited feedback for multi-cell systems is a topic of ongoing research [BH10]. Unfortunately, results from the well-investigated single-cell limited feedback are not
69
hlk
Backhaul
with delay
hnk
l
h
k
hkk
n
h
k
CSI feedback
with delay
k, h
n, h
l
h
k
k
k
Figure 5.13 CSI feedback and backhauling concept considered for inter-cell
directly applicable to the multi-cell scenario. While the CSI of only one channel
is fed back in the single-cell case, cooperative strategies require feedback of CSI
from multiple BSs using the same feedback link. Further, in single-cell transmission, quantized CSI reaches the BS after experiencing a delay in the feedback
channel [ZHKA09]. In the multi-cell cooperative framework, however, quantized
CSI is subject to an additional source of delay in the backhaul link. The impact
of delayed CSI on the performance of non-cooperative systems [ZHKA09] has
been investigated extensively. The eect of delayed limited feedback on the performance of cooperative systems has received comparatively less attention.
In this section, two dierent cases are considered: i) a single receive antenna
and ii) multiple receive antennas at the user equipment (UE). For the single receive antenna case, the BSs need to optimize their precoding matrices/beamforming vectors to maximize the sum-rate under given constraints but
for a multiple receive antenna case, the precoding/postcoding matrices/vectors
should be jointly optimized. In Subsection 5.3.2, we describe ICIN, a lowcomplexity and non-iterative partial cooperative strategy that uses explicit perbase power constraints and yields reasonable gains in the sum-rate, while resulting in a small burden over the backhaul link. Note that ICIN requires that the
total number of antennas per BS be larger than the number of single-antenna terminals considered in one transmission, an aspect we will discuss in detail later.
We also describe some limited CSI feedback algorithms for ICIN. In Subsection 5.3.3, we further extend the cooperative strategies to the multiple antenna
cases and show performance results.
70
5.3.2
71
Nbs K, (5.25) ensures perfect interference nulling in most cases, i.e. hjk wj = 0,
for k, j = 1, . . . , K, j = k. Note that while ICIN is a simple, non-iterative and
distributed coordinated beamforming strategy, it suers from the dimensionality
constraints imposed from computing the pseudo-inverse, i.e. Nbs K. For the
more practical case where the number of users in the system is greater than Nbs ,
scheduling can be employed to enforce Nbs K. This implies that there exists a
trade-o between increasing the number of users for simultaneous transmission
in the cells and perfect interference cancelation. Clustering can also be employed
to group cells into clusters of size Nbs each and use intra-cell time division
multiple access (TDMA) or a comparable orthogonal transmission strategy, to
make sure that Nbs K.
Limited Feedback
Practical feedback channels are bandwidth-limited and have delays associated
with them. Hence, it is important to investigate the performance of ICIN with
m [t] are quantized to
delayed limited feedback [BH10]. The channel directions h
k
m
the unit-norm vectors given by hk [t] at the k-th user, where we now introduce
variable t to capture the time instant, for example a transmit time interval
(TTI). We assume that each user can utilize Btot bits for feedback, and that
k [n] and h
j [t], j = k respectively, where
Bk and Bkj bits are used to quantize h
k
k
j
k [t] to h
k [t] and
Bk + j=k Bk = Btot . The delay associated with quantizing h
k
k
feeding back the latter to the k-th BS is denoted by Dk . The k-th user also
j [t], j = k to h
j [t], j = k and feeds back
quantizes the interfering channels, h
k
k
the latter to the k-th BS, which then forwards this information to the j-th BS
over the backhaul link, incurring an overall delay of Dkj . The limited feedback
model is also shown in Fig. 5.13.
k [t Dk ],
k [t Dk ] and h
At the time instant t, the k-th BS has knowledge of h
k
for all j = k. The beamforming vector at the t-th time instant, wk [t], is designed
using the delayed and quantized CSI of the desired channels and the interference
caused to other cells [BH10]
wk [t] =ak where
.
k [tD k ]..h
k
A= h
1
(5.26)
/
k
k
k
k
k
k
k1 [tDk1 ], hk [tDk ], hk+1 [tDk+1 ]..hK [tDK ] .
72
(5.27)
(5.28)
where ehkk [t] and ehkj [t] denote the channel knowledge uncertainties, which are
uncorrelated with hkk [t Dk ] and hkj [t Djk ], respectively. The entries of ehkk [t]
and ehkj [t] are distributed by NC (0, 1). The correlation coecients for the desired
and interfering channels are denoted by k and jk , respectively. Clarkes autocorrelation model is used to determine k and jk as [ZHKA09]
k = J0 (2Dk fd Ts ), and jk = J0 (2Djk fd Ts ),
(5.29)
where J0 is the zeroth order Bessel function of the rst kind, fd is the Doppler
spread and Ts is the symbol duration. The Doppler spread is given as fd = fc /c,
where is the relative velocity of the transmitter-receiver pair, fc the carrier
frequency, and c the speed of light. The mean loss in sum-rate due to delayed
limited feedback is bounded in [BH10], as a function of delays, signal strengths,
and can be minimized choosing Bk and Bkj as per Theorems 5.1-5.3.
Theorem 5.1. Given the total number of bits allocated to quantize all the channels seen by one UE, Btot , the optimum number of bits assigned to the desired
channel at the k-th user, Bk , at low SNR is given by
1
1
Nbs
(Nbs 1)|K|
Btot
log2 k
Bk =
(jk (kj )2 ) |K| ,
|K| + 1
|K| + 1
Nbs 1
jK
for Nbs > K, and Bk = 0 for Nbs = K. The optimum number of bits assigned
to all the interfering channels at the k-th user is computed as Bk,int = Btot
Bk [BH10].
Theorem 5.2. Given the total number of bits allocated to quantize all the channels seen by one UE, Btot , the optimum number of bits assigned to the desired
channel at the k-th user, Bk , at high SNR is given by
**
'
'
Nbs
.
Bk = (Nbs 1) log2 (|K| 1)
Nbs 1
The optimum number of bits assigned to all the interfering channels at the k-th
user, Bk,int = Btot Bk [BH10].
Theorem 5.3. The optimum number of bits invested by UE k into the quantization of the channel to the j-th interfering BS, Bkj is given by
2
4
jk (kj )2
Bk,int
j
+ (Nbs 1) log2 3
Bk =
,
1
|K|
(j ( j )2 ) |K|
jK
1.4
rs
1.3
bc
73
bc
7,0,14,14,0,0,0
bc
6,0,11,11,0,0,0
1.2
1.1
bc
7,0,7,7,0,0,0
1.0
rs
bc
0.9
rs
0.8
0.7
rs
6,0,4,4,0,0,0
bc
7,0,0,0,0,0,0
rs
rs
0.6
7
14
21
28
number of total feedback bits per UE Btot
35
Figure 5.14 Comparison of the mean data-rate at the cell-edge for dierent values of
j
j
for
/ K, where K is the largest set of
jK, j=k Bk = Bk,int and Bk = 0 for j
interferers that satises [BH10]
23
1 4
j
j 2 |K|
Btot
jK (k (k ) )
<
.
log2
j
j 2
|K|(N
(k (k ) )
bs 1)
For numerical evaluation, we consider a seven-cell system, i.e. M = K = 7.
Each BS has eight antennas (Nbs = 8) and each user has a single antenna. The
system setup is based on the urban micro-cell propagation scenario in the 3GPP
spatial channel model (SCM). The inter-site distance (ISD) is assumed to be
800 m. The pathloss between the BSs and the UE is modeled using the COST
231 Walsh-Ikegami non line-of-sight (NLOS) model, adopted for urban microcells. Using a carrier frequency of 1.9 GHz, BS and UE heights of 12.5 m and
1.5 m, respectively, a building height of 12 m, building to building distance 50 m
and street width 25 m, the path-loss in dB from BS m to a UE k at a height of
1.5 m is given as
m
P Lm
k [dB] = 34.53 + 38 log10 (dk ),
(5.30)
where dm
k denotes the distance from UE k to BS m. The transmit power is Es =
33 dBm for all BSs, and the noise power is given by 114 dBm. We also model
the delay associated with the feedback and backhaul links to be one and two
frames, respectively. Note that the m
k parameters are obtained as a dierence
of the path-losses from the desired and interfering BSs. For example, m
k [dB] =
[dB].
P Lkk [dB] P Lm
k
It is seen from Fig. 5.14 that while the limited feedback technique in this
section outperforms equal-bit allocation for all Btot , the improvement in data
74
rate is about 40 % at Btot = 35. At Btot = 7, the desired channel is given all the
7 bits, while at Btot = 35, the two strong interfering channels are assigned 14
bits. Equal-bit allocation, in contrast, sees an increase in the feedback bits for the
strong interfering channels from 1 to 5 bits per channel. Quantizing the strong
interferers more nely at the cost of allocating zero bits to the weak channels
leads to the signicant improvement in data rates using the proposed algorithm.
5.3.3
H H
P g k Hk
wj xj + gkH nk
(5.31)
x
k = P gkH HH
k wk xk +
j=k
75
= H
,
W
(i)
(i)
(i)
(5.33)
where R1 and R2 are the Nbs Nbs normalized matched channel matrices
2
dened by Hk HH
k /"Hk "F and w1 , w2 are the transmit beamforming vectors
of size Nbs 1.
Theorem 5.4. If Nbs = 2, Nue 2 and R1 and R2 are both invertible, then the
following claim holds. If (non-zero) transmit beamforming vectors w1 and w2
satisfy the zero inter-user interference conditions, i.e.,
g1H R1 w2 = 0
g2H R2 w1 = 0
then w1 , w2 are the generalized eigenvectors of (R1 , R2 ), which means:
R1 w1 = 1 R2 w1
R2 w2 = 2 R1 w2
76
(5.34)
w1H R1 w3
(5.35)
=0 =
w3H R3 w1 .
C,
C.
(5.36)
w2 R 2 w 3 = 0 = w 2 R 3 w 3 .
Again by Lemma 2 in [CKH10], (5.36) is equivalent to
w2 = (R2 w3 R3 w3 ),
as long as R2 w3 R3 w3 = 0.
C,
(5.37)
77
(5.38)
Here, x " y denotes that the complex vectors x and y are parallel.
Dene two functions , : C3 C3 as
(w) = R1 w R2 w
(w) = R2 (R1 w R3 w) R3 (R1 w R3 w).
where the components and are polynomials of degrees 2 and 4, respectively,
in the components of w. By applying Lemma 2, 3 in [CKH10], we see that
solving (5.38), with the restriction that (w1 ) = 0 and (w1 ) = 0, is equivalent
to nding a solution w1 for the following equation:
(w1 ) (w1 ) = 0.
Note that (w1 ) = 0 implies R1 w1 R3 w1 = 0. Therefore, we can nd all possible (w1 , w2 , w3 ) satisfying the no inter-user interference condition, in generic
(non-singluar) cases as described below.
Theorem 5.6. Under the non-singular hypothesis (w1 ) = 0 and (w1 ) = 0,
no inter-user interference is achieved by (w1 , w2 , w3 ), if and only if
(w1 ) (w1 ) = 0
w2 = (w1 )
w3 = (R1 w1 R3 w1 )
for some , C.
Extension to the Two-Cell Case
While considering multiple antennas per UE, we so far constrained ourselves to
the case of one BS (hence observing a BC). We now extend this to the two-cell
case1 . Note that the essential dierence is that only the antennas connected to
one BS may be used for the transmission towards one particular UE. Otherwise,
the BSs would also have to exchange the data to be transmitted to the terminals,
resembling a joint signal processing CoMP scheme which will be investigated in
Sections 6.3 and 6.4. Let us initially focus on a two-cell MIMO system as shown
in Fig. 5.15, where two BSs serve two UEs equipped with more than one receive
antenna. As usual, the channel between BS m and UE k is denoted as Hm
k . The
1
Optimal M -cell coordinated beamforming algorithm with a zero inter-cell interference constraint is still unknown, thus in the section we mostly focus on a two-cell system.
78
Desired signal
Other-cell interference
where j = k, P is the total transmit power and MRC is also assumed at the
(Hk )H w
UE, i.e. gk = "(Hkk )H wk " . Then the design goal is to maximize the desired signal
k
k
term and to remove the other-cell interference term found in (5.39). Thus we
introduce an interference-aware coordinated beamforming with MRC algorithm
that satises the following condition:
H
H
g1H H21 w2 =0 = g2H H12 w1
(5.40)
H
H
w1H H11 H21 w2 =0 = w2H H22 H12 w1 ,
which implies that the other-cell interference term in (5.39) is perfectly removed;
at the same time, the proposed system maximizes the desired eective channel
gain |gkH (Hkk )H wk |2 by using MRC. Note that it can be guaranteed that there is
no inter-user interference thanks to the transmit beamforming vectors. Since we
are considering a two-cell environment, this can be interpreted as the two-user
MIMO interference channel (IC) illustrated in Fig. 5.15 [CHHT10].
Theorem 5.7. Under a zero other-cell interference constraint in (5.40), the
sucient and necessary beamforming vectors with MRC for UE = (where k, j are
H
H j
1 or 2, j = k) are generalized eigenvectors of Hkk Hjk and Hkj
Hj .
Theorem 5.8. Given wj , where each BS has two transmit antennas, the sucient and necessary beamforming vector (unique up to complex multiplications)
for UE k, wk can be expressed as
' *
' *
z2
z2
wk =
or
w
,
(5.41)
=
k
z1
z1
79
15
ut
bc
rs
10
non-cooperative eigen-beamforming
ICIN, eigen-beamforming, perfect CSI
CBF, perfect CSI
CBF wth LFB, Q=6 bits per user
bc
utrs
bc
rsut
bc
ut
ut
rs
ut
ut
ut
bc
rs
ut
ut
0
5
10
15
20
SINR [dB]
Figure 5.16 Sum-rate comparisons as a function of SINR, where Nbs = Nue = 2. Each
BS has the same transmit power P/2, where P is the total transmit power.
where
z=
H
Hkk
'
Hjk wj
z1
z2
*
,
Therefore, only six scalar parts are needed to be quantized for computing the
transmit beamforming and the receive combining vectors, i.e., RH,11 , RG,11 ,
Re{RH,12 }, Re{RG,12 }, Im{RH,12 }, and Im{RH,12 }, where Re{} and Im{}
denote real and imaginary part, respectively. On the other hand, RH and RG
can be jointly quantized using vector quantization as follows:
RH,11
RG,11
vH = Re{RH,12 } and vG = Re{RG,12 } .
Im{RH,12 }
Im{RG,12 }
80
Upon receiving the quantized values from the UEs over a control channel, the
BSs can estimate RH and RG and compute the transmit beamforming vectors
before transmitting the data.
Fig. 5.16 shows the achievable sum-rate results for i) coordinated beamforming, ii) non-cooperative eigen-beamforming, and iii) interference nulling algorithms introduced in [CHHT10]. For this gure, we model the elements of each
UEs channel matrix as independent complex Gaussian random variables with
zero mean and unit variance NC (0, 1). Note that the algorithm introduced is not
directly related to the channel model. Once the BSs know all channel matrices, the transmit beamforming and receive combining vectors can be computed
through Theorems 5.7 and 5.8. As can be seen from Fig. 5.16, the coordinated
beamforming algorithm shows reasonably good sum-rate performance compared
with other solutions regardless of SNR values. Note that the coordinated beamforming algorithm in the gure uses 6 bits limited feedback per user, i.e., 3 bits
each for RH and RG , respectively.
5.3.4
Summary
In this section, we presented some latest results in downlink cooperative beamforming, which is important for interference management in upcoming cellular standards like 3GPP LTE-A. We described the details of several strategies,
distinguishing whether each terminal is equipped with one or multiple receive
antennas. We also presented simulation results from these strategies to illustrate
the potential gains that can be obtained from such kind of CoMP. From an
LTE-Advanced point of view, the most likely scenario is to have BSs with 2 or
4 transmit antennas, UEs with 2 receive antennas, thus the solutions introduced
in the chapter would be good candidates for the LTE-A systems. A particular
implementation of coordinated beamforming will be described and simulated in
Section 14.4.3.
In this chapter, we focus on CoMP schemes where user data or received signals connected to multiple users are exchanged between base stations for joint
signal processing. Such schemes promise larger spectral eciency gains than
pure interference coordination techniques, but typically come at the price of
larger backhaul requirements and (particularly in the downlink) more severe
synchronization requirements. After Sections 6.1 and 6.2 introduce centralized
and decentralized uplink CoMP schemes, respectively, 6.3 and 6.4 focus on the
downlink.
6.1
6.1.1
Introduction
When multiple BSs are connected via perfect backhaul links with innite capacity, uplink centralized joint detection resembles a multiple access channel (MAC)
problem, where the CCU is a super-receiver, and the BSs form a distributed
antenna system (DAS) [Mol01]. Consequently, various optimal or suboptimal
multi-user detectors such as the maximum likelihood (ML), linear minimum
mean square error (LMMSE) detector, MMSE detector with successive interference cancelation (SIC) or parallel interference cancelation (PIC), and iterative
detectors based on the Turbo principle, can be used for joint detection. By remov-
82
ing the inter-cell interference to a certain extent and obtaining signicant array
gain and diversity gain as shown in Chapter 4, the system spectral eciency can
be signicantly increased [Mol01, DP03].
When joint detection is implemented in practice, the received signals at the
cooperative BSs need to be quantized and then forwarded via the backhaul links
to the CCU for centralized processing. This entails requirements for large backhaul capacity, typically on the order of Mbps or even Gbps [MF07b, HFG09].
In existing or upcoming systems such as LTE, the backhaul capacity is typically
limited. Considering realistic backhaul constraints, we can transfer the locally
demapped signals or soft-decoded information to the CCU [FHG09], or forward
the locally compressed receive signals exploiting the correlation inherent in the
message of the UEs [MF09b].
One aspect of uplink CoMP lies in the fact that the signals received by multiple
BSs are correlated, which leads to redundancy. This implies that each BS can
exploit this signal correlation and compress its received signals, then transmit
the compressed signals to the CCU, which reduces the information needed to be
exchanged via the backhaul links. The CCU then decompresses the signals with
its own received signal as the side information (if it is a BS itself), and nally
estimates the messages of the UEs [dCS09].
Another inherent feature of CoMP systems is their asymmetric channels, i.e.,
the average channel power from one UE to its local BS and other cooperative BSs are dierent, and the power from multiple UEs in dierent cells to
one BS also dier. The channel asymmetry cannot be compensated by power
control [HYK+ 10], which is quite dierent from the near-far eect in single cell
multi-user systems. This does not change the structure of the joint detector at the
CCU when innite backhaul capacity is assumed, but it has large impact on the
system performance. Moreover, when the backhaul link constraint is taken into
account, this feature introduces new degrees of freedom to design uplink CoMP
schemes. Depending on the relative locations of multiple UEs (i.e., various user
pairings), the CCU may jointly detect only some UEs, and the supporting BSs
may quantize, compress, decode or even partially decode their received signals
to reduce the information to be transmitted to the CCU [MF08a, SSPS09a].
6.1.2
83
0
G1
..
y = Hs + n = H
(6.1)
x + no + np ,
.
GK
G
T T
where y = [y1T , , yM
] , and ym are the forwarded symbols from BS m with
local processing errors, s and x are the transmitted symbols after and before precoding, H is the composite channel matrix, G is the (block-diagonal) precoding
matrix, no denotes the noise vector at all M BSs including the thermal noise at
the receiver front-end and the inter-cluster interference from the cells outside of
the cooperative cluster, and np denotes the local processing errors at the BSs
due to quantization or compression.
Assuming that the CCU knows the channel matrix H and precoding matrix
G perfectly, the system capacity region with unlimited backhaul is described
in (3.4). We can use (3.4) to calculate the block error rate in fading channels,
which can serve as an achievable lower bound for practical systems. Given modulation and coding schemes, the transmission data rate is known. If for one channel
realization the rate is larger than the sum capacity computed from (3.4), then
the data block can not be decoded correctly, or we can say an outage happens.
The ML detection algorithm nds the most probably transmitted symbols,
which, under the assumption of Gaussian noise, minimizes the Euclidean distance
between the received signals and all possible received symbols,
= min "y HGx"2 .
x
x
(6.2)
84
(6.3)
(6.4)
1 = W1H y1 ,
x
y2 = y1 H1 G1 e (d (
x1 )) ,
..
.
2 = W2H y2 ,
x
xK1 )) ,
yK = yK1 HK1 GK1 e (d (
H
K = WK
x
yK ,
(6.5)
85
+
where ck,i is the ith coded bit of UE k, Xk,i
= {[xT1 , , xTK ]T : ck,i = 1} and
T
T T
Xi = {[x1 , , xK ] : ck,i = 1}, p(y|x) is a multivariate Gaussian distribu3
3Nk
tion with the signal model of (6.1), p(x) = K
k=1
i=1 p(ck,i ), Nk is the number of bits modulated on Nue sub-streams of UE k, and La (ck,i ) = log(P (ck,i =
1)/P (ck,i = 1)) denotes a priori information from the decoding stage.
For the MMSE-PIC detector, we rst need to construct the soft estimation of
the transmitted symbols using the a priori information transferred from the
decoding stage. Since xk is a mapping result of ck,i , i = 1, , Nk , the soft
estimation of xk is the weighted summation of all possible mapping results
xk (ck,1 , , ck,Nk ) with their corresponding probabilities,
k =
x
xk (ck,1 , , ck,Nk )
Nk
1
p(ck,i ).
(6.7)
i=1
Then, for each UE, we can subtract the estimated interference from other
UEs,
k = y HG
y
xk ,
(6.8)
k1 , 0, x
k+1 , , x
K )T is the estimated interference sym k = (
x1 , , x
where x
k to further suppress the residual
bol vector. An MMSE lter is applied to y
interference plus noise, which is given by
kH }1 E{
yk y
yk xH
Wk = E{
k }
H H
kG
HH
k QG
H + nn 1 Hk Gk ,
= Hk Gk Gk Hk + H
k
k
(6.9)
(6.10)
E{zk,j xH
k,j }
(6.11)
H
wk,j
Hk gk,j ,
=
where wk,j is the j-th column of matrix Wk , k,j =
and k,j is well-approximated by a Gaussian variable with zero mean and variance
#
"
2
k,j
= E |zk,j k,j xk,j |2 = k,j |k,j |2 .
(6.12)
The extrinsic information is given in the same form as in (6.6), except that y
is replaced by zk,j , x is replaced by xk,j , and (6.1) is replaced by (6.11). Since
86
the vector operation is replaced by multiple scalar operations for calculating the
extrinsic information, the complexity is signicantly reduced.
Simulation Settings and Results
In this subsection, we rst show the block error rate (BLER) of these detectors,
and then show the impact of user positions and pairing on the system sumrate. We will compare the sum-rate achieved by the centralized joint detection
with that by the non-cooperative MMSE-SIC, with which each BS only locally
decodes the message of its own user and treats the inter-cell interference as noise.
We consider two cells, each has a 4-antenna BS and a single antenna user.
The distance between the two BSs is 500 m, the pathloss in dB follows 35.3 +
37.6log10 d, and shadowing is not considered. The cell-center user is located on
the line connecting the two BSs and is 50 m from its serving BS, and the cell-edge
user is 245 m from its serving BS. The small-scale fading is assumed to be i.i.d.
Rayleigh fading. The uplink transmit power is 30 dBm to both users, and the
noise power at each BS is 99 dBm. We choose BS 1 as the CCU. BS 2 forwards
its received signal via an innite-capacity backhaul link to BS 1, i.e., np = 0.
The sum-rates achieved by the joint LMMSE and MMSE-SIC are computed
using Shannons capacity formula with their respective signal-to-interferenceand-noise ratio (SINR) and are averaged over realizations of small-scale fading.
Note that the sum-rate of the MMSE-SIC is the same as the sum capacity shown
in (3.4) [SAH+ 04].
The BLER versus signal-to-noise ratio (SNR) of various joint detection algorithms is shown in Fig. 6.1, where two users are assumed to have identical SNR.
Each user employs binary phase shift keying (BPSK) modulation and a rate-1/3
convolutional code with generators (155, 117, 123)8. The data packet length is
256 bits. For ML, LMMSE, and MMSE-SIC detection, the soft decision Viterbi
decoder is used. For Turbo detection with MMSE-PIC, the Max-Log-MAP algorithm is used in the soft-input-soft-output decoder [WH02] with 6 iterations. To
compare with the BLER lower bound, which is calculated from the achievable
sum-rate formula, the data block involves the data packets of two users and
therefore the BLER is accounted if any of these two packets are wrong.
From this gure, we can see that the LMMSE detector has a similar performance as the MMSE-SIC detector when the convolutional coding and softdecision Viterbi decoder are employed. At a BLER of 102 , the ML detector
performs 1 dB better. Even without iteration, (6.6) is used to send the soft decision values to the Viterbi decoder, otherwise, the ML detector with hard decisions
cannot outperform the LMMSE detector with soft decisions. The Turbo detection algorithm with MMSE-PIC detector and soft-input-soft-output decoder has
2 dB SNR gain over the non-iterative MMSE-SIC detection algorithm, and the
gap to the theoretical lower bound is about 1 dB.
To show the gain of joint detection and observe the impact of user pairing
on the performance, Fig. 6.2 gives the sum-rate cumulative distribution func-
100
87
bcrsutqp
bcrsutqp
bcrs
ut
qp
10
1
bcrs
BLER
ut
qp
102
bc
ut
rs
qp
103
12
6
SNR [dB]
rs
ut
bc
qp
6.1.3
1.0
0.9
0.8
center-center
0.7
0.6
CDF
88
center-edge
edge-edge
0.5
0.4
0.3
non-cooperative detection
cooperative LMMSE
cooperative MMSE-SIC
0.2
0.1
0
0
10
15
sum-rate [bit/channel use]
20
Figure 6.2 Sum-rate CDF under innite backhaul capacity, which shows the
performance gap between joint detection and non-cooperative detection. 9 dB
inter-cluster interference is considered.
BS and those already known by the CCU. Both the quantization noise and the
compression distortion will deteriorate the joint detection performance.
Local BS Processing Methods
Consider that one of the M BSs serves as the CCU. Without loss of generality, we
again consider BS 1 as the CCU. The backhaul link constraint from BS m to the
CCU is denoted as Cm . If a per-link capacity constraint is applied, all the links
have same capacity, i.e., Cm = C. If the sum-capacity constraint is applied, i.e.,
m Cm = C, the capacity of each link would be allocated by some optimization
criteria such as the sum-rate maximization in the cluster.
We rst consider the direct quantization scheme. The data symbols received
by BS m are quantized with Bm bits each for real and imaginary dimensions separately, where Bm = Cm /(2Nbs ). Assume that the signal level is within the range
of [A, +A], then with uniform quantization the quantization noise variance in
each dimension is
A2 22Bm
.
(6.13)
3
The covariance matrix of the quantization noise vector is therefore given as
2
2
, , 2q,m
]), representing the processing errors of BS m. We
m = diag([2q,m
next address the source coding scheme without exploiting the signal correlation
between BS m and BS 1. The covariance of the received signal of BS m is
2
=
q,m
2
yy,m = Hm ss (Hm )H + o,m
I,
(6.14)
89
2
transmitted signals of one user, and o,m
is the variance of the observation noise
including the receiver thermal noise and the inter-cluster interference at BS m.
According to the backhaul constraint, the distortion matrix m should satisfy
2
log2 det I + (m )1 Hm ss (Hm )H + o,m
I Cm .
(6.15)
(6.17)
This is a lossy decentralized multi-source compression with receiver-side information. In practice, we can rst use the conditional Karhunen-Lo`eve transform
to decompose the vector signal into independent streams, and then compress the
resulting scalar streams separately. Given the quantization noise or compression
distortion matrix m in BS m, the sum-rate achievable with joint detection is
(6.18)
Rsum = log2 det I + (nn + )1 Hss HH ,
2
2
INbs , , oM
INbs }, and INbs is
where = diag(1 , , M ), nn = diag{o1
the identity matrix of size Nbs .
90
1.0
1.0
0.9
0.7
quantization
source coding
compr. with corr.
0.6
innite backhaul
0.6
0.9
0.8
0.7
CDF
CDF
0.8
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
3
5
6
7
8
9 10
sum-rate [bit/channel use]
11
5
6
7
8
9 10
sum-rate [bit/channel use]
11
Figure 6.3 The impact of quantization noise and compression distortion on the
6.1.4
91
1.0
0.9
0.8
0.7
center-edge
CDF
0.6
edge-center
0.5
0.4
0.3
quantization
source coding
compression with correlation
0.2
0.1
0
2
8
10
12
14
sum-rate [bit/channel use]
16
18
Figure 6.4 The impact of user pairing on the performance of joint MMSE-SIC,
As shown in Fig. 6.2 for the case where both users are at the cell-center, local
decoding performs closely to or even the same as joint decoding. These decoded
bits can be forwarded to the CCU to facilitate interference cancelation for other
cell-edge users. Otherwise, it is shown in Fig. 6.4 that under limited backhaul
capacity the cell-center user of the supporting BS will degrade the sum-rate no
matter whether we quantize or compress its received signals.
Due to the mutual interference, it is not wise to decode the information of all
users at one BS only. On the other hand, if each BS demodulates and transfers
all the soft information of multiple users, the transfer rate may even exceed that
of the direct quantization scheme. For example, assume that 3 BSs cooperate to
serve 3 users, where each user transmits a data stream with 16-QAM modulation.
This amounts to 12 information bits for each BS. If 4 bits are used to represent
the log-likelihood ratio (LLR) of each information bit, then totally 48 bits need
to be transferred to the CCU. However, if we use 8 bits to quantize each received
sample, only 16 bits need to be transferred for both the real and imaginary parts
of the superposed signals.
Therefore, it is better that the BSs locally decode the information of some UEs
and forward to the CCU, but locally process and then transfer the received signals
from other UEs to the CCU for joint detection, as also stated in Section 4.3.2.
This is in fact a receiver mode switching, which is analogous to the downlink
CoMP mode switching, where the BSs serve some UEs without cooperation and
jointly transmit to other UEs (see Section 6.4).
Partial Decoding and Rate Splitting
For those users who are between cell-center and cell-edge, we can divide the data
streams of each user into two parts. One is decoded by the local BS, and the other
92
is at the CCU. In particular, for each UE that is in the cells of supporting BSs, we
use rate splitting to divide its data streams, which is implemented by superposition coding [MF08a, SSPS09a]. At each supporting BS, the received messages
are partially decoded and the residual received signals are compressed and then
forwarded to the CCU. At the CCU, MMSE-SIC is used for joint detection. By
optimizing the power allocated to the two parts of the data streams, maximum
sum-rate can be achieved under a certain constraint of backhaul capacity. Partial decoding at the local BS can reduce the information to be transferred to
the CCU, but may also reduce the users rate since decoding is performed under
interference. It is expected that such a multi-level decoding strategy may provide
a smooth transition from the two extreme schemes: (i) each BS only compresses
its received raw signals, or (ii) each BS only locally decodes the messages of its
own users [dCS09], as also pointed out in Section 4.3.2.
However, numerical results show that the performance gain of rate splitting
is marginal (e.g. compared to a time-share between dierent cooperation strategies). The same conclusion is drawn in [MF08a, SSPS09a] with slightly dierent
forms of superposition coding.
This is because CoMP channels are asymmetric, which cannot be compensated
by power allocation. Using rate splitting for a user, say UE k, is equivalent
to dividing the user into two users with dierent powers under a sum power
constraint, UE k1 and UE k2. In CoMP systems, from the viewpoint of its local
BS, say BS 1, UE k1 allocated with more power looks like a cell-center user.
Other BSs, however, also receive higher power from UE k1. Consider that a true
cell-center user in BS 1 will lead to high receive power at its local BS but low
receive power at other BSs, which makes local decoding more desirable. Now
it is clear that rate splitting is not equivalent to dividing a user into two users
with dierent positions, and hence does not help to improve the performance as
expected.
6.1.5
93
BSs occurs on the same time-frequency resources (i.e., the same OFDM symbols
and the same sub-carriers). This is a notable contrast to standard transmission
in WiMax, where UEs in adjacent cells are transmitting on (mostly) dierent
time-frequency resources, in order to reduce interference. Uplink sounding, i.e.,
determination of the channel information for the dierent UEs, is done in such
a way that the sounding signals allocated to the dierent UEs are orthogonal to
each other, as in Section 9.1. When macro-diversity is enabled (i.e., only one signal exists on a time-frequency resource, the BSs exchange log-likelihood ratios of
the bit decisions over the backhaul network. For cooperative reception (multiple
UEs on a time-frequency resource), quantized versions of the received signals are
exchanged over the backhaul network.
LTE-A also considers CoMP concepts, where similar provisions are made. For
centralized uplink CoMP, there should be backhaul links to connect the eNodeBs
to the CCU. Consider the per-link capacity, which is simply C BW . In an LTE-A
system, each eNodeB includes three BSs, the number of the transmit antennas
at each BS may be 2, 4 or 8, and the bandwidth BW is 20 MHz. Here we simply
ignore the various overhead, and simply multiply the backhaul constraints in
bit/channel use with the system bandwidth to obtain Mbit/s. Therefore, when
C = 3 bit/channel use or C = 10 bit/channel use, the required per-link backhaul
is 180 Mbps or 600 Mbps, respectively. When considering direct quantization,
this implies Bk = 0.375 bits and Bk = 1.25 bits per antenna when Nbs = 4.
6.1.6
Summary
In this section, uplink centralized joint detection was studied, where multiple
BSs forward partially processed receive signals to a CCU or another BS for a
joint and centralized decoding of multiple terminals.
Dierent joint detection concepts were rst introduced under the assumption
of innite backhaul capacity between cooperative base stations, and compared
in terms of performance and complexity. Then, constrained backhaul links were
considered and schemes introduced where local preprocessing and compression
are performed before signals are forwarded over the backhaul. It was shown that
the impact of compression distortion on the performance depends strongly on
user locations and pairing. Furthermore, results conrmed that it is benecial to
let some terminals be decoded locally (and the data bits potentially forwarded
to the CCU for interference cancelation), while others are decoded by the CCU
itself, which can be seen as a hybrid of a decentralized and centralized cooperation, as discussed in Section 4.3.1.
Finally, the provisions for uplink centralized joint processing in WiMax and
LTE-A standardization were discussed.
94
6.2
6.2.1
95
96
The useful channels for each UE k are the channels between this UE and all
the BSs. The selection of signicant useful channels for each UE k is performed
based on the k-th column vector of the channel matrix H. Obviously, a single
9 U of dimensions NBS K is sucient to represent all the signicant
matrix H
useful channels for all the UEs.
The interfering channels for each UE k are the channels between other UEs
and all the BSs. A signicant interfering channel for one UE could be considered
as an insignicant interfering channel for another UE. Therefore, it is reasonable
to separately represent the signicant interfering channels for individual UEs k
9 (k) .
by individual UE-specic signicant interfering channel indicator matrices H
I
Furthermore, for each UE k there are two kinds of channels irrelevant to the
interference considered in the proposed decentralized CoMP scheme, and they
9 (k) . Firstly, corresponding to the
are indicated as dont care elements in each H
I
9 (k) are certainly
useful channels for UE k, the elements in the k-th column of H
I
9 (k) corresponding
dont care elements. Secondly, the elements in the rows of H
I
to the insignicant useful channels for this UE k are also dont care elements.
The reason is that in the proposed JD algorithm the received signals at the BS
antennas corresponding to the insignicant useful channels for each UE k will
not be used in the data estimation of this UE.
Taking the channel group including all channels between all antennas of one
BS and one UE as a selection unit, a practical signicant channel selection scheme
according to the following mathematical criteria is proposed. For each UE k,
rstly we select its signicant useful channels. Let Am denote the set of indices a
of the antennas belonging to BS m, m = 1, . . . , M . The channel from UE k to BS
antenna a, characterized by the coecient hak , is selected as a signicant useful
channel if the channel group gain aAm |hak |2 covers a signicant portion of the
sum of all useful channel gains for this UE m aAm |hak |2 . Then, we select
the signicant interfering channels for each UE k based on the channel coecients excluding the dont care elements. Let Bk denote the set of indices of
the BSs corresponding to the selected signicant useful channel groups for UE k.
For each UE k, if a channel with the channel coecient hak' , k ' = k, is selected as
a signicant interfering channel, it has to fulll the following
condition.
Namely,
a H a
the channel group weighting factor magnitude aAm (hk ) hk' corresponding to the scaling of the interference in the matched ltering estimate covers a
signicant portion of the
of the channel group weighting factor magnitudes
sum
(ha )H ha' for all the interferences to UE k. In fact, we
'
k
k
k =k
mBk
aAm
implicitly select the relevant signicant interfering channels based on the selected
signicant useful channels. As can be seen from the above mathematical criteria,
the selection of hak' depends on the considered useful channel coecient hak .
Two UEs have compatible signicant interfering channels if all the signicant
interfering channels selected for one UE are never considered as insignicant
interfering channels for the other UE. If all the UEs have compatible signicant
interfering channels, all the individual UE-specic signicant interfering channel
UE
BS
H =
3
1 0 *
* * *
(3)
H
=
I
0 1 *
* *
(2)
H
= 1 *
I
0 *
(1)
H
I = *
useful channels
for UE 1
97
1 0 1
HU = 1 1 0
0 1 1
0
0 1 1
1
0 1
. /a
(k )
9
hI
0
0
1
1
1 0
k
. /a
* *
(k )
9
hI
1
1
0
1, 0,
0
k
corresponding to
insignificant useful
channels for UE 1
compatible
results
. /a . /a
(k )
9
hI
hI = 9
k
incompatible
Figure 6.5 Example for signicant channel selection and indicator matrix formalism.
98
(6.19)
HI
9 ,
=H(H
I
(k)
(6.20)
(6.21)
99
(6.22)
Applying the iterative ZF JD algorithm with full CSI, i.e., the PIC algorithm,
(i) in the i-th iteration can be described by [Ver98]
the estimated data vector x
H 1 H
(i 1) ,
(i) = diag H H
H y diag HH H x
x
(6.23)
where diag () sets all the elements on the diagonal of its argument to zero. In the
(i) could be forwarded to a
end of every iteration i, the estimated data vector x
data estimate rener applying hard quantization or soft quantization techniques
(i). Now we will apply the signicant
to obtain the rened estimated data vector x
(k)
CSI described by HU given in (6.19) and by HI given in (6.20) instead of full
CSI into the PIC algorithm. According to the functionalities of the channels in
data transmission, rstly the signicant useful channel coecients in HU will be
considered in the matched ltering part. Then, these signicant useful channel
coecients and their corresponding signicant interfering channel coecients in
(k)
HI will be considered in the iterative interference cancelation part. In this way,
the proposed iterative ZF JD algorithm with signicant CSI can be derived as
(i 1) .
(i) = 1 HH
x
(6.24)
U y diag (H ) x
In the above equation, the channel gain scaling matrix is dened as
= diag HH
U HU ,
and the channel correlation matrix H is dened as
(1)
[HU ]H
1 HI
..
,
H =
.
[HU ]H
K
(6.25)
(6.26)
(K)
HI
where the matrix operator [ ]k returns the k-th column vector of its argument.
Knowing that a suitable data estimate renement can further improve the system
performance, in this section the iterative algorithm applying the transparent data
(i) = x
(i), is considered. Without loss of generality, it
estimate renement, i.e., x
can be treated as a benchmark for this kind of iterative algorithms with dierent
data estimate renement techniques.
In the case that the linear iterative JD algorithm with signicant
CSI described
(i) = x
(i 1) and the matrix + diag (H ) has full
by (6.24) converges with x
(i) can be easily calculated from (6.24) as
rank, the limiting value of x
1 H
() = + diag (H )
HU y.
(6.27)
x
Under special conditions, this equation can be simplied step by step in dierent
cases as shown in the following:
100
In the case that all the individual UE-specic signicant interfering channel
(k)
matrices HI can be combined to one matrix HI , one obtains
H = HH
U HI ,
(6.28)
(6.29)
(6.30)
(6.32)
(6.33)
101
ya
(6.35)
for every UE k at all the BSs corresponding to its signicant useful channels.
(b) Collect rka from the coordinated BSs through the backhaul links, and sum
them up at BS m = k to obtain the matched ltering estimate for each UE
k as
H
rk =
rka =
([hU ]ak ) ya .
(6.36)
a
(6.37)
from dierent UEs k ' = k to UE k at the BSs with antenna indices a corresponding to the signicant interfering channels
UE k.
7
8for
a
(b) Collect the reconstructed interfering signals f (k) k' from coordinated BSs
over the backhaul, and subtract them from rk at BS m = k to obtain the
estimated data symbol for every UE k as
'
/a *
.
1
(k)
rk
f
x
k (i) =
k
k'
k' =k a
'
*
.
/a
1
(k)
a H
rk
([hU ]k ) hI
xk' (i 1) .
(6.38)
=
k
k'
'
a
k =k
102
y1
BS 1
[hU ]11
H
r1
H
[hU ]13 [hI]11
H
[hU ]13
y2
BS 2
[hU ]22
H
H
[hU ]11 [hI ]13
r2
BS 3
H
[hU ]33
H
[hU ]32
matched filtering
x1 (i )
l1
x1 (i -1)
1
H
[hU ]21 [hI ]22
H
[hU ]21
y3
H
[hU ]22 [hI]21
r3
x2 (i -1)
1
H
[hU ]32 [hI ]33
H
[hU ]33 [hI ]32
x2 (i )
l2
x3 (i )
l3
x3 (i -1)
Figure 6.6 Decentralized signal processing with signicant CSI in a 3-cell system.
103
BS 1
u1,2 , v1,2
BS 2
h11 , h13
h21 , h22
z2,1
u3,1
v3,1
z1,3 z3,2
u2,3
v2,3
BS 3
between
zm,l
um,l
two from BS l from BS l
to BS m to BS m
BSs
z2,1 =
u1,2 =
1 and 2 2 H
y2 x
[hU ]1
1 (i 1)
2 and 3
h32 , h33
3 and 1
u2,3 =
H
[hU ]32
z3,2 =
y3
u3,1 =
H
[hU ]13
y1
x2 (i 1)
vm,l
from BS l
to BS m
v1,2 =
H
[hU ]21 [hI ]22 x2 (i 1)
v2,3 =
H
[hU ]32 [hI ]33 x3 (i 1)
z1,3 =
v3,1 =
x3 (i 1)
H
[hU ]13 [hI ]11 x1 (i 1)
Figure 6.7 Backhaul communication steps for JD with partial CSI in a 3-cell system.
(NU 1)
matched ltering
upper
NBL
N L
and
(6.39)
interference cancellation
for 1 NI (K 1)
2 NI L
= (NU 1) + (NI + K 1) L for (K 1) < NI (K 1)2
.
2
(K
1)
L
for
(K
1)
<
N
(K
1)
K
I
matched ltering
interference cancellation
104
6.2.2
Performance Assessment
Analytical Calculations
The system performance of the uplink decentralized CoMP scheme can be investigated based on the limiting value of the iterative ZF JD algorithm with signicant CSI. Considering the transparent data estimate renement, the limiting
value of the estimated data vector described by (6.27) can be rewritten as
1 H
1 H
= diag + diag (H )
HU H x + diag + diag (H )
HU H x
x
+ +
useful contribution
1 H
diag (H )
HU n .
interference
(6.40)
noise
(k)
S (k)
, where
N (k) + I (k)
(6.41)
)
(
1 H
H 1
H
= P diag + diag (H ) HU H diag H HU + diag H
,
k,k
(
)
1
H 1
H
H
diag
H
H
diag
+
U
U
H
k,k
and N (k) = 2
(
+ diag (H )
1
H 1
HH
U HU + diag H
)
(6.42)
k,k
can be calculated from (6.40). Since only partial CSI instead of full CSI is applied
in the cooperative signal processing, the data estimates may contain slightly
rotated and scaled useful contributions. However, such a rotation or scaling can
be easily estimated and compensated at the receiver.
Numerical Simulation Results
In the following, the system performance of the proposed CoMP scheme is
assessed with respect to numerical simulation results in Figs. 6.8 and 6.9. A
small cellular system including 3 cells with a frequency reuse factor of 1 as
shown in Fig. 6.5 is taken as the reference scenario. Some key pre-assumptions
for the simulations are listed as follows:
1.0
1.0
MF-cell
0.8
0.6
JD-(2, 1)
0.4
L=1
0.8
CDF
CDF
105
0.6
L=5
0.4
JD-(3, 6)
0.2
L=2
L=
0.2
JD-(2, 3)
0
0
10 15 20 25 30 35 40 45
[dB]
10 15 20 25 30 35 40 45
[dB]
Figure 6.8 CDF of the output SINR in the uplink with N = 20 dB.
NI = 1, 2
6
NI = 1, 2, 3, 4, 5, 6
NI = 1, 2
NI = 1, 2, 3, 4, 5, 6
30
NI = 1, 2, 3, 4
NI = 1, 2, 3, 4
4
20
lower
NBL
Cout [bit/s/Hz]
106
10
0
1
2
NU
(a) Outage capacity.
2
NU
Figure 6.9 Outage capacity and backhaul load vs. numbers of signicant channel
groups considered in JD with L = 4 iterations, pout = 0.1, N = 20 dB, M = K = 2.
considers only intra-cell useful channels but no inter-cell interfering channel in the
signal processing for each UE. The system performance is strongly limited by
the inter-cell interference. Applying the proposed decentralized CoMP scheme
considering signicant CSI, the CDF curve of the SINRs for JD-(2, 1) and
that for JD-(2, 3) are plotted. Obviously, the proposed decentralized CoMP
scheme considering a few appropriately selected UE-oriented signicant channels
can strongly improve the system performance as compared to the conventional
intra-cell matched ltering scheme. The communication scheme denoted by JD(3, 6) is nothing else but the decentralized CoMP scheme applying the iterative
ZF JD algorithm with full CSI considering all the useful and interfering channels for every UE. All the inter-cell interference is eliminated, and the system
performance is only limited by the Gaussian noise.
The SINR performance of the proposed decentralized CoMP scheme considering dierent numbers of iterations in the iterative JD algorithm is investigated
in Fig. 6.8(b). It is shown that after only a few iterations, the SINR performance
of the iterative JD converges to that of the corresponding JD with an unlimited number of iterations. The convergence behavior can be well retained even if
signicant CSI instead of full CSI is considered in JD.
In Fig. 6.9, the inuence of dierent amounts of considered signicant CSI
on the system performance and on the backhaul load of the proposed decentralized CoMP scheme is investigated. According to the proposed signicant
channel selection scheme, for each number of signicant useful channel groups
NU , the number of signicant interfering channel groups NI could range from
1 to (K 1) NU . In Fig. 6.9, (K 1) NU bars are plotted for each number NU
corresponding to all possible numbers of signicant useful and interfering channels considered in JD-(NU , NI ). In Fig. 6.9(a), the outage capacity of UEs in
the 3 cells is plotted. In Fig. 6.9(b), the lower bound of the backhaul load as
107
described by (6.39) is plotted. Since i.i.d. Gaussian data symbols and i.i.d. noise
signals are considered, it is reasonable to assume that the remaining interfering
signals and the noise signals after the linear JD are uncorrelated and Gaussian
distributed. With (k) indicating the SINR for UE k, the corresponding instantaneous capacity is calculated as
(k)
(6.43)
Cint = log2 1 + (k) .
The outage capacity Cout is dened w.r.t. its outage probability pout as
!
(k)
(6.44)
pout = Prob Cint < Cout , k = 1, 2, 3,
(k)
where pout describes the probability that the instantaneous capacity Cint of
one UE is smaller than the outage capacity Cout . For every pair of (NU , NI ),
an outage capacity Cout can be calculated based on a given pout . The most
important results derived from Fig. 6.9 are the following:
Generally, the more signicant channels are considered in the decentralized
CoMP scheme, the better system performance can be achieved. The more
signicant useful channels are considered, the more noise can be suppressed.
The more interfering channels are considered in JD, the more interference can
be eliminated, but the larger the noise enhancement will be.
Interestingly, it is shown that for a given number of signicant interfering
channel groups NI , a larger outage capacity can be achieved when considering
a smaller number of signicant useful channel groups NU . The reason is that
the more signicant useful channel groups are considered, the more BSs are
involved in JD, and the more interference is included in the matched ltering
data estimate for each UE. The system performance of the proposed scheme is
mainly limited by the remaining interfering channels which are not considered
as signicant interfering channels in JD. In fact, the outage capacity increases
with the number of signicant useful channel groups NU and the ratio
=
NI
,
(K 1) NU
(6.45)
rather than directly with the number of signicant interfering channel groups
NI . Variable indicates the ratio of the number of considered signicant
interfering channels to the total number of interfering channels for each UE.
Considering only a few appropriately selected signicant channels, a good
system performance with a moderate backhaul load can be achieved by the
proposed JD scheme. For example, JD with full CSI, i.e., JD-(3, 6), requiring
at least a backhaul load of NBL = 26 exchanged messages, and achieves an
outage capacity of Cout = 5.37 bit/s/Hz, while JD-(2, 3) which requires only
a backhaul load of NBL = 13 can already achieve an outage capacity of Cout =
4.96 bit/s/Hz. Hence, a good compromise between backhaul load and data rate
can be made by considering signicant CSI in JD.
108
6.2.3
Summary
A practical uplink decentralized CoMP scheme has been proposed in this section.
The decentralized CoMP scheme can be directly implemented at the coordinated
BSs in a exible way without requiring a central unit. Distinguishing the significant useful channels from the signicant interfering channels, only the channel
state information which plays a signicant role in the system performance of
each UE is required in joint detection. A good compromise between data rate
and backhaul load can be made in the proposed decentralized CoMP scheme
considering signicant CSI.
6.3
109
Cluster of
base stations
to-average power ratio (PAPR) for linear precoding [JHJvH02] as well as for
non-linear Tomlinson-Harashima precoding (THP) [NMK+ 07]. In general, by
limiting each user to report its strongest eigenmodes only, feedback may be
reduced.
Recent results applying generalized MIMO techniques in wireless networks
show huge gains [FHK+ 05, KFV06]. In addition, [ZD04] proposes a common
framework to study multi-user CoMP downlink transmission, considers practical
signal processing issues and emphasizes the advantage of array gain, enhanced
channel rank and macro diversity. In [JJT+ 09], the authors conrm these ndings
based on channel measurements from a real cellular urban-macro deployment.
The work in [VHLV09] promises signicant gains obtained from a simulator
including realistic operational conditions valid for a WiMAX system operated in
an indoor scenario.
However, higher complexity, growing data rates on the backhaul and the
additional overhead remain serious challenges for the introduction of CoMP in
next generation mobile networks. Note that these costs can be scaled down
with the size of the cooperation cluster. In the downlink, backhaul requirements increase at least linearly with the number of BSs belonging to the cluster [HS09] (in a centralized approach). See a detailed discussion on backhaul
requirements in both centralized and decentralized downlink CoMP in Section 12.2. Hence, a distributed implementation of CoMP is realistic where the
serving BS cooperates with a small subset of BSs (see Fig. 6.10) in its direct
vicinity [ZSK+ 06, MF07a, JTS+ 08b, NEHA08, PHG09, ZMS+ 09a, TWH+ 09].
110
Sections 13.3 and 13.4 present details on a real-time implementation [JTW+ 09]
of downlink CoMP demonstrating its feasibility.
The subsequent section is organized as follows: In Subsection 6.3.1, we introduce an extended system model which covers the algorithms described in this
work. Then we continue with a general description of CoMP joint transmission (JT) obtained in a fully centralized setup and determine the system capacity by use of DPC under a sum power constraint in Subsection 6.3.2. In the
next steps, we introduce concepts to alleviate the major drawbacks related to
joint transmission, as e.g. its higher complexity, increased backhaul and signaling overhead. Those concepts cover linear precoding techniques, a greedy user
selection process and clustering solutions in Subsection 6.3.3. The clustering can
be carried out statically or dynamically and restrict joint processing techniques
to a limited number of base stations. Moreover, the cluster formation may be
performed and optimized by a central entity (network-centric), or in a per-user
way (user-centric). In Subsection 6.3.4, we introduce a concept for a unied
channel state information (CSI) feedback framework to cope with dierent vendor specic types of channel feedback provided by mobile devices. This concept
is well-aligned with the estimation of an eective channel described in Subsection 9.1. Finally, we summarize a system concept where each terminal provides
channel feedback to its serving base station only; the base stations in the same
cluster exchange the channel feedback and payload data in order to determine
the precoding weights and perform the spatial precoding, both in a distributed
manner.
6.3.1
System Model
We consider a cellular orthogonal frequency division multiplex (OFDM) downlink where a central site is surrounded by multiple tiers of sites. As in Chapter 3,
we assume each site to be partitioned into three 120 sectors or cells, i.e. yielding a set M consisting of M = |M| sectors in total. In our notation, each sector
constitutes a cell which is controlled by one BS, and frequency resources are fully
reused in all M cells. In joint transmission, the data to each user is simultaneously transmitted from multiple BSs. In order to mitigate the overhead related
to joint transmission techniques, BSs are grouped into C subsets or clusters, of
which one example is shown in Fig. 6.10. Mc represents the set of cells included
in a cluster c and Mc = |Mc | denotes its maximum dimension. Joint processing is only allowed between BSs belonging to the same cluster, whereas BSs
belonging to dierent clusters are not coordinated and thus produce residual
inter-cluster interference. As an extension, coordinated beamforming techniques
may be used to deal with the interference between clusters, i.e. to coordinate
the inter-cluster interference, as introduced in Section 5.3. Further, we assume
disjoint clusters, i.e. a given BS cannot belong to more than one cluster operated
at the same time/frequency resource, as for example created through clustering
111
j{Kc \k}
j{K\Kc }
Desired signal
Intra-cluster interference k
Inter-cluster interference zk
6.3.2
m m H
where
= E{sck (sck )H } and m
ss,k = E{sk (sk ) }, and Pmax is the per-base
station power budget. We note that the original MIMO BC capacity [CS03],
[JVG04]) was calculated under a sum-power constraint. The problem of nding
the sum-capacity region of a downlink system with a per-antenna power constraint was considered in [YL07], where a generalized uplink-downlink duality
was established.
css,k
112
BSs are considered as a huge distributed antenna system (DAS) (i.e. spanning
one huge cluster of cells), which jointly serve a set of users U, of which K U are
served on the same resource in time and frequency. Note, in contrast to the typical
Rayleigh fading assumption, the MIMO channels in this evaluation do not have
the same average signal-to-noise ratio (SNR). This is caused by the dierent
pathloss coecient to the dierent antenna arrays of the BSs in the cellular
deployment. The cellular channels are generated by use of the spatial channel
model extended (SCME) with a 3D antenna pattern. We are using an iterative
water-lling algorithm with a sum-power constraint [JRV+ 05] to determine the
maximum sum-rate of the system as a function of the size of the active set of
users U . While in practice we would rather consider a per-BS or per-antenna
power constraint than a sum-power constraint, these results should provide an
overview on a well-known water-lling algorithm and its achievable sum-rates
in a cellular deployment. Note that Section 6.3.4 evaluates a linear precoding
scheme with per-antenna power constraint.
For the results shown in Fig. 6.11, we assume that the transmit power per
physical resource block (PRB) emitted by each BS is set to Pi = 400 mW (equivalent to the full transmit power in LTE systems of 40 W for 20 MHz of bandwidth), Pi = 40 mW or Pi = 4 mW. As noise, we assume thermal noise given
at 20 C and an additional receiver noise gure of 9 dB. In particular, the high
transmission power of Pi = 400 mW yields a very high average SNR of 38 dB
for each user in U and its specic serving cell in M. We are aware that such high
SNRs cannot be achieved in practice due to various impairments in the system
hardware, such as resolution of analog to digital conversion (ADC), phase noise
etc.
Fig. 6.11 shows the achievable BC capacity by use of an iterative water-lling
algorithm with sum power constraint [JRV+ 05]. The capacity is given for 1 to 5
active users per cell and for the low to high SNR regime. The capacity increases
for an increasing set size U of available users, hence multi-user diversity helps
to improve the capacity of the BC. For the low SNR regime, i.e. Pmax = 4 mW
or Pmax = 40 mW, the capacity increases by 78% when assuming 3 active users
instead of 1 user per cell. In contrast, we observe a slightly reduced slope in
the high SNRs regime (Pmax = 400 mW), i.e. for an equivalent gain we need
to have 4 active users per cell. Peak capacities, i.e. 90%-ile per cell approach
6.7 bit/s/Hz, 14.6 bit/s/Hz and 23.5 bit/s/Hz for 3 users per cell for low to
high SNR regime, respectively. As a reference, we include the BC capacity for
Rayleigh fading channels and an average SNR of 38 dB per transmit antenna. It
turns out that the typical Rayleigh fading assumption with equivalent average
SNR overestimates the capacities by approx. 33%.
113
35
30
25
20
15
4 mW per PRB
40 mW per PRB
400 mW per PRB
Rayleigh fading with mean
per-antenna SNR=38dB
10
5
0
0
2
3
4
number of active users per cell
Figure 6.11 Cellular deployment with 21 cells, fully coordinated by use of iterative
WF [JRV+ 05] and sum power constraints. Rayleigh fading CDF is given for an
average per-antenna SNR of 38 dB. Error bars indicate the standard deviation of the
MIMO BC distributions.
6.3.3
114
spatial streams (or eigenmodes) are transmitted to each user with no inter-user
interference, resulting in a block diagonal (BD) covariance matrix.
An extension of the BD concept, called MET, was proposed in [BH07a] and
uses a linear transmission strategy based on zero-forcing beamforming for maximizing the weighted sum-rate. On a frame-by-frame basis, MET distributes up
to Mc Nbs spatially multiplexed streams for one or multiple users.
MET was initially proposed for multi-user MIMO (MU-MIMO) transmissions
and its extension to the CoMP case can be summarized as follows. Lets assume
that each user multiplies its channel matrix by the Hermitian of the left dominant
eigenvector. The eective channel after linear antenna combining is then
hck MET = (uck )H Hck = (uck )H Uck ck (Vkc )H = k,c (vkc )H .
(6.49)
for matrix
Pc as given in [ZD04]:
$
Pc =
min
m=1,...,M
;
Pmax
"Wm "2
115
5
I[KK] ,
(6.53)
where Wm are the rows of matrix W related to the antennas of BS m. Note that
this power allocation is suboptimal and typically results in only one BS antenna
transmitting with maximum power, and hence, the remaining Mc Nbs 1 antennas transmit with less than Pmax /Nbs .
Clustering for Reducing Feedback and Backhaul Data Trac
From a practical point of view, one of the major drawbacks related to joint
processing is its higher complexity, i.e. increasing backhaul and signaling overhead. To reduce these complexity requirements, clustering solutions that restrict
joint processing techniques to a limited number of BSs have been proposed. In
these approaches, the network is statically or dynamically divided into clusters
of cells [BH07b, PGH08, TWH+ 09]. Moreover, the cluster formation may be
performed and optimized by a central entity (network-centric), or in a per-user
way (user-centric). As a result of [TBB+ 10], dynamic BS clustering was found
to be key relation for spectrally ecient CoMP transmission while keeping the
backhaul trac at a moderate level.
The work described in [BHA08] considers that BS clusters are created in a
dynamic way (see Section 7.2), in other words at each time slot t the sets of coordinated BSs are generated in order to maximize a given objective function. This
work demonstrates a signicant reduction of signaling overhead in the backhaul
due to data sharing between cooperating base stations, while achieving a high
fraction of the full coordination performance.
6.3.4
116
Network
data bits
of UE 1
data bits
of UE 2
CSI / data
bit exchange
BBS 1B
BBS 2B
s1
s2
joint precoding
CSI / CQI
feedback
IRC
IRC
y1
y2
UE 1
UE 2
CSI / CQI
feedback
Figure 6.12 Cooperative transmission and CSI/CQI feedback and exchange concept,
as illustrated for a toy scenario with M = K = 2.
erates with a small subset of BSs, Fig. 6.10, in its direct vicinity are reported
in [ZSK+ 06, JTS+ 08b, NEHA08, PHG09, ZMS+ 09a, TWH+ 09]1 . Chapter 13.3
reports on a rst real-time implementation of downlink CoMP demonstrating its feasibility and [JTW+ 09, JFJ+ 10] summarize this work. Terminals are
assumed to estimate the multi-cell CSI in the downlink using CSI reference signals (CSI RSs). Subsequently, UEs deliver CSI feedback in combination with
channel quality indicator (CQI) values to their serving BS, as illustrated in
Fig. 6.12. Next, BSs in the cluster exchange the CSI as well as scheduled user data
over a low-latency signaling network denoted as X2 interface [3GP10i]. Precoding
weights for the joint beamforming are determined at each BS. The relevant set of
weights is applied to the data signals and in this way, the transmitted waveforms
are obtained locally. Similar to the centralized approach, the desired signals sum
up constructively while the mutual interference inside the cluster is canceled. We
emphasize that under the assumption of low Doppler shift, i.e for low mobility
or even static users, the backhaul bandwidth required for sharing the user data
between cooperating BSs is much higher than the one required for updating the
channel estimates within the cluster, as discussed in Section 12.2. Let us assume
an average throughput per cell denoted as rate, hence, each BS has to receive
the scheduled user data for its own UEs according to that data rate. Further,
we consider that hybrid automatic repeat request (HARQ) processes for each
user in the active set of users Mk are running decentralized at each BS k. Thus,
1
Note, the length of the cyclic prex (CP) limits the tolerable backhaul latency in the centralized approach. For distributed downlink CoMP, latency is more related to the ongoing
aging process of the CSI while it is exchanged over the backhaul. A few ms may be tolerated for slowly moving UEs. Hence, capacity and latency requirements for the backhaul are
signicantly relaxed compared to the centralized approach.
117
each BS has to perform the channel coding with a given code rate, according
to the CQI feedback provided by the users in Mk . For simplicity, we x this
rate to 1/2 in the sequel. According to (6.54), all remaining K 1 BSs in the
cluster K convey their coded user data over the backhaul to the k-th BS. Thus,
the backhaul overhead scales linear with the number of BSs exchanging their
scheduled data in the cluster.
*
'
K 1
(6.54)
trac = rate 1 +
code rate
The process is split into three phases:
Phase I: Channel feedback.
Each user performs a cluster-wide channel estimation using reference signals
(see Section 9.1). Each UE generates multiple-input single-output (MISO)-CSI
according to [BH07a, TBH08, TWH+ 09]
hck = ( ck )H Hck ,
(6.55)
where the Euclidean norm equals " ck "2 = 1. Besides, ck is always used to
denote the linear combining scheme to generate CSI MISO feedback. In Section 2, we assume the combining metrics dened in (6.49) and (6.50) and
denote them as eigenmode-aware receive combining (ERC) and eigenmodeaware optimum combiner (EOC). This channel information is fed back in
conjunction with the expected post-equalization SINR
(I)
SINRk =
, kc |2
| (hck )H w
7
8 c,
c H
( k ) zk zH
k k
(6.56)
$
, kc = p,k hck /"hck "2 and
where each user assumes a precoder according to w
no intra-cluster interference, since this interference will be removed by the
joint precoder. In particular, the achievable SINR (6.56) together with the
CSI (6.55) is then conveyed to the serving BS.
Phase II: Distributed precoder calculation.
A scheduling instance in the cluster c combines a total number of
Mc Nbs = KNbs MISO channels to a compound MIMO channel matrix2 . In
the following, each BS is responsible for a specic sub-band of the overall
bandwidth where CoMP JT is employed. Therefore, BSs partially exchange
their collected CSI and combine the channel feedback hck to a compound
virtual MIMO channel matrix of size Mc Nbs K according to (6.51).
Subsequently, each BS determines the linear precoder for its specied
sub-bands but for all Mc Nbs antennas of the cluster according to (6.52).
2
With proper user selection, the full rank condition of the compound channel can be frequently
met in the multi-point-to-multi-point case with independent links [ZD04, JJT+ 09].
118
119
bc
bc
bc
bc
10
100
bc
bc
bc
bc
bc
bc
80
8
bc
bc
bc
bc
qp
bc
rs
bc
bc
ld
qp
bc
ut
ld
qp
bc
bc
rs
qp
qp
bc
bc
bc
ut
bc
bc
bc
bc
bc
bc
bc
60
LTE 1x1, round robin
LTE 2x2, round robin
LTE 2x2, score-based
CoMP MET, LTE map.
CoMP MET, Shan. map.
CoMP EOC, Shan. map.
40
20
120
12
0
0
1
2
3
4
5
6
7
8
9
cluster size, i.e. number of cells involved in JT CoMP
10
Figure 6.13 Performance results as a function of the cluster size Mc . Channel feedback
and assume 630 bit codeword length and 1% target block error rate (BLER). In
addition, some results are provided based on Shannon information rates.
Performance of Reference Cases
For the scheduling in one cell, UEs provide feedback on their SINRs in the form
of so-called CQI values for subgroups of sub-carriers denoted as PRBs. These
CQIs correspond to a specic spatial transmission mode, which is indicated by
the precoding matrix indicator (PMI). As a rst extension towards multi-cell
processing, adjacent base stations are synchronized and multi-cell demodulation
reference signal (DRS) are introduced. They enable interference-aware equalization at the UE and improve the SINR estimation accuracy, leading to a more
precise link adaptation at the BS side [TSWJ09].
For reference purposes, we include the performance results for interferencelimited single-input single-output (SISO) as well as a MIMO 2 2 transmission
from Section 5.1. For Nbs = 2, two active xed discrete Fourier transform (DFT)based beams are sent to K = 2 dierent users in a round-robin manner or taking
CQI feedback into account. The CQI-aware score-based solution, described in
Section 5.1, outperforms both other reference cases with a relative throughput
gain of Mc =1 = 1.27 and Mc =1 = 2.2 compared to round-robin and SISO,
respectively. Note, with K = Nbs , the MIMO setup benets from an additional
user in conjunction with an increase of antennas Nbs = Nue = 2. All results in
Fig. 6.13 are based on an equal per-beam power constraint with a per-antenna
power constraint according to LTE assumptions.
120
6.3.5
121
Summary
In this section, we investigated centralized joint transmission in the context of
CoMP transmission in the downlink of next generation mobile networks. Starting
from a general system model, we rst determined the MIMO broadcast capacity in a cellular system for dierent SNR regimes. Second, we introduced the
concept of linear joint precoding for a subset of BSs, i.e. a cluster, in the system, whereas BSs belonging to dierent clusters are not coordinated. Hence,
each cluster is surrounded by multiple non-coordinated cells. For removing the
interference inside the cluster the common multi-user eigenmode transmission
has been further developed towards optimum combining. The gains from receive
antenna combining have been included in the overall optimization. The performance has been studied in detail in a triple-sectored multi-cell scenario covering
57 cells. At rst, we observed that median data rates per cell can be increased
by 81%, 112% and 157% assuming a cluster size of 3, 5 and 10 cells, respectively,
compared to a non-cooperative system with the same 2 2 antenna conguration. However, backhaul requirements per feeder link increase as well, i.e. by 5,
9 and 19 bits per bit-on-air-interface assuming a xed code rate of 1/2, for the
coordination of 3, 5 and 10 cells, respectively. Second, as a function of the cluster size ranging from 1 to 5 and up to 10, the linear eigenmode-aware optimum
combiner scheme achieves 28%, 34%, 41%, 46%, 49% up to 62% of the capacities
provided by system-wide dirty paper coding. Altogether, signicant gains from
coordination have already been realized by using small clusters.
Acknowledgements
The authors are grateful for nancial support from the German Ministry of
Education and Research (BMBF) in the national collaborative project EASY-C
under contract No. 01BU0631.
6.4
122
Network
data bits of
all UEs
(1) . . . h
(1)
h
1
K
data bits of
all UEs
(possibly) partial
CSI exchange
(2) . . . h
(2)
h
1
K
BBS 1 B
BBS 2 B
s1
s2
h1
h2
y1
y2
UE 1
UE 2
Figure 6.14 Setup for decentralized beamforming with limited CSIT, for a toy example
with M = K = 2.
of shared CSI is limited [ZG10b]. In the second, the CSI is shared ideally across
the cooperating cells, while the user data is only partially shared to lift the burden
o the backhaul [ZG10a]. In the latter case, we use the notion of superposition
coding, as already introduced in an uplink context in Section 4.3, assuming now
that each terminal receives a superposition of conventionally and cooperatively
transmitted signals. We will see that the optimal ratio of these signals varies with
both the interference strength statistics and the backhaul capacity constraint.
6.4.1
123
Each BS having precise knowledge about a dierent subset of the user channels
(e.g. a BS may not wish or be able to decode the CSI feedback from a very
distant user).
Note that this is the essential dierence to Section 6.3, where the same extent of
CSI is assumed to be fully distributed among all cooperating BSs. In this section,
we introduce a feedback model using the concept of hierarchical codebooks, which
allows us to incorporate additional structure into this problem and as a result
facilitate robust beamforming design.
Despite possible dierences in their acquired CSIT, the dierent transmitters wish to conciliate their views so as to design a consistent set of precoding
vectors that maximizes the user rate. This problem can be categorized as a
so-called team-decision problem or a decentralized statistical decision making
problem [Ho80, Rad62].
System Model
Consider a set of M BSs communicating with K UEs. Each BS has Nbs 1
Nbs 1
antennas, whereas each UE has a single antenna. hm
is the channel
k C
1 T
M T T
NBS 1
is UE ks whole
from BS m to UE k and hk = [(hk ) , . . . , (hk ) ] C
2
m
channel: hm
k NC (0, k,m I[Nbs ] ) and dierent hk are independent of each other.
The overall unquantized channel matrix H groups the channels to all users,
(m) denote UE ks quantized channel as
i.e. H = [h1 . . . hK ]. Similarly, we let h
k
perceived by BS m and group the whole of BS ms channel knowledge into
(m) . . . h
(m) ]. The signal received by UE k is given by:
(m) = [h
H
1
K
yk = hH
k s + nk ,
(6.57)
K
wk xk ,
(6.58)
k=1
where x CK1 is the vector of transmit symbols, its entries being independent
and with x NC (0, I). The overall beamforming matrix W groups the precoding vectors wk carrying the dierent users symbols, so that W = [w1 . . . wK ]
CNBS K , where precoding vector wk = [(wk1 )T . . . (wkM )T ]T carries user ks symbols, and wkm CNbs 1 is BS ms precoding contribution towards UE k. The rate
achievable for user k is equal to
Rk = log2 (1 + k ),
(6.59)
124
hk
Qlk (.)
k ,max
Qlk (.)
Q0k (.)
(L
h k
h (kL
( l k ,max ))
( l ))
h (kL
( 0 ))
Figure 6.15 Distributed hierarchical CSI model: the quantization codebooks are
designed to be hierarchical to oer additional structure. Qkl (.) denotes the l-level
(specifying the accuracy) quantization function of user ks channel.
(6.60)
j=k m=1
125
K
!
(1) , . . . , WM H
(M)
E U k hk , W 1 H
.
(6.61)
k=1
m (H) =
(m) H
Wm H
W
126
that here all transmitters agree on maximizing a common utility (despite their
lack of shared CSI knowledge), as opposed to optimizing a selsh utility.
Person-by-Person Optimization
Person-by-person optimal strategies3 are such that for each team member, his
strategy is optimal given the other team members strategies. Clearly, the globally optimal strategies are person-by-person optimal, but the converse is in general not true. In our particular setup of distributed CSIT, an optimal strategy
for transmitter m, given that the other transmitters strategies are xed, may
(m) , as follows:
be characterized, for a local channel knowledge equal to H
(m)
Wm H
<
<
(m)
H|
H
U (H, Wm )
= arg max
.
.
.
dHf
(6.63)
(m)
H|H
2
"Wm |F P
where
(H, Wm ) = U H, W
1 (H) , . . . , Wm , . . . , W
M (H) ,
U
(6.64)
and fH|H
(m) denotes the probability distribution function of the overall channel
(m), the quantized overall channel matrix at transmatrix H conditioned on H
mitter m. This is equivalent to
<
<
(H, Wm ) , (6.65)
(m) = arg max
.
.
.
dHfH (H) U
Wm H
(m)
"Wm "2F P
H
R(
)
(m) as the Voronoi region corresponding to this state of
where we dene R H
knowledge at transmitter m. The Voronoi region indicates the set of all possible
(m) has been
values for the actual channel H given that channel estimate H
observed at transmitter m.
A Decentralized Beamforming Example for M = K = 2
To simplify the presentation of the solution to the problem, we focus on the
M = K = 2 case. The hierarchy in the knowledge at the two transmitters, and
as a result the beamforming strategies to follow, fall into one of three cases,
which may be characterized as follows:
Common Knowledge
In this case, L1 (1) = L1 (2) and L2 (1) = L2 (2). It corresponds to the traditional
assumption under limited CSIT, where both transmitters have the same knowledge. This is the case if the BSs mutually exchange their CSIT, as considered
in Sections 6.3 and 13.3, or if the CSI feedback is designed such that it can be
3
Note that in game theory, Nash equilibria, which correspond to strategies from which no user
has any incentive to deviate, are also person-by-person optimal, but there users in general
do not share a common objective and are often competing for resources.
127
decoded by both BSs individually, which is the approach pursued in Section 13.4.
Considering a hierarchical CSI structure, the assumption of common knowledge
can be regarded reasonable if both users are located at the cell-edge. In terms of
performance, having the same global CSI available at each BS is equivalent to
having centralized beamforming decisions being made.
Degraded Knowledge
In this case, L1 (1) L1 (2) and L2 (1) L2 (2), or L1 (1) L1 (2) and L2 (1)
L2 (2). In other words, one of the transmitters has a better representation of
both channels, and will adapt its beamforming on a ner scale than the other
transmitter. This is typical, for example, of when both users lie in the same cell.
Symmetric Knowledge
Here, L1 (1) > L1 (2) and L2 (1) < L2 (2), or L1 (1) < L1 (2) and L2 (1) > L2 (2).
Hence, one of the transmitters has a better representation of the channel of a
given user, and a worse one for the other, with opposite knowledge at the other
transmitter. This corresponds, for instance, to the BSs serving users each within
their own cell.
We now focus on the symmetric case where L1 (1) > L1 (2) and L2 (1) < L2 (2):
this represents the more common setup among the ones described and is also the
more challenging to formulate; the remaining cases can be dealt with in a similar
manner. We characterize each users quantized CSI by a pair i1 = (i1,2 , i1,1 ) for
user 1, and another i2 = (i2,1 , i2,2 ) for user 2. The rst index in each pair corresponds to the coarse knowledge (hence is shared by both users), i.e. the index
of the codeword in the coarsest codebook, to which the channel is quantized,
QL1
(hk ) (see Fig. 6.15), and the second index provides the missing
k (minm Lk (m))
bits to locate the ner codeword around the coarsest one, QL1
(hk ).
k (maxm Lk (m))
Given the structure of the distributed CSI, the beamforming matrix decisions may be parameterized in terms of these indices, so that W1 varies with
(i1 , i2,1 ), whereas W2 is a function of (i1,2 , i2 ). Taking this into consideration,
we expand (6.62) to
L1 (2) L2 (1)
2
2
S (i1,2 , i2,1 )
(6.66)
i1,2 =1 i2,1 =1
R1 (i1 )
<
dh1 dh2 fH (H) U H, W1 (i1 , i2,1 ) , W2 (i1,2 , i2 ) ,
R2 (i2 )
(6.67)
where I1 = 2L1 (1)L1 (2) , I2 = 2L2 (2)L2 (1) , R1 (i1 ) and R2 (i2 ) correspond to the
Voronoi regions associated with the indexed codewords.
128
It is easy to verify that the beamforming decisions for each S (i1,2 , i2,1 ) term
may be optimized separately. For given i1,2 and i2,1 , we optimize the corresponding S (i1,2 , i2,1 ). To simplify notation, we remove the dependence on i1,2 and i2,1
from the expressions. The problem is thus:
<
I2 <
I1
max.
dh1 dh2
i1,1 =1 i2,2 =1
7
s.t.
R1 (i1,1 )
R2 (i2,2 )
8
(6.68)
P, i1,1 = 1, . . . , I1
(6.69)
P, i2,2 = 1, . . . , I2 .
(6.70)
Recalling the separable nature of our utility function (refer to (6.61)), this
can be reformulated as:
<
I1
I2
2
7
8
Pr Rk (ik,
dhk
max.
)
k
Rk (ik,k )
8
(6.71)
<
Rk
(ik,
k
)
(6.72)
<
=
Rk (ik,k )
dhH
k
log2
(i
where Ck k,k
'
log2 1 +
2
|hH
k wk (i1,1 , i2,2 ) |
H
2 + |hk wk (i1,1 , i2,2 ) |2
(i
1+
*
4
,
(6.73)
(i
)
2 + wk (i1,1 , i2,2 )H Ck k,k wk (i1,1 , i2,2 )
!
= E hk hH
k hk Rk (ik,k ) , and wk (i1,1 , i2,2 ), k = 1, 2 is obtained
8
rs
ut
7
sum-rate [bit/s/Hz]
129
ut
rs
ldld
ldld
bc
bc
utrs
ldld
rs
4
ut
rs
ut
3 ldbc
bc
ld
bc
fully-shared
CSIT (upper bound), joint BF
symm. knowledge, proposed decentr. BF
symm. knowledge, myopic BF
symm. knowledge, coarse CSIT used, joint BF
knowledge at BS 1 or 2 shared, joint BF
2
5
10
15
20
SNR [dB]
Figure 6.16 Sum-rates for L1 (2) = L2 (1) = 2, L1 (1) = L2 (2) = 6 bits and = 0.1,
c 2010 IEEE.
from [ZG10b].
example. The quality of this approximation increases and becomes asymptotically optimal with the size of the codebook.
Reference Schemes
Simple upper and lower bounds to the proposed schemes correspond to joint
beamforming based on the more accurate CSIT (unachievable in a distributed
system) and the least accurate (achievable) CSIT, respectively. In another decentralized scheme which uses the local channel knowledge, each BS designs its
transmission assuming all the other BSs share the same knowledge as itself. This
is simpler than the proposed decentralized scheme, and has similar complexity
to joint beamforming design based on the coarse CSIT.
Numerical Results
To show the gains from this decentralized scheme, we plot average rates achieved
for a symmetric M = K = 2, Nbs = 1 channel, where h1,1 , h2,2 are NC (0, 1),
and h1,2 , h2,1 are NC (0, ), modeling the strength of the interference links.
The hierarchical codebooks are designed using Lloyds algorithm: rst the coarse
codebook, then for each codeword in it, the corresponding ner codebook.
Fig. 6.16 compares the proposed decentralized scheme to the upper and lower
bounds stated before for L1 (2) = L2 (1) = 2 and L1 (1) = L2 (2) = 6. We label
the scheme which attempts to use local channel knowledge as if it were shared
myopic BF, in the sense that each BSs ignores some of the information it could
be using. Thus, the upper bound scheme would require 2(L1 (1) + L2 (2)) = 24
bits of CSIT being shared, whereas the schemes based on distributed, symmetric
CSIT would require L1 (1) + L2 (2) + L1 (2) + L2 (1) = 16 bits. The benet of the
130
second layer of CSI over the more coarse shared representation of the channel
depends on the signal-to-noise ratio (SNR) and on the value of . At low SNR and
for low, there is little use for the extra information. The performance of myopic
BF, even though it relies on more information than the joint beamforming relying
on coarse CSI, is signicantly worse, highlighting the importance of coordinated
action. For reference, we also plot the performance that would be obtained if the
knowledge at transmitter i, i = 1, 2 were indeed common to both transmitters
and joint beamforming would result; clearly this yields more gain than joint
beamforming based on coarse CSI.
6.4.2
backhaul link,
capacity C1
global CSI
Network
131
backhaul link,
capacity C2
global CSI
BBS 1B
BBS 2B
s1
s2
y1
y2
UE 1
UE 2
Figure 6.17 Setup for multi-cell beamforming with limited data sharing, for a toy
setup with M = K = 2.
132
Thus user ks message rate rk is split into common and private rates, rk,c and
rk,p , respectively:
rk = rk,p + rk,c .
(6.74)
In the sequel, we assume full CSITs at both BSs, since we focus on the cost
of sharing data, and denote as k the other BS or UE, depending on the context.
Proposed Backhaul Usage
Backhaul link k with nite capacity Ck serves to carry both private and common
so that:
messages for user k as well as the common messages for user k,
Ck rk,p + rk,c + rk,c
= rk + rk
rk,p
,
k = 1, 2.
(6.75)
(6.77)
133
(6.78)
k H k 2
w
h
k
k,p
rk,p log2 1 +
,
2
k
rk = rk,p + rk,c
2
k H k 2 H
w
+
h
w
h
k,c
k
k,p
k
log2 1 +
.
k2
(6.79)
Proof. This follows from results obtained for the two-user MAC with a common
message by [SW73b].
Particular Cases
The transmission scheme introduced here covers the two particular cases of:
no message sharing (IC), obtained by forcing rk,p rk , k = 1, 2, and
full message sharing (BC), obtained by forcing rk,p 0, k = 1, 2.
Achievable Rate Region
An achievable rate region R is the set of (r1 , r1,p , r2 , r2,p ), as specied above, that
satises the specied backhaul and power constraints. One way to get its boundary is to use the rate prole notion from [MZC06]. Points along the boundary
are thus obtained by solving the following optimization problem for [0, 1]:
thus, species how the sum-rate achieved, r, is split between the two users.
max. r
s.t. r1 r,
r2 (1 )r
r1 + r2 r2,p C1 , r1 + r2 r1,p C2
2
2
H
k
k
H
rk log2 1 +
, k = 1, 2,
2
H
2
k
k
H
2
+ hk
wk,p
+ hk wk,c
2
H
k
hkk wk,p
rk,p log2 1 +
, k = 1, 2,
2
H 2
k
k
H
2
+ hk
wk,p
+ hk wk,c
k
k
k
2
"wk,p
"2 + "wk,c
"2 + "wk,c
" + P, k = 1, 2.
(6.80)
(6.81)
(6.82)
The above optimization may be solved using a bisection over the sum-rate r:
an essential part of the solution consists of establishing feasibility of a given rate.
134
(6.83)
Thus
the rate pair is only
feasible if r1 C1 , r2 C2 , and rate tuple
r1 , (r1,p )min , r2 , (r2,p )min R4 is achievable.
Feasibility of (r1 , r1,p , r2 , r2,p )
Assume r1 , r2 , r1,p and r2,p are xed. Establishing their feasibility and obtaining
beamforming vectors to achieve them may be done by solving the total transmit
power minimization problem subject to constraints (6.80), (6.81) and (6.82):
feasibility of this problem implies feasibility of the set of rates, and its optimal
solution yields the most power ecient beamforming strategies to attain it. This
problem can be formulated as:
min.
2
7
k
"wk,c "2 + "wk,p
"2
k=1
2
k H k 2 H
hk wk,p + hk wk,c
s.t. 2rk 1
, k = 1, 2,
2
H 2
k
H
2 + hkk
wk,p
+
h
w
k,c
k
2
k H k
hk wk,p
rk,p
1
, k = 1, 2,
2
2
2
H
H
2 + hkk
wk,p
+
h
w
k,c
k
k
k
2
k
2
"wk,c
"2 + "wk,c
" + "wk,p " P, k = 1, 2.
135
H
k
Further noting that hH
k wk,c and hk,k wk,p being real does not restrict the
solution, we obtain the following equivalent convex problem:
min.
2
7
k
"wk,c "2 + "wk,p
"2
k=1
6
s.t.
2rk 2rk,p k H k
hk wk,p = hH
k wk,c , k = 1, 2
2rk,p 1
+(
)+
+
+ k H k
H
+
k
k
2rk,p 1 +
wk,p , k = 1, 2
h
w
h
w
+
k,c + hk
k
k
k,p
k
k
2
2
"wk,c
"2 + "wk,c
" + "wk,p " P, k = 1, 2.
Numerical Results
Figs. 6.18(a)-6.18(c) show the rate regions corresponding to the proposed scheme,
which we label hybrid IC/BC, the IC (rk,c = 0, k = 1, 2), the BC scheme (rk,p =
0, k = 1, 2), and the quantized use of backhaul for dierent values of backhaul
capacity (we let C1 = C2 = C), for a particular channel instance with Nbs = 1.
For low backhaul capacity, the rate regions corresponding to the hybrid scheme
and the IC almost overlap and are both larger than the BC region. As the
backhaul capacity increases, all 3 regions become larger (up to the point where
the system is no longer backhaul-constrained), the BC region becomes larger than
the IC region and closer to the hybrid schemes region, until eventually these two
regions overlap. Moreover, depending on the strength of the interfering links and
on the backhaul constraints, one or the other scheme will be better.
Fig. 6.18(d) illustrates the average common to total rate ratio as a function of
C, when = 0.5, for a Rayleigh block-fading channel such that hkk NC (0, INbs )
and hkk NC (0, INbs ), when Nbs = 1: As for earlier simulation results, parameter controls the strength of the interference links. In general, the maximum
cannot be achieved without sharing some data. How much depends on .
136
1.5
2.0
bc
1.0
bc
ld
ld
R2 [bit/channel use]
R2 [bit/channel use]
ld
bc
bc
0.5
ldbc
bc
bc
bc
bc
1.0
bc
ld
ld
ldbcbc
0.5
bc
0
0
0.5
1.0
R1 [bit/channel use]
1.5
0.5
1.0
1.5
2.0
R1 [bit/channel use]
1.0
bc
2
bc
bc
bc
0.5
ld
0.3
ld
ld
0
2
3
4
R1 [bit/channel use]
0.1
bc
5
ld
ld
0
6
ld
0.2
bcbc
ld
ld
0.6
0.4
ld
ld
0.7
bcbc
ld
ld
0.8
bcbc
ld
ld
0.9
rc /r
ld
2.5
R2 [bit/channel use]
bc
1.5
= 0.1
= 0.5
3 4 5 6 7 8
C [bit/channel use]
10
6.4.3
Summary
In this section, decentralized downlink CoMP strategies suitable for reducing
the backhaul required for information exchange between cooperating BSs were
introduced. The backhaul related to channel state information exchange can be
mitigated by the use of transmit beamforming schemes which explicitly account
for the lack of CSI accuracy and the dierence in CSI estimates at the involved
base stations. The backhaul related to user data exchange can be reduced by
limiting the exchange to only a fraction of the total trac, and adjusting the
ratio of shared versus non shared trac with respect to key system parameters,
such as the number of antennas, the interference strength and the backhaul
capacity limits. Numerical evaluation showed the benet of such approaches.
Part III
Challenges Connected
to CoMP
Clustering
Patrick Marsch, Stefan Br
uck, Andrea Garavaglia,
Matthias Schulist, Ralf Weber and Armin Dekorsy
(7.1)
where d is the distance between transmitter and receiver, and take into account
the impact of directive base station (BS) antennas with an azimuth-dependent
attenuation of
2
4
2
AL = min 12 , 20
[dB]
(7.2)
70
and an antenna gain of 14 dBi. On the other hand, we observe a real-world
setup with M = 54 BSs, as it exists in downtown Dresden, Germany, and again
calculate pathlosses based on (7.1) and (7.2), but now also considering signal
reection, diraction and obstruction based on ray-tracing using a 3D-model of
140
Clustering
Figure 7.1 Setups considered in this chapter (pathloss to best serving cell in dB).
the city. The two setups are illustrated in Fig. 7.1, where the pathloss to the best
serving cell is shown as a function of potential UE location. Clearly, the realworld setup has a larger average ISD than the hexagonal grid, but we use the
former as it corresponds to the setup of the test bed discussed in Sections 13.2,
13.4 and 13.5, and the latter as it corresponds to standard next generation mobile
networks (NGMN) simulation assumptions.
Before going into the details of static and dynamic clustering, let us introduce
the concept of ideal clustering as the case where each potential UE location is
served by exactly the set of cells to which it has the strongest links. This is clearly
infeasible in practice, as it will be unlikely to nd other UEs that can be jointly
served through exactly the same set of optimal cells, and as this would involve
a substantial signalling overhead between BSs. However, this concept serves as
a good upper performance bound for any concrete clustering scheme usable in
practice. Assuming for simplication that we have only Nbs = 1 BS antenna per
cell, and that downlink joint signal processing CoMP is performed such that
interference between jointly transmitted streams is completely removed and the
maximum array gain is obtained (i.e. an idealistic assumption), we can state the
downlink signal-to-interference-and-noise ratio (SINR) obtained by a UE j on
an exemplary orthogonal frequency division multiplex (OFDM) sub-carrier if it
is served by a cluster of cells M' as
m
j
P
mM'
M'
SINRj =
,
(7.3)
2
P
m
j +
m{M\M' }
141
0.9
8dB shadowing
0.8
0.7
0.6
CDF
0.4
0.2
0
bc
bc
bc
bc
bc
1.0
0.9
bc
0.5
0
5
10
Geometry [dB]
15
20
bc
bc
bc
bc
bc
bc
bc
bc
bc
bc
0.2
bc
0.3
ideal. clust. of 3 cells
bc
0.6
0.1
0.7
no clustering
bc
8dB shadowing
0.8
bc
bc
bc
bc
0.4
bc
bc
0.3
bc
0.5
0.1
CDF
1.0
bc
no clustering
bc
5
10
Geometry [dB]
15
20
Figure 7.2 SINRs achievable under perfect interference cancelation in ideal clusters of
dierent sizes.
7.1
142
Clustering
7.1.1
Non-Overlapping Clusters
Let us rst consider the case where clusters may not overlap, i.e. where clusters
are disjunct w.r.t. the cells involved. The question is now how such a set of
clusters can be found in accordance to some performance metric at reasonable
complexity.
For this, we consider two dierent optimization criteria. On one hand, it can
be desirable to maximize the mean signal-to-interference-and-noise ratio (SINR)
that the points in J can achieve under a particular xed clustering. On the other
hand, we can consider maximizing a certain outage measure, i.e. the number of
locations in J for which a certain minimum SINR can be achieved. For both
cases, let us assume that a set of C potential clusters has already been chosen
heuristically (for example based on a ranking of the most frequently desired
clusters for the locations in J ), and a matrix
A {0, 1}[MC]
(7.4)
is given, where each non-zero element am,c means that cell m is involved in
cluster c. We now state the SINR achievable at a location j if served by cluster c
similarly as in (7.3) as
P
m
j
SINRcj =
mM,am,c =1
m
j
mM,am,c =0
+ 2
(7.5)
The mean SINR that could be achieved over all locations in J for a particular
potential cluster c can then be stated as
1
SINRcj .
(7.6)
fc =
|J |
jJ
(7.7)
s.t. Ax 1[M1]
[C1]
and x {0, 1}
(7.8)
binary,
(7.9)
143
(7.10)
We then maximize the number of locations that can achieve the SINR target
through a particular clustering concept by re-stating the optimization problem
from (7.7) as
max 1T z
(7.11)
s.t. Bx z,
(7.12)
and Ax 1
(7.13)
[C1]
and x {0, 1}
[J1]
, z {0, 1}
binary,
(7.14)
(7.16)
s.t.
0[J1]
z
B I[J]
and x {0, 1}[C1] , z {0, 1}[J1] binary.
(7.17)
Clustering
Y-Distance [km]
8
3 b 1
36 b 34
53
35
18 b 16
12 b 10
17
4
51 b 49
0.5
1.5
43
9 b 7
20
35
12 b 10
17
81
11
51 b 49
15 b 13
39 b 37
50
14
38
48 b 46
42 b 40
47
36 b 34
2
18 b 16
99
78
8
3 b 1
97
0.5
34
2.0
2.5
9
b
18 b 16
Y-Distance [km]
32
100
18 21 26
29
21 b 19
1.0
1.5
X-Distance [km]
9 b 7
7 8 54
8
20
33 b 31
0.5
53
0.5
36
75
6 b 4
103
11
15 b 13
23
54 b 52
71
30 b 28
56
7
1.0
12 b 10
0.5
2.5
57 b 55
106
3 b 1
14
85
2.0
26
24 b 22
84
87
1.0
1.5
X-Distance [km]
1.5
17
41
45
59
109
6 b 4
1.0
18 b 16
38
42 b 40
47
97
39 b 37
14
48 b 46
99
81
11
15 b 13
50
100
78
54 b 52
103
0.5
32
9 b 7
20
1.0
33 b 31
5
21 b 19
9 b 7
75
29
6 b 4
56
106
20
30 b 28
23
57 b 55
2.0
71
24 b 22
1.5
26
59
109
Y-Distance [km]
2.0
Y-Distance [km]
144
1.5
6 16 20
17
6 b 4
5
135
3 b 1
1.0
2
15 17 28
2 12 13
0.5
15 b 13
12 b 10
11
14
84
41
45
43
1.0
1.5
X-Distance [km]
87
85
2.0
0
2.5
36
0.5
1.0
1.5
X-Distance [km]
2.0
34
2.5
data based on ray-tracing, the clustering results are shown in Fig. 7.3. We can
here see the sets of locations that are assigned to particular clusters, where the
indices of the involved cells are given (except in cases where this is obvious). For
the hexagonal grid, the result is rather intuitive. In the case of a mean SINR
optimization, complete sites are declared as clusters, as users at sector borders
can benet most from CoMP due to a large signal-to-noise ratio (SNR), while
for the outage optimization, cells of dierent sites are grouped to clusters, as
this can mainly increase the performance of the weak users. For real-world BS
locations, however, clusters may of course span co-located as well as distributed
cells. Clearly, the actual assignment of potential UE locations to the best serving
cluster here leads to a more scattered result due to shadowing, but we have here
averaged over these eects for illustration purposes. The performance obtainable
with these clustering schemes is shown in Fig. 7.5 and will be discussed later.
7.1.2
145
Overlapping Clusters
Clearly, the clustering schemes introduced in the last subsection inherit the problem that UEs at the border between clusters will always experience a low SINR.
Hence, one may consider using spatially overlapping clusters using dierent subsets of system resources [MF07a, Mar10]. This can be seen as a kind of fractional
frequency reuse, but employed in such a way that the overall reuse factor is 1.
Let us assume, for example, that the system resources are split into R = 3
equally-sized resource blocks (RBs). Each cell can then be involved in up to 3
clusters with dierent partnering cells. We can solve both the problem of nding
the optimal choice of overlapping clusters as well as the optimal assignment of
resources to clusters by a simple extension of the problem in (7.7) to
8
7
(7.18)
max f T f T f T x
A 0 0
0 A 0
s.t.
(7.19)
0 0 A x 1[RM+C1]
I[C] I[C] I[C]
and x {0, 1}[RC1] binary,
(7.20)
in the case that the mean SINR is to be optimized. We here simply observe R C
virtual clusters connected to one of the three resource blocks. The constraint
in (7.19) assures that each cell is only involved in one cluster for each RB, and
that each potential cluster is only chosen on one RB. Equivalently, if an outage
measure is to be optimized, we can change (7.15) to
( )
8 x
7
(7.21)
max 0T 1T
z
A 0 0 0
( ) (
)
0 A 0 0 x
1[RM1]
s.t.
(7.22)
0 0 A 0 z
0[J1]
B B B I[J]
and x {0, 1}[RC1] , z {J 1} binary.
(7.23)
The clustering results for before mentioned setups and based on overlapping
clusters are shown in Fig. 7.4. For a hexagonal setup, the result is interestingly
almost the same, independent of whether mean SINR or outage is optimized.
We can see that both co-located cells and those belonging to 3 dierent sites are
grouped to clusters, and that each cell is now involved in exactly one intra-site
and two inter-site clusters. The resulting clustering approach is similar to those
proposed intuitively in [MF07a, Mar10]. For the real-world setup, the chosen
clustering diers strongly depending on the optimization criterion, and a cell
must not necessarily be involved in 3 clusters.
Clustering
2.0
23
Y-Distance [km]
29
6 b 4
33 b 31
32
21 b 19
9 b 7
20
8
3 b 1
54 b 52
12 b 10
17
4
51 b 49
39 b 37
14
42 b 40
47
41
0.5
45
43
1.0
35
12 b 10
17
11
15 b 13
39 b 37
50
14
38
48 b 46
42 b 40
47
99
81
51 b 49
100
78
36 b 34
18 b 16
97
0.5
43
1.0
1.5
X-Distance [km]
34
2.0
2.5
9
b
9 b 7
7 88 9
4 6 20
6 b 4
18 b 16
16 17 18
17
1.5
5
356
145
3 16 17
1 8 10
3 b 1
1.0
2 3 13
2
1 2 12
15 17 28
12 b 10
0.5
11
15 b 13
14
84
2 13 14
11 36 40
41
45
1.0
1.5
X-Distance [km]
16 20 2120
Y-Distance [km]
Y-Distance [km]
3 b 1
53
103
4
0.5
32
8
0.5
2.0
75
5
9 b 7
11 34 36
33 b 31
20
11 12 36
36
21 b 19
54 b 52
1.0
15 b 13
29
56
106
7
71
6 b 4
12 b 10
10 11 12
11
2 13 15
0.5
2.5
30 b 28
23
57 b 55
1 10 12
14
13 14 15
85
2.0
26
24 b 22
1.5
235
13 14 40
59
109
8 53 54
3 b 1
123
2
5 28 30
30
84
87
1.0
1.5
X-Distance [km]
6 b 4
456
5
135
38
48 b 46
97
81
11
15 b 13
50
100
35
18 b 16
99
36 b 34
53
103
0.5
78
4 6 20
18 16
16 17 18 b
17 16 17 20
1.5
7 8 54
9 b 7
789
8
75
56
106
1.0
30 b 28
57 b 55
9
b 21
19 20
20
2.0
71
24 b 22
1.5
26
59
109
Y-Distance [km]
146
87
2.0
85
0
2.5
36
0.5
1.0
1.5
X-Distance [km]
2.0
34
2.5
Figure 7.4 Optimization results for overlapping clusters. The dierently hatched areas
7.1.3
Resulting Geometries
Fig. 7.5 nally shows the geometries that can be achieved with the dierent
proposed static clustering strategies. As before, the upper two plots refer to the
case where the mean SINR is optimized, and the lower two to the case of outage
optimization. We generally consider xed clusters, but calculate geometries based
on many shadow fading realizations with a standard pathloss deviation of 2 dB
or 8 dB, respectively, where the shadowing from one UE to multiple co-located
cells is assumed fully correlated, and that to arbitrary cells has a correlation
coecient of 0.5. While a standard deviation of 2 dB is mostly used to model
channels for indoor UE positions, a value of 8 dB reects outdoor locations. As
mentioned before, the case of no cooperation at all and ideal clusters of size 3
are considered as upper and lower bounds. In Plots 7.5(a) and 7.5(c), reecting
a hexagonal setup, we can see that static, non-overlapping clusters can help to
improve either the strong users (mean SINR optimization) or the outage by
about 3-5 dB. Signicantly improved geometries can be obtained for overlapping
clusters, where for the hexagonal setup there is no dierence between mean SINR
147
1.0
0.9
CDF
0.7
0.6
0.5
0.4
0.3
ut
0.2
ut
ut
rsrsbc
rsbc
0.1
rsutut
bc
ut
ut
ut
ut
rs
ut
bcbc
0.5
utut
0.2
utut
rs
10
CDF
0.7
0.5
0.4
0.3
0.2
0.1
ut
rsbc
0
5
rsrsutbc
ut
rs
ut
rs
bc
bc
rsrsbc
bc
bc
bc
bc
bc
bc
5
geometry [dB]
10
0.5
15
0.2
0.1
rsututbc
0
15
ut
ut
rs
rsrsutbc
ut
rs
ut
0.3
rs
ut
rs
ut
rs
ut
bc
rs
bc
rs
bc
bc
utut
utut
rs
ut
rs
ut
rs
rs
rs
bc
bc
rs
bc
bc
bc
bc
nors clust.
rs
rs ut
bcbc
non-overlap.
rs
overlap.
bc
ut
ut
ut
ut
0.6
0.4
ideal. clust.
2dB shad.
8dB shad.
bc
bcbc
ideal. clust.
2dB shad.
8dB shad.
bc
0.7
bc
10
bc
rs
0.8
bc
bc
5
geometry [dB]
rs
0.9
rs no clust.
rs
bc ut
non-overlap.
rs
overlap.
rs
rsbc
rs
bc
rs
rs
rsrs
rs
rs
ut
rs
rs
1.0
ut
rsrs
ut
rs
bc
ut
ut
bc
rs
rs
bc
ut
ut
ut
ut
ut
ut
ut
rsbcbc
ut
utut
CDF
0.8
ut
utut
rs
rsrsbc
ut
15
utut
rs
ut
rsututbc
5
geometry [dB]
0.6
no clust.
rsrs ut
bc
non-overlap.
rs
overlap.
bc
rs
0.3
utut
utut
ut
0.1
bc
0.6
0.4
ideal. clust.
2dB shad.
8dB shad.
0.7
rs no clust.
rs
bc ut
non-overlap.
rs
overlap.
rs
rsbc
0.8
rsrsut
ut
bcbc
bc
bc
rs
rsrsututbc
ut
ut
ut
rs
bc
0.9
rsrs
bc
rs
ut
bc
rs
ut
ut
1.0
CDF
0.8
ideal. clust.
2dB shad.
8dB shad.
bc
5
geometry [dB]
10
15
148
Clustering
7.2
7.2.1
149
Some functions have been specied in Release 8, like the automatic neighbor relation (ANR) function, the physical cell identier (PCI) selection function
and self-conguration functionalities, while others have been added in Release 9
and later rened. Among them are mobility robustness optimization, mobility
load balance - see [3GP10e], [3GP10i] for latest updates. The specication work
relates to both detailed interfaces within the network, which is mainly done
in the radio access network (RAN) working groups, as well as details related
to the management of the system (OAM), mainly done in the SA5 group (see
www.3gpp.org). Both centralized and distributed functions are considered from
an architecture point of view, where the particular choice is made case-by-case,
based on trade-os between complexity to implement the function and overall
benets for the operations.
The approach that is followed in this work for CoMP clustering also considers
the possible impact on the standardization, and how the functionality could be
integrated in the future 3GPP specications. By looking at existing functions,
it turns out that the ANR function, which collects radio quality information
to automatically create and adjust neighbor relation tables (NRTs) in the systems, provides a good framework for the integration of a clustering algorithm
with limited complexity. In fact, CoMP clustering could be regarded from this
perspective as an extension of the information that is included in the neighbor
relations, by considering what cells are suited to cooperate.
7.2.2
150
Clustering
CCU
Clusters 1 , 2 , , c
Optimal
Goal
Cellm
[
[
M1 reported 10 times
M2 reported 5 times
Mc reported Nc times
UE-2
UE-1
Cell-1
UE-i
Cell-2
Cell-k
such sets from dierent terminals over the observation period T and computes
relevant statistical properties. In each serving cell, the reported information can
be summarized with a list of pairs [Mc , Nc ], whereby Nc 1 is the number of
occurrences the set Mc has been reported by all UEs to the serving cell during
the period T . The idea behind [Mc , Nc ] is that cell combinations that have been
observed very often oer a higher potential to improve the system performance
for several users when a CoMP scheme is adopted.
This information is eventually collected in a CoMP central unit (CCU) associated with the considered top cluster, which computes the cell clusters in an
adaptive manner by optimizing selected objectives. Fig. 7.6 illustrates the steps
of the entire process and the involved logical entities: At each period T , information is collected at the CCU and passed to an optimization algorithm that
adapts the cell clustering and redistributes back the new sets to all base stations
in the top cluster.
As in Section 7.1, the optimization problem to be performed by the CCU can
be written in a classical linear programming notation. Let C denote a set of C
potential clusters chosen based on the before mentioned UE reports and according to some heuristic. As before, binary matrix A {0, 1}[MC] states which cell
is involved in which potential cluster. Dierent from Section 7.1, however, clusters may consist of varying numbers of cells within a heuristically chosen range
151
s.t. Ax 1
(7.25)
[C1]
and x {0, 1}
binary.
(7.26)
where x {0, 1}
is a binary vector stating which clusters have nally been
selected, and (7.25) assures that each cell is involved in at least one cluster. If
this latter constraint is changed to an equality, disjoint clusters are enforced, as
in Section 7.1.1. Furthermore, equally-sized clusters can be obtained by choosing
Kmin = Kmax . One of the key factors to dene an appropriate optimization problem is the selection of the cost function c . Looking at the CoMP functionality,
a trade-o between system complexity and performance could for example be
obtained by making the cost proportional to the cluster cardinality |Mc | or to
the number of required X2 interfaces (as large clusters increase system complexity) and inversely proportional to the combined radio conditions of the cluster
cells (better radio conditions means higher performance). In order to account
for the number of UEs that would benet from a certain cluster c, the cost can
included a term inversely proportional to Nc , i.e. to the number of UEs that
have reported the cluster, leading to
[C1]
|Mc |
,
RSRPm 10Nc
(7.27)
mMc
152
Clustering
to the active set size in wideband code division multiple access (WCDMA)
systems.
2. Associate to each potential cluster c C its cost c according to (7.27)
3. Create an initial optimization solution by adding the sets in increasing cost
order, till all cells are included in the nal solution or there are no more
potential clusters available
A modied version is needed in case of disjoint sets (set partitioning), as
at each step the sets overlapping with the ones already put in the solution
shall be removed from the candidate list - the process stops when all cells
are covered or the candidate list is empty.
4. Improve the solution by step-wise replacing two (or more) sets with one set
not yet included, whose cost is lower than the sum of the costs of the replaced
sets.
This is in fact the only way to decrease the cost, as the initial solution was
built by selecting sets in cost-increasing order.
Cluster computations and simulation results following this scheme are detailed
in the next section.
7.2.3
Simulation Results
In order to evaluate the performance of the adaptive clustering principle, system
level simulations were run employing the hexagonal setup shown in Fig. 7.7 and
the simulation assumptions according to [3GP10d]. A 3GPP reference network
layout was congured with 19 3-sector sites of 500 m inter-site distance (ISD).
Each of the 57 cells was equipped with Nbs = 2 antennas of 15 degrees downtilt. Herewith, the typical 3GPP urban macro spatial channel model extended
(SCME) in the 2 GHz band was used. A number of 100 UEs were placed at
random locations within each of the 4 hotspot areas indicated in Fig. 7.7. UEs
were simulated with Nue = 2 antennas moving at a speed of 3 km/h. For each
UE, the 8 strongest interfering sectors were simulated as spatially correlated.
A signal bandwidth of 5 MHz was used, and the maximum transmit power per
sector was set to 20 W (43 dBm) per 5 MHz.
Fig. 7.7 shows the result of the applied clustering algorithm which was congured to obtain the optimal solution for a disjoint set of clusters with up to
3 cells using a shadow fading standard deviation of 2 dB. The clustering algorithm took all RSRP measurements from UEs into account that were greater
than 120 dBm. It is apparent that for the two circular-type UE hotspots on
the upper right and lower left, the closest 3 sectors (4, 6, 26) and (14, 16, 42)
from three dierent sites were selected, respectively. For the other two line-type
hotspots on the upper left and lower right, the three geographically closest sectors (10, 33, 35) and (18, 20, 52) in the middle of the area as well as the adjacent
cells each belonging to a dierent site were selected.
153
500
500
Y-Distance [m]
1000
Y-Distance [m]
1000
500
500
1000
1000
1000
500
500
1000
1000
500
X-Distance [m]
(a) No clustering.
1.0
0.9
ut
utbc
utbc
1.0
0.8
0.6
0.6
CDF
0.7
0.5
0.4
0.5
0.4
0.3
ut
0.2
bc
0 utbc
10
utbc
bc
0
5
10
geometry [dB]
1000
0.9
0.7
0.1
500
bc
0.8
CDF
utbc
ut
X-Distance [m]
15
0.1
20
bc
0.2
bc
bc
bc
bc
bc
0.3
no CoMP
2dB shad.
8dB shad.
bc bc bc
bc
2dB shad.
8dB shad.
bc
bc
0 bc
1 0 1 2 3 4 5 6 7 8 9 10
geometry for Kmax = 3 vs. no CoMP [dB]
Figure 7.7 Cell layout with UE positions and selected clusters, and clustering gains.
In order to assess the performance of the adaptive clustering algorithm, network simulations were run with the calculated clusters obtained in Fig. 7.7(b). As
in the previous section, performance is measured via the interference geometry
introduced in (7.3). The cumulative distribution function (CDF) of UE geometries obtained for the calculated CoMP clusters using 2 and 8 dB shadow fading
standard deviation are depicted in Fig. 7.7 c) and compared to the corresponding geometries if the UE is served by one cell only1 . While a shadow fading with
8 dB standard deviation represents the default value for outdoor scenarios, a
standard deviation of 2 dB is selected for indoor scenarios. It is seen in the gure
1
Simulations without CoMP resulted in similar geometry curves for the selected UE distributions in Fig. 7.7 a) for dierent shadow fading standard deviations due to the underlying
3GPP cross-correlation coecient which is set to 0.54 for interfering cells from other sites
and to 1.0 for interfering cells belonging to the same site.
Clustering
500
500
Y-Distance [m]
1000
Y-Distance [m]
1000
500
500
1000
1000
1000
500
500
1000
1000
500
X-Distance [m]
1.0
0.9
0.8
bc
bc
1.0
0.7
bc
0.4
0.3
bc
0.1
0
bc
10
bc
bc
bc
1000
2dB shad.
8dB shad.
5
0
5
10
geometry gain vs. static clust. [dB]
bc
bc
bc
bc
bc
bc
bc
0.5
0.3
bc
bc
0.6
0.4
0.2
bc
bc
0.8
CDF
0.5
bc
0.9
bc
bc
0.6
500
0.7
X-Distance [m]
CDF
154
0.2
bc
0.1
2dB shad.
8dB shad.
0
0
2
4
6
8
10
geometry loss w.r.t. ideal clust. [dB]
Figure 7.8 Cell layout with UE positions and selected clusters, cluster size Kmax = 3.
that for 50% of the observed geometries, the CoMP clustering algorithm results
in a 6 dB better geometry environment. Though the CDFs in Fig. 7.7 c) are
showing the geometry statistics of all UEs, the curves do not reect the eective
improvement experienced by individual UEs at the very same position in the
network. These are instead represented in Fig. 7.7(d).
For the outdoor case with 8 dB shadow fading standard deviation, the CoMP
cluster in Fig. 7.7(b) achieves a median geometry improvement of 3.5 dB for
individual UEs, whereas for the indoor case, the reduced standard deviation of
2 dB leads to an even higher median gain of 5.7 dB.
One drawback of the adaptive clustering algorithm compared to pre-dened
(static) clusters is the need for an additional control entity in the network and
the increased signalling overhead to estimate a good set of clusters. Such addi-
155
500
500
Y-Distance [m]
1000
Y-Distance [m]
1000
500
500
1000
1000
1000
500
500
1000
X-Distance [m]
1000
500
500
1000
X-Distance [m]
Figure 7.9 Clustering results for dierent maximum cluster sizes Kmax .
Clustering
rsut
qprsut
ut
1.0
bc
qprsqp
bc
ut
rs
0.9
bc
ut
qp
0.8
rs
bc
rs
0.7
bc
qp
0.6
CDF
156
qp
ut
ut
ut
rs
ut
0.5
bc
rs
qp
rs
bc
qp
rsutbc
bc
0.4
ut
0.3
qprs
0.2
bcrs
0.1
ut
bc
qp
rs
qp
qp
bcqp
bc
bc
SF=2dB,
SF=8dB,
SF=2dB,
SF=8dB,
SF=2dB,
SF=8dB,
SF=2dB,
SF=8dB,
CS=3
CS=3
CS=4
CS=4
CS=5
CS=5
CS=6
CS=6
0
0
4
6
8
10
geometry gain vs. no CoMP [dB]
12
14
157
ing. Considering more realistic antenna patterns is therefore of interest for future
evaluations. Another aspect to be mentioned is the modeling of shadow fading
in the 3GPP spatial channel model (SCM). Shadow fading is modeled as being
spatially uncorrelated in the SCM. A more realistic model, however, would take
a correlation of the shadow fading over distance into account. This missing correlation impacts the presented results as well, since UEs being located next to
each other can measure very dierent RSRP values from the same cells. Taking
a spatial correlation of shadow fading into account, it becomes more likely that
closely located UEs report similar sets of cells Sj , which should improve the
reliability of the adaptive clustering algorithm.
7.2.4
158
Clustering
7.3 Summary
159
extended NRTs to the CCU via the DM and the Itf-N interface (LF2). The CCU
in the SON server receives the updated and extended NRT (LF3) and computes
the updated clusters (LF4) within a top-cluster as a result of the optimization
algorithm described in the previous subsections. Finally, the CCU transmits the
updated cluster information (LF3), which is then received by the base stations
(LF2). If a master/slave concept is applied for CoMP, the message may also contain information which cells act as master and slaves. Since in this architectural
framework the CCU is part of the OAM, the Itf-N interface is impacted and needs
to be enhanced to support sending the extended NRT tables to the CCU and
the updated cluster information back to the base stations. Principally, the CCU
could also be part of the serving gateway (S-GW) or the mobility management
entity (MME). In this case the S1-U and the S1-C interfaces could be extended,
respectively. In case the CCU is a completely separate entity, introducing a new
interface is required.
7.3
Summary
In this chapter, cell clustering techniques were observed, which are an essential prerequisite of using CoMP in practical cellular systems. One can here
mainly dierentiate between static clustering concepts, where certain sets of
cooperation-enabled base stations are dened once and then xed, and adaptive,
or self-organizing clustering concepts, where clusters are adapted over time to
user locations. Interference geometries have been introduced as a performance
metric for clustering, where all results can then be compared to the theoretical benchmark of UE-specic, ideal clustering, where each terminal is served by
the best possible set of cells. We have seen that static clustering can already
obtain a geometry within a few dB of that of UE-specic clustering if overlapping clusters are dened, hence if each cell is involved in multiple clusters
connected to dierent portions of the system resources. Performance can be further improved at the expense of signaling overhead through adaptive clustering,
where a concrete algorithm based on terminal-side radio channel measurements
already supported in LTE Release 8 was described. With adaptive clustering, up
to 70% of all terminal stations experienced geometry gains compared to the case
of static, non-overlapping clusters. It has to be noted, however, that the performance of adaptive clustering is very sensitive to the choice of the cost function,
which is hence an important topic for future investigations.
Finally, it was shown that the proposed adaptive clustering scheme ts well
into the SON framework of LTE with only slight extensions of the system architecture and of the already existing SON ANR concept. In this case, the clustering
algorithm is run on the SON server of the OAM system.
Synchronization
This chapter deals with another major challenge connected to CoMP, namely
the synchronization of cooperating and cooperatively served devices in time and
frequency. On one hand, there are dierent local oscillators in each base station
and mobile terminal that lead to deviations in the carrier frequency according
to its nominal value. On the other hand, there are variations in the symbol
timing between each transmitter and receiver station. Both eects need to be
compensated by synchronization techniques.
In cellular networks, we can distinguish between a network synchronization
among all involved base stations and the alignment of the user equipments to that
time and frequency reference. The basic denitions of the synchronization terms
as well as procedures for the reference network synchronization are described in
Section 8.1. The impact of symbol timing mismatches on CoMP is then treated
in Section 8.2, before Section 8.3 concludes this chapter with the analysis of the
impact of residual carrier frequency osets on CoMP performance.
8.1
Synchronization Concepts
D. Richard Brown III and Andrew G. Klein
Synchronization is the process of establishing a common notion of time among
two or more entities. In the context of wired and wireless communication networks, synchronization enables coordination among the nodes in the network and
can facilitate applications such as distributed sensing. Precise synchronization
can also facilitate scheduling of communication resources as well as interference
avoidance in multi-access networks. This section provides an overview of some of
the synchronization concepts and techniques used in coordinated communication
networks.
8.1.1
Synchronization Terminology
In the context of wireless communication networks, each node in the network keeps
a local notion of time, i.e. a clock, by counting cycles of a local oscillator (LO).
Among other parameters, all oscillators are characterized in terms of their nominal
162
Synchronization
163
clock B
T
t1
t2
t3
clock A
8.1.2
Network Synchronization
The Network Time Protocol
The NTP is a protocol for synchronizing the clocks of nodes that are connected
through variable-latency networks [Mil91]. NTP is an application-layer protocol
that operates over the Internet protocol (IP), and can therefore be implemented
completely in software. The protocol has been in use since the 1980s, and today
it is responsible for synchronizing the clocks of the majority of computers connected to the Internet. Nodes in the network are assigned to a class or stratum,
164
Synchronization
Ti3
Ti2
Ti1
Ti
and those with the lowest stratum number are assumed to be perfectly synchronized with Coordinated Universal Time (UTC). Nodes with higher stratum
numbers synchronize their clocks with nodes having lower stratum numbers. This
hierarchical structure of NTP results in it being highly scalable.
To estimate clock osets, a master and slave exchange timestamps which are
64-bit descriptions their current local clock time. Figure 8.2 demonstrates the
exchange of timestamps between a master and slave. If Ti , Ti1 , Ti , and Ti1 are
the four most recent timestamps, then the clock oset of the slave relative to the
master at time Ti can be calculated via
Ti2 + Ti1 Ti3 Ti
.
(8.1)
2
Since each NTP message contains the last three timestamps Ti1 , Ti2 , Ti3 ,
and the nal timestamp Ti is estimated upon arrival of the message, the clock
oset can be estimated from a single message exchange between slave and master.
i =
Equation (8.1) implicitly assumes that the two transmission paths are symmetric and have equal delay. In practice, however, network delays are stochastic
quantities. Consequently, NTP performs multiple oset estimates in combination with a ltering and selection scheme to obtain a more accurate estimate
of the clock oset. The estimated clock osets are fed to a Type-II adaptive
parameter phase-locked loop (PLL), which corrects the LO phase and frequency.
An adaptive Type-II PLL has one integrator in the loop lter (or two poles
in the open-loop transfer function) and continuously adjusts the phase and frequency [Smi86].
The accuracy of the protocol depends on a variety of factors, including the
update interval and network topology. Several studies (e.g. [Mil03, MTH97,
165
KZM07, Min99]) have investigated the performance of NTP under typical use,
showing that clock osets have a standard deviation on the order of several
milliseconds, and residual frequency osets on the order of 0.1 ppm.
The Precision Time Protocol
Also known as IEEE 1588 [IEE08a], the PTP attains sub-microsecond accuracy
which is necessary in applications such as networked control systems and precision machinery in factories. The phase and frequency correction in PTP are quite
similar in principle to NTP: after a sequence of messages are exchanged between
slave and master, the clock osets are estimated through ltering and selection,
and are used to adjust a PLL which corrects the LO phase and frequency. There
are, however, several fundamental dierences between PTP and NTP. The primary dierence is that PTP is implemented in hardware rather than software.
By moving the clock synchronization as close to the physical layer as possible,
sources of jitter and processing delay introduced in network layers higher up the
stack can be mitigated. In addition, PTP is primarily intended to be used in
a local area network (LAN) setting as opposed to NTP, which may synchronize to an Internet clock reference located some far distance away. While PTP
can achieve a higher accuracy than NTP, it does require the use of dedicated
hardware. The performance of PTP will again depend on a variety of factors,
including the quality of the LO, as well as the network topology. Products already
available on the market today [Sem10] claim clock osets within 1 s and frequency osets better than 0.01 ppm. Similar results were achieved [Ton05] in
a test of PTP over a metropolitan area network.
The most recent version of the standard, referred to as IEEE 1588-2008, oers
a transparent clock mode which requires dedicated network switches that support
the standard. Such switches employ a transparent clock that further minimizes
delay by providing an alternate local clock for network nodes so that they need
not rely on the master clock. This mode permits maximum clock oset errors on
the order of tens of nanoseconds [HJ10].
8.1.3
Satellite-Based Synchronization
A Global Navigation Satellite System (GNSS) permits nodes to determine their
location to within a few meters using time signals received line-of-sight from satellites. While the primary intent of a GNSS is for determining position information,
such systems are also very useful as an accurate, common clock reference. In contrast to NTP and PTP, clock synchronization using a GNSS is done wirelessly
using one-way communication links (i.e. by receiving signals broadcast from the
satellites). In order for a terrestrial node to be able to receive the relatively weak
signals from distant satellites, however, a line-of-sight link is typically necessary.
In the absence of precise location information, a node must be able to receive
signals from four satellites since there are four unknowns: latitude, longitude,
166
Synchronization
exp{j(t )}
exp{jt}
(t
)
(t
T1
T2
D
altitude, and time. If precise location information is available, only one satellite
is needed for clock synchronization since propagation delay is known.
Examples of GNSSs include the United States GPS, the Russian GLObal
NAvigation Satellite System (GLONASS), and the European Galileo system. As
stated in [LAK99], GPS provides clock synchronization to better than 100 ns
in time and 1013 in frequency. Other satellite systems are expected to give
synchronization accuracy of a similar order, as they share many of the same
parameters as GPS [HP05a].
8.1.4
167
To understand just how accurately the BSs must be synchronized to facilitate retrodirective transmission, consider the two-transmitter distributed beamforming scenario shown in Fig. 8.3 where both BSs simultaneously transmit
unmodulated carriers at radian frequency with the goal of having the carriers
arrive with identical phase, i.e. coherently combine, at the destination, i.e. the
mobile. Note that the BSs are implicitly syntonized in this scenario, but they
are unsynchronized such that transmitter 2 has a clock oset of with respect
to transmitter 1. After propagation through the unit-gain single-path channels,
the received signal at the destination can be written as
y(t) = exp{j(t )} + exp{j(t )},
where the baseband signals modulated by each carrier are omitted for clarity.
The received power can be computed as |y(t)|2 = 2 + 2 cos () 4. When the
transmitters are perfectly synchronized, i.e. = 0, the carriers combine coherently at the destination and the received power |y(t)|2 = 4. This corresponds to
the ideal coherent case in distributed beamforming. When the transmitters
are not synchronized, the received power will be less than in the ideal coherent case. To illustrate the eect of unsynchronized transmitters, the clock oset
can be modeled as a zero-mean Gaussian distributed random variable with
standard deviation . Fig. 8.4 shows the received power at the destination as a
4.5
utrsldbcqp
ideal coherent
utrsldbcqp
ut
4.0
utrs
ut
ldbc
qp
ut
rs
90% of ideal
coherent
3.5
ld
bc
3.0
rs
2.5
bc
ld
2.0
qp
ut
1.5
103
f0
f0
f0
f0
f0
f0
=
=
=
=
=
=
800 MHz
900 MHz
1.4 GHz
1.7 GHz
2.1 GHz
2.6 GHz
qp
ut
incoherent
rs
ld
bcqp
ut
ut
ut
Synchronization
ut
168
102
101
transmitter clock oset standard deviation [ns]
100
Figure 8.4 The eect of transmitter clock oset on distributed beamforming power for
) 2 T63 T42 ,
cov
,
a
T2 T
where T is the duration of the observation, and the notation A ) B means that
A B is positive semi-denite.
As an example, Fig. 8.5 shows the clock and frequency oset standard deviations for the two-way carrier synchronization protocol developed in [PD10]. This
example assumes seven transmitters serially exchange 1 GHz wireless beacons
169
bc
rs
ut
bc
rs
ut
bc
rs
ut
bc
rs
ut
bc
rs
bc
ut
103
10
bc
ld
0
5
10
beacon SNR [dB]
15
ld
ld
ld
ld
10
rs
ut
rs
ut
bc
ut
ut
T=1s resync
T=500ms resync
T=250ms resync
T=125ms resync
ut
ut
ld
10
ut
10
ut
ut
10
rs
ut
ut
100
102
101
20
ld
105
10
0
5
10
beacon SNR [dB]
15
20
Figure 8.5 Clock and frequency oset standard deviations as a function of beacon
SNR and re-synchronization interval for seven transmitters synchronized via the
two-way carrier synchronization protocol.
8.1.5
Summary
In this section, we have introduced the concept of synchronization and described
several approaches to the problem of synchronizing nodes in a coordinated communication network. These techniques can be used separately or in conjunction
to facilitate the establishment of a common notion of both frequency and time
170
Synchronization
8.2
(8.3)
A possible timing scenario is shown on the right side of Fig. 8.6, where the
signals of three transmitters are received by one receiver. The desired transmitter
(Tx#1) is synchronized to the receiver, while the others are delayed such that
171
t=0
BS#3
Rx DFT Window
Tx #1:
UE#2
d2 , d2
d3 , d3
UE#3
BS#1
CP
CP
OFDM Symbol o
(desired)
Tx #2:
d1 , d1 UE#1
Tx #3:
BS#2
Inter-symbol interference (ISI) from previous symbol
Figure 8.6 Hexagonal cell structure and possible timing scenarios in CoMP systems.
the TDOA of transmitter 2 (Tx#2) lies within the CP, but the ISI that is caused
by the channel decay already leaks into the discrete Fourier transform (DFT)
window. The third transmitter (Tx#3) even violates the CP limit, such that a
portion of the previous OFDM symbol leaks into the DFT span.
As we will see in Subsection 8.2.1, ISI is introduced in the system on top
of multi-user interference (MUI) if the maximum TDOAs are not limited to be
within the CP (see e.g. [WG00], [WXBD09], [ZMM+ 08], [Ham10]). The amount
of additional interference depends on the grade of timing mismatch. Fig. 8.7(a)
depicts the joint distribution of occurring TDOAs after synchronization for different inter-site distances (ISDs), for the case that three users are uniformly
distributed within a hexagonal cooperative cell that is served by three base stations. As an example, we also indicate the bounds for the short and the long CP
length that is used in 3GPP LTE systems (TCP = 4.7 / 16.7s) by vertical lines.
Besides the path delays, another important eect that needs to be considered is
the pathloss (d) with
' *
d
,
(8.4)
(d) =
d0
which also depends on the distance between the transmitters and receivers. Note
that is used as pathloss exponent, d0 as the reference distance and as an
attenuation factor that depends on the environment here. If we consider the
pathloss attenuations from all links, we can also form a coupling matrix d
similar to Td . As it is known from literature, the attenuation of each link due to
pathloss leads to special structures of the channel matrix, e.g. diagonal or row
dominated matrices (depending on the relative position of the users). Likewise,
also the possible interference power is attenuated. To compensate the pathloss,
transmit power can be controlled, which is however limited to the maximum
transmit power Pmax . Therefore, especially in large cells we may not be able
to achieve a required target signal power level throughout the whole serving
area. A convenient metric to assess the decoupling of two links is the separation
factor (SF) that denes the ratio between maximum and minimum receive power
Synchronization
1.0
1.0
0.9
0.9
0.8
0.8
0.6
0.7
ISD={1000:1000:10000}m
0.5
CDF
0.7
CDF
172
0.6
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
={2:0.5:4}
0.5
0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
d [ s ]
(a) TDOA distribution.
10 15 20 25 30 35
separation factor SF [dB]
40
(b) SF distribution.
(8.5)
r2
D(1 + rDC + DC2 )1/2 . We use D = dISD / 3 as cell diameter here. In order to fulll
the cyclic prex restriction, we can re-write (8.3) as
d2 d1
(8.6)
TCP (L).
c
If we reorder (8.6), we obtain an expression for the cooperation radius of the
circle in which ISI-free CoMP is possible (see [KJRF10] for details), i.e.
rC,TDOA
(8.7)
If we only allow a joint signal processing for users who are within a certain
cooperation range , we can dene the second constraint as
' *
d2
.
(8.8)
d1
173
a
2
4
a2
1 + 22/
.
1 D with a =
4
1 2/
(8.9)
However, in systems with limited transmit power we also have to ensure that
a minimum link signal-to-noise ratio (SNR) can be achieved although a low link
separation is available. The synchronization eects that need to be characterized
for CoMP systems can be summarized as follows:
The TDOAs must not exceed the CP limitation, otherwise ISI is induced.
The channel length increases above the CP constraint.
The level of ISI power depends on the TDOA and the link separation.
8.2.1
174
Synchronization
'
After transmission over a channel specied by the taps hm
k (o, i ) for the link
between transmitter k and receiver m, the time domain signal is given as
ym (o, i) =
L
K
'
m
m (o, i' ) x
(o,
i
)
+n
m (o, i).
h
k
k
k
(8.11)
k=1 i' =1
Here, n
NC (0, n2 I) is the receive noise in time domain. The channel
'
taps of the link specic channel impulse responses are modeled as hm
k (o, i )
NC (0, h2 m (o,i' ) km (d)), where L = 5 (L)/TS 6 represents the discrete channel
k
length and h2 m (o,i' ) the tap variance given by the corresponding power delay
k
m
prole. The parameter m
k = 5k (d)/TS 6 expresses a timing oset (given in
samples). The received signal in frequency domain is obtained by the DFT operation applied to the received samples ym (o, i) as
N 1
j2qi
1
ym (o, i) e N , q Q.
ym (o, q) =
N i=0
(8.12)
In [KF10], a closed-form solution of (8.12) is derived that gives us an expression for the frequency domain transmission with arbitrary symbol timing osets
. The transmission can then be summarized for the received signal at the m-th
receiver branch with
K
2 m '
ym (o, q) =
Ekm (o, q, q ' )Hkm (o, q ' , q ' )xk (o, q ' )ej N k q
k=1 q' D
K
k=1 q ' D
K
K
k=1 q ' D
q '' =0
k=1 q ' D
N
1
Ekm (o 1, q, q ' )Hkm (o, q' , q ' )xk (o 1, q ' )eo N (k NCP )q
2
N
1
'
q '' =0
(8.13)
m
[N N ]
which
where Ekm are elements of the matrices Em
k (o) , Ek (o 1) C
include the inter-carrier interference (ICI) due to the windowing of the current
and previous OFDM symbols in time domain in the case that the CP limit is
N
m
m
'
m
k
))
Ek (o, q, q ) = 1 j
o sin( N (NB
k
N
e
m
N
k
sin
Ekm (o
'
1, q, q ) =
m NCP
k
N
1
N e
175
m
k =0
otherwise
m
j
sin( Nk (m
N ))
km CP
N o1
k
sin
N
m
k =0
otherwise
'
'
with m
k = q q q, q = 1...N and
m
o = m
k (k + NB 2NCP 1)
m
o1 = m
k (k NCP 1).
In (8.13), Hkm are elements of the diagonal matrix with the channel trans[N N ]
. By
fer function (CTF) in frequency domain on a certain link Hm
k C
[N N ]
using the Fourier transform matrix F C
with the elements F (q, q ' ) =
2
'
H , where
ej N q q / N , this channel matrix can also be written as H = FHF
[NB NB ]
N 1 N 1
2
1 m
'
Ck (o, a, b)ej N (bq aq)
N a=0
(8.14)
b=0
Ckm (o
N 1 N 1
2
1 m
'
1, q, q ) =
Ck (o 1, a, b)ej N (bq aq) ,
N a=0
'
(8.15)
b=0
where Ckm (o) and Ckm (o 1) are elements of special Toeplitz matrices in time
domain that are explained in more detail in [KF10]. It should be noted that
m
m
for m
k NCP , Ek (o) becomes an identity matrix and Ek (o 1) changes to a
m
zero matrix. Within the range of NCP L + 1 k NCP , we dene an eective cyclic extension NCP,e = NCP L + 1 for the following which gives us the
interference free range within the CP. For the case that m
k NCP,e , C(o) and
C(o 1) also become zero matrices and we get the well known asynchronous
interference free transmission equation in frequency domain with
K
2 m
y (o, q) =
Hkm (o, q, q)xk (o, q)ej N k q ,
m
(8.16)
k=1
where we only have a phase slope caused by over all sub-carriers within one
OFDM symbol.
176
Synchronization
'
q D
q ' D\q
MUI
ICI
ISI
u(o,q)
(8.17)
As we can observe in this expression, we now have a coupling between adjacent sub-carriers (ICI) and consecutive OFDM symbols (ISI) in addition to the
coupling between multi-user interference (MUI). A characterization of the ISI
and ICI is done for example in [SDAD02], [SM03], [MC06] and [NK02].
8.2.2
+
G(o, q)Z(o, q ' )W(o, q ' ) P(o, q ' )x(o, q ' )
q ' D\q
q ' D
(8.18)
with
uu =
177
Z(o, q ' )W(o, q ' )P(o, q ' )W(o, q ' )H Z(o, q ' )H
q' D\q
Z(o 1, q ' )W(o 1, q ' )P(o 1, q ' )W(o 1, q ' )H Z(o 1, q ' )H .
q ' D
(8.20)
It should be noted that for frequency-selective channels we have to include a
separate precoding lter and power allocation vector for all adjacent sub-carriers
in order to get an exact expression for the MSE.
The interference aware receive lter can then be obtained by minimizing the
sum mean squared error (SMSE), i.e.
!!
,"22
G = argmin {tr {ee }} = argmin E "x x
.
(8.21)
G
(8.22)
#
with yy = E yyH . The interference-aware transmit lter can be derived by
solving the optimization criterion of (8.21) with respect to W such that
%
&
%+
+ &
+ !
+
+ +2
1 +2
+
, 2 | E +W Px+ = Pmax ,
(8.23)
W = argmin E x x
"
{W,}
tr(nn + uu )
W = ZH ZZH +
IK
Pmax
(8.24)
'nn
%+
+ &
"
#
+ +2
E +W Px+ = 2 tr WPWH = Pmax
2
>
?
Pmax
?
!.
=@
tr ZZH (ZZH + 'nn )2 P
(8.25)
(8.26)
If we assume an uplink transmission where the users have only one transmit antenna and are not able to communicate to each other the transmit lter
becomes an identity matrix W = IK . Under this constraint, we can derive an
expression for the post equalization signal-to-interference-and-noise ratio (SINR)
178
Synchronization
of the k-th user after the joint BS signal processing as the ratio of the desired
signal power and the portion of MUI, ISI and ICI plus noise as
SINRk =
gkH
pk gkH zk zH
k gk
K
r=1,r=k
.
pr zr zH
r + uu + vv gk
(8.27)
It is worth mentioning that the used lter matrix only aims at canceling
the multi-user interference for the desired sub-carrier with the knowledge of
the colored noise u (see (8.17)). If we use the lter dened in (8.22), the post
equalization SINR for the k-th user yields
SINRk =
(zk )H 1
yy zk
.
H
1 (zk ) 1
yy zk
(8.28)
SINRk = K
r=1,r=k
with
u2 =
H
2 ( 2 + 2 )
pr zH
u
v
k w r w r zk +
(8.29)
zk (o, q ' )H W(o, q ' )P(o, q ' )W(o, q ' )H zk (o, q ' )
q' D\q
q ' D
(8.30)
In [AA09], a joint optimization of the receive and transmit lter as well as
power control in systems with asynchronous interference has been presented
based on the results in [SB04] and [ZMM+ 08]. In order to simplify the analysis,
we introduce a simple power control scheme here. The transmit power of each
transmitter is controlled in order to achieve a target SNR k on the strongest
link within one column of the channel matrix. The uplink transmit power values
in that case can be obtained by
pk =
k v2
max |zkm |2
!.
(8.31)
8.2.3
179
Value
256
120
18
3.84 MHz
15 kHz
800 MHz
20 dB
23 dBm
7 dB
1 dBi
43 dBm
4 dB
15 dBi
J
10 log10 (1.3823 K
290K BSC ) = 132 dBm
2 dB
3 dB
1.69 dB @800 MHz
3.86
(1)
L
L1
1
e = 1 , = ln(0.1)
=1
Synchronization
bc
rs
14
bc
16
rs
12
TCP =4.7s
bc
rs
bc
(L)=2.3s rs
=3.86
(TCP (L))
> (d)
6
rs
4
2
bc
bcrs
bcrs
rs
bcbc
(L)=0
0
rs
rs
rs
bc
bc
bcbc
14
=0
10
8
18
bc
rs
SUD
bcrs
bc
bc
(d) [s]
180
bc
12
10
8
(L)=2.3s
TCP =4.7s
6
4
2
bcrs
bc
bcrsbc
bcrs
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
rC / D
(a) Average SINR loss.
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
rC / D
(b) Occurring TDOAs.
Figure 8.8 Average SINR loss and occurring TDOAs in a symmetric user scenario.
we show results where we assume that we have a CP length that can cover all
possible TDOAs ((TCP (L)) > (d)). For that simulation we used now the
pathloss model as described above. As it can be observed, the signal attenuations due to the pathloss lead to a decoupling of the channel matrix between the
diagonal and o-diagonal entries with increasing rC . In the cell-center, the singleuser power control scheme is not able to control the transmit powers to achieve
the target SNR, since the multi-user interference is not considered. That is the
reason why we can observe an initial SINR loss of 2 dB. In a next scenario,
we limit the CP as is denoted in Table 8.1 with TCP = 4.7 s. One can see an
SINR degradation from the point where the CP limit is exceeded again, but we
can also observe that the ISI power is attenuated due to pathloss with increasing rC . The single user detection (SUD) performance is included as a general
bound where we assume that each base station only wants to detect the closest
user without the knowledge of the others. A key observation of the presented
results is that, as expected, only in a small region the asynchronous interference
is the dominant degrading eect. On the one hand, the TDOAs must exceed the
ISI-free range within the CP. It should be noted that in real systems, due to
timing estimation errors, the ISI-free range can be reduced in addition to the
reduction caused by the channel decay. On the other hand, the ISI power from
the interfering users which is also attenuated by the pathloss needs to exceed
a certain threshold such that it leads to additional interference. Among others,
this threshold mainly depends on inter-site distance, carrier frequency, pathloss
exponent, transmit power etc. Furthermore it should be mentioned that for the
downlink case we can achieve similar results since the problem of asynchronous
interference is equivalent in both directions.
8.2.4
181
Summary
In this section, we investigated CoMP systems in the case that the time differences of arrival after single link synchronization exceed the limitation which
is given by the cyclic prex in OFDM systems. We analytically described the
impact on the OFDM transmission model as well as on multi-user joint detection and transmission. In numerical simulations, we analyzed how the additional
asynchronous interference aects the SINR performance in hexagonal cells within
a simple symmetric user conguration setup. We could show that the SINR loss
increases if the residual symbol timing osets violate the cyclic prex limitation
until the pathloss leads to a decoupling of the users and consequently to an
attenuation of the asynchronous interference.
8.3
182
Synchronization
have decided to thoroughly analyze the simplest case of a CoMP system, with 2
cooperating single-antenna base stations (BSs) serving two single-antennas user
equipments (UEs). The obtained closed-form expressions give us clear insights
into the exact relations between the frequency errors and the measures of interest. For systems of higher dimension, it can be expected that these relations are
basically sustained, and performance simply scales (to some extent) with the
number of antennas.
8.3.1
Downlink Analysis
Inter-stream Interference of Spatially Precoded Streams
We consider the downlink of an exemplary CoMP scheme, where M singleantenna BSs transmit to K single-antenna UEs. The K UEs are all assumed to
be assigned to the same physical resource block (PRB). In the sequel, we focus
on the signal conditions observed at a single sub-carrier of this PRB. According to the notation introduced in Section 3.5, the transmission equation for this
sub-carrier can be given as
y = HH Wx + n.
(8.32)
(8.33)
where fm is the CFO between BS m and the common reference. For notational
convenience, we introduce the angular frequencies of the CFOs m = 2fm .
The matrix (t) is incorporated into the transmission equation (8.32) according
to
y = HH (t)Wx + n.
(8.34)
In general, there are also CFOs observed at the side of the receiver, representing the osets of the receivers oscillators. These could be captured by another
CFO matrix, as it has been done in the equation for the uplink in (8.54). However, those CFOs have been neglected in (8.34), as they can easily be tracked
continuously and compensated by standard synchronization techniques for the
downlink [MKP07].
We now draw our attention to the joint precoding matrix W. This matrix can
be separated into the product of two matrices
W = C P = [p1 c1 . . . pK cK ]
C = [c1 . . . cK ]
P = diag (p1 , . . . , pK ) .
(8.35)
183
The matrix C represents the algebraic function used to diagonalize the eective transmission channel HH W, while P is a diagonal matrix, whose diagonal
elements pk represent the scaling of the column vectors ck in C (i.e. the precoding beams for the transmit symbols xk in vector x) according to the power
allocation. As we do not assume the columns ck to be normalized, the relation
between pk and the transmit power Pk allocated to beam k can be characterized
by pk = "ck "1 Pk .
To analyze the impact of CFOs on the signal conditions at the receivers, we
focus on the simplest case of a CoMP system, where two BSs each with one
single antenna cooperate to simultaneously serve two single-antenna terminals.
Then the channel matrix is given as
) ( T)
(
h1
h11 h12
H
=
,
(8.36)
H =
h21 h22
hT2
where hk specify the channel components connected to UE k. To diagonalize
the eective channel HH W, we use zero-forcing (ZF) precoding. Assuming that
both BS have ideal channel knowledge at time instant t = 0, the ZF matrix C is
calculated according to
(
) (
)
1
h22 h12
c11 c12
H 1
=
.
(8.37)
C = (H ) =
h21 h11
c21 c22
h h h h
11 22 12 21
The constant scaling factor in front of the matrix will in the following be
denoted as . The eective channel matrix for time instant t = 0 then yields
( T
)
h1 (t)c1 p1 hT1 (t)c2 p2
HH (t)W =
.
(8.38)
hT2 (t)c1 p1 hT2 (t)c2 p2
Here, the k-th diagonal element of the matrix represents the eective channel of the transmit symbol xk intended for k-th receiver, while the o-diagonal
elements in row k at position m represent the interference from symbol xm
on the signal seen at k-th receiver. As we assume symmetric conditions for all
receivers, we focus exemplarily on the the signal received at the rst receiver,
i.e. y1 = hT1 (t)Wx. The signal y1 will be separated into a desired and an interference part, y1 = y1,d + y1,i , which will be analyzed separately in the following.
The desired signal y1,d contains the contribution from x1 ; with (8.33), (8.36)
and (8.37), it amounts to
y1,d = hT1 (t)c1 p1 x1
21
(8.39)
In the above equation, we observe that the ZF pre-compensated channel is distorted by the complex term (1 exp(j21 t)) h12 c21 . It is reasonable to assume
184
Synchronization
that BSs and UEs are spaced suciently apart from each other, and hence the
single channel coecients hij can be considered to be mutually independent.
Consequently, the product h12 c21 represents a random complex number with
zero mean. The amplitude of the ZF pre-compensated channel may therefore
increase or decrease with the same probability.
In a similar manner, we can calculate the interference signal y1,i , which reects
the inter-stream interference from the unintended signal x2 :
y1,i = hT1 (t)c2 p2 x2
= p2 exp(j1 t) [(exp(j21 t) 1) h11 h12 ] x2
= p2 exp(j1 t) [0 (1 exp(j21 t)) h12 c22 ] x2 .
(8.40)
Note the similar structure in (8.40) compared to (8.39), which both exhibit
the same weighting factor (1 exp(j21 t)) scaling the complex distortion. The
complex distortion is formed here by the product of h12 and c22 , which is independent of the value altering the useful signal part in (8.39), as c21 is independent
of c22 . We can therefore conclude that the inter-stream interference introduced
by the CFOs has a magnitude that is similar to that of the amplitude change of
the desired signal, as long as c21 and c22 can be assumed to have similar mean
power.
From the above results, we will now derive an expression for the SIR between
desired and interference signal. From (8.39) and (8.40), the instantaneous SIRi
can be given as
p1 (h11 h22 exp(j21 t)h12 h21 ) 2
|y1,d |2
=
(8.41)
SIRi =
p2 (1 exp(j21 t))h11 h12 .
|y1,i |2
To obtain an estimate for the mean SIR, we use Jensens inequality, allowing
us to determine the mean value for enumerator and denominator separately:
SIR
=
E{|y1,d |2 }
E{|y1,i |2 }
E{"c2 "2 (|h11 h22 exp(j21 t)h12 h21 |2 )}
P1
.
P2 (|1 exp(j21 t)|2 )
E{"c1 "2 |h11 h12 |2 }
(8.42)
,
=
P2 (1 cos(21 t))
(21 t)2 P2
SIR
(8.43)
185
35
30
SIR [dB]
25
20
15
10
5
0
0
0.2
0.4
0.6
0.8
1.0
t
Figure 8.9 SIR degradation due to CFO-induced inter-stream interference of two
spatially precoded streams according to (8.43) (P1 = P2 ).
where the Taylor expansion of cosine for small angles, cos() 1 0.52 , was
used to obtain the approximation on the right-hand side. An illustration of this
SIR relation for P1 = P2 is given in Fig. 8.9.
Eq. (8.43) reveals that the mean SIR resulting from the CFO-induced interstream interference between the spatially precoded streams strongly depends on
the dierence of the CFOs observed at the transmitters, 21 = 2 1 , as well
as ratio of the powers Pm allocated to the single transmission streams. As long
as the precoding matrix C is not updated, the interference grows continuously
over time with the factor (21 t)2 . From the above result, we can thus deduce
a requirement for the minimum update interval of the precoding matrix C to
achieve a desired SIR constraint :
$
(8.44)
t = (21 )1 2 1 .
The CFO dierence 21 can be related to the accuracy a achieved for the
oscillators used at the transmitters after synchronization. With the carrier frequency fc , the relation 21 = 2afc holds. To give an example, assume that
a CoMP system operates at a carrier frequency fc = 2 GHz, synchronization
of the transmitters has achieved an accuracy of a = 1 parts per billion (ppb)
= 103 parts per million (ppm), and the desired SIR target the system should
not fall below is = 20 dB. Then the update interval should be around 11 ms,
which lies in the dimension of the duration of only a few radio frames in modern
wireless communication systems. This example clearly indicates that synchronization requirements for downlink CoMP precoding are very strict; however, we
have seen in Section 8.1 that synchronization methods are readily available that
are capable of establishing these.
186
Synchronization
L1
(8.45)
l=0
h(q)
with
q Q := {0, . . . , Q 1}
q D := {Q/2 + 1, . . . , Q/2}.
(8.47)
The rst term in this expression represents the useful signal, while the sum
term represents the distortions from ICI.
187
35
30
SIR [dB]
25
20
15
10
5
0
0
0.05
0.25
0.30
As derived in detail in [Sch09], it can be shown that the mean power of the
useful signal, Pu , and the mean power of the ICI, PICI , amount to
Pu = Ps h2 si2 (f QTs )
PICI =
Ps h2
(8.49)
(1 si (f QTs )),
2
(8.50)
h2
Pu
si2 (f QTs )
.
=
PICI
1 si2 (f QTs )
(8.51)
188
Synchronization
the inter-stream interference may be neglected and the eective channel may be
understood as a set of independent, orthogonal SISO channels. The sub-carrier
channel h(q) seen by the rst receiving UE is thus the eective SISO channel
h(q) = hT1 (q)c1 (q)p1 (q) p1 (q).
(8.52)
Before inserting this eective channel into (8.46) to obtain useful channel and
ICI channels seen by the rst receiving UE, some considerations on the CFOs
fm are necessary: As both CFOs fm have an inuence on the exact value of
the ICI, we can use their maximum f = maxm fm to upper bound it. Based
on this, (8.46) can be directly translated for the eective channel seen by the
rst UE to
h(q, q) = p1 (q) exp (jf QTs ) si ( (f QTs q)) .
(8.53)
If we assume that the mean power of p1 (q) is identical for all sub-carriers, the
upper bound for the ICI from (8.51) can be found to be valid also for the CoMP
downlink.
By comparing the SIR for the inter-stream interference from (8.43) and for
the ICI from (8.51), we immediately see that the former expression grows with
the time t, while the latter only depends on the absolute value of the CFO
f normalized to (QTs )1 , which is, in fact, the sub-carrier spacing. Therefore,
the requirement on the synchronization accuracy derived from the SIR for the
inter-stream interference is orders of magnitude larger than that derived from
the SIR for the ICI. Resorting to our example from the preceding subsection,
the CFO f to achieve a SIR of 20 dB should be below 5 % of the sub-carrier
spacing. If the sub-carrier spacing is at 15 kHz like in current OFDM-based
mobile radio systems, the maximum allowed CFO would be 750 Hz. Compared to
the maximum allowed CFO of only a few Hertz that is required to enable updates
of the precoding matrices within a few milliseconds, this dierence amounts to a
factor larger than 100. From these considerations, it may be concluded that the
ICI in OFDM systems does not play a signicant role for the synchronization of
the CoMP downlink, and therefore its inuence may be neglected.
Techniques for the Compensation of Degradation Eects
Unfortunately, there are no methods available to compensate for the degradation
eects due to inter-stream interference at the side of the receiver. The reason for
that is related to the fact that the dimension of the spatial receive space in the
downlink is usually much smaller than the dimension of the spatial transmit
space. For an illustration, consider the following: If M BSs with Nbs transmit
antennas each cooperate, they are able to form up to M Nbs orthogonal transmit
beams. If the BSs are not properly synchronized, the beams will loose their
orthogonality and will thus interfere at each UE. Even if the UE has multiple
antennas, it will not be able to resolve the interfering beams, disabling suitable
approaches for proper compensation.
189
The only remaining solution is to compensate for CFO distortions at the side
where they appear, that is the transmitter. However, this can only be accomplished if knowledge on the CFOs is available there. In [Zar08], it has been
proposed to estimate the CFOs by all UEs that are involved in the CoMP transmission, who then feed back their estimates to the BSs. Techniques to estimate
these CFOs with high accuracy have also been proposed there. However, this
approach has the drawback that it requires computationally complex estimators
at the UEs and a continuous feedback of the estimates from all UEs. Moreover,
it is questionable whether the proposed technique can achieve a better synchronization accuracy in practice than techniques for the synchronization between
BSs do (see Section 8.1). To summarize, we can conclude here that tight synchronization between the BSs is of fundamental importance to enable reliable
CoMP transmission in the downlink, which calls for inter-BS synchronization
techniques achieving the best accuracy possible.
8.3.2
Uplink Analysis
Inter-Stream Interference of Spatially Precoded Streams
The transmission equation for the uplink can be given as
y = r (t)Ht (t)Px + n.
(8.54)
To point out the duality to the downlink, we consider the CFO distortion
matrices for both the transmitter and receiver, t (t) and$r (t), here. The diagonal matrix P now is constituted of the elements pk = Pj only, so it reduces
to a simple power allocation matrix here. To obtain the transmitted signals, the
cooperating BSs jointly equalize the received signal y
x
= WH y = WH (r (t)Ht (t) Px + n).
(8.55)
H(t)
As all CFO distortions can be continuously tracked at the receiving BSs, the
(8.56)
190
Synchronization
(8.57)
We observe here that the two dierent CFOs f1 and f2 generate independent ICI from the channel vectors h1 (q) and h2 (q), respectively, which are
simply superimposed. The transmission equation given for the SISO case in (8.48)
is modied to match the CoMP uplink according to
y(q) = H(q, 0)x(q) +
(8.58)
H(q q, q)x(q q) +n(q).
qD\0
ICI
191
For the power of the useful signal, conditions remain the same as in the SISO
case, i.e. we achieve according to (8.51) for the SIR of the useful signal x
k (q)
SIR (
xk ) =
P1
(8.60)
Compared to the SIR conditions valid for the CoMP downlink in (8.51), we
clearly see that the major dierence for the uplink lies in the fact that the
ICI from all simultaneously transmitted data streams aect the SIR of each
desired signal. The ICI power in the uplink thus scales with the number K of
simultaneously transmitting UEs. Considering that it is more dicult to establish
a tight synchronization between dierent UEs, compensation techniques for ICI
may become an issue in the uplink.
Remark: The brief analysis presented here points out the most important
aspects of the ICI eects in the OFDM-based CoMP uplink only. For a more
detailed investigation, the interested reader is referred to [SJ09].
Techniques for Compensation of Degradation Eects
As mentioned already in the corresponding subsection, compensation of the
degradation eects due to inter-stream interference is simple, as we only have to
update the equalization matrix (8.56) continuously. However, we have pointed
out above that in the uplink the ICI may be a more serious problem, as the
CFOs caused by the low-cost oscillators at the UEs may become much larger
than those caused by the oscillators of the BSs. Although compensation of the
ICI distortions on the side of the receiver is possible, it turns out to be a complex
task. For an overview on existing compensation techniques for CFO-induced ICI,
refer to [MKP07, SJ09].
In general, the OFDM channel including the CFO distortions can be represented as a large square matrix of dimension Q Nbs Q Nbs , which has a
band structure with all elements within this band being non-zero. By zero-forcing
this huge channel matrix, the channel including all CFO-distortions can be fully
compensated. However, it is obvious that this would imply an enormous computational eort. To cut this eort down, several approaches have been proposed
that exploit the specic structure of the large channel matrix to divide it into
a set of smaller sub-matrices. The computational eort remaining is still considerable, though. Also some solutions have been proposed that try to maintain
the sub-carrier-wise processing OFDM systems are favored for. Although they
are not capable of removing the CFO-induced ICI completely, they can compensate for a large amount of it, improving the SIR conditions signicantly. One
such approach was presented in [SJ09]; the signal processing for the compensation process is depicted in Fig. 8.11: The received signals rm at antenna m
are individually transformed to the frequency domain, where the sub-carrierwise channel equalization with matrix GH = H1 is applied that separates the
simultaneously transmitted signal streams. After signal separation, the equalized symbols of each stream are convoluted in frequency domain with a func-
192
Synchronization
Figure 8.11 Receiver processing for simplied signal reconstruction with CFO
compensation in uplink.
tion derived from (8.46) to compensate for the CFO distortions. This method
fully compensates for the CFO-induced ICI of the transmit symbols, however, as
shown analytically in [SJ09], it cannot compensate for the ICI that results from
the CFO-induced violation of the periodic property of the cyclic prex, which is
used for OFDM transmission.
8.3.3
Summary
In this section, we analyzed the inter-stream interference and the ICI induced by
CFO distortions for the downlink and uplink of an OFDM-based CoMP system,
consisting of 2 BSs and 2 UEs, each equipped with a single antenna. For the downlink, it has been shown that the SIR conditions derived for the inter-stream interference set strict requirements for the synchronization accuracy, while the eect
of ICI can be neglected. As compensation for CFO distortions at the receivers is
not possible, tight synchronization between the simultaneously transmitting BSs
is mandatory. In the uplink, we can fully compensate for the inter-stream interference if the CFO distortions are continuously tracked. Therefore, requirements for
the synchronization accuracy are rather deduced from the SIR conditions resulting from the ICI. The ICI distortion may still be compensated at the receiver;
however, the additional computational complexity required for this purpose is
considerable.
Channel Knowledge
In this chapter, we address the issue how channel knowledge - referring to both
desired channels and the channels towards interferers - needed for various CoMP
schemes can be made available where it is needed. We rst investigate channel
estimation techniques at the receiver side in Section 9.1, and then discuss how
the obtained channel knowledge can be eciently fed back to the transmitter
side in Section 9.2, which is for example a crucial requirement for the downlink
CoMP schemes investigated in Sections 6.3 and 6.4. The chapter shows that
standard channel estimation and feedback concepts can principally be extended
to enable CoMP in general. However, it also becomes apparent that large CoMP
cooperation sizes may be considered questionable in practice, due to the fact that
weak links cannot be estimated accurately, and the involved pilot and channel
state information (CSI) feedback overhead may become prohibitive.
9.1
194
Channel Knowledge
9.1.1
where n {1, . . . , Nm } are the relevant MPC indices, n (t) the amplitude of the
n-th MPC, and d0 (t) is the delay connected to the shortest MPC. Note that
with a tilde, as we are observing a channel in time domain, as
we here use h
opposed to h, which always refers to a channel coecient in frequency domain
) is TCIR = N 1 , which
throughout this book. The length of the CIR h(t,
m
is directly related to the path length dierence s of the shortest and longest
MPC. For example, a value of TCIR of 1 s corresponds to s = 1 s c = 300 m,
with c being the velocity of light of 3 108 m/s. Fig. 9.1(c) illustrates the ideally
innite1 channel transfer function (CTF) h(t, f ) C, i.e. the frequency domain
1
In reality, one can observe a wideband similarity of the radio channel over several 100 MHz,
allowing downlink beamforming based on uplink covariance estimation in frequency division
duplex (FDD) systems.
195
) over frequency f :
representation of h(t,
n (t)( n (t))ej2f n (t) .
h(t, f ) =
(9.2)
(9.3)
196
Channel Knowledge
(9.4)
197
s (t) of length TCIR Tg can be fully reconstructed even for the above
a CIR h
mentioned undersampled frequency domain signal h(t, q). This can and - in case
of LTE Release 8 - is being used for an ecient design of so-called pilots or
reference signals (RSs), as will be explained later.
Receive Filter
In case of a rectangular bandpass RF- (or baseband) lter BPF(f ), the Dirac
functions ( n (t)) of the MPCs will be convoluted with the SI-function
i) is a superposition of dierent
sinc(t/t), so that each tap of the CIR h(t,
MPCs. Note that in the general case, the MPC delays n (t) will not coincide with
the sampling timing i t. Hence, from the measured, quantized and potentially
s (t), one cannot directly derive the real channel MPC delays
shortened CIR h
n and amplitudes n . Accurate knowledge would be desirable - e.g. for channel
prediction - as the superposition of MPCs aects the further evolution of a tap.
Time-varying Radio Channel
Mobile radio channels are time-variant due to movements of the user equipments
(UEs) themselves, moving objects in the environment, or time-variant scatterers.
From Eq. (9.1) and Fig. 9.1(b), the main eects on the CIR for a moving UE
are clearly visible, i.e. at time t + t, the delays of the MPCs will have changed
from n (t) to n (t + t). The delay values n (t + t) compared to n (t) are
determined by the variation of the corresponding path lengths between transmit
and receive antennas, which increase or decrease dependent on the relative - for
downlink (DL) - incident angles at the moving UE for particular MPCs.
Estimation in Time and Frequency - Two-Dimensional Wiener Filter
Mobile radio OFDM or single carrier frequency domain multiple access
(SC-FDMA) systems like LTE Release 8 have been designed for training based
channel estimation, i.e. they rely on predened and standardized pilots or RSs.
To save overhead, RSs are placed only on every nRS -th sub-carrier, where
nRS = fRS /f as introduced above for undersampling of the radio CTF. In
general, RSs may be allocated to any sub-carrier, but in practical systems regular
sampling of the CTF h(t, q) is most common.
Considering the transmission of a signal vector s C[Q1] having zero mean
over a mobile radio channel hBB (t) in frequency domain can be written as
y = hBB (t) ( s + n,
(9.5)
198
Channel Knowledge
Frequency-spaced RSs with undersampling factor 1/nRS reduce processing overhead for channel estimation, and one can introduce
' '
[FRS ]i' ,q' = ej2i q /Q , i' {1, . . . , L, . . . Q/nRS } , q ' {1, . . . , Q/nRS } . (9.6)
Here, FRS C[Q/nRS Q/nRS ] is the row- and column-reduced inverse discrete
Fourier transform (IDFT) matrix F, and yRS (t) is the row-reduced receive vector
for the Q/nRS sub-carriers carrying RSs:
,
= FRS yRS (t),
h(t)
(9.7)
,
,
C[Q/nRS 1] is the noisy estimate of the CIR h(t).
s (t), the
where h(t)
For h
,
with indices L + 1 to Q/nRS can be set to zero, based on the
elements of h
assumption that the CIR is limited to L taps and the taps L + 1, . . . , Q/nRS
carry only noise or at least very low power taps. This zero setting of elements
or taps is called denoising and improves the signal-to-interference ratio (SIR) of
the estimated CTF after conversion of the CIR into frequency domain.
T
,
,
,
s (t) =
1), . . . , h(t,
L), 0, . . . , 0
h
h(t,
.
(9.8)
excess delay
,
s (t) can be calculated based on the row-reduced matrix
The shortened CIR h
'
(9.9)
,
RS by applying the interpoIt is also possible to generate h(t)
directly from y
[QQ/nRS ]
, which may be pre-calculated for real systems
lation matrix Fint C
and known RS positions and signals, i.e.
,
, = FH h
s (t) = FH F' RS yRS (t) = Fint yRS (t).
h(t)
(9.10)
Interpolation Gain
Estimation accuracy and overhead for RSs in terms of resources and power
are important design parameters for an OFDM system. For a given signal-tointerference-and-noise ratio (SINR), the achievable channel estimation accuracy
depends on the number of RSs or, equivalently, the undersampling factor of the
CTF. By doubling the number of RSs, the so-called interpolation or processing
gain is increased by 3 dB at the cost of two-fold pilot overhead. Note that the
interpolation gain is the improvement of channel estimation accuracy due to
the above described denoising eect, compared to a baseline channel estimation
performed individually for each sub-carrier and OFDM symbol.
The length of the guard interval has been designed for expected worst case
scenarios. In scenarios where Nm is signicantly smaller than L t, further
199
interpolation gains may be obtained by setting further samples of the CIR within
,
the GI to zero. For this purpose, the length of the CIR h(t)
has to be estimated
(t) to
requiring additionally an estimate of the signal-to-noise ratio (SNR) of y
nd those taps being below the noise level.
Wiener Filter
A well-known solution for exploiting potential interpolation gains is to apply
Wiener ltering [Hay02], which nds the optimum lter Fint,opt with respect
2
to a minimum mean square error (MMSE) criterion, i.e. minimizing H
=
, h(t)||2 }. Wiener lters generally exploit estimated or known sta1/Q E{||h(t)
tistical properties of the signals, i.e. the auto-covariance matrix yy = E{yyH }
C[QQ] and cross-covariance matrix hy = E{hyH } C[QQ] to calculate
Fint,opt = hy -1
yy .
(9.11)
Note that (9.11) states the general solution, while in the case of sub-sampled
RSs as explained above the dimensionality of yy will have to be changed accordingly. The equation leads to the optimal solution under the assumption of fully
uncorrelated channel h and observation noise n. The interpolation gain of Wiener
ltering is due to noise suppression for radio channels with large coherence bandwidth. Assume as an illustrative example a fully correlated frequency-at radio
channel, where one single complex value can be estimated from Q/nRS observations. hy is a matrix carrying the SNR values on the diagonal elements. With
decreasing SNR, the estimated covariance values from yy will be scaled down
according to their reliability.
The interpolation matrix Fint as already introduced in (9.10) targets the same
interpolation gain as the Wiener lter. Note there is an inverse relation between
the coherence bandwidth in frequency domain aecting yy and the length of
200
Channel Knowledge
Figure 9.2 Pattern of LTE Release 8 reference signals (CRS) for 2 antenna ports.
201
Generally the smoothing eect is interesting for those REs carrying RSs, which
are in a notch of the CTF, so that without smoothing, channel estimation accuracy for these REs would be poor.
Interpolation allows - as long as the sampling theorem is fullled - adapting
of RS overhead to the intended estimation accuracy. With respect to CoMP,
channel prediction might be the most interesting aspect, as it seems to be a viable
option to overcome the issue of channel state information (CSI) outdating. The
current goal is to extend the prediction range for a single PRB for the last 2
OFDM symbols of 70 s up to at least several milliseconds, as explained later.
The two-dimensional Wiener ltering solution is an extension of the onedimensional case. In the two-dimensional case it is necessary to stack all O = 14
channel vectors within a PRB into one channel vector h2 (t). The same has to
be applied to the two-dimensional receive signal, which results in the vector y2 .
With these matrices, it is possible to compute the auto-covariance matrix 2,yy =
E{y2 y2H } and the cross-covariance matrix 2,hy = E{h2 y2H } to calculate the
optimum lter F2,int,opt . More details can be found in [HKR+ 97b].
Subspace Concept - Channel Prediction
The so-called subspace concept as proposed in [WMZ05b] is closely related to
the optimum Wiener lter solution, i.e. it exploits long-term channel statistics
to improve channel estimation quality. The subspace is spanned by the relevant
MPCs within the excess delay according to (9.8) being unequal to zero or - more
precisely - above a certain threshold. Hence the subspace dimension might vary
between 1 and the maximum number of taps L of the CIR. Most easily it is
) is depicted for t1 and t2 =
explained from Fig. 9.1(b), where the CIR h(t,
t1 + t. For small t, the MPCs n will change only marginally, i.e. will mainly
change their phases (n ), while the amplitudes (n ) remain almost constant.
In other words, the large scale fading is almost stable, and only the small-scale
fading varies. If e.g. the main MPCs dening the subspace have been properly
identied by according long-term channel observation of the auto covariance
matrix yy , it will be sucient to limit estimation (and reporting) to the short
term variations of the main MPCs (n ) or, equivalently, of the subspace dened
,
s (t) that exceed a certain power threshold.
by all elements in h
In the case of low sub-space dimensions of the radio channel, it could be shown
in [WMZ05a] that signicant interpolation gains can be possible. As an extreme
example, for one single relevant MPC, an interpolation gain of about 15 dB has
been reported, where some of the gain is due to proper identication of additional
irrelevant taps within the CIR, and not only of the maximum length of h(t).
For low mobility below 3 km/h and a prediction horizon of less than 5 ms,
even simple linear prediction over two previous CSI estimates of the interpolated
CTF might yield a mean square error (MSE) of 20 dB, simulated based on
the spatial channel model extended (SCME). For LTE, a useful target is a prediction horizon of 10 20 ms under real world radio channel conditions. An
202
Channel Knowledge
9.1.2
203
(9.12)
VBS C[Mc Nbs Mc ] contains the precoding vectors per BS forming the virtual
transmit antenna ports and VUE C[Mc Mc Nue ] the UE specic postcoders for
all UE. Note that these postcoders might be cell-specic, i.e. in contrast to (6.49),
the UE calculates the left dominant eigenvectors with respect to their serving
cells and not all cooperating cells.
This simplies channel estimation, and can be motivated by the additional
precoders for cancelation of inter-cell interference within the cooperation area.
For He C[Mc Mc ] this linear - potentially zero-forcing (ZF) - precoder is
obtained by the Moore-Penrose pseudo-inverse
1
We = He (HH
e He ) .
(9.13)
Note that the cell-specic pre- and postcoders lead to an implicit per cell
channel estimation, while the estimation of He is done explicitly per eective
channel component. This scheme achieved already some consensus in 3GPP, as
it takes care of the dierent levels of correlation of antenna elements within one
and between dierent cells.
204
Channel Knowledge
9.1.3
205
interference oor. Even in case of the eective channel concept, where the number
of relevant channel components is signicantly reduced, channel estimation will
suer from inter-RS interference. Helpful is a localization of interference as far
as possible by applying strong antenna tilting [TWB+ 09, TWS+ 09]. In addition,
LTE Release 8 foresees dierent levels of inter-RS orthogonality:
full orthogonality for antenna ports of the same cell (mutual muting of REs),
cell-specic frequency shifting of antenna patterns (with and without muting
in other cells) and
so-called quasi orthogonality between cells based on cell-specic Zado-Chu
sequences sZC [HT09].
Note that these sequences sZC are a variant of the well-known constant amplitude zero autocorrelation codes (CAZAC) sequences. While they provide zero
auto- and cross-correlation, the amplitude is not really constant, but at least the
value of the cubic metric is comparable to that of QPSK modulation. ZadoChu sequences sZC run over all CRS of one OFDM symbol and provide good
wide-band orthogonality. For CoMP, the challenge is that in case of shortened
sequence lengths - as required for frequency-selective CSI - the performance
degrades signicantly. For example, the cross-correlation for a sequence length
2
= 16 dB for all sequence shifts,
of 50 PRBs achieves an MSE of about CRS
2
while for a length of 10 PRBs this degrades to about CRS
= 10 dB. For one
PRB, the eect will be more detrimental, making improvements in the RS design
mandatory for advanced CoMP.
Due to the required backward compatibility to LTE Release 8, where UEs rely
on a constant transmission of the CRS grid as illustrated in Fig. 9.2, it is obvious
that these existing reference signals have to be kept as they are. A possible
way forward is hence to design new CSI RS, specically intended for channel
estimation for up to 8 antenna ports in a multi-cell environment with suciently
high accuracy. Based on the maximum MCS of 64-QAM with code rate 5/6-th,
2
in the range of 20 dB seems to be a reasonable, but challenging, target.
a CSI
2
For given mobile radio channels, the MSE of channel estimation CSI
will be
aected by the design parameters according to (9.14), i.e. the number of channel
components NCC , the number of RSs Q/nRS , the length of the pilot sequence
LRS and its relative power PRS , where the last point is known as power boosting.
2
=
CSI
NCC nRS
const.
LRS PRS
(9.14)
206
Channel Knowledge
mutually orthogonal RSs for dierent antenna ports as well as dierent cell IDs.
This can be done by frequency division multiplex (FDM), time division multiplex (TDM), code division multiplex (CDM), or any hybrid solution, as has
been intensively investigated in 3GPP. Performance-wise there are only minor
dierences, but there might be side eects like backward compatibility to LTE
Release 8, muting and corresponding power oset issues etc.
The parameter NCC - the number of channel components to be estimated
- can be minimized by applying the eective channel concept as explained in
the previous subsection. In [TSS+ 08], a specic solution allows to increase LRS
without adding extra overhead. For that purpose, CSI RS are multiplied in time
domain with cell-specic Hadamard sequences sH (or so called orthogonal cover
codes) and a suitable regular allocation of sH to cells ensures that the required
length of sH for full orthogonality increases with the increasing inter-cell distance.
Depending on UE mobility, the longer sequences might more or less violate
the coherence time of the radio channel, leading to corresponding inter-code
interference. Statistically, the more distant cells will contribute less interference,
so that higher sensitivity to mobility due to long sequences is easier acceptable.
The additional orthogonality comes basically for free, as the Hadamard sequences
are applied to the already available CSI RS. In Fig. 9.3, the normalized MSE of
the channel estimator for correlation over several TTIs for the top 5 strongest
cells is compared with Hadamard or random sequences on top of the reference
signals. With increasing length of the correlation time, Hadamard sequences
provide signicantly lower MSE than the random sequences.
In Fig. 9.3 we can observe a further important aspect: the MSE of channel
estimation degrades signicantly with decreasing receive power of the estimated
channel components, as already seen in Section 4.2. Fortunately, lower receive
power relates to lower interference and for JT, corresponding simulations veried
a self scaling eect, i.e. that precoding sensitivity to channel estimation errors
decreases with decreasing receive power.
9.1.4
qp
ut
qp
ut
ut
rs
rs
bc
20
ut
ut
30
qp
top-1
bc
top-2
top-3
top-4
top-5
ut
10
ut
bc
rs
signal
signal
signal
signal
signal
ut
rs
bc
ut
qp
ut
rs
qp
rs
bc
10
bc
bc
rs
20
bc
ut
30
40
ut
qp
ut
rs
ut
ut
qp
rs
qp
10
ut
ut
qp
top-1
top-2
bc
top-3
top-4
top-5
ut
qp
10
ut
20
ut
20
207
signal
signalrs
signalbc
signal
signal
40
0
2
4
6
correlation time [TTIs or ms]
2
4
6
correlation time [TTIs or ms]
Figure 9.3 Normalized MSE of the correlation estimator for the ve strongest
c 2008 IEEE.
cells [TSS+ 08].
208
Channel Knowledge
Similar as CSI RS suer in the DL from multi-cell interference, channel estimation in the UL will be degraded due to simultaneously transmitting UEs,
specically from cells reusing the same Zado-Chu sequences for the DRSs.
Therefore, it is important to localize interference by strong BS antenna tilting and to minimize inter-cluster interference by proper clustering of cells and
user grouping [TSS+ 08], still being a eld of extensive research.
9.1.5
Summary
Accurate channel estimation is the basis for any CoMP scheme like coordinated
beamforming / scheduling or - more importantly - joint precoding or detection.
Channel estimation aects precoding accuracy and denes an upper limit for
possible performance gains.
Specically joint transmission faces several further challenges compared to
more conventional single link systems, such as
a high number of channel components to be estimated,
strong multi-cell interference due to frequency reuse one, and
a high sensitivity of joint precoding with respect to estimation errors combined
with the need for frequency-selective channel information.
A sound understanding of the time-variant radio channel is important, motivating well-known channel estimation techniques like two dimensional Wiener
ltering or the so-called sub-space concept. Discussed enhancements are the eective channel concept, which limits the number of eective channel components
to be estimated to the number of supported data streams - and reference signals
carrying orthogonal time domain sequences reducing inter-cell interference for
low mobility users without extra overhead.
9.2
209
ited rate, mobiles can only provide imperfect CSI, i.e., a small nite number of
bits is used for the information fed back to BSs (cf., e.g., [MRF10]). The consideration of systems with limited feedback assuming nonlinear transmit processing
can be found in, e.g., [DLZ06, CJCU07]. However, we are especially interested
in linear precoders because of their implementation advantages over non-linear
schemes, like being in general computationally more ecient, having smaller processing delays by avoiding successive encoding, and inducing less requirements
on hardware like the dynamic range of ampliers or analog-to-digital converters. Compared to the investigation of linear precoders in [MSEA03, LHSH04]
(see also references therein) for a single-user system, or in [Jin06] (see also references therein) for a multi-user system, we consider a multi-user system with
user scheduling in order to fully exploit multi-user diversity.
Note that in the standardization of 3GPP LTE-A, i.e., where non-cooperative
single-user MIMO (SU-MIMO) is considered, two major feedback schemes have
been discussed, viz., implicit and explicit feedback of CSI. Here, implicit feedback
means the feedback of a precoder index from each mobile to its assigned BS. The
corresponding codebook entry is used by the BS as the precoder in the downlink.
Although the index directly represents the precoder, the CSI is still included in an
implicit manner. Contrary to that, explicit feedback denotes the direct feedback
of CSI in terms of an index which represents the codebook entry which is closest
to the exact CSI with respect to some distance criterion. In this section, we focus
solely on explicit CSI feedback because it is the most promising feedback scheme
in the context of MU-MIMO and CoMP (cf., e.g., [DB07]).
Please note that, while we were concerned with a most ecient estimation and
representation of channel characteristics in the time- and frequency domain in
Section 9.1, we are now exploiting spatial dimensions of channel matrices in order
to make CSI feedback as ecient as possible. In this section, we mainly focus on
channel vector quantization (CVQ) schemes for eciently capturing these spatial
dimensions. Most of the explicit CSI feedback schemes are based on CVQ using
a nite channel codebook as proposed in [3GP06a, 3GP06b, DB07, DLU09].
Precisely speaking, each user quantizes a product of his channel matrix and
an estimation of its receive lter, in the following denoted as the composite
channel vector, and feeds back the corresponding codebook index together with
an approximate signal-to-interference-and-noise ratio (SINR) value. Note that
users need to estimate their receivers because the nally chosen receive lters
depend on the precoder at the BSs, which is determined after quantization.
In fact, the BSs use the quantized composite channel vectors to compute the
precoder based on the zero-forcing (ZF) criterion, and use the available SINR
values to schedule the users by maximizing the sum-rate.
Usually, CVQ is based on choosing the codebook entry with minimum
Euclidean distance to the composite channel vector. However, minimizing the
Euclidean distance is not necessarily related to the nal goal of designing a
communications system, i.e., maximizing the sum-rate. Therefore, we propose
210
Channel Knowledge
H T1
x
K
..
.
NBS
H TU
+
n1
+
nU
y1
Nue
yU
Nue
gH
1
..
.
gH
U
x
1
1
x
U
1
to estimate the receive lter and quantize the corresponding composite channel vector by maximizing the approximate SINR which is directly related to the
achievable rate of the corresponding user [DLU09]. Note that this idea is strongly
related to the method presented in [WSJ+ 10], however, the latter approach is
based on two codebooks and a dierent type of sum-rate approximation.
Before reviewing the state-of-the-art CVQ methods and deriving the proposed
CVQ approaches in Subsections 9.2.3-9.2.6, we introduce our transmission model
in the next two subsections. Finally, we investigate the performance of the proposed schemes when applied to a MU-MIMO system with linear ZF precoding
in Subsections 9.2.7-9.2.9.
9.2.1
Transmission Model
Here, we consider the downlink transmission from one BS with Nbs antennas
to U UEs with Nue receive antennas each, out of which K U UEs have been
scheduled to be served on the same resources in time and frequency. Note that in
this section, NBS = Nbs since the number of BSs is assumed to be M = 1. In the
sequel, we will capture all K selected users in the set K {1, . . . , U }, |K| = K.
As introduced in Chapter 3, the transmission taking place on one exemplary
orthogonal frequency division multiplex (OFDM) sub-carrier of the system can
be stated as
yk = HTk s + nk ,
(9.15)
211
where W C[NBS K] is the precoding matrix, and x C[K1] are the symbols
of the K scheduled UEs before precoding. As in Section 3.5, we assume that
these symbols have unit power, i.e., x NC (0, I), and that the transmit power
assigned to dierent streams is inherently contained in the precoding matrix W.
Next, we describe the receivers at the mobile stations. We assume that each
user is applying a linear lter gk C[Nue 1] to the receive vector yk to get the
estimate
x
k = gkH yk C
(9.17)
9.2.2
(9.19)
j=k
log2 (1 + k ) .
(9.20)
kK
9.2.3
212
Channel Knowledge
resources and chooses the proper modulation and coding scheme (MCS) using the
CQIs of the dierent users. Again, since we are especially interested in schemes of
strongly constrained feedback, we restrict the maximum number of transmitted data
symbols per user to be one.
Assume for the rst that the precoder W is known at the mobile receivers
such that the LMMSE lters gk can be computed according to (9.18). In order
to compute the CDI, each user k quantizes the composite channel vector ck =
Hk gk C[NBS 1] , being a combination of the LMMSE lter and the physical
channel matrix, by applying CVQ based on the channel codebook
C = {u1 , . . . , u2B },
(9.21)
where B denotes the number of necessary bits for indexing the 2B normalized
codebook vectors uq C[NBS1] , q {1, . . . , 2B }. By doing so, only NBS entries in
ck instead of NBS Nue entries in Hk need to be quantized at each UE, leading to
a smaller quantization error if one keeps the feedback amount constant [DLU09].
However, in a real system, the nally chosen precoder W, and therefore the
resulting receive lter gk is unknown to UE k at the time CVQ is applied.
This is because each user has no knowledge about channels of other users due
to the non-cooperative nature of the downlink channel. As a consequence, the
quantizer QC needs to compute the quantized composite channel vector ck
C[NBS 1] based on an estimate of the receive lter, in the following denoted as
k C[Nue 1] , whose estimation quality compared to the nally chosen LMMSE
g
lter gk depends mainly on the chosen quantization method. In the following,
we dene the quantizer output as
k ) = QC (Hk ) .
CDI: (
ck , g
(9.22)
Moreover, due to the fact that the channels of other users and the nally
chosen precoder are not known when the feedback information is computed at the
mobile, CQI must be approximated as well. This is usually done by taking into
account a rough estimate of the multi-user interference caused by the imperfect
CSI at the base station due to quantization. As derived in [TBT08, 3GP06a],
the CQI of user k, which is here a scaled version of the SINR at the kth mobile
receiver, is approximated via
k "22 cos2 k
k |
"Hk g
|cH Hk g
, cos k = k
,
2
2
k "2
"Hk g
k "2 sin k /NBS
2 + K "Hk g
(9.23)
where k [0, ] denotes the angle between the normalized composite channel
vector and the quantized version thereof (quantization angle), and where, without loss of generality, we set "
gk "2 = 1. In the following, we present two CVQ
methods based on two dierent quantization criteria.
k , Hk ) =
ck , g
CQI: k (
9.2.4
213
(9.25)
(9.26)
Note that the optimization criterion of the resulting receive lter estimate
is no longer the MSE such as in the nally applied LMMSE receiver but the
Euclidean distance. This leads to a mismatch between the true SINR and the
one fed back as CQI. Finally, minimum Euclidean distance based CVQ can be
summarized as
+2
+
+
cEuclid
= arg max +QH
k
k u 2 , Hk = Q k R k ,
uC
kEuclid ,
QEuclid
: Hk 8 cEuclid
,g
Euclid
Hk Qk QH
C
k
k c
k
+ ,
kEuclid = +
g
Euclid +
c
+Hk Qk QH
+
k k
2
(9.27)
214
Channel Knowledge
k
c
1
k
1
k
Qk QH
kc
k
c
range(H k )
c 2010 IEEE.
Figure 9.5 Quantization of the composite channel vector, from [DLU09].
kEuclid , Hk ) according
and the corresponding CQI computes as k (cEuclid
,g
k
to (9.23).
9.2.5
lters):
H
vH HH
k uu Hk v
k (u, v, Hk ) = H 2
H
v I + HH
k (I uu ) Hk v
vH A(u)v
.
= H
v B(u)v
215
(9.29)
It is well known that expressions in the form of (9.29) are maximized by setting
v to the eigenvector corresponding to the largest eigenvalue i solving the generalized eigenvalue problem A(u)vi = i B(u)vi (e.g., [BX05]). Moreover, if B(u)
is invertible, the eigenvalues and eigenvectors are the same as for the regular
eigenvalue decomposition of B1 (u)A(u).
Note that this maximization nds the best v given a specic codebook entry
u. The optimal
cSINR
is the one yielding the largest SINR value over all codebook
k
SINR
k
is the corresponding optimal weight vector for this cSINR
, i.e.,
entries, and g
k
QSINR
C
SINR
= arg max
max
k (u, v, Hk ),
ck
v{C[Nue 1] :"v"2 =1}
uC
SINR
SINR
k
: Hk
8
ck
,g
,
kSINR =
arg max
k (cSINR
, v, Hk ).
g
k
(9.30)
A drawback of this method applied directly is the computational complexity,
since the maximization over v needs to be performed for all entries of the channel
codebook, and thus, requires 2B generalized eigenvalue decompositions per subcarrier on which the optimization is performed.
9.2.6
arg max
v{C[Nue 1] :"v"
2 =1}
v H HH
k Hk v,
kCCM ,
= arg max uH Hk g
kEuclid)
(
cEuclid
,g
k
(9.31)
uC
216
Channel Knowledge
9.2.7
TC
W = W' 1/2 , W' = C
,
(9.33)
where the diagonal matrix CKK represents power loading. For the simulations in Section 9.2.9, we assume equal power loading according to
2
4K
1
= diag
,
(9.34)
"W' ek "22 k=1
where ek {0, 1}[K1] contains a one in element k, and otherwise zero.
9.2.8
Resource Allocation
With the codebook indices and the scaled SINR values of all users, the base
station schedules the users and computes the ZF precoder as described in the
previous subsection. To do so, it calculates the SINR approximations based on
the scaled versions thereof. It holds [TBT08, 3GP06a]
= K diag (k' )kK .
(9.35)
Then, it uses these SINR approximations in order to schedule the users according to a greedy algorithm as described in [DLU09, TBT08, 3GP06a], see also
Section 9.1. Finally, the set K of scheduled users is used to compute the ZF
precoder according to (9.33) and (9.34).
9.2.9
Simulation Results
We investigate the proposed schemes in a MIMO OFDM system with the parameters as given in Table 9.1 and assuming the typical urban macro-cell channel
model of the WINNER project [HKK+ 07].
217
c 2010 IEEE.
Table 9.1. Simulation parameters, see [DLU09].
Parameter
Num. of BSs
Num. of Tx ant. per BS
Num. of overall users
Num. of users sched. to same res.
Num. of Rx ant.
Speed of users
Carrier frequency
Bandwidth
FFT size
Num. of sub-carriers
Num. of feedback bits
Feedback period
SINR quantization
Channel model
Path loss used
Ant. spacing at Tx
Ant. spacing at Rx
Variable
M
Nbs
U
K
Nue
Value
1
4
10
4
2
1 m/s
2.0 GHz
18 MHz
2048
1200
4
1.0 ms
No
Typical urban macro
No
0.5 wavelength
0.5 wavelength
Fig. 9.6 illustrates the performance dierence between the CVQ schemes with
Pseudo-Maximization (PM) and full maximization of the CQI indicator (approximate SINR), and compares them with the minimum Euclidean Distance (minimum quantization error) approach. Here, we assume a random codebook with
B = 4, where the elements of C are chosen from an isotropic distribution on the
NBS -dimensional unit sphere, i.e., normalized versions of vectors with random
entries that correspond to NC (0, 1).
The maximal gain of the pseudo and full maximization schemes over the
minimum Euclidean distance method seems to be about 1.2 bit/s/Hz at 0 dB
SNR. Moreover, one can see that the pseudo-maximization scheme acceptably
approaches the performance of the full maximization scheme. Indeed, the performance gap between the two is never more than about 0.7 bit/s/Hz. Surprisingly, pseudo-maximization and Euclidean distance minimization are both
slightly superior to full maximization in the mid and high SNR regions. This
is possible due to the fact that the SINR measure used as the cost criterion
for maximization does not represent the exact SINR, but is rather an approximation of this quantity. It appears that Euclidean distance minimization is the
best option at mid and high SNR, and that in that range the SINR pseudomaximization achieves the same performance. As pseudo-maximization makes a
choice between the codebook entry that maximizes channel magnitude and the
one that minimizes the quantization angle (Euclidean distance), one can suspect
that at high SNR, the minimum quantization angle solution is chosen most of
218
Channel Knowledge
15
10
0
10 5
10
15
20
25
30
35
40
SNR in dB,
Figure 9.6 Performance comparison in case of random codebook and typical urban
c 2010 IEEE.
macro-cell, from [DLU09].
the time, such that no performance dierence is noticeable with the scheme that
only chooses the entry with the minimum Euclidean distance.
Remember that the pseudo-maximization scheme requires much less computational complexity than full maximization (see the discussion on that eect in
Section 9.2.5). Due to the above results, it also appears that pseudo-maximization
always performs better or equivalently to quantization angle minimization,
and that not much is gained by using full maximization (it can even cause
slight performance degradation in certain SNR ranges). Therefore, the pseudomaximization scheme seems like the preferred alternative to quantization angle
minimization and produce signicant sum-rate gains with respect to this scheme
in the low SNR region. In [KDA+ 10], the same conclusion has been drawn on
the basis of system-level simulations.
9.2.10
Summary
In this section, we presented explicit CSI feedback schemes based on CVQ. Whereas
the minimum Euclidean distance method achieves a good performance in the high
SNR region, it degrades enormously in cases of small SNRs. In these cases, CVQ
based on maximizing an estimate of the SINR outperforms Euclidean distance based
schemes. However, the maximum SINR based CVQ method suers from a very high
computational complexity. Here, the suboptimal solutions presented in this section
provide a good trade-o between complexity and performance. While this section
focussed on a single-cell case, all discussed CSI feedback techniques can principally
also be applied to multi-cell CSI (possibly capturing both desired and interfering
channels), and hence be used in the context of CoMP.
10.1
220
controllable error propagation. Initially, the transmission model is stated in Subsection 10.1.1, after which precoding architectures are introduced step by step
in Subsections 10.1.2 to 10.1.4. The section concludes with a numerical example
and a summary in Subsections 10.1.5 and 10.1.6, respectively.
10.1.1
System Model
The considered multi-user downlink system employs an orthogonal frequency
division multiple access (OFDMA) scheme, where M BSs with all in all NBS
transmit antennas jointly transmit data to K UEs with NUE receive antennas
in total. The notation from Section 3.5 is reused for the baseband model of a
frequency at channel in frequency domain per sub-carrier as
+n ,
= GH HH Wd(x) + n = GH Hx
(10.1)
x
C[NUE1] are the stacked symbols which are transmitted to the UEs
where x, x
and estimated after receive processing, respectively.
On one hand, the statistical quantities of this model are"the transmit
symbols,
#
H
2
= x INUE . On
whose transmit covariance matrix is given with xx = E xx
the other hand, the vector n C[NUE1] is assumed
to
be
additive
white
Gaussian
"
#
noise (AWGN) with covariance matrix nn = E nnH .
The transmit symbols are processed by the linear spatial precoder W
C[NBS NUE ] which forms together with the physical channel H C[NBS NUE ] an
C[NUE NUE ] . Additionally, the transmit power is limited by
eective channel H
the gain R according to the sum-power constraint
"
"
#
#
E "Wx"2 = 2 tr Wxx WH = ETx .
(10.2)
The aim of the linear precoder is to decouple the received data symbols from
each other already at BS side. Each of the K non-cooperative UEs spatially
lters its received signals independently. Consequently, the receive lter matrix
G C[NUE NUE ] is a block matrix of the compound UEs receive lters. The
equalized data signals at each UE k can be decomposed into
K
.
HH
k = GH
x
(10.3)
HH
k
k Wk xk +
k Wj xj + nk
j=1,j=k
Here, HH
k Wk xk describes the desired symbol part. The second term
K
H
H
represents the spatial interference. The AWGN
j=1,j=k
k Wj xj
#
"
vector nk C[Nue k] per UE is dened with E nk nH
= n2 k INue as local
k
covariance matrix.
For the sake of simplicity, the compound receive lter is reduced to a scaled
diagonal matrix G = 1 INUE . This means that each user stream is handled independently from other streams like single antenna UEs. Each UE should estimate
this scalar based on the eective channel. Admittedly, real handset implemen-
221
= HH W which
This expresses a power ratio based on the eective channel H
describes the relation of the useful signal part l to interference terms.
10.1.2
Tx
with WF
>
?
?
=@
'
ETx /x2
tr
HH H (HH H + ' )2
!.
(10.6)
n
min
2k uk uH
k ,
(10.7)
k=1
with the unitary matrix U U[NUE NUE ] and R[NUE NUE ] . This decomposition bounds the system performance.
Spatial conditions are determined, e.g., through antenna correlations, path
loss or shadowing eects, so that reduced rank situations can occur. Consequently, in a multiple-input multiple-output (MIMO) multi-point scenario there
exist at least 21 22 2nmin dominant eigenvalues with uk as the corresponding eigenvector to eigenvalue 2k and nmin = min(M, K). In the full rank
222
case, there exist at most nmin = min(NBS , NUE ) non-zero eigenvalues. Related
to the EVD, each user symbol is transmitted on a dierent eigenmode with
the power allocated according to corresponding eigenvalues. With the EVD, the
transmit lter can be written as
1
'
UH ,
(10.8)
W = WF H U + UH U
with
>
?
?
=@
lim
WF
2
n 0
tr{nn }
ETx INUE
ETx /x2
tr
( + UH ' U)
>
?
?
=?
@
ETx /x2
n
min
1
i=1
and i nmin .
(10.9)
(10.10)
2i
min(NBS ,NUE )
ranknum (H) =
i=1
(10.13)
Finally, the number of user streams cannot exceed the number of eigenmodes.
Hence, a spatial resource allocation may not include an additional stream which
would reduce the SINR of the already precoded streams.
10.1.3
223
one has to choose an appropriate algebraic decomposition [GVL96]. In this section, three basic approaches will be compared before the idea of order-recursions
will be presented.
The Cholesky decomposition (CD), which is similar to Gaussian elimination, separates symmetric, positive-denite matrices into a triangular matrix R.
By calculating the inverse of R with backward recursions and pivoting, CD is a
3
)) and numerical stable algorithm.
low complex (O(1/3NUE
'
(10.14)
Substitution: RH R = HH H + I
H
Computation:
W = H R1 R1
(10.15)
On the other hand, the ability to handle a multitude of MIMO setups requires
a further organizational overhead, because the assignment of sub-matrices
within H necessarily leads to an advanced exception handling within the algorithm.
Another possibility is to use the class of QR decomposition with its realizations namely the modied Gram-Schmidt orthogonalization, Householder transformation or the Givens rotation. A compound matrix is decomposed into a
unitary and a triangular matrix with
) (
) (
)
(
Q1
R
HH
=
and
(10.16)
Substitution:
'
1/2
Q2
0
Computation:
= R1 QH
1 .
(10.17)
(
)1
P
PK
AU
1
1
= SP
C1
KV) UC
+C1
V
(A
VC
(10.20)
The drawback is the high error propagation in limited precision arithmetics.
Numerical errors are accumulated with each processing stage from C1 to K
until the full matrix has been processed.
224
vC[1NUE ]
This principle denotes the transmit lter as a series of column- or row-wise matrix
extensions. Here, each column in H is the algebraic mapping of antenna links
which relate to one column in W for each user data stream.
*1
'
tr{nn }
H
w1 = TxWF h1 h1 h1 +
(10.22)
ETx
'
*1
tr{nn }
(10.23)
[w1 w2 ] = TxWF [h1 h2 ] [h1 h2 ]H [h1 h2 ] +
ETx I2
..
.
(10.24)
1
'
.
(10.25)
W = TxWF H HH H +
In the following, the algorithm is discussed in detail to explain the lter updates
by forward and backward recursion.
10.1.4
.
W[:,I[1:l]] = W[:,I[1:l1] ] b d[IT[1:l1] ]
225
/
.
W
HH
[:,I
]
[1:l1]
[:,I[l] ]
,
d[I[1:l] ] =
1
(10.27)
(10.28)
(10.29)
b
otherwise
2
= e / "e"2 + " ' 1/2
.
(10.31)
with b
d
"
[I[1:l] ]
[I[1:l] ,I[1:l ]
The updates are performed with subspace projections (see (10.28)
and (10.29)). Thus
'
T *
H
e = INBS W[:,I[1:l1] ] H[:,I[1:l1]]
H[:,I[l]]
(10.32)
PC (HH
[:,I[1:l1] ]
[1:l1] ]
mentary space which is spanned by the set of the already considered row space
of the channel matrix. With this approach, the precoding matrix is additively
extended with rank-one updates of regularized projections through the channel
column vector H[:,I[l] ] into the existing precoding solution.
The quantity of linear independence between the user stream I[l] and the
existing precoder W[:,I[1:l1] ] , corresponding to the EVD, is the angle
cos I[l] =
HT[:,I[l] ] e
"H[:,I[l]] " "e"
, 0 I[l]
(10.33)
2
21 HT[:,I[l]] PC (H[:,I
) H[:,I[l] ] nmin , "e" > 0 .
[1:l1] ]
(10.34)
"e"2
226
Table 10.1. Operation count comparison of matrix inversion algorithms [GVL96, HKF10].
O-notation
Divisions
Sqr. roots
Cholesky dec.
3
1/3NUE
NUE
0
QR dec.
2
3
4NBS NUE
1/3NUE
NUE
NUE
Order-rec.
2
3/2NBSNUE
NUE
0
10.1.5
25
n2 = 0.001
20
25
SINR [dB]
10
n2 = 0.1
5
0
101
102
channel condition number (H)
10
n2 = 0.1
0
103
oating point
xed-point
n2 = 0.01
15
n2 = 1.0
100
n2 = 0.001
20
n2 = 0.01
15
30
oating point
xed-point
SINR [dB]
30
227
n2 = 1.0
100
101
102
channel condition number (H)
103
Figure 10.1 Floating and xed-point SINR for i.i.d. Gaussian MIMO systems as
function of the channel condition number.
prominent. If eigenvalues (10.8) are smaller than the machine accuracy num or
the projection norms (10.34) are observed as zeros then a low desired signal and
high spatial interference power results. With an increasing noise, the regulariza'
tion through aects the division operations to be in a dened range, prevents
numerical instabilities and improves the conditioning of the matrix inversion.
Anyway, the regularized (or approximated) division results in an imperfect interference suppression.
10.1.6
Summary
This section described an order-recursive algorithm to calculate the Wiener
transmit lter for coherent CoMP transmissions between several decentralized
base stations to several decentralized mobile terminals in cellular radio networks.
State of the art implementations do not oer the exibility to handle multiple
transmission setups as required for multi-user precoding in conjunction with a
proposed reduced complexity. The discussion of all signal processing steps shows
how algebraic observations are mapped to physical parameters and how they
reect the spatial eigenmodes. These observations can be integrated into spatial
resource allocation to extend the precoding coecients recursively with additional spatial streams.
10.2
228
10.2.1
k=2
noise
desired
interference
(10.35)
229
for the UE receive lter weights g has been skipped for brevity since we deal
exclusively with UE 1 in this section. Please note that due to the single-layer
assumption introduced above, the transmit symbols towards the UEs, xk , are
denoted as scalars. For simplied notation in the sequel, we dene the eective
channel for single-layer transmission from BS k to UE 1 as follows:
hk = [Hk1 ]H wk .
(10.36)
Then, the interference-and-noise part from (10.35) being eective at the receiver
can be formulated as
z1 =
K
hk xk + n1
(10.37)
k=2
(10.38)
"
z1 zH
1
= I+
K
"
#
H
hk hk E |xk |2 .
(10.39)
k=2
H
1/2
= h1 1
gIRC
= h1 1/2
z z
z1 z1
z1 z1 .
1 1
MRC pre-whitening
(10.40)
230
lter. The motivation for this choice is that the SINR is appropriate to represent the eective signal-to-noise ratio (SNR) on an additive white Gaussian
noise (AWGN) channel. With the eective SNR approach further performance
metrics like error probability or throughput can easily be derived from known
AWGN gures. As starting point, the generic expression for the instantaneous
SINR with a general UE receive lter is given by (see e.g. [HSP01])
H 2
g h1
"
#
E |x1 |2 .
SINR(g) = H
(10.41)
g z1 z1 g
The term instantaneous SINR relates to the fact that only one specic channel
realization of the desired and interfering links is considered. The generic expression in (10.41) is used later on when mismatched UE receive lters due to estimation errors are evaluated. The next step is to insert UE receive lter weights
for IRC from (10.40) into (10.41) resulting in the simplied SINR expression
#
"
#
"
H
2
= E |x1 |2 .
(10.42)
SINR(gIRC ) = h1 1
z1 z1 h1 E |x1 |
Here, the IRC parameter
H
= h1 1
z1 z1 h1
(10.43)
is introduced which can be interpreted as SINR in case of ideal IRC for unitpower transmit symbols. In order to complete the framework on IRC, the UE
receive lter weights according to the minimum mean square error (MMSE)
approach [Ver98] are derived and compared to the IRC ones. The MMSE solution
works on the covariance of the receive signal, which reads
y1 y1 = Ex,n
"
y1 y1H
= I+
K
"
#
H
hk hk E |xk |2 .
(10.44)
k=1
The relation between the covariance of the receive signal y1 and the covariance
of the interference-and-noise signal z1 can be observed as
#
"
H
(10.45)
y1 y1 = z1 z1 + h1 h1 E |x1 |2 .
Conceptually, the MMSE weights are obtained by whitening the receive signal
and applying a matched lter afterwards. This results in the following weight
expression
H
H
gMMSE
= h1 1
y1 y1
.
"
#/1
H
H
= h1 z1 z1 + h1 h1 E |x1 |2
.
(10.46)
H
h1
1
z1 z1
"
# H 1 4
2
1
h1 z1 z1
z1 z1 h1 E |x1 |
H
2
1 + h1 1
z1 z1 h1 E {|x1 | }
(10.47)
231
Inserting the IRC expressions for weights from (10.40) as well as for SINR
from (10.42), the MMSE weights read
*
'
SINR(gIRC )
1
H
H
gIRC
gH . (10.48)
= 1
=
gMMSE
1 + SINR(gIRC )
1 + SINR(gIRC ) IRC
This interesting result shows that MMSE weights gMMSE are simply a scaled
version of IRC weights gIRC , hence yielding the same SINR
SINR(gMMSE ) = SINR(gIRC ).
(10.49)
H
gMRC
= 2 h1 .
(10.50)
So far, the UE receive lter weights have been derived assuming ideal knowledge of channel and eective noise covariance, which is also the basis for the rst
performance study in Section 10.2.2. In later subsections, we will incorporate
dierent kinds of estimation errors by introducing the following weights:
, H 1
H
H
) = h
IRC
IRC,a
=g
(h,
g
z1 z1
1
H
H
=
IRC
IRC,b
=g
(h, )
g
H
H
)
=
IRC,c
IRC
g
=g
(h,
H 1
h1
z1 z1
H
,
1
h
z1 z1
1
with estimation of h
(10.51)
with estimation of
(10.52)
(10.53)
10.2.2
232
with correlation parameters for transmit correlation and for receive correlation.
As described in the introduction of this section, the study focuses on scenarios where desired and interferer downlink are generated as dedicated beams. For
an appropriate modelling of this beamforming aspect, high correlation at BS
side is assumed. Therefore, the value = 0.9 is chosen since it also models in
3GPP [3GP10h] the high correlation case. At the UE side, we assume some polarization diversity and, therefore, low to medium correlation can be achieved. Here,
= 0.3 is selected matching the medium correlation case in 3GPP [3GP10h].
We consider one BS site with Nbs = 2 physical antennas and one UE with
Nue = 2 physical antennas. Furthermore, we take K = 2, i.e. one desired beam
(layer) and one interferer beam (layer). Please note that a beam in this context is
equivalent to one layer in the LTE sense. For simplicity, no particular pre-coding
is adopted meaning that one BS antenna serves one UE. Such a model gives
direct insight into the performance of multi-user spatial multiplexing.
The performance measure for the study is the mean SINR where the expectation is taken over all channel samples. Study parameters are the SNR and
signal-to-interference ratio (SIR), respectively. They are dened as
E{|x1 |2 }
2
E{|x1 |2 }
.
SIR =
E{|x2 |2 }
SNR =
(10.54)
(10.55)
Starting from (10.42), this results in the following SINR expression (K = 2) for
IRC:
(
)1
z 1 z 1
H
h1
SINR(gIRC ) = h1
E {|x1 |2 }
.
/1
H
H
= h1 SNR1 I + h2 h2 SIR1
h1 .
(10.56)
The curves are sketched as mean SINR versus SNR with SIR as additional parameter. Numerical results for IRC are depicted in Fig. 10.2 for the parameter
SIR {10, 0, 10} dB. Additionally, for reference, the performance of an MRC
receiver is shown.
It can be seen that the potential gains of IRC versus MRC strongly depend
on the SNR operating point. As soon as the SNR is larger than the SIR, the IRC
gain versus MRC gets substantial. This behavior is further aected by transmit
and receive correlation. For higher receive correlation values, the point where
IRC pays o is moved to slightly higher SNR values.
As a conclusion for interference-limited scenarios (SIR < SNR) with one dominating interferer, we see substantial gains for the discussed IRC approach. In
the following sections, it is further elaborated on the losses being implied when
estimation errors for channel and interference covariance are taken into account.
233
15
10
5
IRC, SIR=10dB
IRC, SIR=0dB
IRC, SIR=-10dB
MRC, SIR=10dB
MRC, SIR=0dB
MRC, SIR=-10dB
0
5
10
15
10
5
SNR [dB]
10
15
20
Figure 10.2 Mean SINR versus SNR for ideal IRC and MRC receive lter weights.
Please note that the underlying channel model assumes simply one antenna
array at BS site which is used by desired as well as interfering link. In the strict
sense, this model is only applicable for the intra-cell interference case. However,
the basic results on algorithm aspects and potential IRC gains can be easily
extended to the cases of inter-cell interference.
10.2.3
(10.57)
234
where the vector e consists of complex-valued Gaussian noise with zero mean
and covariance
"
#
1
ee = E eeH = z1 z1 ,
(10.58)
G
and where we introduced the interpolation or processing gain G as a linear
power ratio (see Section 9.1). The amount of corresponding de-noising by means
of channel estimation ltering in time and/or frequency direction is given by
10 log10 (G) [dB].
Let us rst assume the interference-and-noise covariance estimate to be ideal
and deal with estimation in Section 10.2.4. With the additive error model, the
non-ideal weight vector suering from channel estimation errors reads
H
H
H 1
IRC,a
g
= h1 1
z1 z1 + e z1 z1 .
(10.59)
SINR(
gIRC,a ) =
2 + h1 1
ee 1
z1 z1 h1
z1 z1
E{|x1 |2 }.
+ tr ee 1
z1 z1
(10.63)
235
The intention now is to put this SINR into a relationship with SINR(gIRC )
from (10.42), to obtain the general SINR ratio
H
ee 1
1 2 + h1 1
SINR(
gIRC,a )
z1 z1 h1
z1 z1
.
=
1
SINR(gIRC )
+ tr ee z1 z1
(10.64)
In a nal step, we make use of our channel estimation error model with specic
denition for ee from (10.58), which allows for substituting ee 1
z1 z1 = I/G to
obtain the SINR ratio
1 2 + /G
+ 1/G
SINR(
gIRC,a )
=
=
.
SINR(gIRC )
+ Nue /G
+ Nue /G
(10.65)
This is valid for the specic model that channel estimation error exhibits the
same spatial correlation like data, but only scaled by a scalar processing gain.
It needs to be understood that this is purely the ratio of post-combining SINR.
The actual losses measurable in error rate or throughput will usually be higher
and they will also depend on the specic type of signal constellation and the
post-combining demodulation strategy. As such, the ratio in (10.65) manifests
a fundamental lower bound for the non-recoverable losses caused during combining by weight mismatch due to imperfect channel estimation. We will treat
the additional demodulation loss from phase- or amplitude mismatch in the next
paragraph.
Further, we can read from (10.65) that the SINR ratio is equal to 1 for Nue = 1
UE receive antenna, which is reasonable, since the post-combining SINR does
not suer from an amplitude or phase error. It is only the demodulation process
itself, which later causes losses. Another sanity check is that the SINR ratio
converges to 1 for G irrespective of a nite Nue .
Demodulation Loss
In order to consider the impact of the demodulation loss, the channel estimation
error is introduced into the system model established in (10.38):
, e x + gH z
x
1 = g H h
1
1
1
, x + gH (z ex ) .
= gH h
1 1
1
1
(10.66)
236
With the specic channel estimation error model introduced in (10.58), this
SINR expression can be further simplied to
H , 2
g h1
1
.
(10.68)
SINR(g) = H
g z1 z1 g 1/E {|x1 |2 } + 1/G
With the chosen model being valid for not too low SNRs, the demodulation loss
is independent of the SNR that data transmission is subject to (but of course
dependent on the SNR channel estimation is subject to) and the actually chosen lter weights. So, when comparing the achievable SINR for dierent weight
selection approaches, the demodulation loss has no impact since it eects each
approach equivalently. It is a signicant contribution to the overall loss, though,
as we will see in Fig. 10.3 in the next paragraph.
Numerical Evaluation of SINR Loss with Weight Mismatch
The channel estimation interpolation gain achievable in a general OFDMA system depends on the topology of reference signals in the time-frequency grid and
the two-dimensional correlations of channel coecients parameterized by time
dispersion properties (e.g., delay spread) and time variation (e.g., Doppler frequency) of the channel (see Section 9.1). It can be shown by analysis of the
Wiener solution [HKR+ 97b] that for the specic reference signal situation and
channel model parameters in LTE, the interpolation gain by means of UE-side
implementable ltering with limited span in frequency direction ranges roughly
between 3 dB in a fairly dispersive channel still compliant with a normal cyclic
prex and 12 dB in a non-dispersive channel. Further interpolation gain may be
achieved by means of ltering in time direction.
Fig. 10.3 shows the SINR ratio for exemplary parameters. With a processing
gain of 6 dB for Nue = 2 UE receive antennas, we have an acceptable combining
loss of 0.11 dB when targeting 10 log10 () = 10 dB. For Nue = 4, the combining loss increases to 0.31 dB. To guarantee the same post-combining SINR with
Nue = 4 instead of Nue = 2, the processing gain needs to be improved by approximately 5 dB. This result is related to [RCP09], and it shows that the cost for
channel estimation is rising with the number of antennas when the combining
loss with respect to ideal performance shall be limited.
The combining loss to be expected with realistic channel estimation is small
compared to the signicant gains oered by ideal IRC, as shown in Fig. 10.2.
It should also be noted that the requirements on processing gain are dominated
by the demodulation loss, if we consider Nue 4 and the higher SINR range,
e.g. above 8 dB. Hence, irrespective of combining loss due to channel estimation
errors, IRC implementation still yields substantial improvements.
u
u
u
u
u
u
u
u
l
1.4
b
u
1.8
l
u
1.6
bl
1.0
l
l
b
b
b
0.8
l
l
r
r
b
b
r
r
0.6
1.2
r
l
l
l
urbrl
l
l
l
u
b
b
0.4
r
l
b
b
rb
rlb
rl
r
rl
bl
b
lr
urbl
uurb
urub
uur
uur
ub
rl
urb
ru
urb
ru
0.2
u
l
ur
u
u
u
ur
237
Nue = 2
Nue = 4
10 log10 ()
10 log10 ()
10 log10 ()
10 log10 ()
10 log10 ()
=
=
=
=
=
16 dB
12 dB
10 dB
8 dB
4 dB
Demodulation loss
2
0
5 6 7 8 9 10 11 12 13 14 15
Processing gain [dB]
Figure 10.3 SINR ratio versus processing gain G. Combining losses from (10.65) for
Nue {2, 4} and a selection of typical values for are shown together with the
demodulation loss from (10.68) for E {|x1 |2 } = 1.
10.2.4
238
1 q, o
1 1
Nscm
qQ,
oO
H
, (
q , o) x1 (
q , o)h
)
y1 (
1 q, o
4
2K
1
=
xk (
q , o)hk (
q , o) + n(
q , o) x1 (
q , o)e(
q , o)
Nscm
qQ,
oO k=2
4H
2K
xk (
q , o)hk (
q , o) + n(
q , o) x1 (
q , o)e(
q , o)
,
k=2
0.5
1.0
1.5
bc
bc
bc
bc
2.5
bc
bc
3.0
bc
bc
bc
3.5
bcbc
bc
bc
bc
bc
bc
bc
bc
bc
bc
bc
bc
bc
bc
bc
2.0
bc
bc
bc
bc
bc
bc
bcbc
bc
bc
bc
239
bc
Nue =2,
Nue =2,
Nue =2,
Nue =2,
Nue =4,
Nue =4,
Nue =4,
Nue =4,
4
2
10
12
14
16
18
number of samples Nscm
20
22
24
Figure 10.4 SINR ratio due to covariance matrix mismatch for varying number of
scheduling interval in time direction and within the resource block allocation size
in frequency direction. Some latency constraints may further reduce the number of samples that can be used in estimating the spatial interference-and-noise
covariance. Note that interference can also be vastly dierent between the timefrequency resources used for control signaling and those used for data transmission and, hence, estimation of the spatial interference-and-noise covariance also
needs to be restricted to the appropriate time-frequency resources. Furthermore,
the computation of the sample covariance matrix requires knowledge of some
symbols x1 either from transmitted reference symbols or from decision feedback
of the desired received signal, which may also limit the number of samples that
can be used for estimation. For IRC, the sample covariance matrix needs to be
invertible and, hence, Nscm Nue . Note that there are typically only between 4
and 12 samples available for spatial interference-and-noise covariance estimation
when using UE-specic RSs and without decision feedback in the LTE Release 8
downlink [3GP07b].
The set Q of sub-carriers is ideally centered at the sub-carrier q where the
sample covariance matrix is computed. Similarly, the set O of OFDM symbols is ideally centered at the OFDM symbol o where the sample covariance
matrix is computed. However, centering the sets Q, O at q, o is not possible
at the boundaries of the time-frequency resource allocations where the spatial
interference-and-noise covariance may change. Moreover, it is desirable to use
the same covariance matrix over an entire range of sub-carriers q and OFDM
symbols o. The partitioning of the time-frequency resources into tiles with identical sample covariance matrix determines the main computational complexity
as given by the number of sample covariance computations and matrix inversions
of size Nue Nue required for IRC.
240
h1 En
h
1
1
z1 z1
z1 z1 h1
!
SINR (
gIRC,b ) = H
1 h1
1 z z
h1 En
z1 z1
z1 z1
1 1
(10.71)
where En {} denotes expectation with respect to noise as part of the sample covariance matrix. Note that we have dropped the explicit dependency of
the sample covariance with respect to the sub-carriers q and OFDM symbols o
because the sample covariance matrix can be assumed to be constant over an
entire tile in time-frequency resources. For simplicity of the notation, we have
also dropped the dependency of the channel for the desired UE with respect to
the sub-carriers q and OFDM symbols o, which corresponds to a block-fading
approach in frequency and time directions. Equation (10.71) cannot be simplied
in general and the performance loss from imperfect covariance matrices needs to
be evaluated by simulations. However, there exists a simple and elegant solution for the SINR loss from covariance estimation in case the sample covariance
matrix is complex non-central Wishart distributed, which is the case when the
interference-and-noise samples can be assumed to be Gaussian distributed with
zero-mean [TC94]. This assumption holds (i) for the case of no interferers and
(ii) for the case of interferers with complex Gaussian signal alphabet,i.e. when
q , o) is Gaussian distributed in (10.70). Furthermore, the assumpK = 1 or xk (
tion of zero-mean Gaussian distributed interference-and-noise samples is good
approximation for the case of multiple interferers and interferers with higher
order modulation. For complex non-central Wishart distributed sample covariance matrices, the SINR ratio can then be derived [TC94] as
Nscm Nue + 1
SINR (
gIRC,b )
=
SINR (gIRC )
Nscm
(10.72)
Note that the SINR ratio from estimating the spatial interference-and-noise
covariance matrix is independent of the SNR and SIR for zero-mean Gaussian
distributed interference-and-noise samples. Equation (10.71) is evaluated by simulations for the case that the interference-and-noise samples are zero-mean Gaussian distributed without interferer, i.e. SIR = , and for the case that there is
one interferer with, e.g., MPSK modulation, such that the samples are no longer
Gaussian distributed. Fig. 10.4 illustrates the theoretical and simulated SINR
losses for the cases of Nue = 2 and Nue = 4 dependent on the number of samples
Nscm , and for SNR = 0 dB. The theoretical results from (10.72) are shown as
solid lines, and the simulated results are depicted with crosses for Nue = 2 and
circles for Nue = 4, respectively. From the simulation results with one interferer,
241
the loss observed from sample covariance estimation is upper bounded by the
loss due to zero-mean Gaussian distributed interference-and-noise samples.
10.2.5
En,e h1 z1 z1 h1
% H
& E{|x1 |2 }.
(10.73)
SINR(
gIRC,c ) =
,
,
1 h
1
En,e h1
z1 z1 z1 z1 z1 z1 1
and we evaluate the performance loss with the SINR ratio given by
SINR(
gIRC,c )/SINR(
gIRC ) by randomizing all relevant variables.
Fig. 10.5 shows the results for the SINR ratio versus SNR for one interferer with MPSK modulation and SIR = 0 dB, Nue = 2, 4 and processing gain
G = 6 dB and 15 dB for the channel scenario outlined in Section 10.2.2. The number of samples for covariance estimation is chosen as Nscm = 6 and Nscm = 18
to cover a range typical for LTE as well as a scenario with reduced losses from
covariance mismatch. The combined losses from channel estimation and spatial interference-and-noise covariance estimation are smaller than the sum of the
individual losses. From comparing Fig. 10.5 with Fig. 10.4, it can be seen that
at low SNR the channel estimation errors even compensate for some covariance
matrix mismatch such that the combined losses become smaller than the losses
from covariance matrix mismatch alone. The losses are not negligible but rather
small as compared to the gains from IRC versus MRC outlined in Section 10.2.2.
In particular, in the low SIR regime, the benets of IRC versus MRC outweigh
the implementation losses by far. Note that the implementation losses are substantially increased with a larger number of antennas Nue which may also need
to be considered in overall performance and throughput evaluations for MIMO
systems.
10.2.6
Summary
This section has focused on the link-level assessment of IRC in the presence of
estimation errors due to low-complexity implementation. First, SINR expressions
for IRC and MRC were derived, showing that the MMSE approach yields exactly
the same SINR value as the studied IRC method. A basic performance analysis
in terms of achievable SINR after combining was conducted comparing IRC
bc
0.5
bc
bc
bc
bc
bc
bcbc
bc
bc
242
1.0
bc
bcbc
bc
1.5
bc
2.0
bc
bc
2.5
bc
bc
bc
Nue =2,G=6dB,Nscm=6
Nue =2,G=6dB,Nscm=18
Nue =2,G=15dB,Nscm=6
Nue =2,G=15dB,Nscm=18
Nue =4,G=6dB,Nscm=6
Nue =4,G=6dB,Nscm=18
Nue =4,G=15dB,Nscm=6
Nue =4,G=15dB,Nscm=18
3
0
10
15
20
25
SNR [dB]
Figure 10.5 SINR loss versus SNR for one interferer with MPSK modulation and SIR
In this chapter, we address how CoMP can be applied selectively and adaptively
to well-chosen sets of terminals in a mobile communications system. While Section 11.1 focuses on scheduling approaches, where a central scheduling unit performs multi-cell resource allocation in the context of non-cooperative or joint
transmission in a cellular downlink, Section 11.2 looks into radio link control
and signalling aspects connected to establishing CoMP on-demand. Finally, Section 11.3 ventures into the eld of ad-hoc CoMP, where cooperation is established
exibly after uplink transmission has already taken place.
11.1
11.1.1
Introduction
In previous chapters, we have typically observed transmissions between multiple
base stations (BSs) and user equipments (UEs) on a single orthogonal frequency
division multiplex (OFDM) sub-carrier, assuming that the assignment of system resources to the communicating entities has already taken place. While the
question of which BSs should in principal be clustered, and hence enabled to
cooperate, was already addressed in Chapter 7, we now want to look into the
question of how UEs can be assigned to system resources eciently, such that
the performance under a particular transmission scheme is maximized. More
specically, we will investigate
244
cells = sectors
sites with
3 base stations
each
exemplary
CoMP cluster
central scheduling
unit (CSU)
which UEs should simultaneously use the same physical resource block (PRB)
in dierent cells in the case of conventional, non-cooperative transmission, or
which UEs can be eciently served on the same resources if downlink joint
transmission (JT) (see Section 6.3) is used.
In this respect, scheduling may explore the degrees of freedom of choosing
tuples of UEs whose links will be lightly aected by the mutual co-channel interference in the conventional, non-cooperative transmission case, or who can be
served eciently on the same resources in the JT case. In the following, some
heuristic algorithms are described, and their results will show that, by making
use of the information fed back by the UEs and made available to a central
scheduling unit (CSU) through backhaul links, centralized scheduling provides
considerable gains compared to conventional, individual scheduling by the BSs.
11.1.2
System Model
In this subsection, the system modeling considered for the study of scheduling
algorithms is described. The downlink of a CoMP-enabled system with M BSs
is considered. These BSs are grouped into C clusters, where the sets of BSs
included are denoted as Mc , c = 1, . . . , C. All BSs within one cluster have a
backhaul link to a dedicated CSU. A setup where one exemplary CoMP cluster
is highlighted is shown in Fig. 11.1.
In the sequel, we shift our focus to this one cluster c, and assume it has a set
of UEs Uc that are served by the BSs in Mc and may be assigned to R available
PRBs, which are indicated by r = 1, 2, . . . , R. For simplicity, in this section only
245
single-antenna BSs and UEs are considered, hence Nbs = Nue = 1 according to
the notation used in previous chapters. Let us denote as Kr the set of all UEs
in the system that are assigned to resource r, and Kc,r Kr as the subset of
these UEs that are served by cluster c. Vector hck (r) C[|Mc |1] denotes the
channel coecients connected to resource r, representing the links from UE k to
each BS m belonging to cluster c. Similar as derived before in Section 5.1, the
downlink signal-to-interference-and-noise ratio (SINR) experienced by a UE k
belonging to cluster c and assigned to resource r can be stated as
2
c
H
(hk (r)) wkc (r)
, (11.1)
k (r) =
2
2
H
H
(hck (r)) wjc (r) +
(hk (r)) wj (r) + 2
j{Kc,r \k}
Intra-cluster interference
j{Kr \Kc,r }
Inter-cluster interference
246
(11.2)
(11.3)
which is used to model whether a given transmission has been successful. Here,
adaptive modulation takes into account 2-, 4-, 16- and 64-QAM as available modulation schemes. Since the focus here is on enhancing total throughput, adaptive modulation selects the QAM of order Q = 2q yielding the highest average
throughput, i.e.,
Q= = arg max (1 BLER()) L q.
(11.4)
Q2{1,2,4,6}
11.1.3
247
PRB enhances the SINR of the UEs sharing this PRB, but reduces the SINR of
the UEs allocated on the other PRBs. Finally, the spatial compatibility among
UEs is PRB-dependent, i.e., UEs that are spatially compatible on a given PRB
might be incompatible on another [LZ06, MK10].
These relationships illustrate the strong interdependency among the above
subproblems, and the challenge of jointly solving them usually leads to computationally prohibitive solutions. All previously referred aspects aect signal
and/or interference levels and consequently the system performance. Moreover,
even for some subproblems, optimum solutions can already be very complex.
Thus, sub-optimal scheduling solutions are often preferred [FDH07, MK10].
Dierent objectives may be pursued by the scheduler (spectral eciency, fairness, quality of service (QoS) requirements, etc.) and each objective results in its
own problem which may not have a known optimal solution. While conventional
schedulers share the same scheduling objectives, the additional information available to a centralized scheduler allows this to consider the impact of the resource
allocation at one BS on the other BSs within the same cluster. Moreover, the
additional control of joint scheduling introduces other degrees of freedom which
are exploited, e.g., by adapting the set of simultaneously transmitting BSs.
In this section, we focus on the resource allocation problem comprising PRB
assignment and precoding sub-problems, and we target at maximizing system
throughput. If we assume a xed and equal power distribution among PRBs, the
problem may be decoupled and solved separately for each PRB. This approach
is, in fact, used in the presentation of all algorithms in this section. We further
consider a xed power control explained later and also x the used CoMP scheme
and choice of precoders to the following two options:
Conv. transmission (CT): PRBs are reused by multiple BS-UE links within
a cluster, but each active UE is served exclusively by only one BS.
Joint transmission (JT): PRBs are reused by multiple BS-UE links within
a cluster, with all BSs sending linearly and jointly precoded data to all UEs
and, consequently, based on a zero-forcing (ZF) lter.
The remaining resource allocation problem is hence: For the case of
conventional transmission (CT), the CSU needs to determine which BS-UE links
can simultaneously use a same PRB. Assuming that the CSU has CSI on all links
within a cluster, it is able to estimate the impact of the intra-cluster interference
induced by the PRB reuse, and can dynamically determine which PRBs should
be assigned to which UEs served by which BSs. For the JT problem, the CSU
needs to nd sets of UEs with good compound channel properties, which can be
eciently served with spatial multiplexing on the same PRB. Maximizing system
throughput assuming CT or JT becomes a combinatorial problem whose optimal
solution may be computationally complex to nd. Therefore, only sub-optimal,
but rather ecient solutions are considered herein.
248
2.
3.
4.
5.
max
kUc , mMc
2
|hm
k (r)| with highest gain.
{k }, Mc,r {m }) R(Kc,r
, Mc,r ), set Kc,r
= Kc,r
{k }, Mc,r =
If R(Kc,r
Mc,r {m } and go to the previous step, otherwise nish.
249
Find the UE {k } = arg max "hck (r)" with highest channel vector norm.
2.
'
of scheduled UEs
Assign initially the PRB r to the UE {k } by dening the set Kc,b
'
of CoMP cluster c on PRB r as Kc,r
= {k } .
While K ' K
'
a. Find k = arg max' (Kc,r
{k}).
3.
kUc
kUc \Kc,r
4.
'
'
= Kc,r
{k }.
b. Set Kc,r
'
Set Kc,r = Kc,r .
link is the one which leads to the highest throughput and, in case of ties, the link
with highest channel gain is chosen. This procedure continues adding new links
as long as the cluster throughput increases. Otherwise, it nishes and goes to
the next PRB. The greedy algorithm for each PRB r in the case of conventional,
non-cooperative transmission can be stated as shown in Table 11.1.
Heuristic Scheduling Algorithm for Joint Transmission
In this subsection, a heuristic, sub-optimal scheduling algorithm for the case of
JT is described, where the discussion is restricted to a single CoMP cluster c and
=
is built. For simplicity of notation, the indices c and
PRB r, on which a group Kc,r
r are omitted in the sequel. For JT, the data symbols xk intended for all scheduled
UEs k are made available to all BSs of the CoMP cluster and are precoded using
their associated precoding vectors wk before transmission. For the spatial signal
separation, linear precoding can be employed [PNG03, JJT+09], while the eciency of such separation strongly depends on the characteristics of the channel
vector of the scheduled UEs. Therefore, a JT scheduler that only allows sharing PRBs among UEs with uncorrelated channels is usually employed [MK10].
Thus, the problem to be solved is choosing a group of K = M UEs that are
spatially compatible, i.e., that can eciently share the same PRB. JT schedulers
for this problem are usually heuristic and composed by two elements: a spatial
compatibility metric and a user selection algorithm [MK10].
In the following, the spatial compatibility metric is discussed. It is employed
by the CSU to measure the spatial compatibility among UEs. In general, spatial
compatibility metrics are functions of the CSI (available at the CSU through
the backhaul) that try to map the characteristics of the spatial channels of the
UEs to a scalar value quantifying how eciently these UEs can be separated in
space [MK10]. Such groups of UEs are termed a compatibility group.
When ZF precoding is employed, it has been shown that the sum of channel
gains with null-space successive projections represents an eective measure of
spatial compatibility, especially when aiming at maximizing the system throughput [MK10, TUBN06, YG05]. For a compatibility group K' = {1, 2, . . . , K ' },
the use of successive null-space projections imposes that the channel vector
250
hk' of UE k ' K' be projected onto the null-space of the channels of all UEs
k K' , k = 1, 2, . . . , k ' 1 [TUBN06, YG05], i.e., a vector space orthogonal to
channels of all UEs k = 1, 2, . . . , k ' 1.
Since signals conveyed through orthogonal channels do not interfere with each
other, the more orthogonal to hk the channel vector hk' of UE k ' is, the less its
squared norm (i.e. the channel gain) will be aected by the projection and the
more spatially compatible to the UEs k = 1, 2, . . . , k' 1 the UE k' will be.
Denoting as Tk' C[MM] the matrix that projects the channel vector hk' of
UE k ' onto the null-space of the channels of UEs 1, 2, . . . , k ' 1 [TUBN06], one
has
for k' = 1,
I[M] ,
H
H
(11.5)
Tk' =
Tk' 1 hk' 1 hk' 1 Tk' 1
(K' ) =
K
(11.6)
k' =1
Note that according to (11.6), the higher the gain of the channels of the UEs
belonging to K' is, and the more orthogonal to each other they are, the larger
(K' ) becomes. Altogether, these high orthogonality degrees and high channel
gains result in an increased system throughput, rendering (K' ) a suitable spatial compatibility metric, especially for algorithms oriented towards throughput
maximization, at it is the case herein.
In the following, the user selection algorithm considered in this subsection is
discussed. Its task is to arrange the UEs of the CoMP cell in a compatibility
group by using the spatial compatibility metric. Often, the optimum compatibility group can only be found through an exhaustive search over all possible
groups, so that sub-optimal, but rather ecient user selection algorithms are
desired. One such algorithm is the best t algorithm [STKL01, DS04, Cal04],
which is also a greedy algorithm.
Starting from a compatibility group containing only an initial UE, the best
t algorithm extends the group by sequentially admitting the most spatially
compatible UE with respect to those UEs already admitted to the compatibility
group. Let K' = {k '} be the initial compatibility group containing only the UE k ' ,
which is chosen as the UE with the highest channel norm because this leads to
the highest throughput for single-user transmission. Let K ' be the size of the
compatibility group K' . Then, the best t algorithm computes the spatial compatibility metric (K' {k}) for each UE k Kc \ K' . Then, the UE k = which
leads to the highest value for the spatial compatibility metric () is admitted
to the group K' . After that, the same procedure is repeated with the remaining
251
UEs and an additional UE is admitted to the group, and so on until the group
size K ' reaches the target compatibility group size K = . When using ZF precoding, the choice of K = is fundamentally limited by the number of transmitting
antennas, which is the maximum number of UEs that can be multiplexed in this
case. However, additional restrictions such as a maximum number of scheduled
UEs per PRB may be applied to limit the amount of control information to be
exchanged at each TTI.
Combining the spatial compatibility metric described by (11.6) and the best
t algorithm, an overall scheduling algorithm for JT can be derived, which is
presented in Table 11.2. Similarly to the scheduling algorithm for the CT case,
PRBs are processed sequentially. It should be noted that the scheduling algorithm does not compute precoding vectors wk for any UE, thus avoiding a considerable amount of computations [MK10]. Additionally, it also does not involve
power allocation. This allows the algorithm to be more easily combined with different precoding and power allocation schemes. However, precoding and power
allocation should also be oriented to the same objective of the algorithm, namely
throughput maximization. Beyond the single-antenna case, the algorithm can be
straightforwardly adapted to cases considering multiple antennas at the communicating nodes by extending the channel vectors hk (r) accordingly [TUBN06].
The eective channel (including the eect of transmit precoding) might be estimated at the UEs using pilot symbols, as discussed in Section 9.1. Alternatively,
xed receive lters at the UEs might be considered at the transmitter using, e.g.,
a receiver-oriented design of the BSs precoding vectors [MBQ04].
11.1.4
252
Value
7
7 (i.e. 21 BSs and cells per cluster)
1 km
50 m
2 GHz
15 kHz
15 (with 12 sub-carriers each)
14
1 ms
35.3 + 37.6 log10 (d) in dB
8 dB
3 km/h
TU [3GP08b]
ZF
6 dB
1s
3 to 12
21
The path loss and shadowing are modeled according to [PDF+ 08] alongside
various other simulation parameters in Table 11.3, and BS antenna patterns are
modeled according to [3GP07a], i.e.,
- '
5
*2
18
(a)
(11.7)
k,m,c , 20 [dB].
G (k,m,c ) = min 12
7
Fast fading considers an average UE speed of 3 km/h and employs the typical
urban power-delay prole to model frequency selectivity [3GP08b]. When considering JT, linear ZF precoding is adopted as precoding technique due to its
simplicity and its ability to suppress intra-cell interference [MK10, YG05]. The
precoding vectors wkc (r) computed for the scheduled UEs according to the ZF
criterion are rst scaled to become unit-norm vectors. After that, because each
BS has a limited transmit power available per PRB, all precoding vectors within
a CoMP cluster are scaled so that no BS spends more than this total transmit
power per PRB. This is easily accomplished as follows. First, the precoding vectors wkc (r) for all UEs k Kc,r scheduled to receive on PRB r within the CoMP
cluster c are organized in a precoding matrix Wc (r). Then, the total power spent
by a BS m corresponds to the sum of the squared absolute values of the weights
it applies to each transmit signal, i.e., tr{Wm (r)(Wm (r))H }. Since the power
ratio among elements of each column may not be changed in order to preserve
the properties of ZF, the precoding matrix Wc (r) is simply scaled down so that
power constraints are fullled. Note that this means that the transmit power of
some BSs might not be fully used, which is sub-optimal.
For each BS, a number of UEs is uniformly distributed over the coverage
area of the cell. The transmit power of the BSs is set as to grant an SNR of
at least 6 dB at the cell-edge considering the eects of pathloss, antenna pat-
1200
*
bc
900
ut
800
ut
700
600
rs
rs
CT
ut
bc
bc
bc
ut
ut
ut
ut
ut
ut
ut
ut
ut
bc
300
ut
ut
ut
ut
ut
ut
ut
200
bc
bc
bc
bc
400
bc
bc
500
rs
rs
rs
ut
1000
rs
rs
ut
1100
ut
rs
253
10
11
12
100
0
7
8
9
load [UEs/sector]
tern and shadowing (95% of reliability). It is assumed that BSs always have
data to transmit to the UEs, which make use of a best-eort service. Several
snapshots are considered in each simulation. During each snapshot, large-scale
fading is assumed constant while small-scale fading variation is modeled using
Jakes model [Jak74]. A sucient number of snapshots is simulated in order to
get reliable statistics about the system throughput.
Fig. 11.2 shows the total throughput of the system averaged over all simulated snapshots as a function of the system load in UEs per cell. The following
transmission, scheduling and link adaptation schemes are compared:
Conventional transmission, scheduling and link adaptation. This corresponds to a conventional cellular system in which there is no coordination
or communication among sites. Each BS uses a local, greedy scheduler. For
a given PRB, it schedules the UE with the highest channel gain at each BS.
Thus, a full reuse of frequency resources is observed. In this scenario, link
adaptation is based on the interference perceived during the last transmission
to a UE on a PRB.
Conventional transmission and scheduling, but interference-aware
link adaptation. Here, the same local schedulers are used, but link adaptation is based on the assumption that each BS knows the scheduling decisions of
the other BSs and can precisely predict intra-cluster interference, as proposed
in Section 5.2.2 for the uplink.
Conventional transmission, centralized scheduling and interferenceaware link adaptation, using the greedy scheduler for CT proposed before.
Joint transmission, conventional scheduling and interference-aware
link adaptation, where the UEs with the highest channel gains in the cluster
254
are scheduled for JT, and two UEs may not be scheduled to the same BS on
the same resource.
Joint transmission, centralized scheduling and interference-aware
link adaptation, using the greedy scheduler for JT proposed before.
Note that for all schemes, inter-cluster interference is estimated as interference
perceived during the last transmission to the UE on the PRB.
Regarding non-cooperative transmission, we can see that a large performance
gain of 100 % to 120 % can already be achieved if knowledge on intra-cluster
interference is used for link adaptation, as also observed in Section 5.2 for the
uplink. Performance can further be increased by around 20% if the proposed
centralized scheduling algorithm for CT is used. Joint transmission in general
performs signicantly better than non-cooperative transmission, as also observed
in Sections 6.3 and 6.4, but we can see that a throughput improvement of about
10% can additionally be obtained if the greedy, centralized scheduling scheme
proposed in this section is used, as opposed to classical scheduling at each BS.
11.1.5
Summary
In this section, multi-cell centralized scheduling algorithms oriented towards the
maximization of system throughput have been presented, for a downlink system
based on non-cooperative or multi-cell joint transmission. While both algorithms
are heuristic and have low complexity, the results presented in this section have
illustrated the huge potential of intelligent scheduling to provide high data rates
in CoMP systems.
This section has concentrated on the downlink. However, similar relative performances are expected in the uplink if a sucient amount of CSI is available
at each cluster. The studies considered here have assumed relatively idealized
conditions. Real systems have to deal with further challenges such as backhaul
constraints, signaling overhead, limited or outdated CSI and synchronization
issues which degrade system performance, as discussed in various other parts of
the book.
11.2
255
The section focusses on decentralized radio link control1 , where each BS typically controls the UEs of its cell, though some aspects also apply to centralized
control, where a central node controls all UEs of one or more CoMP clusters. In
general, uplink (UL) and downlink (DL) transmissions need radio link control,
but some control loops only refer to UL transmissions, e.g., UL power control, UL
timing advance, etc. In 3GPP LTE, frequency division duplex (FDD) and time
division duplex (TDD) use the same radio link control and they face basically
the same problems; only the radio link measurements may dier if the channel
is reciprocal. Signaling refers to the direct communication of cooperating BSs
when using decentralized radio link control and to the indirect communication
via a central node when centralized radio link control is applied.
11.2.1
Resource Allocation
Resource allocation, as considered from a physical layer point of view in Section 11.1, is the process where BSs allocate radio resources in time and frequency to certain UEs. The BSs need to answer the question which and how
many resources to allocate. The decision on which resources to allocate is based
on information about the (predicted) radio link quality, e.g., the current radio
channel state including slow and fast fading as well as the interference situation.
The decision on how many resources to allocate is based on information about the
trac demand, e.g., buer ll levels and quality of service (QoS) requirements.
All information needs to be available at the serving BS performing resource
allocation.
In cooperative transmission and reception schemes as introduced in Chapters 5
and 6, the resulting radio link quality changes compared to a non-cooperative
transmission, and the resource allocation should be based on the quality of the
cooperative radio link between the UE and the BS antennas in multiple cells,
see also Chapter 9. Inter-BS communication could be used to exchange channel
information between cooperating BSs so that they know the quality of the cooperative radio link. However, such an information exchange between BSs might
not be possible, e.g., due to the lack of an appropriate interface (limited capacity and/or long delays). In that case, the serving BS could try to estimate the
improvement of the radio link quality due to cooperation, e.g., an oset could
be added to the estimated signal-to-interference-and-noise ratio (SINR).
Having estimated the quality of the cooperative radio link, each BS can allocate resources independently of the other BSs, or a set of BSs can do some form of
joint resource allocation in order to better take interference into account, as proposed in Section 5.2. In the former case, the process does not dier from regular
1
Note that the distinction of centralized and decentralized radio link control does not correspond to the classication of centralized and decentralized CoMP schemes, which refers to
the place where decoding (uplink) and encoding (downlink) are performed, see Section 4.
256
11.2.2
Link Adaptation
The selection of modulation and coding schemes (MCSs) is carried out by the
serving BS. Based on the estimated SINR, the BS selects the MCS that maximizes the user throughput. Since CoMP can increase the SINR perceived at the
receiver, link adaptation in CoMP-enabled networks should not be based on the
SINR of the BS-UE link, but on the increased SINR after cooperation. Thereby,
the BS can select a more aggressive MCS resulting in a higher throughput.
One way to estimate the channel quality after cooperation is to measure the
radio links involved in the cooperation and to gather and combine the measurements at the serving BS, see also Section 9.1 and details provided later in
Section 11.2.3. Besides channel quality, the SINR is also determined by interference, which can either be estimated from previous transmission attempts, or be
more accurately predicted if cooperating BSs exchange their schedules prior to
257
link adaptation (see Section 5.2). Exchanging schedules between BSs of course
requires additional signaling and increases delay.
Alternatively, the serving BS could estimate the SINR increase due to cooperation without additional inter-BS or UE to BS signaling. For instance, mobility measurements, which are anyway reported by a UE, give an indication on
the pathloss to candidate BSs. Such a technique was for instance described
in Section 7.2. For CoMP transmissions, which last over several hybrid automatic repeat request (HARQ) round-trip times (RTTs), the number of HARQ
re-transmissions that indicates the actual block error rate (BLER) can be considered when setting the MCS. During such a transmission period, the MCS could
be adapted to better meet the BLER target, which maximizes throughput. If
the MCS selection is not adapted to the increased SINR obtained with cooperation, CoMP only reduces the BLER leading to fewer re-transmission requests.
Although this slightly reduces the packet delay, it is desirable to operate the
HARQ at a more spectrally ecient BLER.
11.2.3
258
much less signaling trac. Examples of such radio link measurements in LTE
are rank indicator (RI), precoding matrix indicator (PMI), and channel quality
indicator (CQI). One could even further reduce the signaling load by reporting
only long-term measurements such as pathloss coecients.
In general, the information exchange between the nodes can be on-demand or
periodic. In the former case, the serving BS can request the required information
for a specic cooperation attempt with specic BSs on-demand. BSs not involved
in the cooperation and UEs attached to those BSs are not required to measure or
report anything. With periodic signaling, all candidate links have to be measured
and reported independently of the actual need for cooperation, and UEs and/or
BSs continuously exchange CoMP-related information.
11.2.4
(11.8)
Here, Pmax is the maximum UE transmit power, which is of course the upper
limit of the actual UE transmit power. P0 can be seen as the (cell-specic) desired
receive power, which is transmitted by the BS as part of the LTE system information. The term 10log10 (R) reects the fact that for a larger number of assigned
resource blocks R a higher received power and thus a correspondingly higher
transmit power is needed. The parameter MCS , congured by the BS, adds an
MCS-dependent power oset, which reects the dierent SINR requirements per
MCS. PLDL is part of the open-loop power control component, where each UE
selects an appropriate transmit power to compensate a fraction of the estimated DL pathloss to the serving cell, PLDL . The DL pathloss is derived from
the signal strength of the DL reference signals. Without cooperation, a UE only
detects the DL reference signals of its serving BS to estimate the pathloss.
With cooperation, this component should also take the pathloss to supporting
BSs into account. This is tricky because UEs would then need to know which
cells actually cooperate, which would require additional BS-UE signaling to precongure supporting BSs. This signaling would reduce the serving BSs ability to
react quickly to changing transmission conditions by selecting supporting cells
on-demand on a subframe basis. Furthermore, UL CoMP can be designed to be
transparent to the UE, which would allow the support of legacy UEs. Including
the pathloss to supporting BSs in the open-loop power control component would
259
11.2.5
11.2.6
260
Figure 11.3 Suspension of HARQ process for uplink CoMP in 3GPP LTE.
261
Figure 11.4 Usage of two transport blocks per HARQ process for uplink CoMP.
262
11.2.7
Handover
In cellular systems, such as LTE, a given terminal is associated to one serving cell.
In general, the serving cell is chosen based on the BS-to-UE radio link quality.
UEs regularly measure the signal strength of neighbor cells and report to their
serving BS. As soon as the signal of the serving cell is received with lower signal
strength compared to the signal of a neighbor cell, a handover is triggered by the
serving BS. In general, thresholds and a hysteresis are used to avoid ping-pong
eects. After the handover, the target cell with the best radio link is in charge
of the UE and becomes its new serving cell.
By means of CoMP, the eective signal quality and thereby user and cell
throughput increases. In order to cooperate, data and control has to be exchanged
via the transport network connecting the sites. If the transport network, especially the serving BSs backhaul link, is highly loaded, the serving BS is not able
to perform CoMP. Therefore, improved user and cell throughput due to CoMP
may be limited by the backhaul link capacity. Dierent BSs may have dierent
limitations on their backhaul link, e.g. some may be connected via bre, other
via leased telephone lines (E1/T1).
The handover algorithm could mitigate this limitation by considering the
backhaul capacity and the current backhaul load. For instance, a UE that is not
supported by CoMP due to the limited (copper) backhaul of its serving BS could
be handed over to a dierent serving BS with free (bre) backhaul resources.
With CoMP support at the target BS, the performance could be enhanced. An
adapted algorithm could trigger a handover for active UEs with a backhaullimited serving BS such that the new (target) serving BS has free backhaul
capacity to cooperate. Accordingly, the cell-selection criterion of UEs in idle
mode could be modied. The thereby selected cell may not have the best (noncooperative) radio link quality, which is typically used as a criterion to select
a serving cell, but due to BS cooperation the resulting user and cell may be
optimized.
11.2.8
Inter-BS Signaling
According to LTE operation, each UE is associated to one serving cell, and the
corresponding BS controls the radio link, e.g., resource allocation, link adaptation, etc. With CoMP, transmissions in multiple cells are coordinated, and the
radio link control can be done in a centralized or distributed manner.
With centralized radio link control, a master entity controls and coordinates
the transmissions in multiple cells. In order to perform resource allocation, link
adaptation, power control etc., the master requires the above mentioned radio
link measures, such as CQI, RI, CSI, buer ll levels, etc. Such information has
to be centrally collected by the master, while control commands have to be distributed to the sites. In such a setup, where the radio link control is centralized,
the baseband data processing could also be centralized. In that case, UL signals
263
are forwarded from the antenna sites to the master, which jointly processes them;
DL signals are fed from the master to the sites. The sites are then equipped with
simple nodes, such as remote radio heads (RRHs), and all complex processing
is performed in powerful BSs, see also Section 11.1. With such a scheme, xed
CoMP clusters are determined by the cells coordinated by one master, also refer
to Section 7.1. Within that cluster, all transmissions are coordinated, but coordination across cluster boundaries is not possible. Since there is a logical tree
topology between the master and each site of the cluster, UL and DL signals are
transmitted only once between a site and the master (in contrast to the following
distributed radio link control). However, since the network control is located in
the master, data (and control) needs to be transferred between the antenna sites
and the master for every UL and DL transmission irrespective of the CoMP gain
(in contrast to the following distributed radio link control).
With distributed radio link control, the communication with the UE is still
controlled by the serving cell, although the UE signal can be jointly received
or cooperatively transmitted by several cooperating cells. As a result, there is a
logical mesh topology between cooperating cells, which is composed of multiple
individual tree topologies between each serving cell and its supporting cells.
Since a given cell can support multiple serving cells at the same time, UL and
DL signals may be exchanged among multiple cells. However, network control
remains in the BSs, and each BS is capable of handling UL and DL transmissions
on its own, i.e., CoMP can be disabled. This is especially useful for UEs with
very good channel conditions to the serving cell, which would not benet much
from CoMP. A distributed control scheme allows adapting (i.e. decreasing and
increasing) the cluster size: the selection of supporting cells can be done in a UEor cell-specic manner.
In a cell-specic selection, cells which benet most from the CoMP scheme,
e.g., cells with large overlapping areas, would be clustered. This could be done by
the operation and maintenance system based on network planning and the result
would be xed clusters, which corresponds to the centralized scheme above. The
cell-specic selection can be made more dynamic by re-conguring the clusters
during operation based on measurements, such as UE location and signal quality.
Here, cells would be clustered so that certain hotspots or certain areas with bad
link quality benet from cooperation, see also Section 7.2. However, with cellspecic clustering the supporting cells are not optimal for all UEs of a cell.
In a UE-specic cell selection, the serving BS could request cooperation from
one or more supporting BSs for certain UEs. Here, each UE always has the optimal cluster of supporting cells for the given cooperation mode. Consequently,
there are no cluster boundaries anymore where transmissions cannot be coordinated; each UE is always in the middle of its own cluster.
An example of UE-specic signaling for decentralized UL joint detection based
on IQ sample exchange is shown in Fig. 11.5. Basics on UL joint detection were
provided in Sections 6.1 and 6.2, and a simulative performance evaluation of
264
serving
BS
UE
supporting
BS
scheduling
IQ_req(R
Bs)
PUxCH
Rx
Rx
s)
aram
Q, opt. p
IQ_rsp(I
the particular scheme considered here can be found in Section 14.3. First, the
serving BS does the scheduling. As described above, resource allocation, link
adaptation and power control can be adapted to the mode of cooperation. Then,
the serving BS requests support from one or more supporting cells for a particular UE transmitting on certain resources. The corresponding message to request
IQ samples for certain RBs is named IQ req(RBs) in Fig. 11.5. UEs requiring
cooperation and the corresponding supporting cells are selected on-demand. As
it will be described in the sequel, UE and BS selection can be based on different parameters, such as location, pathloss, actual channel realization, etc.
Having received the UEs data (physical uplink shared channel (PUSCH)) or
control channel (physical uplink control channel (PUCCH)) on the indicated
resources, the supporting BS transfers the requested IQ samples to the serving
BS. The corresponding response carrying IQ samples and other optional parameters is named IQ rsp(IQ, opt. params) in Fig. 11.5. The serving BS performs
joint reception (using maximum ratio combining (MRC) or interference rejection combining (IRC)) based on the IQ samples received from cooperating BSs
in conjunction with its own received signal. Finally, it checks if the reception was
correct and prepares the transmission of HARQ feedback.
A hybrid approach combining a centralized control scheme with a decentralized CoMP scheme has the advantage that the cluster size can be adapted in
the sense that the actual CoMP cluster can be a subset of the control cluster.
However, it still has the drawback of a new network entity requiring a continuous
exchange of control information with the sites.
265
UE Selection
A UE-specic CoMP scheme with distributed control requires a proper selection
of UEs. In CoMP schemes where the backhaul trac and the computational
complexity scales with the number of supported UEs, it could be benecial not
to select all UEs of a cell for cooperation. If too many UEs are selected for
CoMP, the resulting backhaul load might be overwhelming, or processing time
may become a critical resource. If the wrong UEs or if too few UEs are selected,
then the potential gain of CoMP cannot be fully exploited. Various dierent
methods could be used aiming at optimizing dierent parameters:
Relative channel quality: This method aims at selecting UEs for which the
quality of the channels, e.g., pathloss, towards cooperating cells are relatively
close to the quality of the channel towards the serving cell. This method allows
selecting UEs that are close to the cell-edge. These UEs are usually suering
from high co-channel interference.
Current radio link performance: This method aims at selecting UEs that
have bad radio link performance, e.g., bad absolute channel quality to the
serving cell or very active co-channel cells generating lots of interference.
Data rate improvement: This method aims at selecting UEs that would
experience the largest throughput increase due to CoMP. The selection would
be based on the dierence of the (estimated) user throughput with and without
CoMP support. Depending on the expression used to measure the throughput
increase, this method can lead to maximum cell capacity.
Geographic location: This method uses the geographic location to select UEs
for CoMP mode, which can be obtained by means of dierent techniques,
e.g., cellular location methods (see Section 15.1), or global positioning system
(GPS) measurements reported by outdoor UEs.
Type of application: This method uses service or application-specic parameters to select UEs to operate under CoMP mode. UEs in CoMP mode perceive
increased data rates, but they might see slightly higher packet delays. Furthermore, the setup of the CoMP mode may take time. So CoMP perfectly
suits for all kind of services requiring large bit rates over a certain period of
time. Such services are, e.g., le download, le sharing, (high denition) video
streaming, software updates, mailbox synchronization, harddisk backup, etc.
Supporting BS Selection
A UE-specic CoMP scheme requires the proper selection of supporting BSs for
each selected UE. Even a cell-specic CoMP scheme requires the selection of
supporting BSs for a given serving BS. Again, the careful selection of supporting
BSs is important to not overload the backhaul network and the BS processors.
This holds for CoMP schemes where the backhaul trac and the computational
complexity scales with the number of supported BSs. BSs of the same site are
less critical to select, since the information can be exchanged at no backhaul
expense, whereas information between BSs of dierent sites are exchanged via
266
11.2.9
Summary
In this section, the potential modications of existing radio link control loops
and signalling aspects of practical implementations of CoMP schemes in cellular
systems at the example of 3GPP LTE were discussed. Most radio link control
loops are aected when introducing CoMP. The biggest challenge is to get the
right radio link measures, i.e., the measures of the radio channel between antennas of multiple BSs and the UE, at the right place, i.e., the place where the radio
link is controlled. Only the HARQ and UL timing advance procedures impose
more strict constraints.
The signalling corresponding to the communication between cooperating BSs
mainly depends on whether control is centralized or not. When using decentralized radio link control, the corresponding signalling can trade-o complexity,
backhaul load and latency. An example signalling scheme for a backhaul-ecient,
UE-specic, on-demand decentralized UL CoMP scheme has been introduced.
11.3
Ad-hoc CoMP
Michael Grieger, Patrick Marsch and Gerhard Fettweis
In this section, the concept of ad-hoc CoMP is introduced for the cellular uplink.
In this concept, a certain cooperation strategy is decided upon after uplink transmission has already taken place. Furthermore, the extent of cooperation may
be progressively adapted until successful decoding is possible. Both concepts
make use of the fact that the channel knowledge during detection and decoding is more accurate than at the time of scheduling. The topic is motivated in
Subsection 11.3.1, after which concrete schemes are proposed for two particular
267
scenarios in Subsections 11.3.2 and 11.3.3, respectively. It is discussed in Subsection 11.3.4, to which extent the concept of ad-hoc CoMP may shed a dierent
light on hybrid automatic repeat request (HARQ), followed by a summary in
Subsection 11.3.5.
11.3.1
Introduction
Opportunistic Communication and Scheduling
The volatile nature of the wireless channel has long been seen as a burden,
complicating the life of wireless system engineers. In recent years, however, the
perspective has changed, bringing wireless channels into a more favorable light
as summarized by David Tse and Pramod Viswanath in [TV05]. In a cellular system, available time and frequency resources are shared among multipleusers. Provided the channel state can be tracked with sucient accuracy, channel
uctuations can be exploited by scheduling users on those time and frequency
resources for which channel conditions are the best. A particular channel condition can then be exploited by matching modulation and coding schemes (MCSs)
as well as signal parameters such as transmit power to the channel state, a concept referred to as link adaptation. Centralized scheduling schemes that exploit
channel uctuations for improved CoMP system performance were already presented in Section 11.1.
Imperfect CSI at the Scheduler
It is obvious that the performance of these scheduling concepts depends strongly
on the amount and quality of channel information that is available at the scheduler, which should be as up to date, accurate and extensive as possible, i.e.,
ideally including information on interference, distortion, radio frequency (RF)
imperfections, etc. In reality, however, the picture that is available at the scheduler is fairly noisy, and is akin to an image seen through a narrow lens. For CoMP
systems in particular:
Channel links to dierent base stations (BSs) have diverse gains and are, therefore, hard to estimate. Additionally, there might be interference between pilot
signals (see Section 9.1) which further impedes accurate channel estimation.
Joint scheduling, which promises huge gains in terms of total throughput and
fairness by exploiting multi-user diversity, requires that channel state information (CSI) be forwarded to a central scheduling node and the scheduling
decision be forwarded to the user equipments (UEs). Due to this scheduling
delay, the scheduler bases its decision on outdated information.
Hence, while making its decision, the scheduler relies on imperfect CSI. Consequently, transmission errors are inevitable, because there is a probability that
the scheduler assigns a transmission rate which is not supported by the channel.
As described in Section 11.2, in contemporary standards of cellular systems like
LTE, the impact of transmission errors is reduced by using HARQ techniques.
268
estimate channel
at BSs
forward CSI
to scheduler
scheduling:
resource allocation
link adaptation
CoMP mode
rate adaptation
269
send uplink
grants to
UEs
Ad Hoc CoMP
refine CoMP mode
take decoding
success into account
11.3.2
270
BSs forward
channel estimates to global
scheduler
Delay
Scheduler determines:
- compression code
- decoding order
- inst. achievable rates
- scheduled rates
Communication of the
scheduling decision
Delay
Channel state
has changed
Transmission
Joint decoding
adaptive compression
CSI that does not describe the channel that is used for the transmission of user
data perfectly. In cellular systems, the channel is variant for two main reason:
fast fading.
time varying interference, particularly in systems with little interference averaging such as LTE.
Assuming that these are the only kinds of CSI impairments that could occur,
the CoMP mode can be adapted such that the backhaul capacity available is
used optimally. For all users that can be decoded locally, CoMP would not be
used at all. The same is true for users that could not be decoded even with the
maximum CoMP support available. At the same time, dierent uplink CoMP
schemes such as distributed interference subtraction (DIS) or DAS as introduced
in Section 4.3.1 would be used whenever they deemed most eective. In the
sequel, we study the eect of outdated CSI in a distributed antenna system with
centralized decoding (see Section 4.3.1).
Example: Adaptive Compression in a Distributed Antenna System
As introduced in Section 4.3.1, in a DAS with centralized decoding, one of the BSs
functions as a joint decoder of codewords transmitted by all UEs in the CoMP
cluster, and all other BSs function as remote radio heads (RRHs), forwarding
their receive signal. We consider that the backhaul connecting the BSs is limited
in its capacity. Thus, the signals received at all RRHs have to be compressed
prior to their exchange over the backhaul.
The scheduling, transmission, and decoding process is depicted in Fig. 11.7.
Since the problem of resource allocation was described in detail in Section 11.1,
we here assume that resources have already been allocated to UEs by any arbitrary algorithm. For this reason, we consider simplied scenarios of few BSs
and UEs as drafted in Fig. 4.4(c). The inuence of inter-cluster interference is
neglected in this model, and we only observe outdated CSI due to fast fading
eects.
Since we consider the eect of outdated CSI at the scheduler only, we assume
that the channel is perfectly estimated for every transmission block. As depicted
271
in Fig. 11.7, in a mobile time-variant environment, the scheduler has access only
to CSI that is outdated by nd transmission blocks, because certain delays for the
exchange of channel estimates and for the communication of uplink grants are
inevitable. Based on this outdated channel information, the scheduler estimates
achievable transmission rates and assigns appropriate MCSs. For simplicity, in
the remainder of this section, we assume that the number of possible MCSs is
unlimited, ignoring the fact that in real systems only a certain granularity of
MCSs is available. Transmission errors occur if the rate of the assigned MCS is
too high to be successfully decoded. Since achievable rates in a DAS depend on
the compression accuracy, the scheduler has to nd a trade-o between throughput and the required backhaul capacity. Note that in the multi-user case with
the employment of successive interference cancelation (SIC) at the decoder, the
rate of each UE also depends on the decoding order.
In the following paragraphs, we investigate the benet of adaptive compression. In particular, two strategies are compared:
1. xed compression: a xed backhaul rate is used for the exchange of the compressed signals from the RRH to the decoder.
2. adaptive ad-hoc compression: the updated CSI after transmission is taken into
account to decide which backhaul rate (and therefore compression accuracy)
is sucient for successful decoding.
If the adaptive scheme is employed, we exploit the fact that the RRHs have
full knowledge of the current channel state after the transmission. They are
therefore able to adapt the compression appropriately and to enable successful decoding with as little information exchange as possible. The gain of the
adaptive scheme is indeed two-fold: besides achieving higher throughput due to
the reduced probability of transmission errors; backhaul consumption is reduced
because the adaptive scheme exploits all cases where decoding is possible with a
backhaul rate that is lower than the xed rate. Additionally, in the case where
successful decoding could not be achieved even under full cooperation, the backhaul is not used at all, enabling its potential usage for other terminals.
A comparison of the maximum sum-rate that can be achieved for a certain
average backhaul rate is shown in Fig. 11.8, where we consider a scenario with
M = 2 BSs with Nbs = 1 receive antennas each, and either K = 1 double-antenna
UE (Fig. 11.8(a)) or K = 2 single-antenna UEs (Fig.11.8(b)). The UEs use xed
per antenna transmit power P , and the received signal is distorted by additive
white Gaussian noise (AWGN) with variance v2 . In general, a rich scattering
environment leading to complex Gaussian channel realizations (Rayleigh channel) that are spatially independent is assumed. The UEs are assumed to be
located at the cell-edge. In order to model the time-variance of the channel, it is
assumed that the 1 or 2 UEs are moving at a constant speed v. We employ the
widely used Jakes spectrum to model the eects of the Doppler spread [JC94].
Furthermore, we assume coding over a complete transmission block of 1 ms and
6
5
v = 5 km/h
v = 15 km/h
v = 30 km/h
v = 45 km/h
adaptive
4
3
2
272
6
5
4
3
2
xed
1
1
0
2
4
6
8
average backhaul rate [bit/channel use]
(a) Setup 1 (K = 1, M = 2)
2
4
6
8
average backhaul rate [bit/channel use]
(b) Setup 2 (K = 2, M = 2)
Figure 11.8 Comparison of sum-rate vs. backhaul for the adaptive and xed schemes
for dierent time varying Rayleigh channels (fc = 2.68 GHz, v2 = 0.1, P = 1).
a scheduling delay of 3 ms. The results are based on information-theoretic models that include the use of best-known compression techniques that also utilize
side-information by Wyner-Ziv coding [dCS09], and a rather simple scheduler
that makes a decision based on the channel that was observed nd codewords
earlier and considers a backo-factor which is chosen such that throughput is
maximized. For further details, we refer to [GMF10a].
As expected, Fig. 11.8 shows a throughput loss that increases with the
time-variance of the channel. However, when the ad-hoc cooperation scheme
is employed, we see strong gains in terms of the throughput/backhaul trade-o.
Indeed, ad-hoc cooperation allows us to achieve almost maximum throughput
for much lower backhaul rates than with xed cooperation, which in the low
backhaul regime mitigates the negative impact of time varying channels on the
achievable throughput. As expected, the backhaul savings of the ad-hoc scheme
increase with increasing mobility.
When the achievable gains for the one UE case (Fig. 11.8(a)) and the two
UE case (Fig. 11.8(b)) are compared, we see that the possible gains of Ad-Hoc
CoMP are reduced. The reason for this observation is that, in the two user case,
it is not possible to adapt the backhaul rate such that the rates of both users
are equal to the scheduled rate separately because the backhaul rate is increased
until both users can be decoded successfully. Hence, the gains from the proposed
adaptive cooperation scheme decreases with an increase in the number of UEs
that are decoded jointly. However, this occurs when only single antenna BSs are
employed. By using multiple BS antennas, the backhaul rate can be distributed
on the spatial dimensions of the receive signal in a way that less backhaul rate
is utilized for the compression of user signals beyond the accuracy required for
successful decoding as shown in [GMF10c].
11.3.3
273
274
Decoding
successful?
yes
End of
transmission
no
yes
Request refined
digital representation?
no (H)ARQ
successful decoding. A downside of the proposed scheme is that it requires a feedback mechanism between the decoder and the forwarding BSs, which introduces
an additional delay.
For the rest of this subsection, we will compare two approaches:
1. xed compression: the same backhaul rate cx is always used for the exchange
of compressed signals from BS 2 (the RRH) to BS 1 (the decoder). The
throughput is maximized (in this case) by choosing an optimal signal-to-noise
ratio (SNR) gap at the scheduler.
2. progressive cooperation: a progressive renement of the exchanged signal is
used to achieve the lowest backhaul rate cpro that enables successful decoding.
When latency and complexity are not constrained, the most backhaul-ecient
scheme would be to rene the accuracy of the forwarded information in very
small successive steps. However, in real-world systems, a good trade-o between
throughput, backhaul rate, and latency is desired. Therefore, we need to nd
other methods that limit the number of iterations. A straightforward approach
is a simple three step scheme. The signal is rst quantized with the rate cfix
. If
2
decoding is unsuccessful, the exchanged signal is rened to a total rate of cx . If
decoding is still not successful, in the last step, further renement to a rate of
2cx is used. The transmission is in outage if even this rate is not sucient for
decoding. Further information is given in [GMF10b].
As mentioned earlier, we assume a block fading channel, such that the channel
(as well as the channel estimation and the channel estimation error) are constant for the transmission of one codeword, and successive channel realizations
are assumed to be uncorrelated. Fig. 11.10(a) shows the Monte-Carlo simulation
results for the setup that was already observed in Section 11.3.2. In addition to
2
= 0), we consider channel estimation
the case of perfect channel estimation (est
2
errors with variance est = {0.02, 0.05, 0.1}. The relatively large gap between
2
est
=0
2
est
= 0.02
2
est
= 0.05
2
est
= 0.1
optimal
three-step
5
4
3
2
xed
275
6
5
4
3
2
1
2
4
6
8
average backhaul rate [bit/channel use]
(a) Setup 1 (K = 1, M = 2)
2
4
6
8
average backhaul rate [bit/channel use]
(b) Setup 2 (K = 2, M = 2)
Figure 11.10 Comparison of sum-rate vs. backhaul for the optimal and the heuristic
three-step progressive scheme as well as the xed scheme (v2 = 0.1, P = 1).
the throughput for perfect channel estimation and for imperfect CSI is a consequence of the fact that the variance of the estimation distortion is unknown,
resulting either in transmission errors or scheduled rates that are far from the
achievable rates.
Fig. 11.10(b) shows that the gain of progressive ad-hoc cooperation scheme
decreases with the number of UEs that are decoded jointly, because the backhaul
rate is increased until both users can be decoded successfully. The reasons and
potential countermeasures are the same as in the case of imperfect CSI due to a
scheduling delay.
11.3.4
276
and 11.3.3 show that the number of retransmissions can be reduced by using
Ad-Hoc CoMP. Future research will show if an ad-hoc use of CoMP along with
coordinated scheduling might have the same potential as that of reliable communication without HARQ on the rst two layers.
11.3.5
Summary
In this section, the concept of Ad-Hoc CoMP for the cellular uplink was introduced. The key concept is to adapt the CoMP strategy after transmission has
taken place in order to exploit channel information that is more recent than the
one available at the time of scheduling. In this way, a more ecient use of backhaul can be achieved. The potential gains of using Ad-Hoc CoMP were shown for
the example of a distributed antenna system where base stations exchange quantized receive signals for centralized decoding, and where two particular scenarios
were considered.
In the rst scenario, assuming that perfect CSI is available to the BSs after
transmission, while only inaccurate CSI was available at the time of scheduling,
it was shown that the employment of an adaptive backhaul compression rate can
greatly increase backhaul eciency.
In the other scenario, now assuming that the CSI available at the time of
decoding is also subject to estimation errors, it could be shown that a progressive
ad-hoc cooperation scheme is highly benecial in terms of backhaul savings. Here,
successively rened information is passed over the backhaul in multiple iterations,
until successful decoding of the terminal transmissions is possible. Clearly, this
leads to a trade-o between latency (number of iterations) and sum backhaul
rate. A simple three-step approach was introduced, and its performance relative
to the optimal progressive scheme and a naive xed cooperation scheme was
shown.
For the scenarios observed, the results indicate that an adaptive and progressive use of CoMP promises to reduce the required backhaul rate by about
50 %.
12 Backhaul
In this chapter, we address a last, but absolutely not least important challenge
connected to CoMP, namely the fact that most base station cooperation schemes
require information exchange over a backhaul infrastructure. Depending on the
existing infrastructure of a mobile operator, both backhaul capacity and latency
requirements of some CoMP schemes may be the main cost drivers or potential
show stoppers on the roadmap towards CoMP. The chapter starts with addressing fundamental aspects of backhaul-constrained cooperation in Section 12.1,
after which concrete backhaul capacity and latency requirements of various
uplink and downlink CoMP schemes and their scaling behavior are derived in
Section 12.2. Finally, Section 12.3 gives an overview on existing and upcoming
backhaul technology options, and hence gives the reader a feeling of whether
particular CoMP schemes can be expected to be technically and commercially
feasible in the near future or not.
12.1
278
Backhaul
regardless of the rate at which BSs cooperate. The conclusion is drawn based on
the characterization of the capacity region to within a constant gap1 .
12.1.1
Introduction
Why is Backhaul Limited?
One of the common misconceptions about backhaul cooperation is that the backhaul provides near unlimited cooperation capability, so that base stations can
cooperate in an unlimited manner. To refute this, we shall use a simple example
to illustrate that in a wide-band cellular system, backhaul cooperation is usually
limited.
Consider a wide-band orthogonal frequency division multiplex (OFDM)-based
cellular system with a bandwidth of 20 MHz. To attain near unlimited cooperation, the received signal at a base station should be quantized nely enough so
that it can be recovered with a negligible distortion at other base stations. Let
us do a back-of-the-envelope calculation to get a sense of the rate that should be
used to convey these quantization outputs. Suppose we use 8 bit/s/Hz to quantize the signal and use the backhaul to exchange them. The total throughput
required in the backhaul is then 20 8 = 160 Mbits/s. Even for optical carriers
in synchronous optical network (SONET)/synchronous digital hierarchy (SDH),
such a high data rate is only supported beyond OC-12, and not to mention
other technologies such as digital subscriber line (DSL) that cannot support
it. For wireless technologies with growing bandwidth, since the backhaul link
capacity does not increase with wireless spectra, from the above calculation we
conclude that backhaul cooperation should be considered limited, and understanding how to make use of backhaul cooperation eciently for interference
mitigation becomes important.
Gaussian Interference Channel with Backhaul Cooperation
The simplest information-theoretic model for studying the fundamental limits of
a communication system in the presence of interference is the interference channel (IC). In its simplest form, an interference channel consists of two transmitterreceiver pairs, and each receiver is only interested in retrieving information from
its own transmitter. Therefore, one users information-carrying signal becomes
interference for the other user. A Gaussian IC is one where the second users
signal x2 interferes with the rst users signal x1 in an additive fashion and vice
versa, along with additive white Gaussian noises at both receivers. Mathematically, the Gaussian interference channel is dened as follows:
y1 = h11 x1 + h12 x2 + z1 , y2 = h21 x1 + h22 x2 + z2
1
(12.1)
The dierence between inner and outer bounds are within a constant number of bits, which
does not depend on channel parameters.
279
z1
m1
ENC 1
h11
x1
z1
DEC 1
m
, 1 m1
ENC 1
h21
ENC 2
x2
DEC 1
m
1
DEC 2
m
2
h21
CB12
h12
m2
h11
x1
h22
CB21
DEC 2
CB12
m
2 m2
CB21
ENC 2
h12
x2
z2
h22
z2
Figure 12.1 Channel model considered. Dashed lines denote interfering links.
are the received signals, where two mutually independent additive noise processes
{zi [k]}N
k=1 (i = 1, 2) are independently and identically distributed (i.i.d.) with
NC (0, 1) over time. In this section, we use [] to denote time indices. Transmitter
i intends to convey message mi to receiver i by encoding it into a block codeword
{xi [k]}N
k=1 , with transmit power constraints
N
2
1
xi [k] 1, i = 1, 2,
N
(12.2)
k=1
(12.3)
280
Backhaul
Fq2 ,
(12.5)
where additions are modulo-two component-wise, q = max {n11 , n12 , n21 , n22 },
is the shift matrix
and S Fqq
2
0 0 0 0
1 0 0 0
(12.6)
S = 0 1 0 0.
.
. . ..
..
. .
0 0 1 0
An interpretation of this model considers the binary expansion of signals. The
eect of additive white Gaussian noise is modeled by truncation of the signal
below the noise level. The eect of superposition with interference is modeled
by the modulo-two component-wise addition of the bits, where the carry-over in
real addition is not captured for simplicity.
Fundamental Gain from Limited Backhaul Cooperation
We identify two regions pertaining to the gain from limited backhaul cooperation:
linear and saturation regions, as illustrated by a numerical example in Fig. 12.2.
The example is symmetric with SNR1 = SNR2 = SNR = 20 dB, INR1 = INR2 =
INR = 15 dB, and CB12 = CB21 = CB . In the linear region, backhaul cooperation is
ecient, in the sense that the growth of user data rate is roughly linear with
respect to the capacity of the backhaul links. The gain in this region is the
degrees-of-freedom gain that CoMP systems provide. On the other hand, in the
saturation region, backhaul cooperation is inecient in the sense that the growth
of user data rate becomes saturated as one increases the rate in the backhaul
links. The gain is the power gain of a constant number of bits at best, and the
constant is independent of the channel strength and the backhaul cooperation
rate. We will focus on system performance in the linear region, not only because
the rate at which base stations can cooperate is limited in most scenarios, but
also because the gain from cooperation is more signicant.
With the constant-gap-to-optimality result, we nd the fundamental gain from
cooperation in the linear region as follows: either one cooperation bit buys one
more bit or two cooperation bits buy one more bit until saturation, depending
10
user data rate [bit / channel use]
281
0
0
10
20
30
40
50
cooperation rate [bit / channel use]
linear region
saturation region
cooperation is ecient
cooperation is inecient
Figure 12.2 The gain from limited backhaul cooperation.
on channel parameters. This will be elaborated and explained in the last part of
this section.
The rest of this section is organized as follows. First, we describe the cooperation strategies between base stations that achieve the capacity regions to
within 2 bits and 6.5 bits in uplink and downlink scenarios respectively. Next,
we show that there is an uplink-downlink reciprocity between the two scenarios,
and hence there is no dierence in the fundamental gains obtained from receiver
or transmitter cooperation. We then quantify the degree-of-freedom gain by characterizing the number of generalized degrees of freedom in the system. Finally, we
use a couple of linear deterministic examples to illustrate the high-level intuitive
reasons why there are two dierent kinds of behaviors of the gain from backhaul
cooperation in the linear region.
12.1.2
282
Backhaul
Received
a1
a1
a3
b1
a3
Received
Exchanged
a1
a2
a3
a1
b1 a2
a3
Exchanged
b1 a2
b1
b1
b1
a1
b1
b3
b3
b1
a1
b3
b3 a2 b1 a2
a1
a2
a3
b2
Exchanged
Received
a1
a1
a1
a2
a2
a1
a2
a3
a1
a1
b2 a2
a3
a2
a1
b1
b2
Exchanged
a1
a1
a2
a2
a3
a1
a3
b2 a2
a3
a2
a1
Figure 12.3 Example channels. {ak } denote user 1s bits, while {bk } denote user 2s.
c 2009 IEEE.
Index k denotes the k-th level at the corresp. transmitter. [WT09b]
283
level so that the undesired signal does not pollute the cooperative information.
In this example, as illustrated in Fig. 12.3(b), with one-bit cooperation in each
direction in the LDC, the optimal sum-rate is 5 bits, achieved by turning on one
more bit a2 . This causes collisions at the second level at receiver 1 and at the
third level at receiver 2, which can be resolved with cooperation: receiver 1 sends
b1 a2 to receiver 2, and receiver 2 sends b1 to receiver 1. Now, receiver 1 can
solve (a1 , a2 , a3 , b1 ), and receiver 2 can solve (b1 , b3 , a1 , a2 ). In fact, the exchanged
linear combinations are not unique. For example, receiver 1 can send (b1 a2 )
a1 and receiver 2 can send b1 a1 , and this again achieves the same rates. As
long as receiver 1 does not send a linear combination containing the private bit a3
and the sent linear combination is linearly independent of the signals at receiver 2
(and vice versa for the linear combination sent from receiver 2 to receiver 1), the
scheme is optimal for this example channel. The above discussion regarding the
scheme in the LDC naturally leads to an implementable one-round scheme in the
Gaussian channel, where both receivers quantize-and-bin their received signals
at their own private signal level.
In the above example, it is optimal that each receiver sends to the other,
linear combinations formed by its received signal above its private signal level.
Is this optimal in general? The answer is no. Consider the following asymmetric
example: SNR2 = INR2 , SNR1 is 2/3 of SNR2 in dB, and INR1 is 1/3 of SNR2 in
dB. CB12 = 23 log SNR2 and CB21 = 13 log SNR2 . The corresponding LDC is depicted
in Figs. 12.3(c) and 12.3(d), where one bit in the LDC corresponds to 13 log SNR2
in the Gaussian channel. First consider the same scheme as in the previous
example. Note that if receiver 2 just forwards signals above its private signal
level, it can only forward a1 to receiver 1 and achieves R1 up to 2 bits. On the
other hand, if receiver 2 forwards a3 to receiver 1, which is below user 2s private
signal level, it achieves R1 = 3 bits. From this example, we see that whenever
there is useful information (which should not be polluted by the receivers own
private bits) that lies at or below the private signal level (in this example, the
bit a3 ), the one-round scheme described in the previous example is sub-optimal.
To extract the useful information at or below the private signal level, one of
the receivers (in this example, receiver 2) can rst decode and then form linear
combinations using (decoded) common messages only.
Without loss of generality it turns out that, the above situation (there is useful
information for the other receiver that lies at or below the private signal level)
only occurs at one of the two receivers. In other words, there exists a receiver
where no useful information (for the other receiver) lies at or below the private
signal level. The reason is the following:
1. It is not dicult to see that the capacity region is convex, and hence if a scheme
can achieve max(R1 ,R2 )C {1 R1 + 2 R2 } for all 1 , 2 0, it is optimal. Here
C denotes the capacity region.
2. If 1 2 , we weigh user 1s rate more. Since the private bits are cheaper to
support in the sense that they do not cause interference at receiver 2, user 1
284
Backhaul
should be transmitting at its full private rate, which is equal to the number
of levels at or below the private signal level at receiver 1. Therefore, all levels
at or below the private signal level are occupied by user 1s private bits and
there is no useful information for receiver 2 at receiver 1.
3. Similarly if 1 2 , there is no useful information for receiver 1 at receiver 2,
at or below the private signal level.
Hence, the following two-round strategy is optimal in the LDC: if 1 2 ,
receiver 1 forms a certain number (no more than the cooperative link capacity)
of linear combinations composed of the signals above its private signal level and
sends them to receiver 2. After receiver 2 decodes, it forms a certain number of
linear combinations composed of the decoded common bits and sends them to
receiver 1. If 1 2 , the roles of receiver 1 and 2 are interchanged. Depending
on the operating point in the capacity region, we use dierent congurations,
implying that time-sharing is needed to achieve the full capacity region.
From the above discussion, a natural and implementable two-round strategy
for Gaussian channels emerges. For transmission, we use a superposition Gaussian random coding scheme with a simple power-split conguration, as described
in [ETW08]. For cooperation, one of the receivers quantizes-and-bins its received
signal at its private signal level and forwards the bin index; after the other
receiver decodes with the side information that helps it, it bins-and-forwards the
decoded common messages back to the rst receiver and helps it decode.
Coding Strategy
The scenario is depicted in Fig. 12.1(a). The strategy consists of two parts: (1)
the transmission scheme, describing how transmitters encode their messages, and
(2) the cooperation scheme, describing how receivers exchange information and
decode messages. We give an overview of the strategy below.
Transmission Scheme. We use a simple superposition coding scheme with
Gaussian random codebooks. Each transmitter splits its own message into
common and private (sub-)messages. Each common message is aimed at both
receivers, while each private message is aimed at its own receiver. Each message
is encoded into a Gaussian random codeword with certain power. For transmitter i, the power for its private and common codewords is Qip and Qic = 1 Qip ,
respectively, for i = 1, 2. As [ETW08] points out, since the private signal is undesired at the unintended receiver, a reasonable conguration is to make the private
interference at or below the noise level so that it does not cause much damage
and can still convey additional information in the direct link if it is stronger than
the cross link. When the interference is stronger than the desired signal, simply
set the whole message
! to be common. In other words, for (i, j) = (1, 2) or (2, 1),
Qip = min
1
INRj , 1
285
quantizes its received signal and sends out the bin index (described in detail
below). At the second round, receiver i receives this side information, decodes
its desired messages (both users common messages and its own private message)
with the decoder described in detail below, randomly bins the decoded common
messages, and sends the bin indices to receiver j. Finally receiver j decodes with
the help from the receiver-cooperative link. We call this a two-round strategy
STGjij , meaning that the processing order is: receiver j quantizes-and-bins,
receiver i decodes-and-bins, and receiver j decodes. Its achievable rate region is
denoted by Rjij . By time-sharing, we can obtain an achievable rate region
R := conv {R212 R121 }, the convex hull of the union of two rate regions.
There is a simple way to understand the strategy from an engineering perspective. To achieve max(R1 ,R2 )R {1 R1 + 2 R2 } for some non-negative (1 , 2 ), the
processing conguration can be easily determined: strategy STGjij should be
used, where i = arg minl=1,2 {l } and j = arg maxl=1,2 {l }. To summarize, the
receiver which decodes last is the one we favor the most.
In the following paragraphs, we describe each component in detail, including
quantize-binning, decode-binning, and their corresponding decoders. For simplicity, we consider strategy STG212 .
Quantize-binning: Upon receiving its signal from the transmitter-receiver
link, receiver 2 does not decode messages immediately. Instead, serving as a
relay, it rst quantizes its signal by a pre-generated Gaussian quantization codebook with a certain distortion, and then sends out a bin index determined by a
pre-generated binning function. How should we set the distortion? As discussed
previously, note that both its own private signal and the noise it encounters
are not of interest to receiver 1. Therefore, a natural conguration is to set the
distortion level equal to the aggregate noise plus the private signal power level.
Decoder at receiver 1: After retrieving the receiver-cooperative side information, that is, the bin index, receiver 1 decodes the two common messages and
its own private message, by searching the transmitters codebooks for a codeword triple (indexed by the two common messages and the users own private
message) that is jointly typical [CT06] with its received signal and some quantization point (codeword) in the given bin. If there is no such unique codeword
triple, it declares an error.
Decode-binning: After receiver 1 decodes, it uses two pre-generated binning
functions to bin the two common messages and sends out these two bin indices
to receiver 2.
Decoder at receiver 2: After receiving these two bin indices, receiver 2
decodes the two common messages and its own private message, by searching the
transmitters codebooks for a codeword triple such that it is jointly typical [CT06]
with its received signal and the common messages that both lie in the given bins.
286
Backhaul
12.1.3
( )
)
h11
h21
=
,
v
.
2m
h12
h22
(12.8)
There is only one cooperative common code carrying both cooperative common messages.
287
Hence, the overall cooperative signal transmitted by combining both transmitters is the following:
xoh = xo + w1z v1z + w2z v 2z + w1m v 1m + w2m v 2m ,
(12.9)
xh
where wiz and wim are independent Gaussian random codes carrying a part
of the cooperative private message for user i. For the non-cooperative part, we
simply transmit the superposition of two independent Gaussian random codes xic
and xip for the non-cooperative common and non-cooperative private messages
for user i, respectively. Hence, the overall non-cooperative signal transmitted by
transmitter i is xicp = xic + xip , for i = 1, 2. Overall, the transmit signal from
transmitter i is the superposition of cooperative and non-cooperative signals, i.e.
xi = xoh (i) + xicp , i = 1, 2.
(12.10)
For the power allocation, note that the interference caused by the other users
cooperative private signal should be nulled out approximately, that is, its variance
is at or below the noise level. Moreover, the interference caused by the other users
non-cooperative private signal should also be at or below the noise level. With
this guideline, we can determine the power allocation policy. For more details we
point the readers to [WT10].
The decoding procedure, compared to the uplink scenario, is much simpler.
Each receiver decodes all common messages and its own private messages jointly.
12.1.4
288
Backhaul
Based on reciprocity, we investigate performance in the linear region by characterizing the optimal generalized degrees of freedom available in the system,
and demonstrate the fundamental gain from limited backhaul cooperation in the
rest of this section. The notion of generalized degrees of freedom is originally
proposed in [ETW08]. For simplicity, we consider a symmetric set-up, where
SNR = SNR1 = SNR2 , INR = INR1 = INR2 ; CB = CB12 = CB21 ,
(12.11)
(12.12)
CB
= ,
SNR log SNR
lim
(12.13)
x ,
SNR
Csym
,
log SNR
(12.14)
(12.15)
Numerical plots for the g.d.o.f. are given in Fig. 12.4. We observe that the gain
from cooperation varies at dierent values of . By investigating the g.d.o.f., we
conclude that at high SNR, when interference-to-noise ratio (INR) is below 50%
of SNR (in dB), one-bit cooperation per direction buys roughly one-bit gain per
user until full receiver cooperation performance is reached, while when INR is
between 67% and 200% of SNR (in dB), one-bit cooperation per direction buys
roughly half-bit gain per user until saturation.
3
In fact, the limit does not exist when = 1, where the phases of the channel gains matter.
In particular, its value can depend on whether the system MIMO matrix is well-conditioned
or not. To overcome this issue, we pose a reasonable distribution, namely, i.i.d. uniform
distribution, on the phases, show that the limit exists almost surely, and dene the limit to
be the number of generalized degrees of freedom per user. See [WT09a] for more details.
289
d(, )
2.0
1.5
= 1/2
= 1/3
= 1/6
1.0
=0
0.5
0
0
0.5
1.0
1.5
2.0
2.5
3.0
c 2009 IEEE.
Figure 12.4 Generalized degrees of freedom. [WT09b]
Molto più che documenti.
Scopri tutto ciò che Scribd ha da offrire, inclusi libri e audiolibri dei maggiori editori.
Annulla in qualsiasi momento.