AccelerComm Polar WhitePaper PDF

The implementation challenges of polar codes
Robert G. Maunder
CTO, AccelerComm
February 2018
Abstract
Although polar codes are a relatively immature channel coding technique with no previous standardised
applications, they have been selected by the 3rd Generation Partnership Project (3GPP) to provide error correction
in the New Radio (NR) standard for 5th Generation (5G) mobile communications. The hardware acceleration of
polar encoding and decoding will be necessary in order in to meet the strict requirements in many applications of
5G. However, the processes of polar encoding and decoding are complicated and it is not trivial to translate them
into hardware. This white paper provides a tutorial of the polar encoding and decoding processes, before discussing
the challenges of their hardware implementation.
I. I NTRODUCTION
In mobile communication, channel coding may be used to protect information against the effects of
transmission errors, which may be caused by noise, interference or poor signal strength. More specifically,
a channel encoder is used to encode the information in the transmitting device, which may be a basestation,
a handset or another user device. This allows a corresponding channel decoder to be used in the receiving
device, in order to mitigate the transmission errors and recover the transmitted information.
In recent decades, several high-performance channel codes have been developed, which allow infor-
mation to be reliably transmitted at rates that closely approach the theoretical limit that is imposed by
the channel capacity. Specifically, turbo codes have been used in 3rd Generation (3G) and 4th Generation
(4G) mobile communication standards, while Low Density Parity Check (LDPC) codes have been adopted
in WiFi and satellite standards. More recently, polar codes [1] have emerged, offering particularly strong
error correction performance for short messages. However, polar codes are much less mature than turbo
and LDPC codes, having no previous standardised applications.
At the time of writing, the 3rd Generation Partnership Project (3GPP) is defining the so-called New
Radio (NR) standard [2], as a candidate for 5th Generation (5G) mobile communication. Here, polar codes
have been selected to provide channel coding in the control channel of the enhanced Mobile BroadBand
(eMBB) applications of NR, as well as in the Physical Broadcast Channel (PBCH). Polar codes have also
been identified as a candidate to provide channel coding for the data and control channels of the Ultra
Reliable Low Latency Communication (URLLC) and massive Machine Type Communication (mMTC)
applications of NR.
In addition to setting a strict requirement for ultra-reliable error correction, 5G imposes a requirement
for the error correction to be completed quickly, with a lower latency than in 3G or 4G. Owing to
this, many 5G applications will require polar encoding and decoding to be implemented using high-
performance hardware acceleration, which must consume a minimal amount of hardware resources and
power consumption.
Sections II and III of this white paper provide tutorials for the algorithms that underpin the processes
of polar encoding and decoding, respectively. Following this, Section IV discusses the challenges of
implementing these algorithms in hardware. Finally, we offer some concluding remarks in Section V.
AccelerComm
c 2018 www.accelercomm.com 1
ACCELERCOMM WHITE PAPER: THE IMPLEMENTATION CHALLENGES OF POLAR CODES
II. P OLAR ENCODER

A polar encoder comprises three successive components, namely information block conditioning, the
polar encoder kernal and encoded block conditioning, as shown in Figure 1. These components are
discussed in the following paragraphs.
The input to the information block conditioning component may be referred to as an information block,
which comprises K number of information bits, where K may be referred to as the information block size.
The information block conditioning component interlaces the K information bits with N − K redundant
bits, which may be frozen bits [1], Cyclical Redundancy Check (CRC) bits [3] and/or Parity Check (PC)-
frozen bits [4] in the NR polar code. Here, frozen bits always adopt a value of 0, while CRC and PC-frozen
bits adopt values that are obtained as functions of the information bits. The information block conditioning
component generates the redundant bits and interlaces them into positions that are identified by a prescribed
method, which is also known to the polar decoder. Furthermore, the information block conditioning
component additionally performs code block segmentation, interleaving and scrambling operations in the
NR polar code, as shown in Figures 6 – 8. The output of the information block conditioning component
may be referred to as a kernal information block, which comprises N number of kernal information bits,
where N may be referred to as the kernal block size. Here, the information block conditioning must be
completed such that N is a power of 2 that is greater than K. In the NR polar code, N may adopt values
of up to Nmax = 1024.
Polar encoder in transmitter
Kernal Kernal
Information information encoded Encoded
block Information block Polar block Encoded block
block encoder block Modulator
K conditioning N kernal N conditioning M
Polar decoder in receiver

Channel
Recovered Soft
Recovered kernal kernal Soft
information information encoded encoded
block Information block Polar block Encoded block
block decoder block Demodulator
K conditioning N kernal N conditioning M
Fig. 1: Top-level schematic of a polar encoder and decoder.
The input to the polar encoder kernal is a kernal information block and its output may be referred to
as a kernal encoded block, which comprises N number of kernal encoded bits. The operation of the polar
encoder kernal may be illustrated by a polar code graph representation, which is exemplified in Figure 2.
Here, the symbol ⊕ represents a binary eXclusive-OR (XOR) operation. Note that the graph comprises N
inputs on its left edge and N outputs on its right edge, corresponding to the N kernal information bits and
the N kernal encoded bits, respectively. The graph comprises log2 (N ) stages, each of which comprises
N/2 vertically aligned XORs, giving a total of log2 (N )N/2 XORs. Note that there are data dependencies
between successive stages, which enforces a left to right processing schedule. More specifically, the data
dependencies prevent the computation of the XORs in a particular stage until after the XORs in the stage
to its left have been computed.
Note that successive graph representations have recursive relationships. More specifically, the graph
representation for a polar encoding kernal operation having a kernal block size of N = 2 comprises a
single stage, containing a single XOR. The first of the N = 2 kernal encoded bits is obtained as the
AccelerComm
N = 2 graph
Input 0 Output 0
Input 1 Output 1
N = 4 graph
Input 0 Output 0
Input 1 Output 1
Input 2 Output 2
Input 3 Output 3
N = 8 graph
Input 0 Output 0
Input 1 Output 1
Input 2 Output 2
Input 3 Output 3
Input 4 Output 4
Input 5 Output 5
Input 6 Output 6
Input 7 Output 7
Stage 0 Stage 1 Stage 2
Fig. 2: Polar code graphs for N ∈ {2, 4, 8}.
AccelerComm
XOR of the N = 2 kernal information bits, while the second kernal encoded bit is equal to the second
kernal information bit. For greater kernal block sizes N , the graph representation may be considered to
be a vertical concatenation of two graph representations for a kernal block size of N/2, followed by an
additional stage of XORs, as shown in Figure 2. In analogy with the N = 2 kernal described above, the
first N/2 of the N kernal encoded bits are obtained as XORs of corresponding bits from the outputs of
the two N/2 kernals, while the second N/2 of the kernal encoded bits are equal to the output of the
second N/2 kernal.
The input to the encoded block conditioning component of the polar encoder is a kernal encoded block
and its output may be referred to as an encoded block, which comprises M number of encoded bits, where
M may be referred to as the encoded block size. The resultant polar coding rate is given by R = K/M ,
where the encoded block conditioning must be completed such that M is greater than K, although M
may be higher or lower than N . The encoded block conditioning component may use various techniques
to generate the M encoded bits. More specifically, repetition [5] may be used to repeat some of the N
bits in the kernal encoded block, while shortening or puncturing techniques [5] may be used to remove
some of the N bits in the kernal encoded block. Note that shortening removes bits that are guaranteed to
have values of 0, while puncturing removes bits that may have either of 0 or 1 values. In addition to this
rate matching operation, the encoded block conditioning component also performs sub-block interleaving,
channel interleaving and code block concatenation operations in the NR polar code, as shown in Figures 6
– 8. Following polar encoding, the encoded block may be provided to a modulator, which transmits it
over a communication channel.
The complete polar encoding process is exemplified in Figure 3, for the case where a particular
arrangement of frozen bits is used to convert the K = 4 information bits [1001] into the M = 8 encoded
bits [00001111].
0 0 1 0
Frozen bit Encoded bit 0
0 0 1 0
0 1 1 0
1 1 1 0
Info bit 0 Encoded bit 3
0 0 1 1
0 0 1 1
0 1 1 1
1 1 1 1
Fig. 3: Example polar encoding process, using the N = 8 polar code graph, illustrating the case where a
particular arrangement of frozen bits is used to convert the K = 4 information bits [1001] into the M = 8
encoded bits [00001111].
AccelerComm
III. P OLAR DECODER

In the receiver, the demodulator’s role is to recover information pertaining to the encoded block.
However, the demodulator is typically unable to obtain absolute confidence about the value of the M
bits in the encoded block, owing to the random nature of the noise in the communication channel. The
demodulator may express its confidence about the values of the bits in the encoded block by generating
a soft encoded block, which comprises M number of encoded soft bits. Each soft bit may be represented
in the form of a Logarithmic Likelihood Ratio (LLR)

Pr(bit = 0)
LLR = ln ,
Pr(bit = 1)
where Pr(bit = 0) and Pr(bit = 1) are the probabilities that the corresponding bit has the value 0
and 1, respectively. Here, a positive LLR indicates that the demodulator has greater confidence that the
corresponding bit has a value of 0, while a negative LLR indicates greater confidence in the bit value
1. The magnitude of the LLR expresses how much confidence, where an infinite magnitude corresponds
to absolute confidence in this bit value, while a magnitude of 0 indicates that the demodulator has no
information about whether the bit value of 0 or 1 is more likely.
A polar decoder comprises three successive components, namely encoded block conditioning, the polar
decoder kernal and information block conditioning, as shown in Figure 1. These components are discussed
in the following paragraphs.
The input to the encoded block conditioning component of the polar decoder is a soft encoded block
and its output may be referred to as a soft kernal encoded block, which comprises N number of kernal
encoded LLRs. In order to convert the M encoded LLRs into N kernal encoded LLRs, infinite-valued
LLRs may be interlaced with the soft encoded block, to occupy the positions that correspond to the
0-valued kernal encoded bits that were removed by shortening in the polar encoder. Likewise, 0-valued
LLRs may be interlaced with the soft encoded block, to occupy the positions where kernal encoded
bits were removed by puncturing. In the case of repetition, the LLRs that correspond to replicas of a
particular kernal encoded bit may be summed and placed in the corresponding position within the soft
kernal encoded block. Additionally, the encoded block conditioning component must perform the inverse
of the sub-block interleaving, channel interleaving and code block concatenation operations in the NR
polar code, as shown in Figures 6 – 8.
The input to the polar decoder kernal is a soft kernal encoded block and its output may be referred to
as a recovered kernal information block, which comprises N number of recovered kernal information bits.
The polar decoder kernal may operate on the basis of various different algorithms, including Successive
Cancellation (SC) decoding [1] and Successive Cancellation List (SCL) decoding [6], which are detailed
in Sections III-A and III-B, respectively.
The input to the information block conditioning component of the polar decoder is a recovered kernal
information block and its output may be referred to as a recovered information block, which comprises
K number of recovered information bits. The recovered information block may be obtained by removing
all redundant bits from the recovered kernal information block. Additionally, the information block
conditioning component must perform the inverse of the code block segmentation, interleaving and
scrambling operations in the NR polar code, as shown in Figures 6 – 8.
AccelerComm
A. SC decoding
A polar decoder kernal that operates on the basis of SC decoding may be considered to have a similar
graph structure to a polar encoder, as illustrated in Figure 2. An SC decoder performs computations
pertaining to the XORs in the graph, according to a sequence that is dictated by data dependencies.
However, the functionality of each XOR in the graph varies, when performing operations on LLRs and at
different steps in the SC decoding process. More specifically, there are three types of computations that
can be performed by a particular XOR in the graph, depending on the availability of LLRs provided on
the connections on its right-hand side, as well as upon the availability of bits provided on the connections
on its left-hand side.
The first occasion when an XOR can contribute to the SC decoding process is when an LLR has been
provided by each of the connections on its right-hand side. As shown in Figure 4(a), we refer to the first
and second of these two LLRs as x̃a and x̃b , respectively. This enables the XOR to compute an LLR x̃c
for the first of the two connections on its left-hand side, according to the f function
x̃c = f (x̃a , x̃b )
= 2 tanh−1 (tanh(x̃a /2) tanh(x̃b /2)) (1)
≈ sign(x̃a )sign(x̃b ) min(|x̃a |, |x̃b |), (2)
where sign(·) returns −1 if its argument is negative and +1 if its argument if positive. Here, (2) is referred
to as the min-sum approximation.
x̃c = f (x̃a , x̃b ) x̃a ûa x̃a ûa ûc = XOR(ûa , ûb )
x̃b x̃d = g(x̃a , x̃b , ûa ) x̃b ûb ûd = ûb
(a) (b) (c)
Fig. 4: The three computations that can be performed for an XOR in the polar code graph: (a) the f
function, (b) the g function and (c) partial sum calculation.
Later in the SC decoding process, a bit ûa will be provided on the first of the connections on the left-
hand side of the XOR, as shown in Figure 4(b). Together with the LLRs x̃a and x̃b that were previously
provided using the connections on the right-hand side, this enables the XOR to compute an LLR x̃d for
the second of the two connections on its left-hand side, according to the g function
x̃d = g(x̃a , x̃b , ûa )
= (−1)ûa x̃a + x̃b . (3)
Later still, a bit ûb will be provided on the second of the connections on the left-hand side of the
XOR, as shown in Figure 4(c). Together with the bit ûa that was previously provided using the first of
the connections on the left-hand side, this enables the partial sum computation of bits ûc and ûd for the
first and second connections on the right-hand side of the XOR, where
ûc = XOR(ûa , ûb ), (4)
ûd = ûb . (5)
As may be appreciated from the discussions above, the f function of (1) or (2) may be used to propagate
LLRs from right-to-left within the graph, while the partial sum computations of (4) and (5) may be used
AccelerComm
to propagate bits from left-to-right and while the g function of (3) may be used to switch from propagating
bits to propagating LLRs.
In order that LLRs can be propagated from right to left, it is necessary to provide LLRs on the
connections on the right-hand edge of the graph. This is performed at the start of the SC decoding
process, by providing successive LLRs from the soft kernal encoded block on successive connections on
the right-hand edge of the graph. Likewise, it is necessary to provide bits on the connections of the left-
hand edge of the graph, in order to facilitate the propagation of bits from left to right. Here, a further data
dependency beyond those described above is imposed. If the position of a particular connection on the left-
hand edge of the graph corresponds to the position of an information bit in the kernal information block,
then the bit that is input into that connection depends on the LLR that is output from that connection.
More specifically, if a positive LLR is output on the connection, then a value of 0 may be selected
for the corresponding bit of the recovered kernal information block and then input into the connection.
Meanwhile, a negative LLR allows a value of 1 to be selected for the corresponding bit of the recovered
kernal information block and then input into the connection. In the case of a connection corresponding
to a redundant bit within the kernal information block, the value of that redundant bit may be input into
the connection as soon as it is known. Here, frozen bits always adopt the value 0, but the value of CRC
and PC bits will not become available until related information bits have been recovered.
In combination, the data dependencies described above impose a requirement for the information bits
within the recovered kernal information block to be obtained one at a time on the connections on the
left edge of the graph, in order from top to bottom. More specifically, the SC decoding process begins
by using the f function (1) or (2) to propagate LLRs from the right hand edge of the graph, to the top
connection on the left-hand edge of the graph, allowing the first bit to be recovered. Following this, each
successive bit from top to bottom is recovered by using the partial sum computations of (4) and (5) to
propagate bits from left to right, then using the g function of (3) for a particular XOR to switch from bit
propagation to LLR propagation, before using the f function to propagate LLRs to the next connection on
the left-hand edge of the graph, allowing the corresponding bit to be recovered. This process is illustrated
in the example of Figure 5.
B. SCL decoding
In the SC decoding process described in Section III-A, the value selected for each bit in the recovered
information block depends on the sign of the corresponding LLR, which in turn depends on the values
selected for all previous recovered information bits. If this approach results in the selection of the incorrect
value for a particular bit, then this will often result in the cascading of errors in all subsequent bits. The
selection of an incorrect value for an information bit may be detected with consideration of the subsequent
frozen bits, since the decoder knows that these bits should have values of 0. More specifically, if the
corresponding LLR has a sign that would imply a value of 1 for a frozen bit, then this suggests that an
error may have been made during the decoding of one of the preceding information bits. However, in the
SC decoding process, there is no opportunity to consider alternative values for the preceding information
bits. Once a value has been selected for an information bit, the SC decoding process moves on and the
decision is final.
This motivates SCL decoding [6], which enables a list of alternative values for the information bits to be
considered. As the decoding process progresses, it considers both options for the value of each successive
information bit. More specifically, an SCL decoder maintains a list of candidate kernal information blocks,
where the list and the kernal information blocks are built up as the SCL decoding process proceeds. At
the start of the process, the list comprises only a single kernal information block having a length of zero
bits. Whenever the decoding process reaches a frozen bit, a bit value of 0 is appended to the end of
AccelerComm
(3) +0.09 (2) +0.72 (1) −2.41 (0) +2.41

Frozen bit Encoded LLR 0
(4) 0 (7) 0 (14) 1
(5) +0.81 (2) +0.09 (1) −0.87 (0) −0.87

(6) 0 (7) 0 (14) 1
(9) +0.96 (8) −3.13 (1) −0.72 (0) +3.56

(10) 0 (13) 1 (14) 1
(11) −4.09 (8) −0.96 (1) −0.09 (0) +0.09

Info bit 0 Encoded LLR 3
(12) 1 (13) 1 (14) 1
(17) −2.02 (16) +4.28 (15) −5.53 (0) −3.12

(18) 0 (21) 0
(19) +2.26 (16) −2.02 (15) +2.02 (0) +1.15

(20) 0 (21) 0
(23) +0.73 (22) −9.81 (15) −4.28 (0) −0.72

(24) 0
(25) −10.5 (22) −0.73 (15) −2.75 (0) −2.66

(26) 1
Fig. 5: Example SC decoding process, using the N = 8 polar code graph, for the case where a particular
arrangement of frozen bits is used to convert a particular set of M = 8 encoded LLRs into the K = 4
recovered information bits [1001]. The LLRs obtained using the f and g functions of (2) and (3) are shown
above each connection. The bits obtained using the partial sum computations of (4) and (5) are shown
below each connection. The accompanying numbers in parenthesis identify the step of the SC decoding
process where the corresponding LLR or bit becomes available.
each candidate kernal information block in the list. However, whenever the decoding process reaches an
information bit, two replicas of the list of candidate kernal information blocks is created. Here, the bit
value of 0 is appended to each block in the first replica and the bit value of 1 is appended to each block
in the second replica. Following this, the two lists are merged to form a new list having a length which
is double that of the original list. This continues until the length of the list reaches a limit L, which is
typically chosen as a power of two. From this point onwards, each time the length of the list is doubled
when considering an information bit, the worst L among the 2L candidate kernal information blocks are
identified and pruned from the list. In this way, the length of the list is maintained at L until the SCL
decoding process completes.
Throughout this process, the worst candidate kernal information blocks are identified by comparing and
sorting metrics that are computed for each block [7], based on the LLRs obtained on the left-hand edge of
the polar code graph. These LLRs are obtained throughout the SCL decoding process by using separate
replicas of the partial sum computations of (4) and (5) to propagate the bits from each candidate kernal
information block into the polar code graph, from left to right. Following this, separate replicas of the g
and f computations of (1) – (3) may be used to propagate corresponding replicas of the LLRs from right
to left, as in the SC decoding process described in Section III-A. The metric associated with appending
AccelerComm
the bit value ûl,j in the position j ∈ [0, N − 1] to the candidate kernal information block l is given by
φl,j (ûl,j ) = φl,j−1 + ln(1 + e−(1−2ûl,j )x̃l,j ) (6)
if ûl,j = 12 (1 − sign(x̃l,j ))

φl,j−1
≈ , (7)
φl,j−1 + |x̃l,j | otherwise
where x̃l,j is the corresponding LLR and φl,j−1 is the metric that was calculated for the candidate kernal
information block in the previous step of the SCL decoding process. Here, (7) is referred to as the min-sum
approximation. Note that since the metrics accumulate across all bit positions j ∈ [0, N − 1], they must
be calculated for all L candidate kernal information blocks whenever a frozen bit value of 0 is appended,
as well as for all 2L candidates when both possible values of an information bit are considered. In the
latter case, the 2L metrics are sorted and L candidates having the highest values are identified as being
the worst and are pruned from the list.
Following the completion of the SCL decoding process, the candidate kernal information block having
the lowest metric may be selected as the recovered kernal information block. Alternatively, in CRC-aided
SCL decoding [8], all candidates in the list that do not satisfy a CRC are pruned, before the candidate
having the lowest metric is selected and output. The error correction capability of the NR polar code is
characterised in Figures 9 – 11.
IV. C HALLENGES OF HARDWARE IMPLEMENTATION

There are several challenges associated with the hardware implementation of polar encoders and,
in particular, polar decoders. This section begins by discussing challenges that are common to the
implementation of both polar encoders and polar decoders, before discussing additional challenges that
are specific to polar decoders.
Data dependencies. As described in Sections II and III, the polar encoding and decoding
processes are characterised by particular data dependencies, which require the various processing
operations to be completed in a particular sequence. This limits the degree of parallel process-
ing that can be achieved during the implementation of polar encoders and decoders. This is
particularly challenging in the case of polar decoders, owing to the serial nature of the SC
and SCL algorithms. More specifically, the corresponding data dependencies require the kernal
information bits to be recovered one after another, in order from top to bottom of the polar
code graph. During the polar decoding process, the data dependencies allow different numbers
of operations to be completed in parallel at different times, as illustrated in the example of
Figure 5. In order to minimise the number of steps required to complete the decoding process,
a large amount of hardware may used so that a single processing step is sufficient to complete
the largest number of parallel operations that are supported by the decoder data dependencies.
However, the data dependencies will prevent much of this hardware from being used throughout
the rest of the decoding process, which may motivate the use of a smaller amount of hardware
and a greater number of steps. However, either way, the ratio of hardware resource usage to
the latency required to complete the decoding process may be unfavourable, unless sophisticated
alternative techniques can be developed and utilised.
Routing. A particular challenge in the implementation of polar encoders and decoders is routing
the correct information to the correct hardware components at the correct time. As illustrated by
AccelerComm
the graph representations of Figure 2, the polar encoder and decoder include intricate networks of
internal connections, particularly as the kernal block size N becomes large. Unless sophisticated
techniques for routing information around the polar code graph are developed, large interconnec-
tion networks are required to enable information to be routed between each pairing of hardware
components. This is a particular challenge in the polar decoder, where partial sum bits must be
routed from the left-hand edge of the graph to the computation of g functions that are distributed
all over the graph, for example.
Flexibility. The 5G NR polar code is required to support a wide variety of kernal block sizes
N , comprising up to a maximum of Nmax = 1024 bits. This requires a compromise to be struck
between providing enough hardware to complete the processing of the longest block lengths with
a low latency, and providing so much hardware that it cannot be fully exploited when completing
the processing of the shortest block lengths. Unless sophisticated techniques for managing this
challenge are developed, a poor ratio of hardware resource usage to the latency required to
complete the decoding process will result for either the short or the long block lengths.
Interlacing. As described in Sections II and III, the block conditioning components of the polar
encoder and decoder are required to insert or remove bits in the various blocks, in order to
transform between block sizes of K, N and M . Here, the specific positions of the inserted
or removed bits depend on the particular combination of K, N and M , requiring the use of
very flexible interlacer and deinterlacer circuits, which must be capable of inserting or removing
an arbitrary number of bits in arbitrary positions within the various blocks. Here, sophisticated
techniques are required in order to facilitate hardware efficient block conditioning with low
latency.
Complicated block conditioning. The information block conditioning and encoded block con-
ditioning employed in the NR polar code is very complicated, since it includes code block
segmentation, CRC attachment, CRC interleaving, CRC scrambling, PC and frozen bit insertion,
sub-block interleaving, rate matching, channel interleaving and code block concatenation, as
shown in Figures 6 – 8. Furthermore, there are intricate interdependencies between these opera-
tions, where the frozen bit insertion process in the information block conditioning is dependent
on the rate matching operation in the encoded block conditioning, for example. In contrast to
other channel codes, where the various information and encoded block conditioning operations
can be completed separately, using independent processing blocks, the NR polar code requires its
processing blocks to be tightly coupled together in order to maximise the achievable performance.
The following challenges are specific to the implementation of polar decoders.
Decoder complexity. The complexity of a polar decoder is much greater than that of a polar
encoder for three reasons. Firstly, while polar encoders operate on the basis of bits, polar decoders
operate on the basis of the probabilities of bits, which require more memory to store and more
complex computations. Secondly, while polar encoders only have to consider the particular
permutation of the information block that they are presented with, polar decoders must consider
all possible permutations of the information block and must select that which is most likely.
Finally, while polar encoders only process each information block once, an SCL polar decoder
must process each information block L number of times, in order to achieve sufficiently strong
error correction. For these reasons, the latency, hardware resource usage and power consumption
AccelerComm
of polar decoders are typically orders of magnitude greater than those of polar encoders.
Copy. As described in Section III-B, the SCL decoding process creates replicas of the list
of candidate kernal information blocks, as well as all associated intermediate LLRs and bits.
However, copying this large amount of information within a hardware implementation imposes
particular challenges for the implementation of the memory architecture. One option is to employ
memory blocks having very large bandwidths, allowing the copy process to be completed within
a small number of steps. Alternatively, the copy process could be completed over many steps,
requiring only a moderate memory bandwidth. However, either way, the ratio of hardware
resource usage to the latency required to complete the decoding process is unfavourable, unless
sophisticated alternative techniques can be developed and utilised. This challenge is particularly
important, since the hardware resource usage of polar decoders is typically dominated by memory.
Sort. Another key challenge in the implementation of the SCL decoding process is imposed
by metric sorting. As described in Section III-B, this sort is required in order to identify
and prune the worst L candidate kernal information blocks, among the merged list of 2L
candidates. One option is to employ a large amount of hardware to simultaneously compare
every one of the 2L candidates with every other one of the candidates, so that the sorting can
be completed within a short latency. Alternatively, the hardware resource requirement can be
reduced by structuring successive comparisons to efficiently reuse intermediate results, at the
cost of increasing the latency required to rank the 2L candidates. However, either way, the
ratio of hardware resource usage to the latency required to complete the decoding process is
unfavourable, unless sophisticated alternative techniques can be developed and utilised.
CRC integration. CRC bits are employed by the NR polar code in order to facilitate error
detection and also to improve the error correction capability of the polar decoder. However,
there is a tradeoff between the error detection capability and the error correction capability. In
order to meet the error detection reliability requirements of NR, the CRC bits must be handled
very carefully, in a manner which is not captured in the NR standards. In particular, the CRC
(and PC) bits must be decoded as an integral part of the polar decoding process, using an
unconventional decoding technique. This is in contrast to conventional CRCs, which may be
decoded separately from other channel codes, in independent processing blocks, leading to a
much simpler implementation.
V. C ONCLUSIONS
In this white paper, we have discussed the selection of polar codes in the 5G NR standard and have
provided tutorials on the polar encoding and decoding processes, paying particular attention to the SC and
SCL decoding algorithms. Furthermore, we have discussed the challenges associated with the hardware
implementation of polar encoder and decoders, noting that these challenges are particularly great in the
case of the polar decoder, since its complexity is orders of magnitude greater than that of the polar encoder.
At AccelerComm, we have been researching polar codes since they were first published in 2009. We
have drawn upon our expertise and intuition for polar codes in order to develop polar encoder and decoder
solutions that address all of the challenges described in this white paper. We offer patent-pending first-
to-market polar encoder and decoder Intellectual Property (IP) which allow all of the 5G requirements to
be met in Field Programmable Gate Array (FPGA) and Application Specific Integrated Circuit (ASIC)
implementations. More specifically, we have developed sophisticated solutions that overcome all of the
AccelerComm
challenges described in Section IV, offering much greater flexibility, error correction capability and
hardware efficiency than all previously published implementations of polar encoders and decoders.
R EFERENCES
[1] E. Arikan, “Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels,”
IEEE Transactions on Information Theory, vol. 55, no. 7, pp. 3051–3073, July 2009.
[2] 3rd Generation Partnership Project; Technical Specification Group Radio Access Network; NR; Multiplexing and channel coding (Release
15), 3GPP Std. TS 38.212, Rev. 15.0.0, December 2017.
[3] K. Niu and K. Chen, “Crc-aided decoding of polar codes,” IEEE Communications Letters, vol. 16, no. 10, pp. 1668–1671, October
2012.
[4] Huawei, HiSilicon, “Polar code construction for NR,” in 3GPP TSG RAN WG1 Meeting #86bis, Lisbon, Portugal, October 2016,
R1-1608862.
[5] ZTE, ZTE Microelectronics, “Rate matching of polar codes for eMBB,” in 3GPP TSG RAN WG1 Meeting #88, Athens, Greece, February
2017, R1-1701602.
[6] I. Tal and A. Vardy, “List decoding of polar codes,” in 2011 IEEE International Symposium on Information Theory Proceedings, July
2011, pp. 1–5.
[7] A. Balatsoukas-Stimming, M. B. Parizi, and A. Burg, “Llr-based successive cancellation list decoding of polar codes,” IEEE Transactions
on Signal Processing, vol. 63, no. 19, pp. 5165–5179, Oct 2015.
[8] K. Niu and K. Chen, “Crc-aided decoding of polar codes,” IEEE Communications Letters, vol. 16, no. 10, pp. 1668–1671, October
2012.
[9] T. Erseghe, “Coding in the finite-blocklength regime: Bounds based on Laplace integrals and their asymptotic approximations,” IEEE
Transactions on Information Theory, vol. 62, no. 12, pp. 6854–6883, December 2016.
Prof. Robert G. Maunder is an industry authority on error correction and channel coding. As a professor at the
University of Southampton, he built a team of experts and published over 100 IEEE papers and resources on the joint
design of algorithms and hardware implementations for error correction, including turbo, LDPC and polar codes. This
expertise is being leveraged by Prof Maunder’s founding of AccelerComm, which is a semiconductor IP-core company
specialising in patent-pending channel coding solutions.
AccelerComm
c
AccelerComm 2018
PBCH encoder Key:
(x.x.x.x) Section of TS38.212
Determination
of higher layer
parameters
PBCH PBCH PBCH CRC24C CRC Frozen bit Polar Sub-block Rate Multiplexing
payload payload payload attachment interleaving insertion encoding interleaving matching onto PBCH
generation interleaving scrambling
(7.1.1) (7.1.1) (7.1.2) (7.1.3) (7.1.4) (7.1.4) (7.1.4) (7.1.5) (7.1.5)
PBCH decoder
Determination Determination
of higher layer of known bits
www.accelercomm.com
parameters
PBCH PBCH PBCH Distributed-CRC-aided SCL Sub-block Rate Demultiplexing

payload payload payload polar decoding deinterleaving dematching from PBCH
extraction deinterleaving descrambling
Fig. 6: Block diagram of the polar encoder and decoder employed by the Public Broadcast Channel (PBCH) of 3GPP New
Radio.
13
c
AccelerComm 2018
PDCCH encoder Key:
of RNTI of encoded
block length
DCI bit Zero Ones-initialised CRC CRC Frozen bit Polar Sub-block Rate Multiplexing
sequence padding CRC24C scrambling interleaving insertion encoding interleaving matching onto PDCCH
generation attachment
(7.3.1) (7.3.1) (7.3.2) (7.3.2) (7.3.3) (7.3.3) (7.3.3) (7.3.4) (7.3.4)
PDCCH decoder
of RNTI of information
www.accelercomm.com
block length
DCI bit Distributed-CRC-aided SCL Sub-block Rate Demultiplexing

sequence polar decoding deinterleaving dematching from PDCCH
extraction
Fig. 7: Block diagram of the polar encoder and decoder employed by the Physical Downlink Control Channel (PDCCH) of
3GPP New Radio.
14
c
PUCCH/PUSCH encoder Key:
AccelerComm
Determination
of encoded
2018
block length
(6.3.1.4.1/6.3.2.4.1)
K ∈ [12, 1706] Code CRC6 or PC and Polar Sub-block Rate Channel Code
block CRC11 frozen bit encoding interleaving matching interleaving block
segmentation attachment insertion concatenation
UCI bit (6.3.1.2.1/6.3.2.2.1) (6.3.1.2.1/6.3.2.2.1) (6.3.1.3.1/6.3.2.3.1) (6.3.1.3.1/6.3.2.3.1) (6.3.1.4.1/6.3.2.4.1) (6.3.1.4.1/6.3.2.4.1) (6.3.1.4.1/6.3.2.4.1) (6.3.1.5/6.3.2.5) Multiplexing
sequence onto PUCCH/
generation PUSCH
(6.3.1.1/6.3.2.1) Short Rate (6.3.1.6/6.3.2.6)

block matching
K ∈ [1, 11] encoding
(6.3.1.3.2/6.3.2.3.2) (6.3.1.4.2/6.3.2.4.2)
Identical to LTE
PUCCH/PUSCH decoder
Determination
of information
block length
www.accelercomm.com
K ∈ [12, 1706] Code PC/CRC-aided SCL Sub-block Rate Channel Code
block polar decoding deinterleaving dematching deinterleaving block
concatenation segmentation
UCI bit Demultiplexing

sequence from PUCCH/
extraction PUSCH
Short Rate
block dematching
K ∈ [1, 11] decoding
Identical to LTE
Fig. 8: Block diagram of the polar encoder and decoder employed by the Physical Uplink Control Channel (PUCCH) of 3GPP
New Radio.
15
PBCH polar code, K = 32, M = 864, QPSK, AWGN

10 0
10 -1
L=1
L=2
BLER
L=4
L=8
L=16
L=32
capacity
10 -2
10 -3
-12 -11 -10 -9 -8 -7 -6 -5
Es /N 0 [dB]
Fig. 9: Plot of Block Error Rate (BLER) versus channel Signal to Noise Ratio (SNR) Es /N0 for the
Public Broadcast Channel (PBCH) polar code of 3GPP New Radio, when using Quadrature Phase Shift
Keying (QPSK) for communication over an Additive White Gaussian Noise (AWGN) channel. Here, K
is the number of bits in each information block, M is the number of bits in each encoded block and L is
the list size used during min-sum Successive Cancellation List (SCL) decoding. The simulation of each
SNR was continued until 1000 block errors were observed. Capacity plots are provided by the O(n−2 )
metaconverse PPV upper bound [9].
AccelerComm
PDCCH polar code, BLER = 0.001, QPSK, AWGN

10
M=108, L=8
M=216, L=8
M=432, L=8
Required Es /N 0 [dB]
M=864, L=8
0 M=1728, L=8
M=108, L=16
M=216, L=16
M=432, L=16
M=864, L=16
M=1728, L=16
-5 M=108, capacity
M=216, capacity
M=432, capacity
M=864, capacity
M=1728, capacity
-10
-15
8 16 32 64 128
K
Fig. 10: Plot of Signal to Noise Ratio (SNR) Es /N0 required to achieve a Block Error Rate (BLER) of 10−3
versus number bits in each information block K for the Physical Downlink Control Channel (PDCCH)
polar code of 3GPP New Radio, when using Quadrature Phase Shift Keying (QPSK) for communication
over an Additive White Gaussian Noise (AWGN) channel. Here, M is the number of bits in each encoded
block and L is the list size used during min-sum Successive Cancellation List (SCL) decoding. The
simulation of each SNR was continued until 100 block errors were observed. Capacity plots are provided
by the O(n−2 ) metaconverse PPV upper bound [9].
AccelerComm
PUCCH polar code, BLER = 0.001, QPSK, AWGN

15
M=54, L=8
M=108, L=8
10 M=216, L=8
M=432, L=8
M=864, L=8
M=1728, L=8
5 M=3456, L=8
M=6912, L=8
M=13824, L=8
Required Es /N 0 [dB]
0 M=54, L=16
M=108, L=16
M=216, L=16
M=432, L=16
-5 M=864, L=16
M=1728, L=16
M=3456, L=16
M=6912, L=16
-10 M=13824, L=16
M=54, capacity
M=108, capacity
M=216, capacity
-15
M=432, capacity
M=864, capacity
M=1728, capacity
-20 M=3456, capacity
M=6912, capacity
M=13824, capacity
-25
8 16 32 64 128 256 512 1024 2048
K
Fig. 11: Plot of Signal to Noise Ratio (SNR) Es /N0 required to achieve a Block Error Rate (BLER) of
10−3 versus number bits in each information block K for the Physical Uplink Control Channel (PUCCH)
polar code of 3GPP New Radio, when using Quadrature Phase Shift Keying (QPSK) for communication
over an Additive White Gaussian Noise (AWGN) channel. Here, M is the number of bits in each encoded
block and L is the list size used during min-sum Successive Cancellation List (SCL) decoding. The
simulation of each SNR was continued until 100 block errors were observed. Capacity plots are provided
by the O(n−2 ) metaconverse PPV upper bound [9].
AccelerComm

AccelerComm Polar WhitePaper PDF

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

AccelerComm Polar WhitePaper PDF

Caricato da

Copyright:

Formati disponibili

The implementation challenges of polar codes

II. P OLAR ENCODER

Polar decoder in receiver

Fig. 1: Top-level schematic of a polar encoder and decoder.

Stage 0 Stage 1 Stage 2

Fig. 2: Polar code graphs for N ∈ {2, 4, 8}.

III. P OLAR DECODER

x̃b x̃d = g(x̃a , x̃b , ûa ) x̃b ûb ûd = ûb

(a) (b) (c)

(3) +0.09 (2) +0.72 (1) −2.41 (0) +2.41

(5) +0.81 (2) +0.09 (1) −0.87 (0) −0.87

(9) +0.96 (8) −3.13 (1) −0.72 (0) +3.56

(11) −4.09 (8) −0.96 (1) −0.09 (0) +0.09

(17) −2.02 (16) +4.28 (15) −5.53 (0) −3.12

(19) +2.26 (16) −2.02 (15) +2.02 (0) +1.15

(23) +0.73 (22) −9.81 (15) −4.28 (0) −0.72

(25) −10.5 (22) −0.73 (15) −2.75 (0) −2.66

IV. C HALLENGES OF HARDWARE IMPLEMENTATION

(7.1.1) (7.1.1) (7.1.2) (7.1.3) (7.1.4) (7.1.4) (7.1.4) (7.1.5) (7.1.5)

PBCH PBCH PBCH Distributed-CRC-aided SCL Sub-block Rate Demultiplexing

(7.3.1) (7.3.1) (7.3.2) (7.3.2) (7.3.3) (7.3.3) (7.3.3) (7.3.4) (7.3.4)

DCI bit Distributed-CRC-aided SCL Sub-block Rate Demultiplexing

(6.3.1.1/6.3.2.1) Short Rate (6.3.1.6/6.3.2.6)

UCI bit Demultiplexing

PBCH polar code, K = 32, M = 864, QPSK, AWGN

PDCCH polar code, BLER = 0.001, QPSK, AWGN

PUCCH polar code, BLER = 0.001, QPSK, AWGN

Potrebbero piacerti anche