Sei sulla pagina 1di 22

Residue Number System - 2

Dr. Arunachalam V
Associate Professor, SENSE
Choosing the RNS moduli
• The set of moduli chosen for RNS affects both the representational
efficiency and the complexity of arithmetic algorithms.
• In general, we try to make the moduli as small as possible, since it is the
magnitude of the largest modulus mk-1 that dictates the speed of arithmetic
operations.
• We also often try to make all the moduli comparable in magnitude to the
largest one, since with the computation speed already dictated by mk-1,
there is usually no advantage in fragmenting the design of arithmetic unit
through the use of very small moduli at the right end.
Choosing the RNS moduli – using prime numbers
• Let us assume that we want to represent unsigned integers in the range (0 to
1,00,000)10 , requiring 17 bits with standard binary representation.
• RNS (13|11|7|5|3|2) ; M = 30,030
• Range is not yet adequate
• RNS (17|13|11|7|5|3|2) ; M = 5,10,510
• M is 5.1 times of the required range, Therefore the moduli 5 is removed,
• RNS (17|13|11|7|3|2) ; M = 1,02,102
• The number of bits needed for encoding each number is (5+4+4+3+2+1 =
19 bits)
• Now, since the speed of arithmetic operations is dictated by the 5-bit
residues modulo m5, we can combine the pairs of moduli 2 and 13 (26) & 3
and 7 (21), with no speed penalty.
• RNS (26|21|17|11) ; M = 1,02,102 ; This still needs 19 bits
Choosing the RNS moduli – using pairwise relatively prime
numbers

• Including powers of smaller primes before moving to larger primes.


• The chosen moduli will still be pairwise relatively prime, since powers of
any two prime numbers are relatively prime.
• For example, after including = 2 and = 3 in our list of moduli, we
note that 2 is smaller than the next prime 5.
• So we modify and to get : RNS ( 2 |3) ; M = 12
• This strategy is consistent with our desire to minimise the magnitude of the
largest modulus.
Choosing the RNS moduli – using powers-of-smaller
prime numbers
• Similarly, after we have included = 5 and = 7, we note that both 2
and 3 are smaller than the next prime 11.
• RNS ( 3 | 2 |7|5) M = 2520
• RNS (11| 3 |2 |7|5) M = 27,720
• RNS (13|11| 3 |2 |7|5) M = 3,60,360
• 3.6 times larger than needed, replace 3 with 3 and then combine the pair 5
and 3 to obtain:
• RNS (15|13|11|2 |7) M = 1,20,120 Which needs 4+4+4+3+3 = 18 bits
lower than 19 bits.
• The speed has also improved because the largest residue is now 4 bits wide
instead of 5.
Choosing the RNS moduli – some other strategies
• Other variations are possible. For example, given the simplicity of
operations with power-of-2 moduli, we might want to backtrack and
maximise the size of our even modulus within the 4-bit residue limit:
• RNS(24|13|11|32|7|5) M = 7,20,720
• We can remove 5 or 7 from the list of moduli, but the resulting RNS is in
fact inferior to RNS (15|13|11|23|7).
• This might not be the case with other examples; thus, once we have
converged on a feasible set of moduli, we should experiment with other
sets that can be derived from it by increasing the power of the even
modulus at hand.
Choosing the RNS moduli
• For selecting the RNS moduli is guaranteed to lead to the smallest possible
number of bits for the largest modulus, thus maximizing the speed of RNS
arithmetic.
• However, speed and cost do not just depend on the widths of residues but
also on the moduli chosen.
• For example, we have already noted that power-of-2 moduli simplify the
required arithmetic operations, so that the modulus 16 might be better that
the smaller modulus 13 (except, perhaps, with table-lookup
implementation)
• Moduli of the form 2a -1 can be performed using a standard a-bit binary
adder with end-around carry.
• Hence, we are motivated to restrict the moduli to a power-of-2 and odd
numbers from 2a -1.
• We can prove that the numbers 2a -1 and 2b -1 are relatively prime iff a and
b are relatively prime.
• Thus, any list of relatively prime numbers ak-2> .... > a1>a0 can be the basis
of the following k-modulus RNS
Choosing the RNS moduli – even and odd moduli

• Hence, we are motivated to restrict the moduli to a power-of-2 and odd


numbers from 2a -1.
• We can prove that the numbers 2a -1 and 2b -1 are relatively prime iff a and
b are relatively prime.
• Thus, any list of relatively prime numbers ak-2> .... > a1>a0 can be the basis
of the following k-modulus RNS :
2 2 − 1 ∙ ∙ ∙ |2 − 1|2 − 1 for which the widest
residues are ak-2 bit numbers.
• Note that to maximize the dynamic range with a given residue width, the
even modulus is to be chosen as large as possible.
For our target range 1,00,000

RNS(32|31|15|17) possesses adequate range.

The derived RNS requires 5+5+4+3 = 17 bits for representing each


number, with largest residue being 5 bits wide.

In this case, the representational efficiency is close to 100% and no bit


is wasted.
Conclusions – Choice of moduli
• To compare the RNS above to our best result with unrestricted moduli, we
list the parameters of the two systems together:
a. RNS(15|13|11|2^3|7) 18 bits M = 1,20,120
b. RNS(2^5|2^5-1|2^4-1|2^3-1) 17 bits M = 1,04,160
• Both are providing the desired range.
• b has wider, but fewer residues. However, the simplicity of arithmetic with
low-cost moduli makes this a more attractive choice.
• The representational efficiency of low-cost RNS is provably better than
50%, leading to the waste of no more than 1 bit representation.
• In general, restricting the moduli tends to increase the width of the largest
residues and the optimal choice dependent on both the application and the
target implementation technology.
ENCODING AND DECODING
OF NUMBERS
Conversion from binary/decimal to RNS
• Given a number y, find its residues with respect to the moduli , 0 ≤ ≤
− 1.
• y - is the unsigned binary number.
• Conversion of a sign-magnitude or 2's-complement numbers can be
accomplished by converting the magnitude and the complementing the
RNS representation if needed.
• To avoid time-consuming divisions, we take advantage of the following
equality:
∙∙∙ = 2 +∙ ∙ ∙ + 2 + +

• If we precompute and store for each i & j, then the residue xi of y (mod mi)
can be computed by modulo- mi addition of these constants.
• For converting 10-binary numbers in the range[0,839] to RNS(8|7|5|3).
• Only residues mod 7, mod 5 and mod 3 are given in the table, since the
residue mod 8 is directly available as the 3 least significant bits of the
binary number y.
Precomputed residues of the first 10 powers of 2
Represent y = (1010 0100)2 = (164)10 in RNS(8|7|5|3).

• mod 8 is (100)2 = 4
• y = 27 + 25 + 22

• Answer : (4|3|4|2)
Optimizing the number of modulo additions
• In the worst case, k modular additions are required for computing each
residue of a k-bit number. To reduce the number of operations, one can
view the given input number as a number in a high radix.
• For example, if we use radix 4, then storing the residues of 4i, 2×4i & 3×4i
in a table would allow us to compute each of the required residues using
only k/2 modular additions.
• The conversion for each modulus can be done by repeatedly using a single
lookup table and modular adder or by several copies of each arranged into a
pipeline.
• For a low-cost modulus m = 2a-1, the residue can be by determined by
dividing up y into a-bit segments and adding them modulo 2a-1.
Conversion from RNS to binary/decimal
• First derive the mixed-radix representation of the RNS number
• Then use the weights of the mixed-radix positions to compute the
conversion
• Also derive position weights for the RNS directly based on the Chinese
remainder theorem (CRT).
• The Chinese remainder theorem
• The magnitude of an RNS number can be obtained from the CRT formula:

• Where, by definition,
The Chinese remainder theorem (CRT)
• The magnitude of an RNS number can be obtained from the CRT formula:
• = | ∙∙∙| | = ∑
• Where, by definition, = / , and = is the multiplicative
inverse of Mi with respect to mi.
Values needed in applying the CRT to RNS(8|7|5|3)
i
i
3 8 0 0
1 5 0 0
1 105
1 168
2 210
2 336
3 315
3 504
4 420
4 672
5 525
0 3 0 0
6 630
1 280
7 735
2 560
2 7 0 0
1 120
2 240
3 360
4 480
5 600
6 720
Example

(1|0|0|0)RNS = 105
(0|1|0|0)RNS = 120
(0|0|1|0)RNS = 168
(0|0|0|1)RNS = 280

3 2 4|2 = 3 × 105 + 2 × 120 + 4 × 168 + 2 × 280 = 107


Reference
1. Chapter 4.2 & 4.3 of Behrooz Parhami, “Computer Arithmetic:
Algorithms and Hardware Design”, (2/e) Oxford University Press 2015.
DA 1
• Try the problem numbers [3.1, 3.2, 3.4, 3.5, 3.6, 3.7] and [4.1, 4.2, 4.3, 4.4,
4.5, 4.6, 4.13] of Behrooz Parhami, “Computer Arithmetic: Algorithms and
Hardware Design”, (2/e) Oxford University Press 2015.

Next Class
ENCODING AND DECODING THE RNS
NUMBER

Potrebbero piacerti anche