Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Contents
Preface
1 Modular congruences
1.1 Motivation . . . . . . . . . . . . . . . . . .
1.2 Divisibility . . . . . . . . . . . . . . . . . .
1.2.1 Divisors, quotients and remainders
1.2.2 Euclids algorithm . . . . . . . . .
1.2.3 Prime numbers . . . . . . . . . . .
1.3 Modular arithmetic . . . . . . . . . . . . .
1.3.1 Congruence modulo n . . . . . . .
1.3.2 The rings Zn . . . . . . . . . . . .
1.3.3 The Chinese Remainder Theorem .
1.4 RSA revisited . . . . . . . . . . . . . . . .
1.5 Exercises . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
10
11
13
20
24
24
30
37
42
46
2 Pseudo-random numbers
2.1 Motivation: traffic simulation
2.2 Linear congruential generators
2.3 Blum-Blum-Schub generators
2.4 Traffic simulation revisited . .
2.5 Exercises . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
51
52
54
59
59
62
.
.
.
.
.
.
.
.
.
.
65
65
65
66
68
68
73
77
84
90
91
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3 Polynomials
3.1 Motivation . . . . . . . . . . . . . . . . . . . .
3.1.1 Digital circuit equivalence . . . . . . .
3.1.2 Inverse kinematics of a robot . . . . . .
3.2 Basic concepts . . . . . . . . . . . . . . . . . .
3.2.1 Rings of polynomials . . . . . . . . . .
3.2.2 Monomial orderings . . . . . . . . . . .
3.2.3 Division of terms and polynomials . . .
3.2.4 Reduction modulo a set of polynomials
3.3 Grobner bases . . . . . . . . . . . . . . . . . .
3.3.1 Ring ideals . . . . . . . . . . . . . . .
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
3.4
3.5
4 Euler-Maclaurin formula
4.1 Motivation . . . . . . . . . .
4.2 Expressions . . . . . . . . .
4.3 Main results . . . . . . . . .
4.4 Examples . . . . . . . . . .
4.4.1 Gaussian elimination
4.4.2 Insertion sort . . . .
4.5 Exercises . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
95
101
111
118
118
122
123
.
.
.
.
.
.
.
129
130
131
144
145
146
149
154
.
.
.
.
.
.
.
.
.
.
159
159
161
161
163
167
171
171
174
178
178
.
.
.
.
.
.
.
.
179
179
179
182
183
194
194
205
212
References
214
Subject Index
219
CONTENTS
Table of Symbols
5
223
CONTENTS
Preface
The material presented herein constitutes a self-contained text on several topics
of discrete mathematics for undergraduate students. An outline of this text is as
follows.
In Chapter 1 we address modular congruences and their applications. A
motivating example from public key cryptography is presented in Section 1.1.
In Section 1.2, we introduce several concepts and results related to divisibility.
Section 1.3 concentrates on modular arithmetic including basic properties of the
rings Zn . Then, in Section 1.4, we revisit the motivating example.
In Chapter 2 we discuss the generation of pseudo-random numbers. Random
numbers are useful in several different fields ranging from simulation and sampling
to cryptography. In Section 2.1 we present a motivating example related to traffic
simulation. In section 2.2 we introduce linear congruential generators. BlumBlum-Shub generators are presented in in Section 2.3. The traffic simulation
example is revisited in Section 2.4.
Chapter 3 presents several key concepts and results related to polynomials. In
Section 3.1 we first discuss a motivating example illustrating how polynomials can
be used to verify equivalence of digital circuits. Then we illustrate the relevance
of polynomials in robotics. In Section 3.2 we introduce the notion of polynomial
over a field as well as the sum and product of polynomials. We then introduce
division of polynomials and several related results. Grobner bases and their
properties are presented in Section 3.3. In Section 3.4 we revisit our motivating
examples and show how to use Grobner bases for finding solutions of systems of
nonlinear polynomial equations.
In Chapter 4, we introduce several techniques to compute summations. In
Section 4.1 we present a motivating example in Bioinformatics. In Section 4.2
we introduce summation expressions and some of their relevant properties. The
Euler-Maclaurin formula is presented in Section 4.3. In Section 4.4, we illustrate
the relevance of summations to the analysis of the Gauss elimination technique
and the insertion sort algorithm.
In Chapter 5 we present the discrete Fourier transform The discrete Fourier
transform is widely used in many fields, ranging from image processing to efficient
multiplication of polynomials and large integers. In section 5.1 we discuss a
motivating example that illustrates the use of the discrete Fourier transform
7
CONTENTS
1.Modular congruences
2.Pseudo-random
numbers
4.Euler-MacLaurin formula
3.Polynomials
6.Generating functions
Chapter 1
Modular congruences
In this chapter we address modular congruences and their applications. The
chapter is organized as follows. We start in Section 1.1 with a motivating example
from public key cryptography. Then, in Section 1.2, we introduce several concepts
and results related to divisibility. Section 1.3 concentrates on modular arithmetic
including basic properties of the rings Zn . In Section 1.4 we revisit the example.
In Section 1.5 we propose some exercises.
1.1
Motivation
Consider the problem of sending a secret message through a public channel between two parties, say from Alice to Bob, that have not previously agreed upon
a key. The most-well known solution to this problem consists in using a public
key protocol to exchange the message. In public key cryptography, each party
has a pair of keys: a public key and a corresponding private key. The public
key, as the name suggests, is made public whereas the private key is kept secret.
Messages are encrypted with the public key and can only be decrypted with the
corresponding private key. Moreover, it should be hard to get the private key
from the public one, even in the presence of an encrypted message.
Let K be the set of public keys and R a set of private keys, and X, Y be
sets. Encryption is described as a family of maps u = {uk : X Y }kK and
decryption as a family of maps v = {vr : Y X}rR such that:
1. for each public key k K there is a unique private key rk R such that
vrk uk = id;
2. uk and rk must be computed efficiently (in polynomial-time);
3. it should be hard to invert uk , when rk is not known.
To send a message x to Bob, Alice first uses Bobs public key k to obtain the
ciphered text uk (x). When Bob receives uk (x) over the public channel, he uses
his private key rk to obtain the original message vrk (uk (x)) = x.
9
10
Due to Property 3, it is hard for a third party, say Eve, that eavesdrops the
channel and knows uk (x) and k to obtain x.
Example 1.1.1 The RSA cryptosystem is due to R. Rivest, A. Shamir and L.
Adelman [26]. It can be characterized as in Figure 1.1, where mod(n, m) is the
remainder of the (integer) division of n by m.
For the RSA cryptosystem to work we need to show that a, b, u and v can be
efficiently computed and that the equality
vrk uk = idX
indeed holds. The proof uses several notions and results that are presented in
this chapter.
As we shall see, the security of RSA relies on the conjecture that it is not
possible to factorize integers efficiently (in polynomial-time). This conjecture
may well not be true. Indeed, it has been shown by Shor [28] that quantum
computers factorize in polynomial time. Therefore, the RSA must be abandoned
if and when quantum computers become available.
1.2
Divisibility
11
1.2. DIVISIBILITY
1.2.1
dividesQ=Function[{m,n},IntegerQ[n/m]]
Figure 1.2: Divisor test in Mathematica
The remainder of the division of an integer n by an integer m play an important role in the sequel. We first consider the case where m is positive.
Definition 1.2.2 Let m, n Z with m > 0. We say that q Z is the quotient
and r Z is the remainder of the integer division of n by m whenever
n=qm+r
and
0r<m
For simplicity we often only refer to the quotient and remainder of the division
of n by m. The following result establishes that they are unique.
Proposition 1.2.3 For each n, m Z with m > 0 there are unique integers q
and r such that n = q m + r and 0 r < m.
Proof: Let S = {n k m : k Z}. The set S N0 is not empty since,
for instance, n + |n| m S. Let r be the least element of S N0 . Then,
r = n q m for some q Z, and therefore n = q m + r with r 0. If r m,
considering r = n (q + 1) m and recalling that m > 0, it holds that r 6= r
and 0 r = r m < r, contradicting the fact that r is the least element in
S N0 . Hence, r < m.
12
We now prove the unicity. Assume there are integers q, q , r and r such that
n = q m + r and n = q m + r with 0 r, r < m. Then,
m (q q ) = r r
Assuming without loss of generality that q q and recalling that m > 0, we can
conclude that either r r = 0 or r r m. But, since 0 r, r < m, it holds
that r r < m. Hence, r r = 0 = m (q q ). Therefore, r = r . Moreover,
given that m > 0, also q q = 0, that is, q = q .
QED
Let n = q m + r be the equality in Proposition 1.2.3. Observe that
jnk
q=
,
m
n
. Recall that the map Floor associates with each x
that is, q is the Floor of m
the integer x, that is, the largest integer k such that k x. We denote the
remainder of the division by
mod(n, m).
Clearly,
mod(n, m) = n
jnk
m
for n, m Z, m > 0.
We now present some useful facts about remainders.
Proposition 1.2.4 Let n, n , m, k Z with m, k > 0.
1. If 0 n < m then mod(n, m) = n.
2. mod(n, m) = mod(n , m) if and only if n n is a multiple of m.
3. If k divides m then mod(mod(n, m), k) = mod(n, k).
Proof:
1. Since n = 0 m + n and 0 n < m, by Proposition 1.2.3, mod(n, m) = n.
2. Assume that mod(n, m) = mod(n , m). Since
n
n= m
m + mod(n, m) and n = nm m + mod(n , m)
we conclude that
n
n n = m
m nm m.
n n
Hence, n n = ( m
m ) m and therefore n n is a multiple of m.
Conversely, assume that m divides n n . Hence, n n = q m for some
q Z and therefore
1.2. DIVISIBILITY
13
n = q m + n = q m + nm m + mod(n , m)
that is, n = (q + nm ) m + mod(n , m). Since 0 mod(n , m) < m, by
Proposition 1.2.3, mod(n , m) = mod(n, m).
n
m. As a consequence, since k divides
3. We have that n mod(n, m) = m
m, we also have
n
n mod(n, m) = m
qk
We can extend Definition 1.2.2 to the case where m < 0 and requiring that
0 r < |m|. Clearly, Propositions 1.2.3 and 1.2.4 also extend to this case
(Exercise 3 in Section 1.5).
1.2.2
Euclids algorithm
We now present the Euclids algorithm for computing the greatest common divisor
of two integers as well as an extension of the algorithm.
Definition 1.2.5 Let m, n Z not simultaneously equal to 0. The greatest common divisor of m and n, gcd(m, n), is the greatest integer that divides both m
and n.
The case m = n = 0 is excluded because every integer divides 0 and therefore
there is no greatest integer that divides 0.
Clearly, gcd(m, n) = gcd(n, m) and gcd(m, n) is always a positive integer.
The following properties of the greatest common divisor are useful in the sequel.
Proposition 1.2.6 Let m, n Z not simultaneously equal to 0.
1. gcd(0, m) = |m| for m 6= 0.
2. gcd(m, n) = gcd(m, n) = gcd(m, n) = gcd(m, n).
3. gcd(n, m) = gcd(mod(n, m), m).
Proof:
1. It is easy to conclude that |m| is the largest element of the set {k Z : k|m}
for m 6= 0. Noting that k|0 for all k Z, we conclude that gcd(0, m) = |m| holds.
2. Since m = k d if and only if m = (k) d, for every d, k Z, we have
that
{k Z : k|m} = {k Z : k|(m)}
14
Therefore, gcd(m, n) = gcd(m, n). The other equalities also follow easily.
3. Assume n = q m + mod(n, m) and let d|m, that is, m = k d for some
k Z. If d|mod(n, m), then
n = q k d + k d = (q k + k ) d
QED
The second statement of Proposition 1.2.6 shows that for computing the greatest common divisor we can concentrate only on nonnegative integers. The first
and third statements of Proposition 1.2.6 play a crucial role in the Euclids algorithm for computing the greatest common divisor. Clearly, one way of finding
gcd(m, n), when m and n are both different from 0, consists of listing all the divisors of n and all the divisors of m picking then the largest element included in
both lists. However, this is not efficient, even if we only list the positive divisors.
The Euclids algorithm uses the results above to compute the greatest common
divisor of two nonnegative integers in a more efficient way.
The Euclids algorithm, or Euclidean algorithm, dates from around 300 BC,
being included in the 7th book of Euclids Elements. It can be recursively as
described in Figure 1.3.
, 18)
= euclid( |{z}
6
mod(24,18)
= euclid( |{z}
0 , 6)
mod(18,6)
= 6
15
1.2. DIVISIBILITY
where it is clear the recursive nature of the algorithm. To compute euclid(24, 18)
we can also make just a simple table including the recursion steps:
mod(n, m)
24 18
18
18 24
18
and
yi = xi1 .
(1.1)
16
(1.2)
On the other hand, using 3 of Proposition 1.2.6 and the fact that gcd(a, b) = gcd(b, a),
we have
gcd(xi1 , yi1 ) = gcd(xi , yi)
for all 1 i p, hence
gcd(x0 , y0 ) = gcd(xp , yp ).
(1.3)
The recursive function euclid in Figure 1.4 implements the Euclids algorithm
in the obvious way (see Figure 1.3).
euclid=Function[{m,n},
If[m==0,
n,
euclid[Mod[n,m],m]]];
Figure 1.4: Euclids algorithm in Mathematica
The time complexity of Euclids algorithm is discussed in Section 6.3.2 of
Chapter 6.
Extended Euclids algorithm
We start by stating an important property of the greatest common divisor.
Proposition 1.2.10 Let m, n Z not both equal to 0. Then gcd(n, m) is the
smallest positive number of the form a m + b n for a, b Z.
Proof: Let us first consider the case n = 0. Then, a m + b n becomes a m.
The smallest positive integer of this form is |m|, that is, gcd(n, m), recalling 1 of
Proposition 1.2.6. The case m = 0 is clearly similar.
17
1.2. DIVISIBILITY
Assume now that m and n are not both equal to 0, and let S be the set of all
positive integers of the form a m + b n for integers a and b. Since m and n are
not both equal to 0, S is a nonempty set of positive integers and therefore it has
a least element x = a m + b n for some a and b. Let q and r be the quotient
and remainder of the division of m by x, respectively. Then, m = q x + r with
0 r < x and therefore
r = m q (a m + b n) = (1 q a) m + (q b) n
If r > 0 then r S. But, since r < x, this contradict the fact that x is the least
element of S. Hence, r = 0 and therefore x divides m. Reasoning in a similar
way with respect to x and n we conclude that x divides n. Thus, x is a common
divisor of m and n. Let y Z be another common divisor of m and n. Then,
x = a m + b n = a k y + b k y = (a k + b k ) y
for some k, k Z, and therefore y x. Hence, x = gcd(n, m).
QED
18
mod(n, m)
24 18
18
n
m
1
(= 1 (1)
18 24
18
24
18
1
18
(= 1 0
18
1
24
(= 0 1
1
1
0
)
We start by filling in the first three columns of each line, top down. This
corresponds to the table in Example 1.2.9. Once we get 0 in column m, we
can fill in the last line of columns a and b with 0 and 1, respectively, since
exteuclid(m, n) = (0, 1). Afterwards, we fill in the other lines of these columns,
bottom up, following the second equality of the algorithm, as indicated. Hence,
exteuclid(24, 18) = (1, 1)
Observe that, indeed, gcd(24, 18) = 6 = 1 24 + (1) 18.
We can now compute exteuclid(15, 100), for instance, in a similar way:
mod(n, m)
n
m
15 100
10
10
15
10
We conclude that exteuclid(15, 100) = (7, 1). Note again that gcd(15, 100) =
5 = 7 15 + (1) 100.
19
1.2. DIVISIBILITY
Proposition 1.2.12 The extended Euclids algorithm is sound, that is,
if exteuclid(m, n) = (a, b) then gcd(m, n) = a m + b n
holds for m, n nonnegative integers.
Proof: To compute the value of exteuclid(m, n), with m 6= 0, we start by computing exteuclid(mod(n, m), m), and go on repeating this step until we are required
to compute exteuclid(0, k) for some k. Hence, we are required to compute
exteuclid(xi , yi )
for 1 i p and p 0, where (x0 , y0) = (m, n), xp = 0 and
(xi , yi) = (mod(yi1 , xi1 ), xi1 )
for each 1 i p. Note that there are indeed finitely many of these pairs (xi , yi),
that is, after a finite number of steps we get xp = 0 for some p, because these
pairs are exactly as in (1.1) above. In fact, as remarked therein,
yp = gcd(m, n)
(1.4)
(1.5)
yp1
xp1
yi1
xi1
, ap )
, ai )
y
x) + v x
= u y u xy x + v x
= (v u xy ) x + u y
x
(1.6)
20
yi1
xi1
k
) xi1 + ai yi1
ap xp + bp yp = a0 x0 + b0 y0
On the other hand, recalling (1.4) and (1.5)
gcd(m, n) = yp = 0 0 + 1 yp = ap xp + bp yp
and therefore
gcd(m, n) = a0 x0 + b0 y0
Recalling (1.6) we can finally conclude that, indeed, exteuclid(m, n) = (a, b) such
that gcd(m, n) = a m + b n.
QED
The recursive function exteuclid in Figure 1.6 implements the extended
Euclids algorithm following in Mathematica.
exteuclid=Function[{m,n},
If[m==0,
{0,1},
{exteuclid[Mod[n,m],m][[2]]exteuclid[Mod[n,m],m][[1]]Floor[n/m],
exteuclid[Mod[n,m],m][[1]]}]];
Figure 1.6: Extended Euclids algorithm in Mathematica
1.2.3
Prime numbers
21
1.2. DIVISIBILITY
n = a p n + b k p = (a b + b k) p
for some k Z, that is, p divides n.
QED
22
n=
s
Y
pi
and
n=
i=1
t
Y
qi
i=1
where pi and qj are prime numbers for 1 i s and 1 j t, and pi pi+1 and
qi qi+1 for each 1 i < s and 1 j < t. Moreover, assuming without loss of
generality that s t, there is 1 i s such that pi 6= qi . Clearly, s, t > 1. Then,
since p1 divides n, by Proposition 1.2.15, p1 divides q1 or p1 divides q2 . . . qt .
On one hand, if p1 divides q1 , then p1 = q1 since they are both prime. Thus,
n =
s
Y
i=2
pi
and
n =
t
Y
qi
i=2
that is, n < n has two distinct prime factorizations. But this contradicts the
assumption of n being the smaller integer greater that 1 satisfying this property.
On the other hand, since q2 . . . qt < n then q2 . . . qt has a unique prime
factorization. Hence, if p1 divides q2 . . . qt then q2 . . . qt = k p1 and
therefore p1 = qj for some 1 j < t. Removing p1 from the first factorization of
n and qj from the second, we again end up with 1 < n < n with two distinct
prime factorizations thus contradicting once more the assumption regarding n.
As a consequence, we conclude that every integer n > 1 has a unique prime factorization.
QED
Example 1.2.17 The factorizations into primes of 15, 90 and 2205, for instance,
are as follows:
15 = 31 51
90 = 21 32 51
2205 = 32 51 72
1.2. DIVISIBILITY
23
24
1.3
Modular arithmetic
This section concentrates on modular arithmetic, that is, where the arithmetic
operations are defined modulo n. Modular arithmetic was first introduced by the
German mathematician C. Gauss in 1801.
1.3.1
Congruence modulo n
25
congrMod=Function[{a,b,n},Mod[a,n]==Mod[b,n]];
Figure 1.7: Congruence modulo n in Mathematica
The relation =n is an equivalence relation, that is, it is reflexive, symmetric
and transitive (Exercise 8 in Section 1.5). The next result relates the congruences
modulo m and n with the congruence modulo mn.
Proposition 1.3.3 Let a, b, m, n Z with m, n > 0. If m and n are coprime,
then a =mn b if and only if a =m b and a =n b.
Proof:
() Assume a =mn b. Using 2 of Proposition 1.2.4, a b = kmn for some k Z
and therefore both a =m b and a =n b.
() Assume that a =m b and a =n b. The result is straightforward if m or n is
equal to 1. Using the reflexivity of =n , the result is also immediate if a = b.
(1) Let m, n > 1 and a > b. Then,
r
r
Y
Y
ei
ab=k
(pi ) = k
(pi )ei
i=1
i=1
26
Q
Q
for some k , k Z, where ri=1 (pi )ei and ri=1 (pi )ei are the factorizations of
m and n into prime numbers, respectively. Since m and n are coprime, by 2 of
Proposition 1.2.19, pi 6= pj for all 1 i r and 1 j r . We have that
r
Y
ab=k
(pi )ei > 1
i=1
ei
ab=k
(pi )
(pi )ei
i=1
i=1
27
QED
and
are well defined, that is, the classes [a]n and [b]n of two integers uniquely determine
the classes [a]n +n [b]n and [a]n n [b]n . We can also consider the unary operation
n on Zn .
n [a]n = [n a]n
The set Zn equipped with this operations has several important algebraic properties that are studied in Section 1.3.2.
When no confusion arises we can refer to Zn as the set
{0, ..., n 1}
for a simplified notation.
We end this section with some results involving modular congruences that will
be useful later on. We begin with the following theorem, known as the Eulers
theorem, that we do not prove. The interested reader is referred to [15].
28
29
Hence, on one hand ak2 bk2 =n 1 and, on the other hand, ak1 bk2 =n ak1 k2
assuming that k1 > k2 . As a consequence ak1 k2 =n 1.
QED
Observe that if a is not coprime to n then the existence of k N such that
a =n 1 is not ensured. As an example note that mod(2k , 4) is either 2 or 0 for
all k N (Exercise 11 in Section 1.5).
k
(n)
+mod((n),k)
k
= (ak )
(n)
amod((n),k)
then
1 =n amod((n),k)
But the above congruence contradicts the assumption that k is the order modulo
n of a. We can then conclude that k divides (n).
QED
Given that ak = mod(a, n)k for all k N0 , the order of any a > n coprime to
n is always less than or equal to the order of any a < n coprime to n. Hence,
there is always a primitive number modulo n less than n when n > 1.
Example 1.3.10 Let us consider n = 5. The order of 4 modulo 5 is 2 since
4 =5 4 and 42 =5 1, but 4 is not a primitive element modulo 5 since the order of
3 modulo 5 is 4. In fact, 3 =5 3, 32 =5 9, 33 =5 2 and 34 =5 1. Noting that the
order of 2 modulo 5 is also 4, we conclude that 2 and 3 are primitive elements
modulo 5.
The Carmichael function associates any positive integer n with the order of
the primitive elements modulo n.
30
In Mathematica the order of a modulo n and (n) can be computed using the
functions MultiplicativeOrder and CarmichaelLambda, respectively.
The following notion of quadratic residue modulo n is also useful later on in
Chapter 2.
Definition 1.3.13 Let n be an integer greater than 2. The integer a is a
quadratic residue modulo n if a is coprime to n and there is an integer x such
that x2 =n a.
Example 1.3.14 Recall Example 1.3.10. Let us consider n = 5. Since
1 2 =5 = 4 2 =5 1
and
22 =5 = 3 2 =5 4
we can conclude that 1 and 4 are quadratic residue modulo 5, but, for instance,
2 and 3 are not quadratic residue modulo 5.
1.3.2
The rings Zn
In this section we endow the sets Zn with some algebraic structure. For simplicity,
we will consider Zn = {0, 1, ..., n 1}. Then the operations +n , n and n on
Zn presented at the end of Section 1.3 become
a +n b = mod(a + b, n)
a n b = mod(a b, n)
n a = mod(n a, n)
31
(associativity of +)
(commutativity of +)
(additive identity)
(a additive inverse of a)
(associativity of )
(commutativity of )
(multiplicative identity)
for every a A. When A is a unitary ring we can refer to the inverse with respect
to . Then, b A is a multiplicative inverse of a A if a b = b a = 1.
Clearly, multiplicative inverses are unique (Exercise 13 in Section 1.5). We
use a1 to denote the multiplicative inverse a, whenever it exists.
A unitary commutative ring where every nonzero element has a multiplicative
inverse has a special name. Such algebraic structure is a field.
Definition 1.3.17 A field is a unitary commutative ring A = (A, +, 0, , )
where the multiplicative unity is distinct from 0 and every a A\{0} has a
multiplicative inverse.
It is easy to conclude that (R, +, 0, , ) is a field, whereas (Z, +, 0, , ) is
a unitary commutative ring but it is not a field (Exercise 15 in Section 1.5). In
the following example we show that endowing Zn with the operation +n and n
we obtain a unitary commutative ring.
32
Commutativity of +n
0 additive identity
(a b) + (a c) =n mod(a b, n) + mod(a c, n)
From the distributivity of over + and the transitivity of =n it easily
follows mod(a mod(b + c, n), n) = mod(mod(a b, n) + mod(a c, n), n).
33
The proofs of the associativity and commutativity of n are similar to the ones for
+n . Right distributivity follows from the left distributivity and the commutativity
of +n and n . Proving that n a is the additive inverse of a is also easy and it is
left as an exercise to the reader.
For simplicity, when considering the ring (Zn , +n , 0, n , n ) we often just refer
to the ring Zn . A element a Zn with multiplicative inverse is also said to be a
unit of Zn and the corresponding inverse is denoted by an 1 , or simply a1 . The
reference to n can be omitted if no ambiguity arises. Similarly, also the reference
to n can be omitted in the additive inverse n a.
Example 1.3.19 Recall that Z5 = {0, 1, 2, 3, 4}. In the ring (Z5 , +5 , 0, 5 , 5 ):
5 2, the additive inverse of 2, is 3
5 4, the additive inverse of 4, is 1
1 has multiplicative inverse and 15 1 is 1 since 1 5 1 = 1
2 has multiplicative inverse and 25 1 is 3 since 2 5 3 = mod(2 3, 5) = 1
4 has multiplicative inverse and 45 1 is 4 since 4 5 4 = mod(4 4, 5) = 1
Clearly, 3 also has multiplicative inverse and 35 1 is 2. Only 0 has no multiplicative inverse. Hence, (Z5 , +5 , 0, 5 , 5 ) is also a field.
Example 1.3.20 Recall that Z6 = {0, 1, 2, 3, 4, 5}. In the ring (Z6 , +6 , 0, 6 , 6 ):
6 2, the additive inverse of 2, is 4
6 3, the additive inverse of 3, is 3
1 has multiplicative inverse and 16 1 is 1 since 1 6 1 = 1
5 has multiplicative inverse and 56 1 is 5 since 5 6 5 = mod(5 5, 6) = 1
Only 1 and 5 have multiplicative inverses. Hence, the ring (Z6 , +6 , 0, 6 6 ) is
not a field.
The elements of Zn that have multiplicative inverse are precisely the elements
of Zn that are coprime to n. These is a consequence of Proposition 1.3.6.
34
QED
The proofs of Propositions 1.3.6 and 1.3.21 suggest an algorithm for computing inverses in Zn using the extended Euclids algorithm. If a Zn has
multiplicative inverse in Zn and exteuclid(a, n) = (c, d) then mod(c, n) is that
multiplicative inverse (Exercise 17a in Section 1.5).
Example 1.3.22 Consider the ring Z61 . Since 61 is prime all nonzero elements
of Z61 have multiplicative inverse. Let us compute the multiplicative inverse of
16, for instance. Since
exteuclid(16, 61) = (19, 5)
the multiplicative inverse of 16 in Z61 is mod(19, 61) = 42. We can also get the
multiplicative inverse of 16 in Z61 looking at 19 as the additive inverse of 19 in
Z61 , that is, 42 (= 61 19).
A simple corollary of Proposition 1.3.21 states that if n is prime all the elements of Zn apart from 0 have multiplicative inverse, that is, if n is prime then
Zn is a field.
Corollary 1.3.23 Let n be a prime number. Then Zn is a field.
Proof: If a Zn \{0} then 0 < a < n. By 3 of Proposition 1.2.19 and Proposition 1.3.21 a has multiplicative inverse in Zn .
QED
In some situations it is useful to consider an extension of the notion of multiplicative inverse in Zn . Given a positive integer n and a, b Z, we say that b is
a multiplicative inverse of a modulo n whenever
a b =n 1.
35
By Proposition 1.3.6, such integer b exists if and only if a and n are coprime
and from its proof it follows that the extended Euclids algorithm computes a
multiplicative inverse of a modulo n.
As expected, multiplicative inverses modulo n are not unique: if the integer
b is a multiplicative inverse of a modulo n then the integer c is a multiplicative
inverse of a modulo n if and only if c =n b. Furthermore, if a is coprime to n
then the multiplicative inverse of mod(a, n) in Zn is a multiplicative inverse of a
modulo n (Exercise 18 in Section 1.5). As an example let us consider a = 20 and
n = 9. Since mod(20, 9) = 2 and 29 1 = 5 then 5 is a multiplicative inverse of 20
modulo 9.
We can use an 1 to denote a multiplicative inverse of a modulo n.
To end this section we introduce the notions of ring product and ring homomorphism. A ring product is a binary operation that takes to rings and returns
the product of the rings.
Definition 1.3.24 Let A = (A , + , 0 , , ) and A = (A , + , 0 , , ) be
two rings. The product of A and A , denoted by A A , is the ring
(A, +, 0, , )
where
A = A A , that is, A is the Cartesian product of the carrier sets of each
ring;
(a , a ) + (b , b ) = (a + b , a + b );
0 = (0 , 0 );
(a , a ) = ( a , a );
(a , a ) (b , b ) = (a b , a b );
for all a A and a A .
36
h(0) = 0
h(a) = h(a)
h(a b) = h(a) h(b)
37
1.3.3
Herein we present the Chinese Remainder Theorem. This theorem is nearly 2000
years old and was established by Chinese scholars. In particular, this result is
useful for solving some systems of linear congruences.
Let r be a positive integer and let n1 , . . . , nr be positive integers pairwise
coprime. Consider the map
h : Zn1 ...nr Zn1 . . . Znr
(1.7)
such that
h(x) = (mod(x, n1 ), . . . , mod(x, nr ))
This map is a ring homomorphism between the rings Zn1 ...nr and Zn1 . . .Znr
(Exercise 26 in Section 1.5). We now state the Chinese Remainder Theorem in
Proposition 1.3.29.
Proposition 1.3.29 The map (1.7) is an isomorphism.
Proof: As we have stated above, h is a ring homomorphism. Hence, we only
have to prove that h is injective and surjective.
(1) h is surjective. We have to prove that given (x1 , . . . , xr ) Zn1 . . . Znr
there exists x Zn1 ...nr such that h(x) = (x1 , . . . , xr ).
Let
N = n1 . . . nr
and
Ni =
N
ni
for all 1 i r
ni 1
where Ni
xi Ni
Ni =ni xi
(1.8)
xi Ni
Ni =nj 0
(1.9)
38
r
X
i=1
ni 1
(xi Ni
n 1
Ni ).
n 1
r
X
i=1
ni 1
mod(xi Ni
Ni , nj ).
P
n 1
By (1.9), in the above summation ri=1 mod(xi Ni i Ni , nj ) only the term
n 1
mod(xj Nj i Nj , nj ) is not necessarily equal to 0. Thus,
ni 1
x =nj mod(xj Nj
Nj , nj ).
QED
Note that the injectivity of (1.7) can also be proved observing that the sets
Zn1 ...nr and Zn1 . . . Znr are finite and have the same number of elements.
Therefore, h is injective because every surjective map between finite sets with
same cardinality is necessarily injective. Hence, we can proof the Chinese Remainder Theorem without using the result stated in Exercise 10. Moreover, we can
use the Chinese Remainder Theorem to prove this result. Let us briefly see how.
39
x =n 1 k 1
...
x =
nr k r
x =n 1 k 1
...
x =
nr k r
40
for all 1 i r
(1.10)
Recalling the proof of the Chinese Remainder Theorem, we know that the modular equations (1.10) hold for
!
r
X
1
n
s = mod
(ki Ni i Ni ), N .
i=1
S .
and
s =n j
r
X
n 1
mod
(ki Ni i Ni ), N
i=1
=nj kj
=nj kj
Example 1.3.31 Consider the following system of congruences (or modular equations)
41
13x + 1 =7
4
4x 2 =9 5
We want to find all the integer solutions of this system using the Chinese Remainder Theorem (Corollary 1.3.30).
(i) We first transform the given system into an equivalent one where each
congruence ax + b =k c is replaced by a congruence of the form x =k c .
Let us first consider the congruence 13x + 1 =7 4. Since 1 =7 1, by
Proposition 1.3.4,
13x + 1 1 =7 4 1
that is
13x =7 3
Given that 13 =7 6 then 13x =7 6x, using the congruence properties of =7 .
Hence,
6x =7 3
Since 6 is coprime to 7, by Proposition 1.3.21, it has multiplicative inverse
in Z7 . We can use its inverse, 67 1 , to obtain an equivalent congruence in
the intended form, taking again into account the congruence properties of
=7 :
67 1 6x =7 67 1 3
that is
1 x =7 6 7 1 3
4x 2 + 2 =9 5 + 2
4x =9 3
4x =9 3
49 1 4x =9 49 1 3
x =9 7 3
x =9 21
42
x =7 18
x =9 21
Although we could already use Corollary 1.3.30 to find the solutions, we can
further simplify this system observing that 18 =7 4 and 21 =9 3. Hence, we
get the equivalent system
x =7 4
x =9 3
(ii) Since 7 and 9 are coprime we can use Corollary 1.3.30. Assuming n1 = 7
and n2 = 9, then
k1 = 4
k2 = 3
N = 7 9 = 63
N1 =
63
7
=9
N2 =
63
9
=7
1.4
RSA revisited
At the light of the results presented in this chapter we show that the RSA cryptosystem, described in Section 1.1 is sound, that is, we prove that when Bob
decrypts the ciphered text sent by Alice he obtains the original message.
Recall the RSA cryptosystem in Figure 1.9 where (n) = (p 1)(q 1) (see
Proposition 1.2.22).
We now prove that v(n,b) u(n,a) = idZn . The proof applies Eulers theorem
and the Chinese Remainder Theorem (Proposition 1.3.29).
43
isoCRT=Function[{n,w,a},Module[{i,j,coprime},
If[Apply[Times,w]!=n,
Print["Erro"],
coprime=True;
i=1;
While[i<Length[w]&&coprime,
j=i+1;
While[j<=Length[w]&&coprime,
coprime=(GCD[w[[i]],w[[j]]]==1);
j=j+1];
i=i+1];
If[!coprime,
Print["Error"],
Table[Mod[a,w[[k]]],{k,1,Length[w]}]]]]];
Figure 1.8: Chinese remainder theorem in Mathematica
44
Proposition 1.4.1 Consider the RSA cryptosystem with public key (n, a) and
private key (n, b). Then v(n,b) u(n,a) = idZn .
Proof: Recall that n = p q where p and q are distinct prime numbers and
therefore, by 1 of Proposition 1.2.19, p and q are coprime. Let h be the map
presented in (1.7) at Section 1.3.3.
(1) We first prove that h(v(n,b) (u(n,a) (x))) = (mod(xab , p), mod(xab , q)) for each
x Zn . Since xa =n mod(xa , n) it is straightforward to conclude that
xab =n (mod(xa , n))b
using the congruence properties of =n (Exercise 9 in Section 1.5). Using the fact
that u(n,a) (x) = mod(xa , n), we have
v(n,b) (u(n,a) (x)) = mod((u(n,a) (x))b , n) = mod(xab , n).
Therefore, by 3 of Proposition 1.2.4, we obtain
mod(v(n,b) (u(n,a) (x))), p) = mod(mod(xab , n), p) = mod(xab , p).
Similarly, by replacing p with q, we get
mod(v(n,b) (u(n,a) (x))), q) = mod(xab , q).
Then, we conclude that
h(v(n,b) (u(n,a) (x)))) = (mod(xab , p), mod(xab , q)).
(1.11)
(2) We now prove that h(x) = (mod(xab , p), mod(xab , q)) for each x Zn . We
start by showing that mod(x, p) = mod(xab , p). If p divides x then p also divides
xab and so
mod(x, p) = 0 = mod(xab , p).
If p does not divide x, since p is prime, x and p are coprime. Hence, by the
Eulers theorem (Theorem 1.3.5), x(p) =p 1. Using 1 of Proposition 1.2.22, we
conclude that
xp1 =p 1.
(1.12)
By definition of RSA cryptosystem (see Figure 1.9) we have that mod(ab, (n)) =
1, and therefore
ab = k(n) + 1
for some k Z. From (1.12), using the congruence properties of =p , we get
(xp1 )k(q1) =p 1
and then
xxk(p1)(q1) =p x.
45
QED
We now present an example that illustrates the RSA encryption and decryption of messages.
Example 1.4.2 Consider the RSA cryptosystem and, just for illustration purposes, let us assume that Bob has chosen the primes
p = 13
and
q = 7.
and
(91) = 12 6 = 72
To choose the exponents a and b, Bob first picks up an element a in Z72 that
has multiplicative inverse, that is, a coprime to 72. Let us consider a = 5. Then,
the extended Euclids algorithm can be used to compute its inverse b:
x
mod(y, x)
72
1
0
y
x
14
29
46
Given that exteuclid(a, (n)) = exteuclid(5, 72) = (29, 2), then b = 29. Hence,
Bobs public key is (91, 5) and his private key is (91, 29).
Assume that Alice wants to send the message 2 Z91 to Bob. Then she uses
Bob public key and the corresponding encryption rule and obtains the encrypted
message
u(91,5) (2) = mod(25 , 91) = 32
that she sends through the channel.
When Bob receives the encrypted message 32 he decrypts it using the decryption rule associated with his private key, that is,
v(91,29) (32) = mod(3229 , 91)
= mod(32 (322 )14 , 91)
= mod(32 102414 , 91)
= mod(32 2314 , 91)
= mod(32 (232 )7 , 91)
= mod(32 5297, 91)
= mod(32 747 , 91)
= mod(32 74 (742 )3 , 91)
= mod(32 74 54763, 91)
= mod(32 74 163 , 91)
= mod(32 74 16 162 , 91)
= mod(32 74 16 74, 91)
= mod(32 16 16, 91)
= mod(32 74, 91)
= mod(2, 91) = 2
As expected, Bob gets the original message 2. Note that we repeatedly compute
squares and reduced it to elements in Z91 .
1.5
Exercises
1.5. EXERCISES
47
10. Let n1 , . . . , nr be positive integers pairwise coprime. Prove that a =n1 ...nr b
if and only if a =ni b for all 1 i r.
11. Prove that mod(2k , 4) is either 2 or 0 for all k N.
12. Let a, b, n Z with n > 0. Prove that if b is a multiplicative inverse of a
then aj bi =n aji for all i, j N0 such that j i.
48
49
1.5. EXERCISES
(b) A has a multiplicative inverse in A if and only if h(a) has a multiplicative inverse in A .
25. Consider the properties of ring isomorphisms stated in Exercise 24. Do
these properties also hold for ring homomorphisms?
26. Prove that the map (1.7) is a ring homomorphism.
27. Show that proving the surjectivity of the map (1.7) amounts to prove that
given any (x1 , . . . , xr ) Zn1 . . . Znr the system of congruences
x =n1 x1
...
x =
nr xr
28. Find all the integer solutions of the following systems of congruences
3x 2 =7
4
(a)
13x =9 2
(b)
2x + 4 =9 1
12x 2 =5
6
5x + 10 =9 1
(c)
5x 4 =7
6
4x 2 =5
6
3x + 1 =7 10
(d)
4x 2 =9 3
x + 3 =4
1
50
29. Consider the cryptographic system RSA with prime numbers p = 3 and
q = 11, public key (33, 3) and private key (33, 7). Explain how it works
encrypting the message 2 and decrypting the resulting message.
30. Consider the cryptographic system RSA with prime number p = 13, and
public key (143, 7) and let 9 be an encrypted message. Compute the corresponding private key and decrypt 9 to obtain the original message x.
Confirm that u(143,7 (x) = 9.
31. Consider the cryptographic system RSA with prime numbers p = 7 and
q = 11. Choose an appropriate public key and a corresponding private
key, and explain how it works encrypting the message 2 and decrypting the
resulting message.
32. Consider the cryptographic system RSA with prime numbers p = 7 and
q = 13. Choose an appropriate public key and a corresponding private
key, and explain how it works encrypting the message 3 and decrypting the
resulting message.
33. The RSA cryptographic system requires a fast modular exponentiation.
Develop in Mathematica an efficient algorithm for modular exponentiation
using the binary representation of the exponent.
Hint: assuming that b is the exponent and that its binary representation
is {bk , bk1 , . . . , b1 , b0 }, start by assigning to the result variable the value
abk and then do a cycle from k 1 until 0 such that in the i-iteration,
if bki = 1 then square the result variable, multiply it by a and reduce
it modulo n, otherwise square the result variable and reduce it modulo n.
This algorithm is known as Repeated Squaring Algorithm.
34. Assume that the RSA cryptographic system is being used with Zn as the
message space and (n, a) as the public key. Show that if a and the prime
factors of n are known then it is possible to obtain b. (This explains the
reason why factoring is considered to be the Achilles heel of the RSA.)
35. Assume that the RSA cryptographic system is being used with Zn as the
message space and (n, a) as the public key. Show that it is feasible for
an attacker to know the private key (n, a) corresponding to the public key
(n, a) used, if he knows (n).
36. Prove that if there exists an algorithm that in polynomial time, given u(x),
returns the last bit of x, then the private key can be found in polynomial
time, that is, u can be inverted in polynomial time.
Chapter 2
Pseudo-random numbers
In this chapter we discuss the generation of pseudo-random numbers. Random
numbers are useful in several different fields such as simulation, sampling and
cryptography. In simulation and sampling, random numbers are used to create
representative real world scenarios. In cryptography, they are used, for instance,
to generate strings of bits and to random generate keys in a given key space.
Physical methods based in entropy theory can be used to generate sequences
of numbers that can be considered close to truly random number sequences. But
they can be expensive and slow for many applications. For many purposes, it is
enough to use some suitable number sequence generating algorithms, the pseudorandom number generators. In this case, the number sequences are completely
determined by the initial value (the seed), but a careful choice of the appropriate
algorithms often yields useful number sequences for many applications.
There are several features of pseudo-random number generators that can be
measured. Suitable statistical pattern detection tests can then be used. Proving
a pseudo-random number generator secure is more difficult, but it is of utmost
relevance in cryptography. A pseudo-random number generator should be fast
and secure. But it is not easy to get both at the same time. Linear congruential
generators [17], for instance, are fast and therefore useful in simulation applications, but they are not secure enough for cryptographic applications. Others
generators, like, for example, Blum Blum Shub generators [4] are slow but their
security properties makes them suitable for cryptographic applications.
In Section 2.1 we present a motivating example related to traffic simulation.
In section 2.2 we introduce linear congruential generators. Blum-Blum-Shub
generators are presented in in Section 2.3. In Section 2.4 we revisit the traffic
simulation example. In Section 2.5 we propose some exercises.
51
52
2.1
Many complex systems can be studied using computer simulation. Using suitable
models we can study their behaviour and predict their evolution. Computer
simulation is then an important tool in many different fields such as Physics,
Chemistry, Biology, Engineering, Economics and even Sociology.
In computer simulation we can use continuous or discrete models, depending on the particular application. Continuous models usually use differential
equations that describe the evolution of relevant continuous variables. Discrete
models can be used when the systems we want to study can be described by
events and their consequences. Simulation can also be classified as deterministic
or stochastic. In the later case, the variables follow random laws.
Herein, we present a traffic problem simulation example. The problem can be
described as follows.
Vehicles randomly come in to a given toll road. After arriving to the toll
booth (with only one toll gate), they stay in the queue until the payment
is done, and then they leave the road. The goal is to study the evolution
of the number of vehicles in the toll queue, in terms of some given random
laws of the intervals between arrivals, arrival of a vehicle to the toll gate
and departure after payment.
This is an example of discrete event simulation. The system is represented
by a sequence of events and the occurrence of an event corresponds to a change
of state in the system. There is a list of simulation events listing the pending
events, that is, the events that will have to simulated. This list is also known as
the pending event set.
According to this technique, the first step is the identification of the kinds
of relevant events in the system being considered. In this case there are the
following three kinds of events (named according to the usual designations in
queue simulation): arr (arrival), ess (end of self-service) and dep (departure).
The first one corresponds to the arrival of a vehicle to the toll road, the second
one to the arrival of a vehicle to the toll gate, and the third one to the departure
of a vehicle after the payment. Each event is characterized by several attributes.
The relevant ones herein are the following: time (time of the event occurrence)
and kind (the kind of the event). In more complex simulations more attributes
may be considered. For example, if we want to study not only the time spent
at the queue but also the time spent since the vehicle enters the road until it
reaches the toll queue, it is also necessary to indicate for each event the vehicle
with which it is associated.
The Mathematica package in Figure 2.1 offers an intuitive and user-friendly
collection of services that includes the creation of an event and the access to their
different attributes.
53
BeginPackage["trafficSim`des`eventsP`"]
eventsP::usage = "Operations on events."
evt::usage = "evt[t,k]: the event on time t of kind k."
time::usage = "time[e] returns the time of event e."
kind::usage = "kind[e] returns the kind of event e."
Begin["`Private`"]
evt = Function[{t,k},{t,k}]
time = Function[e,e[[1]]]
kind = Function[e,e[[2]]]
End[]
EndPackage[]
Figure 2.1: Mathematica package for events
The second step is the definition of the random laws followed by the events.
With respect to arrivals we assume herein that the interval of time between
consecutive arrivals is a random variable following a exponential distribution with
average value ba (between arrivals). With respect to self-service, we consider that
the time that a vehicle takes to cross the road is a random variable following an
exponential distribution with average value ss (self-service). Finally, with respect
to payments, it is assumed that the time that a vehicle takes to pay (since the
beginning of the payment to its departure) is a random variable following an
exponential distribution with average value st (service time).
Recall that a random variable following an exponential distribution with avt
erage value m has distribution function F (t) = 1 e m . For example, for m = 2
t
and t = 6, the value of 1 e m is approximately 0.95. This value can be interpreted as follows: the probability that the observed value is less than or equal to
3 times the average value is a little bit more than 95%. That is, the probability
that the observed value is greater than 3 times the average value is less than 5%.
The third step consists in defining procedures for simulating the observation
of the random variables of the system. The Mathematica package in Figure 2.2
includes a collection of services providing the procedure exprandom.
Observe that in order to define the function exprandom we use the Mathematica
function Random that generates a pseudo-random number in the interval [0, 1]
following an uniform distribution. Recall that all random variable distributions
54
BeginPackage["trafficSim`des`randomnumbersP`"]
randomnumbersP::usage = "Exponential random numbers."
exprandom::usage = "exprandom[m] returns an observation of
the exponential random variable with mean value m."
Begin["`Private`"]
exprandom = Function[{m}, -(m*Log[Random[]])]
End[]
EndPackage[]
Figure 2.2: Mathematica package for random numbers
can be obtained from the uniform random variable. So it is of utmost importance
to know how to obtain uniform pseudo-random generators in order to get other
types of generators.
2.2
55
(ii) s0 = 2, a = 7, c = 5 and m = 9
1, 3, 8, 7, 0, 5, 4, 6, 2, 1, 3, 8, 7, 0, 5, 4, 6, 2, 1, 3, 8, 7, 0, 5, 4, 6, 2, 1, . . .
(iii) s0 = 2, a = 2, c = 2 and m = 11 then we have
6, 3, 8, 7, 5, 1, 4, 10, 0, 2, 6, 3, 8, 7, 5, 1, 4, 10, 0, 2, 6, 3, 8, 7, 5, 1, 4, . . .
(iv) s0 = 2, a = 3, c = 2 and m = 12 then we have
8, 2, 8, 2, 8, 2, 8, 2, 8, 2, 8, 2, 8, 2, 8, 2, 8, 2, 8, 2 . . .
(v) s0 = 2, a = 5, c = 2 and m = 25 then we have
12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, . . .
Using the word random in a rather informal way, can say that some sequences
look more random than others. Note that the seed may or may not occur again
in the sequence as a term of index greater than 0. Moreover, it may be the case
that not all the elements of Zm occur in the sequence. The last sequence is even
constant for n 1, since mod(5 12 + 2, 25) = mod(62, 25) = 12.
The following proposition states that the seed indeed occurs more than once
in a linear congruential sequence whenever the multiplier and the modulus are
coprime.
Proposition 2.2.3 Let {sn }nN0 be a linear congruential sequence such that its
multiplier a and its modulus m are coprime. Then there is k N such that
sk = s0 .
Proof: First of all note that for i, j > 0 if si = sj then si1 = sj1. In fact,
recalling Definition 2.2.1, we have asi1 + c =m asj1 + c, that is, asi1 =m asj1 .
Since a and m are coprimes, a has an inverse a1 in Zm . Hence, multiplying the
left and the right hand sides by a1 and noting that mod(si1 , m) = si1 and
mod(sj1 , m) = sj1, we get si1 = sj1.
Assume that k N is the least index such that sk = si for some 0 i < k.
Such a k always exists because every term of s is in Zm and this set is finite. It
can not be the case that i 1 since then si1 = sk1 and therefore k would not
be the least index satisfying the conditions above. Hence, i = 0 and we conclude
that sk = si .
QED
In any linear congruential sequence {sn }nN0 there is a finite sequence of
numbers that is repeated an infinite number of times. In fact, since the set Zm is
finite and each term uniquely determines the next one, there are i, k N0 such
that si+k = si and, once si+k = si , the terms following si+k are exactly the same
following si .
56
The period length of a linear congruential sequence {sn }nN0 is the least k N
such that there is i N0 such that si+k = si . Clearly, the period is always less
than or equal to the modulus m of s and therefore we can always determine the
period of s after computing m + 1 terms at the most. The maximum period
lentght of a linear congruential sequence is the modulus m of that sequence.
Example 2.2.4 Recall the linear congruential sequences presented in Example
2.2.2. The period length of the sequence presented in (i) is 11. It is the maximum
period lenght. Similarly, the sequence presented in (ii) also has maximum period
length (9 in this case). The sequence presented in (iii) has a period of length 10
(note that 9 does not occur in the sequence). Finally, the period length of the
sequence presented in (iv) is 2 and the period length of the sequence presented
in (v) is 1.
The choice of the multiplier m determines an upper bound for the period of a
linear congruential sequence. The others parameters have to be carefully chosen
in order to ensure that the period is as long as possible.
The following proposition, known as the Maximum Period Length Theorem,
establishes several conditions on the parameters that ensure maximum period
length.
Proposition 2.2.5 Let {sn }nN0 be linear congruential sequence with multiplier
a, increment c and modulus m. The sequence s has period length m if and only
if all the following conditions hold
(i) c and m are coprime;
(ii) a 1 is a multiple of every prime divisor of m;
(iii) a 1 is a multiple of 4 whenever m is a multiple of 4;
where a, c and m are the multiplier, the increment and the modulus of s respectively.
Note that the seed is not relevant to ensure maximum period length. For the
proof of Proposition 2.2.5 we refer the reader to [17], for instance.
Example 2.2.6 Recall again the linear congruential sequences presented in Example 2.2.2. The sequence presented in (i) satisfies the conditions of Proposition
2.2.5. Observe that a 1 = 0 and 0 a multiple of any integer. The sequence
presented in (ii) also satisfies those conditions. Note that m is not a multiple of 4
and therefore a1 is not required to be a multiple of 4. In the sequence presented
in (iii), although c and m are coprime, a 1 = 1 and 1 is not a multiple of 11
(note that 11 the only prime number that divides m in this case) and therefore
the sequence does not verify the conditions of Proposition 2.2.5.
57
It is worth noticing that a long period is not the only requirement for a
good choice of the parameters of a linear congruential sequence. The sequence
(i) in Example 2.2.2, for instance, has maximum period length but, informally
speaking, it is not much random since even and odd numbers alternate. The
difference of two consecutive terms is almost always equal to 2 because a = 1 and
therefore in most cases sn+1 sn = c = 2.
The Mathematica function MCL in Figure 2.3, computes the terms of linear
congruential sequences, assuming that suitable integer values have already being
assigned to the variables a , s0 , c and m.
MCL=Function[{},s=Mod[a*s+c,m]];
Figure 2.3: Linear congruential sequence in Mathematica
When computation time is an issue we often consider linear congruential sequences {sn }nN0 with c = 0. However, in this case, a period of length m is no
longer advisable since
sn+1 = mod(asn , m)
and therefore once si = 0 for some i N0 then sj = 0 for all j > i.
Another relevant fact in this case is that given a common divisor d N of si
and m we have
j as k
j as k
i
i
si+1 = mod(asi , m) = asi
m = d ksi
k
m
m
for some integers 0 < k, k m, that is, si+1 a multiple of si . Moreover, we can
also conclude that sj is a multiple of d for all j i. It is then advisable to ensure
that m and sn are coprime for all n N0 . In this case the period length is at
most the number of positive integers coprime to m, that is, the period length is
at most (m), where is the Euler function (see Definition 1.2.20).
The following proposition establishes some conditions that ensure the maximum possible period length when the increment is 0. For the proof we refer the
reader to [17]. Recall from Definition 1.3.11 that (m) is the order (modulo m)
of the primitive elements modulo m.
Proposition 2.2.7 Let {sn }nN0 be a linear congruential sequence with multiplier a, modulus m and increment 0. If s0 and m are coprime and a is a primitive
element modulo m then the period length of s is (m). Moreover, (m) is the
maximum possible period length of any linear congruential sequence with modulus m and increment 0.
58
Example 2.2.8 Let us consider a linear congruential sequence {sn }nN0 such
that m = 5, s0 = 4 and a = 2. Recall from Example 1.3.10 that 2 is a primitive
element modulo 5. Since the seed and the modulus are coprime, the sequence
s satisfies the conditions of Proposition 2.2.7 and therefore it has period (5).
Recall from Example 1.3.12 that (5) = 4.
As we have already remarked in Section 2.2 with respect to the generation of
pseudo-random numbers for simulation purposes, it is of utmost importance to
have uniform pseudo-random generators in order to get other types of generators,
capitalizing on the fact that every distribution can be generated from the uniform
distribution in the interval [0, 1]. Linear congruential sequences allow us to obtain
such sequences of pseudo-random numbers in [0, 1] as follows [17]: given a linear
congruential sequence (sn )nN0 with modulus m we just consider the sequence
u = (un )nN0 where
sn
un =
m
for each n N0 .
In cryptographic applications we often have to randomly generate strings
of bits. We now describe how linear congruential sequences can be use as bit
generators. First, we define bit generator functions.
Definition 2.2.9 Let j, k N such that k > j. A (j, k)-bit generator is a functiom
f : Zj2 Zk2
that can be computed in polynomial time with respect to j. For each r Zj2 , the
string f (r) is the bit string generated by r.
In practice, given j, k is obtained as a polynomial function of j. Let s = (sn )nN0
be a linear congruential sequence with modulus m. We can the define a (j, k)linear congruential generator as follows: given j = 1 + log2 m and j < k < m,
then
f : Zj2 Zk2
is such that f (s0 ) = (z1 , z2 , . . . , zk ) where
zi = mod(si , 2)
for each 1 i k. Note that in f (s0 ) we are assuming the binary representation
of the seed of s that, by hypothesis, has a length less than or equal to j.
2.3
Blum-Blum-Schub generators
2.4
59
60
BeginPackage["trafficSim`des`schedulesP`"]
Needs["trafficSim`des`eventsP`"]
schedulesP::usage="Operations on schedules."
empty::usage="The empty schedule."
next::usage="next[s] returns the next event of schedule s."
add::usage="add[e,s] inserts event e into schedule s."
delete::usage="delete[s] removes the next event from
schedule s."
Begin["`Private`"]
empty={}
next=Function[s, s[[1]]]
add=Function[{e,s},
If[s==empty,
{e},
If[time[e]<time[s[[1]]],
Prepend[s,e],
Prepend[ add[e,Rest[s]],s[[1]]]]]]
delete=Function[s,Rest[s]]
End[]
EndPackage[]
Figure 2.4: Mathematica package for scheduling
list. The function add is more complicated since it inserts the received event in
the right place in the list.
The simulator is presented in Figure 2.5. The simulator is no more than a
loop. In each step of the loop the next event in the pending event list is simulated.
In that simulation, the list and the state variables (length of the toll queue, state
of the toll gate, etc) are changed accordingly to the event being simulated, and
the variables under simulation (namely, the length of the toll queue) are observed.
The local variables of the function sim are the following: busy (it indicates
whether the toll gate is occupied or not), ce (the current event being simulated),
ct (the time of the event currently being simulated), nss (number of clients
in self-service, that is, the number of vehicles in the toll road), nwc number of
waiting clients, that is, the number of vehicles in the toll queue), sch (schedule),
tnc (total number of clients, that is, the total number of vehicles in the toll),
trace (it records the simulated events and the values of some variables), ck (the
kind of the event currently under simulation).
61
BeginPackage["trafficSim`des`simulationP`"]
Needs["acs`des`eventsP`"]
Needs["acs`des`schedulesP`"]
Needs["acs`des`randomnumbersP`"]
simulationP::usage = "Discrete event simulation."
sim::usage = "sim[ba,ss,st,ht] runs the simulation with:
average time between arrivals ba, average time of selfservice ss, average service time st, and halting time ht."
Begin["`Private`"]
sim=Function[{ba,ss,st,ht},
Module[{busy,ce,ct,nss,nwc,sch,tnc,trace,ck},
simArr=Function[{},
sch=add[evt[ct+exprandom[ba],"arr"],sch];
sch=add[evt[ct+exprandom[ss],"ess"],sch];
tnc=tnc+1; nss=nss+1];
simEss=Function[{},
nss=nss-1;
If[busy==1,
nwc=nwc+1,
sch=add[evt[ct+exprandom[st],"dep"],sch];
busy=1]];
simDep=Function[{},
If[nwc==0,
busy=0,
nwc=nwc-1;
sch=add[evt[ct+exprandom[st],"dep"],sch]]];
busy=0;
ce=evt[exprandom[ba],"arr"]; ct=time[ce]; ck=kind[ce];
nss=0; nwc=0; sch=empty; tnc=0; trace={};
While[ct<=ht,
Switch[ck,"arr",simArr[],"ess",simEss[],"dep",simDep[]];
trace=Append[trace,{ct,nwc}];
ce=next[sch]; ct=time[ce]; ck=kind[ce];
sch=delete[sch]];
ListPlot[trace,PlotJoined->True]]]
End[]
EndPackage[]
Figure 2.5: Mathematica package for the simulation
62
The output of the simulation is a graphic displaying the evolution of the length
of the toll queue, that is the value of the variable nwc. Figure 2.6 depicts the
graphic corresponding to a simulation assuming that ba, the average interval of
time between consecutive arrivals, is 1, that ss, the average time a vehicle takes
to cross a road, is 50, and that st, the average time a vehicle takes to pay, is
0.9. Figure 2.6 depicts the graphic corresponding to a similar simulation but
25
20
15
10
5
200
400
600
800
1000
Figure 2.6: Evolution of the toll queue length assuming ba=1, ss=50 and st=0.9
assuming that st is 2. In both cases, the starting time is 0 and the halting time
is 1000
400
300
200
100
200
400
600
800
1000
Figure 2.7: Evolution of the toll queue length assuming ba=1, ss=50 and st=2
2.5
Exercises
2.5. EXERCISES
63
64
11. Determine all the primitive elements modulo 8 in Z8 and define a linear
congruential sequence with maximum possible period length with modulus
8 and increment 0.
12. Define linear congruential sequences with maximum possible period length
with increment 0 and
(a) modulus 15.
(b) modulus 162.
(c) modulus 402.
Determine the period length of each sequence. Note that 5 is a primitive
element modulo 162 and 7 is a primitive element modulo 402.
13. Develop an enriched version of the simulator presented in Subsection 2.4
that traces the average number of vehicles during a specific period of time
and the maximum length of the toll queue.
14. Develop an enriched version of the simulator presented in Subsection 2.4
that distinguishes between two kinds of vehicles, the light vehicles and the
heavy vehicles, such that each kind has a different average time between
arrivals to the toll road. Make an histogram, by each kind of vehicle, of the
number of vehicles in the poll queue during the simulation time.
Chapter 3
Polynomials
Polynomials and polynomial equations are widely used in science and engineering.
Herein, we present several key concepts and results related to polynomials.
In Section 3.1 we start by motivating that polynomials can be used to verify
equivalence of digital circuits. Then we illustrate the relevance of polynomials
in robotics. In Section 3.2 we introduce the notion of polynomial over a field
as well as the sum and product of polynomials. We then introduce division of
polynomials and several related results. Grobner bases and their properties are
presented in Section 3.3. In Section 3.4 we revisit our motivating examples and
show how to use Grobner bases for checking equivalence of digital circuits and
for finding solutions of systems of nonlinear polynomial equations. In Section 3.5
we propose some exercises.
3.1
Motivation
3.1.1
66
CHAPTER 3. POLYNOMIALS
3.1.2
Consider the robot arm depicted in Figure 3.1. It represents a robot consisting
of three arm links a1 , a2 and a3 with fixed length, two joints J1 and J2 , an hand
E and a base A that supports the robot.
J2
a2
a3
E
J1
a1
A
67
3.1. MOTIVATION
projection on the yz plane of the robot arm is depicted in Figure 3.2 (the x-axis
points toward the observer). A projection on the xy-plane of the arm link a2 is
depicted in Figure 3.3.
2
z
0
6
-
(a, b, c)
c = l sin + l sin
2
1
3
2
68
CHAPTER 3. POLYNOMIALS
Hence, to determine the angles , 1 and 2 given the coordinates (a, b, c), we
have to solve this system. One possibility consists in converting these equations
into polynomial equations where the variables are the sines and cosines. We can
consider, for instance, one of the following equivalent systems
a = (l2 v2 + l3 v3 )v1
b = (l2 v2 + l3 v3 )u1
c = l u +l u
2 2
3 3
2
2
1 = u1 + v1
1 = u22 + v22
1 = u23 + v32
or
l2 v2 v1 + l3 v3 v1 a
l2 v2 u1 + l3 v3 u1 b
l2 u2 + l3 u3 c
u21 + v12 1
u22 + v22 1
u23 + v32 1
=
=
=
=
=
=
0
0
0
0
0
0
3.2
Basic concepts
In this section we first present the notion of polynomial (over a field) as well
as sums and products of polynomials [20, 10]. Polynomials over a field together
with the sum and product of polynomials constitute a ring (see Section 1.3 of
Chapter 1). Next we introduce the notion of ordered polynomial and then the
division of polynomials. At the end we refer to polynomial reduction modulo a
set of polynomials.
3.2.1
Rings of polynomials
69
Example 3.2.2
x21 is a monomial in the variable x1 with degree 2;
x21 x32 is a monomial in the variables x1 , x2 with degree 5.
70
CHAPTER 3. POLYNOMIALS
polynomial p in C[x1 , . . . , xn ] such that p(x01 . . . x0n ) = 1 and p(m) = 0 for all the
other monomials in x1 , . . . , xn is denoted by 1C[x1,...,xn ] . The subscript is again
omitted when no confusion arises.
A polynomial p in C[x1 , . . . , xn ] is often presented as a sum of all monomials
weighted with their nonzero coefficients. That is,
X
p(m)m.
{mM :p(m)6=0}
Note that using this notation the same polynomial can be referred to in different
ways. For simplicity, all the conventions introduced above for monomials can
also be used. Moreover, a monomial mi can be omitted when deg(mi ) = 0 and a
coefficient ci can be omitted whenever it is the multiplicative identity of C, unless
deg(mi ) = 0.
Given p in C[x1 , . . . , xn ] and m M , we say that p(m)m is a term of p. As
expected, the coefficient of the term is p(m) and its monomial is m. A monic
term is a term whose coefficient is the multiplicative identity of C and a zero
term is a term whose coefficient is the additive identity of C. The degree of a
term t, denoted by
deg(t)
is 0 if t is a zero term and is the degree of its monomial otherwise. When no
confusion arises, we can refer just to terms in x1 , . . . , xn over C without mention
any particular polynomial in C[x1 , . . . , xn ].
Example 3.2.4 Consider the polynomial p in Z5 [x1 , x2 ] such that
p(x31 x22 ) = 2;
p(x21 x22 ) = 3;
p(x01 x02 ) = 1;
p(m) = 0 for all the other monomials in x1 , x2 .
We can present p as
2x31 x22 + 3x21 x22 + 1
Observe that p has degree 5 and its terms are 2x31 x22 , 3x21 x22 and 1x01 x02 . The
coefficient of 21 x31 x22 is 2 and its monomial is x31 x22 .
The Mathematica function degmon in Figure 3.4 receives as input a term q
and positive integer n, and returns the degree of q. Assuming that q is the term
cx1 1 . . . xmm , the function first creates the list {c, x1 1 , . . . , xmm }. Then it removes
the coefficient c and adds the exponent of each variable. The function degmon
uses the built-in Mathematica function PolynomialMod that, given a polynomial
71
for each monomial x1 1 . . . xnn in Mx1 ,...,xn . Then, the -evaluation of a polynomial
p in C[x1 , . . . , xn ] is
X
eval (p) =
p(m) (m).
{mM :p(m)6=0}
For simplicity, we also write p(c1 , . . . , cn ) for eval (p) when (xi ) = ci for each
1 i n.
Example 3.2.5 Let p = 2x21 x22 + x1 x22 + 1 be a polynomial in Z5 [x1 , x2 ] and
consider the map : {x1 , x2 } Z5 such that (x1 ) = 3 and (x2 ) = 4. Then,
eval (p) = 2 32 42 + 3 42 + 1 = 2.
We now define sum and multiplication of polynomials.
Definition 3.2.6 Let p1 and p2 be polynomials in C[x1 , . . . , xn ]. The sum of p1
and p2 is the polynomial p1 + p2 in C[x1 , . . . , xn ] such that
p1 + p2 (m) = p1 (m) + p2 (m)
for each monomial m in x1 , . . . , xn .
72
CHAPTER 3. POLYNOMIALS
73
3.2.2
Monomial orderings
74
CHAPTER 3. POLYNOMIALS
x31 x22 x3 x24 >lx x21 x42 x23 x54 taking i = 1;
x1 x54 6>lx x1 x2 x24 since the only exponent of a variable in x1 x54 that is greater
than the corresponding one in x1 x2 x24 is the exponent of x4 and the exponents of x2 in the two polynomials are not equal.
Note that we assume that in a monomial the powers of the variables always
occur by the order of the variables in x1 , . . . , xn . For instance, by Definition 3.2.1,
x32 x21 is not a monomial in x1 , x2 . This property of monomials has been implicitly
used in the definition of >lx above.
As a side comment, we remark that it is not mandatory to impose a particular
order when defining monomials, that is, both x21 x32 and x32 x21 can be considered
monomials in x1 , x2 . Then, to introduce a definition of lexicographic order on
monomials as the one above, an ordering of the variables has to be previously
fixed and the monomials written accordingly. To obtain the order >lx above the
variable ordering is, of course, x1 > . . . > xn . But, different orderings on the
variables can also be considered, leading to different lexicographic orderings of
monomials.
The order >lx on monomials can be used for defining division of polynomials,
since it satisfies the requirements above.
Proposition 3.2.12 The order >lx is a monomial order.
Proof: The order >lx is total and well founded (Exercise 7 in Section 3.5).
We now prove that >lx is preserved by the product of monomials. Assume
that mi = x1 1i . . . xnni , for i = 1, 2 are monomials such that m1 >lx m2 . Let
m = x1 1 . . . xnn . Then, there is 1 j n such that j1 > j2 and i1 = i2
for all 1 i i. Since
m mi = x11i +i . . . xnni +n
for i = 1, 2, and j1 + j > j2 + j and i1 + i = i2 + i for all 1 i i, we
also have m m1 >lx m m2 .
QED
Next, we introduce the graded lexicographic order. This order also takes into
account the degree of monomials.
Definition 3.2.13 The graded lexicographic order on monomials in x1 , . . . , xn
is denoted by >glx and defined as follows: given the monomials m and m
m >glx m
whenever one of the following conditions holds:
deg(m) > deg(m )
75
We again write m 6>glx m to denote the fact that it is not the case that
m >glx m .
Example 3.2.14 Let us consider again monomials in x1 , x2 , x3 , x4 . Then
x1 x54 >glx x1 x2 x24 since deg(x1 x54 ) > deg(x1 x2 x24 );
x21 x32 x3 >glx x21 x22 x24 since
(i) deg(x21 x32 x3 ) = deg(x21 x22 x24 ) = 6;
(ii) x21 x32 x3 >lx x21 x22 x24 .
for i = 1, 2. In case (i), clearly deg(m1 + m) > deg(m2 + m). In case (ii), we have
deg(m1 + m) = deg(m2 + m) and, by Proposition 3.2.12, m m1 >lx m m2 .
Hence, we conclude that m m1 >glx m m2 .
QED
Observe that since >lx is a total order, given any finite set M of monomials
we can determine the maximum max(M) with respect to >lx . Moreover, since
>lx is also well founded we can determine the minimum min(M) of any set M of
monomials. Similarly with respect to >glx .
Any order > on monomials induces an order on the set of terms of a polynomial. Term coefficients are irrelevant and just the monomials are compared.
Hence, t > t whenever mt > mt where mt and mt are the monomials of t and
t , respectively. The properties of monomial orders clearly extend to the orders
induced on the terms of a polynomial.
The Mathematica function monorderQ in Figure 3.6 receives as input a term
m1, a term m2 and a positive integer n, and returns a Boolean value. It returns
True if m1 >glx m2 and False otherwise. The function monorderQ uses the function degmon already presented in Figure 3.4 and the function index depicted in
Figure 3.5.
The function index receives as input a term q and a positive integer n. It
returns the index of the first variable in the monomial of q that has a nonzero
76
CHAPTER 3. POLYNOMIALS
exponent. Assuming that q is cx1 1 . . . xmm , the function index first creates the
list {c, x1 1 , . . . , xmm } and then removes c. Afterwards, it creates the list with
the indexes of the variables with a nonzero exponent and then returns its first
element.
index=Function[{q,n},Module[{p,w},
p=PolynomialMod[q,n];
If[Head[p]===Times,w=Apply[List,p],w={p}];
If[NumberQ[First[w]],w=Rest[w]];
First[
Map[Function[m,
If[Head[m]===Power,m[[1,2]],m[[2]]]],w]]]];
Figure 3.5: Index of the first variable with a nonzero exponent
The function monorderQ is recursively defined. Given the input terms m1 and
m2, it first tests if they are equal returning False if this is the case. Otherwise, it compares their degrees returning True if deg(m1) > deg(m2) and False
if deg(m1) < deg(m2). When deg(m1) = deg(m2), the function returns True if the
index of the first variable in m1 with a nonzero exponent is less than the index of
the first variable in m2 with a nonzero exponent. If it is greater it returns False.
Otherwise, the function decrements by 1 the exponents of these variables and
recursively checks the the resulting terms.
monorderQ=Function[{m1,m2,n},
If[PolynomialMod[m1,n]===PolynomialMod[m2,n],False,
If[degmon[m1,n]>degmon[m2,n],True,
If[degmon[m1,n]<degmon[m2,n],False,
If[index[m1,n]<index[m2,n],True,
If[index[m1,n]>index[m2,n],False,
monorderQ[m1/xindex[m1,n] ,m2/xindex[m1,n] ,
n]]]]]]];
Figure 3.6: Checking whether m1 is greater than m2
In the sequel, when presenting a polynomial in C[x1 , . . . , xn ] as
s
X
i=1
ti
or
t1 + . . . + ts
77
where t1 , . . . , ts are the nonzero terms of the polynomial, we will often assume
that this presentation is ordered, that is, ti > ti+1 for all 1 i < s. Hence, terms
occur according to their ordering (induced by the monomial order > we are
considering).
Example 3.2.16 The presentation of the polynomial in R[x1 , x2 , x3 ]
6x1 3 + 5x1 x2 + 3x1 2 x2 + 2x1 x2 2 + 4x1 2 x3 + x2 x3 + 2x1 x2 x3 + 6x2 2 x3
is not ordered. Its ordered presentation is
6x1 3 + 3x1 2 x2 + 4x1 2 x3 + 2x1 x2 2 + 2x1 x2 x3 + 6x2 2 x3 + 5x1 x2 + x2 x3
The next notions are useful in the sequel. Recall that each polynomial has a
finite number of nonzero terms and that all the monomial orders we are considering induce a a total order on the terms of a polynomial.
Definition 3.2.17 Given a nonzero polynomial p in C[x1 , . . . , xn ] the leading
term of p, denoted by
lt(p)
is the nonzero term t of p such that t > t for each nonzero term t of p distinct
from t. The polynomial p is said to be monic if lt(p) is a monic term.
Example 3.2.18 The leading term of the polynomial in R[x1 , x2 , x3 ]
6x1 3 + 3x1 2 x2 + 4x1 2 x3 + 2x1 x2 2 + 2x1 x2 x3 + 6x2 2 x3 + 5x1 x2 + x2 x3
is 6x1 3 . This polynomial is not monic.
3.2.3
78
CHAPTER 3. POLYNOMIALS
polsort=Function[{q,n},Module[{p},
p=PolynomialMod[q,n];
If[Head[p]===Plus,
Sort[Apply[List,p],
Function[{h1,h2},monorderQ[h1,h2,n]]],
{p}]]];
Figure 3.7: Ordered list of the terms of q
lt=Function[{p,n},First[polsort[p,n]]];
Figure 3.8: Leading term of a polynomial
Terms
We start with the notion of divisibility of terms.
Definition 3.2.19 Let t1 and t2 be terms in x1 , . . . , xn over C where t2 is a
nonzero term. We say that t1 is divisible by t2 , or that t2 divides t1 , whenever
there is a term t in x1 , . . . , xn over C such that
t1 = t t2 .
t1
.
t2
As we will see below, when t1 is divisible by t2 the term t is unique, hence the
above notion of quotient is well defined. We can also say that t1 is a multiple of
t2 whenever t1 is divisible by t2 .
The term t is the quotient of the division of t1 by t2 and it is denoted by
79
divisibility of nonzero terms only depends on the monomials of the terms. Recall
that nonzero term coefficients are nonzero elements of a field C and therefore have
multiplicative inverse.
Lemma 3.2.21 Let ti = ci x1 1i . . . xnni , for i = 1, 2, be two nonzero terms in
x1 , . . . , xn over C.
1. t1 is divisible by t2 if and only if j1 j2 for every 1 j n.
11 12
2. If t1 is divisible by t2 then t = (c1 c1
. . . xnn1 n2 is the only
2 ) x1
term in x1 , . . . , xn over C such that t1 = t t2 .
c1 = c c2 and therefore c1 c1
QED
2 = c c2 c2 , that is, c1 c2 = c .
The Mathematica function divisibleQ presented in Figure 3.9 checks whether
a term is divisible by another term in the ring of polynomials over Zn . It receives as input a term t1, a term t2 and a positive integer n, and returns a
Boolean value. It returns True if t1 is divisible by t2 and False otherwise.
The function first checks whether t1 = 0, returning True if this is the case.
Otherwise, it checks whether t2 = 0 returning False is this is the case. Otherwise, assuming that t1 = c1 x1 11 . . . xm1m and t2 = c2 x1 21 . . . xm2m , it creates the
1m 2m
} and returns True if all the exponents 1i 2i
list {c1 /c2 , x1 11 21 , . . . , xm
are nonnegative and False otherwise.
80
CHAPTER 3. POLYNOMIALS
divisibleQ=Function[{t1,t2,n},Module[{r1,r2},
r1=PolynomialMod[t1,n];
r2=PolynomialMod[t2,n];
If[r1===0,True,
If[r2===0,False,
If[Head[r1/r2]===Times,
Length[Select[Apply[List,r1/r2],
Function[p,If[Head[p]===Power,
p[[2]]<0,False]]]]==0,
If[Head[r1/r2]===Power,
(r1/r2)[[2]]>0,True]]]]]];
81
0
if k = 0
k1
X
q=
gdt(pi , lt(d))
if k > 0
lt(d)
and
r = pk .
i=0
We can prove that indeed p = q d + r and that q and r are unique in the above
sense. We do not present herein the details of the proof and refer the reader to
[10].
QED
The steps described above to get polynomials q and r such that p = q d + r
are the starting point for the division algorithm. The polynomial d is the divisor.
Proposition 3.2.22 ensures that such q and r are unique when we assume that r =
0 or r is a nonzero polynomial whose nonzero terms are not divisible by lt(d).
Then q is said to be the quotient of the division of p by d and r is said to be the
remainder of the division of p by d.
If r is a nonzero polynomial then p is not divisible by d. Otherwise, p is divisible by d. Clearly, any zero polynomial p is divisible by any nonzero polynomial
d, since p = 0 d.
Example 3.2.23 Let us consider the polynomials
p = 3x31 x32 + 5x21 x32 6x31 x2 + 2x21 x22 10x21 x2 4x21
d = x1 x22 2x1
in R[x1 , x2 ]. We have that p = q d + r where
q = 3x21 x2 + 5x1 x2 + 2x1
and
r= 0
82
CHAPTER 3. POLYNOMIALS
p1 = p
lt(p)
d = 5x21 x32 + 2x21 x22 10x21 x2 4x21 .
lt(d)
lt(p1 )
d = 2x21 x22 4x21 .
lt(d)
lt(p2 )
d = 0.
lt(d)
The above computations can also be presented in the following way where, for
illustration purposes, we have also included the names of the polynomials:
p 3x31 x32 + 5x21 x32 6x31 x2 + 2x21 x22 10x21 x2 4x21 x1 x22 2x1
d
2
3 3
3
3x1 x2 + 5x1 x2 + 2x1 q
3x1 x2 + 6x1 x2
p1 5x21 x32 + 2x21 x22 10x21 x2 4x21
5x21 x32 + 10x21 x2
p2 2x21 x22 4x21
2x21 x22 + 4x21
p3 0
and
r = 2x32 x53
83
4x21 x3 + 2
3x21 x32 x3 + 2x22 x23
Recall that in Z5 the equalities 2 = 3 and 1 = 4 hold and the equality 3+2 = 0
also holds.
It is worthwhile noticing that when we divide a nonzero polynomial p by a
nonzero polynomial d the uniqueness of the quotient and remainder polynomials
depends on the particular ordering of monomials we are considering. Different
orders may lead to different quotients and remainders, since the leading term of
d depends on the particular monomial order. The following example illustrates
this situation.
Example 3.2.25 Let us consider the polynomial p = 4x21 x22 + 2x32 and the polynomial d = x22 + x1 in R[x1 , x2 ]:
assuming the order >glx we have p = q d + r where
q = 4x21 + 2x2 ,
r = 4x31 2x1 x2 ,
since using the division algorithm we get the following
4x21 x22 + 2x32
4x21 x22 4x31
4x31 + 2x32
2x32 2x1 x2
4x31 2x1 x2 .
x22 + x1
4x21 + 2x2
x1 + x22
4x1 x22 4x42
84
CHAPTER 3. POLYNOMIALS
Note that the ordered form of d is different in both cases, thus the leading
term is also different.
3.2.4
p p ,
if
p = p
t
d
lt(d)
lt(p)
d = 5x21 x32 + 2x21 x22 10x21 x2 4x21
lt(d)
we have that
d
5x21 x32
d = 3x31 x32 6x31 x2 + 2x21 x22 4x21
lt(d)
85
Another reduction of p modulo d is also possible given that the term 2x21 x22 is also
divisible by lt(d).
Note that the first reduction above is just the polynomial p1 we got in the
first step of the division algorithm in Example 3.2.23. It is easy to see that if we
reduce p1 modulo d in one step using the term lt(p1 ) we get the polynomial p2
therein. Similarly, we get 0 by reducing p2 modulo d.
When reducing p modulo d in one step, we always get rid of the term t divisible
by lt(d) that we choose to compute the reduction. This term is replaced in p by
a multiple of the polynomial that results from d by removing lt(d). It is also easy
to conclude that the reduction of p modulo d in one step may correspond to a
step of the division algorithm presented in Section 3.2.3. The only difference is
that in the division algorithm we always choose the greatest nonzero term that
is divisible by the leading term of d and herein we can choose any term divisible
by the leading term of d.
The following result is useful in the sequel.
Lemma 3.2.28 Let p and d be nonzero polynomials in C[x1 , . . . , xn ]. Assume
t
that t is a term of p divisible by lt(d) and that p = p lt(d)
d is a nonzero
polynomial. Then, lt(p) >glx lt(p ) whenever t = lt(p) and lt(p) = lt(p ) whenever
t 6= lt(p).
Proof: Let p =
Prp
p
i=1 ti
and d =
p =
rp
X
Prd
d
i=1 ti .
tpi
i=1
Hence,
rd
X
t
d
ti .
lt(d)
i=1
Since p and d are ordered polynomials, lt(p) = tp1 , lt(d) = td1 . Moreover, given
1 < i rp and 1 < i rd we have that lt(p) >glx tpi and lt(d) >glx tdi . In particular, by Proposition 3.2.15,
t
t
lt(d) >glx
tdi
lt(d)
lt(d)
that is,
t >glx
t
tdi
lt(d)
p =
rp
X
i=2
tpi
rd
X
t
d
ti .
lt(d)
i=2
86
CHAPTER 3. POLYNOMIALS
t
tdi for all 1 < i rd , we
Since lt(p) >glx tpi for all 1 < i rp and lt(p) >glx lt(d)
conclude that lt(p) >glx lt(p ).
(2) t 6= lt(p). Then, t = tpj for some j > 1 and, when computing p , the terms tpj
t
and lt(d)
lt(d) cancel each other. Moreover, tpk >glx t for all 1 k < j and, as a
consequence,
t
tdi
tpk >glx
lt(d)
for all 1 < i rd . Hence, no term tpk , with 1 k < j, cancels with a term
t
tdi , with 1 < i rd , and therefore tpk is a term of p for all 1 k < j. We
lt(d)
can then conclude that lt(p ) = tp1 = lt(p).
QED
Note that Lemma 3.2.28 also holds when polynomials are ordered using the
order >lx instead of >glx . Clearly, in this case we have that
lt(p) >lx lt(p ).
selterm=Function[{q,u,n},Module[{p,t,w,r,i},
p=PolynomialMod[q,n];
t=PolynomialMod[u,n];
If[Head[p]===Plus,w=Apply[List,p],w=p];
r=0;
i=1;
While[i<=Length[w]&&r===0,
If[divisibleQ[w[[i]],t,n],r=w[[i]]];
i=i+1];
r]];
Figure 3.10: Selecting a term of q divisible by u
The Mathematica function redone in Figure 3.11 receives as input two polynomials f and g and a positive integer n. It returns as output the polynomial that
results from a reduction of f modulo g in one step. The integer n indicates that
we are considering polynomials over Zn . Besides the function lt (see Figure 3.8),
it uses the Mathematica function selterm in Figure 3.10.
The function selterm receives as input a polynomial q, a term u and a positive
integer n. It returns a term of q that is divisible by u, if such a term exists, and
0 otherwise. The function first creates the list of the terms of q and then passes
through the list using the function divisibleQ (see Figure 3.9) to pick the first
term divisible by u. When there is no such term it returns 0.
87
redone=Function[{f,g,n},
Expand[PolynomialMod[f(selterm[f,lt[g,n],n]/lt[g,n])*g,n]]];
Figure 3.11: One step reduction
The notion of reduction can now be introduced capitalizing on the one step
reduction.
Definition 3.2.29 Let p be a nonzero polynomial and D a finite set of nonzero
polynomials in C[x1 , . . . , xn ]. Then, p reduces to p modulo D, written
D
p p ,
if there is a sequence
p0 , . . . , pm
where m N0 , such that
p0 = p and pm = p ;
d
d1 = 3x22 + 2
d2 = x21
p 3x22 + 5x1
d
2
1
since p
6x42 + 5x1
3x22 + 5x1 .
88
CHAPTER 3. POLYNOMIALS
When p reduces to p modulo a finite set D of polynomials, then the polynomial p p can be expressed in terms of the polynomials in D. In particular, if p
reduces to 0, then we can express p in terms of the polynomials in the set. These
properties are useful later on.
Lemma 3.2.31 Let d1 , . . . , dk , k N0 , be nonzero polynomials in C[x1 , . . . , xn ].
Assume that p0 , . . . , pm , m N0 , are polynomials in C[x1 , . . . , xn ] such that, for
each 0 i < m, pi reduces to pi+1 modulo dj for some 0 j k. Then,
p0 pm =
k
X
ai di
i=1
k
X
0di .
i=1
dj
Step: Let p0 , . . . , pm+1 be such that, for each 0 i < m + 1, pi pi+1 for some
1 j k. By the induction hypothesis,
p0 pm =
k
X
ai di
i=1
dj
t
if i = j and ai = ai otherwise, for all 1 i k.
lt(dj )
QED
C[x1 , . . . , xn ]. If p p then
p p =
k
X
ai di
i=1
89
QED
d2 = x32 + x1
d3 = 4x31 + 2x1
and
p = 4x31 x2 + 4x41 + x1 x32 + 3x21 + 2x1 x2 .
Assuming D = {d1 , d2 , d3 }, we have that
D
p 0
since
3
3
2
p
4x41 + x1 x32 + 3x21
x1 x32 + x21
0.
3
p
p1 where p1 = p
lt(p)
d3 = p x2 d3 ;
lt(d3 )
3
p1
p2 where p2 = p1
2
p2
0 where 0 = p2
lt(p1 )
d3 = p1 x1 d3
lt(d3 )
lt(p2 )
g2 = p2 x1 d2 .
lt(d2 )
Therefore,
that is
or
p = p1 + x2 d3
= p2 + x1 d3 + x2 d3
,
= x1 d2 + x1 d3 + x2 d3
p = x1 d2 + (x1 + x2 ) d3 ,
p = 0 d1 + x1 d2 + (x1 + x2 ) d3
The Mathematica function red in Figure 3.12 receives as input two polynomials f and g and a positive integer n. It returns a polynomial irreducible modulo
{g}. The function repeatedly uses the function redone (see Figure 3.11) to reduce
f modulo g in one step until an irreducible polynomial is obtained.
90
CHAPTER 3. POLYNOMIALS
red=Function[{f,g,n},FixedPoint[Function[h,redone[h,g,n]],f]];
Figure 3.12: Reduction of polynomial f modulo g
The Mathematica function redmod in Figure 3.13 receives as input a polynomial f, a list of polynomials G and a positive integer n. It returns a polynomial
G-irreducible that results from the reduction of f modulo G. This function extends
the function red to a set of polynomials G = {g1 , . . . , gm }. It is the fixed point of
the function that, given a polynomial h, obtains a polynomial h1 by reducing h
modulo g1 , a polynomial h2 by reducing h1 modulo g2 and so on, finally returning
hm .
redmod=Function[{f,G,n},
FixedPoint[
Function[h,Module[{i,r},
r=h;
i=1;
While[i<=Length[G],
r=red[r,G[[i]],n];
i=i+1];
r]],
f]];
Figure 3.13: Reduction of polynomial f modulo G
3.3
Gr
obner bases
In this section we introduce the notion of Grobner basis and some related concepts
and properties. Grobner bases were first introduced by B. Buchberger [7, 8] and
were originally proposed as an algorithmic solution to some important problems
in polynomial ring theory and algebraic geometry (for more technical details see
[1]). Since then many applications and generalizations have been proposed. The
interested reader can consult [30, 5, 32, 6, 27].
Grobner bases are often seen as a multivariate generalization of the Euclidean
algorithm for computing the greatest common divisor for univariate polynomials.
3.3. GROBNER
BASES
91
They are also seen as a nonlinear generalization of the Gauss elimination algorithm for linear systems. In Section 3.4.2 we discuss how to use Grobner bases
to solve systems of nonlinear polynomial equations.
3.3.1
Ring ideals
In a nutshell, the idea behind the Grobner basis approach is as follows. Given
a (finite) set S of polynomials in some ring of polynomials (that depends on the
particular problem at hand), S can be transformed into a set G of polynomials
(the Grobner basis) which is equivalent to S in a sense to be discussed later on.
This new set G satisfies some good properties and, as a consequence, many problems and questions that were difficult to handle when considering the arbitrary
set S become easier when we work with G. Since the transformation from S
to G can be algorithmically performed, many problems involving finite sets of
polynomials become algorithmically solvable.
A typical example of this situation is the ideal membership problem. The
ideal membership problem consists in checking whether or not some polynomial
is a member of the ideal of polynomials generated by a given finite set S of polynomials. In general, this problem is not easy to solve. However, if we transform
S into a Grobner basis G that generates the same ideal the problem is easily
solved.
There are several concepts that have to be presented before defining Grobner
bases. In this section we introduce ideals, ideals of polynomials and some relevant
related notions and properties. In Section 3.3.2 we then discuss Grobner bases.
An ideal is a subset of a ring with some interesting properties. Using ideals
important properties of integer numbers can be generalized to other rings. For
example, prime ideals and coprime ideals can be defined as a generalization of
prime and coprime numbers and there is even a Chinese remainder theorem involving ideals. Herein we do not detail this subject. The interested reader is
referred to [20].
Definition 3.3.1 An ideal I over a commutative ring (A, +, 0, , ) is a nonempty
subset of A such that for all a1 , a2 A
if a1 , a2 I then a1 + a2 I;
if a1 A and a2 I then a1 a2 I.
92
CHAPTER 3. POLYNOMIALS
Example 3.3.2 Consider the ring Z of integer numbers with the usual operations and n Z. The set of the multiples of n is an ideal over Z since (i) this set
is not empty, (ii) the sum of multiples of n is a multiple of n and (iii) the product
of an integer number by a multiple of n is again a multiple of n.
Example 3.3.3 Let (A, +, 0, , ) be a unitary commutative ring. It is easy to
conclude that
{0} and A are ideals over A
if 1 I then I = A.
In the second case note that given any a A we have that a 1 = a I.
3.3. GROBNER
BASES
93
Proof: Recall the ideal (g1 , . . . , gk ) from Example 3.3.4. Statement (i) follows
easily from Definition 3.3.5. With respect to statement (ii), note that if p I
then
m
X
agj gj
p=
j=1
for some g1 , . . . , gm
{g1 , . . . , gk }, where agj A for every 1 j m. Hence,
p=
k
X
ai gi
i=1
The following result states that every ideal over a ring of polynomials is generated by some finite set of polynomials.
Proposition 3.3.8 Every ideal I over C[x1 , . . . , xn ] is finitely generated.
In fact, if I = {0} then I = (0). Otherwise, letting J be the ideal generated
by the leading terms of the polynomials in I, there is a finite set {g1 , . . . , gr } I
such that J = (lt(g1 ), . . . , lt(gr )), that is, J is finitely generated. Moreover,
I = (g1 , . . . , gr ) and therefore I is finitely generated. For the details of the proof
we refer the reader to [10], for instance.
G
When G is a basis of an ideal I of polynomials and p p , then, using
Proposition 3.2.32, it is easy to conclude that p p I and that p I if and
G
only if p I. In particular, if p 0, then p I. These properties are useful
in the sequel.
Proposition 3.3.9 Let G be a basis of an ideal I over C[x1 , . . . , xn ] and let p be
a polynomial in C[x1 , . . . , xn ].
G
2. If 1 G then p 0.
Proof:
1. The fact that p p I follows from Proposition 3.2.32 and Definition 3.3.5. If
p I then, since p p I and I is an ideal, we also have p = p + (p p ) I.
The converse is similar.
94
CHAPTER 3. POLYNOMIALS
2. If p is 0 the result is immediate. Otherwise, note that lt(1) divides lt(p) and
therefore
lt(p)
1
p p
1.
lt(1)
1
g2 = x32 + x1
g3 = 4x31 + 2x1 .
p 0.
Hence, by Proposition 3.3.9, we conclude that p 0 I and therefore p I.
Moreover, recalling again Example 3.2.33, we have that
p = x1 g2 + (x1 + x2 ) g3 ,
or even
p = 0 d1 + x1 d2 + (x1 + x2 ) d3
if we want to explicitly consider all the polynomials in the basis.
3.3. GROBNER
BASES
3.3.2
95
Buchberger criterion
96
CHAPTER 3. POLYNOMIALS
p 0.
() We prove the result by contraposition. Assume that G is not a Grobner
basis. Then there is a nonzero polynomial p I such that lt(p) is not divisible
by lt(g) for all g G. If p is irreducible modulo G, since p 6= 0, we conclude that
p cannot be reduced to 0 modulo G. If p reduces to p in one step modulo an
element of G, Lemma 3.2.28 ensures that
lt(p ) = lt(p).
Hence p 6= 0. Again, lt(p ) is not divisible by lt(g) for all g G and so we can
reason in a similar way. Therefore, it is easy to conclude that it is not possible
to reduce p to 0 modulo G.
QED
Although not obvious from the Definition 3.3.11, it is the case that every ideal
over C[x1 , . . . , xn ] has a Grobner basis. In fact, it can have many Grobner bases.
Reduced Grobner bases play an important role since every polynomial ideal has
only one reduced Grobner basis (see Proposition 3.3.30).
Definition 3.3.13 A Grobner basis G of an ideal I over the ring C[x1 , . . . , xn ]
is said to be a reduced Grobner basis if for each g G
lt(g) is monic;
lt(g) does not divide the nonzero terms of g for all g G\{g}.
3.3. GROBNER
BASES
97
basis. Using the Buchberger criterion we only have to consider a finite number
of polynomials. We first introduce the notion of least common multiple of two
terms.
Definition 3.3.14 Let t1 = k1 x1 1 . . . xnn and t2 = k2 x1 1 . . . xnn be nonzero terms
in x1 , . . . , xn over C. The least common multiple of t1 and t2 , denoted lcm(t1 , t2 ),
is the monic term
x11 . . . xnn
where i = max{i , i } for each 1 i n.
The Mathematica function polLCM in Figure 3.14 computes the least common
multiple of two terms using the built-in Mathematica function PolynomialLCM
and the auxiliary function coeflt. The function coeflt receives as input a
polynomial p and a positive integer n. It returns the coefficient of the leading
term of p. The function first checks whether p has only a nonzero term in which
case it returns its coefficient. Otherwise, it uses the function lt (see Figure 3.8)
to obtain the leading term of p. The integer n indicates that we are considering
polynomials over Zn .
coeflt=Function[{p,n},Module[{q},
q=PolynomialMod[p,n];
If[Head[lt[q,n]]===Power,1,
If[NumberQ[q],q,
If[NumberQ[lt[q,n][[1]]],lt[q,n][[1]],1]]]]];
polLCM=Function[{t1,t2},
PolynomialLCM[t1/coeflt[t1],t2/coeflt[t2]]];
Figure 3.14: Least common multiple of terms t1 and t2
The function polLCM receives as input two terms t1 and t2, and returns
lcm(t1, t2). The terms t1 and t2 are first divided by the corresponding leading
term coefficient. Then the function PolynomialLCM is used to obtain their least
common multiple.
We now present Buchberger polynomials also called S-polynomials.
Definition 3.3.15 Let p1 and p2 be nonzero polynomials in C[x1 , . . . , xn ]. The
Buchberger polynomial of p1 and p2 , denoted B(p1 , p2 ), is the polynomial
B(p1 , p2 ) =
lcm(lt(p1 ), lt(p2 ))
lcm(lt(p1 ), lt(p2 ))
p1
p2
lt(p1 )
lt(p2 )
98
CHAPTER 3. POLYNOMIALS
in C[x1 , . . . , xn ].
1
3x21 x32 x3
5x32 x43
2
1
1
3
= x21 x2 x33 + x2 x43 + x33 x21 x22 x23 .
3
3
3
5
Observe that the leading terms of the polynomials p1 and p2 are canceled when
we compute their Buchberger polynomial. Moreover, B(p1 , p2 ) = B(p2 , p1 ),
and, therefore, it is easy to conclude that if B(p1 , p2 ) reduces to 0 modulo a set
of polynomials so does B(p2 , p1 ) (Exercise 14 in Section 3.5). Note also that if
p1 , p2 I then B(p1 , p2 ) I, where I is an ideal of polynomials.
The Mathematica function polBuch in Figure 3.15 receives as input two polynomials f and g and a positive integer n. It returns the Buchberger polynomial of
f and g. The Buchberger polynomial is computed as expected using the function
polLCM (see Figure 3.14).
polBuch=Function[{f,g,n},
Expand[PolynomialMod[polLCM[lt[f,n],lt[g,n]]/lt[f,n]*f
-polLCM[lt[f,n],lt[g,n]]/lt[g,n]*g,n]]];
Figure 3.15: Buchberger polynomial of f and g
We can now introduce the Buchberger criterion for checking whether a given
basis of an ideal is a Grobner basis. The following theorem is known as the
Buchberger theorem. We only sketch the proof of the theorem. For the complete
proof we refer the interested reader to [10].
Theorem 3.3.17 Let G be a basis of an ideal I over the ring C[x1 , . . . , xn ]. Then
G is Grobner basis if and only if
G
B(g, g ) 0
for all distinct polynomials g, g G.
3.3. GROBNER
BASES
99
Proof (sketch):
() Since B(g, g ) I for all distinct polynomials g, g G, if G is Grobner basis
G
then Proposition 3.3.12 ensures that B(g, g ) 0.
() Conversely, we have to prove that for each p I there is g G such that
lt(g) divides lt(p). Consider p I and assume that G consists of k polynomials
g1 , . . . , gk . Without loss of generality, assume that all the polynomials in G are
monic. Let A be
Pkthe set of all tuples (a1 , . . . , ak ) of polynomials in C[x1 , . . . , xn ]
such that p = i=1 ai gi . For each (a1 , . . . , ak ) A let M(a1 ,...,ak ) be the set of
the monomials of lt(ai gi ) for all 1 i k and let m(a1 ,...,ak ) = max(M(a1 ,...,ak ) ).
Note that either m(a1 ,...,ak ) is the monomial of lt(p) or m(a1 ,...,ak ) >glx lt(p). The
monomial of lt(p) may differ from m(a1 ,...,ak ) because the lt(ai gi )s whose monomial
is m(a1 ,...,ak ) may cancel each other. Let M be the set of all m(a1 ,...,ak ) with
(a1 , . . . , ak ) A and let m = min(M).
As remarked above, either m is the monomial of lt(p) or m >glx lt(p). Using
the hypothesis that B(g, g ) reduces to 0 modulo G for all distinct g, g G, if
m >glx lt(p) then it would be possible to get a1 , . . . , ak C[x1 , . . . , xn ] such that
P
p = ki=1 ai gi and m >glx lt(ai gi ) for each 1 i k. But this contradicts the
assumption m = min(M) and, as a consequence, we can conclude that m is the
monomial of lt(p). Then there is 1 i k such that lt(gi ) divides lt(p) as required.
QED
Taking into account Theorem 3.3.17, to check whether a set of polynomials
is a Grobner basis we just have to consider a finite number of polynomials: the
Buchberger polynomials of the pairs of distinct elements of G.
Example 3.3.18 Consider the polynomials
g1 = 2x21 x2 + x2
g2 = x32 + x1
g3 = 4x31 + 2x1
4x31 + 3x32 0
g3
g2
B(g1 , g3 ) = 0;
B(g2 , g3 ) = x41 + 2x1 x32
and
g3
g2
100
CHAPTER 3. POLYNOMIALS
G
x21 x32
x21 x32
g2 = 4x31 + 3x32
1
2x21 x2
x32
and
G
since
g2
Note that the nonzero polynomial 4x31 + 2x1 cannot be reduced modulo g1 or
g2 since neither of its terms are divisible by lt(g1 ) or lt(g2 ), respectively. Since
we have not proven that reductions modulo a set of polynomials are unique, in
G
G
the sense that p = p whenever p p and p p and p , p are irreducible
modulo G, we cannot use just yet Theorem 3.3.17 to conclude that G is not a
Grobner basis. We have to ensure that there is no other possible reductions of
B(g1 , g2 ) modulo G. It is easy to conclude that this is indeed the case, since lt(g1 )
does not divide any term of 4x31 + 3x32 and lt(g2 ) only divides the term 3x32 . Thus,
we can conclude that B(g1 , g2 ) cannot be reduced to a zero polynomial modulo
G. By Theorem 3.3.17, G is not a Grobner basis.
The Mathematica function GrobnerQ in Figure 3.16 receives as input a list
of polynomials G and returns a Boolean value. It returns True if G is a Grobner
basis of the ideal generated by the polynomials in G, and returns False otherwise.
The predicate uses the functions redmod (see Figure 3.13) and polBuch (see
Figure 3.15) according to Theorem 3.3.17.
GrobnerQ=Function[{G,n},
Apply[And,
Flatten[Table[
redmod[polBuch[G[[i]],G[[j]],n],G,n]===0,
{i,1,Length[G]},{j,1,Length[G]}]]]];
Figure 3.16: Checking whether G is a Grobner basis
3.3. GROBNER
BASES
3.3.3
101
Buchberger algorithm
In this section we present the Buchberger algorithm. Given a finite set of polynomials G as input, the Buchberger algorithm outputs a finite set of polynomials
that constitutes a reduced Grobner basis of the ideal generated by the polynomials
in G. A possible implementation of the Buchberger algorithm in Mathematica is
also presented. Unless otherwise stated, we consider the order >glx on monomials
but the Buchberger algorithm can be used assuming any other order on monomials will do provided it is suitable for polynomial division (see Section 3.2.3).
We first address the problem of obtaining a Grobner basis of an ideal I of
polynomials from any given set of generators of I. Then, we show how to obtain
a reduced Grobner basis of I from a given Grobner basis of I.
The first step of the Buchberger algorithm consists of a technique for obtaining
a Grobner basis of an ideal I from a given basis of I. It is based on the Buchberger
theorem (Theorem 3.3.17).
In this first step, given any basis G of I, we look for pairs of polynomials
m
B(g, g )
0 for every pair of distinct polynomials g, g Gm .
102
CHAPTER 3. POLYNOMIALS
divides lt(pi ), for each i N0 and therefore J0 J1 J2 . . ., that is, the inclusion Ji Ji+1 holds for each i N0 . However, one can prove that no such
increasing chain of ideals exists. Hence, there is some m N0 such that, for all
g, g Gm , if B(g, g ) reduces to p modulo Gm and p is Gm -irreducible then either
p = 0 or p Gm . Since p is Gm -irreducible, it can not be the case that p Gm .
Hence, p = 0. Given that each B(g, g ) can always be reduced to a Gm -irreducible
polynomial we conclude that such a finite sequence G0 . . . Gm of set of polynomial
indeed exists.
Since G Gm and G is a basis of T , the set Gm is also a basis of I. Theorem
3.3.17 ensures that it is a Grobner basis.
QED
Proposition 3.3.20 sketches a technique for obtaining a Grobner basis of an
ideal I from any basis of I. The following example illustrates that construction
of a Grobner basis.
Example 3.3.21 Consider the polynomials
g1 = 2x21 x2 + x2
and
g2 = x32 + x1
therefore
1
1
1
3. B(g1 , g2)
0, B(g1 , g3 )
0, B(g2 , g3 )
0 (see Example 3.3.18)
3.3. GROBNER
BASES
103
calcBuch=Function[{G,n}, Module[{K,H,g,c,h},
K=G;
g=True;
While[g,
c=Length[K];
H={};
Do[h[i,j]=redmod[polBuch[K[[i]],K[[j]],n],
Union[K,H],n];
If[h[i,j]=!=0,H=Union[Append[H,h[i,j]]]],
{i,1,c-1},{j,i,c}];
K=Union[Join[K,H]];
g=(Length[H]!=0)];
K]];
Figure 3.17: First step of the Buchberger algorithm in Mathematica
of an iteration if H is empty the loop ends and the function returns K. Otherwise
K is updated with the polynomials in H and H is reseted to the empty list.
The next goal of the Buchberger algorithm is to obtain a reduced Grobner
basis of an ideal I from a given Grobner basis G of I. To this end, there are three
more steps.
First, we remove from G polynomials g such that lt(g) is divisible by lt(g )
for some g 6= g in the basis. Then, we make the remaining polynomials monic
by multiplying the terms of each polynomial by the inverse of its leading term
coefficient. Finally, we replace each monic polynomial h by h where h is obtained
from h reducing it as much as possible. The next propositions ensure that we
indeed end up with a reduced Grobner basis after performing these steps.
Proposition 3.3.22 Let G be a Grobner basis of an ideal I over C[x1 , . . . , xn ]
and let g G be such that lt(g) is divisible by lt(g ) for some g G\{g}. The
set G\{g} is a Grobner basis of I.
Proof: Since lt(g) is divisible by lt(g ), the polynomial g can be reduced modulo
lt(g)
G to h = g lt(g
) g . By Proposition 3.3.9, h I since g I and therefore, by
G
f1
f2
fr
h h1 h2 . . . hr 0
where f0 , f1 , . . . , fr G and hr 6= 0. By Lemma 3.2.28, we can conclude that
lt(g) >glx lt(h) and, moreover, that lt(hi ) >glx lt(hi+1 ) or lt(hi ) = lt(hi+1 ) for all
104
CHAPTER 3. POLYNOMIALS
3.3. GROBNER
BASES
105
k
X
ai gi
i=2
f2
f3
fr
g h1 h2 . . . hr
where f1 , f2 , . . . , fr G\{g}, r 1 and hr = h. Since lt(g) is not divisible by
lt(g ) for all g G\{g}, some other term t of g is used to reduce g to h1 modulo
f1 . Hence, by Lemma 3.2.28, lt(h1 ) = lt(g). Similarly, we can conclude that, in
fact, lt(hi ) = lt(g) for all 1 i r. Since g is monic, so is hr , that is, h is monic.
Finally, consider any nonzero p I. Since G is a Grobner basis, lt(p) is divisible by lt(g ) for some g G. If lt(p) is divisible by lt(g) then it is also divisible
by lt(h), because lt(h) = lt(g). Therefore, the set G is a Grobner basis of I. QED
Note that given a Grobner basis G of I obtained following the steps described
above, Proposition 3.3.24 ensures that if we replace each polynomial g by g , where
g results from g by successive reductions modulo all the other polynomials until
no more reductions are possible, then we indeed get a reduced Grobner basis of
I.
The Buchberger algorithm can be sketched as in Figure 3.18. The input set
G is any basis of an ideal I 6= {0} such that 0 6 G. The output set G4 obtained
in the last step is a reduced Grobner basis of I. In the sequel, given a finite set G
of nonzero polynomials, we use Buch(G) to denote the output of the algorithm
when it receives G as input. Observe that Buch(G) is well defined since reduced
Grobner basis are unique (see Proposition 3.3.30). The above propositions ensure
the correction of Buchberger algorithm.
Note that it is easy to obtain a Grobner basis for the ideal {0}. The set {0}
is, in fact, the only basis of this ideal and it is a Grobner basis. Moreover, it is
also easy to conclude that if G is a basis of an ideal I 6= {0} and 0 G then
G\{0} is also a basis of I. Hence, there is no loss of generality by assuming that
the input set of the Buchberger algorithm does not include 0.
106
CHAPTER 3. POLYNOMIALS
Buchberger algorithm
input: Finite nonempty set G of nonzero polynomials
step 1: Compute a sequence H0 . . . Hm of sets of polynomials such that
H0 = G;
i
that B(g, g )
p for some distinct g, g Hi , for each 0 i < m;
m
and Hm is the first set such that B(g, g )
0 for any distinct polynomials
g, g Hm .
Let G1 = Hm .
step 2: Assume that G1 consists of k N polynomials g1 , . . . , gk and compute a sequence H0 . . . Hk1 of sets of polynomials such that
H 0 = G1 ;
Let G2 = Hk .
step 3: Let G3 = {c1
g g : g G2 } where cg is the coefficient of the
leading term of g for each g G2 .
step 4: Assume that G3 consists of k N polynomials g1 , . . . , gk and compute a sequence H0 . . . Hk of sets of polynomials such that
H 0 = G3 ;
Hi \{gi+1 }
Hi+1 = (Hi \{gi+1 }) {h} where gi+1 h and h is (Hi \{gi+1 })-
Let G4 = Hk .
output: G4
Figure 3.18: Buchberger algorithm
3.3. GROBNER
BASES
107
g2 = x1 x2 x2
g3 = x2 + 1
0
x2
1 and 1 is irreducible modulo H0
1
Moreover B(g, g )
0 for all polynomials g, g H1 such that g 6= g .
Hence, G1 = H1 = {g1 , g2 , g3 , g4}.
step 2:
H0 = {g1 , g2 , g3 , g4 }
H0 \{1}
ireducible.
Hence, G4 = H0 = {1}.
We conclude that Buch({g1 , g2 , g3 }) = {1} and therefore {1} is a reduced Grobner
basis of the ideal (g1 , g2 , g3).
108
CHAPTER 3. POLYNOMIALS
In Example 3.3.25 we added the polynomial 1 to the basis in the first step of
the algorithm and at the end we got {1} as reduced Grobner basis. Note that
this is not just a consequence of this particular example. In general, if we add
the polynomial 1 to the basis in step 1 of the Buchberger algorithm, then we
immediately have a Grobner basis, since lt(1) divides any term. Then, in step 2,
all the elements of the basis are removed except the polynomial 1, given that their
leading terms are divisible by lt(1). Step 3 does not modify anything because 1 is
monic. Similarly with respect to step 4 since the basis has now only one element.
Note that, in general, whenever the input set G3 for step 4 is a singular set {g},
then g can never be reduced to a distint polynomial modulo G3 \{g} since this
set is empty, and therefore no changes occur in step 4.
Example 3.3.26 Consider the polynomials
g1 = 2x21 x2 + x2
and
g2 = x32 + x1 .
1
Moreover, B(g, g )
0 for all polynomials g, g H1 such that g 6= g.
Hence, G1 = H1 = {g1 , g2 , g3 }.
step 2:
H0 = {g1 , g2 , g3 }
Hence, G2 = H0 = {g1 , g2 , g3 }.
step 3: G3 = {3 g1 , 1 g2 , 4 g3 } = {x21 x2 + 3x2 , x32 + x1 , x31 + 3x1 }.
step 4: Let g1 = x21 x2 + 3x2 , g2 = x32 + x1 and g3 = x31 + 3x1 . Then,
H0 = {g1 , g2 , g3 }
3.3. GROBNER
BASES
109
rempol=Function[{G1,n},Module[{ntddivQ,H,j,G2},
ntddivQ=Function[{K,p},
Apply[And,
Map[Function[h,
Not[divisibleQ[lt[p,n],lt[h,n],n]]],K]]];
G2=Sort[G1,Function[{x,y},
degmon[lt[x,n],n]<degmon[lt[y,n],n]]];
H={}; j=1;
While[j<=Length[G2],
If[ntddivQ[H,G2[[j]]],
H=Append[H,G2[[j]]]];
j=j+1];
H]];
Figure 3.19: Second step of the Buchberger algorithm in Mathematica
The second step of the Buchberger algorithm can be implemented using the
function rempol in Figure 3.19. The function receives has input a list G1 of
polynomials and a positive integer n. It returns a list that results from G1 by
removing all the polynomials whose leading term is divisible by the leading term
of another polynomial in the list. The function rempol uses the auxiliary function
ntddivQ that given a list K of polynomials and a polynomial p returns True if
the leading term of each polynomial in K does not divide the leading term of
110
CHAPTER 3. POLYNOMIALS
p. The function rempol first creates the list G2 that results from ordering the
polynomials in G1 according to the degree of their leading terms, and sets H to
the empty list. After that the function passes through G2 appending to H each
element of G2 whose leading term is not divisible by the leading terms of the
polynomials already in H.
The function makemonic in Figure 3.20 implements the third step of the Buchberger algorithm. It receives as input a list G of polynomials and a positive integer
n. The function returns a list of monic polynomials that results from G by multiplying each polynomial in the list by the inverse of its leading term coefficient.
It uses the function coeflt presented in Figure 3.14.
makemonic=Function[{G,n},Map[Function[p,
Expand[PolynomialMod[p*PowerMod[coeflt[p,n],-1,n],n]]],G]];
Figure 3.20: Third step of the Buchberger algorithm in Mathematica
The function reduce in Figure 3.21 implements the fourth step of the Buchberger algorithm. It receives as input a list G of polynomials and a positive integer
n. A copy K of the input list G is first created, and the output list U is set to the
empty list. Then there is a loop that, at each step, removes the first polynomial
in K and reduces it modulo the remaining polynomials in the list and the polynomials in already in U. The resulting polynomial is inserted in the output list
U. The loop stops when K becomes the empty list. The function reduce uses the
function redmod presented Figure 3.13.
reduce=Function[{G,n},Module[{K,U,h},
K=G;
U={};
While[Length[K]!=0,
h=redmod[First[K],Join[Rest[K],U],n];
K=Rest[K];
If[h=!=0,U=Append[U,h]]];
U]];
Figure 3.21: Fourth step of the Buchberger algorithm in Mathematica
Finally, the function reducedGrobnerbase in Figure 3.22 is a possible implementation of the Buchberger algorithm. It receives as input a list G of polynomials
3.3. GROBNER
BASES
111
and a positive integer n, and returns the reduced Grobner basis of the ideal generated by the polynomials in G. The functions calcBuch, rempol, makemonic
and reduce that implement the four steps of the algorithm are used as expected.
reducedGrobnerbase=Function[{G,n},Module[{K},
K=calcBuch[G,n];
K=rempol[K,n];
K=makemonic[K,n];
K=reduce[K,n];
K]];
Figure 3.22: Buchberger algorithm in Mathematica
3.3.4
Properties of Gr
obner basis
g2 = x32 + x1
g3 = 4x31 + 2x1 .
112
CHAPTER 3. POLYNOMIALS
3.3. GROBNER
BASES
113
A consequence of the above result is that we can use ideals and ideal bases to
check whether two systems of polynomial equations have the same solutions.
114
CHAPTER 3. POLYNOMIALS
p1 = 0
...
p =0
r
and
q1 = 0
...
q =0
s
QED
x1 x2 x1 x2 + 1 = 0
(3.1)
and
x1 x2 + x2 x1 4x2 + 3 = 0
x21 + x1 x2 + x1 5x2 + 6 = 0
2
x1 + x22 + 2x1 7x2 + 7 = 0
(3.2)
in R[x1 , x2 ]. We first compute the reduced Grobner basis of the ideal generated by the polynomials involved in (3.1), that is, the ideal (g1 , g2 ) where
g1 = x1 x2 x1 x2 + 1 and g2 = x21 x2 2x21 + 2x1 x2 4x1 + x2 2. We briefly
sketch the computation of Buch({g1 , g2 }):
3.3. GROBNER
BASES
115
step 1:
H0 = {g1 , g2 }
B(g1 , g2 ) = x21 3x1 x2 + 5x1 x2 + 2
H
0
B(g1 , g2 )
x21 + 2x1 4x2 + 5
1
B(g1 , g3 )
4x22 12x2 + 8
2
Moreover, B(g, g )
0 for all polynomials g, g H2 such that g 6= g.
Hence, G1 = H2 = {g1 , g2 , g3 , g4}.
step 2: G2 = {g1 , g3 , g4 }
step 3: G3 = {g1 , g3 , g4 } where g4 = x22 3x2 + 2
step 4: G4 = G3 .
We conclude that
Buch({g1 , g2 }) = {x1 x2 x1 x2 + 1, x21 + 2x1 4x2 + 5, x22 3x2 + 2}
and therefore this set of polynomials is the reduced Grobner basis of (g1 , g2 ).
We now consider the system (3.2). In this case we have to compute the
reduced Grobner basis of the ideal (r1 , r2 , r3 ), where r1 = x1 x2 + x22 x1 4x2 + 3,
r2 = x21 + x1 x2 + x1 5x2 + 6 and r3 = x21 + x22 + 2x1 7x2 + 7. We briefly sketch
the computation of Buch({r1 , r2 , r3 }):
step 1:
H0 = {r1 , r2 , r3 }
B(r1 , r2 ) = x21 5x1 x2 + 5x22 + 3x1 6x2
H
0
B(r1 , r2 )
9x22 27x2 + 18
1
Moreover, B(g, g )
0 for all polynomials g, g H1 such that g 6= g.
Hence, G1 = H1 = {r1 , r2 , r3 , r4 }.
step 2: G2 = {r1 , r3 , r4 }
116
CHAPTER 3. POLYNOMIALS
step 3: G3 = {r1 , r3 , r4 } where r4 = x22 3x2 + 2
r
4
4
x21 + 2x1 4x2 + 5
x1 x2 x1 x2 + 1 and r3
step 4: since r1
then G4 = {x1 x2 x1 x2 + 1, x21 + 2x1 4x2 + 5, x22 3x2 + 2}.
We conclude that
Buch({r1 , r2 , r3 }) = {x1 x2 x1 x2 + 1, x21 + 2x1 4x2 + 5, x22 3x2 + 2}
and therefore this set is the reduced Grobner basis of (r1 , r2 , r3 ).
Since the reduced Grobner basis of the ideals (g1 , g2 ) and (r1 , r2 , r3 ) is the
same, the ideals are equal and the systems have the same solutions.
Another consequence of Proposition 3.3.32 is the following. Given polynomials
p1 , . . . , pm in C[x1 , . . . , xn ] we can conclude that the system p1 = 0, . . . , pm = 0
has no solutions in C if 1 (p1 , . . . , pm ) (Exercise 31 in Section 3.5). We can also
conclude that the system has no solution whenever {1} is the reduced Grobner
basis of (p1 , . . . , pm ).
Proposition 3.3.35 Let p1 , . . . , pm be polynomials in C[x1 , . . . , xn ], for m N.
If {1} is the reduced Grobner basis of the ideal (p1 , . . . , pm ) then the system
has no solution in C.
p1 = 0
...
p =0
m
Proof: Since {1} is the reduced Grobner basis of the ideal (p1 , . . . , pm ), then
(p1 , . . . , pm ) = C[x1 , . . . , xn ]. So, Z({p1 , . . . , pm }) = Z(C[x1 , . . . , xn ]) by Proposition 3.3.32. But, Z(C[x1 , . . . , xn ]) = , since, for instance, p1 = x1 1 and p2 = x1
are both polynomials in C[x1 , . . . , xn ] and there is no c C such that p1 (c) = 0
and p2 (c) = 0, given that 1 6= 0 in any field.
QED
Let us present an illustrative example.
Example 3.3.36 Consider the system of polynomial equations
x1 x2 + x1 x2 + 2 = 0
x22 x1 = 0
x21 + x1 x2 + 1 = 0
3.3. GROBNER
BASES
117
in R[x1 , x2 ]. We have to compute the reduced Grobner basis of the ideal (p1 , p2 , p3 )
where p1 = x1 x22 + x1 x2 + 2, p2 = x22 x1 and p3 = x21 + x1 x2 + 1. We have that
B(p1 , p2 ) = x21 + x1 x2 + 2
and
p3
x21 + x1 x2 + 2 1.
x1 x2 x1 x2 + 1 = 0
x21 + 2x1 4x2 + 5 = 0
x22 3x2 + 2 = 0
(3.3)
and the system (3.1) have the same solutions. But note that the last equation of
the system (3.3) involves only one variable, x2 , thus it is easier to solve. With the
solutions for x2 we can transform the other two equations in equations involving
only the variable x1 that can also be easily solved.
118
CHAPTER 3. POLYNOMIALS
0x1 = 0
x21
+ 2x1 + 1 = 0
x1 1 = 0
x21 + 2x1 3 = 0
and therefore x = 1. Thus, the system (3.3), and therefore the system (3.1), has
two solutions
x1 = 1, x2 = 1
and
x1 = 1, x2 = 2.
If a system has three or more variables, once we have eliminated one of the
variables as above, we can repeat the process with the remaining equations and
try to eliminate a new variable. Observe that this process of eliminating variables
is similar to Gauss elimination algorithm for systems of linear equations.
Note that it may not be possible to obtain Grobner bases where all the variables but one have been eliminated from some equation. However, recall that the
uniqueness of reduced Grobner bases depend on the monomial order considered.
Hence, if in the reduced Grobner basis with respect to the order >glx , for instance,
no equation with only one variable exists, it may be the case that considering
another order, such as the order >lx for example, such equation occurs. In fact,
whenever the system satisfies some suitable conditions, using the order >lx ensures that in the corresponding reduced Grobner basis an equation with only one
variable always exists. We do not further develop this subject herein and refer
the reader to [10], for instance.
3.4
3.4.1
We address the problem of how Grobner bases can be used for checking the
equivalence of two combinational circuits.
As we have discussed in Section 3.1.1, we can check the equivalence of two
combinational circuits checking whether two suitable propositional formulas are
119
=
=
=
=
conv(x1 ) conv(x1 )
conv(x1 ) (1 conv(x1 ))
x1 (1 x1 )
x21 + x1
120
CHAPTER 3. POLYNOMIALS
conv(2 )
=
=
=
=
conv(x1 ) conv(x1 x2 )
(1 conv(x1 ))(conv(x1 )+ conv(x2 ) conv(x1 ) conv(x2 ))
(1 x1 ) (x1 + x2 x1 x2 )
x21 x2 x21 + x1 + x2
Given the properties of the field Z2 , 1 also corresponds to x21 + x1 and conv(2 )
to x21 x2 + x21 + x1 + x2 .
The Mathematica rewriting rules in Figure 3.23 convert propositional formulas into polynomials.
conv[neg[x ]]:=1-conv[x];
conv[imp[x1 ,x2 ]]:=conv[x2]conv[x1]-conv[x2];
conv[or[x1 ,x2 ]]:=conv[x1]conv[x2];
conv[and[x1 ,x2 ]]:=conv[neg[or[neg[x1],neg[x2]]]];
conv[eqv[x1 ,x2 ]]:=conv[and[imp[x1, x2],imp[x2, x1]]];
Figure 3.23: Converting propositional formulas into polynomials
Let be a propositional formula and let p Z2 [x1 , . . . , xn ] be the polynomial
conv(). It is straightforward to conclude that is satisfiable if and only if there
are u1 , . . . , un Z2 such that p(u1, . . . , un ) = 0 (see Exercise 37 in Section 3.5).
The following proposition establishes how Grobner bases can be used to check
the validity of a propositional formula.
Proposition 3.4.3 Consider P = {x1 , . . . , xn } and FP . Let p = conv()
and pi = conv(xi (xi ))) for each 1 i n. If 1 is in the reduced Grobner
basis of the ideal (p, p1 , . . . , pn ) over Z2 [x1 , . . . , xn ] then is valid.
Proof: If 1 is in the reduced Grobner basis of (p, p1 , . . . , pn ), then this reduced
Grobner basis is {1} and therefore, by Proposition 3.3.35, the system
p=0
p =0
1
.
.
.
pm = 0
121
QED
and
2 = (x1 ) x2 .
conv((1 2 )) = x61 x42 + x61 x22 + x51 x32 + x41 x42 + x51 x2 + x41 x22 + x21 x42 + x41 x2
+ x21 x22 + x1 x32 + x42 + x21 + x22 + x1 + 1.
Observe that
g2 = conv(x1 (x1 ))
g3 = conv(x2 (x))
B(g1 , g2 ) = x51 x42 + x61 x22 + x51 x32 + x41 x42 + x51 x2 + x41 x22 + x21 x42 + x41 x2
+ x21 x22 + x1 x32 + x42 + x21 + x22 + x1 + 1
and
H
0
B(g1 , g2 )
1.
122
3.4.2
CHAPTER 3. POLYNOMIALS
Inverse kinematics
In Section 3.1.2 we have described the inverse kinematics task: given the intended
coordinates (a, b, c) of the end effector of a particular robot, we want to determine
what are the suitable angles , 1 and 2 at the base and joints for reaching that
position. To this end we have to find the solutions of the system of polynomial
equations
l2 v2 v1 + l3 v3 v1 a = 0
l2 v2 u1 + l3 v3 u1 b = 0
l2 u2 + l3 u3 c = 0
u21 + v12 1 = 0
u22 + v22 1 = 0
u23 + v32 1 = 0
where l2 and l3 are the lengths of the arm links. The system has 6 variables, ui
and vi for i = 1, 2, 3, where u1 = sin , v1 = cos , and u2 = sin 1 , v2 = cos 1 ,
u3 = sin 3 , v3 = cos 3 .
For simplicity, we are going to solve this system for particular values of a, b, c,
l2 and l3 . We use Grobner bases as described in Section 3.3.4. Let us assume that
a = b = c = l2 = l3 = 1. We also assume a change of variables: x1 = u1 , x2 = v1 ,
x3 = u2 , x4 = v2 , x5 = u3 and x6 = v3 . Thus, the system becomes
x2 x4 + x2 x6 1 = 0
x1 x4 + x1 x6 1 = 0
x3 + x5 1 = 0
x21 + x22 1 = 0
x23 + x24 1 = 0
x25 + x26 1 = 0
where the polynomials are already ordered.
p2 = x4 x1 + x6 x1 1
p5 = x23 + x24 1
p3 = x3 + x5 1
p6 = x25 + x26 1
considering the order >lx . Note that the polynomials are already ordered according to this order. Using the Buchberger algorithm we get
{x3 + x5 , x1 2x6 , x2 2x6 , x4 x6 , x25 21 , x26 12 }.
123
3.5. EXERCISES
x3 + x5 = 0
x1 2x6 = 0
x 2x = 0
2
6
.
x4 x6 = 0
x25 12 = 0
x26 21 = 0
3.5
Exercises
(d) p = x1 x2 x33 + 4x22 x33 + 2x21 x33 + x1 x2 x23 and d = 2x1 x3 + x2 x3 are polynomials in Z7 [x1 , x2 , x3 ].
3. Consider the polynomials p = 3x31 x32 + 5x21 x32 6x31 x2 + 2x21 x22 10x21 x2 4x21
{d}
124
CHAPTER 3. POLYNOMIALS
9. Prove that the set Zeven of the integer numbers that are even is an ideal
over Z.
10. Prove that the ideal (6, 9) over Z is the set of multiples of 3.
11. Prove that the ideal (g1 , g2 ) over Z is the set {gcd(g1 , g2 ) n : n Z}.
12. Let I and J be ideals over the same ring A = (A, +, 0, , ). Prove that
(a) I J
(b) I + J = {a + b : a I, b J}
(c) I J = {i j : i I, j J}
16. Consider the polynomials g1 = x31 +x3 and g2 = x32 +x3 +2 in Z5 [{x1 , x2 , x3 }].
Prove that {g1 , g2 } is a Grobner basis of the ideal (g1 , g2 ) and check whether
2x31 x32 + x31 x3 + x32 x3 (g1 , g2 ).
17. Consider the polynomials g1 = x1 x3 +x22 +2, g2 = 4x22 x3 +x3 and g3 = 2x22 +3
in Z5 [{x1 , x2 , x3 }]. Prove that {g1 , g2 , g3} is a Grobner basis of the ideal
(g1 , g2 , g3 ) and check whether x1 x22 x3 + x1 x3 + 1 (g1 , g2 , g3 )
18. Consider the polynomials g1 = 2x22 + 2x1 , g2 = x1 x3 + x22 + x1 + 6x2 and
g3 = 4x1 x22 + 4x21 in Z7 [{x1 , x2 , x3 }].
(a) Prove that {g1 , g2 , g3 } is a Grobner basis of the ideal (g1 , g2 , g3 ).
(b) Check whether 2x21 x3 + 6x1 x22 + 6x21 + 5x1 x2 (g1 , g2, g3 ).
(c) Compute the reduced Grobner basis of (g1 , g2 , g3 ).
3.5. EXERCISES
125
19. Let G be a finite set of polynomials in Z2 [{x1 , x2 , x3 , x4 }] and let the set
G = {x1 x2 , x2 2 +x3 x4 , x3 +x1 , x1 x4 , x2 +x3 , x2 +x4 , x2 x3 x4 +
x3 x4 2 , x3 + x4 } be a Grobner basis of the ideal generated by G. Compute
the reduced Grobner basis of the ideal generated by G. Check whether
x1 2 + x2 x4 is in the ideal generated by G.
20. Consider the polynomials g1 = 3x1 x2 + 2 and g2 = 3x1 1 in Z5 [x1 , x2 ].
(a) Check whether {g1 , g2 } is a Grobner basis of the ideal (g1 , g2 ).
21. Consider the polynomials g1 = 3x1 x2 2x1 and g2 = 2x2 2 + 3x1 in Z5 [x1 , x2 ].
(a) Check whether {g1 , g2 } is a Grobner basis of the ideal (g1 , g2 ).
22. Consider the polynomials g1 = x31 and g2 = x41 + 4x2 in Z5 [x1 , x2 ]. Compute
the reduced Grobner basis of the ideal (g1 , g2 ).
23. Consider the polynomials g1 = x21 + 2 and g2 = x21 + 4x2 in Z5 [x1 , x2 ].
Compute the reduced Grobner basis of the ideal (g1 , g2 ).
24. Consider the polynomials g1 = x1 , g2 = x1 x2 x2 and g3 = x2 + 1 in
Z2 [x1 , x2 ].Compute the reduced Grobner basis of the ideal (g1 , g2 , g3 ). Check
whether x21 + x32 (g1 , g2, g3 ).
25. Consider the polynomials g1 = x22 + x23 , g2 = x21 x2 + x2 x3 and g3 = x33 + x1 x2
in Z2 [x1 , x2 , x3 ]. Compute the reduced Grobner basis of the ideal (g1 , g2 , g3 ).
Check whether x21 x23 + x22 x3 (g1 , g2 , g3 ).
26. Let G be a set of polynomials in C[x1 , . . . , xn ] and assume that for all
G
G
p C[x1 , . . . , xn ] we have that p = p whenever p p , p p and p
and p are G-irreducible. Prove that G is a Grobner basis.
27. Let G1 and G2 be Grobner bases of the ideals I1 and I2 over C[x1 , . . . , xn ],
respectively. Use the properties of Grobner bases to determine whether
I1 I2 .
28. Let I1 = (4x1 + 2x2 , 3x2 ) and I2 = (x21 , 3x21 x2 , x21 + 2x2 ) be ideals over
Z7 [x1 , x2 ]. Prove that I1 I2 = I2 .
29. Let I1 = (x1 , 3x2 ) and I2 = (4x1 , 3x21 x1, x21 + 2x2 ) be ideals over
Z7 [x1 , x2 ]. Prove that I1 I2 = I1 I2 .
126
CHAPTER 3. POLYNOMIALS
x1 + x2 + x3 x3 1 = 0
x1 + 2x2 + x23 1 = 0
x1 + x2 + x23 3 = 0
35. Solve the following system of polynomial
bases
x1 x2 x3 + x2 x3 + x2 =
x1 x3 + x2 x3 x3 =
x22 + x2 x3 =
37. Let be a propositional formula and let p Z2 [x1 , . . . , xn ] be the polynomial conv(). Prove that is satisfiable if and only if there are u1 , . . . , un Z2
such that p(u1 , . . . , un ) = 0.
127
3.5. EXERCISES
38. Consider the digital circuits corresponding to the formulas
(x1 (x2 )) (x1 (x1 ))
and
(x1 ) (x2 ).
(b) Check whether they are equivalent using Grobner basis, assuming that
the polynomial
1 + x1 + x31 + x41 + x51 + x71 + x81 + x31 x2 + x41 x2 + x51 x2 + x71 x2 + x21 x22 +
x61 x22 + x71 x22 + x41 x32 + x71 x32 + x41 x42 + x61 x42 + x81 x42
in Z2 [{x1 , x2 }] corresponds to the formula
(((x1 (x2 )) (x1 (x1 ))) ((x1 ) (x2 ))).
39. Consider the digital circuits corresponding to the formulas (x1 ) x2 and
((x2 ) x1 ). Check whether they are equivalent using Grobner bases.
128
CHAPTER 3. POLYNOMIALS
Chapter 4
Euler-Maclaurin formula
In this chapter, we present several techniques to compute summations. Given
a set of integers A, a set of numbers B, and a map f : A B, the sum, or
summation, of f on A, is denoted by
X
f (k)
(4.1)
kA
and is the sum of all images f (k) for every k A. When A is a finite set, say,
A = {a1 , . . . , an }, then
X
kA
Moreover, when A is an integer interval [a..b], that is, the set of all integers
between a and b, then the summation (4.1), can be written as
b
X
f (k).
k=a
130
4.1
Motivation
SPatMatch=Function[{w,p},Module[{i,j,r,s},
r=False;
i=0;
While[i<=Length[w]-Length[p]&&!r,
j=1;
s=True;
While[j<=Length[p]&&s,
s=s&&w[[i+j]]===p[[j]];
j=j+1];
r=r||s;
i=i+1];
r]];
Figure 4.1: Naive pattern matching algorithm
The function SPatMatch in Figure 4.1 receives two lists, stored in w and p,
corresponding to the word and the pattern, respectively. It returns a Boolean
value stored in variable r. If the value of r is True then the pattern p is a
subsequence of the word w.
Let n be the length of the word w and let m be the length of the pattern
p. The first iteration of the outer loop consists in comparing the pattern p with
the first m elements of w. If they all match we are done, that is, the pattern p
is a subsequence of w (in fact, a prefix of w). Otherwise, we necessarily reach a
position 1 k m such that
w[[k]] 6= p[[k]].
This means that the element in the k-th position of the list w differs from the
one in the k-th position of the list p. At this point the first iteration of the outer
loop ends and we start the next iteration. This step consists in comparing the
pattern p with
w[[2], w[[3]], . . . , w[[m + 1]].
131
4.2. EXPRESSIONS
If no pattern is found, at the k-th step the algorithm compares the pattern p with
w[[k], w[[k + 1]], . . . , w[[k + (m 1)]].
It will finish if either a pattern is found or k + (m 1) > n.
We now analyze the complexity of the algorithm in Figure 4.1 in terms of the
number of comparisons performed between the input word w and pattern p, that
is, the comparisons
w[[i+j]] === p[[j]].
This number should be expressed as a function of the input sizes of both w and
p.
We can consider, in particular, worst-case analysis. In worst-case analysis, we
concentrate on characterizing the worst possible situation, that is, on evaluating
the maximum number of comparisons needed to determine wether or not the
pattern p occurs in the word w. Observe that the maximum number of such
comparisons occurs when the algorithm always executes m comparisons at each
iteration of the outer loop. For instance, the case when the mismatch occurs
always in the last iteration of the inner loop (eg. w = {C, C, C, C, C, C, C, C, C}
and p = {C, C, A}). In this case the number of comparisons executed in the inner
loop is
m
X
1.
j=1
Moreover, since the inner loop is executed when i ranges from 0 to n m, the
total number of comparisons is given by the following summation:
!
nm
m
X X
1 .
i=0
j=1
4.2
Expressions
Syntax
We introduce integer and real expressions. Let I be a nonempty set of integer
variables. The set of integer expressions EI is the smallest set containing all
integer numbers and variables that is closed under addition and multiplication.
That is, EI , is inductively defined as follows:
132
Z I EI ;
(d1 + d2 ), (d1 d2 ) EI for every d1 , d2 EI .
In the sequel we can omit parentheses when no ambiguity occurs. The following
usual simplifications will be used:
d1 d2 for d1 d2 ;
d for (1) d.
Let X be a set of real variables such that I X. The set of real expressions
EX is the smallest set containing all real numbers and variables and is closed
under addition, multiplication, exponentiation, logarithm and summation. That
is, EX is inductively defined as follows:
R X EX ;
(e1 + e2 ), (e1 e2 ), (ee12 ), loge1 (e2 ) EX for every e1 , e2 EX ;
!
d2
X
e1
for e1 e1
2 ;
e2
e for (1) e;
e1 e2 for e1 e2 .
We need to compare two expressions e1 and e2 , namely, checking whether
e1 = e2 or e2 e2 .
Otherwise stated, we use the letters i, j, k, m, n, i1 , i2 , j1 , . . . for integer variables and the letters x, y, z, x1 , x2 , y1, . . . for real variables.
Semantics
We will interpret integer expressions as integers and real expressions as real numbers. For this purpose we need the notion of assignment.
An assignment is a map that assigns to each variable a real number, that is,
: X R such that (i) Z for all i I. Given an assignment, the denotation
is a function that associates an expression with a real number. However, some
expressions can not be interpreted, e.g. 01 . Hence, the denotation is a partial
function [[]] : EX 6 R defined as follows:
133
4.2. EXPRESSIONS
[[a]] = a for each a R;
[[x]] = (x) for each x X;
(
[[e1 ]] + [[e2 ]]
[[e1 + e2 ]] =
undefined
[[e1 e2 ]] =
[[ee12 ]]
undefined
otherwise;
undefined
otherwise;
[[e1 ]] [[e2 ]]
[[e2 ]]
[[e1 ]]
log[[e1 ]] ([[e2 ]] )
undefined
otherwise;
"" d
##
"" d ##
2
X
2
X
[[e]] +
e
e
=
k=d1 +1
k=d1
undefined
where
(
(x)
(x) =
[[d1 ]]
To check if e1 = e2 amounts to verify if [[e1 ]] = [[e2 ]] for all such that both
denotations are defined. Similarly for e1 e2 .
Example 4.2.1 In order to check that
2
X
k2 = 5
k=1
k2
##
= [[5]]
134
k=1+1
= 1 + 22 +
""
2
X
k2
k=1+1+1
##
= 1 + 4 + 0 = 5 = [[5]] .
d2
X
k=d1
##
=0=
""
d2
X
k=d1
##
135
4.2. EXPRESSIONS
(2.2) [[d1 ]] [[d2 ]] . Then also [[d1 ]] [[d2 ]] . We have to prove that
##
##
"" d
"" d
2
2
X
X
=
e
e
k=d1
k=d1
(4.2)
Let n = [[d2 ]] [[d1 ]] (= [[d2 ]] [[d1 ]] ). Note that proving (4.2) is equivalent to
proving
"" d
##
"" d
##
2
2
X
X
=
e
e
k=d2 n
k=d2 n
k=d2
Induction hypothesis:
""
d2
X
k=d2 n+1
##
""
d2
X
k=d2 n+1
##
(4.3)
Step: Let be the assignment such that (x) = (x) for x 6= k and (k) =
[[d2 n]] and let be the assignment such that (x) = (x) for x 6= k and
(k) = [[d2 n]] . Again (x) = (x) for every x X occurring in e , and
therefore, by the induction hypothesis of the structural induction, [[e ]] = [[e ]] .
Hence, using also (4.3),
##
##
"" d
"" d
##
"" d
##
"" d
2
2
2
2
X
X
X
X
e
=
e
= [[e ]] +
e
= [[e ]] +
e
k=d2 n
k=d2 n+1
k=d2 n+1
k=d2 n
QED
We now establish some properties of summation that we will use latter on for
symbolically reasoning with summations.
Proposition 4.2.3 Let e, e , c EX , d1 , d2 , d1 , d2 EI and k I such that k
does not occur in c. Then the following properties hold:
136
1. Distributivity
d2
X
ce = c
k=d1
d2
X
e.
k=d1
2. Associativity
d2
X
(e + e ) =
k=d1
d2
X
e+
k=d1
3. Constant
d2
X
e .
k=d1
d2
X
c=
k=d1
0
if d1 < d2
c(d2 d1 + 1) otherwise.
4. Additivity of indices
d2
d3
Pd2 e
X
X
k=d
e+
e = Pd3 1
k=d2 +1 e
k=d1
k=d2 +1
d3 e
k=d1
if d1 > d2 and d2 + 1 d3
otherwise.
5. Change of variable
d2
X
e=
d+d
X2
ek(kd) =
ek(dk)
k=dd2
k=d+d1
k=d1
dd
X1
d2
X
k=d1
##
ce
""
d2
X
k=d1
##
(4.4)
for every assignment where the denotation is defined. We consider two cases.
137
4.2. EXPRESSIONS
(1) [[d1 ]] > [[d2 ]] . Then,
""
d2
X
##
ce
k=d1
= 0 = [[c]] 0 = [[c]]
""
##
d2
X
k=d1
""
d2
X
k=d1
##
(2) [[d1 ]] [[d2 ]] . Then, let n = [[d2 ]] [[d1 ]] . Observe that showing (4.4) is
equivalent to showing
""
d2
X
k=d2 n
##
ce
""
d2
X
##
k=d2 n
""
d2
X
k=d2
##
ce
""
d2
X
##
k=d2
""
d2
X
k=d2
##
""
d2
X
k=d2 n+1
##
ce
""
d2
X
k=d2 n+1
##
Step: Take to be the assignment such that (x) = (x) for x 6= k and (k) =
[[d2 n]] .
138
""
d2
X
k=d2 n
##
ce
""
= [[c e]] +
d2
X
##
ce
k=d2 n+1
= [[c]] [[e]] +
""
d2
X
= [[c]] [[e]] +
= [[c]]
""
""
d2
X
k=d
""
d2
X
k=d2 n
d2
X
k=d2 n+1
d2
X
k=d2 n+1
(induction hypothesis)
k=d2 n+1
""
##
##
##
e
##
##
QED
Symbolic reasoning
We will use the properties described in Proposition 4.2.3, together with the wellknown properties of real expressions, to reason about summations in a symbolic
way (that is, not invoking semantic arguments). We will check the following
property
5
X
j=3
Clearly,
e+
7
X
j=4
e=
7
X
j=3
e+
5
X
j=4
e.
139
4.2. EXPRESSIONS
5
X
j=3
e+
7
X
e=
j=4
5
X
5
X
e+
j=3
7
X
e+
j=3
e+
j=4
5
X
7
X
7
X
j=5
j=5
e+
j=3
5
X
5
X
(additivity)
j=4
(additivity)
j=4
Using the above properties of summations we can now compute the worst-case
number of comparisons carried out by the algorithm presented in Section 4.1. In
fact, we only need the constant sequence property.
Example 4.2.4 motivating example revisited
In Section 4.1 we had to compute the summation
!
nm
m
X X
1
i=0
j=1
m
X
j=1
nm
X
(constant)
= m(n m + 1)
(constant)
i=0
e = e
k=d1
and e does not have summations. In this case, e is called a closed form for the
summation.
We will consider closed forms for the summations of members of both arithmetic and geometric progressions.
140
Example 4.2.5 We start by finding a closed form for the sum of members of
an arithmetic progression. Recall that an arithmetic progression is a sequence
{uk }kN0 such that uk+1 uk is the same for all k N0 . All arithmetic progressions
are of the form
uk = (c + rk).
We would like to consider the summation of the first n + 1 members of {uk }kN0 ,
that is
n
X
(c + rk).
k=0
rn
(c + rk) = c +
(n + 1).
2
k=0
n
X
(c + rk) =
k=0
n0
X
(c + r(n k))
(change of variable)
k=nn
n
X
k=0
(c + rn rk)
(2)
2
n
X
(c + rk) =
k=0
n
X
(c + rk) +
k=0
n
X
k=0
n
X
n
X
k=0
(c + rn rk)
(c + rk + c + rn rk)
(associativity)
(2c + rn)
k=0
= (2c + rn)
n
X
(distributivity)
k=0
= (2c + rn)(n + 1)
(constant)
141
4.2. EXPRESSIONS
(3)
n
X
rn
(c + rk) = c +
(n + 1)
2
k=0
Example 4.2.6 We now find a closed form for the sum of members of an geometric progression. Recall that an geometric progression is a sequence {uk }kN0
such that
uk+1
uk
is the same for every k N0 . All geometric progressions are of the form
uk = cr k .
We would like to consider the summation of the first n + 1 members of {uk }kN0 ,
that is
n
X
cr k .
k=0
n
X
cr k =
k=0
c cr n+1
.
1r
In fact:
(1)
n
X
k=0
cr
+ cr
n+1
n+1
X
cr k
(additivity)
k=0
= cr 0 +
n+1
X
cr k
(additivity)
k=1
= cr +
n
X
cr k+1
(change of variable)
k=1
=c+r
n
X
k=1
cr k
(distributivity)
142
(2)
n
X
cr k =
k=0
c cr n+1
1r
n(n + 1)
k=
2
k=0
n
X
and
rk =
k=0
1 r n+1
1r
where n 0 and r 6= 1.
The perturbation technique is an useful way of finding closed forms of summations. It works as follows. Assume we want to find a closed form for the
summation
n
X
ak
k=0
P
where n > 0. For simplicity let sn abbreviate nk=0 ak . We can rewrite sn+1 in
two different ways: one by splitting off its last term, an+1 , and another by splitting
off its first term, a0 . In many cases, when we rewrite both the expressions for
sn+1 in terms of sn we obtain an equation for sn whose solution is a closed form
of the summation.
In fact, using additivity, on one hand we have
sn+1 =
n+1
X
k=0
ak =
n
X
ak + an+1
k=0
n+1
X
ak = a0 +
n+1
X
ak
k=1
k=0
= a0 +
n
X
ak+1
k=0
ak + an+1 = a0 +
n
X
k=0
ak+1
143
4.2. EXPRESSIONS
that is
sn + an+1 = a0 +
n
X
ak+1
(4.5)
k=0
The goal is now to express the right-hand side of 4.5 in terms of sn in such a way
that solving the equation for sn results in a closed form for sn . Note that this is
not always possible. Let us illustrate this method with the following example.
Example 4.2.7 Consider the summation
n
X
kak
k=0
sn + (n + 1)a
= 0a +
n
X
(k + 1)ak+1
k=0
and, using associativity and distributivity for rewriting the right-hand side in
terms of sn ,
n
n
X
X
n+1
k
sn + (n + 1)a
=a
ka + a
ak
k=0
k=0
1 an+1
1a
(4.6)
1 an+1
1 n+1
a (n + 1) + a
.
1a
(1 a)2
aA
f (a) = L
aA
P
where L = min({L R : aC f (a) L forPall finite C A}),
P whenever this real
number L exists. Note that, in particualr, aN0 f (a) = lim na=0 f (a).
n
144
4.3
Main results
One of the most important techniques to obtain a closed formula for a sum is by
approximation to an integral. The central result was discovered independently
by Leonhard Euler and Colin Maclaurin in the XVIII century.
Let f : [0, n] R be p times differentiable. Denote by f (p) the pth derivative
of f . Then the Euler-Maclaurin formula is as follows:
n
X
f (k) =
k=0
p
X
Bk
k=2
k!
f (k1) (n) f (k1) (0) + Rp .
(4.7)
where {Bk }kN0 is the sequence of Bernoulli numbers inductively defined as follows
k
1 X k+2
B0 = 1 and Bk+1 =
Bj
(4.8)
j
k + 2 j=0
(4.9)
n!
if 0 k n
n
(n k)!k!
=
k
0
otherwise.
Informally, it represents the number of ways that k objects can be chosen among
n objects when order is irrelevant. Hence, we can see the binomial coefficient as
representing the number of k-element subsets of an n-element set. Its name is
derived from the fact that they constitute the coefficients of the series expansion
of a power of a binomial, that is,
X
n
n
xk .
(1 + x) =
k
k=0
n
k
n1
k1
n1
k
145
4.4. EXAMPLES
(p)
f (x) dx.
The proof of the validity of the Euler-Maclaurin formula can be seen in [24].
Example 4.3.1 Consider the map f such that f (k) = k 2 and take p = 3. Then
f (p) is such that f (p) (x) = 0 and so Rp = 0. Using (4.8) we get that B1 = 1/2,
B2 = 1/6 and B3 = 0. Moreover f (1) (n) = 2n and f (2) (n) = 2 and so from the
formula (4.7) we get
n
X
1
1
1
k 2 = n3 + n2 + n.
3
2
6
k=0
Observe that the Euler-Maclaurin formula gives an exact value for the summation whenever f (k) is a polynomial k p , for any p N. In fact, in this case
f (p+1) is such that f (p+1) (x) = 0 and therefore Rp+1 is 0.
Proposition 4.3.2 For each n N and p N0
n
X
1
kp =
p+1
k=0
p
X
p+1
k=0
Bk (n + 1)p+1k
.
The Euler-Maclaurin
formula is useful when obtaining a closed formula of a
Pn
summation k=0 f (k) where f (p) is 0 for some natural number p.
4.4
Examples
146
4.4.1
Gaussian elimination
(k)
(k1)
(k)
aij
(k1)
aij
aik
(k1)
a
(k1) kj
akk
(k1)
bi
(k1)
bi
aik
(k1)
akk
(k1)
bk
(4.10)
The Gaussian elimination technique described above can be easily implemented in Mathematica as depicted in Figure 4.2. It is straightforward to prove
that A(k) x = b(k) is equivalent to Ax = b for k = 0, . . . , n 1.
4.4. EXAMPLES
GaussElimination=
Function[{A,b},Module[{k,i,j,m,n,Aux,baux},
Aux=A;
baux=b;
n=Length[b];
k=1;
While[k<=n-1,
i=k+1;
While[i<=n,
m=Aux[[i,k]]/Aux[[k,k]];
Aux[[i,k]]=0;
j=k+1;
While[j<=n,
Aux[[i,j]]=Aux[[i,j]]-m*Aux[[k,j]];
j=j+1];
baux[[i]]=baux[[i]]-m*baux[[k]];
i=i+1];
k=k+1];
{Aux,baux}]];
Figure 4.2: Gaussian elimination in Mathematica
147
148
subtractions. Looking now at the next loop, the execution of each iteration
performs
n
X
1
j=k+1
i=k+1
j=k+1
We can use the techniques in Section 4.2 to compute this summation. More
precisely, we prove that
!
n
n1 X
n
X
X
n3 n
1 =
1+
(4.11)
3
j=k+1
k=1 i=k+1
In fact,
n1 X
n
X
k=1 i=k+1
1+
n
X
j=k+1
n1 X
n
X
(1 + (n k))
(constant)
k=1 i=k+1
n1
X
((n k)2 + (n k))
(constant)
k=1
n1
X
=
(k 2 + k)
(change of variable)
k=1
n1
X
k=1
k2
n1
X
k=1
(associativity)
149
4.4. EXAMPLES
The equality (4.11) then follows since from Example 4.3.1
n1
X
(n 1)3 (n 1)2 n 1
k =
+
+
3
2
6
k=1
2
k=
k=1
n(n 1)
2
j=k+1
The computation of this sum is similar to the one presented in (1) and we get
!
!
!
n
n1
n1
n
n1 X
X
X
X
X
2n3 + 3n2 5n
2+
1 =
j2 + 2
j =
.
6
j=1
j=1
j=k+1
k=1 i=k+1
(3) From (1) and (2) we conclude that the map f is such that
f (n) =
n3 n 2n3 + 3n2 5n
5n3 + 3n2 7n
+
=
3
6
6
4.4.2
QED
Insertion sort
The problem of sorting consists in, given a list w of numbers, obtaining an ordered
list that is a permutation of w.
There are many sorting algorithms, among them, the insertion sort algorithm
(for more details on sort algorithms, the reader can consult [2]). At the kth step
of the insertion sort algorithm the list w is such that the elements w1 , . . . , wk are
already ordered and the goal is to put wk+1 in the proper position with respect
to w1 , . . . , wk . An implementation in Mathematica of the insertion sort algorithm
is given in Figure 4.3.
The Mathematica function InsertSort in Figure 4.3 receives as input a list
w to be ordered and it gives as output the corresponding ordered list v. At the
beginning v is set to w. The function consists of two nested loops. At the kth
150
InsertSort=Function[w,Module[{v,i,j,m},
i=2;
v=w;
While[i<=Length[w],
j=i-1;
m=v[[i]];
While[j>0&&v[[j]]>m,
v[[j+1]]=v[[j]];
j=j-1];
v[[j+1]]=m;
i=i+1];
v]];
Figure 4.3: Insertion sort algorithm in Mathematica
iteration of the outer loop the list v is considered to be divided in two parts. The
first k elements of v are the first k elements of w already ordered. The elements
of v and w from position k + 1 to the last position are the same. The ordered
sublist is extended to the k + 1th position by inserting the k + 1th element of w
in the correct position The loop ends when there are no more elements to sort.
Then, v is the intended output list.
A run of the insertion sort algorithm for the input list w = {2, 5, 43, 10, 8}
is depicted in Figure 4.4. We denote by v the ordered part of the list and by v
the remaining part. This run has four steps. In the first step, 5 is compared
with 2 and becomes the first element of v . Then, 43 is compared with the last
element of v and remains in the original position since it is greater than 2. In
the last two steps 10 and 8 are inserted in the correct positions.
In the analysis of a sorting algorithm the relevant operations are the comparisons between list elements carried out by the algorithm to sort the list. We now
analyze the insertion sort algorithm in terms of the number of comparisons
v[[j]] > m
performed by the algorithm to sort a list v of length n. This number depends on
the length n of the list and also on how unsorted the list is.
Worst-case analysis
In worst-case analysis we concentrate on finding the maximum number of comparisons needed to sort an input list of length n or, at least, an upper bound for
this value.
151
4.4. EXAMPLES
v = {|{z}
2 , 5, 43, 10, 8}
{z
}
|
v = {5, 2, 43 , 10, 8}
| {z } | {z }
v
v = {10, 5, 2, 43 , |{z}
8 }
|
{z
}
v = {10, 5, 2, 8, 43 |{z}}
{z
}
|
(i 1) =
n1
X
j=1
j=
(n 1)n
.
2
152
QED
Average-case analysis
The average-case analysis is in general more elaborated than the worst-case analysis since it involves probabilistic hypothesis. Recall that the (i 1)-th step of
the outer loop of the function in Figure 4.3 places the i-th element of the list in
the right ordered position, ranging from 1 to i. In the sequel we assume that the
probability of the element to be placed in any position is
1
i
that is, the probability of the element to be placed in any position is the same
(uniform distribution).
As we will see in the proof of Proposition 4.4.4 below, the average-case analysis
of the insertion sort algorithm involves sums such as
n
X
1
i=1
(4.12)
The sum (4.12) is the n-th harmonic number and it is usually denoted by Hn .
Harmonic numbers constitute the discrete version of the natural logarithm. There
is no closed form for Hn , but it is easy to establish upper and lower bounds for
the value of Hn .
Proposition 4.4.3 For each n N
ln(n) Hn ln(n) + 1.
Proof: The result follows by definition of the Riemann integral. On one hand,
Z n
n
X
1
1
Hn 1 =
dx = ln(n).
i
1 x
i=2
Hn =
n
X
1
i=1
1
dx = ln(n).
x
QED
We now present the average-case analysis of the insertion sort algorithm presented in Figure 4.3.
153
4.4. EXAMPLES
Proposition 4.4.4 The insertion sort algorithm presented in Figure 4.4 is, in
the average case, quadratic in the length of the input list, that is, the map
f : N N0 such that f (n) is the average number of comparisons between list
elements performed by the algorithm when it receives as input a list of length
n 1 is in O(n.n2 ).
Proof: Consider an input list of length n 1.
(1) At the (i 1)-th step of the outer loop of the function in Figure 4.3 the i-th
element of the list is placed in the right ordered position, ranging from 1 to i.
Recall that we are assuming that the probability of this element to be placed in
any position is the same, that is,
1
.
i
If the i-th element is placed in position k 2, the number of required comparisons
is i k + 1. Moreover, if it is placed in the first position the number of required
comparisons is i 1. Furthermore, the average-case number of comparisons to
place the i-th element in the correct position is
!
i
X
1
1
(i 1) +
(i k + 1) .
i
i
k=2
Hence, the average-case number of required comparisons to sort the input list is
given by the summation
!!
i
n
X
X
1
1
(i 1) +
(i k + 1)
.
i
i
i=2
k=2
Let us find a closed form of the above summation. Since
i
i
X
1
k=2
1X
(i k + 1)
(i k + 1) =
i
i k=2
(distributivity)
i2
1X
(j + 1)
i j=0
1 i(i 1)
i
2
i1
2
(change of variable)
(Example 4.2.5)
154
we have that
n
X
1
i=2
i
X
1
(i 1) +
=
k=2
!!
(i k + 1)
n
X
1
i1
(i 1) +
i
2
i=2
n
X
1+i
i=2
1
=
2
n
X
(i + 1)
i=0
i=1
n
X
1
i=2
n
X
1
n
X
1
n2 + 3n
=
4
i=1
(associativity)
!
1
2
(distributivity)
(Example 4.2.5)
n
X
i=2
1
(i 1) +
i
i
X
1
k=2
!!
(i k + 1)
n2 + 3n
=
n
X
1
i=1
n2 + 3n
ln(n)
4
4.5
QED
Exercises
155
4.5. EXERCISES
(b)
(c)
(d)
(e)
(f)
Pn
k
k=1 2k(2)
Pn
k=1 k
Pn
i=1
n3
3
= 49 (1 + (2)n + 3(2)n n)
+
n2
2
n
6
n2 (n+1)2
4
i3 =
Pn
2
i=1 i(i+1)
Pn
1
k=1 3(k+k 2 )
2n
1+n
n
3n+3
Pn
Pn
k=0 (n2
Pn
+ 3k)
k=2 n(k
k=0 (6k
Pn
k=0 3
k=1 (3k
Pn
k=0 (6(n
k=1 ((n
Pn+2
k)
+ n2k + 3k)
Pn+1
Pn+1
k=3 k
+ k3k + 5k)
k)2 + k2k )
k)5nk+2 + 2(k 4))
Pn1
Pn1
(bk+1 bk )ak+1 ) where
(ak+1 ak )bk = an bn a0 b0 ( k=0
5. Prove that k=0
n is a positive integer.
6. Consider the Mathematica function f that receives as input a matrix m of
integer numbers and computes an integer a using a function h.
f=Function[{m},Module[{i,j,a,nlin,ncol},
nlin=Length[m];
ncol=Length[First[m]];
a=1;
i=2;
While[i<nlin,
j=1;
While[j<=ncol,
a=a*h[m,i,j];
j=j+1];
156
4.5. EXERCISES
157
158
Chapter 5
Discrete Fourier transform
In this chapter we introduce the discrete Fourier transform The discrete Fourier
transform is widely used in many fields, ranging from image processing to efficient multiplication of polynomials and large integers. In section 5.1 we present
a motivating example that illustrates the use of the discrete Fourier transform
for efficient polynomial multiplication. In Section 5.2 we introduce the discrete
Fourier transform. In Section 5.3 we present the fast Fourier transform, an efficient method for computing the discrete Fourier transform. In Section 5.4 we
revisit polynomial multiplication based on the discrete Fourier transform. Image
processing using the fast Fourier transfer is discussed in Section 5.5.
Several exercises are proposed in Section 5.6.
5.1
Motivation
n1
X
ai xi
and
i=0
q=
n1
X
bj xj
j=0
in R[x], for instance, and recall from Chapter 3 that their product is the polynomial
2n2
X
ck xk
pq =
k=0
160
ck =
k
X
i=0
n1
X
ai bki
i=kn+1
ai bki
if 0 k n 1
if n 1 < k 2n 1.
pq
coefficient rep.
p, q
coefficient rep.
evaluation
O(n log n)
p, q
point-value rep.
interpolation
O(n log n)
pointwise multiplication
O(n)
pq
point-value rep.
161
This technique can also be used for efficient multiplication of large integers.
For this purpose, each integer number k is considered as the value of the evaluation of a suitable polynomial at a positive integer b, corresponding to the base
we are considering. The coefficients of the polynomial correspond to the digits
the representation of k in base b.
5.2
5.2.1
The set of complex number C is the extension of the set of real numbers for which
all polynomials with real coefficients of degree n have precisely n roots (including
multiplicity). For this reason, the set of complex number is called the algebraic
closure of the set of real numbers.
Any complex number can be expressed as a + b i where a and b are real numbers and i is called the imaginary unit. Addition and multiplication of complex
numbers are defined in the usual way, taking into account that i2 = 1, that is,
(a + b i) + (c + d i) = (a + c) + (b + d)i;
(a + b i) (c + d i) = (ac bd) + (ad + bc)i.
The set of complex numbers endowed with their addition and multiplication
constitutes a field. A useful representation of complex numbers is their polar
form, that is
(a + bi) = rei = r(cos() + i sin())
+2k
n
r ei n
for k {0, 1, . . . , n 1}.
Herein, we are particularly interested in the distinct n-th roots of the unity
znk = ei
2k
n
2
for k {0, 1, . . . , n 1}
The n-th root of unity zn = zn1 = ei n is the principal root. In the sequel we may
2k
) = ei n .
also refer to znk for k Z. As expected, also in this case znk = cis( 2k
n
The n-th roots of unity enjoy several interesting properties. The following
properties are useful in the sequel (see Exercises 1 and 2 in Section 5.6).
162
Proposition
5.2.2 Let k N0 and n, d N where n is an even number. Then,
n
+d
d
2
zn = zn and zn2k = z kn .
2
Using the above properties of the complex roots of unity we can speed up the
computation of the n-th roots of unity when n is even. On one hand, note that
once we have computed the first n2 roots of unity, that is
n
zn2 = zn2
n
+1
zn2
...
+0
n
2
= zn0
= zn1
n
znn1 = zn2
+( n
1)
2
= zn2
On the other hand, when we have already computed the n2 -th roots of unity, we
can use those values to obtain half of the n-th roots:
zn0 = zn20 = z 0n
2
zn2 = zn21 = z 1n
2
...
1)
2( n
2
znn2 = zn
= z n2
(znk )i = 0.
i=0
163
5.2.2
We now present the discrete Fourier transform and some related properties.
Definition 5.2.4 Let ~a = (a0 , a1 , . . . , an1 ) Cn where n N.
Fourier transform of ~a is ~b = (b0 , . . . , bn1 ) where
bk =
n1
X
The discrete
aj znkj
j=0
Observe that the discrete Fourier transform is a particular linear transformation described by a Vandermonde matrix. An m n Vandermonde matrix for
(1 , . . . , m ), denoted by Vn (1 , . . . , m ) is such that its entry at row i column j
is ij , and so each row i is a sequence of a geometric progression with ratio i .
Indeed
DFTn (~a) = V ~a
where
V =
1 zn0
(zn0 )2
1 zn1
(zn1 )2
..
..
..
.
.
.
1 znn1 (znn1 )2
. . . (zn0 )n1
. . . (zn1 )n1
..
..
.
.
. . . (znn1 )n1
164
y0
y1
..
.
V 1
yn1
a0
a1
..
.
an1
1
corresponds to DFT1
can
n (y0 , y1 , . . . , yn1 ) = (a0 , a1 , . . . , an1 ). The matrix V
be characterized as follows.
1 zn0
1 z1
n
V = .
.
..
..
1 znn1
. . . (zn0 )n1
. . . (zn1 )n1
..
..
.
.
n1 n1
. . . (zn )
is the matrix
V 1
=
n
zn10
zn20
...
(n1)0
zn
(n1)1
1
zn11
zn21
...
zn
..
..
..
..
..
.
.
.
.
.
1(n1)
2(n1)
(n1)(n1)
1 zn
zn
. . . zn
1 X kj kj
z z
n k=0 n n
k(jj )
y0
y1
1
165
where in the above equality we assume DFTn (y0 , yn1 , yn2, . . . , y1 ) written a
column matrix.
Proof: Considering
Vn1
it holds that
y0
y1
..
.
yn1
a0
a1
..
.
an1
n1
ak =
1 X kj
yj zn
n j=0
1X
ak =
yj zn(nk)j .
n j=0
QED
Hence,
DFT1
n (y0 , y1 , . . . , yn1 ) =
1
DFTn (y0 , yn1, yn2, . . . , y1 )
n
(5.1)
that is, the inverse of the discrete Fourier transform can be computed using the
discrete Fourier transform itself.
Another relevant way of presenting the discrete Fourier transform is based
on polynomials. P
In fact, we may associate to a tuple (a0 , a1 , . . . , an1 ) Cn the
n1
polynomial p = j=0
aj xj in C[x]. Then
DFTn (a0 , a1 , . . . , an1 ) = (p(zn0 ), p(zn1 ), . . . , p(znn1 )).
(5.2)
166
p(i) = (i)2 + 2i + 3 = 2 + 2i
we conclude that DFT4 (3, 2, 1, 0) = (2, 2 2i, 6, 2 + 2i).
Taking into account the equality (5.2), computing DFT reduces to evaluating
a polynomial in the roots of the unity. The naive way to evaluate the polynomial p used in (5.2) at some value u consists of computing ai ui for each
1 i n 1 and then add all these values to a0 . Moreover, if to compute ui we
always perform i 1 multiplications (that is, we do not take advantage of the
previously computed value of ui1 to get ui = ui1 u), this evaluation involves
Pn1
n(n1)
multiplications and n 1 sums. Hence, we have a O(n2) number
i=1 i =
2
of multiplications and sums. If we use this naive way to evaluate DFTn , taking
into account Equality (5.2), it would require O(n3 ) multiplications and sums.
This bound can be improved using the Horners rule (see Proposition 5.2.9),
which uses only a O(n) number of multiplications and sums to evaluate p at u.
Pn1
Proposition 5.2.9 Let p = i=0
ai xi be a polynomial in C[x]. Consider the
sequence of polynomials qj in C[x] with j = 0, ..., n 1 defined as follows:
q0 = an1
qj = qj1 x + an(j+1) .
Then p = qn1 and the evaluation of qn1 using the above sequence of polynomials
involves n 1 multiplications and n 1 sums.
Proof: The proof follows by induction on the degree n 1 of p.
Basis: the degree is 0. Then, n 1 = 0 and no multiplications or sums are
performed.
Step: Assuming that the degree is n, then p(u) = qn (u) = qn1 (u) u + a0 . By
induction hypothesis, computing qn1 (u) takes n 1 multiplications and n 1
sums. So computing p(u) takes n multiplication and n sums.
QED
Example 5.2.10 Consider the polynomial x3 4x + 2 of degree 3. Let ai be the
coefficient of xi for each 0 i 3. Then, a3 = 1, a2 = 0, a1 = 4 and a0 = 2.
Using Proposition 5.2.9,
q0 = a3 = 1;
q1 = q0 x + a2 = x + 0 = x;
q2 = q1 x + a1 = x2 4;
q3 = q2 x + a0 = (x2 4) x + 2 = x3 4x + 2.
167
The evaluation of DFTn needs O(n2 ) multiplications and sums using Horners
rule and Equality (5.2). Note that we are not taking into account that the polynomial p to be evaluated is always the same throughout all components of DFTn
and that p is evaluated at the roots of the unity. These two facts combined allow
to further improve the complexity of computing DFTn to O(n log(n)) multiplications and sums. In the following section we present the algorithm that achieves
this complexity, known as the Fast Fourier Transform.
5.3
p0 =
1
2
X
p1 =
1
2
X
a2j xj
j=0
a2j+1 xj
j=0
p0 (u2 ) =
1
2
X
j=0
hold. As a consequence
a2j u2j
and
p1 (u2 ) =
1
2
X
j=0
a2j+1 u2j
168
up1 (u2 ) =
1
2
X
a2j+1 u2j+1
j=0
QED
Pn1
Hence, the evaluation at u of a polynomial p = j=0
aj xj of degree n 1,
where n is an even number, can be computed using the evaluation of two polynomials of degree less than or equal to n2 1 at u2 .
When considering the n-th roots of unity, we have that
p(znk ) = p0 (zn2k ) + znk p1 (zn2k ) = p0 (z kn ) + znk p1 (z kn )
2
for each 0 k
n
p(zn2
+k
n
2
1. Moreover,
n
2
...
...
169
FFT=Function[{w},Module[{n,z,zp,az,au,ptz,ptu,k,r},
n=Length[w];
If[n==1,
w,
2
zp=EI n ;
z=1;
az=Table[w[[2i+1]],{i,0,n/2-1}];
au=Table[w[[2i+2]],{i,0,n/2-1}];
ptz=FFT[az];
ptu=FFT[au];
r=Table[0,{i,1,n}];
For[k=0,k<=n/2-1,k=k+1,
r[[k+1]]=ptz[[k+1]]+z*ptu[[k+1]];
r[[k+n/2+1]]=ptz[[k+1]]-z*ptu[[k+1]];
z=z*zp];
r]]];
Figure 5.3: FFT in Mathematica
The analysis of the FFT algorithm follows straightforwardly. Let oFFT(n)
be the number of sums and multiplications used in FFT for an input of length
n. For such input, the FFT performs O(n) multiplications and sums, and makes
two recursive calls of order n2 . So we have to find the solution for
oFFT(n) = 2oFFT(n/2) + O(n)
that is:
oFFT(n) =
=
=
170
Example 5.3.2 Consider the tuple (3, 2, 1, 0). The goal is to compute the
discrete Fourier transform DFT4 (3, 2, 1, 0) using the FFT algorithm. This
tranform can be computed from DFT2 (3, 1) and DFT2 (2, 0), and these from
DFT1 (3), DFT1 (1), DFT1 (2) and DFT1 (0). The computation involves 2throots of unity and 4th-roots of unity. Only two of them, z20 and z41 , have to be
computed since
z20 = z40 = 1
z41 = i
z43 = z41 = i.
DFT4 (3, 2, 1, 0)2 = DFT2 (3, 1)2 + z41 DFT2 (2, 0)2 = 2 2i;
DFT4 (3, 2, 1, 0)3 = DFT2 (3, 1)1 z40 DFT2 (2, 0)1 = 6;
DFT4 (3, 2, 1, 0)4 = DFT2 (3, 1)2 z41 DFT2 (2, 0)2 = 2 + 2i.
and it can also be briefly sketched as follows
DFT4 (3, 2, 1, 0) = (4 + z40 .(2), 2 + z41 .(2), 4 z40 .(2), 2 z41 .(2))
( = (2, 2 2i, 6, 2 + 2i) )
DFT1 (3) = 3
DFT1 (1) = 1
DFT1 (2) = 2
DFT1 (0) = 0
5.4
171
5.4.1
1 u0 u20 . . . un0
a0
v0
1 u1 u2 . . . un a1 v1
1
1
.. ..
.. . .
.. .. = ..
. .
. . . .
.
an
vn
1 un u2n . . . unn
Recall that the coefficient matrix is the Vandermonde matrix
Q Vn (u0 , u1 , . . . , un ).
This matriz is invertible, since its determinant, |Vn |, is 0i<jn (uj ui ) and
therefore |Vn | is different from 0 whenever ui 6= uj . If an 6= 0 then p has degree
n. Otherwise the degree is less than n.
Example 5.4.3 Consider the set {(0, 2), (1, 1), (2, 2, (3, 17)} of pairs of real
numbers. There is one (and only one) polynomial p in R[x] with degree less than
or equal to 3 such that
p(0) = 2 p(1) = 1 p(2) = 2 p(3) = 17
since solving the system
172
we get
1
1
1
1
0
1
2
3
a0
0 0
1 1 a1
4 8 a2
a3
9 27
2
1
=
2
17
a0 = 2 a1 = 4 a2 = 0 a3 = 1
Clearly, point-value representation is not unique, in the sense that any set
of n + 1 pairs of complex numbers {(v0 , p(v0 )), . . . , (vn , p(vn ))} with distinct first
components is also a point-value representation of p.
In certain situations it is useful to consider extended point-value representations of a polynomial p with degree n: any set of m > n + 1 pairs of real
values
{(u0 , p(u0 )), (u1, p(u1 )), . . . , (um , p(um ))}
173
getting the coefficients a0 , a1 , . . . , an1 from y0 , y1 , . . . , yn1 corresponds to computing the inverse of the discrete Fourier transform, that is,
DFT1
n (y0 , y1 , . . . , yn1 ) = (a0 , a1 , . . . , an1 ).
We now refer to sum and multiplication of polynomials using only their pointvalue representations.
Given two polynomials p and q with degree n, from point-value representations
of p and q at the same complex numbers u0 , . . . , un we easily get a point-value
representation of p+q at u0 , . . . , un . Recall that deg(p+q) max{deg(p), deg(q)}.
Proposition 5.4.6 Consider the polynomials p and q in C[x] with degree n.
If {(u0 , v0 ), . . . , (un , vn ))} and {(u0, w0 ), . . . , (un , wn )} are point-value representations of p and q, respectively, then the pointwise sum
{(u0 , v0 + w0 ), . . . , (un , vn + wn )}
is a (possibly extended) point-value representations of p + q.
If the polynomials p and q do not have the same degree, let us say, for instance,
deg(q) < deg(p), then deg(p+q) = deg(p) and it is easy to conclude that taking a
suitable extended point-value representation of q we can also obtain a point-value
representation of p + q as described in Proposition 5.4.6.
Example 5.4.7 Consider the polynomials p = x3 4x + 2 and q = x2 2x + 1.
Since {(0, 2), (1, 1), (2, 2), (3, 17)} and {(0, 1), (1, 0), (2, 1), (3, 4)} are a pointvalue representation of p and an extended point-value representation of q, respectively, then {(0, 3), (1, 1), (2, 3), (3, 21)} is a point-value representation of
p + q.
174
Extended point-value representations are also useful when multiplying polynomials in point-value representation. Recall that deg(p q) = deg(p) + deg(q).
Hence, if p and q have degree n, any point value-representation of p q has
2n + 1 elements. As a consequence, to obtain a point value representation for
p q we always have to consider extended point-values representations of p and
q whenever deg(p) and deg(q) are not both 0.
Proposition 5.4.8 Let p and q be polynomials in C[x] with degree n > 0. If
{(u0 , v0 ), . . . , (u2n , v2n ))} and {(u0, w0 ), . . . , (u2n , w2n )} are extended point-value
representations of p and q, respectively, then the pointwise multiplication
{(u0, v0 w0 ), . . . , (u2n , v2n w2n )}
is a point-value representations of p q.
5.4.2
175
pq
coefficient rep.
p, q
coefficient rep.
DFTn for p
DFTn for q
p, q
point-value rep.
DFT1
n
pointwise multiplication
pq
point-value rep.
176
and
(1, 2, 0, 0)
(2) We first compute DFT4 (3, 2, 1, 0). Recalling Example 5.3.2, it holds that
DFT4 (3, 2, 1, 0) = (2, 2 2i, 6, 2 + 2i).
177
and therefore
DFT4 (1, 2, 0, 0) = (1, 1 + 2i, 3, 1 2i) since
DFT4 (1, 2, 0, 0)1 = DFT2 (1, 0)1 + z40 DFT2 (2, 0)1 = 1 + 1 2 = 1
DFT4 (1, 2, 0, 0)2 = DFT2 (1, 0)2 + z41 DFT2 (2, 0)2 = 1 + 2i
DFT4 (1, 2, 0, 0)3 = DFT2 (1, 0)1 z40 DFT2 (2, 0)1 = 1 1 2 = 1
DFT4 (1, 2, 0, 0)4 = DFT2 (1, 0)2 z41 DFT2 (2, 0)2 = 1 2i
(3) Using DFT4 (3, 2, 1, 0), DFT4 (1, 2, 0, 0) and pointwise multiplication (denoted by ) we get a point-value representation for p q (in fact, only the second
components):
(2, 2 2i, 6, 2 + 2i) (1, 1 + 2i, 3, 1 2i) = (2, 2 + 6i, 18, 2 6i)
(4) Finally, we compute
1
DFT1
4 (2, 2 + 6i, 18, 2 6i) = DFT4 (2, 2 6i, 18, 2 + 6i)
4
DFT2 (2, 18) = (16, 20) since
DFT2 (2, 18)1 = DFT1 (2) + z20 DFT1 (18) = 16
Hence,
1
DFT1
4 (2, 2 + 6i, 18, 2 6i) = (12, 32, 20, 8) = (3, 8, 5, 2).
4
The product of p and q is therefore the polynomial 2x3 5x2 + 8x 3.
178
5.5
Image processing
5.6
Exercises
(a) zn 2 = 1
n
(b) zn 2 +d = zn d
(c) zn 2k = z kn
2
3. Let k N
and n N such that k is not divisible by n. Prove that the
P0 n1
equality i=0
(znk )i = 0 holds.
4. Using the Fast Fourier transform compute
(a) DF T2 (3, 4)
(b) DF T4 (0, 1, 1, 2)
(c) DF T4 (2, 1, 1, 0)
(d) DF T4 (4, 7, 3, 0)
(e) DF T8 (2, 0, 1, 1, 0, 1, 2, 1)
(f) DF T21 (2, 1)
(b) p = 5x + 3 e q = 2x + 4
(c) p = 3x 2 e q = 4x2 5x + 3
(d) p = 5x e q = 2x2 3x
Chapter 6
Generating functions
In this chapter we introduce generating functions. Given any sequence of real or
complex numbers, we can associate with it a generating function, that is, a series
involving all the terms of the sequence. There are several kinds of generating
functions for a sequence, such as ordinary generating functions, exponential generating functions and Poisson generating functions [19, 18, 31]. Herein, we only
consider ordinary generating functions.
In Section 6.1 we present motivating examples in algorithm analysis. Generating functions are introduced in section 6.2. In section 6.3 we revisit the
motivating examples and in Section 6.4 we propose some exercises.
6.1
Motivation
In this section, we refer to the use of generating functions in algorithm analysis. We first refer to the average case analysis of a search algorithm involving a
hash function. Then we refer to the Euclids algorithm The second example also
illustrates the relevance of generating functions for solving recurrence relations.
6.1.1
Search by hashing
180
To insert a new record, we assign its key value to Key[rstored + 1] and its data
to Data[rstored + 1] and then increment rstored. The task of searching for some
key K among the keys already stored can be accomplished going through the
table Key sequentially, comparing each Key[j] to the given key K. This can be
rather slow when a large number of records have already been stored.
In order to improve this situation we can use the hashing technique that
involves splitting the storing memory space for keys into m lists, and consider a
hash function h that transforms each key K into an integer h(K) {1, ..., m}.
The integer h(K) indicates the list where to search for the key K. We have to
consider also the tables F irst and Next. For each 1 i m,
F irst[i] {1, 1, ..., nr }
indicates the position in table Key of the first key of list i. The value 1 indicates
that the list is empty. For each 1 j nr ,
Next[j] {0, 1, ..., nr }
indicates the position in table Key of the key that follows Key[j] in the list
h(Key[j]). The value 0 indicates that Key[j] is the last key in its list.
Example 6.1.1 Consider an university library maintaining a database storing
relevant data about its readers, the university students. In order to borrow
books, each student has to register first as a reader at the library. The university
sequentially assigns an identification number to each of its students, lets say, a
nonnegative integer number less than 100000. The database library uses these
identification numbers as record keys.
To keep things easy, assume in this example that we have only 10 lists, that is
m = 10, and that the hash function h : {1, . . . , 99999} {1, . . . , 10} is such that
K
h(K) =
+1
10000
Moreover, for simplicity, let us assume that only 6 students have registered at
the library so far and that Key[1] = 35346, Key[2] = 15367, Key[3] = 43289,
Key[4] = 32128, Key[5] = 38532 and Key[6] = 46238. We can sketch the key
distribution as follows
L1
L2
L3
15367
L4
L5
L6
35346 43289
32128 46238
38532
L7
L8
L9
L10
that is, there are keys in the lists L2 , L4 and L5 and the other lists are empty.
This situation corresponds to
6.1. MOTIVATION
181
rstored=6
F irst[i] = 1 for i {1, 3, 6, 7, 8, 9, 10}
and
Using the hashing technique described above, the task of searching for keys
can be performed faster, since when looking for a given key K we only have to
compare it with the stored keys K such that h(K ) = h(K). Clearly, if the hash
function h is such that h(K) = h(K ) for all the keys K that have already been
stored, then all the lists are empty but one and K has to be compared with all
the stored keys. Therefore, the worst case number of comparisons is equal to the
worst case number of comparisons when no hash function is involved. However,
the average case number of comparisons is smaller when the hashing technique
is used. Our goal is to determine this average case number of comparisons.
The function keySearch in Figure 6.1 determines whether a given key k has
already been stored, assuming that the hash function h is already known and that
the lists key, first and next record the tables Key, F irst and Next, respectively. Using the auxiliary lists first and next, function keySearch compares k
with all the elements in key whose hash function value equals that of k. It returns
the position of the key k in key if k has already been stored and the string the
key has not been stored otherwise.
keySearch=Function[{k},Module[{i,j,r},
i=h[k];
j=first[[i]];
r=False;
While[j>0&&!r,
If[key[[j]]==k,r=True,j=next[[j]]]];
If[r,j,Print["the key has not been stored"]]]];
Figure 6.1: Key search function in Mathematica
If the key we are searching for has been already stored we say that the search
is successful. Otherwise, the search is unsuccessful.
182
6.1.2
Euclids algorithm
Recall the Euclids algorithm for computing the greatest common divisor of two
nonnegative integers presented in Figure 1.4. The analysis of this algorithm
involves counting the number of recursive calls performed when euclid[m,n] is
evaluated.
When evaluating euclid[m,n], there are no recursive calls if m = 0 and if
m = n there is just one recursive call. The analysis of the Euclids algorithm
often assumes that the first argument is less than the second. In fact, note that
if m 6= 0 and m < n, there is one recursive call to euclid[Mod[n,m],m] where,
again, the first argument is less than the second, since mod(n, m) < m. Reasoning
in a similar way, it is easy to conclude that the first argument is going to be also
less than the second in all the following recursive calls. If m > n, there is again
a first recursive call to euclid[Mod[n,m],m], but the first argument is now less
than the second. Hence, we can reason as in the previous case and conclude that
again the first argument is going to be less than the second in all the following
recursive calls.
The Lame theorem (see Theorem 6.3.16) establishes an upper bound for the
number of recursive calls that are performed when euclid[m,n] is evaluated. It
states that for k, m, n N, with m < n, the evaluation of euclid[m,n] involves
less than k recursive calls whenever m < sk+1, where sk+1 is the (k + 2)th Fibonacci number. Recall that the sequence of Fibonacci numbers is the sequence
s = {sn }nN0 such that
s0 = 0
s1 = 1
and
sn = sn1 + sn2
for n 2.
(6.1)
Moreover (see Proposition 6.3.17), the worst case number of recursive calls occurs
when evaluating euclid[sk ,sk+1 ]. In this case there are k 1 recursive calls.
183
The Fibonacci number sk , for some k N0 , can be computed using (6.1), but
the closed form
!k
!k
1
1 5
1+ 5
sk =
(6.2)
2
2
5
is often useful. To get the equality (6.2) we have to solve the recurrence relation
(6.1). In Section 6.3.2 we discuss how generating functions can be used for solving,
in particular, the recurrence relation in (6.1). The analysis of recursive algorithms
often involves recurrence relations.
6.2
Generating functions
In this section we introduce the notion of generating function and some related
properties. A generating function associates a formal power series with each
sequence of elements of a field. Herein, we refer to sequences of real or complex
numbers.
Definition 6.2.1 Let s = {sn }nN0 be a sequence of real or complex numbers.
The generating function for s is
+
X
si z i .
(6.3)
i=0
si z i
or
+
X
si+k z i+k
(6.4)
i=0
to denote Gs (z).
If s is such that sn = 0 for all n > k, for some k N0 , we can write
s0 + s1 z 1 + s2 z 2 + . . . + sk z k
(6.5)
184
to denote Gs (z).
(ii) The generation function for v = (vn )nN0 where vn = 1 for each n N0 is
+
X
zi .
i=0
(iii) The generation function for r = (rn )nN0 where rn = n for each n N0 is
+
X
iz i
i=0
and, taking into account the observations above, we can also write
+
X
iz
i=1
or
+
X
(i + 1)z i+1
i=0
185
z 3i
i=0
186
Definition 6.2.3 Consider two sequences s = {sn }nN0 and t = {tn }nN0 . The
sum of Gs (z) and Gt (z), denoted by Gs (z) + Gt (z), is Gs+t (z) that is, the generating function for the sequence s + t.
Example 6.2.4 Let r and v be the sequences presented in Example 6.2.2. Then,
+
+
X
X
i
Gr (z) + Gv (z) = Gr+v (z) =
(ri + vi )z =
(i + 1)z i .
i=0
i=0
+
i
X
X
i=0
sk tik
k=0
zi .
For simplicity we often just write Gs (z)Gt (z) for Gs (z) Gt (z).
Example 6.2.6 Let t = {tn }nN0 be any sequence and let s be the sequence
a, 0, 0, 0, . . .
that is, s = (sn )nN0 is such that s0 = a and sn = 0 for all n > 0, where a R. Since
i
X
sk tik = s0 ti = ati
k=0
+
X
ati z i
i=0
and therefore Gs (z) Gt (z) is the generation function for the sequence at. Note
that we can also write
a Gt (z) =
+
X
i=0
ati z i
or
187
that is, u = {un }nN0 is such that u3 = 1 and un = 0 for all n N0 \{3}. Note
that
i
i
X
X
uk tik =
0tik = 0
k=0
k=0
i
X
k=0
for i 3. Hence, Gu (z) Gt (z) is the generating function for the sequence
0, 0, 0, t0, t1 , t2 , t3 , t4 . . .
that is, the sequence t = {tn }nN0 such that t0 = t1 = t2 = 0 and tn = tn3 for
n 3, and therefore
Gu (z) Gt (z) =
+
X
ti z i = Gt (z).
(6.7)
i=0
Taking into account the notation introduced above and the fact that ti = ti+3 for
each i 0, we can also write
3
z Gt (z) =
+
X
i=3
ti z i
or
z Gt (z) =
+
X
ti z i+3
(6.8)
i=0
Equalities similar to (6.7) hold for sequences u such that um = 1 is the only
nonzero term of the sequence, where m is any nonnegative integer. The product
Gu (z) Gt (z) can then be also denoted as in (6.8) with the integer 3 substituted
for m.
Observe that the sum and product of generating functions corresponding to
polynomials indeed correspond to the sum and product of polynomials. Note
also that the sum and product of generating functions coincide with the sum and
product of real functions admitting a power series expansion within its interval
of convergence.
Let G denote the set of generating functions for sequences of real numbers. It
is easy to conclude that the operation + : G 2 G that associates to each pair
of generating functions their sum is a commutative and associative operation.
Moreover, Gs (z) + 0 = Gs (z) and Gs (z) + Gs (z) = 0, for all Gs (z) G (recall
188
that 0 denotes the generating function for s = {sn }nN0 such that sn = 0 for each
n N0 ). P
P
Since ik=0 sk tik = ik=0 sik tk for all i N0 , it is also easy to conclude that
the operation : G 2 G that associates to each pair of generating functions
their product is a commutative and associative operation. It also holds that
Gs (z) 1 = Gs (z) for all Gs (z) G (recall that 1 denotes the generating function
for s = {sn }nN0 such that s0 = 1 and sn = 0 for each n > 0).
Moreover, the product of generating functions is distributive with respect to
their sum. Hence, the set G endowed with the operations defined above and
: G G such that (Gs ) = Gs constitutes a unitary commutative ring. The
multiplicative identity is the generating function 1.
Proposition 6.2.8 The tuple (G, +, 0, , ) constitutes a unitary commutative
ring.
Not all the generating functions have multiplicative inverses. If s = {sn }nN0
is such that s0 6= 0 then Gs (z) has multiplicative inverse (and vice-versa).
Proposition 6.2.9 The generating function Gs (z) for the sequence s = {sn }nN0
has a multiplicative inverse if and only if s0 6= 0.
Proof:
P
ti z i is the multiplicative inverse of Gs (z). Then,
() Assume that Gt (z) = +
i=0P
P
Gs (z) Gt (z) = 1. Therefore, 0k=0 sk t0k = s0 t0 = 1 and ik=0 sk tik = 0 for
i > 0. In particular, s0 t0 = 1 and, as a consequence, s0 6= 0.
() Assume that s0 6= 0. Let t = {tn }nN0 be such that
t0 =
and
1
s0
n
1 X
tn =
sk tnk
s0 k=1
sk t0k = s0 t0 = s0
k=0
(6.9)
1
=1
s0
sk tnk = s0 tn +
n
X
k=1
sk tnk = s0
n
1 X
sk tnk
s0 k=1
n
X
sk tnk = 0
k=1
QED
Besides Gs (z)1 we may also use Gs1(z) to denote the multiplicative inverse of
a generating function Gs (z), when it exits.
189
1
1
= =1
v0
1
1
1 X
1
t1 =
vk t1k = v1 t0 = 1
v0 k=1
v0
2
1
1 X
vk t2k = (v1 t1 + v2 t0 ) = (1 + 1) = 0
t2 =
v0 k=1
v0
t3 =
...
3
1
1 X
vk t3k = (v1 t2 + v2 t1 + v3 t0 ) = (0 + 1 + 1) = 0
v0 k=1
v0
1
1
= =1
(ga )0
1
1
1
1 X
(ga )k t1k =
(ga )1 t0 = a
t1 =
(ga )0
(ga )0
k=1
t2 =
...
2
1
1 X
(ga )k t2k =
((ga )1 t1 + (ga )2 t0 ) = a2 + a2 = 0
(ga )0 k=1
(ga )0
190
+
X
(i + 1)si+1 z i .
i=0
Example 6.2.13
Let v be the sequence 1, 1, 1, . . . in Example 6.2.2. The
derivative of Gv (z) is the generating function for 1v1 , 2v2 , 3v3 , . . .. Hence, Gv (z)
is the generating function for 1, 2, 3, 4, . . ., the sequence x = (xn )nN0 where
xn = n + 1 for each n N0 , and therefore
Gv (z)
+
X
(i + 1)z i .
i=0
Example 6.2.14 Let t = (tn )nN0 be such that tn = 0 for all n > k for some
k N0 , and recall that we can use t0 + t1 z + t2 z 2 + . . . + tk z k to denote Gt (z).
The derivative Gt (z) is the generating function for de 1t1 , 2t2 , . . ., ktk , 0, 0,. . .,
the sequence x = (xn )nN0 where xn = (n + 1)tn+1 for 0 n k 1 and xn = 0
for n k, and therefore
Gt (z) = t1 + 2t2 z + 3t3 z 2 + . . . + ktk z k1 .
Note that the derivative of t0 + t1 z + t2 z 2 + . . . + tk z k is just the derivative of the
corresponding polynomial function.
Considering the particular case of 1, 0, 1, 0, 0 . . ., the sequence q in Example
6.2.2, it holds
Gq (z) = (1 z 2 ) = 2z.
191
Gv (z) =
+
X
i=0
1 i+1
z .
i+1
Rz
Rz
Note that ( 0 Gv (z)) = Gv (z) since the derivative of 0 Gv (z) is the generating function for 1 1, 2 21 , 3 13 , . . ., that is, the sequence whose terms are all
equal to 1.
Example 6.2.17 Recall again the sequence t = (tn )nN0 in Example 6.2.14 and
the generating function Gt (z) = t0 + t1 z + t2 z 2 + . . . + tk z k . The integral of Gt (z)
is the generating function for
0,
t0 t1 t2
, 2, 3,
1
. . .,
tk
,
k+1
0, 0,. . .
The usual properties regarding the derivative of the sum and of the product
hold, that is, given generating functions Gs (z) e Gt (z)
(Gs (z) + Gt (z)) = Gs (z) + Gt (z)
(6.10)
(6.11)
and
Moreover, if Gs (z) has a multiplicative inverse then
1
G (z)
= s 2.
Gs (z)
(Gs (z))
(6.12)
function, that is, ( 0 Gs (z)) = Gs (z). The proofs of these properties are left as
an exercise to the reader.
Observe that the notion of derivative of a generating function coincides with
the usual notion of derivative of a function admitting a power series expansion,
within its domain. Similarly, with respect to the integral.
192
Closed forms
It is often useful to get a closed form for a generating function Gs (z), that is,
to get an equality Gs (z) = e where the expression e does not explicitly involve
power series. The following examples illustrate how we can obtain closed forms
for some generating functions.
Example 6.2.18
The equalities
zGv (z) =
+
X
z i+1 =
+
X
zi
i=0
i=0
1 = Gv (z) 1
hold. Solving
zGv (z) = Gv (z) 1
1
1z
(6.13)
thus obtaining a closed form for Gv (z). This technique can be generalized to
conclude that
a
Gs (z) = Gav (z) =
1z
where s is a sequence whose terms are all equal to a real number a, that is, s = av.
Observe that we can also use the fact that Gv (z)1 = 1 z (see Example
6.2.10) to conclude that Gv (z) = (1 z)1 and, as a consequence, the equality
(6.13).
Example 6.2.19 Let a be a real number and let ga be the sequence presented
in Example 6.2.11 (geometric progression with ratio a and first term 1). From
Gga (z) =
+
X
ai z i
i=0
we get
azGga (z) =
+
X
ai+1 z i+1 =
i=0
Solving
+
X
ai z i
i=0
1 = Gga (z) 1
1
1 az
(6.14)
193
z Gw (z) =
+
X
i=0
z 3i+3 = Gw (z) 1
and therefore, solving z 3 Gw (z) = Gw (z) 1 for Gw (z), we get the closed form
Gw (z) =
1
.
1 z3
(6.15)
We can reason as above when for some k N the sequence w = (wn )nN0 is such
that wn = 1 if n is multiple of k and wn = 0 otherwise, for each n N0 . The
resulting closed form is analogous to (6.15) with k instead of 3.
1
1
+
.
1z 1+z
Gx (z) = Gv (z) =
.
(6.16)
=
1z
(1 z)2
194
From the closed form obtained in Example 6.2.22 we can obtain closed forms
for other generating function.
Example 6.2.23 Let r be the sequence 0, 1, 2, 3, . . ., in Example 6.2.2. From
(6.16) we conclude that
Gr (z) = zGt (z) =
z
.
(1 z)2
(6.17)
6.3
6.3.1
We first describe how to use generating functions for computing the expected
value and the variance of some discrete random variables. We then present the
average case analysis of the key search algorithm discussed in Section 6.1.
Discrete random variables and probability generating functions
We briefly recall several basic notions concerning discrete random variables, beginning with the notion of discrete probability space.
Definition 6.3.1 A discrete probability space isPa pair (, p) where is a countable set and p : [0, 1] is a map such that p() = 1. Each element of
is an elementary event.
Given a discrete probability space (, p), the map p canPbe extended to subsets
of considering the map p : 2 [0, 1] such that p(A) = A p(). When there
is no ambiguity we just use p for p. We now introduce discrete random variables.
Definition 6.3.2 A random variable over a discrete probability space (, p) is
a map X : R.
A random variable over a discrete probability space is said to be a discrete
random variable. In the sequel, X() denotes the set {X(w) : w }. Moreover, we say that the random variable X takes only values in a set C whenever
X() C. For instance, if X() N0 we say that X takes only nonnegative
integer values. We can associate a probability function with each discrete random
variable.
Definition 6.3.3 Let X : R be a discrete random variable over (, p). The
probability function associated with X is the map PX : R [0, 1] such that
PX (x) = p({w : X(w) = x}).
195
Some parameters are useful to characterize the probability function of a random variable, such as the expected value and the variance. Since in the sequel
we only refer to discrete random variables taking only nonnegative integer values,
we just introduce expected value and variance of such random variables.
Definition 6.3.5 Let X be a discrete random variable over (, p) taking only
nonnegative integer values. For each m N, the m-th moment of X, denoted by
E(X m ), is
+
X
k m PX (k)
k=0
wheneverP
this summation is a real number. The first moment of X, that is,
E(X) = +
k=0 kPX (k) is the expected value, or mean, of X. The variance of X
is V (X) = E((X E(X))2 ).
It can be easily proved that
V (X) = E((X E(X))2 ) = E(X 2 ) (E(X))2
(6.18)
and therefore the variance of X can be computed using the first and the second
moments of X.
We now introduce probability generating functions. Let X be a discrete random variable taking only nonnegative integer values. We can consider the sequence PX (0), PX (1), PX (2), . . ., and therefore the corresponding the generating
function, the probability generating function associated with X.
196
The sets {w S
: X(w) = k} and {w : X(w) = k } are disjoint for distinct k, k N0 , and kN0 ({w : X(w) = k}) = . As a consequence, recallP
P
ing also Definition 6.3.1, the equalities +
P
(k)
=
X
k=0
w p(w) = 1 hold, that
is, GX (1) = 1.
The expected value of a discrete random variable X taking only nonnegative
integer values can be computed using the derivative of the probability generating
function of X.
Proposition 6.3.7 Let X be a discrete random variable over (, p) taking only
nonnegative integer values. Then E(X) = GX (1).
Proof: The first derivative of the probability generating function of X is
GX (z)
+
X
(k + 1)PX (k + 1)z k .
(6.19)
k=0
Therefore,
GX (1)
+
X
(k + 1)PX (k + 1) =
k=0
+
X
k=0
Note that E(X) is a real number if and only if GX (1) is a real number.
QED
The second moment of X can be computed using the first and the second
derivatives of GX (z).
Proposition 6.3.8 Let X be a discrete random variable over (, p) taking only
nonnegative integer values.
1. E(X 2 ) = GX (1) + GX (1)
2. V (X) = GX (1) + GX (1) (GX (1))2
197
+
X
k=0
Therefore,
GX (z)
GX (z)
+
X
k=0
+
X
k=0
(k 1)kPX (k)z +
+
X
kPX (k)z k
k=0
+
X
k 2 PX (k)z k
k=0
GX (1)
+
X
k 2 PX (k) = E(X 2 ).
k=0
QED
+
X
PX+Y (k)z k .
k=0
k
X
i=0
PXY (i, k i) =
+
X
k=0
PX (k)z k
+
X
k=0
k
X
i=0
PX (i)PY (n i)
PY (k)z k
= GX (z)GY (z).
198
The above result can be extended to the sum of a finite number of random variables, that is, the equality GX1 +X2 +...+Xn (z) = GX1 (z)GX2 (z) . . . GXn (z)
holds.
Average case analysis: unsuccessful search
In this section we return to the motivating example presented in section 6.1.1 and
show how to use probability generating functions in the average-case analysis of
the key searching algorithm. The average case analysis depends on whether the
key has already been stored (successful search) or not (unsuccessful search). We
first consider unsuccessful search.
Assume we are searching for a key K and assume that n N keys have already
been stored in table Key. For simplicity, we denote by Ki the key Key[i]. Suppose
we have a hash function h : K {1, 2, . . . , m}, where m N and K is the key
space, and therefore there are m different lists where to search for K.
Recall the function keySearch in Figure 6.1 to determine whether k has already been stored. We want to determine the average number of comparisons
key[[j]]==k that are performed when searching for k. To this end some probabilistic hypothesis have to be considered. Since we are analyzing unsuccessful
searches, the key K is not yet been stored. Hence, K 6= Ki for each 1 i n.
For illustration purposes suppose m = 2 that is, we have two lists L1 and L2 ,
and suppose n = 3. There are 23 = 8 possible scenarios that can be sketched as
follows:
L1 L2
K1
K2
K3
L1 L2
K1
K2
K3
(1)
(2)
L1 L2
K1 K2
K3
L1 L2
K2 K1
K3
L1 L2
K3 K1
K2
L1 L2
K1 K3
K2
L1 L2
K1 K2
K3
L1 L2
K2 K1
K3
(3)
(4)
(5)
(6)
(7)
(8)
199
1 if rn+1 = ri
0 otherwise
wAi
p(w)
and
200
n
X
Xi
i=1
n
Y
GXi (z)
i=1
+
X
r=0
we get
GN C (z) =
n
Y
(PXi (0) + PXi (1)z).
i=1
Then we just compute E(NC) = GN C (1). We can also compute the variance V (NC) = GN C (1) + GN C (1) (GN C (1))2 .
The following example illustrates the average case analysis assuming an uniform distribution, that is, every elementary event is equally likely.
Example 6.3.10 Assuming an uniform distribution, that is, a situation where
every elementary event w is equally likely:
(, p) is such that = {1, . . . , m}n+1 and p(w) =
where m, n N.
for 1 i n
PXi (1) = m mn1
1
mn+1
1
m
1
mn+1
for each w ,
201
m1
m
m1
1
+ z
m
m
n
n
Y
m1
1
GN C (z) =
GXi (z) =
+ z
m
m
i=1
GXi (z) =
n1
m1
1
+ z
m
m
n2
1
n(n 1) m 1
+ z
GN C (z) =
m2
m
m
GN C (z)
n
=
m
The average case number of comparisons between the stored keys and the
given key K is then
n
E(NC) = GN C (1) =
m
n(m 1)
2
.
The variance of NC is V (NC) = GN C (1) + GN C (1) (GN C (1)) =
m2
To end this section we return to the characterization of the random variables
Xi for 1 i n. An alternative way of characterizing Xi without explicitly
involving the elementary events and their probabilities is as follows:
1 if h(K) = h(Ki )
Xi =
0 otherwise
and
PXi (1) =
m
X
P (h(K) = r e h(Ki ) = r)
r=1
=m 2 = .
m m
m
m
r=1
202
Let us give a closer look to the relationship between the two definitions we
have presented for PXi (1). Recall the set Ai defined above for each 1 i n. We
r
have that Ai = A1i . . . Am
i where Ai = {(r1 , . . . , rn+1 ) : ri = rn+1 = r} for
each 1 r m. These sets are pairwise disjoint and therefore
PXi (1) =
p(w) =
wAi
m
X
p(Ari )
r=1
Note that p(Ari ) is the probability that h(K) = h(Ki ) = r which just corresponds
to the probability P (h(K) = r e h(Ki ) = r) mentioned above.
Average case analysis: successful search
We now address the successful search case. We again assume we are searching for
a key K, that n N keys have already been stored, and that the hash function
is h : K {1, 2, . . . , m}, whit m N. Recall that Ki denotes the key Key[i] and
that K1 was the first key to be stored, K2 the second and so on. Since, we are
analyzing successful searches, the key K we are looking for is one of the keys
already stored.
This situation differs from the case of unsuccessful searches in several aspects.
To begin with note that the minimum number of comparisons in the case of an
unsuccessful search is 0 but it is 1 in the case of a successful search, since list
h(K) is never empty. Moreover, in an unsuccessful search, K is compared with
all the keys in the list h(K), whereas in a successful search this may not be the
case since we need no more comparisons once we find K. Observe also that if
K = Kj for some 1 j n then there are no comparisons between K and Kj
for any j > j, and therefore the maximum number of comparisons is j. This is
the case because when j > j then Kj occurs after Kj in table Key. Finally, note
that in a successful search we have to take also into account the probability that
K = Kj for each 1 j n.
To compute the average number of the intended comparisons we reason as
follows.
(1) The discrete probability space is (, p) where in this case
= {1, . . . , m}n {1, . . . , n}.
Each possible scenario consists of a particular distribution of the keys K1 ,
K2 , . . ., Kn over the m lists and K = Ki for some 1 i n. Thus, each
elementary event is a tuple (r1 , r2 , . . . , rn+1 ) where ri = h(Ki ) for each
1 i n and rn+1 {1, . . . , n} indicates which of the n keys is K.
(2) We consider a discrete random variable Y over the discrete probability
space (, p) such that Y () = {1, . . . , n}, with Y (r1 , . . . , rn+1) = rn+1 . The
203
when K = Kj for 1 j n.
1 if h(Ki ) = h(Kj )
Xij =
0 otherwise
Note that Xjj is always equal to 1. Thus,
1
when i = j
m
X
PXij (1) =
r=1
and
j
X
Xij
i=1
Note that NC() = {1, . . . , j}. The values of NCj correspond to the possible number of comparisons between the stored keys and K, assuming that
K is Kj .
204
n
X
PY (j)PN Cj (nc).
j=1
+
X
r=0
j
Y
i=0
j
Y
(PXij (0) + PXij (1)z).
GXij (z) =
i=0
Finally, the average number of comparisons is E(NC) = G (1). The variance V (NC) can also be computed as expected.
The following example illustrates the average case analysis assuming that
every elementary event is equally likely.
Example 6.3.11 Assuming an uniform distribution, that is, a situation where
every elementary event w is equally likely:
(, p) is such that = {1, . . . , m}n {1, . . . , n} and p(w) =
w , with m, n N.
PY (j) = mn
mn
1
1
=
n
n
for 1 j n and 1 i j
for each 1 j n
1
mn n
for each
205
1
when i = j
m
X
PXij (1) =
1
1
when i 6= j
m
m
r=1
that is,
1 when i = j
PXij (1) =
1 when i 6= j
m
and therefore
when i = j
0
PXij (0) = 1 PXij (1) =
m 1 when i 6= j
m
Moreover,
when i = j
z
GXij (z) =
m1
1
+ z when i 6= j
m
m
GN Cj (z) = z
1
m1
+ z
m
m
j1
for each 1 j n
j1
j1
n
n
X
m1
1
1
z X m1
1
+
+ z
GN C (z) =
z
=
n
m
m
n j=1
m
m
j=1
The average case number of comparisons between the stored keys and the
given key K is then
E(NC) = GN C (1) =
n1
+ 1.
2m
6.3.2
Euclids algorithm
We first discuss how generating functions can be used for solving recurrence relations. After presenting a first example, we consider the case of the Fibonacci
sequence. We then return to the analysis of the Euclids algorithm already introduced in Section 6.1.2.
206
tn = 3tn1 2
and
for n 1.
(6.20)
+
X
ti z i
i=0
for t to solve the recurrence relation 6.20, in order to get an expression for tn
that does not depend on other elements of the sequence. In the sequel, recall the
notations and closed forms presented in Section 6.2. Let us proceed as follows.
(i) First of all note that
G(z) =
+
X
ti z i
i=0
+
X
ti z i
i=1
+
X
i=1
=3
(3ti1 2)z i
+
X
i=1
=3
+
X
ti1 z i 2
ti z
i+1
i=0
= 3z
+
X
i=0
= 3z
+
X
i=0
+
X
i=1
2(1 +
ti z i 2(1 +
ti z i
2z
1z
G(z) = 3zG(z)
2z
1z
(ii) Solving
G(z) =
+
X
zi )
i=0
1
)
1z
2z
1z
= 3zG(z)
zi
2z
(1 z)(1 3z)
(6.21)
207
(iii) P
Now, the goal is to expand the right hand side of (6.23) into a power series
+
i
i=0 ai z , since then the coefficient ai is an expression for ti , for each i N0 .
(iii.1) On one hand, from
2z
A
B
=
+
(1 z)(1 3z)
1 z 1 3z
we get A(1 3z) + B(1 z) = 2z. For z = 0 and z = 1 we get A + B = 0
and 4A + 2B = 2, respectively, and therefore A = 1 and B = 1. Thus,
2z
1
1
=
+
.
(1 z)(1 3z)
1 z 1 3z
(iii.2) On the other hand,
1
1
+
1 z 1 3z
+
X
i=0
+
X
+
X
3i z i
i=0
(3i + 1)z i
i=0
+
X
(3i + 1)z i
i=0
s1 = 1
and
sn = sn1 + sn2
for n 2.
(6.22)
208
+
X
si z i
i=0
of the Fibonacci sequence we are able to solve the recurrence relation 6.22, thus
obtaining an expression for sn that does not depend on other elements of the
sequence. The following steps are similar to the ones presented in the previous
example.
(i) Note that
+
X
si z
=z+
i=0
+
X
si z i
i=2
=z+
+
X
(si1 + si2 )z i
i=2
=z+
+
X
si1 z +
=z+
si z
i+1
i=0
=z+z
si2 z i
i=2
i=2
+
X
+
X
+
X
+
X
si z i+2
i=0
si z + z
i=0
+
X
si z i .
i=0
z
1 z z2
(6.23)
(iii) P
The goal is to expand the right hand side of (6.23) into a power series
+
i
i=0 ai z , since then the coefficient ai is the Fibonacci number si , for each
i N0 .
(iii.1) To make things easierwe first rewrite it as
follows. Note that the roots
1 5
1+ 5
2
and note that 1 2 = 1.
of 1 z z are 1 = 2 . and 2 = 2
Hence,
209
= (z 1 )(z 2 )
z
z
1
1
= 1 2
1
2
z
z
= 1
1
1
2
1
1
2
1+ 5
1+ 5
2
and 2 =
1
2
2
1 5
1 5
.
2
A
B
z
=
+
(1 1 z)(1 2 z)
1 1 z 1 2 z
1
1
1 1 z 1 2 z
(iii.2)
P+ Wei now expand the right hand side of (6.23) into a power series
i=0 ai z as follows
z
1
1
1
=
1 z z2
5 1 1 z 1 2 z
!
+
+
X
1 X
(1 z)i
(2 z)i
=
5 i=0
i=0
+
X
1i 2i
zi
=
5
i=0
(iv) From (ii) and (iii) we conclude that the generating function for the Fibonacci sequence is
+ i
X
1 2i
G(z) =
zi
5
i=0
and therefore the Fibonacci sequence s = {sn }nN0 is such that
210
1
n n
sn = 1 2 =
5
5
!n !
1 5
2
for all n N0 .
Many recurrence relations can also be solved reasoning as above for the Fibonacci sequence. We do not further develop this subject herein. The interested
reader is referred, for instance, to [31]. We just note that step (iii) may be not
so easy as in the case of the Fibonacci sequence and the following theorem, the
rational expansion theorem for distinct roots, may be helpful. For simplicity, in
the sequel we also use p to denote the polynomial function associated with a
polynomial p in R[z].
Pn
Pm
i
i
Proposition 6.3.12 Let p =
i=0 pi z and q =
i=0 qi z be polynomials in
R[z] such that deg(p) < deg(q) and
q = q0 (1 1 z) . . . (1 m z)
where i 6= 0 and i 6= j for all 1 i, j m with i 6= j. The function r =
such that
r(z) =
+
X
p
is
q
ai z i
i=0
m
X
bj ij
j=1
with
bj =
for each 1 j m.
j p(1/j )
q (1/j )
211
+
X
i=0
si z i =
p
q
where p and q are polynomials in R[z] such that deg(p) = k and deg(p) k.
Step: We have to prove that n sk+3 and m sk+2 when euclid[m,n] evaluation involves k + 1 2 recursive calls.
Since there are at least two recursive calls, then m 1. Therefore, the first
recursive call is euclid[Mod[n,m],m]. Moreover, there k > 1 recursive calls involved in the evaluation of euclid[Mod[n,m],m]. Hence, by the induction hypothesis, m sk+2 and Mod[n,m] sk+1 . But
jnk
m + Mod[n,m]
n=
m
n
n
and, since m < n, the inequality m
1 holds and, as consequence, m
m m.
We can then conclude that n m + Mod[n,m] sk+2 + sk+1 = sk+3 .
QED
212
call that sn = 5
2
2
k
that k O(loge m) since sk is approximately 15 1+2 5
for large integers k,
given that | 12 5 | < 1. We can prove that the function Mod involves a O(log2e (m))
number of bit operations and therefore there are a O(log33 (m)) number of bit
operations.
The Lames theorem establishes an upper bound for the number of recursive
calls that take place when evaluating euclid[m,n], but we do not know how close
the real number of calls is to this upper bound. The following proposition states
that if the conditions of the theorem hold, the worst case number of recursive calls
is indeed close to k when evaluating euclid[sk ,sk+1 ], that is, when computing
the greatest common divisor of two consecutive Fibonacci numbers. In this case
there are k 1 recursive calls.
Proposition 6.3.17 Let s = {sn }nN0 be the Fibonacci sequence. For each
k N0 such that k 2 the evaluation of euclid[sk ,sk+1 ] involves exactly k 1
recursive calls.
Proof: We use induction on k 2.
6.4
Exercises
1. Consider the sequences s = {sn }nN0 and t = {tn }nN0 . Prove that
P
i
(a) aGs (z) + bGt (z) =
i=0 (asi + bti )z .
Pi
P
1
i
Gs (z) =
(b) 1z
j=0 sj )z .
i=0 (
213
6.4. EXERCISES
2. Find a closed form for the generating function for the sequence s = {sn }nN0
where
(a) sn = 1 if n is a multiple of 3 and sn = 0 otherwise.
(b) sn = 2 if n is odd and sn = 1 otherwise.
(c) sn = 2n + 1.
(d) sn = 5n + 5.
(e) sn = 3n + 2n .
(f) sn =
3
2
1
6
1
n
4. Find a closed form for the probability generating function of the random
variable X that describes a loaded dice such that the probability of seeing
an even number of spots in a roll is half the probability of seeing an odd
number.
5. Find the mean and variance of the random variables X in Exercise 3.
6. Find the mean and variance of the random variable X in Exercise 4.
7. Consider the hashing technique for storing and retrieving information. Let
h be the hashing function and let NC be the random variable corresponding
to the number of comparisons between keys involved in a unsuccessful search
of a key K. Compute the mean and the variance of NC when
(a) there are 2 lists and the probability that h(K) = 1 is 14 .
(b) there are 3 lists and the probability of that h(K) = 1 and that h(K) =
2 is p1 and p2 respectively.
8. Consider the hashing technique for storing and retrieving information. Let
h be the hashing function and let NC be the random variable corresponding
to the number of comparisons between keys involved in a successful search
214
9. Consider the hashing technique for storing and retrieving information. Let
h be the hashing function and let NC be the random variable corresponding
to the number of comparisons between keys involved in a successful search
of a key K. Compute the mean and the variance of NC when there are 2
lists and the probability that h(K) = 1 is p1 .
10. Solve the following recurrence relations using generating functions
(a) s = {sn }nN0 where s0 = 0 and sn = 3sn1 2 for n 1.
Bibliography
[1] W. Adams and P. Loustaunau. An Introduction to Grobner Bases, volume 3
of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 1994.
[2] A. V. Aho, J. E. Hopcroft, and J. D. Ullman. Data Structures and Algorithms. Addison-Wesley Series in Computer Science and Information Processing. Addison-Wesley, 1983.
[3] J. Baillieul and D. P. Martin. Resolution of kinematic redundancy. In
Robotics, volume 41 of Proc. Sympos. Appl. Math., pages 4989. Amer. Math.
Soc., Providence, RI, 1990.
[4] L. Blum, M. Blum, and M. Shub. A simple unpredictable random number
generator. SIAM Journal on Computing, 15:364383, 1986.
[5] M. Borges-Quintana, M. A. Borges-Trenard, P. Fitzpatrick, and E. MartnezMoro. Grobner bases and combinatorics for binary codes. Appl. Algebra
Engrg. Comm. Comput., 19(5):393411, 2008.
[6] M. Brickenstein, A. Dreyer, A. G.-M. Greuel, M. Wedler, and O. Wienand.
New developments in the theory of Grobner bases and applications to formal
verification. J. Pure Appl. Algebra, 213(8):16121635, 2009.
[7] B. Buchberger. Ein Algorithmus zum Auffinden der Basiselemente des Restklassenringes nach einem nulldimensionalen Polynomideal. PhD thesis, 1965.
University of Innsbruck.
[8] B. Buchberger. Ein algorithmisches kriterium f
ur die losbarkeit eines algebraischen gleichungssystems. Aequationes mathematicae, 4(3):374383, 1970.
[9] R. Cori and D. Lascar. Mathematical logic. Oxford University Press, 2000.
A course with exercises. Part I.
[10] D. Cox, J. Little, and D. OShea. Ideals, Varieties and Algorithms. Springer,
third edition, 2007.
215
216
BIBLIOGRAPHY
[11] B. A. Davey and H. A. Priestley. Introduction to Lattices and Order. Cambridge University Press, second edition, 2002.
[12] R. David and H. Alla.
Springer, 2005.
BIBLIOGRAPHY
217
218
BIBLIOGRAPHY
Subject Index
lemma, 20
theorem, 22
Euler
phi function, 23
theorem, 27, 42
Euler-Maclaurin formula, 142
evaluation of polynomial, 71
additive
inverse, 30
unity, 30
Bernoulli number, 142
binomial coefficient, 142
Buchberger
algorithm, 106
polynomial, 97
theorem, 98
Carmichael function, 29
Chinese remainder theorem, 36, 42
closed form
for generating function, 186
for summation, 137
congruence
modulo n, 24
relation, 26
coprime numbers, 22
degree
of monomial, 69
of polynomial, 69
of term, 70
discrete Fourier transform, 160
inverse, 163
division
of polynomials, 80
of terms, 78
divisor, 11
greatest common, 13
Euclid
algorithm, 14, 204
extended algorithm, 17, 33
ideal, 91
219
220
basis of, 92
finitely generated, 92
Grobner basis, 95
proper, 92
set of generators, 92
insertion sort, 147
average-case analysis, 150
worst-case analysis, 148
integer division, 11
isomorphism of rings, 36
linear congruential sequence, 54
increment, 54
modulus, 54
seed, 54
modular congruence
application to
pseudo-random numbers, 54
public key cryptography, 10
monic
polynomial, 77
term, 70
monomial, 69
degree of, 69
graded lexicographic order, 75
lexicographic order, 73
order, 73
product, 72
multiple, 11
multiplicative
inverse, 31
unity, 31
multiplicative order, 28
number
Bernoulli, 142
coprime to, 22
harmonic, 150
prime, 20
pseudo-random, 51
order
graded lexicographic, 75
SUBJECT INDEX
lexicographic, 73
perturbation technique, 140
polynomial, 69
Buchberger, 97
coefficent of, 69
degree of, 69
division, 80
evaluation of, 71
leading term, 77
monic, 77
multivariate, 70
point-value representation, 170
product, 72, 157
reduction, 87
reduction in one step, 84
ring, 73
sum, 72
symmetric, 72
term of, 70
univariate, 70
zero polynomial, 69
prime number, 20
factorization, 21
probability generating function, 191
product
of monomials, 72
product of rings, 34
quadratic residue modulo n, 29
quotient, 12
reduction of polynomials, 87
remainder, 11
ring, 30
additive inverse, 30
additive unity, 30
homomorphism, 35
isomorphism, 36
multiplicative unity, 31
multiplivative inverse, 31
of generating functions, 184
of polynomials, 73
product, 34
SUBJECT INDEX
unitary, 31
roots of unity, 159
RSA cryptosystem, 10, 42
summation, 127, 130
additivity of indices, 134
application to
analysis of Gaussian elimination
technique, 144
analysis of insertion sort algorithm, 147
associativity, 134
change of variable, 134
closed form, 137
constant, 134
distributivity, 134
members of arithmetic progression,
138
members of geometric progression,
139
perturbation technique, 140
systems of linear congruences, 38
term, 70
coefficient of, 70
degree of, 70
division, 78
leading term, 77
least common multiple, 97
monic, 70
monomial of, 70
zero term, 70
theorem
Buchberger, 98
Chinese remainder theorem, 36,
42
Euclid, 22
Euler, 27, 42
fundamental theorem of arithmetic,
20
Vandermonde matrix, 161
221
222
SUBJECT INDEX
Table of Symbols
+n , 27, 30
n , 27, 30
=n , 24
>glx , 75
>lx , 73
B(p1 , p2 ), 97
Bk , 142
Gs (z), 181
Mx1 ,...,xn , 69
[a]n , 27
Buch(G), 105
deg(p), 69
b
X
f (k), 127
k=a
d2
X
0, 70
C[x1 , . . . , xn ], 69
mod(n,m), 12
e, 130
k=d1
f (k), 127
kA
gcd(m, n), 13
lcm(t1 , t2 ), 97
x, 12
lt(p), 77
Zn , 26, 30, 32
euclid(m, n), 14
exteuclid(m, n), 17
DFTn (a0 , a1 , . . . , an1 ), 160
DFTn1 (a0 , a1 , . . . , an1 ), 161
FFT, 165
n , 27, 30
m | n, 11
D
p p , 87
d
p p , 84
znk , 159
223