ECE750 F2008 Algorithms1

ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
ECE750 Lecture 1: Asymptotics, Resources,
& Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Electrical & Computer Engineering
University of Waterloo
Canada
Sept. 11, 2008
1 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Part I
Asymptotics
2 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Asymptotics: Motivation I
We want to choose the best algorithm or data structure

for the job.
Need characterizations of resource use, e.g., time,

space; for circuits: area, depth.
Many, many approaches:
Worst Case Execution Time (WCET): for hard real-time

applications
Exact measurements for a specic problem size, e.g.,

number of gates in a 64-bit addition circuit.
Performance models, e.g., R
, n
1/2
for
latency-throughput, HINT curves for linear algebra
(characterize performance through dierent cache
regimes), etc.
...
3 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Asymptotics: Motivation II
We will focus on Asymptotic analysis: a good rst

approximation of performance that describes behaviour
on big problems
Reasonably independent of:
Machine details (e.g., 2 cycles for add+mult vs. 1 cycle)
Clock speed, programming language, compiler, etc.

4 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Asymptotics: Brief history
Basic ideas originated in Paul du Bois-Reymonds

Innitarcalc ul (calculus of innities) developed in the
1870s.
G. H. Hardy greatly expanded on Paul du

Bois-Reymonds ideas in his monograph Orders of
Innity (1910) [5].
The big-O notation was rst used by Bachmann

(1894), and popularized by Landau (hence sometimes
called Landau notation.)
Adopted by computer scientists [6] to characterize

resource consumption, independent of small machine
dierences, languages, compilers, etc.
5 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Basic asymptotic notations I
Asymptotic means behaviour as n , where for our
purposes n is the problem size.
Three basic notations:
f g (f and g are asymptotically equivalent)
f _ g (f is asymptotically dominated by g)
f g (f and g are asymptotically bound to each other)

6 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Basic asymptotic notations II
f g means lim
n
f (n)
g(n)
= 1
Example: 3x
2
+ 2x + 1 3x
2
.
is an equivalence relation:
Transitive: (x y) (y z) (x z)
Reexive: x x
Symmetric: (x y) (y x).
Basic idea: We only care about the leading term,
disregarding less quickly-growing terms.
7 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Basic asymptotic notations III
f _ g means limsup
n
f (n)
g(n)
<
i.e.,
f (n)
g(n)
is eventually bounded by a nite value.
(limsup means the maximum value a function attains
innitely often. For example, lim
x
cosx does not exist,
but limsup
x
cosx = 1.)
Basic idea: f grows more slowly than g, or just as

quickly as g.
_ is a preorder (or quasiorder):
Transitive: (f _ g) (g _ h) (f _ h).
Reexive: f _ f
8 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Basic asymptotic notations IV
_ fails to be a partial order because it is not

antisymmetric: there are functions f , g where f _ g
and g _ f but f ,= g. Example:
f (n) = 2
n+cos n
g(n) = 2
n
Here
f (n)
g(n)
= 2
cos n
, which oscillates between
1
2
and 2.
Variant: g _ f means f _ g.
9 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Basic asymptotic notations V
Write f g when there are positive constants c
1
, c
2
such that
c
1

f (n)
g(n)
c
2
for suciently large n.
Examples:
n 2n
n (2 + sin n)n
is an equivalence relation.
10 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Strict forms
Write f g when f _ g but f , g.
Basic idea: f grows strictly less quickly than g
Equivalent: f _ g exactly when lim

n
f (n)
g(n)
= 0.
Examples
x
2
x
3
log x x
Variant: f ~ g means g f
11 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Orders of growth I
We can use as a ruler by which to judge the growth of
functions. Some common tick marks on this ruler are:
log log n log n log
k
n n
n n
2
2
n
n! n
n
2
2
n
We can always nd in a dense total order without
endpoints. i.e.,
There is no slowest-growing function;
There is no fastest-growing function;
If f h we can always nd a g such that f g h.

(The canonical example of a dense total order without
endpoints is Q, the rationals.)
This fact allows us to sketch graphs in which points on

the axes are asymptotes.
Cantor proved that any countable dense, total order

without endpoints is isomorphic to Q, ), i.e., the
rationals under the usual ordering.
12 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Orders of growth II
This means we can put suitably dened sets of

asymptotes in a one-to-one correspondence with the
rationals.
Hence we are justied in drawing graphs where axes are

asymptotes; this gives us an intuitive way to visualize
how e.g. the behaviour of an algorithm changes as we
vary parameters.
13 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Big-O notations I
Big-O is a convenient family of notations for asymptotics:
O(g) f : f _ g
i.e., O(g) is the set of functions f so that f _ g.
O(n
2
) contains n
2
, 7n
2
, n, log n, n
3/2
, 5, . . .
Note that f O(g) means exactly f _ g.
A standard abuse of notation is to treat a big-O

expression as if it were a term:
x
2
+ 2x
1/2
+ 1
. .
x
1/2
= x
2
+ O(x
1/2
)
The above equation can be read as there exists a
function f O(x
1/2
) such that
x
2
+ 2x
1/2
+ 1 = x
2
+ f (x).
14 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Big-O notations II
Big-O notation is an excellent tool for expressing

machine/compiler/language-independent complexity
properties.
On one machine a sorting algorithm might take

5.73n log n seconds, on another it might take
9.42n log n + 3.2n seconds.
We can wave these dierences aside by saying the

algorithm runs in O(n log n) seconds.
O(f (n)) means something that behaves asymptotically

like f (n):
Disregarding any initial transient behaviour;
Disregarding any multiplicative constants c f (n);
Disregarding any additive terms that grow less quickly

than f (n).
15 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Basic properties of big-O notation I
Given a choice between an sorting algorithm that runs in
O(n
2
) time and one that runs in O(n log n) time, which
should we choose?
1. Gut instinct: the O(n log n) one, of course!
2. But: note that the class of functions O(n
2
) also
contains n log n. Just because we say an algorithm is
O(n
2
) does not mean it takes n
2
time!
3. It could be that the O(n
2
) algorithm is faster than the
O(n log n) one.
16 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Additional notations I
To distinguish between at most this fast, at least this
fast, etc. there are additional big-O-like notations:
f O(g) f _ g upper bound
f o(g) f g strict upper bound
f (g) f g tight bound
f (g) f _ g lower bound
f (g) f ~ g strict lower bound
17 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Tricks for a bad remembering day
Lower case means strict:
o(n) is strict version of O(n)
(n) is strict version of (n)
, (omega) is the last letter of the greek alphabet

if f (g) then g comes after f in asymptotic ordering.
f (g): the line through the middle of the theta

asymptotes converge to one another
18 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Notation: o()
f o(g) means f g
o() expresses a strict upper bound.
If f (n) is o(g(n)), then f grows strictly slower than g.
Example:
n
k=0
2
k
= 2
1
2
n
= 2 + o(1)
o(1) indicates the class of functions for which

lim
n
g(n)
1
= 0, which means lim
n
g(n) = 0.
2 + o(1) means 2 plus something that vanishes as

n
If f is o(g), it is also O(g).
n! = o(n
n
).
19 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Notation: ()
f (g) means f ~ g
() expresses a strict lower bound.
If f (n) is (g(n)), then f grows strictly faster than g.
f (g) is equivalent to g o(f ).
Example: Harmonic series
h
n
=
n
k=0
1
k
ln n + + O(n
1
)
h
n
(1) (It is unbounded.)
h
n
(ln ln n)
n! = (2
n
) (grows faster than 2
n
)
20 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Notation: ()
f (g) means f _ g
() expresses a lower bound, not necessarily strict
If f (n) is (g(n)), then f grows at least as fast as g.
f (g) is equivalent to g O(f )
Example: Matrix multiplication requires (n

2
) time. (At
least enough time to look at each of the n
2
entries in the
matrices.)
21 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Notation: ()
f (g) means f g
() expresses a tight asymptotic bound
If f (n) is (g(n)), then f (n)/g(n) is eventually

contained in a nite positive interval [c
1
, c
2
].
() bounds are very precise, but often hard to obtain.
Example: QuickSort runs in time (n log n) on average.

(Tight! Not much faster or slower!)
Example: Stirlings approximation

ln n! n ln n n +O(ln n) implies that ln n! is (n ln n)
Dont make the mistake of thinking that f (g)

means lim
n
f (n)
g(n)
= k for some constant k.
22 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Algebraic manipulations of big-O
Manipulating big-O terms requires some thought

always keep in mind what the symbols mean!
An additive O(f (n)) term swallows any terms that are

_ f (n):
n
2
+ n
1/2
+ O(n) + 3 = n
2
+ O(n)
The n
1/2
and 3 on the l.h.s. are meaningless in the
presence of an O(n) term.
O(f (n)) O(f (n)) = O(f (n)) not 0!
O(f (n)) O(g(n)) = O(f (n)g(n)).
Example: What is ln n + +O(n

1
) times n +O(n
1/2
)?
_
ln n + + O(n
1
)
_
n + O(n
1/2
)
_
= n ln n + n + O(n
1/2
ln n)
The terms O(n
1/2
), O(n
1/2
), O(1), etc. get
swallowed by O(n
1/2
ln n).
23 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Sharpness of estimates
Example: for a constant c,
ln(n + c) = ln
_
n
_
1 +
c
n
__
= ln n + ln
_
1 +
c
n
_
= ln n +
c
n

c
2
2n
2
+ (Maclaurin series)
= ln n +
_
1
n
_
It is also correct to write
ln(n + c) = ln n + O(n
1
)
ln(n + c) = ln n + o(1)
since (n
1
) O(
1
n
) o(1). However, the (
1
n
) error term
is sharper a better estimate of the error.
24 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Sharpness of estimates & The Riemann
Hypothesis
Example: let (n) be the number of prime numbers n.
The Prime Number Theorem is that
(n) Li(n) (1)
where Li(n) =
_
n
x=2
1
ln x
dx is the logarithmic integral, and
Li(n)
n
ln n
Note that (1) is equivalent to:
(n) = Li(n) + o(Li(n))
It is known that the error term can be improved, for example
to
(n) = Li(n) + O
_
n
ln n
e
a
ln n
_
25 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Sharpness of estimates & The Riemann
Hypothesis
The famous Riemann hypothesis is the conjecture that a
sharper error estimate is true:
(n) = Li(n) + O(n
1
2
ln n)
This is one of the Clay Institute millenium problems, with a
$1,000,000 reward for a positive proof. Sharp estimates
matter!
26 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
To maintain sharpness of asymptotic estimates during
analysis, some caution is required.
E.g. If f (n) = 2
n
+ O(n), what is log f (n)?
Bad answer: log f (n) = n + o(n).
More careful answer:
log f (n) = log(2
n
+ O(n))
= log(2
n
(1 + O(n2
n
)))
= log(2
n
) + log(1 + O(n2
n
))
Since log(1 + (n)) O((n)) if o(1),
log f (n) = n + O(n2
n
)
i.e., log f (n) is equal to n plus some value converging
exponentially fast to 0.
27 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
log f (n) = n + O(n2
n
)
is a reasonably sharp estimate (but, what happens if we take
2
log f (n)
with this estimate?)
If we dont care about the rate of convergence we can write
f (n) = n + o(1)
where o(1) represents some function converging to zero.
This is less sharp since we have lost the rate of convergence.
Even less sharp is
f (n) n
which loses the idea that f (n) n 0, and doesnt rule out
things like f (n) = n + n
3/4
.
28 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Asymptotic expansions I
An asymptotic expansion of a function describes how that
function behaves for large values. Often it is used when an
explicit description of the function is too messy or hard to
derive.
e.g. if I choose a string of n bits uniformly at random (i.e.,
each of the 2
n
possible strings has probability 2
n
), what is
the probability of getting
3
4
n 1s?
Easy to write the answer: there are
_
n
k
_
ways of arranging k
1s, so the probability of getting
3
4
n 1s is:
P(n) =
n
k=
3
4
n
2
n
_
n
k
_
This equation is both exact and wholly uninformative.
Can we do better? Yes!
29 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Asymptotic expansions II
The number of 1s in a random bit string is a binomial
distribution and is well-approximated by the normal
distribution as n :
n
k=
1
2
n+
n
2
n
_
n
k
_
_

x=
1
2
e
x
2
2
dx
= 1 F()
where F(x) =
1
2
_
1 +erf
_
x
2
__
is the cumulative normal
distribution.
Maples asympt command yields the asymptotic expansion:
F(x) 1 O
_
1
xe
x
2
2
_
30 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Asymptotic expansions III
We want to estimate the probability of
3
4
n 1s:
1
2
n +
n =
3
4
n
gives =
n
4
. Therefore the probability is
P(n) 1 F
_
n
4
_
1 1 + O
_
1
ne
n
32
_
= O
_
1
ne
n
32
_
So, the probability of having more than
3
4
n 1s converges to
0 exponentially fast.
31 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Asymptotic Expansions
When taking an asymptotic expansion, one writes

ln n! n ln n n + O(1)
rather than
ln n! = n ln n n + O(1)
Writing is a clue to the reader that an asymptotic
expansion is being taken, rather than just carrying an
error term around.
Asymptotic expansions are very important in average

case analysis, where we are interested in characterizing
how an algorithm performs for most inputs.
To prove an algorithm runs in O(f (n)) on average, one

technique is to obtain an asymptotic estimate of the
probability of running in time ~ f (n), and show it
converges to zero very quickly.
32 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Another asymptotic exansion example I
How does log
2
_
n
n/2
_
behave as n ?
% maple
...
> asympt(log[2](n!/(n/2)!/(n/2)!),n);
bytes used=4000224, alloc=3472772, time=1.35
1/2
2
ln(-----) - 1/2 ln(n)
1/2
Pi 1 1 1
n + --------------------- - 1/4 ------- + 1/24 -------- + O(----)
ln(2) ln(2) n 3 5
ln(2) n n
So, log
2
_
n
n/2
_
n O(ln n).
33 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Example: Asymptotic Expansions for
Average-Case Analysis I
The time required to add two n-bit integers by a no

carry adder is proportional to the longest carry
sequence.
It can be shown that the probability of having a carry

sequence of length t(n) satises
Pr(carry sequence t(n)) 2
t(n)+log n+O(1)
If t(n) ~ log n, the probability converges to 0. We can

conclude that the average running time is O(log n).
In fact we can make a stronger statement:

Pr(carry sequence log n + (1)) 0
Translation: The probability of having a carry
sequence longer than log n + (n), where (n) is any
unbounded function, converges to zero.
34 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
The Taylor series method of asymptotic
expansion I
This is a very simple method for asymptotic expansion

that works for simple cases; it is one technique Maples
asympt function uses.
Recall that the Taylor series of a C
function
[continuous derivatives of all orders exist] about x = 0
is given by:
f (x) = f (0) + xf
(0) +
x
2
2!
f
(0) +
x
3
3!
f
(0) +
To obtain an asymptotic expansion of some function

F(n) as n ,
1. Substitute n = x
1
into F(n). (Then n as
x 0.)
2. Take a Taylor series about x = 0.
3. Substitute x = n
1
.
35 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
expansion II
4. Use the dominating term(s) as the expansion, and the
next term as the error term.
Example expansion: F(n) = e
1+
1
n
.
Obviously lim
n
F(n) = e, so we expect something of the
form F(n) e + o(1).
1. Substitute n = x
1
into F(n): obtain F(x
1
) = e
1+x
2. Taylor series about x = 0:
e
1+x
= e + xe +
x
2
2
e +
x
3
6
e +
3. Substitute x = n
1
:
= e +
e
n
+
1
2n
2
e +
1
6n
3
e +
36 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
expansion III
4. Since e ~
1
n
e ~
1
2n
2
e ~ ,
F(n) e +
_
1
n
_
37 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Asymptotics of algorithms
Asymptotics is a key tool for algorithms and data structures:
Analyze algorithms/data structures to obtain sharp

estimates of asymptotic resource consumption (e.g.,
time, space)
Possibly use asymptotic expansions in the analysis to

estimate e.g. probabilities
Use these resource estimates to
Decide which algorithm/data structure is best

according to design criteria
Reason about the performance of compositions

(combinations) of algorithms and data structures.
38 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
References on asymptotics
Course text: [1] Asymptotic notations
Concrete Mathematics, Ronald L. Graham, Donald E.

Knuth and Oren Patashnik, Ch. 9 Asymptotics [4]
Advanced:
Shackell, Symbolic Asymptotics [9]
Hardy, Orders of Innity [5]
Lightstone + Robinson, Nonarchimedean elds and

asymptotic expansions [7]
39 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Part II
Resource Consumption
40 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Resource Consumption
To decide which algorithm or data structure to use, we are
interested in their resource consumption. Depending on the
problem context, we might be concerned with:
Time and space consumption
For logic circuits:
Number of gates
Depth
Area
Heat production
For parallel/distributed computing:
Number of processors
Amount of communication required
Parallel running time
For randomized algorithms:
Number of random bits used
Error probability
41 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Machine models I
The performance of an algorithm must always be

analyzed with reference to some machine model that
denes:
The basic operations supported (e.g., random-access

memory; arithmetic; obtaining a random bit; etc.)
The resource cost of each operation.
Some common machine models:
Turing machine (TM): very primitive, tape-based, used

for theoretical arguments only;
Nondeterministic Turing machine: TM that can

eectively fork its execution at each step, so that after t
steps it can behave as if it were a parallel machine with
e.g. 2
t
processors;
RAM (Random Access Machine) is a model that

corresponds more-or-less to an everyday single-CPU
desktop machine, but with innite memory;
42 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Machine models II
PRAM and LogP [2, 3] are popular models for parallel

computing.
The performance of an algorithm can change drastically

when you change machine models. e.g., many problems
believed to take exponential time (assuming P ,= NP)
on a RAM can be solved in polynomial time on a
Nondeterministic TM.
Often there are generic results that let you translate

resource bounds on one machine model to another:
An algorithm taking time T(n) and space S(n) on a

Turing machine can be simulated in
O(T(n) log log S(n)) time by a RAM;
An algorithm taking time T(n) and space S(n) on a

RAM can be simulated in O(T
3
(n)(S(n) + T(n))
2
)
time by a Turing machine.
Unless otherwise stated, people are usually referring to

a RAM or similar machine model.
43 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Machine models III
When you are analyzing an algorithm, know your

machine model.
There are embarassing papers in the literature in which

nonspecialists have proven outlandish complexity
results by making basic mistakes
e.g. Assuming that arbitrary precision real numbers can

be stored in O(1) space and multiplied, added, etc. in
O(1) time. On realistic sequential (nonparallel)
machine models, d-digit real numbers take:
O(d) space
O(d) time to add
O(d log d) time to multiply

44 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Example of time and space complexity
Lets compare three containers for storing values: list,

tree, sorted array. Let n be the number of elements
stored.
Average-case complexity (on a RAM) is:

Space Search time Insert time
List (n) (n) (1)
Balanced tree (n) (log n) (log n)
Sorted array (n) (log n) (n)
If search time is important: since log n n, a balanced

tree or sorted array will be faster than a list for
suciently large n.
If insert time is important: use a list or balanced tree.
Caveat: asymptotic performance says nothing about

performance for small cases.
45 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Example: Circuit complexity
In circuit complexity, we do not analyze programs per

se, but a family of circuits, one for each problem size
(e.g., addition circuits for n-bit integers).
Circuits are built from basic gates. The most realistic

model is gates that have nite fan-in and fan-out, i.e.,
gates have 2-inputs and output signals can be fed into
at most k inputs
Common resource measures are:
time (i.e., delay, circuit depth)
number of gates (or cells, for VLSI)
fan-out
area
E.g., addition circuits:

Adder type Gates Depth
Ripple-carry adder 7n 2n
Carry-skip (1L) 8n 4
n
Carry lookahead 14n 4 log n
Conditional-sum adder 3n log n 2 log n
46 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Resource consumption tradeos I
Often there are tradeos between consumption of

resources.
Example: Testing whether a number is prime. The

Miller-Rabin test takes time (k log
3
n) and has
probability of error 4
k
.
Choosing k = 20 yields time (log

3
n) and probability
of error 2
40
.
Choosing k =
1
2
log n yields time (log
4
n) and
probability of error
1
n
.
Cracking passwords has a time-space tradeo:
Passwords are stored encrypted to make them hard to

recover: e.g. htpasswd (web passwords) turns foobar
into AjsRaSQk32S6s
47 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Resource consumption tradeos II
Brute force approach: if there are n possible passwords,

precompute a database of size O(n) containing every
possible encrypted password and its plaintext. Crack
passwords in O(log n) time by looking them up in the
database.
Prohibitively expensive in space: e.g. n 2

64
.
Hellman: can recover plaintext in O(n

2/3
) time using a
database of size O(n
2/3
).
MS-Windows LanManager passwords are 14-characters;

they are stored hashed (encrypted). With a
precomputed database of size 1.4Gb (two CD-ROMs),
99.9% of all alphanumerical password hashes can be
cracked in 13.6 seconds [8].
48 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Part III
Complexity classes
49 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Kinds of problems
We write algorithms to solve problems.
Some special classes of problems:
Decision problems: require a yes/no answer. Example:

Does this le contain a valid Java program?
Optimization problems: require choosing a solution

that minimizes (maximizes) some objective function.
Example: Find a circuit made out of AND, OR, and
NOT gates that computes the sum of two 8-bit
integers, and has the fewest gates.
Counting problems: count the number of objects that

satisfy some criterion. Example: For how many inputs
will this circuit output zero?
50 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Complexity classes
A complexity class is dened as
a style of problem
that can be solved with a specied amount of resources
on a specied machine model
Example: P (a.k.a PTIME) is the class of decision

problems that can be solved in polynomial time (i.e.,
time O(n
d
) for some d N) on a Turing machine.
Complexity classes:
Let us lump together problems according to how hard

they are
Are usually dened so as to be invariant under

non-radical changes of machine model (e.g., the class P
on a TM is the same as the class P on a RAM).
51 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Some basic distinctions
At the coarsest level of structure, decision problems

come in three varieties:
Problems we can write computer programs to

solve. What this course is about! (Program will always
stop and say yes or no, and be right!)
Problems we can dene, but not write computer

programs to solve (e.g., deciding whether a Java
program runs in polynomial time)
Problems we cannot even dene.
Consider deciding whether x A for some set A N

of natural numbers. e.g., prime numbers.
In any (eective) notation system we care to choose,

there are
0
(countably many) problem denitions.
(They can be put into 1-1 correspondence with the
natural numbers).
There are 2
0
(uncountably many) problems
subsets of A N. (They can be put into 1-1
correspondence with the reals.)
Most problems cannot even be dened.

52 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
An aside: Hasse diagrams I
Complexity classes are sets of problems
Some complexity classes are contained inside other

complexity classes.
e.g., every problem in class P (polynomial time on TM)

is also in class PSPACE (polynomial space on TM).
We can write P PSPACE to mean: the class P is

contained in the class PSPACE.
is a partial order: reexive, transitive, anti-symmetric.
Hasse diagrams are intuitive ways of drawing partial

orders.
Example: I am a professor and a programmer.

Professors are people; programmers are people (are
too!)
me professors
me programmers
professors people
programmers people
53 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
An aside: Hasse diagrams II
people
professors
o
o
o
o
o
o
o
programmers
Q
Q
Q
Q
Q
Q
Q
Q
me
N
N
N
N
N
N
N
n
n
n
n
n
n
n
n
n
54 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Whirlwind tour of major complexity classes I
There are 462 classes in the Complexity Zoo.
Well see... slightly fewer than that. (Most complexity

classes are interesting primarily to structural complexity
theorists they capture ne distinctions that were not
concerned with day-to-day.)
For every class we shall see, there are many classes

above, beside, and below it that are not shown;
The Hasse diagrams do not imply that the containment

is strict: e.g., when the diagram shows NP above P,
this means P NP, not P NP.
55 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Whirlwind tour of major complexity classes II
Decidable
EXP
PSPACE
coNP
m
m
m
m
m
NP
P
P
P
P
P
P
Q
Q
Q
Q
Q
Q
Q
n
n
n
n
n
n
n
EXP = decision exponential time on TM (aka

EXPTIME)
PSPACE = decision polynomial space on TM
P = decision polynomial time on TM (aka P)
NP, co NP: well get to these...

56 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Randomness-related classes
ZPP, RP, coRP, BPP: probabilistic classes (machine has
access to random bits)
EXPTIME
NP
n
n
n
n
n
n
BPP coNP
R
R
R
R
R
RP
n
n
n
n
n
n
coRP
R
R
R
R
R
ZPP
P
P
P
P
P
P
l
l
l
l
l
l
PTIME
BPP problems that can be solved in polynomial time

with access to a random number source, with
probability of error <
1
2
. (Run many times and vote:
get error as low as you like.)
ZPP=problems that can be solved in polynomial time

with access to a random number source, with zero
probability of error.
57 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Polynomial-time and below
PTIME
Polynomial time
NC Nicks class
LOGSPACE
Logarithmic space
NC
1
Logarithmic depth circuits, bounded fan in/out
AC
0
Constant depth circuits, bounded fan in/out
58 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Structural complexity theory
Structural complexity theory = the study of complexity

classes and their interrelationships
Many fundamental relationships are not known:
Is P=NP? (Lots of industrially important problems are

NP, like placement & routing for VLSI, designing
communication networks, etc.)
Is ZPP=P? (Is randomness really necessary?)
Is BPP NP? If so, we can solve those hard problems

in NP by ipping coins, with some error so tiny we
dont care.
Lots of conditional results are known, e.g.: If BPP

contains NP, then RP=NP and PH is contained in
BPP; any proof of BPP=P would require showing either
NEXP is not in P/poly or that #P requires
superpolynomial sized circuits.
This is not a course in complexity theory. We will work

with basic, practical complexity classes only.
59 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Part IV
Bibliography
60 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Bibliography I
[1] Thomas H. Cormen, Charles E. Leiserson, and Ronald R.
Rivest.
Intoduction to algorithms.
McGraw Hill, 1991.
[2] David Culler, Richard Karp, David Patterson, Abhijit
Sahay, Klaus Erik Schauser, Eunice Santos, Ramesh
Subramonian, and Thorsten von Eicken.
LogP: Towards a realistic model of parallel computation.
In Marina Chen, editor, Proceedings of the 4th ACM
SIGPLAN Symposium on Principles and Practice of
Parallel Programming, pages 112, San Diego, CA, May
1993. ACM Press.
61 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Bibliography II
[3] David E. Culler, Richard M. Karp, David Patterson,
Abhijit Sahay, Eunice E. Santos, Klaus Erik Schauser,
Ramesh Subramonian, and Thorsten von Eicken.
Logp: a practical model of parallel computation.
Commun. ACM, 39(11):7885, 1996.
[4] Ronald L. Graham, Donald E. Knuth, and Oren
Patashnik.
Concrete Mathematics: A Foundation for Computer
Science.
Addison-Wesley, Reading, MA, USA, second edition,
1994.
62 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Bibliography III
[5] G. H. Hardy.
Orders of innity. The Innitarcalc ul of Paul du
Bois-Reymond.
Hafner Publishing Co., New York, 1971.
Reprint of the 1910 edition, Cambridge Tracts in
Mathematics and Mathematical Physics, No. 12.
[6] Donald E. Knuth.
Big omicron and big omega and big theta.
SIGACT News, 8(2):1824, 1976.
[7] A. H. Lightstone and Abraham Robinson.
Nonarchimedean elds and asymptotic expansions.
North-Holland Publishing Co., Amsterdam, 1975.
North-Holland Mathematical Library, Vol. 13.
63 / 64
ECE750 Lecture 1:
Asymptotics,
Resources, &
Complexity Classes
Todd Veldhuizen
tveldhui@acm.org
Bibliography IV
[8] Philippe Oechslin.
Making a faster cryptanalytic time-memory trade-o.
In Dan Boneh, editor, CRYPTO, volume 2729 of Lecture
Notes in Computer Science, pages 617630. Springer,
2003.
[9] John R. Shackell.
Symbolic asymptotics, volume 12 of Algorithms and
Computation in Mathematics.
Springer-Verlag, Berlin, 2004.
64 / 64

ECE750 F2008 Algorithms1

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

ECE750 F2008 Algorithms1

Caricato da

Copyright:

Formati disponibili

ECE750 Lecture 1:

We want to choose the best algorithm or data structure

Need characterizations of resource use, e.g., time,

Many, many approaches:

Worst Case Execution Time (WCET): for hard real-time

Exact measurements for a specic problem size, e.g.,

Performance models, e.g., R

We will focus on Asymptotic analysis: a good rst

Reasonably independent of:

Machine details (e.g., 2 cycles for add+mult vs. 1 cycle)

Clock speed, programming language, compiler, etc.

Basic ideas originated in Paul du Bois-Reymonds

G. H. Hardy greatly expanded on Paul du

The big-O notation was rst used by Bachmann

Adopted by computer scientists [6] to characterize

f g (f and g are asymptotically equivalent)

f g (f and g are asymptotically bound to each other)

Basic idea: f grows more slowly than g, or just as

_ is a preorder (or quasiorder):

_ fails to be a partial order because it is not

Basic idea: f grows strictly less quickly than g

Equivalent: f _ g exactly when lim

There is no slowest-growing function;

There is no fastest-growing function;

If f h we can always nd a g such that f g h.

This fact allows us to sketch graphs in which points on

Cantor proved that any countable dense, total order

This means we can put suitably dened sets of

Hence we are justied in drawing graphs where axes are

Note that f O(g) means exactly f _ g.

A standard abuse of notation is to treat a big-O

Big-O notation is an excellent tool for expressing

On one machine a sorting algorithm might take

We can wave these dierences aside by saying the

O(f (n)) means something that behaves asymptotically

Disregarding any initial transient behaviour;

Disregarding any multiplicative constants c f (n);

Disregarding any additive terms that grow less quickly

Lower case means strict:

o(n) is strict version of O(n)

(n) is strict version of (n)

, (omega) is the last letter of the greek alphabet

f (g): the line through the middle of the theta

o() expresses a strict upper bound.

If f (n) is o(g(n)), then f grows strictly slower than g.

o(1) indicates the class of functions for which

2 + o(1) means 2 plus something that vanishes as

If f is o(g), it is also O(g).

() expresses a strict lower bound.

If f (n) is (g(n)), then f grows strictly faster than g.

f (g) is equivalent to g o(f ).

Example: Harmonic series

() expresses a lower bound, not necessarily strict

If f (n) is (g(n)), then f grows at least as fast as g.

f (g) is equivalent to g O(f )

Example: Matrix multiplication requires (n

() expresses a tight asymptotic bound

If f (n) is (g(n)), then f (n)/g(n) is eventually

() bounds are very precise, but often hard to obtain.

Example: QuickSort runs in time (n log n) on average.

Example: Stirlings approximation

Dont make the mistake of thinking that f (g)