ECE750 Lecture 8: Amortized Analysis & Online Algorithms: Todd Veldhuizen

ECE750 Lecture 8: Amortized analysis & Online algorithms
ECE750 Lecture 8: Amortized analysis & Online algorithms

Todd Veldhuizen tveldhui@acm.org
Electrical & Computer Engineering University of Waterloo Canada
Todd Veldhuizen tveldhui@acm.org
Nov 10, 2007
1/1
ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org
Part I Amortization
2/1
Amortized Analysis I
In accounting, amortization is a method for spreading a large lump-sum amount over a period of time by breaking it into smaller payments; for example, when one purchases a house the mortgage payments are arranged according to an amortization schedule. In algorithm analysis, amortization looks at the total cost of a sequence of operations, averaged over the number of operations. If a single operation is very expensive, that expense can be amortized over a long sequence of operations. If a sequence of m operations takes O (f (m)) time, we say the amortized cost per operation is O (m1 f (m)).
3/1
Amortized Analysis II
If the worst case time per operation is O (g (m)), this implies the amortized time is O (g (m)) also, but the converse is not true: in amortized analysis we can allow a small number of very expensive operations, so long as that expense is averaged out (amortized) over a suciently long sequence of operations. Example: recall the binary tree iterator:
class BSTIterator implements Iterator { Stack stack ; public BSTIterator(BSTNode t) { stack = new Stack(); fathom(t); } public boolean hasNext()
4/1
Amortized Analysis III

{ return ! stack .empty(); } public Object next() { BSTNode t = (BSTNode)stack.pop(); if (t . right child != null ) fathom(t. right child ); return t ; } void fathom(BSTNode t) { do { stack .push(t ); t = t. left child ; } while (t != null );
5/1
Amortized Analysis IV
} }
If the binary tree contains n elements and is balanced, then the fathom() operation takes no more than O (log n) time; and there must be at least some nodes of depth c log n. Therefore each invokation of next() requires (log n) time in the worst case. However, iterating through the entire tree by a sequence of next() operations requires O (1) amortized time:
1. The iterator visits each node at most twice: once when it is pushed onto the stack by fathom(), and once when it is popped from the stack by next(). 2. The total time spent in the next() and fathom() methods is linear in the number of pushes and pops onto the stack.
6/1
Amortized Analysis V
3. Assuming push and pop operations take O (1) time (e.g., linked list implementation of stack), the total cost of iterating through the tree is O (n). 4. Therefore the amortized cost of the iteration is O (n1 n) = O (1).
7/1
Dynamic Arrays I
Arrays have several advantages over more complex data structures:
They are very fast to access, both for random access and for iterating through the contents of the array. They are very ecient in memory use. For example, to store a set of n single-precision oating-point values (4 bytes apiece),
1. An array requires 4n + O (1) bytes; 2. A binary search tree requires 16n + O (1) bytes: for each tree node, need 4 bytes for the oat, 2*4 bytes for the left/right child pointers. And typically small objects such as this are padded up to an alignment boundary, e.g., 16 bytes. So, a tree can take 3-4 times as much memory as an array, for storing small objects.
They are simple to use and implement.
8/1
Dynamic Arrays II
However, appending items to an array can be very inecient: if the array is full, one must usually allocate a new, larger array and copy the elements over, for a cost of O (n). A dynamic array is one that keeps room for extra elements, resizing itself according to a schedule that yields an O (1) amortized time for append operations, despite the occasional operation taking O (n) time. The array maintains
1. A size (number of elements in the array) 2. A capacity (allocated size of the array) 3. The array itself.
When an append operation is performed, size is incremented; if size exceeds capacity then:
1. Allocate a new array of size f (capacity), where f is to be determined; 2. Copy the elements 1..size to the new array
9/1
Dynamic Arrays III

3. Set capacity = the new capacity.
Now analyze the amortized time complexity. Consider a sequence of n insert operations, starting from an empty array. Each time we hit the array capacity, we incur a cost of O (n); other append operations incur only an O (1) cost.
Time required for append 1
9 10 ...
Operation #
10 / 1
Dynamic Arrays IV
Suppose the array has an initial capacity of 1. The cost of the resizings will be f (k ) (1)
k : f (k ) (1)n
where f (0) (x ) = x , and f (i +1) (x ) = f (f (i ) (x )). If we take f (k ) = k + 16, i.e., we increase the capacity of the array by 16 elements each time we run out of room, then the total cost is O (n2 ).
11 / 1
Dynamic Arrays V
If however we choose f (k ) = k , with > 1, then using the geometric series formula, we have a total cost of m+1 = 1
k
k : k n
m=log n
(n 1) 1 = O (n ) = So, the amortized cost of appends into the dynamic array is O (n1 n) = O (1). Increasing the capacity of the array by, say, 5% each time we run out of space leads to an O (1) amortized time for appends.
12 / 1
How to choose ? I
If is large, then we can waste a lot of memory. Just after a resizing, the proportion of memory unused (of 1 memory allocated) is 1 nn = . For example, if = 3, then this proportion is 2 3. If is too small, then we resize the array very frequently, which is costly. Suppose for example that = 1 + . Then the cost of the resizings is m+1 = 1
k
k : k n
m=log n
(1 + )(n 1)
Keeping the constant factors, we have that the amortized cost is 1 + 1 o (1). For example, if = 0.01, (i.e., we grow the array by 1% each time),
13 / 1
How to choose ? II
then the amortized cost is 101. On the other hand, if = 2, the cost is 2.
14 / 1
Part II Online algorithms
15 / 1
Online Algorithms I
Consider the problem of assigning restaurant patrons to tables. If you know in advance who wants to eat dinner at your restaurant, when they want to arrive, how long they will stay, and how much they will spend, you can gure out in advance what subset of patrons to accept to maximize your revenue. In the real world, people just show up at restaurants expecting to be fed: when each party arrives, you must decide whether or not you can seat them; and once given a table, you cant evict them before they are nished eating to make room for someone else. The dierence is that between an oine and online problem.
In an oine scenario, we know the entire sequence of requests that will be made in advance; in principle we can use this knowledge to nd an optimal solution.
16 / 1
Online Algorithms II
In an online scenario, we are presented with requests one at a time, and we must commit to a decision without knowing what the subsequent requests might be.
Many realistic problems are online, for example:

Page replacement policies in operating systems and caches; Call routing in networks; Memory allocation; Data structure operations.
The eld of online algorithms studies such problems, in particular how online solutions compare to their optimal oine versions [?]. In an online problem, one is presented with a request sequence I = (1 , 2 , . . . , n ), and each i must be handled with no knowledge of i +1 , i +2 , . . . , n . Once a decision of how to handle request i is made, it cannot be altered.
17 / 1
Online Algorithms III

We characterize performance by assigning a cost to a sequence of decisions. Write OPT(I ) for the optimal oine solution. If ALG(I ) is an online algorithm, we say ALG is an (asymptotic) c -approximation algorithm if for all legal request sequences I , ALG(I ) c OPT(I ) for some constant not depending on I . If = 0, ALG is a c -approximation algorithm, and we have ALG(I ) c OPT(I ) The online algorithm yields a cost that is at most c times the optimal cost. c is called the competitive ratio.
18 / 1
Example: Load Balancing I
Consider the following load balancing problem: we have n jobs to complete. Each job j {1, . . . , n} requires time T (j ). We have m machines, each equally capable. We want to assign the jobs to machines so that we are nished all jobs as quickly as possible. In an oine version, we would know T [j ] in advance. In an online version, we must assign job j to a machine knowing only the jobs 1..j 1 and what machines they were assigned to. An assignment A : {1, . . . , n} {1, . . . , m} where A(j ) = i means job j is assigned to machine i .
19 / 1
Example: Load Balancing II

The makespan is how long we must wait until all jobs are nished: Makespan(A) = maxi
j : A(j )=i
T (j )
Here, i is over machines, and j is over jobs.
There is an online greedy algorithm that provides a competitive ratio of 2 (i.e., the schedule chosen takes at most twice as long as the optimal oine version.) The algorithm is simple:
Always assign job j to a machine with an earliest nishing time.
The proof of the competitive ratio stems from two observations:
20 / 1
Example: Load Balancing III
1. If the sum of the times of all jobs is, say, 60 minutes, and we have 3 machines, an optimal schedule cant possibly require less than 60/3 = 20 minutes. In general: OPT
1 m j
T (j )
2. The optimal schedule has to be at least as long as the longest job: OPT max T (j )
j
21 / 1
Example: Load Balancing IV

Suppose machine i has the longest running time, and j is the last job assigned to machine i .
i (A)
j (B)
The above diagram shows a schedule with 3 machines. Let tA be the time when job j begins (i.e., the end of region A in the above diagram.) Up until tA , all the machines are in full use: otherwise, there would be a machine nishing earlier than the one to which j has been assigned, and we would have assigned job j to that machine.
22 / 1
Example: Load Balancing V

The sum of all the times in region (A) is j T (j ) (the total time required for all jobs). Therefore mtA
j
T (j )
or, tA Since OPT

1 m
1 m
T (j )
j
T (j ), we have that 1 m T (j )
j
tA
OPT
23 / 1
Example: Load Balancing VI

Let tB be the duration of region (B) in the above gure; this is the duration of the job j . We know that tB maxj T (j ) i.e., the duration of the job j is at most the duration of the longest job. Since OPT maxj T (j ), we have: tB maxj T (j ) OPT Therefore the time until all the jobs are nished is tA + tB 2 OPT The greedy algorithm yields a schedule at most twice as long as the optimal oine solution; we say it is 2-competitive.
24 / 1
Example: Load Balancing VII

The current best known online algorithm is 1.9201-competitive; it is known that no deterministic algorithm can have a competitive ratio better than 1.88. There is a randomized algorithm achieving a competitive ratio of 1.916. There are two things to observe here:
The simple greedy algorithm performs almost as well as the best-known algorithm for this problem. Foreknowledge is a very powerful thing: if we know in advance the jobs to be scheduled, we be almost twice as ecient. For this reason, it can be advantageous to convert online problems into oine problems whenever possible. For example, certain software architectures force us into using online algorithms. A database application may know in advance what pages it will need brought in from disk, but it may not have a way to communicate this information to the operating system, because of the system architecture. The operating
25 / 1
Example: Load Balancing VIII
system needs to treat the page requests as an online problem. If the operating system could be told in advance what pages were needed, the problem could be treated as oine problem.
26 / 1
Bibliography I
27 / 1

ECE750 Lecture 8: Amortized Analysis & Online Algorithms: Todd Veldhuizen

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

ECE750 Lecture 8: Amortized Analysis & Online Algorithms: Todd Veldhuizen

Caricato da

Copyright:

Formati disponibili

ECE750 Lecture 8: Amortized analysis & Online algorithms

ECE750 Lecture 8: Amortized analysis & Online algorithms

Todd Veldhuizen tveldhui@acm.org

Nov 10, 2007

Amortized Analysis III

They are simple to use and implement.

Dynamic Arrays III

Part II Online algorithms

Many realistic problems are online, for example:

Online Algorithms III

Example: Load Balancing I

Example: Load Balancing II

Here, i is over machines, and j is over jobs.

The proof of the competitive ratio stems from two observations:

Example: Load Balancing III

Example: Load Balancing IV

Example: Load Balancing V

or, tA Since OPT

Example: Load Balancing VI

Example: Load Balancing VII

Example: Load Balancing VIII

Potrebbero piacerti anche