Sei sulla pagina 1di 14

ECE750 Lecture 8: Amortized analysis & Online algorithms

ECE750 Lecture 8: Amortized analysis & Online algorithms


Todd Veldhuizen tveldhui@acm.org
Electrical & Computer Engineering University of Waterloo Canada

Todd Veldhuizen tveldhui@acm.org

Nov 10, 2007

1/1

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

Part I Amortization

2/1

Amortized Analysis I
In accounting, amortization is a method for spreading a large lump-sum amount over a period of time by breaking it into smaller payments; for example, when one purchases a house the mortgage payments are arranged according to an amortization schedule. In algorithm analysis, amortization looks at the total cost of a sequence of operations, averaged over the number of operations. If a single operation is very expensive, that expense can be amortized over a long sequence of operations. If a sequence of m operations takes O (f (m)) time, we say the amortized cost per operation is O (m1 f (m)).

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

3/1

Amortized Analysis II
If the worst case time per operation is O (g (m)), this implies the amortized time is O (g (m)) also, but the converse is not true: in amortized analysis we can allow a small number of very expensive operations, so long as that expense is averaged out (amortized) over a suciently long sequence of operations. Example: recall the binary tree iterator:
class BSTIterator implements Iterator { Stack stack ; public BSTIterator(BSTNode t) { stack = new Stack(); fathom(t); } public boolean hasNext()

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

4/1

Amortized Analysis III


{ return ! stack .empty(); } public Object next() { BSTNode t = (BSTNode)stack.pop(); if (t . right child != null ) fathom(t. right child ); return t ; } void fathom(BSTNode t) { do { stack .push(t ); t = t. left child ; } while (t != null );

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

5/1

Amortized Analysis IV
} }

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

If the binary tree contains n elements and is balanced, then the fathom() operation takes no more than O (log n) time; and there must be at least some nodes of depth c log n. Therefore each invokation of next() requires (log n) time in the worst case. However, iterating through the entire tree by a sequence of next() operations requires O (1) amortized time:
1. The iterator visits each node at most twice: once when it is pushed onto the stack by fathom(), and once when it is popped from the stack by next(). 2. The total time spent in the next() and fathom() methods is linear in the number of pushes and pops onto the stack.
6/1

Amortized Analysis V

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

3. Assuming push and pop operations take O (1) time (e.g., linked list implementation of stack), the total cost of iterating through the tree is O (n). 4. Therefore the amortized cost of the iteration is O (n1 n) = O (1).

7/1

Dynamic Arrays I
Arrays have several advantages over more complex data structures:
They are very fast to access, both for random access and for iterating through the contents of the array. They are very ecient in memory use. For example, to store a set of n single-precision oating-point values (4 bytes apiece),
1. An array requires 4n + O (1) bytes; 2. A binary search tree requires 16n + O (1) bytes: for each tree node, need 4 bytes for the oat, 2*4 bytes for the left/right child pointers. And typically small objects such as this are padded up to an alignment boundary, e.g., 16 bytes. So, a tree can take 3-4 times as much memory as an array, for storing small objects.

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

They are simple to use and implement.

8/1

Dynamic Arrays II
However, appending items to an array can be very inecient: if the array is full, one must usually allocate a new, larger array and copy the elements over, for a cost of O (n). A dynamic array is one that keeps room for extra elements, resizing itself according to a schedule that yields an O (1) amortized time for append operations, despite the occasional operation taking O (n) time. The array maintains
1. A size (number of elements in the array) 2. A capacity (allocated size of the array) 3. The array itself.

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

When an append operation is performed, size is incremented; if size exceeds capacity then:
1. Allocate a new array of size f (capacity), where f is to be determined; 2. Copy the elements 1..size to the new array
9/1

Dynamic Arrays III


3. Set capacity = the new capacity.

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

Now analyze the amortized time complexity. Consider a sequence of n insert operations, starting from an empty array. Each time we hit the array capacity, we incur a cost of O (n); other append operations incur only an O (1) cost.
Time required for append 1

9 10 ...

Operation #

10 / 1

Dynamic Arrays IV

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

Suppose the array has an initial capacity of 1. The cost of the resizings will be f (k ) (1)
k : f (k ) (1)n

where f (0) (x ) = x , and f (i +1) (x ) = f (f (i ) (x )). If we take f (k ) = k + 16, i.e., we increase the capacity of the array by 16 elements each time we run out of room, then the total cost is O (n2 ).

11 / 1

Dynamic Arrays V
If however we choose f (k ) = k , with > 1, then using the geometric series formula, we have a total cost of m+1 = 1
k

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

k : k n

m=log n

(n 1) 1 = O (n ) = So, the amortized cost of appends into the dynamic array is O (n1 n) = O (1). Increasing the capacity of the array by, say, 5% each time we run out of space leads to an O (1) amortized time for appends.

12 / 1

How to choose ? I
If is large, then we can waste a lot of memory. Just after a resizing, the proportion of memory unused (of 1 memory allocated) is 1 nn = . For example, if = 3, then this proportion is 2 3. If is too small, then we resize the array very frequently, which is costly. Suppose for example that = 1 + . Then the cost of the resizings is m+1 = 1
k

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

k : k n

m=log n

(1 + )(n 1)

Keeping the constant factors, we have that the amortized cost is 1 + 1 o (1). For example, if = 0.01, (i.e., we grow the array by 1% each time),
13 / 1

How to choose ? II

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

then the amortized cost is 101. On the other hand, if = 2, the cost is 2.

14 / 1

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

Part II Online algorithms

15 / 1

Online Algorithms I
Consider the problem of assigning restaurant patrons to tables. If you know in advance who wants to eat dinner at your restaurant, when they want to arrive, how long they will stay, and how much they will spend, you can gure out in advance what subset of patrons to accept to maximize your revenue. In the real world, people just show up at restaurants expecting to be fed: when each party arrives, you must decide whether or not you can seat them; and once given a table, you cant evict them before they are nished eating to make room for someone else. The dierence is that between an oine and online problem.
In an oine scenario, we know the entire sequence of requests that will be made in advance; in principle we can use this knowledge to nd an optimal solution.

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

16 / 1

Online Algorithms II
In an online scenario, we are presented with requests one at a time, and we must commit to a decision without knowing what the subsequent requests might be.

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

Many realistic problems are online, for example:


Page replacement policies in operating systems and caches; Call routing in networks; Memory allocation; Data structure operations.

The eld of online algorithms studies such problems, in particular how online solutions compare to their optimal oine versions [?]. In an online problem, one is presented with a request sequence I = (1 , 2 , . . . , n ), and each i must be handled with no knowledge of i +1 , i +2 , . . . , n . Once a decision of how to handle request i is made, it cannot be altered.
17 / 1

Online Algorithms III


We characterize performance by assigning a cost to a sequence of decisions. Write OPT(I ) for the optimal oine solution. If ALG(I ) is an online algorithm, we say ALG is an (asymptotic) c -approximation algorithm if for all legal request sequences I , ALG(I ) c OPT(I ) for some constant not depending on I . If = 0, ALG is a c -approximation algorithm, and we have ALG(I ) c OPT(I ) The online algorithm yields a cost that is at most c times the optimal cost. c is called the competitive ratio.

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

18 / 1

Example: Load Balancing I

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

Consider the following load balancing problem: we have n jobs to complete. Each job j {1, . . . , n} requires time T (j ). We have m machines, each equally capable. We want to assign the jobs to machines so that we are nished all jobs as quickly as possible. In an oine version, we would know T [j ] in advance. In an online version, we must assign job j to a machine knowing only the jobs 1..j 1 and what machines they were assigned to. An assignment A : {1, . . . , n} {1, . . . , m} where A(j ) = i means job j is assigned to machine i .

19 / 1

Example: Load Balancing II


The makespan is how long we must wait until all jobs are nished: Makespan(A) = maxi
j : A(j )=i

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

T (j )

Here, i is over machines, and j is over jobs.

There is an online greedy algorithm that provides a competitive ratio of 2 (i.e., the schedule chosen takes at most twice as long as the optimal oine version.) The algorithm is simple:
Always assign job j to a machine with an earliest nishing time.

The proof of the competitive ratio stems from two observations:

20 / 1

Example: Load Balancing III

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

1. If the sum of the times of all jobs is, say, 60 minutes, and we have 3 machines, an optimal schedule cant possibly require less than 60/3 = 20 minutes. In general: OPT
1 m j

T (j )

2. The optimal schedule has to be at least as long as the longest job: OPT max T (j )
j

21 / 1

Example: Load Balancing IV


Suppose machine i has the longest running time, and j is the last job assigned to machine i .

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

i (A)

j (B)

The above diagram shows a schedule with 3 machines. Let tA be the time when job j begins (i.e., the end of region A in the above diagram.) Up until tA , all the machines are in full use: otherwise, there would be a machine nishing earlier than the one to which j has been assigned, and we would have assigned job j to that machine.

22 / 1

Example: Load Balancing V


The sum of all the times in region (A) is j T (j ) (the total time required for all jobs). Therefore mtA
j

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

T (j )

or, tA Since OPT


1 m

1 m

T (j )
j

T (j ), we have that 1 m T (j )
j

tA

OPT
23 / 1

Example: Load Balancing VI


Let tB be the duration of region (B) in the above gure; this is the duration of the job j . We know that tB maxj T (j ) i.e., the duration of the job j is at most the duration of the longest job. Since OPT maxj T (j ), we have: tB maxj T (j ) OPT Therefore the time until all the jobs are nished is tA + tB 2 OPT The greedy algorithm yields a schedule at most twice as long as the optimal oine solution; we say it is 2-competitive.

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

24 / 1

Example: Load Balancing VII


The current best known online algorithm is 1.9201-competitive; it is known that no deterministic algorithm can have a competitive ratio better than 1.88. There is a randomized algorithm achieving a competitive ratio of 1.916. There are two things to observe here:
The simple greedy algorithm performs almost as well as the best-known algorithm for this problem. Foreknowledge is a very powerful thing: if we know in advance the jobs to be scheduled, we be almost twice as ecient. For this reason, it can be advantageous to convert online problems into oine problems whenever possible. For example, certain software architectures force us into using online algorithms. A database application may know in advance what pages it will need brought in from disk, but it may not have a way to communicate this information to the operating system, because of the system architecture. The operating

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

25 / 1

Example: Load Balancing VIII

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

system needs to treat the page requests as an online problem. If the operating system could be told in advance what pages were needed, the problem could be treated as oine problem.

26 / 1

Bibliography I

ECE750 Lecture 8: Amortized analysis & Online algorithms Todd Veldhuizen tveldhui@acm.org

27 / 1

Potrebbero piacerti anche