Sei sulla pagina 1di 28

QEEE DSA05

DATA STRUCTURES AND


ALGORITHMS
G VENKATESH AND MADHAVAN MUKUND
LECTURE 2, 5 AUGUST 2014

Analysis of algorithms
Measuring eciency of an algorithm

Time: How long the algorithm takes (running


time)

Space: Memory requirement

Example 1: Sorting
Sorting an array with n elements

Nave algorithms : time proportional to n2

Best algorithms : time proportional to n log n

How important is this distinction?

Typical CPUs process up to 1010 operations per


second

Probably an overestimate, but useful for


approximate calculations

Example 1: Sorting
Telephone directory for mobile phone users in India

India has about 1 billion = 109 phones

Nave n2 algorithm requires 1018 operations

1010 operations per second 108 seconds

27780 hours

1157 days

3 years!

Smart n log n algorithm takes only about 3 x 1010


operations

About 3 seconds

Example 2: Video game


Several objects on screen

Basic step: find closest pair of objects

Given n objects, nave algorithm is again n2

For each pair of objects, compute their distance

Report minimum distance over all such pairs

There is a clever algorithm that takes time n log n

Example 2: Video game


High resolution monitor has 2500 x 1500 pixels

3.75 million points

Suppose we have 500,000 = 5 x 105 objects

Nave algorithm takes 25 x 1010 steps = 25 seconds

25 second response time is unacceptable!

Smart n log n algorithm takes a thousandth of a


second

Time and space


Time depends on processing speed

Impossible to change for given hardware

Space is a function of available memory

Easier to reconfigure, augment

Traditionally, algorithm analysis concentrates on


time, not space

Input size
Running time depends on input size

Larger arrays will take longer to sort

Measure time eciency as function of input size

Input size n

Running time t(n)

Dierent inputs of size n may each take a dierent


amount of time

Typically t(n) is worst case estimate

Input size
How do we fix input size?

Typically a natural parameter

For sorting and other problems on arrays:


array size

For combinatorial problems: number of objects

For graphs, two parameters: number of vertices


and number of edges

Measuring running time


Analysis independent of underlying hardware

Dont use actual time

Measure in terms of basic operations

Typical basic operations

Compare two values

Assign a value to a variable

Other operations may be basic, depending on context

Exchange values of a pair of variables

Orders of magnitude
When comparing t(n) across problems, focus on
orders of magnitude

Ignore constants

f(n) = n3 eventually grows faster than g(n) = 5000 n2

For small values of n, f(n) is smaller than g(n)

At n = 5000, f(n) overtakes g(n)

What happens in the limit, as n increases :


asymptotic complexity

Choice of basic operations


Flexibility in identifying basic operations
Swapping two variables involves three assignments

tmp x
x y
y tmp

Number of swaps is 3 times number of assignments

If we ignore constants, t(n) is of the same order of


magnitude even if swapping values is treated as a
basic operation

Typical functions

We are interested in orders of magnitude

Is t(n) proportional to log n, , n2 , n3 , , 2n?

Logarithmic, polynomial, exponential

Typical functions t(n)

Feasibility limit
Even n2 is infeasible for
inputs of size 1 million
(10 lakhs)

Worst case complexity


Running time on input of size n varies across
inputs

Search for K in an unsorted array A

i 0



while i < n and A[i] != K do


i i+1
if i < n return i
else return -1

Worst case complexity


For each n, worst case input forces algorithm to
take the maximum amount of time

If K not in A, search scans all elements

Upper bound for the overall running time

Here worst case is O(n) for array size n

Can construct worst case inputs by examining the


algorithm

Average case complexity


Worst case may be very rare: pessimistic

Compute average time taken over all inputs

Dicult to compute

Average over what?

Are all inputs equally likely?

Need probability distribution over inputs

Comparing time efficiency


We measure time eciency only upto an order of
magnitude

Ignore constants

How do we compare functions with respect to


orders of magnitude?

Upper bounds, big O


t(n) is said to be O(g(n)) if we can find suitable
constants c and n0 so that cg(n) is an upper bound
for t(n) for n
beyond n0

t(n) cg(n)
for every n n0

Examples: Big O
100n + 5 is O(n2)

100n + 5

100n + n, for n 5

2
= 101n 101n , so n0 = 5, c = 101

Alternatively

100n + 5

100n + 5n, for n 1

= 105n 105n2, so n0 = 1, c = 105

n0 and c are not unique!

Of course, by the same argument, 100n+5 is also O(n)

Examples: Big O
100n2 + 20n + 5 is O(n2)

100n2 + 20n + 5

100n2 + 20n2 + 5n2, for n 1

125n2

n0 = 1, c = 125

What matters is the highest term

20n + 5 dominated by 100n2

Examples: Big O

n3 is not O(n2)

No matter what c we choose, cn2 will be


dominated by n3 for n c

Useful properties
If

f1(n) is O(g1(n))

f2(n) is O(g2(n))

then f1(n) + f2(n) is O(max(g1(n),g2(n)))

Why is this important?


Algorithm has two phases

Phase A takes time O(gA(n))

Phase B takes time O(gB(n))

Algorithm as a whole takes time

max(O(gA(n)),O(gB(n)))

For an algorithm with many phases, least ecient


phase is an upper bound for the whole algorithm

Binary search
Searching for K in unsorted list A takes time O(n)

What if A is sorted?

Compare K with midpoint of A

If midpoint is K, the value is found

If K < midpoint, search left half of A

If K > midpoint, search right half of A

Binary Search
How long does this take?

Each step halves the interval to search

Initially, interval is size n

For interval of size 0 or 1, answer is immediate

After j steps, interval is size n/2j

After j = log2 n steps, we are done, so O(log n)

Worst case is when K is not found in A

Binary search
bsearch(K,A,left,right)

// A sorted, search for K from A[left] to A[right-1]


if (right - left == 0) return(false)
if (right - left == 1) return(K == A[left])
mid = (left + right) div 2

// integer division

if (K == A[mid]) return (true)


if (K < A[mid]) return (bsearch(K,A,left,mid)
else return(bsearch(K,A,mid+1,right))

Summary
Measure worst case time complexity

Asymptotict(n) as n becomes large

Only orders of magnitude are important

O( ) notation compares orders of magnitude

Complexity of algorithm limits range of operation

Search in an unsorted list is O(n)

Binary search in a sorted list is O(log n)

Potrebbero piacerti anche