Sei sulla pagina 1di 76

CSE 2032Y – Data Structures and

Algorithms
Week 2 Slides – Arrays and Algorithm
Analysis
What will we cover is this lecture?
• Object-oriented programming
– A revision of what you have already done in
CSE 1243 (Programming Paradigms)
• Arrays
– A simple Data Structure
• Introduction to Algorithm Analysis
– Empirical Analysis
– Asymptotic Analysis
OBJECT ORIENTED PROGRAMMING
A revision
Introduction to Object Oriented Programming [1]
Assuming that you have already covered object-oriented programming, this is just
a mere revision so that you can write more interesting programs with arrays….
• The idea of objects arose in the programming community
as a solution to the problems with procedural languages
whereby there was a poor correspondence between a
program and the real world.
– i.e. to improve abstraction
• An object contains both methods and variables.
– A thermostat object, for example, would contain not only
furnace_on() and furnace_off() methods, but also variables
called currentTemp and desiredTemp.
• In Java, an object’s variables such as these are called fields.
Introduction to Object Oriented Programming [2]

• This new entity, the object, solves several


problems simultaneously.
– an object in a program corresponds more closely to an
object in the real world
– The object also solves the problem engendered by
global data in the procedural model.
• the furnace_on() and furnace_off() methods can access
currentTemp and desiredTemp.
• these variables are hidden from methods that are not part of
thermostat
• so they are less likely to be accidentally changed by a rogue
method.
Introduction to Object Oriented Programming [3]
• A class is a specification—a blueprint—for one or more objects.
– A thermostat class, for example, might look in Java:
class thermostat
{
private float currentTemp();
private float desiredTemp();
public void furnace_on()
{
// method body goes here
}
public void furnace_off()
{
// method body goes here
}
} // end class thermostat
– The Java keyword class introduces the class specification, followed by
the name you want to give the class; here it’s thermostat.
• Enclosed in curly brackets are the fields and methods that make up the class.
We’ve left out the bodies of the methods;
• normally, each would have many lines of program code.
Introduction to Object Oriented Programming [4]
Creating Objects
– Specifying a class doesn’t create any objects of that class.
– To actually create objects in Java, one must use the keyword
new.
– At the same time an object is created, you need to store a
reference to it in a variable of suitable type—that is, the same
type as the class.
– A reference can be thought as a name for an object. (It’s actually
the object’s address, but you don’t need to know that.)
– Here’s how we would create two references to type thermostat,
create two new thermostat objects, and store references to
them in these variables:
thermostat therm1, therm2; // create two references
therm1 = new thermostat(); // create two objects and
therm2 = new thermostat(); // store references to them
– Incidentally, creating an object is also called instantiating it, and
an object is often referred to as an instance of a class.
Introduction to Object Oriented Programming [5]
Accessing Object Methods
• After you specify a class and create some objects of that class, other
parts of your program need to interact with these objects.
• Typically, other parts of the program interact with an object’s
methods, not with its data (fields). For example, to tell the therm2
object to turn on the furnace, we would say
therm2.furnace_on();
• The dot operator (.) associates an object with one of its methods (or
occasionally with one of its fields).
• To summarize:
– Objects contain both methods and fields (data).
– A class is a specification for any number of objects.
– To create an object, you use the keyword new in conjunction with the
class name.
– To invoke a method for a particular object, you use the dot operator.
Introduction to Object Oriented Programming [6]
A runnable object-oriented program
// bank.java
// demonstrates basic OOP syntax
////////////////////////////////////////////////////////////////
class BankAccount
{
private double balance; // account balance
public BankAccount(double openingBalance) // constructor
{
balance = openingBalance;
}
public void deposit(double amount) // makes deposit
{
balance = balance + amount;
}
public void withdraw(double amount) // makes withdrawal
{
balance = balance - amount;
}
public void display() // displays balance
{
System.out.println(“balance=” + balance);
}
} // end class BankAccount
////////////////////////////////////////////////////////////////
Introduction to Object Oriented Programming [7]
• A runnable object-oriented program (cont.)
class BankApp
{
public static void main(String[] args)
{
BankAccount ba1 = new BankAccount(100.00); // create acct
System.out.print(“Before transactions, “);
ba1.display(); // display balance
ba1.deposit(74.35); // make deposit
ba1.withdraw(20.00); // make withdrawal
System.out.print(“After transactions, “);
ba1.display(); // display balance
} // end main()
} // end class BankApp

• The output from this program:


Before transactions, balance=100
After transactions, balance=154.35
Introduction to Object Oriented Programming [8]
• The BankAccount Class
– The only data field in the BankAccount class is the amount of
money in the account, called balance.
– There are three methods.
• deposit() method adds an amount to the balance,
• withdrawal() subtracts an amount, and
• display() displays the balance.
– No main() method in this class
• The BankApp class
– Every Java application must have a main() method; execution of
the program starts at the beginning of main(), as shown.
– You need to worry yet about the String[] args argument in main().)
– The main() method creates an object of class BankAccount,
initialized to a value of 100.00, which is the opening balance.
ARRAYS
Introduction to arrays (Using Java) [1]
• An array is
– One of the most basic data structures
– built into most programming languages
– is a number of items, of same type, stored in linear order, one
after another
• Arrays have a set limit on their size (in most programming
languages), they can't grow beyond that limit
• For example, lets say you wanted to have 100 integral
numbers.
int[] myArray; // defines a reference to an array
MyArray = new int[10]; // creates the array, and
// sets myArray to refer to it

OR you can use the equivalent single-statement approach:


int[] myArray=new int[10];
Introduction to arrays (Using Java) [2]
• The [] operator is the sign to the compiler
we’re naming an array object and not an
ordinary variable.
• Arrays have a length field, which you can use
to find the size (the number of elements) of
an array:
int arrayLength = myArray.length; // find array size
Accessing Array Elements
myArray
• Array elements are accessed using an index
myArray[0]
number in square brackets.
• It is similar in many languages : myArray[1]

– temp = myArray[3]; // get contents of fourth myArray[2]

element of array myArray[3]


– myArray[7] = 66; // insert 66 into the eighth cell myArray[4]
• Remember that in Java, as in C and C++, the
myArray[5]
first element is numbered 0, so that the
indices in an array of 10 elements run from 0 myArray[6]

to 9. myArray[7]
– Note: If you use an index that’s less than 0 or myArray[8]
greater than the size of the array less 1, you’ll
get the Array Index Out of Bounds runtime error. myArray[9]
Array Initialization
• Unless you specify otherwise, an array of integers
is automatically initialized to 0 when it’s created.
• When we create an array of objects, e.g.
BankAccount[] B= new BankAccount[10];
– Until the array elements are given explicit values, they
contain the special null object.
– If you attempt to access an array element that
contains null, you’ll get the runtime error Null Pointer
Assignment.
– The idea is to make sure you assign something to an
element before attempting to access it.
An Array application [1]
class ArrayApp
{
public static void main(String[] args)
{
long[] arr; // reference to array
arr = new long[100]; // make array
int nElems = 0; // number of items
int j; // loop counter
long searchKey; // key of item to search for

//--------------------------------------------------------------
arr[0] = 77; // insert 10 items
arr[1] = 99;
arr[2] = 44;
arr[3] = 55;
arr[4] = 22;
arr[5] = 88;
arr[6] = 11;
arr[7] = 00;
arr[8] = 66;
arr[9] = 33;
nElems = 10; // now 10 items in array
An Array application [2]
//--------------------------------------------------------------
for(j=0; j<nElems; j++) // display items
System.out.print(arr[j] + “ “);
System.out.println(“”);

//--------------------------------------------------------------
searchKey = 66; // find item with key 66
for(j=0; j<nElems; j++) // for each element,
if(arr[j] == searchKey) // found item?
break; // yes, exit before end
if(j == nElems) // at the end?
System.out.println(“Can’t find “ + searchKey); // yes
else
System.out.println(“Found “ + searchKey); // no
An Array application [3]
//--------------------------------------------------------------
searchKey = 55; // delete item with key 55
for(j=0; j<nElems; j++) // look for it
if(arr[j] == searchKey)
break;
for(int k=j; k<nElems-1; k++) // move higher ones down
arr[k] = arr[k+1];
nElems--; // decrement size
//--------------------------------------------------------------
for(j=0; j<nElems; j++) // display items
System.out.print( arr[j] + “ “);
System.out.println(“”);
} // end main()
} // end class ArrayApp

• The output of the program looks like this:


77 99 44 55 22 88 11 0 66 33
Found 66
77 99 44 22 88 11 0 66 33
An Array application [4]
• Insertion
– Inserting an item into the array is easy; we use the normal array syntax:
arr[0] = 77;
– We also keep track of how many items we’ve inserted into the array with the nElems
variable.
• Searching
– The searchKey variable holds the value we’re looking for.
– To search for an item, we step through the array, comparing searchKey with each element.
– If the loop variable j reaches the last occupied cell with no match being found, the value
isn’t in the array.
– Appropriate messages are displayed: Found 66 or Can’t find 27.
• Deletion
– Deletion begins with a search for the specified item.
– For simplicity, we assume (perhaps rashly) that the item is present.
– When we find it, we move all the items with higher index values down one element to fill in
the “hole” left by the deleted element, and we decrement nElems.
– In a real program, we would also take appropriate action if the item to be deleted could not
be found.
• Display
– Displaying all the elements is straightforward: We step through the array, accessing each
one with arr[j] and displaying it.
Activity
• Rewrite the previous Java Application whereby
it now contains two classes, namely:
– ArrayData
• To maintain the array of 100 long values
• To keep track of the number of elements in the array
(using field nElems)
• To provide methods for inserting, searching, deleting and
displaying the contents of the array
– ArrayApp
• To create the array of 10 elements
• To carry out the same operations as the previous
application but now calling the methods in the ArrayData
class
ALGORITHM ANALYSIS
Analysis of Algorithms

Input Processing Output

An algorithm is a step-by-step procedure for solving a problem in a finite


amount of time.
Why Algorithm Analysis
• Generally, we use a computer because we
need to process a large amount of data. When
we run a program on large amounts of input,
besides to make sure the program is correct,
we must be certain that the program
terminates within a reasonable amount of
time.
• Algorithm Analysis: a process of determining
the amount of time, resource, etc. required
when executing an algorithm.
Running Time
• Most algorithms transform input objects into
output objects.
• The running time of an algorithm typically
grows with the input size.
• Average case time is often difficult to
determine.
• We focus on the worst case running time.
– Easier to analyze
– Crucial to applications such as Games,
Finance and Robotics
Experimental Studies – Empirical Analysis

1. Write a program implementing the algorithm


2. Run the program with inputs of varying size
and composition
3. Use a standard function to get an accurate
measure of the actual running time
4. Plot the results
Experimental Studies (cont.)

10000
8000
6000
Time (ms)

4000
2000
0
1 2 3 4 5 6 7 8 9
Input size
Experimental Studies (cont.)
• Not as easy as it may at first appear…
– What is the question?
• Running time of average case?
• Which of two algorithms are faster?
• The values of parameters that optimize
performance?
• How close does an optimizing algorithm
come to an optimum value?
(Approximation algorithms)
Experimental Studies (cont.)
• Not as easy as it may at first appear…
– What do we measure?
• Actual running time of algorithm?
–What about context switches and the
like?
–Poor memory use (i.e. cache misses,
page faults) can give misleading results
–Platform dependence?
Experimental Studies (cont.)

• Count Primitive Operations


–Number of memory references
–Number of comparisons (in a sorting
algorithm)
–Number of arithmetic operations
Experimental Studies (cont.)
• Not as easy as it may at first appear…
– Where do we get the test data from?
• Need to generate enough samples to give statistically
significant results
• Need to generate samples of various sizes to give
performance numbers over various sample sizes
• Need test data that represents realistic situation in which
algorithm is used.
– Generating data uniformly and at random is not often
the correct choice
– Ex. Network algorithms on randomly generated graphs
– Ex. Word searches on random words not effective since
word distributions in natural languages are not uniform
Limitations of Experiments
• It is necessary to implement the algorithm,
which may be difficult
• Results may not be indicative of the running
time on other inputs not included in the
experiment.
• In order to compare two algorithms, the same
hardware and software environments must
be used
• In the end you may discard the algorithm!
Theoretical Analysis

• Uses a high-level description of the


algorithm instead of an implementation
• Characterizes running time as a function of
the input size, n.
• Takes into account all possible inputs
• Allows us to evaluate the speed of an
algorithm independent of the
hardware/software environment
Pseudocode

• High-level description of an algorithm


• More structured than English prose
• Less detailed than a program
• Preferred notation for describing algorithms
• Hides many program design issues
Pseudocode (Example)
Example: find max element of an array

Algorithm arrayMax(A, n)
Input array A of n integers
Output maximum element of A
currentMax ← A[0]
for i ← 1 to n − 1 do
if A[i] > currentMax then
currentMax ← A[i]
return currentMax
Pseudocode Details
• Control flow
– if … then … [else …]
– while … do …
– repeat … until …
– for … do …
– Indentation replaces braces
• Method declaration
Algorithm method (arg [, arg…])
Input …
Output …
Pseudocode Details (Cont.)
• Method call
var.method (arg [, arg…])
• Return value
return expression
• Expressions
←Assignment
(like = in C++/Java)
= Equality testing
(like == in C++/Java)
n2 Superscripts and other mathematical formatting allowed
The Random Access Machine (RAM) Model

• A CPU
• A potentially unbounded
bank of memory cells, each 2
1
of which can hold an 0
arbitrary number or
character
• Memory cells are numbered and accessing any cell in memory takes unit
time.
The RAM Model (Cont.)
• Each ``simple'' operation (+, *, -, =, if) takes exactly 1
time step.
• Loops and subroutines are not considered simple
operations.
– Instead, they are the composition of many single-step
operations.
– The time it takes to run through a loop or execute a
subprogram depends upon the number of loop iterations or
the specific nature of the function.
• Each memory access takes exactly one time step, and
we have as much memory as we need.
– The RAM model takes no notice of whether an item is in
cache or on the disk, which simplifies the analysis.
Primitive Operations
• Basic computations performed by an
algorithm
– Evaluating an expression
• Identifiable in pseudocode
– Assigning a value to a variable
• Largely independent from the
programming language
– Indexing into an array
Primitive Operations (Cont.)
• Exact definition not important (we will see
why later)
– Calling a method
• Assumed to take a constant amount of
time in the RAM model
– Returning from a method
Counting Primitive Operations
• By inspecting the pseudocode, we can
determine the maximum number of
primitive operations executed by an
algorithm, as a function of the input size
Counting Primitive Operations (Cont.)

Algorithm arrayMax(A, n) # operations


currentMax ← A[0] 1 opn
for i ← 1 to n − 1 do loop n − 1 times
if A[i] > currentMax then 1 opn (index, comp)
currentMax ← A[i] 1 opn (index, asmt)
} 1 opn (increment loopcounter)
return currentMax 1

2n < Total < 3n − 1


Where Did That Come From?
• Incrementing loop is (n-1) ops, plus a
check each time (n-1) more ops.
• Each time through body of loop is either 0
op or 1 op

1 + (n − 1) + (n − 1) + 1 < Total < 1 + 2(n − 1) + (n − 1) + 1


2n < Total < 3n − 1
Estimating Running Time

• Algorithm arrayMax executes 3n − 1 primitive


operations in the worst case. Define:
a = Time taken by the fastest primitive
operation
b = Time taken by the slowest primitive
operation
Estimating Running Time (Cont.)

• Let T(n) be worst-case time of arrayMax. Then


T(n) ≤ b(3n − 1)
• Hence, the running time T(n) is bounded by a
linear function
• Actual running time can be between a(2n) and
b(3n − 1)
Growth Rate of Running Time
• Changing the hardware/ software
environment
– Affects T(n) by a constant factor, but
– Does not alter the growth rate of T(n)
• The linear growth rate of the running time
T(n) is an intrinsic property of algorithm
arrayMax
• T(n) is also known as the time complexity of
an algorithm
– Also written as C(n) or Cn.
Growth Rates
• Growth rates of functions:
– Linear ≈ n
– Quadratic ≈ n2
– Cubic ≈ n3

• In a log-log chart, the slope of the line


corresponds to the growth rate of the
function
Growth Rates (Cont.)
1E+29
Cubic Quadratic
1E+27
1E+25
1E+23
1E+21
Linear
1E+19
1E+17
T(n)

1E+15
T

1E+13
1E+11
1E+9
1E+7
1E+5
1E+3
1E+1
1E-1
1E-1 1E+1 1E+3 1E+5 1E+7 1E+9

n
Constant Factors

• The growth rate is not affected by


– constant factors or
– lower-order terms
• Examples
– 102n + 105 is a linear function
– 105n2 + 108n is a quadratic function
Constant Factors
Quadratic
1E+25
1E+23
Quadratic
1E+21
1E+19 Linear
1E+17
1E+15 Linear
n)
T(n

1E+13
1E+11
1E+9
1E+7
1E+5
1E+3
1E+1
1E-1
1E-1 1E+1 1E+3 1E+5 1E+7 1E+9

n
Exercise 1

• Suppose that a particular algorithm has


time complexity T(n)=3x2n, and executing
an implementation of it on a particular
machine takes T seconds for n inputs. Now
suppose that we are presented with a
machine that is 64 times as fast. How many
inputs would we process on the new
machine in T seconds?
Exercise 2
• Suppose that another algorithm has time
complexity T(n)=n2, and that executing an
implementation of it on a particular machine
takes T seconds for n inputs. Now suppose
that we are presented with a machine that is
64 times as fast. How many inputs would we
process on the new machine in T seconds?
Exercise 3
• Suppose that another algorithm has time
complexity T(n)=8n, and that executing an
implementation of it on a particular machine
takes T seconds for n inputs. Now suppose that
we are presented with a machine that is 64 times
as fast. How many inputs would we process on
the new machine in T seconds?
Best, Worst and Average Cases
• Based on input, the running time of an
algorithm varies accordingly.
• For example, a sequential search algorithm
begins searching at the first position in the
array and looks at each value in turn until the
item is found.
• There is a wide range of running times:
– Best case, item is found in first position
– Worst case, item is found in the last position
– Average case, item is found half-way
Best, Worst and Average Cases (Cont.)
• When analyzing an algorithm, should we
analyze the Best, Worst or Average Case?
– Normally, we are not interested in the Best
case as this would happen rarely
– Average case would be ideal as this gives the
realistic situation but unfortunately it is not
always possible.
Best, Worst and Average Cases (Cont.)

–How about Worst case?


• It may happen rarely
• It has the advantage of telling us that
the algorithm will never get worse
than this case!
• Also suitable for real-time apps like
Air-traffic Control system
–Good to know that there will be at
most ‘so many’ planes at one time
A Faster Computer or a Faster Algorithm?

• Suppose we have an algorithm that has


running time proportional to n2
– Given a computer that is 10 times faster,
will the running time n2 become
acceptable?
– Maybe
– But the funny thing is that when we get
faster computers, we run bigger programs!
A Faster Computer or a Faster Algorithm? (Cont.)

– If on a slower machine we were sorting n


inputs, on a machine that is 10 times faster
we will try to sort 10n inputs
– If running time of sorting algorithm is linear,
then running time for 10n inputs on faster
machine will be same as earlier,
– BUT if running time is quadratic then time
taken to run the 10n inputs will be much
more!
– SO we better opt for a FASTER algorithm
Asymptotic Analysis

• It refers to the study of an algorithm as the


input size “gets big” or reaches a limit.
• i.e. we study the growth rate by ignoring
the constants and the lower terms
• Several terms are used to describe the
running-time of an algorithm, along with
their associated symbols
Asymptotic Analysis –
Upper Bound
• It refers to the upper or highest growth rate
an algorithm can have
• NOTE: Upper bound is not the same as the
worst case for a given input of size n.
• We discuss about the upper bound for
some class of input size n, which may be
– best-case, average-case or worst-case
• E.g. We can say “This algorithm has an
upper bound to its growth of n2 in the
worst case”.
Upper Bound - Notation
• The phrase “has an upper bound to its growth“
is long and thus we adopt a special notation
– “Big-Oh” notation, written as “O”
• “This algorithm has an upper bound to its
growth of n2 in the worst case”. Can be
rewritten as
– This algorithm is in O(n2) in the worst case
Big-Oh Definition
• Definition
– T(n) is in the set O(f(n)) if there exist two
positive constants c and n0 such that
|T(n)| ≤ c|f(n)| for all n> n0
– Constant n0 is the smallest value of n for
which the claim of an upper bound holds
true.
• Usually n0 is small e.g. 1
Big-Oh Example1
• Consider the sequential search algorithm for
finding a specified value in an array.
• If visiting and testing one value in the array
requires cs steps where cs is a positive number,
then in the average case
T(n)= csn/2.
• For all values of n>1, |csn/2|<=| csn|
• Therefore T(n) is in O(n) for n0=1 and c=cs.
Big-Oh Example2

• For a particular algorithm, T(n)=c1n2+c2n in the


average case where c1 and c2 are positive
numbers.
• Then |c1n2+c2n| ≤ |c1n2+c2n2| ≤ (c1+c2)|n2| for
all n>1.
• So T(n)≤ c|n2| for c= c1+c2 and n0=1
• Therefore T(n) is in O(n2) by definition
Big-Oh Example3
• Assigning the first position of an array to a
variable takes a constant time regardless of
the size of the array.
• Thus T(n)=c (for the best, worst and the
average cases).
• Therefore we could say T(n) is in O(c)
• But it is traditional to say that an algorithm
whose running time has a constant upper
bound is in O(1)
Big-Oh Exercise 1
T(n)=7n-2
• For all values of n≥1, |7n-2|<=|7n|
• Therefore T(n) is in O(n) for n0=1 and c=7
Big-Oh Exercise 2
T(n)=3n3 + 20n2 + 5
For all values of n>1,
|3n3 + 20n2 + 5 | <= |3n3 + 20n3|
|3n3 + 20n2 + 5 | <= 23|n3|
• Therefore T(n) is in O(n3) for n0=2 and c=23
Big-Oh Exercise 3
• T(n)=3 log n + log log n
• For all values of n>1,
|3 log n + log log n|≤ |3 log n + log n|
• |3 log n + log log n|≤ 4|log n|
• Therefore T(n) is in O(log n) for n0=2 and c=4
Big-Oh Rules
• If f(n) is a polynomial of degree d, then f(n) is
O(nd), i.e.,
1.Drop lower-order terms
2.Drop constant factors
• Use the smallest possible class of functions (i.e.
tightest upper bound)
– Say “2n is in O(n)” instead of “2n is in O(n2)”
• Use the simplest expression of the class
– Say “3n + 5 is in O(n)” instead of “3n + 5 is in
O(3n)”
Asymptotic Analysis –
Lower Bound

• defines a lower bound I.e. it defines the least


amount of some resource (usually time) that is
required by an algorithm for some class of input
size n (i.e. can be worst-, best- or average-case)
• It is denoted by symbol Ω, pronounced as “big-
Omega” or just “Omega”
Lower Bound - Definition
• Definition
– T(n) is in the set Ω(g(n)) if there exist two
positive constants c and n0 such that
|T(n)| ≥ c|g(n)| for all n> n0
– Constant n0 is the smallest value of n for
which the claim of an upper bound holds
true.
• Usually n0 is small e.g. 1
Lower Bound - Example
• For a particular algorithm, assume
T(n)=c1n2+c2n in the average case where c1
are c2 are positive numbers.
• Then |c1n2+c2n| ≥ |c1n2| or ≥ c1|n2| for all
n>1.
• So T(n) ≥ c|n2| for c= c1and n0=1
• Therefore T(n) is in Ω(n2) by definition
– Here also we get the tightest bound,
though we also say T(n) is in Ω(n)
θ Notation
• When the lower and upper bound for an
algorithm (in a specific input class) in the same,
then we indicate that by the θ notation.
• An algorithm is said to be θ(h(n)) if it is in
O(h(n)) and in Ω(h(n)).
• We drop the word “in” for θ notation because
there is a strict equality for two equations with
the same θ.
– i.e. if f(n) is θ(g(n)) then g(n) is θ(f(n)).
Activity
1. A program is O(N). It takes the program 8
seconds to complete when working with a
data set of 100,000 items. (N = 100,000) What
is the predicted time for the program to
complete when working with a data set of
200,000 items?
Homework (Cont.)
2. A program is O(N2). It takes the program 5
seconds to complete when working with a
data set of 10,000 items. (N = 10,000) What is
the predicted time for the program to
complete when working with a data set of
20,000 items?
3. A program is O(N3). It takes the program 5
minutes to complete when working with a
data set of 1,000,000 items. What is the
predicted time for the program to complete
when working with a data set of 3,000,000
items?

Potrebbero piacerti anche