03 Algorithm Analysis
• 3.1 Experimental Studies
• 3.2 The Seven Functions Used Generally
• 3.3 Asymptotic Analysis
• 3.4 Simple Justification Techniques
Running Time
best case
• Most algorithms transform input objects average case
into output objects. 120
worst case
• The running time of an algorithm typically 100
grows with the input size.
Running Time
80
• Average case time is often difficult to 60
determine.
40
• We focus on the worst case running time.
20
• Easier to analyze
0
• Crucial to applications such as games, finance 1000 2000 3000 4000
and robotics Input Size
3.1 Experimental Studies
• Write a program implementing the 9000
algorithm 8000
7000
• Run the program with inputs of varying size
and composition, noting the time needed: 6000
Time (ms)
5000
4000
3000
2000
1000
• Plot the results 0
0 50 100
I nput Size
Limitations of Experiments
• It is necessary to implement the algorithm, which may be difficult
• Results may not be indicative of the running time on other inputs not
included in the experiment.
• In order to compare two algorithms, the same hardware and software
environments must be used
3.1.1 Moving Beyond Experimental
Analysis
• Uses a high-level description of the algorithm instead of an
implementation
• Characterizes running time as a function of the input size, n.
• Takes into account all possible inputs
• Allows us to evaluate the speed of an algorithm independent of
the hardware/software environment
Pseudocode
• High-level description of an algorithm
• More structured than English prose
• Less detailed than a program
• Preferred notation for describing algorithms
• Hides program design issues
Pseudocode Details
• Control flow • Method call
• if … then … [else …] method (arg [, arg…])
• while … do … • Return value
• repeat … until … return expression
• for … do …
• Indentation replaces braces • Expressions:
¬ Assignment
• Method declaration
Algorithm method (arg [, arg…]) = Equality testing
Input …
Output … n2 Superscripts and other
mathematical formatting
allowed
The Random Access Machine (RAM)
Model
• A CPU
• An potentially unbounded
bank of memory cells, each of 2
which can hold an arbitrary 1
0
number or character
Memory cells are numbered and accessing any
cell in memory takes unit time.
3.2 Seven Important Functions
• Seven functions that often appear in algorithm analysis:
• Constant 1
• Logarithmic log n
• Linear n
• N-Log-N n log n
• Quadratic n2
• Cubic n3
• Exponential 2n
• In a log-log chart, the slope of the line corresponds to the growth rate
In a log-log chart, the slope of the line corresponds to the
growth rate
http://net.pku.edu.cn/~course/cs202/2015/resource/other/alganal.py
3.3 Asymptotic Analysis
Primitive Operations
• Basic computations performed by an algorithm
• Identifiable in pseudocode
• Largely independent from the programming language
• Exact definition not important
• Assumed to take a constant amount of time in the RAM model
• Examples:
• Evaluating an expression
• Assigning a value to a variable
• Indexing into an array
• Calling a method
• Returning from a method
Counting Primitive
Operations
• By inspecting the pseudocode, we can determine the maximum number of
primitive operations executed by an algorithm, as a function of the input size
• Step 1: 2 ops, 3: 2 ops, 4: 2n ops, 5: 2n ops, 6: 0 to n ops, 7: 1 op
Estimating Running Time
• Algorithm find_max executes 5n + 5 primitive operations in the worst
case, 4n + 5 in the best case. Define:
a = Time taken by the fastest primitive operation
b = Time taken by the slowest primitive operation
• Let T(n) be worst-case time of find_max. Then
a (4n + 5) T(n) b(5n + 5)
• Hence, the running time T(n) is bounded by two linear functions.
Growth Rate of Running Time
• Changing the hardware/ software environment
• Affects T(n) by a constant factor, but
• Does not alter the growth rate of T(n)
• The linear growth rate of the running time T(n) is an
intrinsic property of algorithm find_max
Slide by Matt Stallmann
Why Growth Rate Matters
if runtime
time for n + 1 time for 2 n time for 4 n
is...
c lg n c lg (n + 1) c (lg n + 1) c(lg n + 2)
cn c (n + 1) 2c n 4c n
~ c n lg n 2c n lg n + 4c n lg n + runtime
c n lg n quadruple
+ cn 2cn 4cn
s
c n2 ~ c n2 + 2c n 4c n2 16c n2 when
problem
c n3 ~ c n3 + 3c n2 8c n3 64c n3 size
doubles
c 2n c 2 n+1 c 2 2n c 2 4n
Slide by Matt Stallmann
Comparison of Two Algorithms
insertion sort is
n2 / 4
merge sort is
2 n lg n
sort a million items?
insertion sort takes
roughly 70 hours
while
merge sort takes
roughly 40 seconds
This is a slow machine, but if
100 x as fast then it’s 40 minutes
versus less than 0.5 seconds
Constant Factors
• The growth rate is not affected by 1E+25 Quadratic
1E+23
• constant factors or 1E+21
Quadratic
Linear
• lower-order terms 1E+19
Linear
1E+17
• Examples 1E+15
T(n)
• 102n + 105 is a linear function 1E+13
1E+11
• 105n2 + 108n is a quadratic function 1E+9
1E+7
1E+5
1E+3
1E+1
1E-1
1E-1 1E+1 1E+3 1E+5 1E+7 1E+9
n
3.3.1 Big-Oh Notation
• Given functions f(n) and g(n), we say 10,000
that f(n) is O(g(n)) if there are positive 3n
constants c and n0 such that 1,000 2n+10
f(n) cg(n) for n n0 n
• Example: 2n + 10 is O(n) 100
• 2n + 10 cn
• (c 2) n 10 10
• n 10/(c 2)
• Pick c = 3 and n0 = 10
1
1 10 100 1,000
n
Big-Oh Example
1,000,000
• Example: the function n2 is not n^2
O(n) 100,000 100n
• n2 cn 10n
• nc 10,000 n
• The above inequality cannot be
satisfied since c must be a constant 1,000
100
10
1
1 10 100 1,000
n
More Big-Oh
Examples
7n-2
7n-2 is O(n)
need c > 0 and n0 1 such that 7n-2 c•n for n n0
this is true for c = 7 and n0 = 1
3n3 + 20n2 + 5
3n3 + 20n2 + 5 is O(n3)
need c > 0 and n0 1 such that 3n3 + 20n2 + 5 c•n3 for n
n0
this is true for c = 4 and n0 = 21
3 log n+5
3 log n + 5 is O(log n)
need c > 0 and n0 1 such that 3 log n + 5 c•log n for n
n0
this is true for c = 8 and n0 = 2
Big-Oh and Growth Rate
• The big-Oh notation gives an upper bound on the growth rate of a function
• The statement “f(n) is O(g(n))” means that the growth rate of f(n) is no more
than the growth rate of g(n)
• We can use the big-Oh notation to rank functions according to their growth rate
f(n) is O(g(n)) g(n) is O(f(n))
g(n) grows Yes No
more
f(n) grows more No Yes
Same growth Yes Yes
Big-Oh Rules
• If is f(n) a polynomial of degree d, then f(n) is O(nd), i.e.,
1.Drop lower-order terms
2.Drop constant factors
• Use the smallest possible class of functions
• Say “2n is O(n)” instead of “2n is O(n2)”
• Use the simplest expression of the class
• Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”
Asymptotic Algorithm Analysis
• The asymptotic analysis of an algorithm determines the running time in big-Oh
notation
• To perform the asymptotic analysis
• We find the worst-case number of primitive operations executed as a function of the input size
• We express this function with big-Oh notation
• Example:
• We say that algorithm find_max “runs in O(n) time”
• Since constant factors and lower-order terms are eventually dropped anyhow, we can
disregard them when counting primitive operations
3.3.3 E.g., Computing Prefix
Averages
• We further illustrate asymptotic analysis with
35
two algorithms for prefix averages X
30
• The i-th prefix average of an array X is average A
of the first (i + 1) elements of X: 25
A[i] = (X[0] + X[1] + … + X[i])/(i+1) 20
15
• Computing the array A of prefix averages of
10
another array X has applications to financial
analysis 5
0
1 2 3 4 5 6 7
Prefix Averages
(Quadratic)
The following algorithm computes prefix averages in quadratic
time by applying the definition
Arithmetic Progression
• The running time of prefixAverage1 is O(1 + 2 + …+ n)
• The sum of the first n integers is n(n + 1) / 2
• There is a simple visual proof of this fact
• Thus, algorithm prefixAverage1 runs in O(n2) time
Prefix Averages 2 (Looks
Better)
The following algorithm uses an internal Python function to simplify
the code
Algorithm prefixAverage2 still runs in O(n2)
time!
Prefix Averages 3 (Linear
Time)
The following algorithm computes prefix averages in linear time
by keeping a running sum
Algorithm prefixAverage3 runs in O(n) time
Math you need to Review
Summations • properties of logarithms:
Logarithms and Exponents logb(xy) = logbx + logby
logb (x/y) = logbx - logby
logbxa = alogbx
logba = logxa/logxb
• properties of exponentials:
a(b+c) = aba c
abc = (ab)c
ab /ac = a(b-c)
Proof techniques b = a logab
Basic probability bc = a c*logab
Relatives of Big-Oh
big-Omega
f(n) is (g(n)) if there is a constant c > 0 and an integer constant
n0 1
such that f(n) c•g(n) for n n
0
big-Theta
f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an
integer constant n0 1
such that c’•g(n) f(n) c’’•g(n) for n n
0
Intuition for Asymptotic
Notation
Big-Oh
f(n) is O(g(n)) if f(n) is asymptotically less than or equal to
g(n)
big-Omega
f(n) is (g(n)) if f(n) is asymptotically greater than or equal
to g(n)
big-Theta
f(n) is (g(n)) if f(n) is asymptotically equal to g(n)
Example Uses of the Relatives of Big-
Oh
5n2 is (n2)
f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1
such that f(n) c•g(n) for n n0
let c = 5 and n0 = 1
5n2 is (n)
f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1
such that f(n) c•g(n) for n n0
let c = 1 and n0 = 1
5n2 is (n2)
f(n) is (g(n)) if it is (n2) and O(n2). We have already seen the former, for the latter recall that f(n)
is O(g(n)) if there is a constant c > 0 and an integer constant n0 1 such that f(n) < c•g(n) for n
n0
Let c = 5 and n0 = 1