Chapter 1-Introduction (2)
Chapter 1-Introduction (2)
Algorithms (CoSc2041 )
Amsalu Dinote(MSc)
12
Chapter Outline
• Introduction
• Abstract data types
• Data structures
• Algorithms
• Properties of algorithms
• Algorithm analysis
• Measures of times
13
Introduction
• A program is written in order to solve a problem.
• A solution to a problem actually consists of two things:
A way to organize the data
Sequence of steps to solve the problem
• The way data are organized in a computers memory is
said to be Data Structure and the sequence of
computational steps to solve a problem is said to be an
Algorithm.
• Therefore, a program is nothing but data structures plus
algorithms.
14
Introduction(cont..)
• Given a problem, the first step to solve the problem is obtaining
ones own abstract view, or model, of the problem.
• This process of modeling is called abstraction.
16
Abstract Data Type(cont..)
• A data structure is a language construct that the
programmer has defined in order to implement an
abstract data type.
• There are lots of formalized and standard Abstract
data types such as Stacks, Queues, Trees, etc.
• Do all characteristics need to be modeled?
• Not at all!!
It depends on the scope of the model
It depends on the reason for developing the model
17
Abstraction
• Abstraction is a process of classifying characteristics as
relevant and irrelevant for the particular purpose at hand and
ignoring the irrelevant ones.
• Applying abstraction correctly is the essence of successful
programming
• How do data structures model the world or some part of the
world?
The value held by a data structure represents some specific
characteristic of the world
The characteristic being modeled restricts the possible
values held by a data structure
The characteristic being modeled restricts the possible
operations to be performed on the data structure.
• Note: Notice the relation between characteristic, value, and data
structures
• Where are algorithms, then?
18
Example 1: Employee ADT
• If we are going to model employees of an
organization:
• This ADT stores employees with their relevant
attributes and discarding irrelevant attributes.
(some of such attributes can be: name, sex, id,
salary, etc)
• This ADT supports operations such as hiring,
firing, retiring.
19
Example 2: List ADT
• An ADT for a list of integers might specify the following
operations:
Insert a new integer at a particular position in the list.
Return true if the list is empty.
Reinitialize the list.
Return the number of integers currently in the list.
Delete the integer at a particular position in the list.
• From this description, the input and output of each
operation should be clear, but the implementation for
lists has not been specified.
20
Example 3: Other ADTs
• Objects such as lists, sets, and graphs, along with their operations, can
be viewed as ADTs, just as integers, reals, and booleans are data
types.
• Integers, reals, and booleans have operations associated with them, and
so do ADTs.
• For the set ADT, we might have such operations as add, remove, size,
and contains.
• Alternatively, we might only want the two operations union and find,
which would define a different ADT on the set.
• An abstract data type can also be viewed as the realization of a data
type as a software component.
• The interface of the ADT is defined in terms of a type and a set of
operations on that type.
• The behavior of each operation is determined by its inputs and outputs.
• An ADT does not specify how the data type is implemented.
• There are lots of formalized and standard Abstract data types such as
Stacks, Queues, Trees, Graphs etc.
21
Data Structures
• A data structure is the implementation for an ADT.
• With abstraction you create a well-defined entity that can be
properly handled. These entities define the data structure of the
program.
• A data structure is a language construct that the programmer has
defined in order to implement an abstract data type.
• The C++ class (struct too) allows for the implementation of
ADTs, with appropriate hiding of implementation details.
• Thus, any other part of the program that needs to perform an
operation on the ADT can do so by calling the appropriate
method.
• If for some reason implementation details need to be changed, it
should be easy to do so by merely changing the routines that
perform the ADT operations.
• This change, in a perfect world, would be completely transparent
to the rest of the program.
22
Algorithms
• An algorithm is a well-defined computational procedure that
takes some value or a set of values as input and produces
some value or a set of values as output.
• Data structures model the static part of the world. They are
unchanging while the world is changing.
• In order to model the dynamic part of the world we need to
work with algorithms.
• Algorithms are the dynamic part of a program’s world model.
• An algorithm transforms data structures from one state to
another state in two ways:
An algorithm may change the value held by a data
structure
An algorithm may change the data structure itself
23
Algorithms
• The quality of a data structure is related to its ability
to successfully model the characteristics of the world.
• Similarly, the quality of an algorithm is related to its
ability to successfully simulate the changes in the
world.
• However, independent of any particular world model,
the quality of data structure and algorithms is
determined by their ability to work together well.
• Generally speaking, correct data structures lead to
simple and efficient algorithms and correct algorithms
lead to accurate and efficient data structures.
24
Properties of an Algorithm
• Finiteness: Algorithm must complete after a finite number of steps.
• Definiteness: Each step must be clearly defined, having one and only
one interpretation. At each point in computation, one should be able to
tell exactly what happens next.
• Sequence: Each step must have a unique defined preceding and
succeeding step. The first step (start step) and last step (halt step) must
be clearly noted.
• Feasibility: It must be possible to perform each instruction.
• Correctness: It must compute correct answer all possible legal inputs.
• Language Independence: It must not depend on any one programming
language.
• Completeness: It must solve the problem completely.
• Effectiveness: It must be possible to perform each step exactly and in a
finite amount of time.
• Efficiency: It must solve with the least amount of computational
resources such as time and space.
• Generality: Algorithm should be valid on all possible inputs.
• Input/Output: There must be a specified number of input values, and
one or more result values.
25
Algorithm Analysis Concepts
• Algorithm analysis refers to the process of determining how
much computing time and storage that algorithms will require.
• In other words, it’s a process of predicting the resource
requirement of algorithms in a given environment.
• In order to solve a problem, there are many possible algorithms.
• One has to be able to choose the best algorithm for the problem at
hand using some scientific method.
• To classify some data structures and algorithms as good, we need
precise ways of analyzing them in terms of resource requirement.
• The main resources are:
Running Time
Memory Usage
Communication Bandwidth
• Running time is usually treated as the most important since
computational time is the most precious resource in most problem
domains.
26
Algorithm Analysis Concepts(cont..)
• There are two approaches to measure the efficiency of algorithms:
1. Empirical: Programming competing algorithms and trying them on different
instances.
2. Theoretical: Determining the quantity of resources required mathematically
(Execution time, memory space, etc.) needed by each algorithm.
• However, it is difficult to use actual clock-time as a consistent measure of an
algorithm’s efficiency, because clock-time can vary based on many things.
For example,
Specific processor speed
Current processor load
Specific data for a particular run of the program
o Input Size
o Input Properties
Operating Environment
• Accordingly, we can analyze an algorithm according to the number of
operations required, rather than according to an absolute amount of time
involved. This can show how an algorithm’s efficiency changes according to
the size of the input.
27
Complexity Analysis
• Complexity Analysis is the systematic study of the cost of computation,
measured either in time units or in operations performed, or in the amount of
storage space required.
• The goal is to have a meaningful measure that permits comparison of
algorithms independent of operating platform.
• There are two things to consider:
Time Complexity: Determine the approximate number of operations
required to solve a problem of size n.
Space Complexity: Determine the approximate memory required to solve a
problem of size n.
• Complexity analysis involves two distinct phases:
Algorithm Analysis: Analysis of the algorithm or data structure to produce a
function T(n) that describes the algorithm in terms of the operations
performed in order to measure the complexity of the algorithm.
Order of Magnitude Analysis: Analysis of the function T(n) to determine the
general complexity category to which it belongs.
• There is no generally accepted set of rules for algorithm analysis. However,
an exact count of operations is commonly used.
28
Analysis Rules
1. We assume an arbitrary time unit.
2. Execution of one of the following operations takes time 1:
Assignment Operation
Single Input Operation
Single Output Operation
Single Boolean Operations
Single Arithmetic Operations
Function Return
3. Running time of a selection statement (if, switch) is the time for the condition
evaluation + the maximum of the running times for the individual clauses in the
selection.
4. Loops:
• Running time for a loop is equal to the running time for the statements inside the
loop * number of iterations.
• The total running time of a statement inside a group of nested loops is the running
time of the statements multiplied by the product of the sizes of all the loops.
• NB: For nested loops, analyze inside out. Always assume that the loop executes the
maximum number of iterations possible.
5. Running time of a function call is 1 for setup + the time for any parameter calculations
+ the time required for the execution of the function body.
29
Algorithm Analysis Examples
Example 1:
int count()
{
int k=0;
cout<< “Enter an integer”;
cin>>n;
for (i=0;i<n;i++)
k=k+1;
return 0;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment statement: int k=0
1 for the output statement.
1 for the input statement.
In the for loop:
1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+1+1+(1+n+1+n)+2n+1 = 4n+6 = O(n)
30
Algorithm Analysis Examples
Example 2:
int total(int n)
{
int sum=0;
for (int i=1;i<=n;i++)
sum=sum+1;
return sum;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment statement: int sum=0
In the for loop:
1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+ (1+n+1+n)+2n+1 = 4n+4 = O(n)
31
Algorithm Analysis Examples
Example 3: Time Units to Compute
void func() -------------------------------------------------
{ 1 for the first assignment statement: x=0;
int x=0; 1 for the second assignment statement: i=0;
int i=0; 1 for the third assignment statement: j=1;
int j=1; 1 for the output statement.
cout<<“Enter a value”; 1 for the input statement.
cin>>n; In the first while loop:
while (i<n){ n+1 tests
x++; n loops of 2 units for the two increment
i++; (addition) operations
} In the second while loop:
while (j<n) n tests
{ n-1 increments
j++; --------------------------------------------------
} T (n)= 1+1+1+1+1+n+1+2n+n+n-1
}
= 5n+5 = O(n)
32
Algorithm Analysis Examples
Example 4:
int sum (int n)
{
int partial_sum = 0;
for (int i = 1; i <= n; i++)
partial_sum = partial_sum +(i * i * i);
return partial_sum;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment.
1 assignment, n+1 tests, and n increments.
n loops of 4 units for an assignment, an addition, and two
multiplications.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+(1+n+1+n)+4n+1 = 6n+4 = O(n)
33
Formal Approach to Analysis
• In the above examples we have seen that analysis so
complex.
• However, it can be simplified by using some formal
approach in which case we can ignore initializations,
loop control, and book keeping (assignment
operations).
34
Formal Approach to Analysis
}
sum = sum+i; 1 N
i 1
• Suppose we count the number of additions that are
done. There is 1 addition per iteration of the loop,
hence N additions in total.
35
Formal Approach to Analysis
Nested Loops: Formally
• Nested for loops translate into multiple summations,
one for each for loop.
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= M; j++) { N M N
}
sum = sum+i+j; 2 2M 2MN
i 1 j 1 i 1
}
• Again, count the number of additions. The outer
summation is for the outer for loop.
36
Formal Approach to Analysis
37
Formal Approach to Analysis
Conditionals: Formally
If (test) s1 else s2: Compute the maximum of the
running time for s1 and s2.
if (test == 1) {
for (int i = 1; i <= N; i++) { N N N
sum = sum+i; max 1, 2
}} i 1 i 1 j 1
else for (int i = 1; i <= N; i++) {
for (int j = 1; j <= N; j++) {
max N , 2 N 2 2 N 2
sum = sum+i+j;
}}
38
Example
• Example:
• Suppose we have hardware capable of executing 106
instructions per second. How long would it take to execute
an algorithm whose complexity function was: T (n) = 2n2
on an input size of n=108?
• The total number of operations to be performed would be T
(108):
T(108) = 2*(108)2
=2*1016
• The required number of seconds would be given by
T(108)/106 So, the running time =2*1016/106= 2*1010
• The number of seconds per day is 86,400 so this is about
231,480 days (634 years).
39
Measures of Times
• In order to determine the running time of an algorithm it
is possible to define three functions Tbest(n), Tavg(n) and
Tworst(n) as the best, the average and the worst case
running time of the algorithm respectively.
• Average Case (Tavg): The amount of time the algorithm
takes on an "average" set of inputs.
• Worst Case (Tworst): The amount of time the algorithm
takes on the worst possible set of inputs.
• Best Case (Tbest): The amount of time the algorithm
takes on the smallest possible set of inputs.
• We are interested in the worst-case time, since it
provides a bound for all input – this is called the “Big-
Oh” estimate.
40
Asymptotic Analysis
• Asymptotic analysis is concerned with how the
running time of an algorithm increases with the size of
the input in the limit, as the size of the input increases
without bound.
• There are five notations used to describe a running
time function. These are:
Big-Oh Notation (O)
Big-Omega Notation ()
Theta Notation ()
Little-o Notation (o)
Little-Omega Notation ()
41
The Big-Oh Notation
• Big-Oh notation is a way of comparing algorithms and is
used for computing the complexity of algorithms; i.e., the
amount of time that it takes for computer program to run .
• It’s only concerned with what happens for very a large
value of n.
• Therefore only the largest term in the expression
(function) is needed.
• For example, if the number of operations in an algorithm is
n2 – n, n is insignificant compared to n2 for large values of
n.
• Hence the n term is ignored. Of course, for small values of
n, it may be important.
• However, Big-Oh is mainly concerned with large values of
n.
• Formal Definition: f (n)= O (g (n)) if there exist c, k ∊ ℛ+
such that for all n≥ k, f (n) ≤ c.g (n).
42
The Big-Oh Notation-Examples
Examples: The following points are facts that you can
use for Big-Oh problems:
1<=n for all n>=1
n<=n2 for all n>=1
2n <=n! for all n>=4
log2n<=n for all n>=2
n<=nlog2n for all n>=2
43
The Big-Oh Notation-Examples
Example: f(n)=10n+5 and g(n)=n. Show that f(n) is
O(g(n)).
To show that f(n) is O(g(n)) we must show that
constants c and k such that f(n) <=c.g(n) for all n>=k
Or 10n+5<=c.n for all n>=k
Try c=15. Then we need to show that 10n+5<=15n
Solving for n we get: 5<=5n or 1<=n.
So f(n) =10n+5 <=15.g(n) for all n>=1.
(c=15,k=1).
44
The Big-Oh Notation-Examples
Example: f(n)=3n2 +4n+1 and g(n)=n2. Show that f(n) is
O(g(n)).
4n<=4n2 for all n>=1 and
1<=n2 for all n>=1
3n2+4n+1<=3n2+4n2+n2 for all n>=1
3n2+4n+1<=8n2 for all n>=1
So we have shown that f(n)<=8n2 for all n>=1
Therefore, f(n) is O(g(n)) where c=8,k=1
45
Typical Orders
• Here is a table of some typical cases. This uses logarithms to base 2, but
these are simply proportional to logarithms in other base.
N O(1) O(log n) O(n) O(n log n) O(n2) O(n3)
1 1 1 1 1 1 1
2 1 1 2 2 4 8
4 1 2 4 8 16 64
8 1 3 8 24 64 512
16 1 4 16 64 256 4,096
1024 1 10 1,024 10,240 1,048,576 1,073,741,824
• Demonstrating that a function f(n) is big-O of a function g(n) requires that
we find specific constants c and k for which the inequality holds (and show
that the inequality does in fact hold).
• Big-O expresses an upper bound on the growth rate of a function, for
sufficiently large values of n.
• An upper bound is the best algorithmic solution that has been found for a
problem.
• “ What is the best that we know we can do?”
46
Big-O Theorems
• Theorem 1: k is O(1)
• Theorem 2: A polynomial is O(the term containing the
highest power of n). Polynomial’s growth rate is
determined by the leading term
If f(n) is a polynomial of degree d, then f(n) is O(nd)
• In general, f(n) is big-O of the dominant term of f(n).
• Theorem 3: k*f(n) is O(f(n))
Constant factors may be ignored
E.g. f(n) =7n4+3n2+5n+1000 is O(n4)
• Theorem 4 (Transitivity): If f(n) is O(g(n))and g(n) is
O(h(n)), then f(n) is O(h(n)).
• Theorem 5: For any base b, logb(n) is O(logn).
All logarithms grow at the same rate
E.g. logbn is O(logdn). b, d > 1
47
Big-O Theorems
• Theorem 6: Each of the following functions is big-O of its
successors:
k<logbn<n<nlogbn<n2<n to higher powers<2n<3n
<larger constants to the nth power<n!<nn
48
Properties of the Big-O Notation
• Higher powers grow faster
nr is O(ns) if 0 <= r <= s
Example: n2= O(n3), n2<n3
• Fastest growing term dominates a sum
If f(n) is O(g(n)), then f(n) + g(n) is O(g)
Example: 5n4 + 6n3 is O (n4)
• Exponential functions grow faster than powers, i.e.is O( bn )
b > 1 and k >= 0
Example: n20 is O( 1.05n)
• Logarithms grow more slowly than powers
logbn is O( nk) b > 1 and k >= 0
Example: log2n is O( n0.5)
49
Big-Omega Notation
• Just as O-notation provides an asymptotic upper
bound on a function, notation provides an
asymptotic lower bound.
• Formal Definition: A function f(n) is ( g (n)) if
there exist constants c and k ∊ ℛ+ such that
f(n) >=c. g(n) for all n>=k.
• f(n)=(g(n)) means that f(n) is greater than or equal
to some constant multiple of g(n) for all values of n
greater than or equal to some k.
Example: If f(n) =n2, then f(n)= ( n)
• In simple terms, f(n)=(g(n)) means that the growth
rate of f(n) is greater than or equal to g(n).
50
Big-Omega Notation
51
Theta Notation
• A function f (n) belongs to the set of (g(n)) if there exist positive constants
c1 and c2 such that it can be sandwiched between c1.g(n) and c2.g(n), for
sufficiently large values of n.
• Formal Definition: A function f (n) is (g(n)) if it is both O(g(n)) and
(g(n)). In other words, there exist constants c1, c2, and k >0 such that
c1.g(n)<=f(n)<=c2. g(n) for all n >= k
• If f(n)= (g(n)), then g(n) is an asymptotically tight bound for f(n).
• In simple terms, f(n)= (g(n)) means that f(n) and g(n) have the same rate of
growth.
• Example:
1. If f(n)=2n+1, then f(n) = (n), for c1=2 , c2=3 and k=1
2. f(n) =2n2 then
f(n)=O(n4)
f(n)=O(n3)
f(n)=O(n2)
• All these are technically correct, but the last expression is the best and tight
one. Since 2n2 and n2 have the same growth rate, it can be written as f(n)=
(n2).
52
Theta Notation
• This means,
as n increases T(n) grows as fast as f(n)
this computes the tight optimal bound of T(n) and it
describes the average case analysis
53
Little-o Notation
• Big-Oh notation may or may not be asymptotically
tight, for example:
2n2 = O(n2)
= O(n3)
• f(n)=o(g(n)) means for all c>0 there exists some k>0
such that f(n)<c.g(n) for all n>=k.
• Informally, f(n)=o(g(n)) means f(n) becomes
insignificant relative to g(n) as n approaches infinity.
Example: f(n)=3n+4 is o(n2)
• In simple terms, f(n) has less growth rate compared to
g(n).
g(n)= 2n2 g(n) =o(n3), O(n2), g(n) is not o(n2).
54
Little-Omega ( notation)
55
Relational Properties of the Asymptotic Notations
• Transitivity
• if f(n)=(g(n)) and g(n)= (h(n)) then f(n)=(h(n)),
• if f(n)=O(g(n)) and g(n)= O(h(n)) then f(n)=O(h(n)),
• if f(n)=(g(n)) and g(n)= (h(n)) then f(n)= (h(n)),
• if f(n)=o(g(n)) and g(n)= o(h(n)) then f(n)=o(h(n)), and
• if f(n)= (g(n)) and g(n)= (h(n)) then f(n)= (h(n)).
• Symmetry
• f(n)=(g(n)) if and only if g(n)=(f(n)).
• Transpose symmetry
• f(n)=O(g(n)) if and only if g(n)=(g(n),
• f(n)=o(g(n)) if and only if g(n)=(g(n)).
• Reflexivity
• f(n)=(f(n)),
• f(n)=O(f(n)),
• f(n)=(f(n)).
56
Amortized Complexity
57
Amortized Complexity