Chapter 1
Chapter 1
The way data are organized in a computers memory is said to be Data Structure
and the sequence of computational steps to solve a problem is said to be an
algorithm. Therefore, a program is nothing but data structures plus algorithms.
Given a problem, the first step to solve the problem is obtaining ones own abstract
view, or model, of the problem. This process of modeling is called abstraction.
The model defines an abstract view to the problem. This implies that the model
focuses only on problem related stuff and that a programmer tries to define the
properties of the problem.
An entity with the properties just described is called an abstract data type (ADT).
An ADT consists of an abstract data structure and operations. Put in other terms,
an ADT is an abstraction of a data structure.
A data structure is a language construct that the programmer has defined in order
to implement an abstract data type.
There are lots of formalized and standard Abstract data types such as Stacks,
Queues, Trees, etc.
1.1.2. Abstraction
How do data structures model the world or some part of the world?
The value held by a data structure represents some specific characteristic of
the world
The characteristic being modeled restricts the possible values held by a data
structure
The characteristic being modeled restricts the possible operations to be
performed on the data structure.
Note: Notice the relation between characteristic, value, and data structures
1.2. Algorithms
An algorithm transforms data structures from one state to another state in two
ways:
The quality of a data structure is related to its ability to successfully model the
characteristics of the world. Similarly, the quality of an algorithm is related to its
ability to successfully simulate the changes in the world.
However, independent of any particular world model, the quality of data structure
and algorithms is determined by their ability to work together well. Generally
speaking, correct data structures lead to simple and efficient algorithms and correct
algorithms lead to accurate and efficient data structures.
Algorithm analysis refers to the process of determining how much computing time
and storage that algorithms will require. In other words, it’s a process of predicting
the resource requirement of algorithms in a given environment.
In order to solve a problem, there are many possible algorithms. One has to be able
to choose the best algorithm for the problem at hand using some scientific method.
To classify some data structures and algorithms as good, we need precise ways of
analyzing them in terms of resource requirement. The main resources are:
Running Time
Memory Usage
Communication Bandwidth
Running time is usually treated as the most important since computational time is
the most precious resource in most problem domains.
3. void func()
{
int x=0;
int i=0;
int j=1;
cout<< “Enter an Integer value”;
cin>>n;
while (i<n){
x++;
i++;
}
while (j<n)
{
j++;
}
}
In the above examples we have seen that analysis so complex. However, it can be
simplified by using some formal approach in which case we can ignore
initializations, loop control, and book keeping.
• In general, a for loop translates to a summation. The index and bounds of the
summation are the same as the index and bounds of the for loop.
N
for (int i = 1; i <= N; i++) {
}
sum = sum+i; 1
i 1
N
• Suppose we count the number of additions that are done. There is 1 addition
per iteration of the loop, hence N additions in total.
}
sum = sum+i+j; 2 2M
i 1 j 1 i 1
2 MN
}
• Again, count the number of additions. The outer summation is for the outer
for loop.
Conditionals: Formally
• If (test) s1 else s2: Compute the maximum of the running time for s1 and s2.
if (test == 1) {
for (int i = 1; i <= N; i++) { N N N
max
sum = sum+i; 1, 2
}} i 1 i 1 j 1
else for (int i = 1; i <= N; i++) {
for (int j = 1; j <= N; j++) {
max N , 2 N 2 2 N 2
sum = sum+i+j;
}}
Example:
Suppose we have hardware capable of executing 106 instructions per second. How
long would it take to execute an algorithm whose complexity function was:
T (n) = 2n2 on an input size of n=108?
The total number of operations to be performed would be T (108):
T(108) = 2*(108)2 =2*1016
The required number of seconds
required would be given by
T(108)/106 so:
Exercises
Determine the run time equation and complexity of each of the following code
segments.
1. for (i=0;i<n;i++)
for (j=0;j<n; j++)
sum=sum+i+j;
What is the value of sum if n=100?
3. int k=0;
for (int i=0; i<n; i++)
for (int j=i; j<n; j++)
k++;
What is the value of k when n is equal to 20?
4. int k=0;
for (int i=1; i<n; i*=2)
for(int j=1; j<n; j++)
k++;
What is the value of k when n is equal to 20?
5. int x=0;
for(int i=1;i<n;i=i+5)
x++;
What is the value of x when n=25?
6. int x=0;
for(int k=n;k>=n/3;k=k-5)
x++;
What is the value of x when n=25?
7. int x=0;
for (int i=1; i<n;i=i+5)
for (int k=n;k>=n/3;k=k-5)
x++;
What is the value of x when n=25?
8. int x=0;
for(int i=1;i<n;i=i+5)
for(int j=0;j<i;j++)
for(int k=n;k>=n/2;k=k-3)
x++;
What is the correct big-Oh Notation for the above code segment?
Average Case (Tavg): The amount of time the algorithm takes on an "average" set of
inputs.
Worst Case (Tworst): The amount of time the algorithm takes on the worst possible
set of inputs.
Best Case (Tbest): The amount of time the algorithm takes on the smallest possible
set of inputs.
We are interested in the worst-case time, since it provides a bound for all input –
this is called the “Big-Oh” estimate.
There are five notations used to describe a running time function. These are:
Big-Oh notation is a way of comparing algorithms and is used for computing the
complexity of algorithms; i.e., the amount of time that it takes for computer
program to run . It’s only concerned with what happens for a very large value of n.
Therefore only the largest term in the expression (function) is needed. For
example, if the number of operations in an algorithm is n2 – n, n is insignificant
compared to n2 for large values of n. Hence the n term is ignored. Of course, for
small values of n, it may be important. However, Big-Oh is mainly concerned with
large values of n.
Formal Definition: f (n)= O (g (n)) if there exist c, k ∊ ℛ+ such that for all n≥ k, f
(n) ≤ c.g (n).
Examples: The following points are facts that you can use for Big-Oh problems:
(c=15,k=1).
Typical Orders
Here is a table of some typical cases. This uses logarithms to base 2, but these are
simply proportional to logarithms in other base.
Demonstrating that a function f(n) is big-O of a function g(n) requires that we find
specific constants c and k for which the inequality holds (and show that the
inequality does in fact hold).
Big-O expresses an upper bound on the growth rate of a function, for sufficiently
large values of n.
An upper bound is the best algorithmic solution that has been found for a problem.
“ What is the best that we know we can do?”
Exercise:
f(n) = (3/2)n2+(5/2)n-3
Show that f(n)= O(n2)
In simple words, f (n) =O(g(n)) means that the growth rate of f(n) is less than or
equal to g(n).
For all the following theorems, assume that f(n) is a function of n and that k is an
arbitrary constant.
Theorem 1: k is O(1)
Theorem 2: A polynomial is O(the term containing the highest power of n).
Exponential functions grow faster than powers, i.e. is O( bn ) b > 1 and k
>= 0
E.g. n20 is O( 1.05n)
Logarithms grow more slowly than powers
f(n)= ( g (n)) means that f(n) is greater than or equal to some constant multiple of
g(n) for all values of n greater than or equal to some k.
In simple terms, f(n)= ( g (n)) means that the growth rate of f(n) is greater that or
equal to g(n).
A function f (n) belongs to the set of (g(n)) if there exist positive constants c1
and c2 such that it can be sandwiched between c1.g(n) and c2.g(n), for sufficiently
large values of n.
In simple terms, f(n)= (g(n)) means that f(n) and g(n) have the same rate of
growth.
Example:
f(n)=O(n4)
f(n)=O(n3)
f(n)=O(n2)
All these are technically correct, but the last expression is the best and tight one.
Since 2n2 and n2 have the same growth rate, it can be written as f(n)= (n2).
2n2 = O(n2)
=O(n3)
f(n)=o(g(n)) means for all c>0 there exists some k>0 such that f(n)<c.g(n) for all
n>=k. Informally, f(n)=o(g(n)) means f(n) becomes insignificant relative to g(n) as
n approaches infinity.
Formal Definition: f(n)= (g(n)) if there exists a constant no>0 such that 0<= c.
g(n)<f(n) for all n>=k.
Symmetry
Transpose symmetry
Reflexivity
• f(n)=(f(n)),
• f(n)=O(f(n)),
f(n)=(f(n)).