0% found this document useful (0 votes)
26 views

Advanced DS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Advanced DS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 170

MANONMANIAM SUNDARANAR UNIVERSITY

DIRECTORATE OF DISTANCE & CONTINUING EDUCATION


TIRUNELVELI 627012, TAMIL NADU

M.C.A./P.G.D.C.A. - I YEAR

DKA17/DAA17 - ADVANCED DATA STRUCTURE


(From the academic year 2016-17)

Most Student friendly University - Strive to Study and Learn to Excel

For more information visit: http://www.msuniv.ac.in


ADVANCED DATA STRUCTURE

SYLLABUS

TINIT I
Introduction: Mathematics Review - A brief introduction ro recursion. Algorithm analysis, Mathematic
background - model - what to analyze - Running time calculations.
Lists, Stacks, Queues: Abstract data types (ADT) - The List ADT - The stack ADT - The Queue ADT

LINIT II
Trees: Implementation of trees, tree travels with an application - -
Binary rrees The search tree ADT -
Binary Search Trees
Hashing: General Idea - Hash function - Separate Chaining.
Priority Queues ftIeaps): Model - Simple implementations - Binary Heap.

UNIT III
Sorting: Preliminaries - Insertion sort - Shell Sort - Heap Son - Merge Sort - Quick Sort.

LINIT IV
Graph Algorithms: Definition Topological sort shortest - Path Algorithms - Network How Problems
- Minimum Spanning Tree - Applications of Depth - First Search.

LINIT V
Algorithm Design Techniques: Greedy Algorithms - Divide and Conquer - Running Time of divide
and Conquer Algorithms - Closest-Points Problems - The Selection Problem TheoretiJal improvements
-
for Arithmetic Probiems.
UNITI
,

LESSON

I
FUNDAMENTAL OF DATA STRUCTURES
CONTENTS
1.0 Aims and Obiectives
t.t Introduction
t.2 Recursion
1.3 Algorithm Analysis
1.3.1 Problem Solving using Pseudocode
1.3.2 Problem Solution using Flow Chart Diagram
1.3.3 MathematicBackground
t.3.4 Model: Abstract Data Type (ADT)
1.3.5 ttr7hat to Analyze?
t.3.6 Running Time Calculations
1.4 Let us Sum up
1.5 Keywords
t.6 Questions for Discussion
L.7 Suggested Readings

1.0 ArMS AND OBJECTMS


After studying this lesson, you should be able to,
a Discuss the concept of recursion
. Describe algorithm analysis which includes mathematic background, model, what to analyze and
running time calculations

1.1 INTRODUCTION
Semantically data can exist in either of the two forms - atomic or structured. In most of the
programming problems data to be read, processed and written are often related to each other' Data
ir.-, r.l"t.d in a variety of different ways. \ilhereas the basic data types such as integers, characters
"r.
etc. can be directly created and manipulated in a programming language, the responsibility of creating
the structured type data items remains with the programmers themselves. Accordingly, programming
languages provide mechanism to create and manipulate structured data items.
8 Advanced Data Structure
M.S. University - D.D.C.E.

In this lesson, we begin with an-explanation of what an algorithm and a daa


srrucrure are.'sre
introduce the concept of ab.stract_data types. \7e then describe-the
pseudo-cod", *hi.h would be used
for writing the algorithms throughout. Finally, the importani ,rp".i
of analysis of algorithms is briefly
touched upon with an inrroduction to the bij JO, ,or"riorr.
\x/e will also discuss recursion in this lesson. Iteration (ooping)
in functional languages is usually
accomplished via recursion. Recursive functions inrroke
thims"elves, allowing ,r, op"r"rion to be
performed over and over. Recursion may require maintaining
a srack, but tail recursion can be
recognized and optimized by a compiler inio the same code
used"to implement it.rrtio, in imperative
programming language standard requires implementations ro
*X:::^TT ftntln"
opumlze tall recurslon.
recognize and

1.2 RECURSION
Recursion is a wonderful, powerful *Jr to_ solve problems.
It is an imponanr concepr in computer
science' Many algorithms can be best described i., i"r.n,
of recursion. Recursion defines a function in
terms of itself' That is, in the course of the function definition
there is a call to thar very same
function' At first this may seem like a never ending loop, o, like
a dog .t ir, rail. It can never
catch it' so too it seems our method will never finisi'. "rirg
Thi, ,rright b" t^J i, ,o-.'*r.s, but in practice
we can check to see if a certain condition is true and in thrt
.i"r" exit (return from) our method. The
case in which we end our recursion is called a base case.
Addiri,onally, just as in a loJp, *. must change
some value and incrementally advance closer to our base
case.
Consider this function:
void myMethod( int counter)
{
if(counter == 0)
return,.
else
{
System. out. println 1, " +counter) ;
myMethod ( - -counter ) ;
return;
)
)
This recursion is nor infinite, assuming the method is passed
a positive integer value.
Consider this method:
void myMethod( int counter)
{
i_f(counter == 0)
return;
else
{
System. out . println "he11o" + count.er)
(
M.S. University - D.D.C.E Fundamental of Data Structures 9

myMethod ( - -counter ) ;

System. out.println ( " " +counter) ;


return;
]
)

The above recursion is essentially a loop like a for loop or a while loop. \7hen do we prefer recursion
to an iterative loop? We use recursion when we can see that our problem can be reduced to a simpler
problem that can be solved after further reduction.
Every recursion should have the following characteristics:
. A simple base case which we have a solution for and a return value.
. A way of getting our problem closer to the base case, i.e., a way to chop out parr of the problem
to get a somewhar simpler problem.
. A recursive call which passes the simpler problem back into the method.
The key to thinking recursively is to see the solution to rhe problem as a smaller version of the same
problem. The key to solving recursive programming requirements is to imagine that your method
does what its name says it does even before you have actually finish writing it. You must pretend the
method does its job and then use it to solve the more complex cases. Here is how.
Identify the base case(s) and what the base case(s) do. A base case is the simplest possible problem (or
case) your method could be passed. Return the correct value for the base case. Your recursive method
will then be comprised of an if-else statement where the base case returns one value and the non-base
case(s) recursively call(s) the same method with a smaller paramerer or set of data.

Thus you decompose your problem into rwo parrs:


l. The simplest possible case which you can answer (and return for)
2. All other more complex cases which you will solve by returning the result of a second calling of
your method.
This second calling of your method (recursion) will pass on the complex problem but reduced by one
increment. This decomposition of the problem will actually be a complete, accurare solution for the
problem for all cases other than the base case.
Thus, the code of the method actually has the solution on the first recursion.
Let's consider writing a method to find the factorial of an integer. For example 7! equals
7*6>15*4*3>12'81 . But we are also correct if we say 7! equals 7't'6!,. ln seeing the factorial o{ I in this
second way we have gained a valuable insight. We now can see our problem in rerms of a simpler
version of our problem and we even know how to make our problem progressively more simple. \(e
have also defined our problem in terms of itself, i.e., we defined 7! in terms of s!. This is the essence of
recursive problem solving. Now all we have left to do is decide what the.base case is. \(hit is the
simplest factorial? 1!. 1! equals 1.

Let's write the factorial function recursively.


int myFactorial ( int integer)
{
if( integer -= 1)
10 Advanced Data Structure M.S. University - D.D.C.E.

return 1;
else
{
return ( integer* (myFact.orial ( integer-1 ) ;

Note that the base case (the factorial of 1) is solved and the return value is given. Now let us imagine
that our method actually works. If it works we can use it to give the result of more complex cases. If
our number is 7 we will simply return 7 * the result of factorial of 6. So we actually have the exact
answer for all cases in the top level recursion. Our problem is getting smaller on each recursive call
because each time we call the method we give it a smaller number. Try to run this program in your
mind with the number 2. Does it give the right value? If it works for 1 then it mirst *ork for t*o ,irr..
2 merely returns 2 * factoriel of 1. Now will it work for 3) '\tr7ell, 3 must return 3 * factorial of 2. Now
since we know that factorial of 2 works, factorial of 3 also works. W'e can prove that 4 works in the
same way, and so on and so on.

However, in fact, your code won't run forever like an infinite loop,.instead, you will eventually run
out of stack space (memory) and get a run-time error or exceprion called a stack overflow. There are
several significant problems with recursion.

Mostly it is hard (especially for inexperienced programmers) to think recursively, though many AI
specialists claim that in reality recursion is closer to basic human thought processes than other
programming methods (such as iteration). There also exists the problem of stack overflow when using
some forms of recursion (head recursion.) The other main problem with recursion is that it can be
slower to run than simple iteration. Then why use it? It seems that there is always an iterative soludon
to any problem that can be solved recursively. Is there a difference in computational complexity? No.
Is there a difference in the efficiency of execution? Yes, in fact, the recursive version is usually less
efficient because of having to push and pop recursions on and off the run-time sack, so iteration is
quicker. On the other hand, you might notice that the recursive versions use fewer or no local
variables.

So why use recursion? The answer to our question is predominantly because it is easier to code a
recursive solution once one is able to identify that solution. The recursive code is usually smaller,
more concise, more elegant, possibly even easier to understand, though that depends on ones thinking
style. But also, there are some problems that are very difficult to solve without recursion. Those
problems that require backtracking such as searching a maze for a path to an exit or tree based
operations are best solved recursively.

Tail Recursion
Tail reiursion is defined as occurring when the recursive call is at the end of the recursive instruction.
This is not the case with my factorial solution above. It is useful to notice when ones algorithm uses
tail recursion because in such a case, the algorithm can usually be rewritten to use iteration instead. In
fact, the compiler will (or at least should) convert the recursive program into an iterative one. This
eliminates the potential problem of stack overflow.
M.S. University - D.D.C.E. Fundamental o{Data Structures 11

This is not the case with


head recursion, or when the function calls itself recursively in different places.
Of course, even in these cases we could also remove recursion by using our own stack and essentially
simulating how recursion would work.
In the example of factorial above, the compiler will have to call the recursive function before doing the
multiplication because it has to resolve the (return) value of the function before it can completJ the
multiplication. So the order of execution will be "head" recursion, i.e. recursion occurs before other
oPerations.

To convert this to tail recursion we need to get all the multiplication finished and resolved before
recursively calling the function. 'We need to force the order of operation so rhar we are not waiting on
multiplication before returning. If we do this the stack frame ."r, b. freed up.
The proper way to do a tail-recursive factorial is this:
int. factorial-(int number) t
if(number -= 0) t
return 1;
)
factorial_i (number, l) ;

)
int, fact.orial_i (int currentNumber, int sum) {
if(currentNumber -- 1) {
ret.urn sum;
) else {
return fact.orial i (currentNumber - 1, sum*currentNumber) ;
)
Recursion
)

Notice that in the call return factorial-i(currentNumber - 1, sum*currentNumber); both parameters


are immediately resolvable. \7e can compute what each parameter is without waiting for a recursive
function call to return. This is not the case with the previous version of factorial. T-his ,treamlining
enables the compiler to minimize stack use as explained above.

1.3 ALGORITHM ANALYSIS


Computer science is the study of methods for effecdvely using a compurer to solve problems, or in
other wor&, determining exadly the problem to be solved.
This process entails:
1. Un&rstanding of the problem.
2. Translating descriptions, goals, requests and unstared desires into a precisely formulated
conceptual solution.
3. Implementing the solution with a compurer program.
12 Advanced Data Structure M.S. University - D.D.C.E.

This solution typically consists of two parrs: dara srrucrures and algorithms.
An algorithm is a well-defined list of steps for solving a parricular problem. A set of algorithms are
always used for performing operations on the data stored Ly
-.rrm of d^t^srnrcrure. Thu"s algorithms
handle data through data stnrcture. In constructing a solution to a problem, a dara strucrure mu$ be
chosen that allows the data to be operated upon easily in the manneirequired by the algorithm.
Data may be arranged and managed at many levels. Algorithm has to be designed in such a manner so
that it can perform the desired operation on rhe stored data.
An algorithm may need to put new data into an existing collection of data, remove data from a
collection, or query a collection of data for a specific purpose.
In the design of many types of programs, the choice of data srrucrures is a primary design
consideration' Exper-ience-in building large systems has shown that the difficulty of implementatiJn
and the quality and performance of the final resuh depends heavily on choosing the best data
stnrcture' After the data structures are chosen, the algorithms to be used often become relatively
obvious. Sometimes things work in the opposite direction - data structures are chosen because certain
key tasks have algorithms that work best with particular data structures. In either case, rhe choice of
appropriate data strucrures is crucial.
The formal algorithm consists of two parts. The first part is a paragraph describing the purpose of the
algorithm, identifies the variable which is used in the algorithm.rrd liri. of input datr. Tie ,".ord prrt
of the algorithm consisrs of the list of sreps thar is to be executed.

1.3.1 Problem Solution using Pseudocode


Pseudocode is an outline of a program, written in a form that can easily be converted into real
programming statements.
Pseudocode cannot be compiled nor executed, and there are no real formatting or syntax rules. It is
simply one step - an important one - in producing the final code. The benefit oT pr".rdo.ode is that it
enables the programmer to concentrate on the algorithms without worrying abtut all the syntactic
details of a particular programming language. In fact, you can write pseudotod. without (rro*irrg
what programming language you will use for the final implementation. "rrer,

Problem

A shop has started a discount scheme. According ro rhar scheme if the purchased quantity is more 10
then 10olo discount DISC has been given to the customer. Operator has to enrer rheiu antity
eTy and
rate RATE of the item; the program will display the total value TOT_VAL. One way to solve the
problem is as follows:
Solution
First Initialize DISC with : 0 and accept the RATE and QTY from the user. Then check whether the
QTY is more than 10 or not. If QTY ) 10 then set DISC : 10. Calculate the rotal value TOT VAL
and display it.
A formal Algorithm of the stated problem:
Algorithm: (TotalValueCalculation)
This algorithm accepts the input for QTY and RATE from the user rhen checks whether the
10 or not. If QTY ) 10 then assigns DISC : 10. After that the VAL : (QTY * RATE)
eTy >*
- (RATE
DISC/100) is calculated and displayed.
M.S. University - D.D.C.E. Fundamental of Data Structures 13

Step 1. [Initialize] set DISC: :0.


Step 2. [Accept the rate and quantity of the item].
Accept and assign INPUT for QTY and RATE.
Step 3. fCheck QTY> 10]
If QTY) 10, then:
set DISC: :10. fAllow discount]

[End of If structure.]
Step 4. fCalculation of total value]
Set VAL: : (QTY x RATE)- EATE " DISC/100).
Step 5. Print VAL.
Step 5. Exit.
1.3.2 Problem Solution using Flow Chart Diagram
A flow chart is a graphical or symbolic representation of a process. Each step in the process flow is
represented by a different symbol and contains a short text description of the process step in the flow chart
symbol. The flow chan symbols are linked together with arrow connectors (also known as flow lines).
Table 1.1: Program Flow Chart Symbols

Teminal box indicates the point at which the


algorithm begins or teminates.

Processing box indicates the straightforwild


computation. assignment operation.

Input-Output box used for tfing input data md


giving output results or messages.

Decision box used when algorithm has to choose between


two or three branches leading to other parts of a
flowchart.

Represents a predefined module. the detailed


intemal steps of which ue dehned elsewhere.
Subroutines.

I l---+ <--
I

Flow lines used to connect different boxes md indicate

t+ direction of flow.

() Connector connects different parts of flow chart.


14 Advanced Data Structure
M.S. University - D.D.C.E.

A Flow chart of the previously stated Total Value Calculation problem is given below:

val (- (ery * RATE)


- (RATE * DISC/I

1.3.3 Mathematic Background


As in C, we shall assume four basic data types - integer, float, boolean and char. Integer and float (also
called real in some languages) are available in every programming language. A variab[ of type booi.rl
can have either true or false as its value. A variabie-co^t*t nr flolt .Lrrrr*rr, will be *iiit.r, in the
usual form. For example, -78 is an integer constant while -78.5 " and .7Be-3 are float
consrants. A
character constant will be written by enclosing it within single quotation marks like 'a, or .A, or ,o/o,.
A boolean constant will be written as true or false. Among rh. io.rrt"rrrs, we shall also allow ,strings,
(a string is a sequence of characters) for example, ,xyz,.

Expressions will be made up of variables and constants connecred by means of operators. I[7e shall
use
the usual arithmetic operators like +, -, * and /. In addition to these, operarors iike mod will be used
to mean remainder of integer division. Thus a mod b will mean the remainder of division of a by b.
The operators * and mod will have higher precedence than + and -. Float and integer can be mi*ei in
expressions but the result will be float. Boolean expressions can be obtained by ising the relational
operators lihe the following:

- = (equal to)
!- (not equal to)
( (less than)

> (greater than)


q : (less than or equal to)
M.S. University - D.D.C.E. Fundamental of Data Structures 15

2: (greater than or equal to)

Boolean variables or expressions can be connected with the logical operators not represented by !, and
represented by &&, or represented by I l, to obtain further compound boolean expressions. For
example,

(r)=10)ll(r<:20)
would mean a should be greater than or equal to 10 or a should be less than or equal
to 20. Among the logical operators not will have a higher precedence than and which in turn will have
a precedence higher than or.
As regards associativity of operators, we shall use parentheses to avoid confusion. Implicitly, left to
right associativity will be assumed. Thus, a* b / c would mean (aob)/c.
The variables used in the programs should be declared before the executable statements in a section
beginning with the keyword var. The format of a variable declaration is as follows:
( data type ) < list of variables ) ;

For example,
int x, y, z;
means thar x, y and z are variables of type integer. The data type can be standard type or user-defined.
The enumerated type as available in C are assumed to be included in this pseudo-code. We illustrate
this with the help of following example:
enum colour (brown, red, green);
This declaration assigns 0 to brown, 1 to red and 2 to green. The enumerated type is defined with a list
of names as shown in the case of colour. These names are the values of a variable of this tyPe we can
assume. For example, the statement,

a = brown;
will assign the value of brown to a. Note that brown is a constant of type color and not a variable.

As in every programming language, the implied sequencing of statements will be assumed in the
pseudo-code. This means that statements will be executed sequentially in a top-to-bottom manner
unless the flow of control is explicitly altered by a control construct such as a loop construd. The
sratemenrs will be separated from one another by means of a semicolon$. A group of statements
placed wirhin begin and end will be a compound statement and will be treated as a single unit.

There will be assignment statements in the usual format namely,


(variable) : (expression) ;

Thus,
a[i] : x-y;
is an example of the assignment statement.

\7e shall also use the usual if-then-else statement in the following format:
if < condition ) then ( statement block > else ( statement block >
16 Advanced Data Structure
M.S. University - D.D.C.E.

Here, ( condition, i:.1 boolean expression. Sometimes it will be expressed using


English. Each of rhe
'then' or'else'parts will contain oni simple or compound statement. \7e further
assume that the ,else,
Part m y be absent. The statements in the 'then' part will be executed if the condition is true
otherwise the statements in the 'else' part will be e*eiuted. In either case,
conrrol will reach the nexr
statement in sequence after the execution of the said statements.

The following are examples of the if-then-else construcr:


. if(atjl < a[k]) {x : atjl} else {x : atkl};
. if (first) {* : 3;y : 5;} else {x : 7;y : 9;}
In the above example first is a boolean variable.
As for the loops, we shall use rhree sraremenrs - the while, the do-while and the for srarements. The
while statemenr has the following format:
while ( condition ) do ( srarement )
The statement within the while construct will be execured repeatedly so
long as the condition becomes
true' once the condition is false, the loop is exited. The stateLents within tf,. loop
are not execured ar
all if the condition is false to start with. ihe format of the do-while sraremenr
is asiollows:
do ( statemenrs > while < condition )
The statements within the do-while construct will be execured repeatedly as long
as rhe condition
remains true' The loop will be executed at least once irrespectire of the
condition. Note that unlike
the while, the do-while allows more than one statemerrt to b. wrirten within
its scope and no curly
races are necessary. The format of the for statement is shown
below:
For( < initialization) ; ( condition ) ; < modification > )( srarement ) ;

lrnitialization ) assigns initial values to the loop-control variables. The statement


within the loop
will be executed repeatedly, with differ..rt ,ralrre, of the control variable. Each time the loop
is
repeated, the values of the control variables are modified in (modification)
section. The loop is
exited when the ( condition ) becomes false.
The following are some examples of the while, do-while and for sraremenrs:
while( < 100) {sum : sum + a]j; j: j + 1;}
forf : 1;j < 100;i : i + 1)sum : sum + aUl;
do {sum : sum + aljl;j: j + 1} while ( < 100)
for( : 100;j > I;j : j- 1) sum : sum + aUl;
To call a function (or a procedure) the following sraremenr is used:
( function/procedure name ) (< parameter list ) );
The parameter list consists of expressions that stand for the actual
parameters. The expressions will be
separated by commas 0. Thus the procedure statement,
find(x, yt z I 3); means a call to some
procedure named find- and x,
Y, z + 3 are the actual prrr-",.r, ,.rrrrir.grr-ents). As usual
the correspondence between actual parameters and formal lrtto.At.a parameters (also known as dummy
arguments) will be positional.
M.S. University - D.D.C.E. Fundamental of Data Structures 17

1.3.4 Model: Abstract Data Typ. (ADT)


Abstract Data Type (ADT) is a mathematical model with a collection of operations defined on that
model. Sets of integers, together with the operations of union, intersection and set difference, form a
simple example of an ADT. The ADT encapsulates a data type in the sense that the definition of the
type and all operations on the type can be localized and are not visible to the users of the ADT. To the
users, just the declaration of the ADT and its operations are important.

Abstract Data Type


. A framework for an object interface
. \X/hat kind of stuff it'd be made of (no detail$?
o Vhat kind of messages it would receive and kind of action it'll perform when properly triggered?

Do_this (p1, p2, p3) Conceptualization phase

Also_this (p1, p2, p3)

Do_that (p1, p2, p3)

From this we figure out:


. Object malre-up (in terms of data)
. Object interface (what sort of messages it would handle)
. How and when it should act when triggered from outside (public trigger) and by another object
friendly to it?
These concerns lead to an ADT - a definition for the object.
An Abstract Data Type is a set of data items and the methods that work on them.
An implementation of an ADT is a translation into statements of a programming language, of the
declaration that defines a variable to be of that ADT, plus a procedure in that language for each
operation of the ADT. An implementation chooses a data structure to represent the ADT; each data
stnrcture is built up from the basic data types of the underlying programming language. Thus, if we
wish to change the implementation of an ADT, only the procedures implementing the operations
would change. This change would not affect the users of the ADT.
Although the terms 'data type', 'data structure' and 'abstract data type' sound alike, they have
different meanings. In a programming language, the data type of a variable is the set of values that the
variable may assume. For example, a variable of type boolean can assume either the value true or the
value false, but no other value. An abstract data type is a mathematical model, together with various
operations defined on the model. As we have indicated, we shall design algorithms in terms of ADTs,
but to irnplement an algorithm in a given programming language we must find some way of
18 Advanced Data Srructure
M.S. University - D.D.C.E.

representing the ADTs in terms of the data.types_ and operarors supported by the programming
language itself. To rePresent the mathematical model underlying an ADT, we use dara srrucrures,
which are a collection of variables, possibly of several data types, corrnecte,i in various
ways.
The cell is the basic building block of data structures. rwe can picture a cell as
a box that is capable of
hoiding a value drawn from some basic or.composite datatypi. Data srructures
are creared by giving
names to aggregates of.cells. and (optionally) interpreting tle values of
some cells as ..pr"rJ"rl"i
relationships or connections (e.g., pointers) ,*orrg ."ilr.

1.3.5 \flhat to Analyze?


. F{ere correct algorithms are analyzed.

An algorithm is said to be correct if it stops with the correcr ourpur, for


' each input instance.
o Incorrect algorithms
* may not stop in any way on some input instances
+ may srop with other than the desired answer

1.3.6 Running Time Calculations


An algorithm is a meth.od for solving a class of problems on a compurer. The complexity
of an
algorithm depends on the running time, or ,,orrg., or whateve, ,rrit, are relevant
algorithm to solve one of those problems.
iolsing the
Computing takes time. Some problems take a very long time, others can be
done quickly. Some
problems seem to take a long tirne, and then .o-"orr. d-iscovers a faster way
ro do tir.* 1" ,frr,.,
aigorithm')' The study of the amount of computational effort that is needed
in order to p"rfor-
certain kinds of computations is the study of computarional complexity.
Naturallv, we would..IP..: that a computing problem for which millions of bits
of input d,ata are
required would probably take longer thrr, ,rott"r problem that needs only
a f"* ir"*, of i.rprt. So
the time complexity of a calculation is measured by .rpr.rrirrg the running
ii-. of the calcularion as a
function f(n) of some measure of the amount of'd,ai that is"needed to iescribe
the problem r. irr"
computer.
For instance, think about this statement: 'I just bought a matrix inversion program,
and it can invert
,l r r. n matrix in just 1.2n3 minutes.'\il(/e see here atypical description of ih.lomplexity of ,."rtri,
algorithm' The running time of the program is being given as a function of
the size of the input
matrix.
A faster program for the same iob might run in 0.8n3 minutes for an n x n matrix.
If someone were to
make. a really important discov"ty, ih"., maybe we could actually
loweruh. .*porr.nt, insread of
me-rely shaving the multiplicative constant. Thus, a program that
would invert an n x n matrix in
anly 7n2.8 minutes would represent a striking i*pro,r"..r."rrt of the state
of the art.
The general rule is that if the running time is at mosr a polynomial function
of the amounr of input
data, then the calculation is an .rry orr., otherwise it,s hard. '
M.S. University - D.D.C.E. Fundamental of Data Structures 19

Suppose we are writing an algorithm for searching the occurrence of a parricular letter from a given
word. If, the letter occurs at the beginning of the word then the f(n) is small. On the other hand, if the
particular letter does not appear in the given word then the f(n) is big.
Generally the complexity of an algorithm is measured by three cerrain cases.

o Best Case:The minimum value of f(n) for any possible input.


o Auerage Case: The ayerage value of f(n) for a certain probabilistic distribution for the input.
) Worst Case: The maximum value of f(n) for any possible input.

Big-O Notation
It is a theoretical measure of the execution of an algorithm, usually the time or memory needed, given
the problem size n, which is usually the number of items. Informally, saying some equarion
f(") : O(g(")) means it is less than some constant multiple of g(n). The notation is read, "f of n is big
oh of g of n".'
Formal Definition: f(n) : O(g(")) means there are positive constants c and k, such that O : (") : cg(n)
for all n : k. The values of c and k must be fixed for the function f and must not depend on n.

cq {n}

Figure 1.1: Graphical Representation Big O Notation


Performance of an algorithm (and therefore the corresponding program) is often measured in terms of
the space and time required to execute it. Generalh, time and space required to execure a program are
inversely proportion. Accordingly, if the program is required to be executed quickly, the space it will
occupy will be more and vice-versa.
A time-critical system, such as real time systems that need very small response time, would achieve its
objective by allowing the program to occupy more space in the memory. On the other hand, a
space-critical program would rake more rime to run.

However, an algorithm can almost always be developed that uses rhe available space and execurion
time balance in a given sysrem.
20 Advanced Data Structure M.S. University - D.D.C.E.

heck Your
Fill in the blanks:
(r) Iteration (ooping) in functional languages is usually accomplished via .............
(b) Recursion defines a function in terms of .................. .

(.) The case in which we end our recursion is called a ............... case.

2. Define algorithm.
3. FIow many parts are there in a formal algorithm? Mention the parts.
4. Define pseudocode.

1.4 LET US SUM UP


Recursion is a wonderful, powerful way to solve problems. Recursive functions invoke themselves,
allowing an operation to be performed over and over. Recursion may require mainraining a stack, but
tail recursion can be recognized and optimized by a compiler into the same code used to implement
iteration in imperative languages.
An algorithm is a well-defined list of steps for solving a particular problem. The formal algorithm
consists of two parts. The first part is a paragraph describing the purpose of the algorithm, identifies
the variable which is used in the algorithm and lists of input data. The second part of the algorithm
consists of the list of steps rhar is to be execured.

The objective of analyzingan algorithm is to obtain quantitatiye measures for the resources required
by the algorithm during its execution. Performance of an algorithm is often measured in terms of the
space and time required to execute it. Generally, time and space required to execure a program are
inversely proportion. Accordingly, if the program is required ro be executed quickly, the space it will
occupy will be more and vice-versa.

1.5 KEY\TORDS
Recursion: It is a method in which a function calls itself.

Tail Recursion: It is defined as occurring when the recursive call is ar rhe end of the recursive
instrucrion.
Flow Chart: Diagramatic representation of an algorithm.
Pseudocodc: An outline of a program, written in a form that can easily be converted into real
programming statements.
Data Structure.'A combination of one or more basic data types ro form a single addressable data type
along with operations defined on it.

Algorithms: A finite set of instructions which, when followed, accomplishes a particular task, the
termination of which is guaranteed under all cases.
ADT (Abstract Data T!pe).' A mathematical model with a collection of operations defined on rhar
model.
Fundamental of Data Structures 21
M.S. University - D.D'C.E'

the algorithm during its execution'


SpaceComplexiyt:A quantitative measures for the space required by
by the algorithm during its execution'
Time cornplexitjt:A quantitative measures for the time required

l.6QUESTIONSFON
7. Vhat is recursion? How does recursion works?

2. Describe tail recursion.


form:
3. can you express each of the following algebraic formulae in a recursive

(r)y:(x,+x,+..'..+x)
(b)y : L + 2x+ 4x,* 8x,+ ... + 2"x"'
(.)y:(1+x)"
4. GivenS:1+ 22 + 32 + 42 +......... + n2,whereSisthesumof thesquaresof nnumbers'in'rite
an algorithm to comPute S.
5. \write an algorithm ro compure the sums for the first n terms of the following series, where n has

to be input bY the user :


(.) S:1+2+3+......
(b) S:1+3+5+.'....
G) S:2+4+6+......
(d) S : 1 + t/2 + r/3 +.....
of n.
6. Given a number n, write an algorithm to comPute the {actorial
'!ilrite short note on the following:
7.
(a) ADT
(b) Mathematical notations

Check Your Progress: Model Answers


1. (a) Recursion

(b) Itself

(c) Base

2. An algorithm is a well-defined list of steps for solving a particular problem'


the
3. The formal algorithm consists of two parts. The-first.Part is a paragraph.describing
in the algorithm and lists of
purpose of th.'ffiithm,'part
identifies the rariable v,'hich is used
'r".o.rd oI the algorithm consists of the list of stePs that is to be
input data. Th.
executed.
can easily be converted into
4. Pseudocode is an outline of a program, written in a form that
real programming statements'
22 Advanced Data Structure
M.S. University - D.D.C.E.

1.2 SUGGESTED READINGS


John R' Hubbard' Schaum's outline of Data Structures with
Java, 2nd Edition, McG.a* Hili
\x/illiam collins, Data structures and the
Java collections Framework, 2nd Edition, McGraw H;1.
MasteringAlgoritbms witb C - by Kyle Loudon published
- by o'Reilly
{t Associates.
The Art of computer Programming- by Donald
E. Knuth - pubiished by Addison-wesley professional.
Introduction to Algorithm's - by Thomas
H. cormen, charles E. Leiserson, Ronald L. Rivest, clifford stein
Published by The MIT press.

Ddtd structures and Efficient Argorithm-i, Burkhard


Monien, Thomas ottmann, springer
Data Structures and Algorithms, Shi-Kuo
Chang; Vorld Scientific
Hou to Solve it by Computer, RG Dromey; Cambridge
University press
Classic Data Structures in C+ +, Timothy
A. Budd, Addison \Tesley
LESSON

2
LISTS, STACKS AND QUEUES
CONTENTS
2.0 Aims and Objectives
2.1, Introduction
2.2 Singly Linked List
2.2.1 ADT of Singly Linked Lists
2.2.2 Implementation
2.3 Application:PolynomialAddition
2.4 ADT of Stacks

2.5 Implementation
2.5.1, Implementing a Stack using an Array
2.5.2 Implementing Stacirs using Linked Lists
2.6 Analysis of Stack Implementations
2.7 ADT of Queues
2.8 Queuelmplementations
2,8.t Arra,v Implementation of Queues
2.8.2 Linked Impiementation of Queues
2.9 Analysis of Queue Implementations
2.1,0 Let us Sum up

2.11 Keywords
2.12 Questions for Discussion
2.13 SuggestedReadings

2.0 AIMS AND OBJECTIVES


After studying this lesson, you should be able to:
o Describe the application o{ ADT in linked lists

o Explain the application of ADT in stacks


r Identify the application of ADT in queues
24 Advanced Data Structure
M.S. University - D.D.C.E.

2.1 INTRODUCTION
Computers help us in solving many real life problems. 'We know that
compurers can work with
greater speed and acctracy than human beings. lVhat actually does
, .o-p.ri", do? A very simple
answer to this question is that a computer .tor., data and reproduces
it as information as and when
required' Representation of data should be in a proper for-rt so that
accurare information can be
produced f-iqh speed. In this lesson, *" *iil siudy the various ways
I in which data can be
represented' Efficient storage and retrieval of data is imiortant in computirrg.
I., this lesson, we will
study and implement various data structures.
organized data is known as information. Let us consider a few examples and try
fi understand what
data structures are' You all must have seen a pile of plates in a resraurant.
\Thenever a plate is required,
p!1. on the toP is removed. Similarly, if , plri. is added to the pile, it is kept on th" ,op of rfr"
'Lt There is a definite
pile' Process involved in 1!e ,tor^g. and rerrieval oi ttre plates. If the rack i, .-ptf
ther.e will be no plates Similarly, if the rack"is full then there will b" ,ro plr." for more plri.r.
1]'ailabl:'
Similar Proc€ss is applied with stacks. Stack is a data srnrcrure which stores 'daa
at the toi, this
operation is known as push. It retrieves data from the top, the operation is
known as pop. If the stack
then the pop operation raises an error while push op.rrrion cannor be
:t :T.P? performed-if the stack
is full. Stack is shown in Figure 2.1.

Figure 2.1: Stack (pile of plates)


You must have seen a queue at the bus stop. There is a well-defined method
for a person ro enter the
queue as well as leave the queue to get into the bus. To enter a queue
a person ,rrrri, at the end of the
queue and the Person at the start of the queue leaves the queue
,o .rrt", the bus. Similarly, we have
queue as a data structure in which a daa.i.-.rt is added ,i th" rear
end and removed from the front
end. Figure 2.2 shows the structure of queue.

fo-qt{leave the queue} rear (enter the queue)

Figure 2.2: Queues


In a-train, separate comPartments are joined together. A new compartment
can be added at the end, at
the beginning or in between the existing train i=omparrmenrs. The same method
is followed ro remove
Lists, Stacks and Queues 25
M.S. University - D.D.C.E.

acomparrment. Linked lists in data structures follow similar approach. An element can be inserted and
deleteJ from any position in the linked list. A list can be shown as in Figure 2.3'

Figure 2.3: Linked List (Train Compartments)

There are other data srructures thar we will study in this course, for example graphs, trees etc. As we
have seen in the above examples data structures stores a given set of data according to certain rules.
These rules help in organizing the data.
you have studied programming languages before. This example use some concePts from it to
will
explain why data srnrcrures are essential. Data can be stored in various ways.. Usually the
,.p."r.rrtrtion chosen depends on the nature of the problem and not on the data. Consider a program,
*iri.h requires storing mark, of five students for a single subject. The simplest way to store them will
be to use ii,r. irrtege. variables. \fle assume that the roll numbers are from one to five. Now I wish to
write a program, *t i.t can give me the marks of any roll number given as input. Is it possible to write
an efficient program to do this task if the numbers are stored as variables?

& b * * *
() i1 f! f1
*
Figure 2.4: Marks Stored in five different Variables

The data,'anin this case the marks, is stored, but it cannot be reproduced as information efficiently. Now
we take of integers with five elements. The above data is stored in the array - Ln array will
^rray element can be accessed using index. Therefore, the marks for the
have a name'a''arrd
"rcliirdividual
first roll number will be stored as a [0], for the second roll number as a [1] and so on.
There is a relation between the roll number and the array index. Now it is much easier to access the
marks according ro rhe rcll number. This could not be achieved using variables, as there was no
relation between the data. It was not possible to relate marks and roll number.

sl al ll e[3] *[3] a[4]


Figure 2.5: Marks Stored as an ArraY

The above example can be slightly modified so that now we will store the marks of three subjects per
roll number. \7e can take three independent arrays to store them.'W'e can access the individual marks
for each roll number from the respective arrays. Ve find that the arrays contain marks of different
subjects for the same roll number. It would be easier to handle the marks if these three artays were
grouped together.
26 Advanced Data Srrucrure
M.S. University - D.D.C.E.

aro] aixl elEl atsl at{l

hJol blll btsl htsl ht4l

.-qJrrlcrlI cIEl *[3] els3


Figure 2.6: Three Arrays to Store the Marks of three Subjects
Now, we need a data structure that stores the above information in logical
relation with each other so
that it can be retrieved with higher speed. structures in c/c++
store various fields rogerher, so now
we can store the marks. of three subjects in the structure and
then make an ur^f offive elements. The
above example shows that the method in which data should
be represented depends on rhe problem to
be solved. The structure definition will be as follows:
struct record
{
int a [5] ;
int b[5];
int c[5];
j;
In the above problem, we have used three different types of represenrarions
for the data. They are all
built using the basic data structures. There are cenain rules associated
with each data srructure thar
restrict the storage and retrieving of data in an ordered format. \we
have ,..., like stacks and
queues, which restrict the entry and exit of its elements "*r-ples
according to certain r.rles.'stacks allow entry
and exit of eiements at the top position only, while in queues an
element is added at rhe end and
removed from the front. Arrays and structrr.., ,r. built in d"r,
,rru.r,'rres in most of the p.og.r;;;;
languages' Data can be stored in many other ways ,rirrg
oth., type of dara structures. Thus we can
define data structure as a collection oidrr, elements *hlr"
organization is characterized.byaccessing
functions that are used to store and retrieve individual data
elements. Data stnrctures are represented at
the logical level using a tool called Abstract Data Type and
actually implemented using algorithms.
Abstract Data Type's show how exactly the data strucrure behaves.
what are the various operations
required for the data structure? It can be called as the design
of the dara srrucrure. Using the Abstract
Data Type, it can be implemented_ using various algoritlims. Abstract
Drt, iyp. clearly srates rhe
nature of the data to be stored and the rules of th. ,rlo* operations
to be perfor*"d o1 the data. For
example, the Abstract Data Type of a stack will clearly rrrr"'rt
* a new element will be inserted at the
M.S. University - D.D.C.E Lists, Stacks and Queues 27

top and an element can be removed only from the top of the stack. It does not specify how these rules
should be implemented. It specifies what are the requiremenrs of a stack.
The implementation of a data stnrcture is done with the help of algorithms. An algorithm is a logical
sequence of discrete steps that describe a complete solution to a given problem in a finite amount of
time. A task can be carried out in various ways, hence there are more than one algorithm which can be
used to implement a given data structure. This means that we will have to analyze algorithms to select
the best suited for the application.

2.2 SINGLY LINKED LIST


An array is an example of list. Arrays have fixed size, which is declared and fixed at the start of the
program, and therefore can not be changed while it is running. For example, suppose an array of size 5
has been declared at the start of the program.

Now, this size cannot be changed while running the program. This we all know is static allocation.
\7hen writing the program, we have to decide on the maximum amount of memory that would be
needed. If we run the program on a small collection of. data, then much of the space will go waste. If
program is run on bigger collection of data, then we may exhaust the space and encounter an
overflow. Consider the following example:
Example: Suppose, we define an array of size 5. if we store 5 elements in it, it is said to be full and no
space is left in it. On the contrary, if we store 2 elements in it, then 3 positions are empty and virtually
useless, resulting in wastage of memory.

1 1

? ?

A full array An array with more than half


locations ernpty

Figure 2.7
Dynamic data structures can avoid these difficulties. The idea is to use dynamic memory allocation.
'$7e
allocate memory for individual elements as and when they are required. Each memory location
contains a pointer to the location where the successive element is stored. A pointer or a link or a
reference is a variable, which stores the memory address of some other variable. If we use pointers to
locate the data in which we are interested, then we need not worry about where the data is actually
stored, since by using a pointer, we can let the computer system itself locate the data when required.
28 Advanced Data Structure M.S. Universitv - D.D.C.E.

Linked lists use the concept of dynamic memory allocation. In this respect, they are different than
arrays. Every node in a linked list contains a 'link' to the next node as shown below. This link is
achieved by using poinrers.

Skucture 1 Sffuriure ! $Euciure 3

rteffi t
t 1
item I itr*
l*
ltI*xt rnd
$tilt

Figure 2.8: A Linked List


This type of list is called a linked list because it is a list where order is given by links from one irem to
the next.

2.2.1 ADT of Singly Linked Lists


There are various operations, which can be performed on lists. The list Abstract Data Type definition
given here contains a small collection of basic operarions, which can be performed on listsi

ListADT Specification
Value Definition: The value definition of a linked list contains a dara type for storing the value of the
node along with the pointer to the next node. The value can be represented using a simple data type or
a collection of basic data types. Ffowever, it must necessarily contain at least one pointer to the next
stnrcture. This can be shown as follows:
sLruct datatlpe
t
int item;
struct datatlpe *next;
)

or,
struct datatlpe
{
int item;
f1oat. info;
char str;
struct. datatlpe *next;
)

Definition clause: The nodes of the list are all of the same rype, and have a key field called key. The list
is logically ordered from smallest unique element of key ro the largest value i.e. at any position the key
of the element is greater than its predecessor and smaller than its successor.
Lists, Stacks and Queues 29
M.S. University - D.D.C.E.

Operations:

l. Crlist:
Function: creates a list and initializes it as empty'

Preconditions.' none.

Postconditioas; list is created and is initialized as emPty'

2. Irusert:
middle or at the end'
Function:inserts new element into the list either at the beginning, in the
Preconditions.' a list already exists'
postcond.itiozs: list is returned with the new element insened in it.
3. Delete:
the list'
Function: searches a list for the element and removes the element from
Preconditions; the list already exists'

Postconditions: thelist is returned with the element removed from


it'
4. Pint:
Function: traverses the list and prints each element'
Preconditions: the list already exists.

Postcond.itiozs; list elements are printed in the order they are Present in the list. List remains

unchanged.

5. Modifii:
Function: searches for an element and replaces it with a new value.
Preconditions; the list already exists.
Postconditions: theelement if present is modified by a new value'
maintain a list of elements.
These are the basic set of operations that might be needed to create and
other operations, which can be performed on linked lists, are:
t. Counting the elements in a list.

2. Concatenating two lists.


think of more operations like comparing two lists, adding the elements of two lists,
etc'
lJsers can
depending on a specific p.obl.* and try building ADT's of their own.

2.2.2lmplementation
members. Some members can
Each element of the list is called a node and consists of two or more
pointels to other nodes' In case of
contain the information p"rt"irring to that node and the others may be
a singly linked lisr, one membeiconsists of such a pointer. R iint ed
list is therefore a collection of
strucrures ordered rr* uy their physical placement in memory
but by logical links that are stored as part
stmcture of the same type'
of data in the structure iiself. tti" ii"t is in form of a pointer ro another
30 Advanced Data Structure
M.S. University _ D.D.C.E

Such a structure is represented in ,C/C + ,


+ as follows:
st.ruct node
{
int item;
struct node *next;
j;
The first member is an- integer item and the
second a pointer ro rhe next node in the
item can be any complex data tvpe. That is it list as shown. The
.rn .ortrin a collection of basic ari, ,yp.r. Furrher, as
we will study doubly linked liits later, we
can have more than two pointers. one,
successor node and the other to the predecessor. pointing to the
Th. ;;;;;., ,yp. is the type of the node itself. This
node can be shown as follows:
st.ruct node
{
int it.em;
float info;
char str,.
struct. node xnext.;
j;
Right now' we will limit our discussion to
-singly
linked lists with only two members, i.e. one
containing the data and the other a pointer ro rhe ,i.rt ,od..

+ede

r-lgfln ng)*
Figure 2.9: Node
Such structures, which contain a member
field that points to the same stnrcrure type,
self-referential structures. A node may be represented are called
in general form as foilows:
SF."EU+-.[*.",* EhF".t - n arne
{
typ,e:uemherl;
"!"y.&.8 rremherZ;
type rnemherl;

.HS*X"}IH* Imhet-name *nexr;

Figure 2.10: Structure Declaration for the


Node
Lists, Stacks and Queues 31
M.S. University - D.D.C.E.

The node may conrain more than one item with different data types. However, one of the items must
be a pointer of the type label-name. The above node with all its members can be depicted as followsr

men:hert

node

Consider a simple example to understand the concept of linking. Suppose we define a stnrcture
follows:
struct list
{

int value;
struct list *next,'
);
Assume that the list contains two node viz. nodel and node2. They are of type strud list and are
defined as follows:
struct list nodel,node2;
This statemenr creates space for two nodes each containing two empty fields as shown below:

nfrdet

Nodel.vn,lue
Nodel. next"
nod,Ez

&0dEt.lIffi"Iu€
n*dea . next
Figure 2.11: Creation of the two Nodes

The next pointer of nodel can be made to point to node2 by the statement
nodeL.next: &node2;
32 Advanced Data Srructure
M.S. University _ D.D.C.E.

This statement stores the address of node2 into the field nodel.nexr and thus establishes a ,,link,,
between nodel and node2 as shown below:

nfi*e1
node1.:ralue

rrode t
nsdei "vnlue
Address sf
n*de?

node?.value

Figure 2.12: Node I pointing to Node 2


Now we can assign values to the field value.
nodel.value - 30;

node2.value:40;
The result is as follows:

aadel

node? . nel:L

Figure 2.13: The Complete List of two Nodes


Assume that the address of node2 is 871800. As you can see,
that address is now srored in the next field
of nodel.
\we may continue this process
to create a linked list of any number of values. Each time you need to
store a value allocate the node and use it in the list. For ."r*pI.,
node2.next-Ernode3; would add another link.
Also every list must have an end' This is necessary for processing
the list. c has a special pointer value
called NULL that can be stored in the next fierd of the irrt
,rod..
In the above rwo-node list, the end of the rist is marked as follows:
node2.nexr=NULL;
The value of the value member of node2 can be accessed using the
next member of nodel as follows:
(
cout < "\n" ( ( nodel.next- ) value;
M.S. University - D.D.C.E. Lists, Stacks and Queues 33

2.3 APPLICATION: POLYNOMIAL ADDITION


\7e will now try to add two polynomials using linked lists. In mathematics a polynomial is written as
follows:
P (x) :x,v' + a2x(n-1) + ... + an

Each term will be represented as a node. The node will


be of fixed size having 3 fields, which represenr
the coefficient and exponent of a term plus rhe pointer to the nexr rerm.
Assuming that all coefficients are integers, the required declarations in C are as follows:
STRUCT LINK_LIST

INT COEF;

INT EXPO;

STRUCT LINK_LIST *NExT;

];
Polynomial nodes will be drawn as below:
For example, the polynomial a : 3x1a + 2xtc + 3xa will be stored as:

In order to add two polynomials together, we examine their starting terms. Two pointers are used to
move aiong the two polynomials. If the exponents of two terms are equal, then the coefficients are
added and a new term created for result. If the exponents are nor equal, then the term with bigger
exponent is attached to the resultant polynomial.
Following is the program and algorithm for the above problem:
/*this program adds two polynomials using linked lists's/
Algoithm
Step 1: Stan.
Step 2: Ask the user ro enter the first polynomial.
Step 3: Ask the user to enter the value of coefficient and exponent, or '0,0' terminate the polynomial.
Step 4: Ask the user ro enter rhe second polynomial.
Step f : Repeat step 2

Step 6: If the
user enters the value greater than 0, then match the values of different exponenrs entered
by the user and then perform addition of the coefficients with similar exponenrs.
Step 7: Display the results after the polynomial addition.
Step 8:End.
#include <stdlib.h>
#include <conio.h>
#include <iostream.h>
34 Advanced Data Structure M.S. University - D.D.C.E.

#include <stdio.h>
#define TRUE 1
#defi-ne FALSE 0
#defi-ne MAx 10
int arrayll{Axl;
cfass polyadd
{
protected:
struct link_Ilst
{
int coef;
int. expo;
struct link_1ist *next;
);tlpedef struct link_list node;
public:
void crpoly(node *list) ;
void padd(node *1ist1, node *1ist2, node *1ist3);
void print(node *list) ;

void polyadd: :crpoly(node *1ist) / /create the pollmomial


T

int x, y;
cout<<endl<<"Input a pair of number as 'coef, expo' :";
cout.<<end1<< " input ' 0, 0 ' to stop enterinq" ;
cout<<endl<<"input the coefficient " ;
cin >>x;
cout<<end1<<"input the exponent ";
cin >>y;
if(x==Q && y -- 0)
{
list->next = NULL;
)
else
if (x l= 0 il y == 0)
{
list->next = new node O ;
M.S. University - D.D.C.E. Lists, Stacks and Queues 35

list->next->coef = x;
list->next->expo = y;
crpoly(list->next);
)

)
void polyadd::padd(node *1ist1, node *1ist2, node *1ist3)
{ / /add two polynomials
if(list1 l= NULL I I list2 l= NULL)
{
if (1istl-->expo == 1ist2->expo)
{
list3->next = new nodeO;
1ist3->nexL->next = NULL;
1ist3->next->coef = 1ist1->coef + list2->coef;
1ist3->next->expo = 1ist1->expo;
padd(1ist1->next, list2->next, 1ist3->next) ;
)
else if ( list1->expo>list2->expo)
t
1ist3->next= new node ( ) ;
1ist3->next->next = NULL;
list3->next->coef = list1->coef;
1isL3->next->expo = listl-->expo;
padd(1ist1->next, 1ist.2, 1ist3->next) ;

)
else if (Iist1->expo< 1ist.2->expo)
{
1ist3->next = new nodeO;
1ist3->next->next = NULL;
list3->next->coef = 1ist2->coef;
list3->next->expo = 1ist2->expo;
padd(1ist1, 1ist2->next, 1ist3->next) ;

)
else if(list1 == NULL && list2!= NULL)
{
36 Advanced Data Structure M.S. Universitv - D.D.C.E.

list3->next. = new nodeO;


1ist3->next->next = NULL;
list3->next->coef = 1ist2->coef ;
1ist3->next->expo = 1ist2->expo;
padd(list1, 1ist2->next, list3->next) ;

)
el-se if (1ist2 == NULL && listl_t= NULL)
i
l-ist3->nexL = new nodeO;
l-ist3->next->next = NULL;
1ist3->next->co€f = 1ist.1->coef ;
1ist3->next->expo = 1ist1->expo;
padd(1ist1->next, 1ist.2, 1ist3->next) ;

)
return;
)
void polyadd: :print (node *l-ist)
{
if(1ist->next != NULLi
i
couE<<endl<< "Coef ficient is,'<<1ist->next->coef ;

cout<<end1<< "Exponent is,,<<1ist->next->expo;


print (1ist->next) ;
)
return;
)
void maino
t
clrscr ( ) ;

polyadd *pa = new polyaddfl;


node * headl;
node * head2;
node * head3;
headl = new node0;
head2 = new node0;
M.S. University - D.D.C.E. Lists, Stacks and Queues 37

head3 = new nodeo;


couL<<"Enter first polynomial" ;

pa->crpoly (head1 ) ;

cout<<"Printing first polynomial" ;


pa->print (head1) ;
cout<<"Enter second pol1mom]-al-" ;
pa->crpoly(head2);
couL<<"Printing second polynomial" ;
pa->print (head2) ;
pa->padd(head1->next, head2->next, head3) ;
couL<<end]<<"printing after addition" ;
pa->print (head3 ) ;
cout<<end1;
i
Sample Output:
Enter first pol]momial
input a pair of numbers as 'coef,expo':
input '0,0' to stop entering,
input the coefficient 2
input the exponent 2
input a pair of numbers as 'coef,expo':
j-nput 'O,O' to stop entering,
input the coefficient 1
input the exponent 1
input a pair of nrlmbers as 'coef ,expo' :
input '0,0' to stop entering,
input the coefficient 0
input the exponent 0
Printingr firsL polynomial
Coefficient is 2
Exponent is 2
Coefficient is 1
Exponent is 1
Enter second polynomial
input a pair of numbers as 'coef,expo':
input '0,0' to sLop entering,
38 Advanced Data Structure M.S. University - D.D.C.E.

input the coefficient 3


input the exponent 3
input a pair of numbers as ,coef,expo':
input '0,0' to stop entering,
input the coefficient 1
input the exponent 1
input a pair of numbers as ,coef,expo,:
input '0, 0' to stop ent.ering,
input the coefficient 0
input the exponent 0
Printing after addition
Coefficient is 3
Exponent is 3
Coefficient is 2
Exponent is 2
Coefficient is 2
Exponent is 1

2.4 ADT OF STACKS


A stack is a dynamic structure i.e. it changes as elemenrs are added to and removed from it. The
oPeration that adds an element to the top of a stack is usually called PUSH and the operarion thar
takes the top element off the stack is called POP. \7hen we sr;rr using the stack for the first time, it
mu$_ be emPty, so we need to create an empty stack. lWe mrst also have an operation to check
whether a stack has something in it to pop or not. For particular implementations one may need to
check whether a stack is full or not. ADT of such operations are given as follows,

ADT Sp e cifieation fo r S tac ks

Value definitioz.'A stack can contain anything of the type its implementing data srrucrure is defined,
i.e. integers, characters, complex records etc.

Definition elause: A stack as explained is a list of elements in which the item added mosr recenrly is
taken out first, i.e. the Last item In is the First one Out. Therefore, a stack can be defined as a LIFO
list of elemenrs.
Operations:

I. Create:

Function: initializes stack to an empry stare.


Precondition' none.
Postcondition: stack is created and is empry.
Lists, Stacks and Queues 39
M.S. University - D.D.C.E.

2. Push:

Function: adds new element to the toP of the stack.


Precondition' stack is created and is not full.
Postcondition: original stack plus new element added on toP.
3. Pop:

Function: removes top element from the stack.


Precondition' stack is created and is not emPty.
Postcondition: original stack with top element removed'
4. Ernpt1:

Function: tests whether stack is empty'


Preconditiou stack is created.
Postcond.itioz.' answer as yes or no depending on the status of the stack; no change in stack
contents.

5. Full:
Function: tests whether stack is full.
Precondition' stack is created.
Postcond.itioz.. answer as yes or no depending on the status of the stack; no change in stack
contents.

6. Destrojt:

Function: removes all elements from stack, leaving the stack emPty.
Precondition' stack is created.
Postconditioz.' stack is empty.

These are the specifications for some of the operations, which are commonly performed on
ADT
stacks. Reader can think of more such operations depending on a particular situation and requirement
and can write ADT for them on the same lines.

2.5 IMPLEMENTATION
A stack can be implemented using both static and dynamic implementations, i.e. the space can be
allocated at compile time itself or at the execurion time of the program. Each implementation has its
own advantages and disadvantages, which we will consider later.

2.5.1 Implementing a Stack using an Array


'$0e
Since all elements of the stack are of the same rype, an array can be used to contain them. can store
elements in sequential slots in the arra.f tplacing the first element pushed in the {Lr:st array position, the
second element in the second array position and so on.
40 Advanced Data Structure
M.S. University - D.D.C.E.
rwe also need to know
how to find the top element when we wanr ro pop and where ro pur the
new
element when we push. Although, we can access any element of an
iir"rrly, we have to confirm
to the stack restriction of "LIFO ". So, we will access the stack .I"-.rrt,
^rr^y
orriy',hrorgh the top, not
through the bottom or rhrough the middle.
Therefore, even though the representation of the stack may be a random-access
structure such as an
artay, the stack itself as a logical entity is not randomly r.."rr"d. ,N/e
can use its top element only. In
the ar-ray representation, a variable 'top' keeps. track of th" top position
of the stack. An empry stack is
signaled by stack top being zero anda full stack by top gr.rr., th*
th" l.r, ,ro.r-g; location.
In the following Programs, stack is an array of size 'MAx' and top is a variable,
which keeps track of
top position of the stack. Before writing the actual code, let us rry and
write an aigorithm for it.
Algorithm
Step 1: Start.

Step 2: Declare a stack of fixed size.


step ): Insert elements into the stack and display the status of the stack after
each insertion.
Step 4: Inserr values into the stack until the stack is full.

step I: Start the pop operation and display the status of the stack after
each pop operation.
step 6: continue wit h the pop operarion until the stack is empty.
Step 7:End.
Program
To implemenr a stack using arrays.
#include<iostream. h>
#include<stdfib. h>
#include<stdio. h>
#include<conio. h>
cLass fntstack
{
protected:
int count;
public:
fntStack(int num)
{

top - 0;
maxelem = flufili
s = new int [maxelem] ;
count =0;
)
int push(int r)
M.S. University - D.D.C.E. Lists, Stacks and Queues 41

{
if (LoP =- maxelem)
return maxelem;
sIt.op++] = t;
count++;
return count;
)
int pop ( )

{
if (top < 0)
{
return (-1);
)
top - top-1;
cout<<"top elelmnt is " <<sItop];
return (s Itop] ) ;
)

void display__pop o
i
if (top < 0)

{
cout << " (empty) \n" ;
16lr1rh.

)
for (int t=top;t>=0;t--)

eout << " \n";


]
void display3ush o
{
if (top < 0)
{

cout << " (empt.y) \n";


return;
)

f or ( int t.=0 ; t<top; t++ )


42 Advanced Data Structure M.S. University - D.D.C.E.

cout << "\n";


)
int empty( )
{
return top == 0'
)
private:
int *s;
int top;
int. maxelem;
j;
void main ( )
{
IntStack *s = new IntStack(100);
int d;
int count;
clrscr ( ) ;
count = s->push(1);
s->displayl:ush ( ) ;
count = s->push(2);
s->display_push O ;
count = s->push(3);
s->displayJush ( ) ;
count = s->push(4);
s->displayJush O ;
for (int, i=0 ; iccount; i++)
{
s->Pop ( ) ;
s->display_$op ( ) ;

)
gretch ( ) ;

)
M.S. Universitv - D.D.C.E. Lists, Stacks and Queues 43

2.5.2Implementing Stacks using Linked Lists


As we know, the linked lists use the concept of dynamic memory allocation, i.e. memory is allocated as
and when needed. Unlike the array version, this implementation has no program imposed maximum
stack size. Nodes are allocated dynamically, using the new operator, as and when needed. If the call to
new fails, then it returns NULL and it is assumed that memory is full. In case of successful memory
allocation, rhe new stnrcture is logically added. An empty stack is represented by stack top (stack
pointer) pointing to NULL. Before writing the actual code, let us write an algorithm for the Program'

Algorithm
Step 1:Start.
Step 2: Declare the structure of the linked list.
Step Insert elemenrs through the top of the linked list and increment the top position by 1 after each
i:
insertion.
Step 4: Insert the elements into the list until it is full.
Step f : Perform the pop operation by decrementing the top position by 1 after each pop.

Step 6: Display the list after every pop operation until the list is empty.
Step 7:End.
Program

To implementing stacks using linked lists.


/x program Lo show push and pop operations using linked l-ists*/
//Implementation of stack using Linked Lists
#include<iostream. h>
#includecconio . h>'
#include<stdio. h>
struct node
{
int data;
node *link;
);
class stack
{

private:
node *top;
public:
stack ( )

t
toP=1tr911 '
44 Advanced Data Structure
M.S. University - D.D.C.E.

)
void push(int item)
j-nt pop O ;
-stack ( )

if (t.op==NgLL)
return;
node *temp;
while (topt =NULL)
{
temp=gqpi
top-top->1ink;
delete temp;
)

void stack:: push (int item)


{
node *tempi
temP=ng\^/ node;
if ( temp==NULL)
cout<<endl<< " stack is FuIl ";
temp- >data= i tem;
t.emp->link=top;
top-temp;
)
int st.ack:: popo
t
if (top==59111
{
cout<<endl<< " Stack is Empty";
return NULL;
)

node *temp;
int item;
temP=39p'
M.S. University - D.D.C.E. Lists, Stacks and Queues 45

item=temp->data;
toP=3on-t1ink;
delete temp;
return it.em;
)
void main( )

{
clrscr ( ) ;
stack s;
s.push(11);
s . push (L2) ;

s.push(13);
s.push(14);
s.push(15);
int i=s.popO;
cout<<end1;
cout<<"Item popped=" <<i<<end1;
i=s.popO;
cout<< " Item Popped= " <<i<<endl ;
i=s.popO;
cout<< " Item Popped= "<<i<<endI ;
i=s.pop ( ) ;
cout<< " ltem Popped= "<<i<<endl ;

i=s.pop ( ) ;

cout<< " Item Popped= "<<i<<endl ;

i=s.pop ( ) ;

cout<< " Item Popped= " <<i<<endl ,'

getchr ( ) ;

2.6 ANALYSIS OF STACK IMPLEMENTATIONS


As stated in earlier lesson, an array vaiiable of will take the same amount of memory,
MAX stack size
no matter how many array slorc are actually used. Therefore, we need to reserve space for the
maximum possible. If more elements need to be stored, then we may fall short of memory. Also, if
elements srored yery less, we may have a lot of unused space. On the other hand, the linked
^re
implementation using dynamically allocated storage space only requires space for the number of
46 Advanced Data Srructure
M.S. University - D.D.C.E.

elements actuallyPresent in the stack at run time. But, the elements arelarger since we must
store the
link (the next field) along with rhe user's dara.
Apart from the space requirement, the two implementations can be compared on other criteria also.
For example, Programming efforts and program complexity. \7e can .o-prr. the efficiency of
the two
representations with respecr to each other in terms of Big oh norarion.

L. In both implementations, operations like creating a stack have measure O(1). In


Create operation:
an affay implementation, only the stack has to be declared ,ri.rg ,rrrys. Also, in the linked
implementation, only the memory for the first node has to be allcatej. Therefore, a consrant
amount of work is invoh,ed at all times.
2. EmPtl orfull opyatioy: In both implementations, operarions like checking whether a stack is full
or emPty have Big Oh measure O(1) because the algorithm has to checklor the presence of just
one element.

3. Push or pop operation: In push or pop operations, the number of elements in the stack do not
affect the amourt of work done by theie operations. Because, in both operations we directly
access the top of the stack, i.'e. only one element. Therefore, push and pop have
measure O(1).
4. Destroy operation: Probably, this is the only operarion amongst the basic ones, which differs
from
one implementation to other. In the array version, we just have to ser rhe top field to zero)so
it is
an O(1) operation. But, in the linked version, we musr process every node in the $ack ro free
the
node space.
Therefore, the operation is o(n) where n is the number of nodes in the stack.
In all, the two implementations are almost equivalent in terms of amount of work they do. Since the
destroying operation is not widely used, the difference is not significant. The Table 2.1 summarizes
the
Big oh measures of various operations on the two implem"rr"iiorrr.
Table 2.1: Big Oh Measure of Common Stack Operations

Operations Array Linked


Implementation Implementation

Create o(1) o(l)


Empty o(l) o(1)
Full o(1) o(1)
Push o(l) o(l)
Pop o(l) o(1)
Destrov o(1) o(n)
The decision to use one of the two implementations depends on the situation.
Linked implementation is more flexible and is preferable where the number of stack elemenrs
vary
greatly. It wastes less space when the stack is small. \7hen stack size is unpredictable,
linkei
implementation is preferable. Array implementation is shorr, simple and efficient. \rhen we
are sure
that we will not need to exceed the declared stack size, the array implementation is a good choice.
For example,if a customer database for a bank is to be maintained, then the linked implementation
yould be preferable because the number of customers is expected to vary greatly. On the con trary, if a
fixed list of students of a class is to be maintained, array implementation would te more convenient.
Lists, Stacks and Queues 47
M"S. University - D.D.C.E.

Examples: A stack is an appropriate data stnrcture when information must be saved and then later
retrieved in reverse order. Any-situation requiring to store a previous steP and coming back to it in
furure may be a good one ro .rr. a str.k. Following are some examples using stack as their data structure.
Reuersing an lnput Text Line

Brief
As a simple example of using stacks, let us try to make a function that will read a line of input and
will
then write it out Lackward.'We can accomplish this task by pushing each character onto the stack as it
is read. \flhen we come to end of the i.rprrt, we will pop characters off the stack and they will come
off
in the reverse order.
Program
To reverse an input text line.
#include<stdio . h>
#i-nclude<sLdlib. h>
#define MAX 10 /*defining stack size* /
mainO /*main starts here*/
{
int toP=Q;
char stacktMAxl,c; /*declaring a character stack*/
clrscr ( ) ;

cout<< " \n enter the sequerice of characters: ";

while ( (c=get.char O) !='\n' ) /*accepting the text*/


{
stack I top] =c;
toP++;
)
cout<<"\n the reversed string is:"; //Printing the Eext in
//reverse order/ /
while ( toP t =0 )
{
cout<<sLackltop-11 ;

Lop- - ;

)
)
/*main ends here*/
Vatidating an Expression by Parenthesis Matching
Brief
Consider a mathematical expression that includes several sets of nested Parentheses, for e.g.
(a-((b+c(d))))
48 Advanced Data Structure
M.S. Universitv _ D.D.C.E.

\7e want to make sure that the parentheses are nested correctly, i.e.
1' lwe want to check that there are equal numbers of right and left parentheses.
2. Every right parenthesis is preceded by a matching left parenthesis.
Expressions such as

((r+b)
violate condition 1, and expressions such as-

)a+ b(c
violate condition 2.

the actual code for the program let us write an algorithm for it
l*:fning and rry ro understand
rne toglc.

Algorithm
Stelr l:Start.
Step 2: Declare a character array to store opening braces.
Step 1: Start accepting the expression.

step 4: If the character is.an opening brace, e.g. '('or'{'or'[', push it onto rhe stack. If successive
opening braces, keep pushing them on the stack.
Step.f : If the character is a closing brace, e.g. ,), or,), or,],,
l. Check if the stack is empty.
2' If the stack is empty, it means there was no corresponding opening brace for the closing brace.
Therefore, the expression is invalid.
3. If the stack is not emptI, pop an elemenr from the stack.
4' If the popped opening brace corresponds to the closing brace then the expression
is valid.
5. Else the expression is invalid.

Step 6: \7hen we come to the end of accepting the expression,


1' If there still some opening braces left in the srack, ir means that the expression
are
is invalid.
2. If the srack is empty, expression is valid.
Step 7:End.
,/*Program: To validate an expression by matching left and right parenthesis,i/
//A PROGRAM TO TEST FOR THE MATCHING PARENTHESIS
IN AN //EXPRESSION.
#include<iostream. h>
#includecstdio . h>
#include<stdlib. h>
#include<conio. h>
M.S. University - D.D.C.E. Lists, Stacks and Queues 49

#define MAX 10
class stk
i
public:
char stacklMAXl;
char c, e1e;
int top;
stk o
t
top-0;
)
int push(char c)
{
stack I top] =q.
return ++top;
)
char pop ( )
{
e1e=stackltop-11 ;

stack I top-1 ] =Q ;
top--;
return ele,.
)
t.
void main( )

t
clrscr ( ) ;
stk s,.
int flag=l-,top;
int tFlag =Q,.
char ret;
char c;
system("C1ear,') ;

cout<< " Enter the Expression,, ;


while ( (c=getchar ( ) ) ! =, \n' )
{
50 Advanced Data Structure M.S. University - D.D.C.E.

if(c=='(' ll c=='{' llc=='[')


{
toP = s.Push(c);
tFlag = 1;
)
else
if (c==')'l I c=='l' I lc==')')
i
if (toP==91
i
flag=9;
break;
)
else
{
ret = s.pop0;
if ((ret=='('&&c=-') ') II (ret=-'{'&&c=='}') II (ret=-' ['&& c==']'))
{
tFlag =g;
)

else
t
flag=Q;
break;
)

)
)

)
if(s.stack[0] !=0)
{
flag=g;
i
if (f1ag==11 1

cout<<"Expression is vafid'; )
else if ( flag==g 1 1
cout<<"Expression is invalid. "; )
M.S. Universiry - D.D.C.E. Lists, Stacks and Queues 51

get.che O ;

)
's7'e
have tried to keep the above program as simple as possible, because our
aim was to illustrate the
concePt of stacks and.not of making an e*tersive e"piession evaluaror. This
is *hy -;h;;il;
some assumptions, which are as follows:

Any expression is to be contained in brackets, for e.g. the following expression will not work with the
above program:

a+(b+c) - d
Rather, it has to be in the following formar:
(a+ (b+c) - d)

The reason for using a.stack in this problem should be clear. The last parenthesis to
be opened should
be the first one to be closed. This is simulated by a stack in which the last
element ro arrive is the first
to leave' Each element on the stack represents a parenthesis that has been opened but has not
yet been
closed. Pushing an item onto the ,tr.k .orr"rporrdr to an opening brace ,ri poppirg
an item fro* th.
stack corresponds to closing a parenrhesis.

ll
tt
H { -------- {a+{---

{ n+ {h** 3 { a+ {h-c}
Figure 2.14
52 Advanced Data Structure M.S. University - D.D.C.E.

The Figure 2.14 depicts rhe contents of the stack at various stages of processing the for exPression:

{a+ (b-c)}
P os tfix Expre s si on Eual ua t i o n

Brief
The sum of. 2 normally as 2+3. This is called infix notation. The same sum can be
and 3 is represented
represented as +23,which is called prefix notation, and23+, which is called Postfix notation.

The pre{ixes "in", "pre" and "post" refer to the relative position of the oPerator with respect to the
t*o op".rrrds. In prefix norarion, operator is before the two operands. In infix notation, operator is in
b"t*".n the two tpera.rds. in postfix notation, operator comes immediately after the two operands.
Reader should gather more information on various notations in relevant literature.

Postfix notation has some obvious advantages over the most commonly used infix notation.
1.. Need for parenthesis is eliminated.
2. Knowledge of operator precedence is not required.
'We
try to develop a program for evaluating a postfix expression. Each operator encountered refers
can
'When
to the previous .*o op.irrrds. Each time we come across. an operand we push it on to a stack.
*" ,.*ih an operaror, its operands wiil be the two top elements on the stack. 'We can Pop these two
elemenrs, perform the operation on them and push back the result on to the stack.

It is then available for use with the next operator. The following program evaluates a Postfix
expression using this method. But let us write an algorithm first.

Algoritbm
Step 1: Start.

Step 2: Declare the structure of the stack.


Step ): Accept an expression from the user (ex: 1+2 or 2't2 etc.).

Step 1: Check if the operator is:

1. '+', then perform addition between the values and store as result.
2. '-', then perform subtraction between the values and save as result.
3. .'t', then perform multiplication between the values and store as result'
4. '/', then perform division between the values and store as result.
Step I: Push the result obtained from the expression and store it on the top position of the stack.

Step 6: Display the result on the screen.


Step 7:End.
Sample Output.

Enter the expression: 49't'5+


The result is 41.
Lists, Stacks and Queues 53
M.S. University - D.D.C.E.

for S ub-prograrns
S tac k

This is one of the most important applications of stacks. What happens within the computer when
sub-programs are called? Tlie system (or the program) must remember the place where the call was
,r,rd., ,J thrt ir can return there after the sub-program is complete. It must also remember all the local
variables, CPU registers, and other data, so that information is not lost while the sub-program is
working. \0e can tlink of all this information as one large structure, a temporary storage area for each
sub-program.
Suppose that we have 3 sub-programs called A, B, and C, and suppose that A invokes B and B invokes
c.^'ihen B has not .ornplet.d iis work unril C has tinished and returned. Similarly, A is the first to
srarrwork, but it is the last to be finished, nor unril after B has finished and returned. Thus the
sequence in which this activityproceeds is summed up as the property last in,
first out. If we consider
the machine's task of temporarY storage areas for use by sub-programs, then those areas
"rrig.rirrg this same
would be allocated in a list with ProPeny, that is, in a stack.
The example is represented in the Figure 2.15.

Figure 2.15: Stack for Sub'programs at Various Stages

2.7 ADT OF QUEUES


A queue is an ordered coliection of items from which items can be deleted from one
end (called the
frorrt of the queue) and into v,hich items can be inserted at one end only (called-the rear of the
queue)'
As opposed to ,tr.kr, queues are FIFO lists. The element that was the First to be In, will
be the First
to be Out. put in oti., *ords, the element that has spent the longest time in the queue, will be the
spent the
first to be taken out. This is opposite to a stack, in which we know that the element that has

least time, is the first to be out, i.e. LIFO.

There are various examples of queues in the real world. A line at a tailway counter or a L'us
stop is
familiar examples of qr",r"r. The person first in the queue will be the first to get the ticket. Similarly,
any new passenger will have to stand at the back of the queue'
To add elements to a queue we access the rear of the queue. To remove elements we access the front'
The middle elements are logically inaccessible, even if we physically store the queue elements
in a
random access structure such as an array.

The essential property of the queue is its FIFO access'


54 Advanced Dara Structure
M.S. University - D.D.C.E.

As-it is clear by now, there are two operations that can be applied to a queue. Firsr,
new elements are
added to the rear of the queue. We wil cafl this operation ."i"rq. \7e
can also take the elements off the
front of the queue. \X/e will call this operation exiiq.
\7e are also required to check whether the queue conrains anything or is empty.
Theoretically, we can always enter in a queue, for in principle, a queue is
not limited in size. But
certain implementation, for e.g. an atray imple-.nt"rion, ,"q,rir., us to check
whether the data
structure is full, before entering a new element. Therefore, *. .rn also have
an operation for this
purPose' Before doing anphing, we also need to create a queue and initialize
it to an empty state. Also,
we might wanr ro delete all elements of the queue, leaving an empty srrucrure.
Following is the ADT representation of some of the common operations that
can be performed on
queues.

o value definition: A queue can contain any'thing of the type its implementing
defined, i.e. integers, characters, complex ,".ord{.t..
r - -----' data srrucrure is

Definition clause: A queue as explained is a list of elements in which the item


' out first, i.e. the First item In is the First one Out. Therefore, a queue can be
added first is taken
defined as a FIFO list
of elements.
Operations:

t. Create:

Function: initializes queue ro empry stare.


Precondition none.
Postconditioz..A queue is created and is empry.
2. Enterq:
Function: adds new element at the rear of the queue.
Precondition: queue is creared and is not full.
Postcondition: original queue plus new element added ar rhe rear.
3. Exit:
Function: remoyes front element of the queue.
Precondition queue is created and is nor empry.
Postcondition: original queue minus the front element.
4. Emptj:
Function: checks whether the queue is empty.
Precondition queue is created .

Postconditioz.' answers'yes' or'no,; original queue unchanged.


5. Full:
Function: checks whether the queue is full.
Lists, Stacks and Queues 55
M.S. University - D.D.C.E.

Precondition' queue is created'


Postcondition" answers 'lzes' or 'no'; original queue unchanged'
may come across
Students ca1 try their own ADT specifications for any other operations that they
with various implernentations.
The enterq operation can always be performed, since there is no limit to the number
of elements a
.orrrin. So, an or.ilo* sit.ration should never occur. The exit oPeration, however' can be
queue -ry
applied only if the queue is nonempty
is
The result of an illegal attemPt to remove an element from an empty data structure
called

underflow.

2.8 QUEUE IMPLEMENI4TIoNS


2.8,1 Array Implementation of Queues
Representation of a queue as arrays is somewhat different than a stack'
In addition to a
one-dimensional arra,y,i. n."d 2 ,rriiabl.s, front and rear. Front points.to
the first element of the
there are no elements in the
il."; and rear to th. i".t .l.*.nt of the queue. Thus, front -r.trt when
queue. The initial condition is

front : rear: 0.

Initial pueue
fror:t

rear

Rear

Queue after Inserting tuo Elements

fr*nt

reaf
56 Advanced Dara Structure
M.S. University - D.D.C.E

Queue after Deletingan Element

k"qu

tear
Figure 2.15: Various Stages in Array Implementation of
eueues
A full queue is shown by rear, which is equal ro rhe last storage
secrion.
The following Program shows insertion and. deletion
operarions on a queue using array. Before writing
the actual code, let us try to write an algorithm f"r rfr"
prrgrr_.
Algorithm
Step 1: Start.
Step 2:Declare the structure of the queue.
step 1: Insert the elements into the queue from the rear and display the status of queue after
each
insenion.
Step 4: Continue insertion until the queue is full.
step r: Initiate the pop operation by popping the elements from the front.
Step 6: Display the queue after eachpop operation.
Step 7:End.
Ptogram
To delete and insen from a queue.
//CREATTON OF QUEUES USING ARRAYS
#include<iostream. h>
#include<conio. h>
#include<stdio. h>
#define MAx 10
class queue
{
private:
int .arr [MAX] ;
int. front,Tear;
public:
queue ( )

{
M.S. University - D.D.C.E. Lists, Stacks and Queues 57

front =-1;
rear=-1;
)
void addq(int i-tem)
t
if (rear==MAX-1 )

{
cout<<end1<<"Queue is Full_,, ;

return;
)
rear++;
arr Irear] = j-tem;
cout<<endl<< " items added,,<<arr I rear ] <<end1 ;
if ( front==-1 1

frnnl-=O.

)
int delq( )

{
int. data;
if(front==-1)
tI
cout<<end1<< " Queue is empt.y" ;

return NULL;
]
data =arr Ifront] ;

if (front==rear)
front=rear=-1;
else
f ront,++;
return data;
)
j;
void main( )

{ clrscr O ;
queue a;
a.addq(11);
58 Advanced Data Structure M.S. Universitv - D.D.C.E

a.addq(21);
a.addq(31);
a.addq(41);
a.addq(51);
int i=a. delq ( ) ;
cout<<endl<<" ILem Deleted='<<i<<endl;
i = a.delq0;
cout<<endl<< " f tem Deleted= "<<i-<<endl ,'

i= a.delqO;
cout<<endl<< " Item Deleted= " <<i<<endl ;

i= a.delq( ) ;
cout<<endl<< " Item Delet.ed= " <<i<<endl ,'

i = a.delqo;
cout<<endf<< " Item Deleted= "<<i<<endl ;

getche ( ) ;
)

The above design has a shoncoming. As we enter and delete elements from the queue, the front and
rear locations also shift forward i.e. as we delete the first item from the queue, the second location
becomes the front. Therefore, we loose the first storage location for future storage. As we continue
entering and deleting elements, the total storage space available goes on decreasing. Since, we are using
arrays as our basic data stnrcture for queues; this can be a serious drawback.

One solution to the above problem can be to shift all remaining elements up every time we remove an
element from the queue. But, this increases the overheads. To understand this, take a look at the
Figure 2.18.

Initial pueue
fr*nt

Queue after Inserting tuto Elements

k*r.t

"{eflt
M.S. University - D.D.C.E. Lists, Stacks and Queues 59

Queue after Deleting dn Element

kp.nj

r.-fg{

Figure 2.17: Loosing a Storage Location


As you can see above, after taking out the front element of the queue, our front location has
incremented to the next element. Thus we have lost the first memory location. Hereafter, we only
have 5 locations to store data. Gradually, this space will go on decreasing.

One way of rectifying this problem is to shift the rest of the elements of the queue one space up each
time the front element is deleted, as said above. But if a queue is large, this will require a lot of effort.
The decision to use this design depends on the final use to which the queue will be put. If the number
of elements to be stored in the queue is large, there will be a lot of processing required ro move up all
the elements. On the other hand, if the queue generalll, conrains only a few elements, this movement
may not be much of an overhead. Thus, the complete evaluation of the design depends on rhe intended
use of the program. Ve will see other implementations to rectify this shortcoming as we proceed.

2.8.2 Linked Implementation of Queues


Unlike the array implementation, this version has no program imposed maximum queue size. Nodes
are allocated dynamically using new operator, as and when required. If the call to new fails, then it
returns NULL and is assumed that memory is full. In case of successful allocation, the new structure is
logically added.
To remove a node from a linked list, the node being pointed by the front pointer is removed and its
next node becomes the front node. Take a look at the following figure.

Existing pueue

*ffit N-Ult

Inserting another Node at tbe Rear

I{ILL
Deleting a Node from the Front

}"TIILL

2.18: Linked Implementation of Queues


60 Advanced Data Structure M.S. University - D.D.C.E.

The following program shows how to use linked lists to implement queues. Before writing the actual
code, let us write an algorithm for the program.

Algorithm
Step 1: Start.

Step 2: Declare the structure of the linked queue.


Step t: Insert the elements in the linked queue and increment the rear by 1 after each insertion.
Step 4: Insert the elements until the queue is full.
Step r: Perform the pop operation by deleting the elements from the front of the linked queue.

Step 6: Display the status of the queue after each pop operarion.
Step z:End.
Prograrn

To implement queues using linked lists.


/*-this program uses linked lists to implement queues'r'/
/ /
TNIPLEMENTATION OF LINKED QUEUES
# inc lude< ios tream . h>

#include<stdio. h>
#include<conio . h>
struct node
{
int data;
node *link;
j;
class queue
{
private:
node *front, *rear,.
public:
queue ( )

{
f ront=rear=NULL,'
]
void addq(int item)
{
node *temp;
Lists, Stacks and Queues 61
M.S. University - D.D,C.E.

temp=nsry node;
if (temP==NULL)
cout<<endl<<"Queue is Full" ;

temp->data=item;
temp- >1 ink=NULL ;

if ( front==NULL)
{
rear=front=temp;
return i
)
rear->1ink=temP i
rear=rear- >1 ink;
)
int delq( )

{
if ( front==NULL)
{
cout<<end1<<"Queue is EmPtY" ;

return NULL;
)
node *temp;
int item;
item= front- >data;
temp=front;
front= front ->1ink;
delete temp;
return item;
)

-queue ( )

{
if (front==NULL)
return;
node *temP;
while(front!=NULL)
t
temp=front;
f ront=f ront.->1ink;
delete temp;
62 Advanced Data Structure
M.S. University * D.D.C.E.

]
)

];
void main ( )

{
clrscr ( ) ;
queue a,.

a.addq(11);
a.addq(21);
a.addq(31);
int i=a.delqO;
cout<<endl<<,,The Item deleted=,,<<i ;
i=a.delq( ) ;
cout<<endl<<,,The Item deleted=,,<<i ;
i=a.delq( ) ;
cout<<endl<<,,The Item deleted=,,<<i ;
getche ( ) ;
)

2.9 ANALYSIS oF QUEUE IMPLEMENTAffi


An array of MAX queue size will take same amount of memory, no matter how many array slots are
*iT* dy,,mi.,ly'ato.,,.a ,;;;;;; ,p,." onry requires
:::::'XJffi "li'-l:*1,:tl.::::,ll?"
space for the number of elements actually in th"e q.r"rr" ri";;;,";j"r";r;:;;rL
pointers' Frowever, the node elements arelarger,rirr.. ^r r,or. the .::.r*i
*" -l* link as well as the user data.
\7e can also compare the relative "efficiency" of the two implemenrarions,
Following are the Big oh measurements of some of
in rerms of Big oh.
the .orr*o, operations that areperformed on
queues.

1' create: In both implementations, create operation has


a measure of o(1). It takes
v \^/' r! !4rr fixed amount of
work in any case.
2' E*ptl andfutl: In both implementations, these two operations
have measure o(1) only. Because,
the algorithm has to do only one operation, that of checking
whether the queue conrains
anything or nor.
3. Enterq and exitq: These operations are arso o(1) in
both impremenrarions.
The number of elements in the queue does not affect
the work done by these two operations. Irf
both implementations, we can diiectly access ,1. rro.r,
,, d rear of the queue. Therefore, amounr
of work done by these two op".rtior* is independ"rr,
of l.r.rre size.
4' Deleting thc entire queue: This operation differs from
one implernentation to other. Tl,e array
implementation just sets the front and rear indexer,
,o i, is an o(1) operation. The linked
Lists' Stacks and Queues 53
M.S. University - D.D.C.E.
Therefore, this operation has the
implementation has to access each node and free it explicitly.
in the queue at that time.
-.rrrre o(n), where n is the number of elements
As with the array and linked implementation of stacks, these
two implementations of queues are more
differing in.one operation' Following
or less equivalent in rerms of ,*orrrrt of work they do, possibly
on the two implementations of queues'
table summrrir., ,h" nig-Oh -"rrr.es of variou, optt"tio"t
Table 2.22 Big Oh Comparison of Queue Operations

Operations Array Linked


Implementation Implementation

Create o(1) o(1)


Empty o(1) o(1)
Full o(1) o(1)
Enterq o(l) o(1)
Exitq o(1) o(1)
destroy o(1) o(n)

on the requirements of your


The answer to the question, which implementation is befter, depends
application.

SetADT
A set is a collection of bindings. Each binding consisrs
of a key and a value. A key uniquely identifies
key. Programming systems use sets often'
its binding; a value is data thi't is somehow p'"nirr*t to its.
compilers and assemblers use ,.i, to relate symbols to their
meanings'
F;;";rdie
Set lnterface

The Set interface should conrain these function declarations:


Set-T Set-new(int iEstimatedlength,
int ("pfCompare) (const void'"pvKeyl, const void'tpvKey2));
void Set-free(Set-T oSet);

void Set-clear(Set-T oSet);

int Set-getlength(Set-T oSet);


int Setjut(Set-T void'rpvKey, void *pvValue);
oSet, const
*pvKey);
int Set-remove(Set-T oSet, const void

void *Set-getKey(Set-T oSet, const void *pvKey);


const

void 'tSet-getValue(Set-T oSet, const void'rpvKey);


void Set-maP(Set-T oSet,
void (*pfApply) (const void,rpvKey, void'r'r'ppvValue, void "pvExtra),
void'rPvExtra);
64 Advanced Data Strucrure
M.S. University - D.D.C.E.
Complcx NumberADT

typedef srrun {
float real;
float imag;

) GoMPLEX;
COMPLEX makecomplex (float, float)
;

COMPLEX addc (COMPLEX, COMPLEX);


COMPLEX subc (COMPLEX, COMPLEX);
COMPLEX multc (COMPLEX, COMpLEX);
COMPLEX divc (COMPLEX, COMPLEX);
A StringADT
Most languages either have a built in string datatype
or a srandard ribrary, so rare to create own string
ADT.
. A string datatype should have operarions ro:
. Return the nth characrer in a string.
. Set rhe nth character in a string to c.
. Find the length of a string.
a Concatenate two strings. ,.Alison" + .,Cawsey,,
_ ,,Alison Cawsey,,
. Copy a string.
. Delete pan of a string. (,.Alison Cawsey,, ,,Alison,)
. Modify and compare strings in other ways.
S tring P ro ce s s ing: A lgoit h ms
String orocessing algorithms are algorithms
for processing sequences of characrers or symbols
e.g.,
File compression - take a sequence, encode it
' as a shorter sequence.
cryptography - take a sequence, encode it so enemies
' can'r read it!
String search - search for occurrences of one
' sequence within another.
. Pattern matching - find out if sequence matches some patrern.
' Parsing - \'ork out srructure of a sequence, in terms of a grammar.
Applications of more complex algorithms might
' 'we
include genome sequencing and analysis.
can start by looking at the relevant datatypes
' of a string. ("Alison Cawsey,, ,,Alison")
or classes for strings and sequences.- Delete part
_
. Modify and compare strings in other ways.
M.S. University - D.D.C.E. Lists, Stacks and Queues 65

S tring I mp leme ntalion s

There are two main ways that strings may be:


lmplemented; As a fixed length array, where the first element denotes the length of the string, e.g.,

[6,a,1,i,s,o,n,.....]. This is used as the standard string type in Pascal.


As an array, but with the end of the string indicated using a special 'null' character (denoted '- 0'), e.g.,
Ia,l,i,s,o,n,_ 0,.....].
Memory can be dynamically allocated for the string once we know its length.
First implemenrarion has disadvantages of all fixed length array implementations' But some operations
are efficient (e.g., finding length).

Second implementation has advantages of dynamic allocation of space; modifying string also may be
more efficient, as needn't recalculate size.
heck Your
Define the following:
t. Stack

2. Linked Implementation

2.IO LET US SUM UP


An ADT is the logical picture of a type plus the specifications of the oPerations required in
data
creating and manipulating objects of this data type. Basic operations performed on any type of data
,t.,r.rrri., are irrsert, -odify, delete, search, sorl etc. The operations performed on linked list are
creare, inserr, delete, modify etc. A single node of linked list consist of mainly two fields (1) data and
(2) pointer to the next node as shown below.

The other type of linked lists are:


o Circular linked lists.
. Doubly or two way linked lists.
In computing paradigm linked lists can be put to use for a variety of purposes. Also they can be used
^drr,
to i.rrpi.rrr.r-, ,'rrr.tr.., like stacks and queues . An array is a list of elements in which each
.l.-..rt is accessible via an index. There are various aigorithms available to implement the same task.
Hence it is necessa ry to arralyze algorithms. Performance of a program depends on two factors - space
complexity ,nd timl complexity. Spr.. complexity of a program is the amount of memory required to
.*..rr,. the program ,rr.i.rrfrily. In static allocation memory required by a program is known at
compile time wliereas in dynamic allocation it can increase during the execution. Time complexity of a-
progir- is the amourrt of ii.rr. required to execute a program successfully. The order of magnitude of
,n a"lgorithm is the sum of the fre(uencies of all of its statements. Asymptotic notations calculate the
,pprJ*i-rte rime and space requirements of an algorithms. There are various types of asymptotic
rroirtior6 viz.BigOh, Omega, Theta etc. The major goals involved in the study of data structure arei
To identify and develop useful mathematical entities and operations and
o To determine what classes of problems can be solved using these entities and operations.
. Determine representations for abstract entities and implement them.
55 Advanced Data Structure M.S. University - D.D.C.E.

A stack is an ordered list in which all insertions and deletions are done at one end, called the rop. A
queue is an ordered list in which all insertions take place at one end, the rear, while all deletions take
place at the other end, the front. Unlike an the definition of stacks and queues provide for the
insertion and deletion of items. So, stacks^rray,
and queues are dynamic, constanrly changing objects.
Queues provide FIFO storage as opposed to LIFO storage provided by stacks. A stack is a dynamic
stnrcture i.e. it changes as elements are added to and removed from it. The operation that adds an
element to the top of a stack is usually called PUSH and the operarion that takes the top element off
the stack is called POP. Fven though the representation of the stack may be a random-access srnrcrure
such as anarray, the stack itself as a logical entity is not randomly accessed. Stacks and queues can be
implemented using arrays as well as using linked lists. As stated in earlier lessons, an array variable of
MAx stack size will take the same amount of memory, no matter how many array slots are actually
used. Therefore, we need to reserve space for the maximum possible. On the other hand, the linked
implementation using dynamically allocated storage space only requires space for the number of
elements actually present in the stack at run time. But the elements are larger since we must store the
link (the next field) along with the user's data. Stack is an appropriate data strucrure when information
must be saved and then later retrieved A in reverse order. Any situation requiring to backtrack to
some earlier position may be a good one to use a stack. There are 2 operations that can be applied to a
queue. First, new elements are added to the rear of the queue. \Ve will call this operation enrerq. \fe
can also take the elements off the front of the queue. Ve will call this operation exitq. The result of an
illegal aftempt to remove an element from an empty data structure is called underflow.

2.11 KEY\TORDS
Stack: Stack is a data stnrcrure which srores data at the top.

Data Sttueture: It is a collection of data elements.

2.12 QUESTTONS FOR DTSCUSSTON


'$fhat do you understand
1. by the term Abstract Data Type?
2. Differentiate between the Stacks and Queues on the basis of the storage of data.
3. \7hat is the basic difference between stacks and queues?
4. \7hat is a LIFO list and a FIFO list?
5. lWhat are the various functions that can be performed
on stacks and queues?

Check Your Progress: Model Answers


1, A stack is a dynamic structure i.e. it changes as elements are added to and removed from it.
The operation that adds an element to the top of a $ack is usually called PUSH and the
operation that takes the top element off the stack is called POP.
2. Linked implementation is more flexible and is preferable where the number of stack
elements vary greatly. It wastes less space when the stack is small. IUThen stack size is
unpredictable, linked implementation is preferable.
M.S. University - D.D.C.E. Lists, Stacks and Queues 67

2.13 SUGGESTED READINGS


Data Structures and Efficient Algoritbm.r, Burkhard Monien, Thomas ottmann, Springer

Data Structures and Algoritbms,Shi-Kuo Chang, lVorld Scientific

Hoa) to Solae it by Computer, RG Dromey, Cambridge University Press

Classic Data Structures in C+ +, Timothy A. Budd, Addison Vesley

Jean-Paul, Tremblay, Paul G Sorenson, Introduction to data structures uitb application, McGraw Hill Book
Company
Richard F Gilberg, et al., Data Structures - A Pseudocode Approach aitb C, First Edition, Thomson, 2002

Kutti, NS Padhye, P.Y., Data Structures in C+ +,2nd ed., Prentice Hall 2000
Robert Sedgewick, Algoithms in C+ +,3rd ed., Addison Vesely 1999

Ellis Horowitz, et a1., Fundamentals of Data Structures in C+ +,lst ed, Galgotia


UNITII
LESSON

3
TREES
CONTENTS
3.0 Aims and Objectives
3.1 Introduction
3.2 Trees
3.2.L Degree of Node of a Tree
3.2.2 Degree of a Tree
3.2.3 Level of a Node
3.3 N-ary Tree
3.3.1 Binary Tree
3.3.2 Full and Complete Binary Tree
3.3.3 Representations in Contiguous Memory
3.4 Linked Tree Representation
3.5 Binary Tree Traversal
3.5.1 Order of Traversal of Binary Tree
3.5.2 Procedure for Inorder Traversal
3.5.3 PreorderTraversal
3.5.4 PostorderTraversal
3.6 Binary Search Tree
3.6.1 Creating a Binary Search Tree
3.6.2 Deletion of a Node from Binary Search Tree
3.6.3 Deletion of a Node with two Children
3.6.4 Deletion of a Node with one Child
3.6.5 Deletion of a Node with no Child
3.6.6 Searching ior aTarget Key in a Binary Search Tree
3.6.7 An Application of a Binary Search Tree
3.7 AVL Trees
3.8 Let us Sum up
3.9 Keywords
3.10 Questions for Discussion
3.ll SuggestedReadings
72 Advanced Data Structure M.S. University - D.D.C.E

3.0 AIMS AND OBJECTIVES


After studying this lesson, you should be able to:
. Define trees
o Identify various characteristics of tress
o Describe the representations in contiguous memory
. Explain the linked tree representation
. Traverse binary tree
. Traverse a binary tree
. Search a binary tree

o Insert into a binary search tree


. Understand and use AVL trees

3.1 INTRODUCTION
While dealing with many problems in computer science, engineering and many other disciplines, it is
needed to impose a hierarchical structure on a collection of drta -it"*r. For exampl", *. ,r""d to
impose a hierarchical structure on a collection of dati items while preparing organisaiional charts and
genealogies, to rePresent the syntactic structure of source programs- ir, .o-pll"rs. A tree is a data
stmcture that is used to model such a hierarchical strucrure t, drt, irems, hence the study of rree as
one of the data structures is important. This module discusses tree as a datastnrcture.

3.2 TREES
A tree is a set of one or more nodes T such that

1,. There is a specially designated node called roor, and


2. Remaining nodes are partitioned into n > : O disjoint set of nodes T,, Tr,...,T. each of which is a tree.
Shown below in Figure 3.1 is a srructure, which is tree.

Figure 3.1: A Tree Structure


M.S. University - D.D.C.E. Trees 73

This is a tree because it is a set of nodes {A, B, C, D, E, F, G, FI, I}, with node A as a root node, and
the remaining nodes are partitioned into three disjoint sets: {8, G, H, I}, {C, E, F} AND {D}
respecrively. Each of these sets is a tree individually because each of these sets satisfies the above
properties. Shown below in Figure 3.2 is a stnrcture, which is not a tree:

t
B C D

\- /
o H

Figure
I

3.22
E ^

A Non-tree Structure
F

This is not a tree because it is a set of nodes {A, B, C, D, E, F, G, H, I}, with node A as a root node,
but the remaining nodes cannot be partitioned into disjoint sets, because the node I is shared.
Given below are some of the important definitions, which are used in connection with trees.

3.2.1Degree of Node of a Tree


The degree of a node of tree is the number of sub-trees having this node as a root, or it is a number of
a
decedents of a node. If degree is zero then it is called terminal node or leaf node of a tree.

3.2.2 Degree of a Tree


It is defined as the maximum of degree of the nodes of the tree, i.e. degree of tree : max (degree (node
i) fori:1ton).
3.2.3 Level of a Node
We define the level of the node by taking the level of the root node to be 1, and incrementing it by 1
as we move from the root towards the sub-trees i.e. the level of all the descendents of the root nodes
will be 2. The level of their descendents will be 3 and so on. 'We then define depth of the tree to be the
maximum value of level for node of a tree.
Consider the tree given below:
74 Advanced Data Structure M.S. University - D.D.C.E.

The degree of each node of the rree:

Node Degree
A 3

B 3

C 2

D 0

E 0

F 0

G 0

H 0

I 0

The degree of the tree: Maximum (Degree of all the nodes) : 3

The level of nodes of the tree:


Node Level

A 1

B 2

C 2

D 2

E 3

F 3

G 3

H )
I 3

3.3 N.ARY TREE


A tree non of whose nodes has more than N children is known as N-ary tree. In other words, an
N-rry tree is a tree whose degree is at the most N. Nore, however, that the degree of the nodes may be
less than N. Thus, a tree with degree 2 is called binary tree, with degree f is called ternery r.". ,r,d ,o
on.

3.3.1 Binary Tree


Binary tree is a special type of tree having degree 2. ln abinary rree no node of can have degree grearer
than 2. Therefore, a binary tree is a set of zero or more nodes T such that
(r) There is specially designated node called as roor of tree and
(ir) The remaining nodes are partitioned into two disjoint sers T, and T2 each of which is a binary
tree. T, is called as a left sub-tree andT2is called right sub-tree or vice-versa.
M.S. University - D.D.C.E. Trees 75

A binary tree is shown below (Figure 3.3).

Figure 3.3: Binary Tree Structure


For a binary tree note that,
(, maximum number of nodes at level i is 2'ti-1
(ir) if k is the depth of the tree then the maximum number of nodes rhe tree can have is

2k-r * 2k"z +....+22 + 21 : 2k -|


Also there are skewed binary trees like the one showr-r below in Figure 3.4

Left Skewed Right Skewed

Figure 3.4: Skewed Trees

3.3.2 Full and Complete Binary Tree


A binary tree of depth k can have maximum 2k-1 number of nodes. If abinary tree has fewer than 2k-1
nodes, it is not a full binary tree.
For example,
fork:3,
the numberof nodes : 2k-l : 23 -7 : 8- | : 7.
76 Advanced Data Structure M.S. University - D.D.C.E.

A full binary tree with depth k - 3 shown below in Figure 3.5.

Figure 3.5: A Full Binary Tree


If a binary tree is full then we can number the nodes of binary tree sequentially from I to 2k-1, $arting
from the root node and at every level numbering the nodes from left to right.
A complete binary tree of depth k is a tree with n node in which these n nodes can be numbered
sequentially from 1 to n, as if it would have been the first n nodes in a full binary tree of depth k.
A complete binary tree with depth k :3 is shown below in Figure 3.6.

Figure 3.6: A Complete Binary Tree

3.3.3 Representations in Contiguous Memory


If a binary tree is a complete binary tree, then it can be represented using an array capable of holding n
elements where n is the number of nodes in a complete binary tree. If tree is afl array of n elements,
then we can store the data values of the i'h node of a complete binary tree with n nodes at an index i in
an array tree. That means v/e can map node i to i'h index in the array, and the parent of node i will get
mapped at an index i/2 whereas left child of node i gets mapped at an index 2i and right child gets
mapped at an index 2i + 1.

For example, a complete binary tree with depth k: 3, having the number of nodes n: 5, can be
represented using an array of.5 as show below in Figure 3.7.
Trees 77
M.S. University - D.D.C.E'

A B C D E

Figure 3.7: AnArray Representation of a Complete Binary Tree having 5 Nodes and Depth 3

Shown below in Figure 3.8 is another example of an array representation of a complete binary tree
with depth k : 3, having the number of nodes n : 4.

A B C D

Figure 3.8: An Array Representa(ion of a Complete Binary Tree having 4 Nodes and Depth 3

In general any binary tree can be represented using an array. But we see that ^1 array rePresentation of
. .J-p1"t. bin ry trl. does not leal to the wastage of any storage. But if we walt to rePresent a binary-
tree which is noi a complete binary tree using representation, then it leads to the wastage of
^i ^rr^y
storage as shown in Figure 3.9.
78 'Advanced Data Structure
M.S. University - D.D.C.E.

A B C D E F G H I
78 10 t1

Figure 3.9: An Array Representation of Binary Tree

3.4 LINKED TREE REPRESENTATION


Ln array rePresentation of a binary tree is not suitable for frequent insertions and deletions, and
therefore we find that even though no storage is wasted if the binary tree is a complete or. *h.r,
u.r"y
rePresentation is used, it makes insenion and deletion in a tree cosrly. Therefor! instead of using
an
array rePresentation, we can use a linked representation, in which every node is represent.d l, ,
structure having 3 fields, one for holding data, one for linking it with lefr s,rb-tree and the one for
linking it with right sub-tree as shown below:

Left Child D*a Rieht Chiid

\7e can creare such a srnrcrure by using the following c declaration:


struct tnode
i
int data;
strucE tnode *lchild;
struct. tnode *rchild;
) *tptr;
M.S. University - D.D.C.E. Trces 79

A tree representation using the above node structure is shown below in Figure 3.10.

Figure 3.10: Linked Representation of a Binary Tree

3.5 BINARY TREE TRAVERSAL


This section discusses different orders in which abinary tree can be traversed. The algorithms for some
commonly used orders of traversal are also presented. It also discusses the issue of construction of a
unique binary tree given the orders of traversal.

3.5.1 Order of Traversal of Binary Tree


The following are the possible orders in which abinary tree can be traversed:
1) LDR
2) LRD
3) DLR
4) RDL
s) RrD
6) DRL
'Where,

L stands for traversing the left sub-tree,


R stands for traversing the right sub-tree, and D stands for processing the data of the node.
Therefore, the order LDR is the order of traversal in which we start with the root node, visit the left
sub-tree first, then process the data of the root node, and then go for visiting the right sub-tree. Since
the left, as well as right sub-trees are also the binary trees, the same procedure is used recursively while
visiting the left and right sub-trees.
The order LDR is called inorder, the order LRD is called postorder, and the order DLR is called
preorder. The remaining three orders are not used. If the processing that we db with the data in the
node of tree during the traversal is simply printing the data value, then the output generated for a tree
given below in Figure 3.11, using the inorder, preorder and postorder is the one shown below in
Figure 3.11 itself.
80 Advanced Data Structure M.S. University - D.D.C.E.

Inorder: DBHEIAFCG

preorder: ABDEHICFG
postorder: DHIEBFGCA

Figure 3.11: A Binary Tree along with its Inorder, Preorder and Postorder
Ifan expression is represented as a binary tree then the inorder rraversal of the rree gives us an infix
exPression, whereas the postorder traversal gives us posfix expression as shown below in Figure 3.12.

bd
Inorder:a+bx-c+d+
Postorder: abcx-+de'f +

Figure 3.122 A Binary Tree of an Expression along with its Inorder and postorder
Given an order of traversal of a tree it is possible to construcr a rree. For example consider the
following order:
Inorder - DBEAC
^
M.S. Universitv - D.D.C.E. Trees 81

'We
can constnrct the binary trees shoyrn below in Figure 3.13 using this order of traversal:

Figure 3.13: Binary Trees Constructed using Given Inorder


Therefore, we conclude that given only one order of traversal of a tree it is possible to consrnrct a
number of binary trees, a unique binary tree is not possible to be constructed. For consrruction of a
unique binary tree we require two orders in which one has to be inorder, the other can be preorder or
postorder.
For example, consider the following orders:
lnorder: DBEAC
Postorder: DEBCA
82 Advanced Data Structure M.S. Universitv - D.D.C.E.

'$(i
e can construct a unique binary tree shown in Figure 3.14 using these orders of traversal.

Figure 3.14: A Unique Binary Constructed using the Inorder and Postorder

3.5.2 Procedure for Inorder Traversal


void inorder(tnode "p)
{
if (p != NULL)
{
inorder (p->]chi1d) ;

printf (p->daLa) ;
inorder (p->rchild) ;

A non-recursive/iterative procedure for traversing a binary tree in inorder is given below for the
purpose of doing the analysis.
void inorder(tnode *p)
{
tnode *stackli00J;
int top;
{

top - 0;
if (p != NULL)
t
top - top + 1;
stackItop] = p;
n - p->1child;
while(top > 0)
t
while(p != NULL)
/*push the left child onto the stack*,/
M.S. University - D.D.C.E. Trees 83

top - top + 1;
stackltopl = P;
p - p->lchild;
)
p = stackltopl;
top - top-l;
printf (p->data) ;
p - p->rchild;
if (p l= NULL)
/*push right child*/
{
toP = toP+l;
stackltopl = P;
p - p->1child;
]

Analgsis
Consider the iterative version of the inorder given above. If the binary rree ro be traversed is having n
nodes, then the number of nil links are n + 1. Since every node is placed on rhe stack once, the
statements stack[top]:: p and p :: stack[top] are executed n rimes. The test for nil link will be done
exactly n+ 1 times. So every step will be executed no more than some small constant times n, hence
the order of algorithm O(n). Similar analysis can be done to obtain the estimate of the compurarion
time for preorder and post order.

3.5.3 Preorder Traversal


void preorder(t.node *p)
t
if (p != NULL)
t
print.f (p->data) ;
preorder (p->1chi1d) ;
preorder (p->rchild) ;

)
84 Advanced Data Structure M.S. University - D.D.C.E

3.5.4 Postorder Traversal


void postorder(tnode *p)
{
if ( !p)
{

postorder (p->1chi1d) ;

postorder 1p->rchild) ;

print.f (p->data) ;

i
Consider the following example.
Given the preorder and inorder traversal of a binary tree. Draw the tree and write down its postorder
traversal.

Inorder :2, A., Q, P, Y, X, C, B


Preorder: Q, A, Z,Y,P, C, X, B
To obtain the binary tree take the first node in preorder, it is a root node, we then search for this node
in the inorder traversal, all the nodes to the left of this node in the inorder traversal will be the part of
the left sub-tree, and all the nodes to the right of this node in the inorder traversal will be the part of
'$7e
the right sub-tree. then consider the next node in the preoder, if it is a part of the left sub-tree, then
we make it as left child of the root, otherwise if it is part of the right sub-tree then we make it as parr
of right sub-tree. This procedure is repeated recursively to get the tree shown below in Figure 3.15:

Figure 3.15: A Unique Binary Tree Constructed Using the Inorder and Postorder
The post order for this tree is:
Z, A,P, X, B, C, Y, Q
The following function counts the number of leaf node in a binary tree.
int. count (tnode *p)
{
Trees 85
M.S. University - D.D.C.E

if (P == NULL)
count = 0;
e1 se

if ( (p->1child == NULL) && (p->rchild == NULL) )

count = 1,'
else
count = count (p->l-child) + count (p->rchild) ;

The following procedure swaps the left and the right child of every node of a given binary tree.
void swaptree(tnode *P)
{
tnode *temp;
if (p != NULL)
{
swaptree (p->1child) ;

swaptree (p->rchild) ;

temp = p->1chi1d;
p->1chi1d - P->rchild;
p->rchild = temp;

]
The following function checks whether the two binary trees are gqual or not.
boolean equal(tnode *p1, tnode *p2)
{
boolean ans;
if ( (p1 == NULL) && (P2 == NULL) )

ans = true,'
else
if ( ( (p1==NULL)&&(p2 !=NULL) I | | I lpl!=NULL) sg(p2==NULL) ) )

ans = false;
e1 se

while((p1 l= NULL) && (P2 l= NULL))


{
if (p1 l= NULL) && (p2 != NULL))
if ( (equal (p1->1chiId,p2->lchi1d) )
ans = equaf (p1->rchiLd',p2->rchild) ;
86 Advanced Data Srrucrure
M.S. University - D.D.C.E.

else
ans = false;
else
ans = false;
]
return (ans ) ,.

The following funcrion creates exacr copy of a given binary trees.


Tnode *copytree(tnode *p)
{

tnode *q;
{
if (p -- NULL)
ret.urn (NUf,f,) ;
el_se
{

Q = new(tnode) ;

q->data - p->data;
q->Ichild = copytree (p->Ichild) ;
q->rchild = copytree (p->rchild) ;
return (q) ;

3.6 BINARY SEARCH TREE


A binary search tree is abinary tree which may be empty, and every node contains an identifier
and
l. identifier of any node in the left sub-tree is less than the identifier of the root
2' identifier of any node in the right sub-tree is greater than the identifier
of the root and the left
sub-tree as well as right sub-tree both are binary"search trees.
M.S. University - D.D.C.E. Trees 87

A tree shown below in Figure 3.16 is abinary search tree:

Figure 3.15: A Binary Search Tree

A binary search tree is basically a binary tree, and therefore it can be traversed is in-order, Preorder,
arrd pori-order. If we traverse a binary search tree in inorder and print the identifiers contained in the
nodes of the tree, we get a sorted list of identifiers in the ascending order.

A binary search tree is an important search structure. For example, consider the problem of searching
a list. If a list is an ordered thln searching becomes faster, if we use a contiguous list and binary search,
but if we need to make changes in the list like inserting new entries and deleting old entries. Then it is
much slower ro use a contiguous list because insertion and deletion in a contiguous list requires
moving many of the entries every time. So we may think of using a linked list because it permits
insertions and deletions to be carried out by adjusting only few pointers, but in a linked list there is no
v/ay ro move rhrough the list other than one node at a time hence permitting only sequential access.
Binary trees provide an ercellent solution to this problem. By making the entries of an ordered list
into the .rodei of a binary search tree, we find that we can search for a key in O(n log n) steps.

3.6.1Creating a Binary Search Tree


'We assume that every node a binary search tree is capable of holding an integer data item and the links
which can be made pointing to the root of the left and the right sub-tree resPectively. Therefore, the
structure of the node can be defined using the following declaration:
strucL tnode
{
int data;
tnode *lchid;
tnode *rchild;
)

To create abinary search tree we use a procedure named insert which creates a new node with the data
value supplied as a parameter to it, and inserts into an already existing tree whose root pointer is also
passed ,i-^ prrrrrreier. The procedure accomplishes this by checking whether_ the tree whose root
poirrte, is passed ,, , prr"*"ier is empty. If it is empty then the newly created node is inserted as a
88 Advanced Dara Structure
M.S. University - D.D.C.E.

root node. If it is not emPty then it copies the root pointer into a variable remp 1, it then srores yalue
of temp 1 in another variable temp2, compares the data value of the node poirrr"i to by temp
1 with
the data value supplied as a prr"-"t.r, if the data value supplied as a prr.-"r., is smaller
than the data
value of node pointed to by temp 1 then it copies theleft link of the node pointed by
.the temp 1 into
temp 1 (goes to the left), otherwise it copies the iight link of the node poirt"i by temp
1 inio temp
t(g9.1 to the right). It this process till temp-1 becomes nil. \Xrhen temp 1 b..orrr., nil, the new
i.p_"rJ:
node is inserted as a left child of the node pointedto by temp2 if daarrrlrr" of the node
poinied to by
tlmp2 is greater than data value supplied as param"t.r. Oth.r*ise the new node is inserted ,s , ,iglrt
child of node pointed to by rcmp2. Therefore the insen procedure is
void insert(tnode *p, int val)
t
tnode *temp1, *temp2;
if (p == NULL)
{
p = new(tnode) ;
p->data = val;
p->1chi1d = NULL;
p->rchild = NULL;
)
else
i
templ = p;
while(temp1 l= NULL)
{

temp2 = temlrl;
if(temp1->data > val)
templ = templ->1eft.;
el_se
templ = templ->right.;
)
if(Eemp2->data > va1)
{

temp2->1eft = new(tnode) ;
t.emp2 = t.emp2->l_eft ;
t.emp2->data = va1;
temp2->1eft = NULL;
temp2->right= NULL;
)
M.S. Universitv - D.D.C.E. Trees 89

else
{
temp2->riSht - new(tnode) ;

temp2 = temp2->right;
temp2->data = va1;
temp2->left = NULL;
bemp2->right = NULL;

3.6.2 Deletion of a Node from Binary Search Tree


To delete a node from a binary search tree the method to be used depends on whether a node to be
deleted has one child, two children, or has no child.

3.5.3 Deletion of a Node with two Children


Consider the binary search tree shown below in Figure 3.17:

Figure 3.172 A Binary Tree before Deletion of a Node Pointed to by x


If we want to delete a node pointed to by x, then we can do it as follows:
Let y be a pointer to the node which is the root of the node pointed by x. W'e srore the pointer ro the
left child of the node pointed by x in a temporary pointer temp. \We then make the left child of the
node pointed by y to be the left child of the node pointed by x. 'We then traverse the rree with the
root as the node pointed by temp to get its right leaf, and make the right child of this right leaf to be
the right child of the node pointed by x, as shown below in Figure 3.18:
90 Advanced Data Structure M.S. University - D.D.C.E.

Figure 3.18: A Binary Tree after Deletion of a Node Pointed to by x


temp = x->lchil-d;
y->1child = x->Ichild;
while(t.emp->rchild l= NULL)
temp - temp->rchild;
temp->rchild = x->rchild;
x->lchild = NULL;
x->rchild = NULL;
delete (x);
Another method is store the pointer to the right child of the node pointed by x in a remporary poinrer
temP. \0fe then make the left child of the node pointed by y ro be the right child of the node pointed
by x. \tr7e then traverse the tree with the root as the node pointed by t"-p ro ger its left leaf, and make
the left child of this left leaf to be the left child of the node pointed 6y x, ,r sho*r, in Figure 3.19:

Figure 3.19: A Binary Tree after Deletion of a Node Pointed to by x


M.S. University - D.D.C.E. Trees 91

temp - x->rchil-d;
y->1-chi1d = x->rchild;
while (temp->1child != NULL)
= temp->1chi1d;
temP
temp->1chil-d = x->1chi1d;
x->lchild = NULL;
x->rchild = NULL;
delete (x) ;

3.6.4 Deletion of a Node with one Child


Consider the binary search tree shown below in Figure 3.20.

Figure 3.20: A Binary Tree before Deletion of a Node Pointed to by x


If we want to delete a node pointed to by x, then we can do ir as follows:
Let y be a pointer to the node which is the root of the node pointed to by x. Make the left child of the
node pointed by y to be the right child of the node pointed by x, and dispose the node pointed by x as
shown below in Figure 3.21:
92 Advanced Data Stmcture M.S. University - D.D.C.E.

Figure 3.21: A Binary Tree after Deletion of a Node Pointed to by x


y->1child = x->rchild;
x->rchild = NULL;
delete (x) ;

3.6.5 Deletion of a Node with no Child


Consider the binary search tree shown below inEigure 3,22:

Figure 3.222 A Binary Tree before Deletion of a Node Pointed to by x


Let the left child of the node pointed by y be nil, and dispose node pointed by x as shown below in
Figure 3.23.
M.S. University - D.D.C.E. Trees 93

Nil

Figure 3.23: A Binary Tree after Deletion of a Node Pointed to by x

for aTarget Key in a Binary


3.6.6 Searching Search Tree
boolean search(tnode *p, int val)
{
boolean ans;
tnode *temp;
temP = P;
ans = false;
while ((temp != NULL) && (lans))
t
if(temp->data == val)
ans = true;
else
if(temp->data > va1)
temp = temp->left;
else
Lemp = temp->right;
l
)

3.6.7 An Application of a Binary Search Tree


One of the applications of a binary search tree is the implementation of a dynamic dictionary. A
dictionary is an ordered list which is required to be searched frequently, and is also required to be
updated (inserrions and deletions) frequently. Hence can be very well implemented using a binary
search tree, by making the entries of dictionary into the nodes of binary search tree. A more efficient
implementation of a dynamic dictionary involves considering a key to be a sequence of characters, and
insiead of searching by comparison of entire keys, we use these characters to determine a multi-way
94 Advanced Dara Structure M.S. University _ D.D.C.E.

branch at each steP, this will allow us to make a 26-way branching according the first
letrer, followed
by another branch according ro rhe second letter and so on.
A program to create abinary search tree, given a list of identifiers is given below:
char keylMAxLENl;
struct tnode
{
key name;
tnode *1chi1d;
tnode *rchild;
)
void btree ( )
I

tnode *rooL;
key item;
int n;
root = NULL;
printf ("Number of data values: ,,) ;
scanf ( "8d", &n) ;
while( n > 0)
{
printf ('.Enter the data va1ue,,);
scanf ( "8s", item) ;
insert (root, item) ,.

n = n_1i
)
printtree (root) ;

3.7 AYL TREES


An AVL tree is a balanced binary search tree. It takes its name from the initials of its inventors
Adelson, velskii and Landis. An AVL tree has the following properries:
-
l. The sub-trees of every node differ in height by at most one level.
2. Every sub-tree is an AVL tree.
-

M.S. Universitv - D.D.C.E. Trees 95

Here, the height of the tree is h. Height of one subtree is h-t while that of another subtree of the same
node is h-2, differing from each other by just 1. Therefore, it is an AVL tree.

lnserting a Node into AWTree


Inserting a node is somewhat complex and involves a number of cases. Implementations of AVL tree
insertion rely on adding an extra attribute - the balance factor - to each node. This f.actor indicates
whether the tree is left-heaay (the height of the left sub-tree is 1 greater than the right sub-tree),
balanced (both sub-trees are the same height) or right-heaay (the height of the right sub-tree is 1 greater
than the left sub-tree). If the balance would be destroyed by an insertion, a rotation is performed to
correct the balance.
Let us consider the following AVL tree in which a node has been inserted in the left subtree of node 1.
95 Advanced Data Structure
M.S. University - D.D.C.E.

This insertion causes irs height to become 2 greater than node-2's right sub-tree. A right-rotation is
performed to correct the imbalance, as shown below:

Your
'$7hat
1. are the characteristic properties of an AVL tree?
2. Define level of a node.

3.8 LET US SUM UP


A tree is a set of one or more nodes T such that there is a specially designated node called root, and
remaining nodes are parritioned into disjoint set of nodes.
The degree of a node of a tree is the number of sub-rrees having this node as a roor, or it is a
number of decedents of a node. If degree is zero then it is called terminal node or leaf node of
a
tree.
o Degree of a tree is defined as the maximum of degree of the nodes of the tree.
a A tree non of whose nodes has more than N children is known as N-ary tree. In orher words, an
N-rry tree is a tree whose degree is at rhe mosr N.
o Binary tree is a special rype of tree having degree 2.
a A binary tree of depth k can have maximum 2k-1 number of nodes. If a binary tree has fewer than
2k-1 nodes, it is not a full binarv tree.
An AVL tree is another balanced binary search rree. It takes its name from the initials of its
inventors - Adelson, Velskii and Landis.
In an AVL tree the sub-trees of every node differ in height by at most one level and every sub-tree
is an AVL tree.
M-S. Universitv - D.D.C.E. Trees 97

3.9 KEY\TORDS
Tree: A two-dimensional data structure comprising of nodes where one node is the root and rest of the
nodes form two disjoint sets each of which is a tree.

Nodc: A data structure that holds information and links to other nodes.
Root nodc: The node in a tree which does not have a parent node.
Degree of a tree: The highest degree of a node appearing in the tree.

I*uel of a node:The number of nodes that must be traversed to reach the node from the root.
N-ary tree: A tree in whose degree is N.
Binary tree: A tree of degree 2.
Inordcr: A tree traversing method in which the tree is traversed in the order of left-tree, node and then
right-tree.
Postordcr: A tree traversing method in which the tree is traversed in the order of left-tree, right-tree
and then node.

Preordcr: A tree traversing method in which the tree is traversed in the order of node, left-tree and
then right-tree.
Search tree: A tree constnrcted and used in searching algorithms.
AW tree (Afulson, Velskii and Izndis Tree): A balanced binary search tree in which the sub-trees of
every node differ in height by at most one level and every sub-tree is an AVL tree.

3.10 QUESTTONS FOR DISCUSSION


t. Consider the tree given below:
(r) Find the degree of each mode of the tree.
(i0 The degree of the tree.
(iii) The level of each nodes of the tree.
98 Advanced Data Structure M.S. University - D.D.C.E.

2. Give the array representation of a complete binary tree wirh depth k: 3, having the number of
nodes n:7.
3. How many binary trees are possible with three nodes?
4. Construct a binary tree whose in-order and pre-order rraversal is given below:
In-order: 5,L,3 rLL,6,8 14,2,7

Pre-order: 6,1,5,II 13,4,8,7,2


5. Perform inorder, preorder and postorder rraversal in the following tree.

If the preorder traversal of a tree gives the following sequence of nodes, draw the tree. Also
traverse it in inorder and postoder.

ABCDEFGH
7. show the result of deleting node (60) from the following binary search tree.
M.S. University - D.D.C.E. Trees 99

8. Show the result of inserting node (45) into the above binary search tree.
9. Convert the following graph into a binary tree by removing necessary edges.

12. \Where are AVL trees used?

Check Your Progress: Model Answers


1. An AVL tree is another balanced binary search tree. It takes its name from the initials of
its inventors - Adelson, Velskii and Landis. An AVL tree has the following properties:
i. The sub-trees of every node differ in height b\. at most one level.
ii. Every sub-tree is an AVL tree.
2. Ve define the level of the node b,v taking the level of the root node to be 1, and
incrementing it by 1 as we move from the root towards the sub-trees i.e. the level of all the
descendents of the root nodes will be 2. The level of their descendents will be 3 and so on.

3.1 1 SUGGESTED READINGS


Data Strucrures and Efficient Algorithms; Burkhard Monien, Thomas Ottmann; Springer;

Data Structures and Algoritbms; Shi-Kuo Chang; \7orld Scientific

Hou to Soloe it u1\ Computer; RG Dromey; Cambridge University Press

Classic Data Structwres in C+ +; Timothy A. Budd; Addison Wesley


LESSON

HASHING AND PRIORITY QUEUES


CONTENTS
4.0 Aims and Objectives
4.I Introduction
4.2 Hashing
4.2.t HashingFunctions
4.2.2 Hash Collision
4.3 Priority Queues
4.4 Let us Sum up
4.5 Keywords
4.6 Questions for Discussion
4.7 Suggested Readings

4.0 AIMS AND OBJECTIVES


After studying this lesson, you will be able to:
. Define hashing and hashing functions
. Describe the collision handling techniques

4.1 INTRODUCTION
Hashing is a method of directly computing the index of the table by using some suitable mathematical
function called hash function. The hash function operares on the name ro be stored in the symbol
table, or whose attributes are to be retrieved from the symbol table. This concepr has been discussed in
this lesson in detail.
A priority queue is a collection of elements such that each element has been assigned a priority. \(e
have discussed priority queues and its implementation in this lesson.

4.2 HASHING
In many applications we require to use a data object called symbol table. A symbol table is nothing but
a set of pairs (name, value) where value represents collection of amributes associated with the name,
and this collection of attributes depends upon the program element identified by the name. For
example if a name x is used to identify array in a program, rhen the attributes associated with x are
^n
M.S. University - D.D.C.E. Hashing and Priority Queues 101

the number of dimensions, lower bound and upper bound of each dimension, and the element type.
Therefore a symbol table can be thought of as a linear list of pairs (name, value), and hence we can use
a list of data object {or reahzine a symbol table. A symbol table is referred to or accessed frequently
either for adding the name, or for storing the attributes of the name, or for retrieving the attributes of
the name. Therefore accessing efficiency is a prime concern while designing a symbol table. Hence, the
mosr common way of getting a symbol table implemented is to use a hash table. Hashing is a method
of directly .o*p,rtirrg th. irJ.* of the table by using some suitable mathematical function called hash
function. The hash frnction operates on the name to be stored in the symbol table, or whose
attributes are to be retrieved from the symbol table. If h is a hash function and x is a name, then h(x)
gives the index of the table where x along with its attributes can be stored. If x is already stored in the
Iable, then h(x) gives the index of the t"tl. *h"r. it is stored to retrieve the attributes of x from the
rable. There are various methods of defining a hash function like a division method. In this method we
take the sum of the values of the characters, divide it by the size of the table, and take the remainder.
This gives us an integer value lying in the range of 0 to (n-1) if the size of the table is n. The other
method is a mid square method. In this method, the identifier is first squared and then the appropriate
number of bits from the middle of square is used as the hash value. Since the middle bits of the square
usually depend on all the characters in the identifier, it is expected that different identifiers will result
irrto diff"..rt values. The number of middle bits that we select depends on the table size. Therefore if r
is the number of middle bits that we use to form hash value, then the table size will be 2'. Hence when
we use this method the table size is required to be power of 2. Another method is folding in which the
identifier is partitioned into several parts, all but the last part being of the same length. These Parts are
then added together to obtain the hash value.

4.2.1 Hashing Functions


Some of the methods of defining hash function are discussed below:

L. Modular aithmetic: In this method, first the key is converted to integer, then it is divided by the
size of index range, and the remainder is taken to be the hash value. The spread achieved depends
very much on the modulus. If modulus is power of small integers like 2 or 1O, then many keys
tend to map into the same index, while other indices remain unused. The best choice for modulus
is often b.ri not always is a prime number, which usually has the effect of spreading the keys quite
uniformly.
Z. Truneation: This method ignores part of key, and use the remainder part directly as hash value.
(considering non-numeric iields as their numerical code) if the keys for example are eight digit
numbers *nd ,h. hash table has 1000 entries, then the first, second, and fifth digit from right
might make hash value. So 62538194 maps to 394. k is a fast method, but often fails to distribute
keys evenly.

3. Fold.ing: In this method, the idencifier is partitioned into several parts all but the last part being of
the same length. These parts are then added together to obtain the hash value. For example, an
eight digit inieger can bi divided into groups of three, three, and two digits. The groups are the
added rogether, and truncated if necessary to be in the proper range of indices. Hence 62538t49
-rp, to, 625 + 381 + 94 : 1100, truncated to 100. Since all information in the key can affect the
value of the function, folding often achieves a better spread of indices than truncation.
4. Mid. square method: In this method, the identifier is squared (considering non-numeric fields as
their numerical code), and then the appropriate number of bits from the middle of the square are
102 Advanced Data Structure
M.S. University - D.D.C.E.

used to get the hash value. Since, the middle bits of the square usually depend on all the
characrers
in the identifier, it is expected that different identifiers wlll ,es.rlt in'different values. The number
of middle bits that we select depends on table size. Therefore, if r is the number of middle bits
used to form hash value, then the table size will be 2', hence when we use mid square
method the
table size should be a power of 2.

4.2.2Hash Collision
To store the name or to add attributes of the name, v/e compute hash value of the name, and place the
name or attributes as the case may be, at that place in the iable whose index is the hash value
of the
name. For retrieving the attribute values.of the name kept in the symbol table, we apply the
hash
function to the name to obtain index of the table where xre ger the aitributes of the name. Hence we
find that no comparis-ons are required to be done, Henc.l the time required for the retrieval is
independent of the table size. Therefore retrieval is possible in a consranr amounr of time,
which will
be the time taken for computing the hash function. Therefore, hash table seems to be
the best for
realization, of the symbol table, but there is one problem associated with the hashing, and it
is of
collisions. Hash collision occurs when the two identifiers are mapped into the same hash value. This
happens because a hash function defines a mapping from a set of valid identifiers to
the set of those
integers, which are used as indices of the table. Therefore, we see that the domain of the mapping
defined by the hash function is much larger than the range of the mapping, and hence th"
ii
of many to one nature. Therefore, when we implement a hash trbi. ,-r,ritable collisio.,-rppirrg hrrrdfrg
mechanism is to be provided which will be activated when there is a collision.
Collision handling involve finding out an alternative location for one of the two colliding symbols.
For example, if x and y are the different identifiers and if h(x : h(y), x and, y are rhe collidini
symbols.
If x is encountered before /, then the i'h entry of the table will be used for accommodating"rymbol x,
but later on when y comes there is a hash collision, and therefore, we have ro find out an ahernarive
location either for x or y. This means we find our a suitable alternative location and either
accommodate y in that location, or we can move x to that location and place y in the i h location
of the
table. There are various methods available to obtain an alternative location to handle the
collision.
They differ from each.other in the way search is made for an alrernative location. The following
are
the commonly used collision handling techniques:
l. Linearprobing or linear open adtlressing:In this method, if for an identifier x, h(x) : i, and
if the i,h
location is already occupied then we search for a location close to the i,h location by doing
alinear
search starting from the (i+1)il location to accommodate x. This means we srarr iro1;1
rli" (i+1)m
location and do the linear search till we get an empty location, and once we ger an empty location
we accommodate x there.
2. Rehashing: This is another method of collision handling. In this method, we find
an alernative
emPty location by modifying the hash function, and ,pplyirg the modified hash function
to rhe
colliding symbol. For example, if x is symbol and h(") : l, and if the i,h location is
already
occupied, then we modify the hash function h to h,, and find out h,(x), if h,(x) : j,
and j,h location
is empty, then we accommodate x in the j'h location. Otherwise, *. orr." again modify h, to
some, h, and repeat the process till the collision gets handled. Once, the collision
gets handledwe
revert back to the original hash function before considering the next symbol.
3. Separate Chaining/,oae|flow chaining This is a method of implemenring a hash table, in which
collisions gets handled automatically. In this method, we use two tables, a slymbol table to
accommo-
M.S. University - D.D.C.E. Hashing and Priority Queues 103

date identifiers and their attributes, and a hash table which is an array of pointers pointing to symbol
table entries. Each symbol table entry is made of three fields, first for holding the identifier, second for
holing the attributes, and the third for holding the link or pointer which can be made pointing to any
symbol table entry. The insertions into the symbol table are done as follows:
If x is symbol to it will be added to the next available -entry of the symbol table.
be inserted, then
The hash value of x is then computed, if hG) : i, then the i'h hash table pointer is made pointing
to rhe symbol table entry in which x is stored if the i'h hash table pointer is not pointing to any
symbol table entry. If the i h hash table pointer is already pointing to some symbol table entry,
then the link field of symbol table entry containing x is made pointing to that symbol table entry
to which i,h hash table pointer is pointing to, and make the i'h hash table pointer pointing the
symbol entry containing x. This is equivalent to building a linked list on the i'h index of the hash
table. The retrieval of attributes is done as follows:
If x is a symbol, then we obtain h(x), and use this value as the index of the hash table, and traverse
the list built on this index to get that entry which contains x. A typical hash table implemented
using this technique is shown below:
Let the symbols to be stored are xl, ! 1, 21, x2, f ,, z, The hash function that we use is:

h(symbol) : (value of first letter of the symbol) mod n,


'W'here.n
is the size of table.

if h(xf :i
h(vd : j
h(z) : k
then
h(x) : i
h(il:j
h(22) : k
Therefore the contents of the symbol table will be the one shown in Figure 4.1.

Link
k x, NULL
yr 1,lULL
:t.
,."-tJLl
i x, a
v:
q O
j

Hash Table Symbol Table

Figure 4.1: Hash Table Implementation using Overflow Chaining for Collision Handling
Consider using division method of hashing store the following values in the hash table of size 11:

25,4 5,9 6,10 t,t02,l 62,t97,20 I


104 Advanced Data Structure M.S. Universitv _ D.D.C.E.

lJse sequential method for resolving the collisions.


since division merhod of hashing is to be used the hash function his:
h(k.y) : k.y mode 1.1, where key is the value to be stored.
'We
start with the valte 25, and compute the hash value using 25 as key. The hash value
is he;) : 25
mod 11 : 3. Therefore srore 25 arthe index 3 in the tabl!.
For 45, h(45) : 45 mod :1l l, hence place 45 at rhe index 1.

For96,h(96) : 96mod 11 : 8,

.'. store 96 at index 8.

For 101,h(100 : 101 mod t! : 2,


.'. store 101 at index 2.
For 102, h(102) :
Lo2 mod 11 :
3, there is a collision, rherefore we find a location closer to location
at index 3 which is empty to accommo date !02, we see that the location
at index 4 is empty.
.'. store t02 at index 4.
For 162,h(1,62) = 162 mod 11 : 8, again there is a collision, therefore we find a location closer ro
location at index 8 which is empty to ,..o-rno date l62,we see that location at index 9
is empty.
.'. store 162 at index 9.
For 197,h(197) : t97 mod 11 : 10,

.'. store L97 at index 10.


For 20t, h(20 1) : 2oI mod 11 : 3, again there is a collision, therefore we find a location
closer ro
location at index 3, which is empty to ,..orrrrno darc 2O1,we see that location at index
5 is empty.
.'. store 201at index 5.
The hash table therefore is the one shown below:

1 15

2 101

3 25

4 102

5 201

B 96

9 162

10 197
M.S. University - D.D.C.E. Hashing and Priority Queues 105

4.3 PRTORTTY QUEUES


A priority queue is a collection of elements such that each element has been assigned a priority and
such that the order in which elements are deleted and processed is defined by the following rules:

An element of higher priority is processed before any element of lower priority. Two elements with
the same priority are processed according to the order in which they were inserted into the queue.
'We
would use a singly linked list to implement the priority queue. Each node of the linked list would
have a type definition as follows.
struct qElement
{
T item;
int priority;
.qElement *next;
) *Pqueue, *front, *rear;
The algorithm for the insertion would change now. Insertion would insert the new element ar rhe
correct position according to the priority of the element. The elements of the priority queue would be
sorted in a non-descending order of the priority with the front of the queue having the element with
the highest priority. The deletion procedure need not change since the element at the front is the one
with the highest priority and that is the one that should be deleted.
void insert(Pqueue *front, Pqueue *rear, T e, i_nt p)
/* this inserts an element having data e and priority p into the priority
queue */
/*the insertion maintains the sorted order of the priority queue */
{
Pqueue *f, *r;
Pqueue *x,.
int pr;

x = new(Pqueue);
x->iCem = e; X->priority = p;
if (front == NULL)
{
front = x,.
x->next = NULL;
rear = x;
i
/" x is the first node being added to the priority queue*/
106 Advanced Data Structure M.S. University - D.D.C.E.

elseif (front.->priority < p)


{
x->next = front;
fronL = x;
)
/* x has the highest. priority hence should be at the front*/
el-seif (rear->priority > p)
t
x->next = NULL;
rear->next = x;
rear = x;
)
/* x has the least priority hence should be at the rear*/
else
{

/* x has Lo be inserted in between according to its prioriLy* /


f = front;
pr = f->pri; r.= NULL,.
while(pr > p) /* Advance t.hrough the queue ti11 Lhe proper
positi'on is reached */
{

f = f->next,. r = f; pr = f->priority;
)

/* f now points to the node before which x has to be inserted and


r points to the node which should be before x*,/
r->next = xi x->next = f,.

Binary Heap
A binary tree that has the following properties (called heap properties) is called a heap tree or binary
heap.

t. Either it is empty
Or
2. The key in the roor is larger than that in either child
And
3. Both subtrees have the heap properries.
M.S. University - D.D.C.E Hashing and Priority Queues 107

Thus, a heap tree or binary heap can be used as a priority queue where the highest priority item is at
the root and is trivially extracted. But if the root is deleted, we are left with two sub-trees and we must
efficiently re-create a single tree with the heap property. Insertion and deletion in a heap tree is very
efficient - of the order of O0og n) - as compared to other trees.

Adding a Node to Heap


Inserting a node into a heap-tree is relatively straightforward. Because we keep of the largest position n
which has been filled so far, we can insert the new element at position n + 1 provided there is still room
in the array. This will once again give us a complete binary tree, but the heap-tree property might, of
course, be violated now. Hence we may have to 'bubble up' the new element. This is done by
comparing its priority with that of its parent, and if the new element has higher priority larger, then it
is exchanged with its parent. rWe may have to repeat this process, but once we reach a p^rent that has
bigger or equal priority, we can stop. Insening a node takes at most log(n) steps because the number of
times that we may have to 'bubble up'the new element depends on the height of the tree. An
algorithm for inserting a new node is listed below:
insert (node v)
{

if (n < MAX)

{
n = n + 1;
heap[n] = v;
bubble_up (n) ;

else
report error: out of space;
)

bubble_up(int i)
{
while (not isroot(i) and heaplil > heap[parent(i)J)
t
swap heaplil and heap[parent(i)];
I - parent(i);
)

Boolean isroot(int t.)


{
if (t == 1)
108 Advanced Data Structure M.S. University - D.D.C.E.

return TRUE;
el_se
return FALSE;
)

Consider the following heap tree.

Let us insert a node with data value 49. Since the next position is the left child of the node with value 43,
the new node will be added as shown below:
M.S. University - D.D.C.E. Hashing and Priority Queues 109

However, this makes the tree violate the heap property. Therefore, 43 must be swapped with 49.

Even now the heap property is not being fulfilled. Therefore,45 must be swapped with 49. At this
point the tree possesses the heap property and thus, we stop.

Deleting a Node from Heap

In a heap-tree, only the node with the highest priority (the one at the root) is deleted. 'We're then left
with something which isn't a binary tree at all. \7e can now 'trickle down' the new root by comparing
it to both its children and exchanging it for the largest. This process is then repeated until this element
has found its place.

Again, this takes at most log(n) steps. Note that this algorithm does not try to be fair in the sense that
if two nodes have the same priority, it is not necessary rhat the one that has been waiting longer is
removed first. A solution to this is to keep some kind of time-stamp on arrivals, or by giving them
numbers.
110 Advanced Data Strucure M.S. University - D.D.C.E.

consider the following heap rree, for illustrating deletion operarion.

Deleting a node will remove 50 from the root of the heap tree. The empty root must then be filled
with the last element of the heap tree (i.e., a3).

A
)o
However, in doing so, the heap looses its heap property and therefore, ir must be rearranged. 43 must
be swapped with 49.
M.S. Universitv - D.D.C.E. Hashing and Priority Queues i11

Even this does not confirm with the definition of heap. One more swap is necessary - 43 with 45 -
resulting in the final heap tree.

t. Define rehashing.
'What
2. is mid-square method?

4.4 LET US SUM UP


o Hashing is a method of directly computing the index of the table by using some suitable
mathematical function called hash function.
o A priority queue is a collection of elements such that each element has been assigned a priority.
. An element of higher priority is processed before any element of lower priority. Two elements
with the same priority are processed according to the order in which they were inserted into the
queue.

4.5 KEY\TORDS
Hashing: Hashing is a method of directly computing the index of the table by using some suitable
mathematical function called hash function.
Separate Chaining: This is a method of implementing a hash table, in which collisions gets handled
automatically.
Prioritjt pueue: A queue in which elements are assigned priorities to determine the order in which they
can be retrieved.

4.6 QUESTTONS FOR DTSCUSSTON


l. Use division method of hashing to store the following values in the hash table of size 12.

125, 745, 196,20r,202, 145, 107,20L


112 Advanced Data Structure M.S. University _ D.D.C.E

IJse sequenrial method for resolving the collisions.


2. How is a heap tree different from a binary tree?
3. Show the steP-wise results of inserting a node (36) into the following heap tree.

4. Give an example of a priority queue from everyday life.

Check Your Progress: Model Answers


L. Rehashing is a method of collision handling. In this method we find an alternative empty
location by modifying the hash function, and applying the modified hash function ro the
colliding symbol.
2. ln Mid square method this method the identifier is squared (considering non-numeric
fields as their numerical code), and then the appropriate number of bits frlm the middle
of the square are used ro get rhe hash value.

4.7 SUGGESTED READINGS


Data structures and Efi.cient Algoritbm.s,'Burkhard Monien, Thomas ottmann; Springer;

Datd Structures and Algoritbms;Shi-Ktto Chang; Vorld Scientific


How to Solae it by Computer; RG Dromey; Cambridge University press
Classic Data Structures in C+ +; Timothy A. Budd; Addison Wesley
UNITIII
LESSON

SORTING
CONTENTS
5.C Aims and Objectives
5.1 Introduction:SoningPreliminaries
5.2 'O'Notation
5.3 Insertion Sort
5.4 Shell Son
5.5 Heap Sort
5.5.1 Insertion in F{eap
5.5.2 Deletion from Heap
5.6 Construction of Heap
5.6.1 Top-downConstrucion
5.6.2 Bottom-upConsrrucrion
5.7 Soning using Heap
5.8 Merge Sort
5.9 Quick Son
5.10 Let us Sum up
5.11 Ke),words
5.1.2 Questions for Discussion
5.13 Suggested Reading

5.0 AIMS AND OBJECTIVES


After studying this lesson, you will be able to:
o Discuss the concept of soning
o Discuss insertion sort, shell sort, heap sort, merge sort, and quick sort

5.1 INTRODUCTION: SORTING PRELIMINARIES


Sorting refers to the operation of arranging data in some given order, such as increasing or decreasing,
with numerical data or alphabetically, with character data.
Let P be a list of n elements P,, P,, P,...P" in memory. Sorting P refers to the operation of rearranging
the contents of P so that they are increasing in order (numerically or lexicographically), that is,
P, ( P, ( Pr...( P.
I 16 Advanced Data Slructure M.S. Universitv - D.D.C.E.

Since P has n elements, there are n! ways that the contents can appear in P. These ways correspond
precisely to the n! permutations of 1, 2, ...n. Accordingly, each sorting algorithm must take care of
these n! possibilities.

C las sificatioru of S orting

Soning can be classified in two types:


(, Internal Sorting
(iil External Sorting
. lnternal Sorting: If the records that have to be sorted are in the internal memory.
. Exterual Sorting: I{ the records that have to be sorted are in secondary memory.

Effi cie ncy C o n s ide ratio n s

The most important considerations are:


L. The amount of machine time necessary for running the program.
2. The amount of space necessar)/ for the program.
In most of the computer applications one is optimized at the expense of another. The actual time units
to sort a file of size n varies frorn machine to machine, from one program to another, and from one set
of data to another.
So, we find out the corresponding change in the amount of time required to soft a file induced by a
change in the file size n. The time efficiency is not measured by the number of time units required but
rather by the number of critical operations performed.
Critical operations are those that take most of the execution time. For example, key comparisons (that
is, the comparisons of the key of two records in the file to determine which is greater), movement of
records or pointer to record, or interchange of two records.

5.2'O'NOTATION
Given two functions f(n) and g(n), we say that f(n) is of tbe order of g(n) or that f(n) is O(S(") if there
exists positive integers a and b such that

f(n)<a"g(n)forn>b.
For example, if f(n) : n2 + 100n and g (n) : n'
f(n) is O(S(")), since n2 + 100n is less than or equal to2n' for all n greaterthan or equal to 100. In this
casea:2andb:100.
The same f(n) is also O("), since n' + 100 n is less than or equal to 2n' {or all n greater than or equal to
8. If (n) is O(g(n)) and g(n) is O(h(n)), then f(n) is O(h(n)). For example, n" + 100n is O(n) and n, is
O(t ), then n2 + 100n is O(n') for (a : 1, b :1). This is called the transitive properry.
If the function is C 'r n then its order will be O(nk) for any constant c and k. As c 'r n is iess than or
equaltoc'r'nkforanyn3l (i.ea: c, b: 1).If f(n)isnkthenitsorderwillbeO(ru.) foranyj3 O(for
a:l,b:1).
M.S. University - D.D.C.E. Soning 117

are both O(h (n)), the new function f(n) + g(n) is also O(h(n)). If
If f(n) and g(n) (") is any polynomial
whose leading power is k I i.e. f(") : c, " flu *c,'i- 11t't + ". + co'r n*cy*,J

f(n)is O(nk).
Algorithm EfficienE in Lagarithmic Function
Let log- n and logun be two functions. Let xm be log-n and xk be logon then
m'^: nandk'k: n
Since if mx : n

So that log,,n : x

So that m"' : k*k

Taking log- of both sides.


xm : log- Q<,0)

Now it can easily be shou,n that log, (x) equals y "'logrx for any x,y and z, so that the last equation
can be written as

log-n : xk 'r log.k [since xm : log-n]

or as

log-n : (og-k) 'r'logun [since xk : logun]

Thus log- n and logo n are constant rnultiples of each other.

If f(n) : . ', g(r) then f(n) is O(g(n)) thus log-n is O(ogon) and logun is O(og-n) for any m and k.

The following facts establish an order hierarchy of functions:


o C is O(1) for any constant C.
o C is O0og n), but log n is not O(1).
. C 'r'logun is O(log n) for any constant C, K.
. C 'r-logun is O(n), but n is not O(log n).
o C 'r nk is O(no) for any constant C, K.
o C ,r'nr. is O(nk.,), but nk*tis not O(nk).
. C 'i n 'r logu n is O(n log n) for any constant C, K.
. C 'r- n'r'logu n is O(n'), but n'is not O(n log n).
. C 's nj 'r- logu n is O(n;1og n) for any constant c, j, k.
. C 'r'nj "' (ogu, is O(nl.t), but nj*'is not O(n,log n).
. C 'F nj '! (ogun)l is O(nl (log for any constant c, j, k, l.
"))
. C* n * (logon)l is O(rl.') but nr*r is not O(n (log n)r).
118 Advanced Data Stmcture
M.S. University - D.D.C.E.

5.3 INSERTION SORT


sorts
records bv records inro an existing sorted file.
*:X::-:::
Suppose an array A with thal i'_]T ':::..l
_inserting
n elements Aul, A[2],... ...Airy
scans A from A[1] to A[N], inserting .*.h .1.-".r,
is in memory. The i"r"#;r;tr';d;rh-
atrt
into its prop., pcsirion in the previously
sorted sub array A[1], A[2],... AtK-11.
Example: sort the following list using the insertion sorr method:
4, 1,3,2,5
4 i4rie j :n I"!p*s: ti*n

i! I
I I I { i" th*rtrf,r3ft rilien Fri{}r tfi j

! J !
-i > !, rrr*m lret,*efr ] & j

;
1 > i, :n:*n bpt.i,r.*r, J !t j

lll- Iisn a]trr j


,

Thus' to find the correct position, search the list till an item just
grearer than the target
" is found; shift
all rhe items from this point one down the rist, insert the r..g.,
i"?t;;;;;ri;;.
Algoithm to Implement Insertion Sort
insert sorL (x, n)
' int x[ ], n,.
{
int i, k, y;
for (k = 1; k < n' k++)
I
L

y = x [k];
for (i = k-1; i > = 0 &&y < x iil; i __)
x [i+11 = xli];
x [i+11 = y;
)
]
Analjsis oif Insenion Sort

If the initial file is sorted, only one comparison is made on each pass, so that sort is
initially sorted in reverse order, the sort I o(N), since the ,orrl o(n). If the file is
;.*;;;ons are:
("-1) + (r- 2) +... + 3 + 2+ t: (N_ 1) *N/2 ^r-b", "f
which is O(N).
The closer the file is to sorted order, the more efficient
the simple insertion sorr becomes. The space
requirements for the sort consists of only one rempor^ry
u^ri^b!", y. The qp".J.r the sort can be
improved somewhat by using abinary r"rr.h to find'the
;;"p;; position for xlkl in the soned file.
M.S. University - D.D.C.E. Soning 119

x[0], x[1] ... x[k - 1]


This reduces the total number of comparisons from O0.f) to O(n log,, (n)). However even if the
correct position i for x[k] is found in OQog, (n)) steps, each of the elements x[i+i] ... x[k - 1] must be
moved by one position. This latter operation to be performed N times requires O$t) replacement.

5.4 SHELL SORT


More significant improvement on simple insertion sort than binary or list insertion can be achieved
using the Shell Sort @iminishing Increment Son). This method sorts separate subfiles of the original file.
These subfiles contain every kth element of the original file. The value of k is called an increment. For
example, if k is 5, the subfile consisting of x[0], x[5], x[10],... is first sorted. Five subfiles, each
containing one fifth of the element of the original file are sorted in this manner. These are:
Subfile 1: x[0] x[5] x[10] ....
Subfile 2: x[1] x[6] x[11] ....
Subfile 3: xl2l xlTl xll2l.,.
Subfile 4: x[3] x[S] x[13] ...
Subfile 5: x[4] x[9] x[14] ...
Thei,oelementofthej,hsubfileisx[(i-1)*5+j -1]. Ifadifferentincrement ischosen,theksubfiies
are divided so that the i'h element o{ the .i'h subfile is x[(i - 1) 'r k+i - 1].

After the k subfiles are sorted (usually by simple insertion), a new smaller value of k is chosen and the
file is again partitioned into a new set of subfiles. Each of these larger subfiles are sorted and the
process is repeated yet again with an even smaller value of k.

Eventually, the value of k is set to 1, so that the subfile consisting of the entire file is sorted.
A decreasing sequence of increments is fixed at the start of the entire process. The last value in this
sequence must be 1.

For example, if the original file is:


25 27 48 37 t2 92 86 33

and the sequence (5, 3, 1) is chosen, the following subfiles are soned on each iteration:
First iteration (increment : 5)

(x[0], x[s])
(x[1], x[6]
(xl2l,xl7l
(,t31)l

kr4)
Second iteration (increment = 3)

(x[0], x[3], x[6])


(x[1], x[4], x[7])
(x[2], x[s])
120 Advanced Dara Structure
M.S. University - D.D.C.E.

Third iteration (increment - 1)

(x[0], x[1], x[2], x[3], x[4], x[5], x[6], x[Z])


Example: The following figure illustrates the shell sorr on this sample file:

{)rrgrr*ol t'il*: l3j i? jii :l )2 s m33l


It--- r
:55?$]?p*l rb 1l
Span = 5

IlasE .l

5F"* i

i*a;s ri
tl $ i; s$s57
r"-lltI
3 .:ti

$pen= i llt
*":ned lile l: 25 S 3l .l* :A lh *l
Algorithm to Implement Shell Sort
void shellsort (int x[ ], int n, inr increments [ ], int numeric)
{
int. incr, J, k, span, y;
for (incr = 0; i_ncr <numeric; incr ++)
{
spdrr = increments Iincr];
for (j = span; j .n; j++)
/* insert el_ement xljl into
,/* its proper position with its *,/
/* subfife * /
y = xljl;
for (x = j-span; k > = 0 && y < x tkl; k_=span)
x Ik+span] = x Ik] ;
x Ik+span] =y;
j /* end for */
)/*endfor*/
) /* end shel1 sorL */
Analysis of Shell Sort
Since the first increment trced by the shell son is large, the individual
subfiles are quite small, so that the
simple insertion sort on those subfiles are fairly fast. Each sort of a subfile causes
the entire file to be
M.S. University - D.D.C.E. Soning i21

more neady sorted. Thus, although successive passes of the shell son use smaller increments and
therefore, deal with larger subfiles, those subfiles are almost sorted due to the actions of previous passes.
Thus the insertion sort on these subfiles are also quite efficient. The actual time requirement for a
specific sort depends on the number of elements in the array increments and on their actual values.

Ithas been shown that order of the shell sort can be approximated by O(n " 0og (.r,)) if an appropriate
sequence of increments is used. For other series, the running time O(n,).

5.5 HEAP SORT


The algorithm which we now formulate is a combination of algorithms by Floyd and \Tilliams. In
general a heap which represents a table of n records satisfies the property
K ( K,; for 2< jcn and i:lj/21. The binary tree is allocated sequentially such that the indices of the
Ieft and right sons (if they exist) of record I are 2i and 2r+ 1 respectively.

A complete binary tree is said to satisfy the "Heap Condition" if the key of each node is greater
than or equal to the keys in its children. Thus, the root node will have the largest key value. Trees
can be represented as arrays, by first numbering the nodes (starting from the root) from left to
right.
The key value of nodes are then assigned to array positions whose index is given by the number of the
node.

Figure 5.1: Heap 1

"23435
E

The relationships of the node can also be determined from the array representation. If a node is at
position j, its children will be at positions 2j and2j+1. Its parenr will be at position j/2. A heap is a
complete binary tree in which each node satisfies the heap condition, represented as an array. The
operation on a heap works in two steps:
(, The required node is inserted/d eleted/ or replaced.
(i0 First step may cause violation of the heap condition so the heap is traversed and modified to
rectify any such violations.
1.22 Advanced Data Structure M.S. University _ D.D.C.E.

5.5.1 Insertion in Heap


Consider an insertion of node R in the heap 1.

1. Initially R is added as the righr child of J and given the number 13.
2. But R J, the heap condition is violated.
3. Move R upto position 6 and move J to position 13.
4. R P, therefore, the heap condition is still violated.
5. Swap R and P.
6. The heap condition is now satisfied by all nodes.

Figure 5.2:Heap 2

Figure 5.3: Heap 3

A general algorithm for creating a heap is given below:


l. Repeat through step 7 while there still is another record to be placed in the heap.

2. Obtain child to be placed at leaf level.


3. Obtain child to be parent for rhis child.
4. Repeat through step 6 while the child has a parenr & the key of the child is greater than that of its
parent.
5. Move parent down to position of child.
6. Obtain position of new parent for the child.
7. Copy child record into its proper place.
More formal procedure Create.Heap (K,N) is given below. K is a given table containing the keys of
the N records of a table, this algorithm creates a heap as previously described. The indel variable
e
M.S. University - D.D.C.E. Soning 123

controls the number of insertions which are to be performed. The integer variable denotes the index
J
of the parent of key k[I]. Key contains the key of the record being inserted into an existing heap.
1. [Build Heap]
Repeat through step 7 for Q :2,3.....5
2. llnitialize consrrucrion phasel
IfQ
KEYfKtQI
3. [Obtain parenr of new record]
Jffrunc (I/2)
4. [Place new record in existing heap]
Repeat through step 6 while I> 1 and KEy > KUI
5. flnterchangerecord]
Ktll fKul
6. [Obtain nexr parenr]
IfJ

J{Trunc (I/2)
ifJ <1
then Jf1
7. [Copy new record into its proper place]
K [r] fKEY
8. [Finished]
Return

5.5.2 Deletion from Heap


s7hen we delete any node from the tree, the child with grearer value will be promoted to replace that node.
Consider deletion of M from heap 2.
The larger of M's children are promored to 5.

An_ efficientsorting method is based on the heap construction and node removal from the heap in
order. This algorithm is guaranteed to sorr N elements in N log N steps.

5.6 CONSTRUCTION OF HEAP


Two methods of heap construction and then removal in order from the heap to sorr rhe list are as
follows:
T op - do wn H e ap C on s tructi o n

r Insert items into an initially empty heap, keeping the heap condition intact in all steps.
124 Advanced Data Structure M.S. University - D.D.C.E.

B ottom - up H e ap C on struction

. Build a heap with the items in order presented.


. From the right most node modify ro sarisfy the heap condition.
For example, to build a heap of the following using both methods of construction.
Let the input string be "PROFESSIONAL'.

5.6.1 Top-down Construction

t Pt
P .q"
6 Gb

ih:

Figure 5.4
M.S. University - D.D.C.E. Soning 125

5.5.2 Bottom-up Construction

Figure 5.5

5.7 SORTING USING HEAP


The sorted elements will be placed in x[ ] an array of. size 12.
126 Advanced Data Srructure
M.S. University - D.D.C.E.

(i) Remove 'S' and store in x[12].

1 i 3 '1 5 & 7 B I 1* 11 1:
xl i* s

(i,) Remove'S'and store in x[11].

3 1tr 11 13
vl
,.1 ll= t
J

(iii) Remove 'R' and store in x[10].

'r : 3 4 ; * r s $ 1{} 11 1l
xl l* r( 5 5
M.S. University - D.D.C.E. Sorting 127

i") Remove 'P; and store in x [9].

'2 3 ;l 5 h 7 fi I .lli 11 1l
rl l= fl K S 5

(") Remove 'O' and store in x[8].

I J 3 r 3 b " I li ltj 11 1:
{l- t)
I 5 5

(vi) Remove 'O' and store in x[7].,

'tu 11 1l
xl 1= U $3 R 5 5
128 Advanced Data Structure M.S. University - D.D.C.E.

(vii) Remove 'N' and store in x[6].

.tI
'I {"1 1l
\ {} P lc
( 5
'l I

(viii)Remove 'L' and store in x[5].

a
,{b
I
(,
1t 1't
xl l= t_ H fi rlxlsls
(ix) Similarly, the remaining 3 nodes are removed and the heap modified ro ger the sorted list
AEFILNOOPRSS.

5.8 MERGE SORT


The operation of sorting is closely related to the process of merging. This sorring method uses merging
of two ordered lists which can be combined to produce a single sorred list. This process ."" U.
accomplished easily by successively selecting the record with the smallest key occurringln either of the
lists and placing this record in a new table, thereby crearing an ordered list.
Merge sort is one of the divide and conquer class of algorithm. The basic idea is to:
o Divide the list inro a number of sublists.
. Sort each of these sublists.
. Merge them to get a single sorted list.
Two-way merge sort divides the list into two, sons the sublist and then merges them to ger rhe sorted
list, also called concatenate sort.
Multiple merging can also be accomplished by performing a simple merge repeatedly. For example, if
we have 16 lists to merge, we can first merge them in pairs. The resulr of this firsi step yields eight
tables which are again merged in pairs to give four tables. This process is repeated until a single table is
obtained. In this example, four separate passes are required to yield a single list. In general, k ,.p"rrt.
passes are required to merge 2k separate lists into a single list.
M.S. University - D.D.C.E.
Soning 129

Examplc

l7l t-rl lit ttl t{:l lrl t5l lil


\ / \ / \ /
F V -r )/3r ,V \/
:r x rl

ll :t?l tlti:5*l
\./\./
\,/
L.
I'r : j j -:. i 5 ll

Algoithmfor Merging Two sorted Fircs to Get Third sorted File


Let (X,, ..., x-) and (X^*,, ..., xJ be two sorted files. Let the merged file is (2,,
...,2).
Procedure:
merge (x, z, 1, m, n)
int x[ ] , zl ), 1, m, n,.
{
/* (xtll , xlml ) and (x[m+1]
keys, such that x[1] € ... € x[m] * / ,
xlnl ) are two sorted lists with
int i, j, k, t;
lr

k = 1; /* i, j a k are positions in these files * /


j = m+1;
while (( i € m) & & (j < = n) )

if (xlil < = xIj ])


{
zlkl = xIi];
i ++;
i
else
{
zlkl = x Ijl;
j ++,'
]
k = 1r+'l .

If (i>m)
130 Advanced Data Structure M.S. University - D.D.C.E.

{
for (t = j; t < lL r t++ )

z[k + t - jl = xltl

el,se

for(t=i;t <m; t++ )


zlk + L fl = vItl

The above algorithm for merge sort has one important property that after pass K, the arr^y A will be
positioned into sorted subarrays of exactly L : 2K elements (except the last subarray).
By dividing n (size of array A) by 2t'L, we get the quotient Q which is number of pairs of sorted
subarrays, of size L, i.e.

Q : rNT OI/2',iL)
S:2*1*q will be the total number of elements in the Q pairs of subarrays. R:N - S denotes the
number of remaining elements.

Analg sis of Algoithm IMERGE'


The while loop is iterated at most n - I + 1 times. The if statement moves at most one record per
iteration (considering sorting of records), the total time is therefore O(t - I + 1). If records are of
length m then this time is O(m(n - I + 1)).

Analjtsis of 'MSORTI
On the ith pass the files being merged are of size2it. Consequently, a total of llog (It{)] passes are made
over the data. Since two files can be merged in linear time (algorithm 'MERGE'), each pass of merge
*
sorr takes O(I.,I) time. As there are [1og, (N)] passes, the total computing time is O(N log N).

5.9 QUICK SORT


Quick sorr is also known as partition exchange sort. An element (a) is chosen from a specific position
within the array such that x is partitioned and a is placed at position j and the following conditions hold:
1. Each of the elements in position 0 through j-1 is less than or equal to a.

2. Each of the elements in position j+ t through n-l is greater than or equal to a.

The purpose of the Quick Sort is to move a data item in the correct direction just enough for it to
reach its final place in the The method, therefore reduces unnecessary swaps, and moves an item
^rray.
a great distance in one move. A pivotal item near the middle of the array is chosen, and then items on
either side are moved so that the data items on one side of the pivot are smaller than the pivot, whereas
those on the other side are larger, the middle (pivot) item is in its correct position. The procedure is
then applied recursively to the parts of the array, on either side of the pivot, until the whole array is
sorted.
M.S. University - D.D.C.E. Soning 131

Example:
If an initial array is given as:

25 57 48 37 t2 92 86 33
and the first element (25) is placed in its proper position, the resulting array is:

t2 25 57 48 37 92 86 33

At this point, 25 is in its proper position in the array (x[1]), each element below that position (12) is
lessthan or equal to 25, and each element above that positio n (57, 40, 37,92 86 and 33) is greate, thm
or equal to 25.
Since 25 is in its final position the original problem has been decomposed into the problem of sorting
the two subarrays.
(r2) and (s7 48 37 e2 86 33)
First of these subarrays has one element so there is no need to sorr it. Repeating the process on the
subarray x[2] through x[7] yields:

12 2s (48 37 33) 57 (e2 86)

and further repetitions yield


t2 2s (37 33) 48 s7 (e2 85)
1.2 25 (33) 37 48 s7 (e2 85)
t2 2s 33 37 48 57 (e2 86)
1,2 25 33 37 48 s7 (36) e2
t2 25 33 37 48 57 86 92

Algorithm to lmplernent puick Sort


qsort (int x[ ], int m, int. n)
{
/* m and n contains upper and lower bounds of array * /
i* arr,ay has to be sorted in non-decreasing order */
int i, j, k, r;
if (m<n)

i = m.
j = n+1;
k = x [m] ; /* keY *1
while (1)

do
132 Advanced Data Structure M.S. Universitv - D.D.C.E

while (x til < k) ;


do

j =j - 1-;

i
while (x tj I > k) ;
:
r-l.E t:
tr < l/. i\

L = x til;
v ft+Jil = v fi'l
t) ) t

v f il = l.
)
else break;

t = x [m] ;
x [m] = x tjl;
LJI _ L,
^
q sort (x, m/ j -1) ;
q sort (x, j+1; +n);
)

\7e shall illustrate the mechanics of this method by applying it to an array of numbers. Suppose, the
array A initially appears as:
(15, 20, 5, 8, 95, t2,80, 17,9,55)
Figure 5.6 shows a quick sort applied to this array.
A(1) A(2) A(3) A(4) A(s) A(6,) A(7) A(8) A(e) A(10)
t5 2A 58 95 t2 80 t7 955
920 58 95 12 80 T7 0ss
e0 58 95 t2 80 t7 20 55

912 58 e5080 17 20 55

9t2 58 0es80 T7 20 55

912 58 15 95 80 t7 20 55

Figure 5.6: Quick Sort of an Array


M.S. University - D.D.C.E. Soning 133

The following steps are involved:


1' Remove the Ist data item, 15, mark its position and scan the array from right to left, comparing
data item values with 15. 'S7hen you find the Ist smaller value, remove it from its current poritio"
and put in position A(1). This is shown in line 2 of.Figre 5.7 -

2. Scan line 2 from left to right beginning with position A(2), comparing data item values with 15.
\7hen you find the Ist value greater than 15, extracr it and srore ir, the position marked by
parentheses in line 2. This is shown in line 3 in the Figure 5.2.

3. Begin the right to left scan of line 3 with position A(8) looking for a value smaller than 15. \7hen
you find it, extract it and store it in the position marked by the parentheses in line 3 of Figure 5.2.

rr
B-a

ItFFFFFFFFFFI...
*-*

fu,-
i
!&tbd
':.---
JsJ - g-c Lrffi-
"
5 _ J l-r
-'.- :i::1{ 4 .-: * -f
: t= *-i

_-\.- rr:- !
=r.

- ;*i{*.
s*i.;il;:*i
l* ' .:,1* -. ',i}ra p.Ej: t: rrE}a*+rt }*'*tsE:-: -,}-:ii i I

,.^,.. -,--JJ/' r;
i Ji-
j,'=ill.f;':r
li=l-
+ ''3*s f 1i- * E 3 -.s e' :.:-':: ,i-
4 --..s
;l =,'f i.
' :":- j j-: "rE : :":-1if, E-.-'
'"'.'". :=rE]-.r' .i... ;E :* -'
't.- 4-&*_- ..-"JlJs-
' ^.*.:. r ll! at t I rd -
L-,#"ErF_
Me'1 -
"'d
d '-- e*
TB

i ;.;' Ei
xI
_ {t
!t
g I :r 3,i* i'EF,
!ix
It!
it-_*-
g * - --*
I I}
JII
I t!

X
L'#
;.-'..: li
tt

:;-
_:r-:: J
'-i..
- -.@

Figure 5.7
134 Advanced Data Slructure M.S. University - D.D.C.E.

4. Begin scanning line 4 from left to right at position A(3), find a value greater than 15, remove it,
mark its position, and store it inside the parentheses in line 4. This is shown in line 5 of Figure 5.7.
5. Now, when you line 5 from right to left beginning at position A(7), you find no value
scan
smalier than 15. Moreover, you come to a parentheses position, position A(5). This is the location
to put the Ist data item, 15, as shown in line 6 of Figure 5.7. At this stage 15 is in its correct place
relative to final sorted arruy.

Analltsis of puick Sort


Assumption: File is of size n where n is a power of 2.

Sayn : 2*, so that m : logrn.

Auerage Case Behauiour

Proper position for the pivot always turns out to be exact middle of the subarray. There will be
approximately n comparisons on the first pass after which file wili split into two subfiles each of size
n/2 approximately.
For each of these two files there will be approximately n/2 comparisons. So, after halving the subfiles
m times, there are n files of size 1. Thus, the total number of comparisons fcr the entire sort is
approximately:
n+2't (n/2) + 4't (n/4) + 8't (n/8) + .. + n't (n/n)
or
n+n+n+n+..n(mterms)
There are m terms because the file is divided m times. Thus, the total number of comparisons are:
O(r, o m) or O(n log n) fas m : log,n]

Worst Case Behauiour

The worst case occurs when the first pivot fails to split the list. This happens when the original file is
already sorted. If, for example, x[b] is in its correct position, the original file is split into subfiles of
sizesOandn-1.
If this process continues, a total of n - 1 subfiles are sorted, the first of size n; the second of size ("
- 1)
andsoon.Totalnumberofcomparisonstosorttheentirefilearen+(n-1) +("-2)+..+Q)
which is O(n)
Thus, the quick sort works best with completely unsorted files and worst for files that are completely
sorted.

heck Your P
T. Define sorting. Name its various categories.
2. Fill in the blanks:
(r) In a heap, the
----- node has the largest key value.

(b) In merge sort, total passes are


---- and the total computing times is
------
M.S. University - D.D.C.E. Soning 1-r5

5.10 LET US SUM UP


o Sorting is an important operations associated with any da:a stnrcture.
. Efficient and reliable data processing depends upon sorted data.
o The internal and external sorting methods have their relative efficiencies in different applications.
o In quick sort an element x is chosen from a specific position within the array such that each
element in position 0 through that element is less than or equal to x and each of the elements in
position greater than the position of x is grearer than or equal to x.
. 'With
a large size n and long keys, heap sort and merge sort can be used. I7ith a large size n and
short keys radix sort can be used.
. Heap is a binary tree with a condition that every node has alarger key value than its left and right
child.
o \X/hile representing such a tree in array) if a node is placed at ith index then its left child will be at
2i and right child will be at 2i+ 1. Hence, any nodes parent will be at i/2.
. In merge sort we split the array into subarrays of some size and then sort each of them separately
and then merge two sorted subarrays to get sorted list with each pass size of subarrays doubled.

5.11KEY\TORDS
Sorting: The operation of arranging data in some given order, such as increasing or decreasing with
numerical data or alphabetically, with character data.
puick Sort: A divide and conquer algorithm which works by creating two problems of half size,
solving them recursively, then combining the solutions to the small problems to ger a solution ro the
original problem.
Insertion Sort: A sorting technique that sorts a set of records by inserting records into an existing
sorted file.

Merge Jorz.' Sorting method that uses merging of two ordered lists which can be combined to produce a
single sorted list.

s.t2 QUESTTONS FOR DTSCUSSTON


1. \7hy is there a need for sorting?
'Write
2. a short note on 'O' notation.
3. Sort the following list using the insertion sorr method:
62831,9
4. Sort the following list using the shell sorr method:
362839421910954
'$7hat
5. is a heap? Construct a heap with the following data:

1.,2,3, 4, 5, 6,7 r 8r 9, tA, Il


136 Advanced Data Structure M.S. University _ D.D.C.E.

6. Explain soning method through heap.


7. I7rite a 'C' code for consrruction of a heap.
8. \7rite a 'C' program for heap sor-t.

9. \7hat is insertion son? Explain with suitable example. Also explain its time complexity.
10. 'Write a 'C' function for insertion sort. For searching the smallest element tn array use binary
search. Explain your program's time complexity.
'u7hat
LL. is merging? \frite a 'C' program to merge two sorted lists to ger anorher sorted list. Explain
its time complexity.
t2. Vhat is merge sort? Explain the method with a suitable example.
13' \flrite a 'C' program for merge son, which uses the function MergeQ to merge two sorted lists to
get the third sorted list. Explain its time complexity

t4- Explain Quick Son.'VTrite a 'C'program for quick sort. Explain its time complexity.
15. Show that algorithm for quick sort takes O(n) time when the input file is already in sorred order.

Check Your Progress: Model Answer


L. Sorting refers to the operation of arranging data in some given order, such as increasing o
decreasing, with numerical data or alphabetically, with character data.
2. Soning can be classified in two rypes:

0 Internal Sorting
(ii) External Soning
3. (r) root
(b) [log,(N)], O(N'rlog.N)

5.13 SUGGESTED READING


Shi-kuo Chang, Data Structures and Algorithms, \l orldscientific.
UNITIV
LESSON

6
GRAPH ALGORITHMS
CONTENTS
5.0 Aims and Objectives
6.1 Introduction
6.2 Definitions
6.3 Topological Son
6.4 Dijkstra Shortest Path Algorithm
6.5 Warshall Algorithm
6.6 Minimal Algorithm
6.7 Traversing a Graph
6.7.1. Depth-firstTraversal
6.7.2 Breadth-firstTraversal
6.8 Spanning Trees
6.9 Minimum-cost Spanning Tree
6.9.1 MST Propeny
6.9.2 Application of Minimum-cost Spanning Tree
6.10 Let us Sum up
6.11, Ke1'words
6.12 Questions for Discussion
6.13 SuggestedReadings

6.0 AIMS AND OBJECTIVES


After studying this lesson, you should be able to:
o Discuss graph definitions
. Define topological sort
o Discuss shortest path algorithms
o Discuss minimum spanning tree
140 Advanced Data Structure
M.S. University - D.D.C.E.

6.1 INTRODUCTION
Graphs are natural models used to represent arbhrary relationship among data objects. .We often need
to rePresent such atbittary relationship among the data objects *hil. d..li"g with many problems in
comPuter science, engineering, and many other disciplines. Therefore the stl.rdy of graphr ,, or" of the
basic data structures is important.

This section P{esents the definition of a graph (both directed as well as undirected) and related terms.
\7e will discuss various shorresr path algorithms and minimum spanning tree.

6.2 DEFINITIONS
A graph is a structure.made of two components, a ser of vertices V, and the set of edges E. Therefore a
qraph is G:(V, E), where G is a graph. The graph may be directed or undirect.d. \izh.r, the graph is
directed every edge of a graph is an ordered pair of vertices connecred by the edge, wherer, *-h.r, th.
graph is undirected every edge of a graph is an unordered pair of venices .orr.r".,Id by the edge. Given
below in Figure 6.1are rhe srrucrures which are graphs.

,{lr\
/\J\
{1
\/
\ 1:-\t
r i

Undirected Craph C, Directed Craph C,

Figure 6.1

Incifunt edge: If. (V, ,V) is an edge, then edge (V, ,V) is said to be incident on venices v, and
1. For
example, in the graph G, shown above in Figure 5.1 the edges incident on verrex l are (!,2),
$,4), and,
(1,3), whereas in G" the edges incident on verrex I are (1,2)_

\egree
of uertex: It is the number of edges incident on the verrex. For example, in graph G, shown
above the degree of ',zertex 1 is 3, because 3 edges are incident on it. For a directed graph, we need ro
define indegree and outdegree.
Indegree of a vertex v, is the number of edges incident on v,, wirh v as the head. Outdegree of verrex
v.
is the number of edges incident on v,, with vi as the tail. For a graph G, shown the indegree of the
vertex 2 is 1, whereas rhe ourdegree of the vertex2 is 2.
Direaed edge: A directed edge between the vertices v, and v is an ordered pair, and denoted as ( \,
t; )'

Undirected edge: An undirected edge between the vertices v, and v is an unordered pair, and denoted as
('rr, u).
M.S. Universitv - D.D.C.E. Graph Algorithms 141

Path: A path between the veftices vp and v, is a sequence of vertices vo:v;1:v;2:...,v;nrvn SUCh that there
exists a sequence of edges (ro, r,,), (v,,, v,) , ... , (v,",v). In the case of a directed graph, a parh between
the vertices vp and vn is a sequence of vertices vp,v;1,v;2,...,v;,,:v, SUCh that there exists a sequence of edges
(to,r,,), (vi, v,r),... : <v;,rvo>.If there exists a path from vertex vp to vq in an undirected graph,
then there always exists a path from vq to vp also. But in the case of a directed graph, if there exists a
path from yertex vo to vo: then it does not necessarily imply that there exists a path from vq to vp also.

Simple path.' A simple path is a path given by a sequence of vertices in which except the first and the
last vertex all vertices are distinct. If the first and the last vertex is the same then the path will be a
cycle.

Maximum number of edges: The maximum number of edges in an undirected graph with n vertices is
n(n - 1)/2 whereas in case of a directed graph it is n(n - 1).

Subgraph

A subgraph of a graph G: (V,E) is a graph G where


1. V(c ) is a subset of V(c).
2. E (G ) consists of edges (v,,v) in E(G) such that both v, and v, are in V(Gl). (Note: if G : (V, E) is
a graph, then V(G) is set of vertices of G and E(G) is a set of edges of G.) )

If E (Gi) consists of all edges (v,,v) in E(G), such that both v, and v, are in V(Gl), then Gl is called an
induced subgraph of G.
For example, the graph shown in Figure 6.2 is a subgraph of the graph G, shown in Figure 6.1.

Figure 6.2: Subgraph


For the graph shown below in Figure 5.3, one of the induced subgraphs is shown in Figure 6.4.

Figure 5.3: Graph G


142 Advanced Data Strucrure

ry
M.S. University - D.D.C.E.

G
Figure 6.4: Induced Subgraph of Graph G of Figure 6.3
In an undirected graph G, the two vertices v1 and v2 are said to be connected, if there exist a path in G
from v, to vr.(being undirected gra?tlt there exists a patb from o2 to al ako).
connected graph: A graph G is said to be connected if for every pair of distinct verrices (v,,v) there is
path from vito vj. Given below in Figure 6.5 is a graph which is connected.

/:\
-r\rA
.av\
1\
i/t\
lrL/---:A.'
'--' t..
ti*-{ 'r
/

\r//

Figure 6.5: Connected Graph


Conepletelit conneeted graph: A graph G is completely connected if for every pair of distinct ve*ices
(v,,v) there exists an edge. Given below in Figure 5.6 is a graph which is compleiely connected.

Figure 5.5

6.3 TOPOLOGICAL SORT


An applieation:You have a lay down of responsibilities. You are also told a set of preference relations;
some tasks cannot be performed previous to others. How shall you plan the jobs &clusire of
violating
any prec restraint?

Job -> nodes; preference relations -) edges.

Evidently, if there is a cycle in the graph, no pracricable plan.


M.S. University - D.D.C.E. Graph Algorithms 143

'$7hen
there is no cycle, "topological sorting'(- is a categorizing of vertices such that if there is a path
from v; to v;, then v1 ocCurs prior to v; in the plan.

Algorithm:
Find a vertex v with zero in-degree (must exist!)

Print v, delete v, and its outgoing edges;

Repeat;
Take O(V^2) time.

6.4 DIJKSTRA SHORTEST PATH ALGORITHM


E. \7. Dijkstra developed an algorithm to determine the shorted path between two nodes in a graph. It
is also possible to find the shortest paths from a given source node to all nodes in a graph at the same
time, hence this problem is sometimes called the single-source shortest paths problem.
The shortest path problem may be expressed as follows:
Given a connected graph G : (V, E), with weighted edges and a fixed vertex s in V, to find a shortest
path from s to each vertex v in V. The weights assigned to the edges may represent distance, cost,
effort or any other attribute that needs to be minimized in the graph.
A solution to this problem could be found by finding a spanning tree of the graph. The graph
representing all the paths from one vertex to all the others must be a spanning tree - it must include all
vertices. There will also be no cycles as a cycle would define more than one path from the selected
vertex to at least one other vertex.
The algorithm finds the routes, by cost precedence. Let's assume that every cost is a positive number.
The algorithm is equally applicable to a graph, a digraph, or even to a mixed graph with only some of
its sides directed. If we consider a digraph, then every other case is fully covered as well since a no
directed side can be considered a 2 directed sides of equal cost for every direction.
The algorithm is based on the fact that every minimal path containing more than one side is the
expansion of another minimal path containing a side less. This happens because all costs are considered
as positive numbers. In this way, the first route D(1) found by the algorithm will be one arc route, that
is from the starting point to one of the sides directly connected to this starting point. The next route
D(2) will be a one arc route itself, or a two arc route, but in this case'will be an expansion of D(1).
Here is the algorithm.
I. Let V be the set of all the vertices of the graph and S be the set of all the vertices considered for
the determination of the minimal path.
2. Set S :{}.
3. Vhile there are still vertices in V - S.

i. Son the vertices in V- S according to the current best estimate of their distance from the
source.

ii. Add u, the closest vertex in V - S, to S.

iii. Re-compute the distances for the vertices in V- S


-

144 Advanced Data Structure


M.S. University - D.D.C.E.

Consider the following example for illustration. Find the shonest parh
from node X ro node y in the
following graph. A label on an edge indicates the distance between ih",*o
nodes the edge connecrs.

Applying Dijkstra algorithm:


t. s:{x}
2. Distances of all the nodes from

3. The nodes in the S:

xA: 8 xB:3 XC:O


XD:oo XE-oo XY: O
4. Since, minimum distance from S to V- S is 3 (XB), S : :
{X, B} andE {XB}
5. Distances of all the nodes from the nodes in the S:

XA:8 XC=o XD:oo XE:o Xy:o


XBA-o XBC:7 XBD-oo XBE:8 XBy:6
Since, minimum distance from s to v - s is 6 (xBy), S :
{X, B, y} and E :
6.
{xBy}.
7. Distances of all the nodes from the nodes in the S:

xA: 8 XC:O XD-o XE:oo


XBA:O XBC: Z' XBD:o XBE:8
XBYA: oo XBYC :@ XBYD: o XByE: o
8. Continuing in similar manner, we find rhat the shortesr path between nodes
X and Y is XBY
with cost value 6.
Network how Problems
Find the shortest path from Ato Z for the given graph:
M.S. University - D.D.C.E. Graph Algorithms 145

Solution:
Initially P:{A} and
T: {B,C,E,D,Z}
The lengths of different verrices (with respect to P) are:
L(B):1 , L(C):4, L(D):L(E) : L(Z): Y

. So L@): 1 is the shortest value


Now let P:{A,B} and T={C,D,E,Z}
so, L(C)-3, L(D):8, L@)=6,L@)= Y

. So L(C) :3, is the shortest value.


Now P:{A,B, C} and T':{D,E,Z|
L'(D) : Min{8,Y} : 8. L'(E) : Min{5,3 + 1} :4
L'(Z) : Min {Y,3+Y}: Y
Successive observations are shown in following figures:
146 Advanced Data Structure M.S. University - D.D.C.E.

7 (a, b, c, e)

l0 (a, b, c, e)

4 (a" b, c)

T (a, b, c, e)

(a, b, c, e, d)

4(ab,c)

Thus the shortest path from AtoZ is (A, B, C, E, D, Z) of lengthg.

6.5'$TARSHALL ALGORITHM
Given the Adjacency Matrix A , this marrix produces the path marrix P.
1. [Initialization]
P-A
2 [Perform a pass]

Repeat thru step A for k= 1(1)n

3 [Process Rows]
Repeat step 4 for i:1(1) n.
4 [Across column]
Repeat for j:1(1)n
P,;-P,;V0* ^ Po)
s. [Exit]

6.6 MINIMAL ALGORITHM


Given the Adjacency Matrix B in which the zero elements are replaced by infinity or by some large
number, the matrix produced by the following algorithm shows the minimum length of paths
between the nodes. MIN is a function that selects the algebraic minimum of its two argumenm.
M.S. University - D.D.C.E. Graph Algorithms 147

tll c-.'B

t2) Repeat thru step 4 for k = 1(1) n


t3l Repeat thru step 4 f.or j - 1(1) n
c,,= MIN (c' c,o+ c)
t5l exit

6.7 TRAVERSING A GRAPH


This section Presents the methods of traversing a graph (directed as well as undirected). It also
describes algorithms for rraversing graphs.

6.7 .l Depth-first Traversal


A graph can be traversed either by using the depth first traversal or breadth first traversal. I7hen a
graph is traversed by.visiting the nodes in the forward (deeper) direction as long as possible, the
traversal is called depth first traversal. For example, for a graph shown in Figure Olg, the depth first
traversal $arting at the vertex v, visits the node in the order shown in Figure e.7 itself.

v"
) [v,
Figure 6.7: Graph g and its Dcpth First Traversds Starting at Vertex v,
Some of the depth first traversal orders are:
(, v,%Y,v,vr%%yrv,
(ir) vr vs v1 y6yev7v, v, %

The procedure for depth first traversal of a graph is given below. The procedure makes use of a global
array visited of n elements where n is the number of vertices of the graph, and the .lemeni, .r.
boolean. If visited[i] :
true then it means that i,h verrex is visited. Initialli we ser visited[i] false, :
therefore:
For(i=1; i<n; i++1
visitedlil = false;
for(i=1; icn; i++)
148 Advanced Data Structure M.S. University - D.D.C.E.

if(visitedlil == false) dfs(i);


void dfs (node x)
{
visitedlxl = true;
for every adjacent y of x do
dfs(y) ;

If the graph G to which the dfs is applied is represented by using adjacency lists then the vertices y
adjacent to can
x be determined by following the list of adjacent vertices for each vertex. Therefore the
loop searching for adjacent vertices has the total cost of d, + d, + ... + d", where d, is degree of vertex
v, because the number of nodes in the adjacency list of vertex v, is d,. If the graph G is having n vertices
and e edges then the sum of the degree of each vertex, i.e., (d, + d, +..,+ dJ is 2e. Therefore there are
rctal 2e list nodes in the adjacency lists of G. (if G is directed graph then there are total e list nodes
only). The algorithm examines each node in the adjacency lists at the most once. Hence the time
required to complete the search is O(e) provided n ( : e. Instead of using adjacency lists if adjacency
matrix is used to represent a graph G, then the time required to determine all adjacent vertices of a
verrex is O(n), and since most n vertices are visited the total time required is O(n').
'S7hen
this procedure is applied to the graph of Figure 6.7, then one of the orders in which the vertices
gets visited is shown below:

V1 false true true true true true true true true true

Y2 false false true true true true true true true true

V3 false false false true true true true true true true

Y4 false false false false false false false false true true

V5 false false false false false false false false false true

V6 false false false false false false false true true true

false false false false true true true true true true

false false false false false true true true true true

V9 . false false false false false false true true true true

6.7 .2 Breadth-first Traversal


'$7hen
a graph is traversed by visiting all the adjacent of a node/vertex first, the traversal is called
breadth first traver:al. For example, for a graph shown below one of the breadth first traversal starting
at the vertex v, visits the node in the order shown below in Figure 6.8.
M.S. Universitv - D.D.C.E. Graph Algorithms 149

Figure 6.8: Bradth First Traversal of Graph G Starting at Vertex v1

breadth first traversal order : v, v, vs vr v* v, v6 vs ve

The procedure for breadth first traversal of a graph is given below. The procedure makes use of a
global array o{ n elements where n is number of vertices of the graph, and the elements are boolean. If
visited[i] : true then it means that i'h vertex is visited. The procedure also makes use of a queue, and
the procedures addqueue and deletequeue are assumed to be available for adding a vertex to the
queue, and for deleting the vertex from the queue. Initially we set visitedli] :: false, therefore:
For(i=1;I<n;i++)
vlsitedlil = false;

void bfs (node x)


{
node y;
addqueue (x) ;

while (queue is not empty)


{

deletequeue (y );
if (visitedlyl == false)
{
visitedlyl = tgUei
for everlz adjacent i of x do
if (visitedlil == false)
addgueue (i) ;
i

If the graph G to which the bfs is applied is represented by using adjacency lists, then the vertices
adjacent to x can be determined by following the list of adjacent vertices for each vertex. Therefore,
150 Advanced Data Structure
M.S. University - D.D.C.E.

the loop searching for adjacent veftices has the toral cost of d, + d, + ... + d,, where d, is degree
of
vertex v, because the number of nodes in the a$acency list of verrex v, is d,. If the graph
G is having n
veftices and e edges then the sum of the degree of each verrex, i.e (d, + d, + .....+ ajir 2..
TherefJre
there are 2e list nodes in adjacency lists of G. (if G is directed gr"pl then there are e list nodes
$9
only). Each vertex gets added to queue exactly once, hence the loop ihii. qr.rr. nor empry is iterated
at the most n times. Hence, time requir.d to .o-plete the ,.rr.h ir OGi provided n
_the e. Instead (:
of using adiacency lists if adiacency matrix is used tJ represent a graph i,',h.r, the time required to
determine all adjacent vertices of a vertex is o(n), and since .rr.ry-r.i.x gets added ,o qrr..r. exactly
once the total time required is O(nr).
\Uflhen this procedure is applied to the Figure 6.9 graph, then one of the orders in which the venices
gets visited is shown below:

false true true true true true fue true true true

v2 false false tn€ true tflJe true fte fue fue true

v3 false false false false true true true true true true

false false false false false fue true true true

false false false true true true true true true true

v6 false false false false false false false true true true

false false false false false false true true true true

v8 false false false false false false false false true tnJe

v9 false false false false false false false false false true

Figure 6.9

6.8 SPANNING TREES


This section Presents the concept of spanning tree. It also presenrs the concept of weighted
graph and
minimum cost spanning tree for the weighted graph. It also discusses the propertiis of tra;ri*.rr1
Spanning Tree (MST).

Depth-firct Spanning Trcc and Brcadth-fir.st Spanning Tree


If a graph G is connected, then the edges of G can be parritioned into two disjoinr sers. one is a set of
tree edges, which we denote as set T, and other is a sei of back edges, which *. d.oot.
as B. The tree
gdgel fe- precisely those edges which are followed during the iepth-firrt tr"r.rr"l or during the
breadth first traversal.of the graph G. If we consider only"th. ,r.. edges, we get a s,rbgraph
containing all the vertices of G, and this sub-graph is a tree called ,p"rroirrg tr"."of the gr;ph
if G
G. For
example, consider the graph shown below in Figure 6.10.
M.S. University - D.D.C.E. Graph Algorithms 151

Figure 6.10: Graph G


One of the depth-first traversal orders for this tree is: l-2-3-4; hence the tree edges are (1,2),Q,3) and
(3,4). Therefore one of the spanning trees obtained using depth-first traversal of the graph of Figure
6.14 is shown in Figure 6.11.

Figure 6.11: Depth.first Spanning Tree of the Graph of Figure 6.14


Similarly one of the breadth-first traversal orders for this tree is : 1-2-4-3; hence the tree edges are
(1,2),(1,4) and (+,:). Therefore one of the spanning trees obtained using breadth first rraversal of the
graph is shown in Figure 6.12.

Figure 6.12:Breadth-first Spanning Tree of the Graph of Figure 6.14


The procedure for obtaining the depth first spanning tree is given below
T = 0; /*initially set of tree nodes is empty* /
void dfst (node v)
i
if (visiEedlvl == false)
t
visit.ed [v] = true;
152 Advanced Data Structure M.S. University - D.D.C.E.

for every adjacent i of v do


{

T = T 0 {(v,i)};
dfst(i);

i
If G is not connected, then the tree edges, which are precisely those edges followed during the
a graph
depth-first traversal of the graph G, constitutes the depth-firsr spanning forest. The depth-first
spanning forest will be made of trees each of which is one of the connecred componenrs of graph G.

\fhen a graph G is directed then the tree edges, which are precisely those edges followed during the
depth-first traversal of the graph G, form a depth-first spanning forest for G. In addition to this, there
are three other types of edges. These are called back edges, forward edges, and cross edges. An edge A
--+ B is called a back edge if B is an ancestor of A in the spanning forest. A non-tree edge rhat goes from
a vertex to a ProPer descendant is called a forward edge. An edge which goes from a vertex to another
vertex that is neither an ancestor nor a descendant is called cross edge. An edge from a vertex to itself is
a back edge.

For example, consider a directed graph G shown below in Figure 5.13.

Figure 6.13: A Directed Graph G


The depth-first spanning forest for the graph G of Figure 6.13 is shown in Figure 6.14.
M.S. University - D.D.C.E. Graph Algorithms 153

Figure 6.14: Depth-first Spanning Forest for the Graph G of Figure 7.17
Consider a graph show below in Figure 6.15.

Figure 6.15: A Graph G


If we apply the procedure dfst to this graph one of the depth-first spanning trees that v/e ger by
starting with vertex 1 is shown below in Figure 6.16.

Figure 6.16: Depth-first Spanning Tree of the Graph G of Figure 6.19

6.9 MINIMUM.COST SPANNING TREE


'\tr7hen
the edges of the graph have weights representing the cost in some suitable terms rhen we can
obtain that spanning tree of a graph whose cost is minimum, in terms of the weights of the edges. For
this, we start with the edge having the minimum-cost/weight add it to set T, and mark it visited. \7e
next consider the edge with minimum-cost which is not yet visited, add it to T, and mark it visited.
Vhile adding an edge to the set T, we first check whether both the vertices of the edge are visited, if it
154 Advanced Data Structure M.S. University - D.D.C.E.

is we do not add to the set T, because it will form a cycle. For example, consider the graph shown
below in Figure 6.17.

Figure 6.17: A Graph G


The minimum-cost spanning tree of the graph of Figure 6.77 is shown below in Figure 6.18.

Figure 6.18: The Minimum-cost Spanning Tree of Graph G of Figure 6.17

6.9.1 MST Property


Let G - (V, E) be a connected graph with a co$ function derided on the edges. Let U be some proper
subset of the set of vertices V. If (u, v) is an edge the of lowest cosr such that u is in U, and v isln V -
U, then there is a minimum cost spanning tree that includes edge (u,v). Many of the methods of
constructing a minimum cosr spanning free use the following properries:

Prim'sAlgoithm
Let G:(V, E) be a weighted graph, and suppose V={1,2,.. ..,n}. The prim's algorithm begins with a set
U initialized to {1}, and at each stage finds the shortest edge (u, v) that connecrs u in U and v in V - U,
and then adds v to U. It repeats this step until U = V.

void mcost(graph G, set of edges T)


t
set of vertices U;
vertex u, v;
{
M.S. University - D.D.C.E Graph Algorithms 155

T=0
u = {1}
WhileU#Vdo
i
find t.he lowest cost edge (u,v)
such that u is in U
and v is in V-U
add (u,v) to T
addvtoU
)
)

6.9.2 Application of Minimum-cost Spanning Tree


A property of a spanning tree of a qaqh G is that, a spanning rree is a minimal connecred sub-graph of
G (by minimal we mean the one with fewest number of edges). Therefore if nodes of G repres.-.r, .iri"t
and the edges represent possible communication link connecting rwo cities, then the spanning trees of
the graph G represent all feasible choices of the communicarLn network. If each .dg. h"-s weight
rePresenting cost measured in some suitable terms (ike cost of consrruction or distance etc.), then ihe
minimum-cost spanning tree of G is the selection of the required communication network.

(.hcck Yotrr I)rogrcs


1. \7hat is minimum cost spanning tree?
2. ttr7hat is depth-first traversal?

6.10 LET US SUM UP


. A graph consists of two non-empty subsets E(Q and V(G), where V(Q is a set of vertices and
E(g ir a set of edges connecting those verrices.
. Graph is a superset of tree. Every tree is a graph but every grephis nor necessarily a tree.
. A graph in which every edge is directed is called directed graph or digraph. A graph in which every
edge is undirected is called an undirected graph.

. There are rwo methods ro traverse a graph.


(, Depth-first search
(ir) Breadth-first search
. Spanning tree is a tree obtained from a graph which covers all its vertices.

6.11 KEY\T/ORDS
Digraph: A graph in which every edge is directed.
Undireaed Graph: A graph in which every edge is undirected.
155 Advanced Data Structure M.S. University - D.D.C.E.

Null Graph.'A graph containing only isolated nodes.

Spanning Tree: A tree obtained from a graph which coyers all its vertices.

Minimum SpanningTree: A tree from the set of spanning tree which has minimum weight.

6.12 QUESTTONS FOR DTSCUSSTON


1. \7hat is a graph? Compare graphs with trees.
2. Define these graphs:
(r) Undirected graphs
(i0 Directed graphs
3. Explain \Warshall's minimal algorithm for finding the path matrix of a graph given its adlacency
matrix.
'$7hat
4. do you mean by traversal of any graph?
5. \7rite depth first search algorithm for the traversal of any graph. \7rite a "C" program for the
same. Explain your algorithm's time complexity with the help of an example.
6. Explain breadth first search algorithm for the traversal of any graph with suitable examples.
Define time complexity of the algorithm. \(rite a "C" program for the same.
7. \7rite Prim's algorithm for finding minimal spanning tree of any graph. Find the minimal
spanning trees of the graph of previous questions by Prim's algorithm.

8. By considering the complete graph with n vertices, show that the number of spanning trees is at
least 2*t-1.

9. Prove that when DFS and BFS are applied to a connected graph the edges of the graph form a
tree.
'\tr7hat
10. do you understand by shortest path from one node to another in a weighted graph. r['/rite
Dijkstra's algorithm to find the shortest path in a weighted graph. Find the shortest path from 3
to T using Dijkstra's algorithm in the following graphs:
(0
M.S. University - D.D.C.E. Graph Algorithms 152

11. Find the minimum distance between the nodes A and F in the following graph.

A B C D E F
A 0 (50) 0 (s3) 0 gi
B 04\ 0 0 (13) (ri (40)

C (10) (15) 0 0 Q4) 0

D 0 (11 (1e) 0 11) 0

E (12) (20) 0 0 (42)

F 0 13) (2 r) (3 1) 0
t2. Obtain a spanning tree for the following graph.

u*) (n
73. Obtain the minimum spanning tree for the following graph. The number in the parentheses are
the cost of the corresponding edge.

A B C D E F
A 0 (60) 0 (s3) 0 (41)

B (14) 0 (1 3) (3e) (40)

C (10) (1s) 0 0 Q4) 0

D 0 11 (1e) 0 0t 0

E 02\ 0 Q0) 0 0 @2)

F U (13) (2t\ (3 1) 0 0

Check Your Progress: Model Answers


1. \7hen the edges of the graph have weights representing the cost in some suitable terms t
we can obtain that spanning tree of a graph whose cost is minimum, in terms of the weigh
of the edges.
'When
2. by visiting the nodes in the forward (deeper) direction
a graph is traversed as long as
possible, the traversal is called depth first traversal.
158 Advanced Data Structure
M.S. University - D.D.C.E.

6.13 SUGGESTED READINGS


Shi-kuo Chan6 Dau Struoures and Algoritbms, yl orldScientific
Birkhanser-B oston, An Introduaion to Data Stiuctures and Algoitbms, Springer-New york
UNITV
LESSON

7
ALGORITHM DESIGN TECHNIQUES
CONTENTS
7.A Aims and Objectives
7.1 Introduction
7.2 Greedy Algorithms
7.2.1 A Simple Scheduling Problem
7.2.2 Huffman Codes
7.3 Divide and Conquer
7.3.1 Running Time of Divide and Conquer Algorithms
7.3.2 Closest-pointsProblem
7.3.3 SelectionProblem
7.3.4 Theoretical Improvements for Arithmetic Problems
7.4 Let us Sum up
7.5 Ke1'words
7.6 Questions for Discussion
7.7 Suggested Readings

7.0 ArMS AND OBJECTTVES


After studying this lesson, you should be able to:
. Discuss greedy algorithms
o Describe the divide and conquer algorithms
. Explain the closesr-f,bints problem
o Define selection problem
. Analyse the theoretical improvemenrs for arithmetic problems

7.1 INTRODUCTION
In this lesson, we will discuss about the design of algorithms. \7e will focus on some common types of
algorithms used to solve problems. For mafly problems, it is pretty possible that at leasr one of these
methods will work.
162 Advanced Data Structure M.S. University - D.D.C.E.

7.2 GREEDY ALGORITHMS


Firstly, we will discuss about greedy algorithm. Greedy algorithms perform in phases. In every phase, a
verdict is made that emerges to be good, neglecting upcoming penalties. Usually, this shows that some
local optimulz is selected.lh. ,orr..e of the name for-this typ. of algorithms is "take what you can get
'When
now" policy. the algorithm expires, we anticipate that the local optimum is identical to the
global optinxunt..In this case, the algorithm is accuratel or else, the algorithm has created a suboptimal
resolution. If the supreme best solution is not essential, then uncomplicated greedy algorithms are at
times used to produce fairly accurate answers, instead of using the more intricate algorithms usually
needed to produce a precise answer.

The most evident real-life case of greedy algorithms is the coin-changing problem. To formulate
modification in U.S. currency, we frequently distribute the major quantity. Therefore, to provide
seventeen dollars and sixty-one cents in change, we provide a ten-dollar bill, a five-dollar bill, two
one-dollar bills, two quarters, one dime, and one penny. By doing this, we are assured to diminish the
number of bills and coins. This algorithm does not function in all financial systems, but luckily, we
can establish that it does function in the American financial system. Certainly, it functions even if
two-dollar bills and fifty-cent pieces are permitted.
Another real-life example is traffic problems where building locally best possible choices does not
forever work. For instance, for the duration of certain rush hour times in Miami, it is best to keep
away from the prime lanes even if they seems to be vacant, as traffic will come to a languish a mile
down the lane, and you will be trapped. Also more scandalous, it is healthier in some cases to make a
momenrary deviation in the direction opposite your target in order to evade all traffic holdups.
Now, we will that use greedy algorithms. The first application that will be
discuss some applications
discussed is a simple scheduling problem. Practically all scheduling problems are either NP-complete
(or of alike complicated complexity) or are solved by a greedy algorithm. The second application that
we will discuss is file compression and is one of the most primitive fallout in computer science.
Finally, we will discuss an example of a greedy approximation algorithm.

7.2.1 A Simple Scheduling Problem


In simple scheduling problem we are provided with some jobs7,, jr, . . . , j,, all with given running
times /,, tz, ., ,u, respectively with a single processor. Now we need to know the finest way to
schedule these jobs so as to reduce the average completion time.

Job Time
J, 1,6

J, 8

J, -)

J, 14

Scheduling

J, J, J, Jo

16 20 28 40

Average completion time : (16+20+28+40)/4 : 26


M.S. University - D.D.C.E. Algorithm Design Techniques 163

J, t, J, J,
3122440
Average complerion time : (3+1,2+24+40)/4 : 19.75

Simple scheduling problem have some following properties:


. Greedg-choice propert!: If shortest job does not go initially the 7 jobs before it will finish 3 time
units quicker, but j, will be delayed by time to finish all jobs previous to it.

o Optimal substrueture: If shortest job is detached from optimal solution, left over solution for n-1
jobs is optimal.

Optimalitjt Proof
Total cost of a schedule is
N
>(l{-k + 1)tik
k:1
t, + (t,+t) + (t,+t,+tr)... (t,+tr+...+tJ
N
(I.,i+ 1) )tik - )k'ttik

k:1
o First term is independent of ordering, as second term increases, total cost becomes smaller.
Assume that there is a job ordering such that x > y and tix < tiy. Swapping jobs (smaller first)
increases second term decreasing total cost

Show: xtix + ytiy < y.tix + xtiy


xtix + ytiy: xtix + ytix + y(tiy-tix)
: yti* + xtix+ y(tiy - tix)
< yti" + xtix+ x(tiy - tix)
: yti* + xtix+ xtiy - xrix : ltix + xtiy
7.2.2Huffman Codes
Now, we consider a second application of greedy algorithms, known as file compression.

The normal ASCII character set includes roughly 100 "printable" characters. To differentiate these
characters, 7 bits are needed. Seven bits permit the demonstration of I28 characters, so rhe ASCII
characterset adds some other "nonprintable" characters. An eighth bit is added asaparity check.
Assume we have a file that encloses only the characters' A, e, i, s, /, plus empty spaces and neulines.
Assume further, that the file has ten A's, fifteen e's, twelve I's, three s's, four /'s, thirteen blanks, and
one neuline. As the table in Figure 7.L shows, this file needs 174 bits to signify, since there are 58
characters and each character requires three bits.
154 Advanced Data Structure M.S. University - D.D.C.E.

Character Code Frequency Total Bj-ts


a 000 10 30
e 001 15 45
i 010 L2 36
s 011 3 9
r 100 4 t2
space 101 3 39
newl-ine l-l-0 L 3

Total L'7 4

Figure 7.1 Using a Standard Coding Scheme


In real case, files can be relatively bulky. Many of the very huge files are production of some program
and there is typically a big difference between the most common and least common characrers. For
example, many huge data files have an enormously large quantity of digits, blanks, and nezolines, but
few q's and x's. '$7e may be involved in minimizing the file size in the case where we are broadcasting
it over a slow phone line. Also, as on viftually every machine disk space is valuable, one might wonder
if it would be probable to offer a better code and decrease the total number of bits needed.
The answer is that this is achievable, and a simple policy attains 25 percent savings on usual huge files
and as much as 50 to 60 percent savings on many huge data files. The common policy is to permit the
code length to differ from character to character and to make sure that the often happening characters
have short codes. Observe that if all the characters appear with the similar frequency, then there are
not probable to be any savings.
Iluffrnan's Algorithrn
Compressing data is an imperative method for performing computing. Data sent over the Internet is
required to be sent as densely as achievable. There are m ny dissimilar methods for condensing data,
but one specific method makes use of a greedy algorithm-Huffman coding. This algorithm is named
for the late David Huffman, an information philosopher and computer scientist who invented the
practice in the 1950s. Data compressed using a Huffman code can accomplish savings o[.20o/o to 9Oo/o.
'When
data are compressed, the characters that build up the data are typically converted inro some
other demonstration to save space. A usual compression method is to convert each character to a
binary character code, or bit string.
For instance, we can encode the character "a" as 000, the character "b" as 001, the character "c" as 010,
and so on. This is known as a fixedJen$h code. The Huffman code algorithm takes a cord of
characters, converts them to a variable-length binary string, and generates a binary tree for the
intention of decoding the binary strings. The path to each left child is allocated the binary character 0
and each right child is allocated the binary characrer 1. The algorithm functions as follows:

Start with a string of characters you would like to compress. For each characrer in the string, compure
its frequency of appearing in the string. Then arrange the characters into order from lowest frequency
to highest frequency. Take the two characters with the minimum frequencies and make a node with
each character (and its frequency) as children of the node. The parent node's data element consists of
the sum of the frequencies of the two child nodes. Insert the node back into the list. Continue this
process until every character is located into the tree. On the completion of this process, you have a
M.S. University - D.D.C.E. Algorithm Design Techniques 155

complete binary tree that can be used to decode the Huffman code. Decoding comprises following a
path of 0s and 1s until you get to aleaf node, which will enclose a character.

7.3 DTVTDE AND CONQUER


Divide and Conquer is another common technique used to design algorithms. Divide and'conquer
algorithms consist of two parts:
Diaifu: Smaller problems are resolved recursively.
Conquer: The key to the original problem is then produced from the solutions to the sub problems.

Conventionally, schedules in which the text consists of at least two recursive calls are known as divide
and conquer algorithms, where as schedules whose text consists of only one recursive call are not. rVe
usually persist that the sub problems be displaced (that is, basically nonoverlapping). Let us review
some of the recursive algorithms that have been covered in this text.
'We
have already seen several divide and conquer algorithms. In lesson 3, we saw tree traversal
strategies. In lesson 5, we saw the classic examples of divide and conquer, namely mergesort and
quicksort, which have O (n log n) worst-case and average-case bounds, respectively.
Lesson 6 showed routines to recover the shortest path in Dijkstra's algorithm and other events to
perform depth-first search in graphs. None of these algorithms are really divide and conquer
algorithms, because only one recursive call is performed.
Now, we will see more cases of the divide and conquer pattern. Our first application is a problem in
conTputational geometry. Specified z points in a plane, we will illustrate that the closest pair of points
can be found in O(n log z) time. The rest of the discussion shows some awfully interesting, but mostly
hypothetical, results. \7e offer an algorithm which solves the selection problem in O(n) worst-case
time. \(e also prove that 2 n-bit numbers can be multiplied rn o(n) operations and that two n x n
matrices can be multiplied in o(d) operations. Unluckily, yet these algorithms have improved worst-
case bounds than the conventional algorithms, none are realistic barring very large inputs.

7.3.1 Running Time of Divide and Conquer Algorithms


All the competent divide and conquer algorithms divide the problems into subproblems, each of
which is some division of the original problem, and then execute some supplementary work to
calculate the final answer. As an instance, we have seen that merge sort functions on two problems,
each of which is half the size of the original, and then uses O(n) supplementary work. This capitulate
the running time equation (with appropriate initial conditions)
T(") :27(n/2) + o(n)

7.3.2 Closest-points Problem


. Given n points in d-dimensions, find two whose shared distance is minimum.
. It is a primary problem in many applications as well as a key step in many algorithms.
. A naive algorithm takes O(dn2) time.
. Element individtrality reduces to Closest Pair, so -(n log n) lower bound.
166 Advanced Data Structurc M.S. University - D.D.C.E.

. Ve will build up a divide-and-conquer based O(n log n) algorithm; dimension assumed constant.

*t I
*
l t
i
t t_J

ll
t t lt
I

1-Dimension Problem
. 1D problem can be solved in O(n 1og n) by means of sorting.
. Sorting, though, does not simplify to higher dimensions. So, let's build up a divide-and-conquer for
1D.

o Divide the points S into two sets 51; 52 by some x-coordinate so that p< q for all p €S1 and q
€ s2.

. Recursively calculate closest pair (p1; p2) inSl and (q1; q2) in 52.

o Let $ be the smallest division found until now:

$ : min(lp2-prl; lqz'qtl)
lD Diaifu d2 Coruquer

. The closest pair is {pl; p2}, or {q1; q2}, or some {p3; q3} where p3 e 51. and q3 eS2.

If m is the dividing coordinate, then p3; q3 must be within $ of m.

. In 1D, p3 must be the rightmost point of 51 and q3 the leftmost point of 52, but these ideas do not
simplify to higher proportions.
. How many points of SL can lie in the interval (m-$;m]?
. By definition of $, at most one. Same holds for 52.
lD Diuide dg Conquer

o Closest-Pair (S).
. If lsl :1,output$: infinity. If lsl:2,output$: lp2-ptl.Orelse,performthefollowing
stePS:

1. Letm:median(S).
2. Divide S into 51; 52 at m.
M.S. University - D.D.C.E. Algorithm Design Techniques 157

3. :
$1 Closest-Pair(S1).

4. $2 : Closest-Pair(S2).

5. $12 is minimum distance across the cut.

5. Return $ : min($l; $2; $12).


. Recurrence is T(n) : 2T(n:2) + O(n), which solves to T(n) : O(" log n).
2-D Closest Pair
'$7e
. partition S into 51; 52 by vertical line I defined by median x-coordinate in S.

. Recursively compute closest pair distances $1 and $2. Set $: min($1; $2).
. Now compute the closest pair with one point each in 51 and 52.

o In each candidate pair (p; d, *h.re p € 51 and q e 52, the points p; q must both lie within $of 1.
. At this point, complications arise, which weren't present in 1D. It's entirely possible that all n:2
points of 51 (and 52) lie within $ of 1.
. Naively, this would require n2l4 calculations.
'We
o show that points in P1; P2 ($ strip around ) have a special structure, and solve the conquer
step faster.

7.3.3 Selection Problem


In this problem, we are provided an unordered list of elements, and want to locate the kth largest
element. An easy way of solving this problem is to initially the list and then study the kth
^rrarlge
largest element. This takes time O(n log n). Yet, most probably locating only the kth largest element
should be easier than arranging the whole list. For instance, we could preserve a list of the k largest
elements and populate this list in time O(n log k). \[hen k is a small constant, this takes only linear
'We
time. will prove that we can execute selection in linear time for an subjective k using a divide and
conquer method.
To get instinct of the solution of the problem, assume that we could locate the median of a list in
'We
linear time. state that we can then use this as a sub procedure in a divide and conquer algorithm to
locate the kth largest element. Specifically, we use the median to divide the list into two halves. Then
we recursively locate the preferred constituent in one of the halves (the first half, if k _n/2, and the
second half otherwise). This algorithm takes time cn at the first level of recursion for some constant c,
cn/2 at the next level (as we recurse in a list of size n/2), cn/4 at the third level, and so on. The total
time taken is cn + cn/2 + cn/4 + '" : 2cn: O(")
Unluckily, however, locating the median doesn't appear to be much easier than locating the kth largest
element. The main idea here is that for applying the recursion, we are not required a precise median -
a near-median would do. Particularly assume we could locate an element at every step such that at least
3/10th of the elements in the list are minor than it and at least 3/10th of the elements are bigger than it,
then we could still apply the same divide and conquer method as above. Assuming each divide step
takeslineartime,ourrunningtimewouldbecomeatmostfi+7/1,0cn+49/t00cn+"':3.33cn
: o(n).
Lastly, it turns out that we can locate a near-median in linear time by again applying recursion.
168 Advanced Data Structure M.S. University - D.D.C.E.

Specifically, we divide the list into groups of 5 elements each, discover the median in each group in
constant time (as each group is of constant size), and then discover the median of these medians
recursively. The main point to observe is that the final step of locaring the median of medians applies
to a much smaller list - of size n/5, and so we still get a small enough running time.
This was just a coarse description and analysis of the algorithm. A more formal analysis detined below:
For straightforwardness of analysis, we suppose that all the list sizes we come across while running rhe
algorithm are divisible by 5.
A lgo rit hm for S e le etion

1. Divide the list into n/5lists of 5 elemenrs each.


2. Find the median in each sublist of 5 elements.
3. Recursively find the median of all the medians, call it m.
4. Partition the list into elements larger than m (call this sublist L1) and those no larger than m (call
this sublist L2).
5. If k _ lL1 l, return Selection(Li, k).
6. If k _ lL1 | + 1., return Selection(L2,k - lL1 l).
The accuracy of the algorithm is simple to quarrel and we will omit the disagreemenr. Ler us analyse
the running time. Observe that we make two recursive calls. The first is to a list of size n/5. The
second is to either LL orL2. We quarrel that these lists can be no bigger than Zn/tO in size. This is for
the reason that there are n/ !0 medians at step 3 that are minor than m, and there are three elements in
each of the sublists equivalent to these n/10 medians rhat are no bigger than the medians, and
consequently no larger than m itself. As a result, L2 is of size at least 3n/10, and L1 is of size at mosr
n- lL2l Similarly we can quarrel that L1 is of size ar leasr 3n/10 and so L2 is of size at mosr
-7n/10.
7n/10.
Thus we obtain the subsequent recurrence for the running time of the algorithm:
T(") : cn + T(n/5) + T(7n/t0)
where cn is the time taken to build the list of medians and to divide the list into L1 and L2 {or a
suitable constant c.

One way of solving this recurrence is to estimate that the running time is T(") : c'n and then verify
whether the equation is fulfilled for some worth of c'. Substiruring this in the equation we ger
c'n : cn + 9/10 c'n
which entails c' : 1Oc.

7.3.4 Theoretical Improvements for Arithmetic Problems


Now, we will discuss a divide and conquer algorithm that multiplies two n-digit numbers. Our
preceding model of calculation assumed that multiplication was performed in invariable time, since the
numbers were small. For large numbers, this supposition is no longer applicable. If we gauge
multiplication in view of the size of numbers being multiplied, then the normal multiplication
algorithm takes quadratic time. The divide and conquer algorithm occurs in subquadratic time. lWe
M.S. University - D.D.C.E. Aigorithm Design Techniques 169

also represent the typical divide and conquer algorithm that multiplies two n by n marrices in sub
cubic time.
o Multiplying Integers
. MatrixMultiplication
Multiplying lntegers
Let us consider multiplying two n-dtgit numbers x andy.If precisely one of x andy is negative, then
the solution is negative; or else it is positive.
Ifx:61,438,521 andy:94,736,407,xy:5,820,464,73A,934,047.Letusdividexandyintotwo
halves, including the most important and least important digits, correspondingly. Then xl : 6,143, xr
: 8,52L, yl : 9,473, andyr : 6,4Q7.\we also have x : xllo4 + xr andy : ylroa + yr.rtshows that
xry = xbltO\ + (xlyr + xryl)104 + xryr

Observe that this equation comprises of four multiplications, xlyl, xlyr, xryl, and xryr, which are each
half the size of the original problem (n/2 digits). The multiplications by 108 and 104 amounr ro rhe
placing of zeros. This and the following additions add only O(n) supplemenrary work. If we execute
these four multiplications recursively by means of this algorithm, discontinuing at an suitable base
case, then we acquire the recurrence

T("):aT(n/2)+o(")
'We : O(n), so, unluckily,
know that T(n) we have nor enhanced the algorithm. To attain a
subquadratic algorithm, we must use less than four recursive calls. The main inspection is that
xlyr + xryl: (xl-xr)(yr-yl) + xlyl + xryr
Therefore, rather than using two multiplications to calculate rhe coefficienr of 104, we can use one
multiplication, plus the result of two multiplications that have by now been performed. It is simple to
see that at the present the recurrence equation gratifies

T("):37(n/2)+o(n),
and so we acquire T(") : O(n1og23) : O(n1.59). To complete the algorithm, we mu$ have a base case,
which can be solved lacking recursion.
'When
both numbers are one-digit, we can do the multiplication by table lookup. If one number has
zero digits, then we return zero. In practice, if we were to use this algorithm, we would choose the
base case to be that which is most convenienr for the machine.

Although this algorithm has bemer asymptotic performance than the standard quadratic algorithm, it
is rarely used, because for small n the overhead is significanr, and for larger n there rr" .',r"r, better
algorithms. These algorithms also make widespread use of divide and conquer.

Matix Multiplication
A basic arithmetical problem is the multiplication oftwo marrices. Figure 7.2 gives a simple O(23)
algorithm to figure out C : AB, where A, B, and C are n by n
-r*i."r. The algorithm follows
straightforwardly from the description of matrix multiplication. To calculate C,u we.o-prrt. the dot
product of the zth row in A with theTth column in B. Typic ally arrays commences ar index 0.
120 Advanced Data Structure M.S. University - D.D.C.E.

For a long time it was presumed that O(n3) was needed for matrix multiplication. Yet, in the late
sixties Strassen showed how to break the O(r3) obstruction. The fundamental idea of Srrassen's
algorithm is to split each matrix into four quadrants, as shown in Figure 7.3. Then it is simple to show
that
Ct, : At, Br,, * At, Brl
Cr,r: Ar,rBr,, * ArlBrl
Cr.r: Ar.,8,., * Ar,rBr,,

C..: Ar,,8,,, * Ar.rBr,,


/'r Standard matrix multiplication. Arrays stafi at A '', /
void
matrix_multiply(matrix A, matrix B, matrix C, unsigned int n )
i
int i, j, k;
for( i=0; i<n; i++; /* Initialization *,/
for( j=O; i<n; j++ )
ctil tjl = 0.0;
f or ( i=0; i<n; j-++ I
for( j=0; j<n; j++ )
for( k=0; k<n; k++ )
cliltjl += Aliltkl * B[k][j];
)

Figure 7.2 Simple O(n3) matrix multiplication


A,,lla,, 8,,'l-[c,
|4.' A,, q,-l
1A,, 8,.,
)18,., ) Lc,, C,., ]

Figure 7.3 Decomposing AB = C into four quadrants


tWe know that T(n) : O(n), thus we do not have an enhancement. As we have seen with integer
multiplicatioll) we must decrease the number of subproblems below 8. Strassen used a policy
analogous to the integer multiplication divide and conquer algorithm and dispayed how to use only
seven recursive calls by cautiously assembling the computations. The seven multiplications are

Mr: @r,r- Ar)(8^, + B,)


Mr: (Ar,, + Ar,)(8r,, + B.)
M, : @ r., - Ar,r)(8r,, + 8,)
Mo: (Ar,, + A\)82,2

Mr: Ar}(Br,r- Br,)


M.S. University - D.D.C.E Algorithm Design Techniques 171

Mu: Ar,r.(Br,r- Br,)


Mr:(Ar,r+42)81,1
After the multiplications are performed, the concluding solution can be acquired with eight more
additions.
Cr,r: M, + Mr- M, + Mu
C,,,: Mo + M,
Cr,r: Mu + M,

Cr,o:Mr-Mr+Mr-M,
Itis simple to confirm that this complicated ordering generares the preferred values. The running rime
now assures the recurrence
T(") :77(n/2) + O(n').
The solution of this recurrence is T(n) : O(nlog27) : O(n2.8t).
Typically, there are particulars to consider, like the case when z is nor a power of rwo, but these are
essentially minor troubles. Strassen's algorithm is poorer than the simple algorithm unttl n is quite
large. It does not simplify for the case where the matrices are light (conrain many zero entries), and it
does not effortlessly parallelize. \7hen run with floating-poinr enrries, it is less stable numerically than
the typical algorithm. Therefore, it is has only restricted applicability. However, ir symbolizes an
imperative theoretical landmark and surely shows that in compurer science, as in many other fields,
even despite the fact that a problem appears to have an inherent difficulty, norhing is sure until
verified.

Your
t. Define greedy-choice properry.
2. \7hat is simple scheduling problem?

7.4 LET US SUM UP


This lesson illustrates some of the most common techniques found in algorithm design. \When
confronted with a problem, it is worthwhile to see if any of these methods apply. A proper choice of
algorithm, combined with judicious use of data structures, can often lead quickly to efficient solutions.

7.5 KEY\trORDS
Optimal Substructure: If shortest job is detached from optimal solution, left over solution for n-1 jobs is
optimal.
Diuide: Smaller problems are resolved recursively.
Conquer: The key to the original problem is then produced from the solutions ro the sub problems.
172 Advanced Data Structure M.S. University - D.D.C.E.

7.5 QUESTIONS FOR DISCUSSION


'W'hat
1,. are greedy algorithms? Discuss some real life examples.

2. A file contains only colons, in the following frequency: colon


spaces, newline, commas, and digits
(100), space (6Os), newline (100), commas (705), O (431), 1 Q42),2 (t76),3 (59), 4 (185), 5 Q50), 6
(174),7 (199), 8 Qo5),9 (217). Construct the Huffman code.

3. Complete the proof that Huffman's algorithm generates an optimal prefix code.
4. \X/rite a program to implement file compression (and uncompression) using Huffman's
algorithm.
5. lWrite a program to implement the closest-pair algorithm.

Check Your Progress: Model Answers


1. Greedy-choice property: If shortest job does not go initially the y jobs before it will finish
3 time units quicker, but j, will be delayed by time to finish all jobs previous to it.

2. In simple scheduling problem we are provided with some jobs71, jr, . . . , j^, all with given
running times /,, t,, . . . ,1., respectively with a single Processor.

7.7 SUGGESTED READINGS


Data Structures and Efficient Algorithms, Burkhard Monien, Thomas Ottmann, Springer

Data Structures and Algoritbms,Shi-Kto Chang; \7orld Scientific

You might also like