DSA NOTES
DSA NOTES
Data:
A collection of facts, concepts, figures, observations, occurrences or instructions in a
Formalized manner.
Information:
The meaning that is currently assigned to data by means of the conventions applied to
those data(i.e. processed data)
Record:
Collection of related fields.
Data type:
Set of elements that share common set of properties used to solve a program.
Data Structures:
Data Structure is the way of organizing, storing, and retrieving data and their relationship
with each other.
Characteristics of data structures:
1. It depicts the logical representation of data in computer memory.
2. It represents the logical relationship between the various data elements.
3. It helps in efficient manipulation of stored data elements.
4. It allows the programs to process the data in an efficient manner.
Operations on Data Structures:
1.Traversal
2.Search
3.Insertion
4.Deletion
CLASSIFICATION OF DATA STRUCTURES
DATA STRUCTURES
Compiler design
Operating system
Statistical analysis package
DBMS
Numerical analysis
Simulation
Artificial intelligence
Graphics
An abstract Data type (ADT) is defined as a mathematical model with a collection of operations
defined on that model. Set of integers, together with the operations of union, intersection and set
difference form a example of an ADT. An ADT consists of data together with functions that
operate on that data.
Advantages/Benefits of ADT:
1.Modularity
2.Reuse
3. code is easier to understand
4. Implementation of ADTs can be changed without requiring changes to the program that uses
the ADTs.
If the element at position i is Ai, then its successor is Ai+1 and its predecessor is Ai-1
Various operations performed on List
#define maxsize 10
int list[maxsize], n ;
Create Operation:
Create operation is used to create the list with „ n „ number of elements .If „ n „ exceeds the
array‟s maxsize, then elements cannot be inserted into the list. Otherwise the array elements are
stored in the consecutive array locations (i.e.) list [0], list [1] and so on.
void Create ( )
{
int i;
printf("\nEnter the number of elements to be added in the list:\t");
scanf("%d",&n);
printf("\nEnter the array elements:\t");
for(i=0;i<n;i++)
scanf("%d",&list[i]);
}
Insert Operation:
Insert operation is used to insert an element at particular position in the existing list. Inserting the
element in the last position of an array is easy. But inserting the element at a particular position
in an array is quite difficult since it involves all the subsequent elements to be shifted one
position to the right.
Routine to insert an element in the array:
void Insert( )
{
int i,data,pos;
printf("\nEnter the data to be inserted:\t");
scanf("%d",&data);
printf("\nEnter the position at which element to be inserted:\t");
scanf("%d",&pos);
if (pos==n)
printf (“Array overflow”);
for(i = n-1 ; i >= pos-1 ; i--)
list[i+1] = list[i];
list[pos-1] = data;
n=n+1;
Display();}
Consider an array with 5 elements [ max elements = 10 ]
10 20 30 40 50
If data 15 is to be inserted in the 2nd position then 50 has to be moved to next index position, 40
has to be moved to 50 position, 30 has to be moved to 40 position and 20 has to be moved to
30 position.
10 20 30 40 50
10 20 30 40 50
After this four data movement, 15 is inserted in the 2nd position of the array.
10 15 20 30 40 50
Deletion Operation:
Deletion is the process of removing an element from the array at any position.
Deleting an element from the end is easy. If an element is to be deleted from any particular
position ,it requires all subsequent element from that position is shifted one position towards
left.
Routine to delete an element in the array:
void Delete( )
{
int i, pos ;
printf("\nEnter the position of the data to be deleted:\t");
scanf("%d",&pos);
printf("\nThe data deleted is:\t %d", list[pos-1]);
for(i=pos-1;i<n-1;i++)
list[i]=list[i+1];
n=n-1;
Display();
}
Consider an array with 5 elements [ max elements = 10 ]
10 20 30 40 50
If data 20 is to be deleted from the array, then 30 has to be moved to data 20 position, 40 has to
be moved to data 30 position and 50 has to be moved to data 40 position.
10 20 30 40 50
After this 3 data movements, data 20 is deleted from the 2nd position of the array.
10 30 40 50
Search Operation:
Search( ) operation is used to determine whether a particular element is present in the list or not.
Input the search element to be checked in the list.
Routine to search an element in the array:
void Search( )
{
int search,i,count = 0;
printf("\nEnter the element to be searched:\t");
scanf("%d",&search);
for(i=0;i<n;i++)
{
if(search == list[i])
count++;
}
if(count==0)
printf("\nElement not present in the list");
else
printf("\nElement present in the list");
}
A Linked list is an ordered collection of elements. Each element in the list is referred as a node.
Each node contains two fields namely,
Data field-The data field contains the actual data of the elements to be stored in the list
Next field- The next field contains the address of the next node in the list
DATA NEXT
10 40 20 30 Null
L
Advantages of Linked list:
1. Insertion and deletion of elements can be done efficiently
2.It uses dynamic memory allocation
3.Memory utilization is efficient compared to arrays
Disadvantages of linked list:
1. Linked list does not support random access
2.Memory is required to store next field
3.Searching takes time compared to arrays
Types of Linked List
1. Singly Linked List or One Way List
2. Doubly Linked List or Two-Way Linked List
3. Circular Linked List
Dynamic allocation
The process of allocating memory to the variables during execution of the program or at run time
is known as dynamic memory allocation. C language has four library routines which allow this
function.
Dynamic memory allocation gives best performance in situations in which we do not know
memory requirements in advance. C provides four library routines to automatically allocate
memory at the run time.
SLL
SLL with a Header
Basic operations on a singly-linked list are:
1. Insert – Inserts a new node in the list.
2. Delete – Deletes any node from the list.
3. Find – Finds the position( address ) of any node in the list.
4. FindPrevious - Finds the position( address ) of the previous node in the list.
5. FindNext- Finds the position( address ) of the next node in the list.
6. Display-display the date in the list
7. Search-find whether a element is present in the list or not
Declaration of Linked List
void insert(int X,List L,position P);
void find(List L,int X); void
delete(int x , List L); typedef
struct node *position;
position L,p,newnode;
struct node
{
int data;
position next;
};
Creation of the list:
This routine creates a list by getting the number of nodes from the user. Assume n=4 for this
example.
void create()
{
int i,n;
L=NULL;
newnode=(struct node*)malloc(sizeof(struct node));
printf("\n Enter the number of nodes to be inserted\n");
scanf("%d",&n);
printf("\n Enter the data\n");
scanf("%d",&newnode->data);
newnode->next=NULL;
L=newnode;
p=L;
for(i=2;i<=n;i++)
{
newnode=(struct node *)malloc(sizeof(struct node));
scanf("%d",&newnode->data);
newnode->next=NULL;
p->next=newnode;
p=newnode;
}
}
Initially the list is empty
List L
Null
L
Insert(10,List L)- A new node with data 10 is inserted and the next field is updated to
NULL. The next field of previous node is updated to store the address of new node.
10 Null
L
P
Insert(20,L) - A new node with data 20 is inserted and the next field is updated to NULL.
The next field of previous node is updated to store the address of new node.
Null
10 20
L P
Insert(30,L) - A new node with data 30 is inserted and the next field is updated to NULL. The
next field of previous node is updated to store the address of new node.
10 20 30 Null
L P
Case 1:Routine to insert an element in list at the beginning
void insert(int X, List L, position p)
{
p=L;
newnode=(struct node*)malloc(sizeof(struct node));
printf("\nEnter the data to be
Inserted\n");scanf("%d",&newnode->data);
newnode->next=L; L=newnode;
}
Case 2:Routine to insert an element in list at Position
This routine inserts an element X after the position P.
Void Insert(int X, List L, position p)
{
position newnode;
newnode =(struct node*) malloc( sizeof( struct node ));
if( newnode = = NULL )
Fatal error( “ Out of Space ” );
else
{
{
p=L;
}
Routine to check whether a list is Empty
This routine checks whether the list is empty .If the lis t is empty it returns 1
int IsEmpty( List L )
{ Null
if ( L -> next = = NULL )
return(1); L
}
Routine to check whether the current position is last in the List
This routine checks whether the current position p is the last position in the list. It returns 1 if
position p is the last position
int IsLast(List L , position p)
{
if( p -> next= =NULL)
return(1);
}
Null
10 40 20 30
Null
10 40 20 30
L
X P
Find Previous
It returns the position of its predecessor.
position FindPrevious (int X, List L)
{
position p;
p = L;
while( p -> next ! = NULL && p -> next -> data! = X )
p = p -> next;
return P;
}
}
print count;
}
Routine to Delete an Element in the List:
It delete the first occurrence of element X from the list L
void Delete( int x , List L){
position p, Temp;
p = FindPrevious( X, L);
if( ! IsLast (p, L)){
temp = p -> next;
P -> next = temp -> next;
free ( temp );
}}
Routine to Delete the List
This routine deleted the entire list.
void Delete_list(List L)
{
position P,temp;
P=L->next;
L->next=NULL;
while(P!=NULL)
{
temp=P->next;
free(P);
P=temp;
}
}
Doubly-Linked List
A doubly linked list is a linked list in which each node has three fields namely Data, Next, Prev.
Data-This field stores the value of the element
Next-This field points to the successor node in the list
Prev-This field points to the predecessor node in the list
PREV DATA NEXT
DLL NODE
DOUBLY LINKED LIST
list.
L->next=newnode;
newnode->prev=L;
void find()
{
int a,flag=0,count=0;
if(L==NULL)
printf(“\nThe list is empty”);
else
{
printf(“\nEnter the elements to be searched”);
scanf(“%d”,&a);
for(P=L;P!=NULL;P=P->next)
{
count++;
if(P->data==a)
{
flag=1;
printf(“\nThe element is found”);
printf(“\nThe position is %d”,count);
break;
}
}
if(flag==0)
printf(“\nThe element is not found”);
}
}
Advantages of DLL:
The DLL has two pointer fields. One field is prev link field and another is next link field.
Because of these two pointer fields we can access any node efficiently whereas in SLL only one
link field is there which stores next node which makes accessing of any node difficult.
Disadvantages of DLL:
The DLL has two pointer fields. One field is prev link field and another is next link field.
Because of these two pointer fields, more memory space is used by DLL compared to SLL
CIRCULAR LINKED LIST:
Circular Linked list is a linked list in which the pointer of the last node points to the first node.
Types of CLL:
CLL can be implemented as circular singly linked list and circular doubly linked list.
Declaration of node:
typedef struct node *position;
struct node
{
int data;
position next;
};
Applications of List:
1. Polynomial ADT
2.Radix sort
3.Multilist
Polynomial Manipulation
Polynomial manipulations such as addition, subtraction & differentiation etc.. can be
performed using linked list
Declaration for Linked list implementation of Polynomial ADT
struct poly
{
int coeff;
int power;
struct poly *next;
}*list1,*list2,*list3;
{
}
else
{
}head1=newnode1;return (head1);
ptr=head1;
while(ptr->next!=NULL)ptr=ptr->next;
ptr->next=newnode1;
return(head1);
}
Addition of two polynomials
void add()
{
poly *ptr1, *ptr2, *newnode ;
ptr1= list1;
ptr2 = list2;
while( ptr1 != NULL && ptr2 != NULL )
{
newnode = (struct poly*)malloc( sizeof ( struct poly ));
if( ptr1 -> power = = ptr2 -> power )
{
newnode -> coeff = ptr1 -> coeff + ptr2 -> coeff;
newnode -> power = ptr1 -> power ;
newnode -> next = NULL;
list3 = create( list3, newnode );
ptr1 = ptr1 -> next;
ptr2 = ptr2 -> next;
}
else if(ptr1 -> power > ptr2 -> power )
{
}
else newnode -> coeff = ptr1 -> coeff;
{ newnode -> power = ptr1 -> power;
newnode -> next = NULL;
list3 = create( list3, newnode );
ptr1 = ptr1 -> next;
ptr2=ptr2->next;
}
else
{
if(ptr1-power>ptr2-power)
{
newnode->coeff=ptr1->coeff;
newnode->power=ptr1->power;
newnode->next=NULL;
list3=create(list3,newnode);
} ptr1=ptr1->next;
else
{
newnode->coeff=-(ptr2->coeff);
newnode->power=ptr2->power;
newnode->next=NULL;
list3=create(list3,newnode);
ptr2=ptr2->next;
}
}
}
Polynomial Differentiation:
void diff()
{
poly *ptr1, *newnode;
ptr1 = list1;
while( ptr1 != NULL)
{
newnode = (struct poly*)malloc( sizeof (struct poly));
newnode->coeff=(ptr1-coeff)*(ptr1->power);
newnode->power=ptr1->power-1;
newnode->next=NULL;
list3=create(list3,newnode);
ptr1=ptr1->next;
}
}
Polynomial Multiplication
void mul()
{
poly *ptr1, *ptr2, *newnode ;
ptr1= list1;
ptr2 = list2;
while( ptr1 != NULL && ptr2 != NULL )
{
newnode = (struct poly*)malloc( sizeof ( struct poly ));
if( ptr1 -> power = = ptr2 -> power )
{
newnode -> coeff = ptr1 -> coeff * ptr2 -> coeff;
newnode -> power = ptr1 -> power+ptr2->power; ;
newnode -> next = NULL;
list3 = create( list3, newnode );
ptr1 = ptr1 -> next;
ptr2 = ptr2 -> next;
}}
}
UNIT II LINEAR DATA STRUCTURES – STACKS, QUEUES
STACK
Stack is a Linear Data Structure that follows Last In First Out(LIFO) principle.
Insertion and deletion can be done at only one end of the stack called TOP of the stack.
Example: - Pile of coins, stack of trays
STACK ADT:
STACK MODEL
TOP pointer
1. Stack Overflow
An Attempt to insert an element X when the stack is Full, is said to be stack
overflow.
For every Push operation, we need to check this condition.
2. Stack Underflow:
Implementation of Stack
Stack can be implemented in 2 ways.
return(1);
}
(ii) Push Operation
int TopElement(Stack S)
{
if(Top==-1)
{
Error(“Empty stack!!No elements”);
return 0;
}
else
return S[Top];
}
Implementation of stack using Array
/* static implementation of stack*/
#include<stdio.h>
#include<conio.h>
#define size 5
int stack [ size ];
int top;
void push( )
{
int n ;
printf( "\n Enter item in stack" ) ;
scanf( " %d " , &n ) ;
if( top = = size - 1)
{
printf( "\nStack is Full" ) ;
}
else
{
top = top + 1 ;
stack [ top ] = n ;
}
}
void pop( )
{
int item;
if( top = = - 1)
{
printf( "\n Stack is empty" );
}
else
{ item = stack[ top ] ;
printf( "\n item popped is = %d" , item );
top - -;
}
}
void display( )
{
int i;
printf("\n item in stack are");
for(i = top; i > = 0; i --)
printf("\n %d", stack[ i ] );
}
void main( )
{
char ch,ch1;
ch = 'y';
ch1 = 'y';
top = -1;
clrscr( );
while(ch !='n')
{
push( );
printf("\n Do you want to push any item in stack y/n");
ch=getch( );
}
display( );
while( ch1!='n' )
{
printf("\n Do you want to delete any item in stack y/n");
ch1=getch( );
pop( );
}
display( );
getch( );}
OUTPUT:
Enter item in stack20
Do you want to push any item in stack y/n
Enter item in stack25
Do you want to push any item in stack y/n
Enter item in stack30
Stack is Full
Do you want to push any item in stack y/n
item in stack are
25
20
15
10
5
Do you want to delete any item in stack y/n
item popped is = 25
Do you want to delete any item in stack y/n
item popped is = 20
Do you want to delete any item in stack y/n
item popped is = 15
item in stack are
10
5
Linked list implementation of Stack
Stack elements are implemented using SLL (Singly Linked List) concept.
Dynamically, memory is allocated to each element of the stack as a node.
Type Declarations for Stack using SLL
struct node;
typedef struct node *stack;
typedef struct node *position;
stack S;
struct node{ int
data; position
next;};
int IsEmpty(Stack S);
void Push(int x, Stack S);
void Pop(Stack S);
int TopElement(Stack S);
(i) Stack Empty Operation:
Initially Stack is Empty.
With Linked List implementation, Empty stack is represented as S -> next = NULL.
It is necessary to check for Empty Stack before deleting ( pop) an element from the stack.
Routine to check whether the stack is empty S
Header
30 20 10 NULL
40
newnode
Before Insertion
Push routine /*Inserts element at front of the list
void push(int X, Stack S)
{
Position newnode, Top;
newnode = malloc (sizeof( struct node ) );
newnode -> data = X;
newnode -> next = S -> next;
S -> next = newnode;
Top = newnode;
}
Header
40 30 20 10 NULL
After Insertion
TOP
(iii) Pop Operation
It is the process of deleting the Top element of the stack.
With Linked List implementations, the element at the Front of the List
(i.e.) S -> next is always deleted.
It takes only one parameter. Pop(X).The element X to be deleted from the Front of the
List.
Before deleting the front element in the list, check for Empty Stack.
If the Stack is Empty, deletion is not possible.
Otherwise, make the front element in the list as “temp”.
Update the next field of header.
Using free ( ) function, Deallocate the memory allocated for temp node.
PANIMALA R
S
padamavani arts and science college
Header
40 30 20 10 NULL
TOP
Before Deletion
Pop routine /*Deletes the element at front of list
void Pop( Stack S )
{
Position temp, Top;
Top = S -> next;
if( S -> next = = NULL)
Error(“empty stack! Pop not possible”);
else
{
Temp = S -> next;
S -> next = temp -> next;
free(temp);
Top = S -> next;
}}
Header
40 30 20 10 NULL
HEADER
30 20 10 NULL
After Deletion
(iv) Return Top Element
Pop routine deletes the Front element in the List.
If the user needs to know the last element inserted into the stack, then the user can
return the Top element of the stack.
To do this, first check for Empty Stack.
If the stack is empty, then there is no element in the stack.
Otherwise, return the element present in the S -> next -> data in the List.
if(S->next==NULL)
error(“Stack is empty”);
return 0;
else
return S->next->data;
}
S
Header NULL
40 30 20 10
TOP
Applications of Stack
INFIX:
The arithmetic operator appears between the two operands to which it is being
applied.
POSTFIX:
The arithmetic operator appears directly after the two operands to which it applies.
Also called reverse polish notation.
PREFIX:
The arithmetic operator is placed before the two operands to which it applies. Also
called polish notation.
Step 4:If the character is a right parenthesis, pop all the operators from the stack till it encounters
left parenthesis, discard both the parenthesis in the output.
E.g. Consider the following Infix expression: - A*B+(C-D/E)#
A
A
*
A
*
B
+ AB
*
+ AB*
AB*
(
(
+
Read char Stack Output
C AB*C
(
+
- AB*C
-
(
+
- AB*CD
D
(
+
AB*CD
/
/
-
(
+
AB*CDE
E /
-
(
+
AB*CDE/-
)
/
-
(
+
Read char Stack Output
AB*CDE/-+
Operand Value
A 2
B 3
C 4
D 4
E 2
A 2
B 3
2
Char Read Stack
6
*
4
C 6
4
D 4
6
2
/ 4
6
- 2
6
+ 8
OUTPUT = 8
a a
(
+
+ a
(
b
ab
+
(
) ab+
* ab+
*
c ab+c
/ ab+c*
d ab+c*d
ab+c*d/
+
21
e ab+c*d/e
/ ab+c*d/e
/
+
f ab+c*d/ef
/
+
# ab+c*d/ef/+
Operand Value
a 1
b 2
c 4
d 2
e 6
f 3
Char Read Stack
A 1
B 2
1
3
+
4
C 3
12
*
2
D 12
6
/
6
6
E
3
F 6
6
2
6
/
8
+
Output = 8
Example for unbalanced symbols:
QUEUES:
Queue is a Linear Data Structure that follows First in First out (FIFO) principle.
Insertion of element is done at one end of the Queue called “Rear “end of the Queue.
Deletion of element is done at other end of the Queue called “Front “end of the Queue.
Example: - Waiting line in the ticket counter.
Front End
Deletion RearEnd
QUEUE Q
Insertion
Queue Model
Front Pointer:-
Rear Pointer:-
Front (F) = - 1
Rear (R) = - 1
Operations on Queue
1. EnQueue
2. DeQueue
(i) EnQueue operation:-
It is the process of inserting a new element at the rear end of the Queue.
For every EnQueue operation
o Check for Full Queue
o If the Queue is full, Insertion is not possible.
o Otherwise, increment the rear end by 1 and then insert the element in the rear end
of the Queue.
It is the process of deleting the element from the front end of the queue.
For every DeQueue operation
o Check for Empty queue
o If the Queue is Empty, Deletion is not possible.
o Otherwise, delete the first element inserted into the queue and then increment the
front by 1.
Queue Overflow
Queue Underflow
An Attempt to insert an element X at the Rear end of the Queue when the
Queue is full is said to be Queue overflow.
For every Enqueue operation, we need to check this condition.
(ii) Queue Underflow:
An Attempt to delete an element from the Front end of the Queue when the
Queue is empty is said to be Queue underflow.
For every DeQueue operation, we need to check this condition.
Implementation of Queue
return ( 1 );
As we keep inserting the new elements at the Rear end of the Queue, the Queue becomes
full.
When the Queue is Full, Rear reaches its maximum Arraysize.
For every Enqueue Operation, we need to check for full Queue condition.
if ( Rear = = ArraySize - 1 )
return ( 1 );
}
(iii) Enqueue Operation
It is the process of inserting a new element at the Rear end of the Queue.
It takes two parameters, Enqueue(X, Q). The elements X to be inserted at the Rear end of
the Queue Q.
Before inserting a new Element into the Queue, check for Full Queue.
If the Queue is already Full, Insertion is not possible.
Otherwise, Increment the Rear pointer by 1 and then insert the element X at the Rear end
of the Queue.
If the Queue is Empty, Increment both Front and Rear pointer by 1 and then insert the
element X at the Rear end of the Queue.
Header
10 20 30 40 NULL
Front Rear
struct node;
typedef struct node * Queue;
typedef struct node * position;
int IsEmpty (Queue Q);
Queue CreateQueue (void);
void MakeEmpty (Queue Q);
void Enqueue (int X, Queue Q);
void Dequeue (Queue Q);
struct node
{
int data ;
position next;
}* Front = NULL, *Rear = NULL;
(i) Queue Empty Operation:
Initially Queue is Empty.
With Linked List implementation, Empty Queue is represented as S -> next = NULL.
It is necessary to check for Empty Queue before deleting the front element in the Queue.
It is the process of inserting a new element at the Rear end of the Queue.
It takes two parameters, EnQueue ( int X , Queue Q ). The elements X to be inserted into
the Queue Q.
Using malloc ( ) function allocate memory for the newnode to be inserted into the Queue.
If the Queue is Empty, the newnode to be inserted will become first and last node in the
list. Hence Front and Rear points to the newnode.
Otherwise insert the newnode in the Rear -> next and update the Rear pointer.
Header NULL
Q
Empty Queue
Before Insertion Header
Applications of Queue
1. Serving requests on a single shared resource, like a printer, CPU task scheduling etc.
2. In real life, Call Center phone systems will use Queues, to hold people calling them in an
order, until a service representative is free.
3. Handling of interrupts in real-time systems. The interrupts are handled in the same order
as they arrive, First come first served.
4. Batch processing in operating system.
5. Job scheduling Algorithms like Round Robin Algorithm uses Queue.
Drawbacks of Queue (Linear Queue)
With the array implementation of Queue, the element can be deleted logically only by
moving Front = Front + 1.
Here the Queue space is not utilized fully.
In Circular Queue, the insertion of a new element is performed at the very first location of the
queue if the last location of the queue is full, in which the first element comes just after the last
element.
A circular queue is an abstract data type that contains a collection of data which allows
addition of data at the end of the queue and removal of data at the beginning of the
queue.
Circular queues have a fixed size.
Circular queue follows FIFO principle.
Queue items are added at the rear end and the items are deleted at front end of the circular
queue
Here the Queue space is utilized fully by inserting the element at the Front end if the rear
end is full.
It is same as Linear Queue EnQueue Operation (i.e) Inserting the element at the Rear end.
First check for full Queue.
If the circular queue is full, then insertion is not possible.
Otherwise check for the rear end.
If the Rear end is full, the elements start getting inserted from the Front end.
It is same as Linear Queue DeQueue operation (i.e) deleting the front element.
First check for Empty Queue.
If the Circular Queue is empty, then deletion is not possible.
If the Circular Queue has only one element, then the element is deleted and Front and Rear
pointer is initialized to - 1 to represent Empty Queue.
Otherwise, Front element is deleted and the Front pointer is made to point to next element in the
Circular Queue.
F, R
F= -1,R= --1
In DEQUE, insertion and deletion operations are performed at both ends of the Queue.
Deletion Insertion
Deletion
Front Rear
(ii) Output Restricted DEQUE
Here insertion is allowed at both ends and deletion is allowed at one end.
Insertion
Deletion Insertion
Front Rear
Operations on DEQUE
Four cases for inserting and deleting the elements in DEQUE are
}
UNIT-III TREE ADT
Binary Tree using Array Representation
Each node contains info, left, right and father fields. The left, right and father
fields of a node point to the node’s left son, right son and father respectively.
Example: -
A
B
-
C
-
-
-
D
-
0 1
A A
B 1 C 2 B 2 C
3
D E F G D E F G
3 4 5 6 4 5 6 7
A B C D E F G A B C D E F G
0 1 2 3 4 5 6 1 2 3 4 5 6 7
For Figure 2.5 For Figure 2.6
Root = i Root = i
leftchild=2i+1 leftchild=2i
rightchild=2i+2 rightchild=2i+1
leftchild’s parent position = i/2 parent position= i/2
2n+1 – 1 => array size 2n+1 - 1 => size of array
n => no of levels of a tree n => number of levels of a tree
rightchild’s position= i-1/2
Struct treenode
{
int data;
structtreenode *leftchild;
structtreenode *rightchild;
}*T;
T
2000
1000 1 1006
1000 1006
2 1004
1002 1008 5 1010
B C
CONVERSION OF A GENERAL TREE TO BINARY TREE
General Tree:
A General Tree is a tree in which each node can have an unlimited out degree.
Each node may have as many children as is necessary to satisfy its
requirements. Example: Directory Structure
It is considered easy to represent binary trees in programs than it is to
represent general trees. So, the general trees can be represented in binary
tree format.
The binary tree format can be adopted by changing the meaning of the left and
right pointers. There are two relationships in binary tree,
Parent to child
Sibling to sibling
Using these relationships, the general tree can be implemented as binary tree.
Algorithm
Identify the branch from the parent to its first or leftmost child. These
branches from each parent become left pointers in the binary tree
Connect siblings, starting with the leftmost child, using a branch for each
sibling to its right sibling.
Remove all unconnected branches from the parent to its children
A
B E F
C D G H I
BINARY TREE TRAVERSALS
Compared to linear data structures like linked lists and one dimensional array,
which have only one logical means of traversal, tree structures can be
traversed in many different ways. Starting at the root of a binary tree, there
are three main steps that can be performed and the order in which they are
performed defines the traversal type. These steps (in no particular order) are:
performing an action on the current node (referred to as "visiting" the node),
traversing to the left child node, and traversing to the right child node. Thus
the process is most easily described through recursion.
A binary tree traversal requires that each node of the tree be processed once
and only once in a predetermined sequence.
The two general approaches to the traversal sequence are,
Depth first traversal
Breadth first traversal
Breadth-First Traversal
In depth first traversal, the processing proceeds along a path from the root
through one child to the most distant descendent of that first child before
processing a second child. In other words, in the depth first traversal, all the
descendants of a child are processed before going to the next child.
Inorder Traversal
Steps :
Traverse left subtree in inorder
Process root node
Traverse right subtree in inorder
B E
C D F
The Output is : C B D A E F
Algorithm
Algorithm inoder traversal (BinTree T)
Begin
If ( not empty (T) ) then
Begin
Inorder_traversal ( left subtree ( T ) )
Print ( info ( T ) ) / * process node */
Inorder_traversal ( right subtree ( T ) )
End
End
Routines
void inorder_traversal ( NODE * T)
{
if( T ! = NULL)
{
inorder_traversal(T->lchild);
printf(“%d \t “, T->info);
inorder_traversal(T->rchild);
}
}
Preorder Traversal
Steps :
Process root node
Traverse left subtree in preorder
Traverse right subtree in preorder
Algorithm
Algorithm inoder traversal (BinTree T)
Begin
If ( not empty (T) ) then
Begin
Print ( info ( T ) ) / * process node */
Preorder_traversal ( left subtree ( T ) )
Preorder_traversal ( right subtree ( T ) )
End
End
Routines
void inorder_traversal ( NODE * T)
{
if( T ! = NULL)
{
printf(“%d \t “, T->info);
preorder_traversal(T->lchild);
preorder_traversal(T->rchild);
}
}
Output is : A B C D E F
Postorder Traversal
Steps :
Traverse left subtree in postorder
Traverse right subtree in postorder
process root node
Algorithm
Algorithm postorder traversal (BinTree T)
Begin
If ( not empty (T) ) then
Begin
Postorder_traversal ( left subtree ( T ) )
Postorder_traversal ( right subtree( T))
Print ( Info ( T ) ) / * process node */
End
End
Routines
void postorder_traversal ( NODE * T)
{
if( T ! = NULL)
{
postorder_traversal(T->lchild);
postorder_traversal(T->rchild);
printf(“%d \t”, T->info);
}
}
B E
C D F
Output is : C D B F E A
A
Examples :
B
C
D E G
F
ANSWER : POSTORDER: DEBFGCA INORDER: DBEAFCG
PREORDER:ABDECFG
3.A BINARY TREE HAS 8 NODES. THE INORDER AND POSTORDER TRAVERSAL OF THE
TREE ARE GIVEN BELOW. DRAW THE TREE AND FIND PREORDER.
POSTORDER: F E C H G D B A
INORDER: FCEABHDG
A
Answer:
C B
F E D
H
G
PREORDER: ACFEBDHG
Example 4
Preorder traversal sequence: F, B, A, D, C, E, G, I, H (root, left, right)
Inorder traversal sequence: A, B, C, D, E, F, G, H, I (left, root, right)
Postorder traversal sequence: A, C, E, D, B, H, I, G, F (left, right, root)
APPLICATIONS
EXPRESSION TREES
a/b+(c-d)e
Tree representing the expression a/b+(c-d)e.
Converting Expression from Infix to Postfix using STACK
Algorithm
1) Examine the next element in the input.
2) If it is an operand, output it.
3) If it is opening parenthesis, push it on stack.
4) If it is an operator, then
i) If stack is empty, push operator on stack.
ii) If the top of the stack is opening parenthesis, push operator on stack.
iii) If it has higher priority than the top of stack, push operator on stack.
iv) Else pop the operator from the stack and output it, repeat step 4.
5) If it is a closing parenthesis, pop operators from the stack and output them
until an opening parenthesis is encountered. pop and discard the opening
parenthesis.
6) If there is more input go to step 1
7) If there is no more input, unstack the remaining operators to output.
Example
Suppose we want to convert 2*3/(2-1)+5*(4-1) into Prefix form: Reversed
Expression: )1-4(*5+)1-2(/3*2
Algorithm
1) Reverse the input string.
2) Examine the next element in the input.
3) If it is operand, add it to output string.
4) If it is Closing parenthesis, push it on stack.
5) If it is an operator, then
i) If stack is empty, push operator on stack.
ii) If the top of stack is closing parenthesis, push operator on stack.
iii) If it has same or higher priority than the top of stack, push operator on
stack.
iv) Else pop the operator from the stack and add it to output string, repeat
step 5.
6) If it is a opening parenthesis, pop operators from stack and add them to
output string until a closing parenthesis is encountered. Pop and discard the
closing parenthesis.
7) If there is more input go to step 2
8) If there is no more input, unstack the remaining operators and add them to
output string.
9) Reverse the output string.
Example
Reverse the output string : +/*23-21*5-41 So, the final Prefix Expression is
+/*23-21*5-41
EVALUATION OF EXPRESSIONS
CONSTRUCTING AN EXPRESSION TREE
Let us consider the postfix expression given as the input, for constructing an
expression tree by performing the following steps :
1. Read one symbol at a time from the postfix expression.
2. Check whether the symbol is an operand or operator.
i. If the symbol is an operand, create a one node tree and push a
pointer on to the stack.
ii. If the symbol is an operator, pop two pointers from the stack
namely, T1 and T2 and form a new tree with root as the operator,
and T2 as the left child and T1 as the right child.
iii. A pointer to this new tree is then pushed on to the stack.
Next, c, d, and e are read, and for each a one-node tree is created and a pointer
to the corresponding tree is pushed onto the stack.
Finally, the last symbol is read, two trees are merged, and a pointer to the final
tree is left on the stack.
BINARY SEARCH TREE
Binary search tree (BST) is a node-based binary tree data structure which
has the following properties:
The left sub-tree of a node contains only nodes with keys less than the
node's key.
The right sub-tree of a node contains only nodes with keys greater than
the node's key.
Both the left and right sub-trees must also be binary search trees.
From the above properties it naturally follows that:
Each node (item in the tree) has a distinct key.
7 1
So
2
4
2
4
5
7
and 1<2 so
2
4
1 5
7
is the final BST.
OPERATIONS
Figure shows the code for the insertion routine. Since T points to the root of
the tree, and the root changes on the first insertion, insert is written as a
function that returns a pointer to the root of the new tree. Lines 8 and 10
recursively insert and attach x into the appropriate subtree.
Thus 5 is inserted.
Delete
Case 1:
6 6
2 8 2 8
1 4 1 4
EXAMPLE :
Case 2 :
6 6
2 8 2 8
1 4 1
3 3
Case 3 :
6 6
2 8 3 8
1 4 1 4
5
3 5
Introduction
To count the number of nodes in a given binary tree, the tree is required to be
traversed recursively until a leaf node is encountered. When a leaf node is
encountered, a count of 1 is returned to its previous activation (which is an
activation for its parent), which takes the count returned from both the
children's activation, adds 1 to it, and returns this value to the activation of its
parent. This way, when the activation for the root of the tree returns, it
returns the count of the total number of the nodes in the tree.
SWAPPING OF LEFT & RIGHT SUBTREES OF A GIVEN BINARY TREE
Introduction
An elegant method of swapping the left and right subtrees of a given binary
tree makes use of a recursive algorithm, which recursively swaps the left and
right subtrees, starting from the root.
Applications of Trees
1. Compiler Design.
2. Unix / Linux.
3. Database Management.
4. Trees are very important data structures in computing.
5. They are suitable for:
a. Hierarchical structure representation, e.g.,
i. File directory.
ii. Organizational structure of an institution.
iii. Class inheritance tree.
b. Problem representation, e.g.,
i. Expression tree.
ii. Decision tree.
c. Efficient algorithmic solutions, e.g.,
i. Search trees.
ii. Efficient priority queues via heaps.
AVL TREE
The AVL tree is named after its two inventors, G.M. Adelson-Velsky and E.M.
Landis, who published it in their 1962 paper "An algorithm for the
organization of information."
Avl tree is a self-balancing binary search tree. In an AVL tree, the heights of
the two child subtrees of any node differ by at most one; therefore, it is also
said to be height-balanced.
The balance factor of a node is the height of its right subtree minus the
height of its left subtree and a node with balance factor 1, 0, or -1 is
considered balanced. A node with any other balance factor is considered
unbalanced and requires rebalancing the tree. This can be done by avl tree
rotations
Need for AVL tree
The disadvantage of a binary search tree is that its height can be as large
as N-1
This means that the time needed to perform insertion and deletion and
many other operations can be O(N) in the worst case
We want a tree with small height
A binary tree with N node has height at least Q(log N)
Thus, our goal is to keep the height of a binary search tree O(log N)
Such trees are called balanced binary search trees. Examples are AVL
tree, red-black tree.
Thus we go for AVL tree.
AVL trees are identical to standard binary search trees except that for every
node in an AVL tree, the height of the left and right subtrees can differ by at
most 1 . AVL trees are HB-k trees (height balanced trees of order k) of order
HB-1. The following is the height differential formula:
|Height (Tl)-Height(Tr)|<=k
When storing an AVL tree, a field must be added to each node with one of
three values: 1, 0, or -1. A value of 1 in this field means that the left subtree
has a height one more than the right subtree. A value of -1 denotes the
opposite. A value of 0 indicates that the heights of both subtrees are the same.
EXAMPLE FOR HEIGHT OF AVL TREE
Rotation :
www.padeepz.net
Modification to the tree. i.e. , If the AVL tree is Imbalanced, proper rotations
has to be done.
A rotation is a process of switching children and parents among two or three
adjacent nodes to restore balance to a tree.
Balance Factor :
BF= --1
7
5 12
BF=1 BF= --1
2 10 14
-> RL ( Right -- Left rotation) --- Do single Right, then single Left.
-> LR ( Left -- Right rotation) --- Do single Left, then single Right.
1. LL Rotation :
BINARY HEAPS
Structure Property :
COMPLETE TREE
A binary tree is completely full if it is of height, h, and has 2h+1-1 nodes.
it is empty or
its left subtree is complete of height h-1 and its right subtree is
completely full of height h-2 or
its left subtree is completely full of height h-1 and its right subtree is
complete of height h-1.
PRIORITY QUEUE
Deletion(h) Insertion(h)
PRIORITY QUEUEI
1. Structure Property :
The Heap should be a complete binary tree, which is a completely filled
tree, which is a completely filled binary tree with the possible exception of the
bottom level, which is filled from left to right.
A Complete Binary tree of height H, has between 2h and (2h+1 - 1) nodes.
Sentinel Value :
The zeroth element is called the sentinel value. It is not a node of the tree.
This value is required because while addition of new node, certain operations
are performed in a loop and to terminate the loop, sentinel value is used.
Index 0 is the sentinel value. It stores irrelated value, inorder to terminate the
program in case of complex codings.
Structure Property : Always index 1 should be starting position.
Max-Heap:
The largest Element is always in the root node.
Each node must have a key that is greater or equal to the key of each of its
children.
Examples
HEAP OPERATIONS:
There are 2 operations of heap
Insertion
Deletion
Insert:
Adding a new key to the heap
Example Problem :
Delete-max or Delete-min:
Removing the root node of a max- or min-heap, respectively
1. DELETE MIN
2. Delete Min -- 13
Other Heap Operations
1. Decrease Key.
2. Increase Key.
3. Delete.
4. Build Heap.
1. Decrease Key :
10 10 8
15 12 8 12 10 12
20 30 20 30 20 30
2. Increase Key :
10 10 10
12 22 12 20 12
15
20 30 20 30 22 30
3. Delete :
The delete(P,H) operation removes the node at the position P, from the heap
H. This can be done by,
20 12 -∞ 12
10 12
22 30 22 30
22 30
Step 2 : Deletemin(H)
10 10
12 12
10 22 12
20 20
30
APPLICATIONS
The heap data structure has many applications
Heap sort
Selection algorithms
Graph algorithms
Heap sort :
One of the best sorting methods being in-place and with no quadratic
worst-case scenarios.
Selection algorithms:
Finding the min, max, both the min and max, median, or even the k-th
largest element can be done in linear time using heaps.
Graph algorithms:
By using heaps as internal traversal data structures, run time will be
reduced by an order of polynomial. Examples of such problems are Prim's
minimal spanning tree algorithm and Dijkstra's shortest path problem.
ADVANTAGE
DISADVANTAGE
Heap is expensive in terms of
safety
maintenance
performance
Performance :
Allocating heap memory usually involves a long negotiation with the OS.
Maintenance:
Dynamic allocation may fail; extra code to handle such exception is
required.
Safety :
Object may be deleted more than once or not deleted at all .
B-TREES
Multi-way Tree
A multi-way (or m-way) search tree of order m is a tree in which
A B-tree of order m (or branching factor m), where m > 2, is either an empty
tree or a multiway search tree with the following properties:
The root is either a leaf or it has at least two non-empty subtrees
and at most m non-empty subtrees.
Each non-leaf node, other than the root, has at least m/2 non-
empty subtrees and at most m non-empty subtrees. (Note: x is the
lowest integer > x ).
The number of keys in each non-leaf node is one less than the
number of non-empty subtrees for that node.
All leaf nodes are at the same level; that is the tree is perfectly
balanced.
Insertion in B-Trees
OVERFLOW CONDITION:
A root-node or a non-root node of a B-tree of order m overflows if, after a
key insertion, it contains m keys.
Insertion algorithm:
If a node overflows, split it into two, propagate the "middle" key to the
parent of the node. If the parent overflows the process propagates upward. If
the node has no parent, create a new root node.
• Note: Insertion of a key always starts at a leaf node.
Application of graphs:
Coloring of MAPS
Representing network
o Paths in a city
o Telephone network o
Electrical circuits etc.
It is also using in social network
including o LinkedIn
o Facebook
Types of Graphs:
Directed graph
Undirected Graph
Directed Graph:
In representing of graph there is a directions are
shown on the edges then that graph is called
Directed graph.
That is,
A graph G=(V, E) is a directed graph ,Edge is a
.
Sub graph:
A sub-graph G' of graph G is a graph, such that the set of vertices and set of edges
of G' are proper subset of the set of vertices and set of edges of graph G
respectively.
Connected Graph:
A graph which is connected in the sense of a topological space (study of shapes), i.e., there is
a path from any point to any other point in the graph. A graph that is not connected is said to
be disconnected.
path:
A path in a graph is a finite or infinite sequence of edges which connect a sequence of
vertices. Means a path form one vertices to another vertices in a graph is represented by
collection of all vertices (including source and destination) between those two vertices.
Simple Cycle: a cycle that does not pass through other vertices more than once
Degree:
The degree of a graph vertex v of a graph G is the number of graph edges which touch v. The
vertex degree is also called the local degree or valency. Or
The degree (or valence) of a vertex is the number of edge ends at that vertex.
For example, in this graph all of the vertices have degree three.
In a digraph (directed graph) the degree is usually divided into the in-degree and the out-
degree
In-degree: The in-degree of a vertex v is the number of edges with v as their terminal
vertex.
Out-degree: The out-degree of a vertex v is the number of edges with v as their initial
vertex.
TOPOLOGICAL SORT
A topological sort is a linear ordering of vertices in a Directed Acyclic Graph
such that if there is a path from Vi to Vp, then Vj appears after Vi in the linear
ordering.Topological sort is not possible if the graph has a cycle.
INTRODUCTION
In graph theory, a topological sort or topological ordering of a directed
acyclic graph (DAG) is a linear ordering of its nodes in which each node
comes before all nodes to which it has outbound edges.
Every DAG has one or more topological sorts.
More formally, define the partial order relation R over the nodes of the
DAG such that xRy if and only if there is a directed path from x to y.
Then, a topological sort is a linear extension of this partial order, that is,
a total order compatible with the partial order.
PROCEDURE
Step – 1 : Find the indegree for every vertex.
Step – 2 : Place the vertice whose indegree is 0, on the empty queue.
Step – 3 : Dequeue the vertex V and decrement the indegrees of all its adjacent
vertices.
Step – 4 : Enqueue the vertex on the queue if its indegree falls to zero.
Step – 5 : Repeat from Step -3 until the queue becomes empty.
The topological ordering is the order in which the vertices dequeue.
Vertices Indegree
1 0
2 0
GRAPH TRAVERSAL
Graph traversal is the Visiting all the nodes of a graph.
The traversals are :
1. DFS (Depth First Search)
2. BFS (Breadth First Search)
Procedure
Step -1 Select the start vertex/source vertex. Visit the vertex and mark it
as one (1) (1 represents visited vertex).
Step -2 Enqueue the vertex.
Step -3 Dequeue the vertex.
Step -4 Find the Adjacent vertices.
Step -5 Visit the unvisited adjacent vertices and mark the distance as 1.
Step -6 Enqueue the adjacent vertices.
Step -7 Repeat from Step – 3 to Step – 5 until the queue becomes empty.
Vertices Visited
Ex : A B Vertices
A 1
C D B 0 1
C 0 1
D 0 1
Enqueue A B C D
Dequeue A B C D
A B
Vertices Visited
Vertices
C 1
C D A 0 1
B 0 1
D 0 1
E E 0 1
Enqueue C A B D
E
Dequeue C A B D
E
APPLICATIONS
Breadth-first search can be used to solve many problems in graph theory, for
example.
Finding all nodes within one connected component
Copying Collection, Cheney's algorithm
Finding the shortest path between two nodes u and v (in an unweighted
graph)
Finding the shortest path between two nodes u and v (in a weighted
graph: see talk page)
Testing a graph for bipartiteness
(Reverse) Cuthill–McKee mesh numbering
Testing whether graph is connected.
Computing a spanning forest of graph.
Computing, for every vertex in graph, a path with the minimum number
of edges between start vertex and current vertex or reporting that no
such path exists.
Computing a cycle in graph or reporting that no such cycle exists.
Uses
Testing whether graph is connected.
Computing a spanning forest of graph.
Computing, for every vertex in graph, a path with the minimum number
of edges between start vertex and current vertex or reporting that no
such path exists.
Computing a cycle in graph or reporting that no such cycle exists.
Vertices Visited
A B Vertices Stack O/P
A 1 A
B 0 1 D
C 0 1 C
C D D 0 1
A D C B
I. Undirected Graph :
An undirected graph is connected if and only if a depth first search
starting from any node visits every node.
A 1. Tree Edge
2. Back Edge ----------- >
B D E
A A
C B B
C E C
D D
B A C
G B
C D A
F D
G F
E E
A
B
C D
F
G E
II. Bi-Connectivity :
The vertices which are responsible for disconnection is called as
Articulation points.
If in a connected undirected graph, the removal of any node does not
affect the connectivities, then the graph is said to be biconnected graph.
1. Num Low(w) ≥ Num(v)
2. Low
Calculation of Num and Low gives the articulation point.
B A A (1,1)
B (2,1)
C G
C D (3,1) (7,7)
F D (4,1)
(5,4)
G E EF (6,4)
BICONNECTIVITY :
A connected undirected graph is biconnective if there are no vertices
whose removal disconnects the rest of the graph.
Articulation Point :
The vertices whose removal disconnects the graph are known as
Articulation Points.
Steps to find Articulation Points :
(i) Perform DFS, starting at any vertex.
(ii) Number the vertex as they are visited as Num(V).
(iii) Compute the lowest numbered vertex for every vertex V in the DFS
tree, which we call as low(W), that is reachable from V by taking one
or more tree edges and then possible one back edge by definition.
F (6,4)
B E H I
A C F I K
D
G
B
C
A F
D G E
H
J
K
I
Visited Parent[]
1 2 3 4
1A 1B 1C 1D 10 11 1 2 13
0 1 2 3
Num[] 1 A 1B 1C 1D
A A (1,1)
B C B (2,1)
D C (3,1)
D (4,4)
APPLICATION
Bio-Connectivity is a application of depth first search.
Used mainly in network concepts.
BICONNECTIVITY ADVANTAGES
Total time to perform traversal is minimum.
Adjacency lists are used
Traversal is given by O(E+V).
DISADVANTAGES
Have to be careful to avoid cycles
Vertices should be carefully removed as it affects the rest of the graph.
EULER CIRCUIT
EULERIAN PATH
An Eulerian path in an undirected graph is a path that uses each edge
exactly once. If such a path exists, the graph is called traversable or semi-
eulerian.
EULERIAN CIRCUIT
An Eulerian circuit or Euler tour in an undirected graph is a cycle that
uses each edge exactly once. If such a cycle exists, the graph is called
Unicursal. While such graphs are Eulerian graphs, not every Eulerian graph
possesses an Eulerian cycle.
EULER'S THEOREM
Euler's theorem 1
If a graph has any vertex of odd degree then it cannot have an Euler
circuit.
If a graph is connected and every vertex is of even degree, then it at least
has one Euler circuit.
Euler's theorem 2
If a graph has more than two vertices of odd degree then it cannot have
an Euler path.
If a graph is connected and has just two vertices of odd degree, then it at
least has one Euler path. Any such path must start at one of the odd-
vertices and end at the other odd vertex.
ALGORITHM
Fleury's Algorithm for finding an Euler Circuit
1. Check to make sure that the graph is connected and all vertices are of
even degree
2. Start at any vertex
3. Travel through an edge:
o If it is not a bridge for the untraveled part, or
o there is no other alternative
4. Label the edges in the order in which you travel them.
5. When you cannot travel any more, stop.
Fleury's Algorithm
1. pick any vertex to start .
2. from that vertex pick an edge to traverse .
3. darken that edge, as a reminder that you can't traverse it again .
4. travel that edge, coming to the next vertex .
5. repeat 2-4 until all edges have been traversed, and you are back at the
starting vertex .
At each stage of the algorithm:
the original graph minus the darkened (already used) edges = reduced
graph
important rule: never cross a bridge of the reduced graph unless there is
no other choice
Note:
APPLICATION
i
Advantage of Bubble sort
• It is simple to write
• Easy to understand
• It only takes a few lines of code.
Disadvantage of Bubble sort
• The major drawback is the amount of time it takes to sort.
• The average time increases almost exponentially as the number of table elements
increase.
Quick Sort
Quicksort is a divide and conquer algorithm.
The basic idea is to find a “pivot” item in the array and compare all other items with pivot
element.
Shift items such that all of the items before the pivot are less than the pivot value and all
the items after the pivot are greater than the pivot value.
After that, recursively perform the same operation on the items before and after the pivot.
Find a “pivot” item in the array. This item is the basis for comparison for a single round.
Start a pointer (the left pointer) at the first item in the array.
Start a pointer (the right pointer) at the last item in the array.
1. Assume A[0]=pivot which is the left. i.e pivot=left.
2. Set i=left+1; i.e A[1];
3. Set j=right. ie. A[6] if there are 7 elements in the array
4. If A[pivot]>A[i],increment i and if A[j]>A[pivot],then decrement j, Otherwise swap A[i]
and A[j] element.
5. If i=j,then swap A[pivot] and A[j].
Advantages of Quick sort
• Fast and efficient as it deals well with a huge list of items.
• No additional storage is required.
Disadvantages of Quick sort
• The difficulty of implementing the partitioning algorithm.
Merge Sort
Merge sort is a sorting algorithm that uses the divide, conquer, and combine algorithmic
paradigm.
Divide means partitioning the n-element array to be sorted into two sub-arrays of n/2 elements.
If there are more elements in the array, divide A into two sub-arrays, A1 and A2, each containing
about half of the elements of A.
Conquer means sorting the two sub-arrays recursively using merge sort.
Combine means merging the two sorted sub-arrays of size n/2 to produce the sorted array of n
elements.
The basic steps of a merge sort algorithm are as follows:
If the array is of length 0 or 1, then it is already sorted.
Otherwise, divide the unsorted array into two sub-arrays of about half the size.
Use merge sort algorithm recursively to sort each sub-array.
Merge the two sub-arrays to form a single sorted list.
Radix Sort
Radix sort is one of the linear sorting algorithms. It is generalized form of bucket sort. It
can be performed using buckets from 0 to 9.
It is also called binsort, card sort.
It works by sorting the input based on each digit. In first pass all the elements are stored
according to the least significant digit.
In second pass the elements are arranged according to the next least significant digit and
so on till the most significant digit.
The number of passes in a Radix sort depends upon the number of digits in the given
numbers.
Step2: Consider the LSB (Least Significant Bit) of each number (numbers in the one‟s
Place…. E.g., in 43 LSB = 3)
Step3: Place the elements in their respective buckets according to the LSB of each number
Step5: repeat the same process with the digits in the 10‟s place (e.g. In 43 MSB =4)
Step6: repeat the same step till all the digits of the given number are consider.
Consider the following numbers to be sorted using Radix sort.
#include<stdio.h >
void main( )
{
int a [ 5 ] = { 4, 5, 2, 3, 6 } , i = 0 ;
void Radix_sort ( int a [ ] , int n );
Radix_sort(a,5);
printf( " After Sorting :" ) ;
for ( i = 0 ; i < 5 ; i ++ )
printf ( " %d ", a[ i ] ) ;
}
void Radix_sort ( int a [ ] , int n )
{
int bucket [ 10 ] [ 5 ] , buck [ 10 ] , b [ 10 ] ;
int i , j , k , l , num , div , large , passes ;
div = 1 ;
num = 0 ;
large = a [ 0 ] ;
for ( i = 0 ; i < n ; i ++ ){
if ( a[ i] > large )
padamavani arts
{ and science
college
large = a [ i ] ;
}
while ( large > 0 )
{
num ++ ;
large = large / 10 ;
}
for ( passes = 0 ; passes < num ; passes ++ )
{
for ( k = 0 ; k < 10 ; k ++ )
{
buck [ k ] = 0 ;
}
for ( i = 0 ; i < n ; i ++ )
{
l = ( ( a [ i ] / div ) % 10 ) ;
bucket [ l ] [ buck [ l ] ++ ] = a [ i ] ;
}
i=0;
for ( k = 0 ; k < 10 ; k ++ )
{
for(j=0 ; j<buck[k];j++ )
{
a[i++]=bucket[ k ][ j ] ;
}
}
div*= 10 ;
}
}
}
SEARCHING
Searching is an algorithm, to check whether a particular element is present in the list.
Types of searching:-
Linear search
Binary Search
Linear Search
Linear search is used to search a data item in the given set in the sequential manner, starting from
the first element. It is also called as sequential search
Binary Search
Binary search is used to search an item in a sorted list. In this method , initialize the lower
limit and upper limit.
The middle position is computed as (first+last)/2 and check the element in the middle
position with the data item to be searched.
If the data item is greater than the middle value then the lower limit is adjusted to one
greater than the middle value.Otherwise the upper limit is adjusted to one less than the
middle value.
Working principle:
Algorithm is quite simple. It can be done either recursively or iteratively:
1. Get the middle element;
2. If the middle element equals to the searched value, the algorithm stops;
3. Otherwise, two cases are possible:
o Search value is less than the middle element. In this case, go to the step 1 for the
part of the array, before middle element.
o Searched value is greater, than the middle element. In this case, go to the step 1
for the part of the array, after middle element.
HASHING :
Hashing is a technique that is used to store, retrieve and find data in the data structure
called Hash Table. It is used to overcome the drawback of Linear Search (Comparison) &
Binary Search (Sorted order list). It involves two important concepts-
Hash Table
Hash Function
Hash table
A hash table is a data structure that is used to store and retrieve data (keys) very
quickly.
It is an array of some fixed size, containing the keys.
Hash table run from 0 to Tablesize – 1.
Each key is mapped into some number in the range 0 to Tablesize – 1.
This mapping is called Hash function.
Insertion of the data in the hash table is based on the key value obtained from the
hash function.
Using same hash key value, the data can be retrieved from the hash table by few
or more Hash key comparison.
The load factor of a hash table is calculated using the formula:
(Number of data elements in the hash table) / (Size of the hash table)
Factors affecting Hash Table Design
Hash function
Table size.
Collision handling scheme
0
1
2
3
.
. Simple Hash table with table size = 10
8
9
Hash function:
It is a function, which distributes the keys evenly among the cells in the Hash
Table.
Using the same hash function we can retrieve data from the hash table.
Hash function is used to implement hash table.
The integer value returned by the hash function is called hash key.
If the input keys are integer, the commonly used hash function is
93
44
3306 107
4999
The folding method for constructing hash functions begins by dividing the item into
equal-size pieces (the last piece may not be of equal size). These pieces are then added together
to give the resulting hash key value. For example, if our item was the phone number 436-555-
4601, we would take the digits and divide them into groups of 2 (43, 65, 55, 46, 01). After the
addition, 43+65+55+46+01, we get 210. If we assume our hash table has 11 slots, then we need
to perform the extra step of dividing by 11 and keeping the remainder. In this case 210 % 11 is 1,
so the phone number 436-555-4601 hashes to slot 1.
6-555-4601
Collision:
If two more keys hashes to the same index, the corresponding records cannot be stored in the
same location. This condition is known as collision.
Characteristics of Good Hashing Function:
0 NULL
1
NULL
2 NULL
3 NULL
4 NULL
5 NULL
6 NULL
7
8 NULL
9 NULL
NULL
Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining.
The hash function is
H(key) = key % 10
1. H(22) = 22 % 10 =2 2. 84 % 10 = 4
3.H(35)=35%10=5 4. H(62)=62%10=2
Advantages
1. More number of elements can be inserted using array of Link List
Disadvantages
1. It requires more pointers, which occupies more memory space.
2.Search takes time. Since it takes time to evaluate Hash Function and also to traverse the
List
Open Addressing
Closed Hashing
Collision resolution technique
Uses Hi(X)=(Hash(X)+F(i))mod Tablesize
When collision occurs, alternative cells are tried until empty cells are found.
Types:-
Linear Probing
Quadratic Probing
Double Hashing
Hash function
H(key) = key % table size.
Insert Operation
To insert a key; Use the hash function to identify the list to which the
element should be inserted.
Then traverse the list to check whether the element is already present.
If exists, increment the count.
Else the new element is placed at the front of the list.
Linear Probing:
Easiest method to handle collision.
Apply the hash function H (key) = key % table size
Hi(X)=(Hash(X)+F(i))mod Tablesize,where F(i)=i.
How to Probing:
first probe – given a key k, hash to H(key)
second probe – if H(key)+f(1) is occupied, try H(key)+f(2)
And so forth.
Probing Properties:
We force f(0)=0
The ith probe is to (H (key) +f (i)) %table size.
If i reach size-1, the probe has failed.
Depending on f (i), the probe may fail sooner.
Long sequences of probe are costly.
Probe Sequence is:
H (key) % table size
H (key)+1 % Table size
H (Key)+2 % Table size
1. H(Key)=Key mod Tablesize
This is the common formula that you should apply for any hashing
If collocation occurs use Formula 2
2. H(Key)=(H(key)+i) Tablesize
Where i=1, 2, 3, …… etc
Example: - 89 18 49 58 69; Tablesize=10
1. H(89) =89%10
=9
2. H(18) =18%10
=8
3. H(49) =49%10
=9 ((coloids with 89.So try for next free cell using formula 2))
i=1 h1(49) = (H(49)+1)%10
= (9+1)%10
=10%10
=0
4. H(58) =58%10
=8 ((colloids with 18))
i=1 h1(58) = (H(58) +1)%10
= (8+1) %10
=9%10
=9 =>Again collision
i=2 h2(58) =(H(58)+2)%10
=(8+2)%10
=10%10
=0 =>Again collision
EMPTY 89 18 49 58 69
0 49 49 49
1 58 58
2 69
3
4
5
6
7
8 18 18 18
9 89 89 89 89
Linear probing
Quadratic Probing
To resolve the primary clustering problem, quadratic probing can be used. With quadratic
probing, rather than always moving one spot, move i2 spots from the point of collision, where
i is the number of attempts to resolve the collision.
Another collision resolution method which distributes items more evenly.
From the original index H, if the slot is filled, try cells H+12, H+22, H+32,.., H + i2 with
wrap-around.
Hi(X)=(Hash(X)+F(i))mod Tablesize,F(i)=i2
Hi(X)=(Hash(X)+ i2)mod Tablesize
Limitation: at most half of the table can be used as alternative locations to resolve collisions.
This means that once the table is more than half full, it's difficult to find an empty spot. This
new problem is known as secondary clustering because elements that hash to the same hash
key will always probe the same alternative cells.
Double Hashing
Double hashing uses the idea of applying a second hash function to the key when a
collision occurs. The result of the second hash function will be the number of positions forms
the point of collision to insert.
There are a couple of requirements for the second function:
It must never evaluate to 0 must make sure that all cells can be probed.
Hi(X)=(Hash(X)+i*Hash2(X))mod Tablesize
A popular second hash function is:
Hash2 (key) = R - (key % R) where R is a prime number that is smaller than the size of the
table.
Rehashing
Once the hash table gets too full, the running time for operations will start to take too
long and may fail. To solve this problem, a table at least twice the size of the original will be
built and the elements will be transferred to the new table.
Advantage:
A programmer doesn‟t worry about table system.
Simple to implement
Can be used in other data structure as well
The new size of the hash table:
should also be prime
will be used to calculate the new insertion spot (hence the name rehashing)
This is a very expensive operation! O(N) since there are N elements to rehash and the
table size is roughly 2N. This is ok though since it doesn't happen that often.
The question becomes when should the rehashing be applied?
Some possible answers:
once the table becomes half full
once an insertion fails
once a specific load factor has been reached, where load factor is the ratio of the
number of elements in the hash table to the table size
Extendible Hashing
Extendible Hashing is a mechanism for altering the size of the hash table to accommodate
new entries when buckets overflow.
Common strategy in internal hashing is to double the hash table and rehash each entry.
However, this technique is slow, because writing all pages to disk is too expensive.
Therefore, instead of doubling the whole hash table, we use a directory of pointers to
buckets, and double the number of buckets by doubling the directory, splitting just the
bucket that overflows.
Since the directory is much smaller than the file, doubling it is much cheaper. Only one
page of keys and pointers is split.
000 100 0 1
010 100
100 000
111 000
001 000 000 100 100 000
011 000 010 100
101 000 111 000
001 000
111 001 011 000 101 000
001 010
101 100 111 001
101 110 00 01 10 11