DSAL Lab Handout
DSAL Lab Handout
TITLE: Create hash table for telephone book database and search data in it.
OBJECTIVES:
OUTCOMES:
PRE-REQUISITES:
QUESTIONS:
OBJECTIVES:
OUTCOMES:
PRE-REQUISITES:
QUESTIONS:
1. Define ADT of SET?
2. Explain iterator in C++ STL?
Modern Education Society’s College of Engineering, Pune
OBJECTIVES:
1. To understand concept of tree data structure
2. To understand concept & features of object oriented programming.
Learning Outcome:
1. Define class for structures using Object Oriented features.
THEORY:
Introduction to Tree:
A tree T is a set of nodes storing elements such that the nodes have a parent-child
relationship that satisfies the following
• if T is not empty, T has a special tree called the root that has no parent
• each node v of T different than the root has a unique parent node w; each node with
parent w is a child of w
Recursive definition
• T is either empty
• or consists of a node r (the root) and a possibly empty set of trees whose roots are
the children of r
Tree is a widely-used data structure that emulates a tree structure with a set of linked
nodes. The tree graphically is represented as below
The circles are the nodes and the edges are the links between them. Trees are usually
used to store and represent data in some hierarchical order. The data are stored in the
nodes, from which the tree is consisted of.
A subtree is a portion of a tree data structure that can be viewed as a complete tree in
itself. Any node in a tree T, together with all the nodes below his height, that are
reachable from the node, comprise a subtree of T.
Important Terms
Following are the important terms with respect to tree.
Path − Path refers to the sequence of nodes along the edges of a tree.
Root − The node at the top of the tree is called root. There is only one root per tree
and one path from the root node to any node.
Parent − Any node except the root node has one edge upward to a node called parent.
Child − The node below a given node connected by its edge downward is called its
child node.
Leaf − The node which does not have any child node is called the leaf node.
Subtree − Subtree represents the descendants of a node.
Visiting − Visiting refers to checking the value of a node when control is on the node.
Traversing − Traversing means passing through nodes in a specific order.
Levels − Level of a node represents the generation of a node. If the root node is at
level 0, then its next child node is at level 1, its grandchild is at level 2, and so on.
Keys - Key represents a value of a node based on which a search operation is to be
carried out for a node.
There are two basic types of trees. In an unordered tree, a tree is a tree in a purely
structural sense — that is to say, given a node, there is no order for the children of that
node. A tree on which an order is imposed — for example, by assigning different
natural numbers to each child of each node — is called an ordered tree, and data
structures built on them are called ordered tree data structures. Ordered trees are by
far the most common form of tree data structure. Binary search trees are one kind of
ordered tree.
Advantages of trees
Trees are so useful and frequently used, because they have some very serious
advantages:
· Trees reflect structural relationships in the data
· Trees are used to represent hierarchies
· Trees provide an efficient insertion and searching
· Trees are very flexible data, allowing to move subtrees around with minumum effort
QUESTIONS:
1. What is class, object and data structure?
OBJECTIVES:
1. To understand concept of Tree & Binary Tree.
OUTCOMES:
1. To analyze the working of various Tree operations.
THEORY:
Tree
Tree represents the nodes connected by edges also a class of graphs that is acyclic is
termed as trees. Let us now discuss an important class of graphs called trees and its
associated terminology. Trees are useful in describing any structure that involves
hierarchy. Familiar examples of such structures are family trees, the hierarchy of positions
in an organization, and so on.
Binary Tree
A binary tree is made of nodes, where each node contains a "left" reference, a "right"
reference, and a data element. The topmost node in the tree is called the root.
Every node (excluding a root) in a tree is connected by a directed edge from exactly one
other node. This node is called a parent. On the other hand, each node can be connected
to arbitrary number of nodes, called children. Nodes with no children are called leaves, or
external nodes. Nodes which not leaves are called internal nodes. Nodes with the same
parent are called siblings.
Insert Operation
The very first insertion creates the tree. Afterwards, whenever an element is to be inserted,
first locate its proper location. Start searching from the root node, then if the data is less
than the key value, search for the empty location in the left subtree and insert the data.
Otherwise, search for the empty location in the right subtree and insert the data.
Traversals
A traversal is a process that visits all the nodes in the tree. Since a tree is a nonlinear data
structure, there is no unique traversal. We will consider several traversal algorithms with
we group in the following two kinds: Depth-first traversal , Breadth-first traversal
There is only one kind of breadth-first traversal--the level order traversal. This traversal
visits nodes by levels from top to bottom and from left to right.
As an example, consider the following tree and its four traversals:
3. (a) Draw the binary tree whose in-order traversal is DBEAFC and whose pre-order
traversal is ABDECF.
(b) What is the post-order traversal of this tree?
(c) Draw all binary search trees of height 2 that can be made from all the letters
ABCDEF, assuming the natural ordering.
Modern Education Society’s College of Engineering, Pune
OBJECTIVES:
OUTCOMES:
PRE-REQUISITES:
THEORY:
Binary Search Tree, is a node-based binary tree data structure which has the following
properties:
• The left subtree of a node contains only nodes with keys lesser than the node’s
key.
• The right subtree of a node contains only nodes with keys greater than the
node’s key.
• The left and right subtree each must also be a binary search tree.
• There must be no duplicate nodes
The above properties of Binary Search Tree provide an ordering among keys so that
the operations like search, minimum and maximum can be done fast. If there is no
ordering, then we may have to compare every key to search a given key.
Searching a key:
To search a given key in Binary Search Tree, we first compare it with root, if the key is
present at root, we return root. If key is greater than root’s key, we recur for right
subtree of root node. Otherwise, we recur for left subtree.
A simple implementation for the Dictionary ADT can be based on sorted or unsorted
lists. When implementing the dictionary with an unsorted list, inserting a new record
into the dictionary can be performed quickly by putting it at the end of the list.
However, searching an unsorted list for a particular record requires Θ(n) time in the
average case. For a large database, this is probably much too slow. Alternatively, the
records can be stored in a sorted list. If the list is implemented using a linked list, then
no speedup to the search operation will result from storing the records in sorted order.
On the other hand, if we use a sorted array-based list to implement the dictionary, then
binary search can be used to find a record in only Θ (log n) time. However, insertion
will now require Θ(n) time on average because, once the proper location for the new
record in the sorted list has been found, many records might be shifted to make room
for the new record.
The way to organize a collection of records so that inserting records and searching for
records can both be done quickly, is by using binary search tree (BST). The advantage
of using the BST is that all major operations (insert, search, and remove) are Θ(log n)
in the average case. Of course, if the tree is badly balanced, then the cost can be as
bad as Θ(n)
QUESTIONS:
OBJECTIVES:
1. To understand graph data structure using adjacency matrix/ list
2. Able to perform graph traversal DFS using adjacency matrix with the help of stack
3. 3. Able to perform graph traversal BFS using adjacency list with the help of
Queue
OUTCOMES:
1. Apply and analyze linear data structures to solve non-linear data structure problems.
PRE-REQUISITE:
THEORY:
Graph is a collection of nodes or vertices (V) and edges(E) between them. We can
traverse these nodes using the edges. These edges might be weighted or
non-weighted.
There can be two kinds of Graphs
• Un-directed Graph – when you can traverse either direction between two nodes.
• Directed Graph – when you can traverse only in the specified direction between
two nodes.
• Adjacency Matrix
• Adjacency List
Adjacency Matrix:
Adjacency Matrix is 2-Dimensional Array which has the size VxV, where V are
the number of vertices in the graph. See the example below, the Adjacency
matrix for the graph shown above.
But the drawback is that it takes O(V2) space even though there are very less
edges in the graph.
Adjacency List:
Adjacency List is the Array[] of Linked List, where array size is same as number
of Vertices in the graph. Every Vertex has a Linked List. Each Node in this
Linked list represents the reference to the other vertices which share an edge
with the current vertex. The weights can also be stored in the Linked List Node.
The code may look complex since everything is being implemented from the
scratch like linked list and so on. For better understanding, read more articles
for easier implementations (Adjacency Matrix and Adjacency List)
QUESTIONS:
2) Given an undirected graph G with V vertices and E edges, What is the sum of the
degrees of all vertices?
Modern Education Society’s College of Engineering, Pune
AIM/PROBLEM STATEMENT: You have a business with several offices; you want to
lease phone lines to connect them up with each other; and the phone company
charges different amounts of money to connect different pairs of cities. You want a
set of lines that connects all your offices with a minimum total cost. Solve the problem
by suggesting appropriate data structures.
OBJECTIVES:
OUTCOMES:
PRE-REQUISITE:
THEORY:
Prim's algorithm to find minimum cost spanning tree (as Kruskal's algorithm) uses the
greedy approach. Prim's algorithm shares a similarity with the shortest path first
algorithms.
Prim's algorithm, in contrast with Kruskal's algorithm, treats the nodes as a single tree
and keeps on adding new nodes to the spanning tree from the given graph.
To contrast with Kruskal's algorithm and to understand Prim's algorithm better, we
shall use the same example –
Step 1 - Remove all loops and parallel edges
Remove all loops and parallel edges from the given graph. In case of parallel
edges, keep the one which has the least cost associated and remove all others.
Now, the tree S-7-A is treated as one node and we check for all edges going out
from it. We select the one which has the lowest cost and include it in the tree.
After this step, S-7-A-3-C tree is formed. Now we'll again treat it as a node and
will check all the edges again. However, we will choose only the least cost edge.
In this case, C-3-D is the new edge, which is less than other edges' cost 8, 6, 4,
etc.
After adding node D to the spanning tree, we now have two edges going out of
it having the same cost, i.e., D-2-T and D-2-B. Thus, we can add either one. But
the next step will again yield edge 2 as the least cost. Hence, we are showing a
spanning tree with both edges included.
The output spanning tree of the same graph using two different algorithms is
same.
QUESTIONS:
1. Suppose we have an undirected graph with weights that can be either
positive or negative. Do Prim’s and Kruskal’s algorithm produce an MST for
such a graph?
2. Can a graph have more than one spanning tree?
3. State the difference between Prims and Kruskal’s Algorithm.
Modern Education Society’s College of Engineering, Pune
AIM/PROBLEM STATEMENT: Given sequence k = k1 <k2 < … < kn of n sorted keys, with
a search probability pi for each key ki. Build the Binary search tree that has the least
search cost given the access probability for each key ?
OBJECTIVES:
OUTCOMES:
1. Define class for Extended binary search tree using Object Oriented features.
2. Analyze working of functions.
PRE-REQUISITES:
THEORY:
An optimal binary search tree is a binary search tree for which the nodes are arranged on
levels such that the tree cost is minimum. For the purpose of a better presentation of
optimal binary search trees, we will consider “extended binary search trees”, which have
the keys stored at their internal nodes. Suppose “n” keys k1, k2, … kn are stored at the
internal nodes of a binary search tree. It is assumed that the keys are given in sorted order,
so that k1< k2 < … < kn.
An extended binary search tree is obtained from the binary search tree by adding
successor nodes to each of its terminal nodes as indicated in the following figure by
squares:
In the extended tree:
· The squares represent terminal nodes. These terminal nodes represent unsuccessful
searches of the tree for key values. The searches did not end successfully, that is,
because they represent key values that are not actually stored in the tree;
· The round nodes represent internal nodes; these are the actual keys stored in the
tree;
· Assuming that the relative frequency with which each key value is accessed is
known, weights can be assigned to each node of the extended tree (p1 … p6). They
represent the relative frequencies of searches terminating at each node, that is, they
mark the successful searches.
· If the user searches a particular key in the tree, 2 cases can occur:
· 1 – the key is found, so the corresponding weight „p‟ is incremented;
· 2 – the key is not found, so the corresponding „q‟ value is incremented.
GENERALIZATION:
The terminal node in the extended tree that is the left successor of k1 can be
interpreted as representing all key values that are not stored and are less than k1.
Similarly, the terminal node in the extended tree that is the right successor of kn,
represents all key values not stored in the tree that are greater than kn. The terminal
node that is successes between ki and ki-1 in an inorder traversal represent all key
values not stored that lie between ki and ki - 1.
COMPLEXITY ANALYSIS:
The algorithm requires O (n2) time and O (n2) storage. Therefore, as n increases it will
run out of storage even before it runs out of time. The storage needed can be reduced
by almost half by implementing the two-dimensional arrays as one-dimensional arrays.
QUESTIONS:
Find the optimal binary search tree for N = 6, having keys k1 … k6 and weights p1 = 10,
p2 = 3, p3 = 9, p4 = 2, p5 = 0, p6 = 10; q0 = 5, q1 = 6, q2 = 4, q3 = 4, q4 = 3, q5 = 8, q6 = 0.
The following figure shows the arrays as they would appear after the initialization and
their final disposition.
W indicates weighted indicates
C indicates cost
R indicates root
Modern Education Society’s College of Engineering, Pune
OBJECTIVES:
OUTCOMES:
1. Apply and analyze non-linear data structures to solve real world complex problems.
PRE-REQUISITES:
THEORY:
In the second tree, the left subtree of C has height 2 and the right subtree has height 0,
so the difference is 2. In the third tree, the right subtree of A has height 2 and the left is
missing, so it is 0, and the difference is 2 again. AVL tree permits difference (balance
factor) to be only 1.
= height(left-sutree) − height(right-sutree)
If the difference in the height of left and right sub-trees is more than 1, the tree is
balanced using some rotation techniques.
AVL Rotations
To balance itself, an AVL tree may perform the following four kinds of rotations −
• Left rotation
• Right rotation
• Left-Right rotation
• Right-Left rotation
The first two rotations are single rotations and the next two rotations are double
rotations. To have an unbalanced tree, we at least need a tree of height 2. With this
simple tree, let's understand them one by one.
Left Rotation
If a tree becomes unbalanced, when a node is inserted into the right subtree of the
right subtree, then we perform a single left rotation –
In our example, node A has become unbalanced as a node is inserted in the right
subtree of A's right subtree. We perform the left rotation by making A the left-subtree
of B.
Right Rotation
AVL tree may become unbalanced, if a node is inserted in the left subtree of the left
subtree. The tree then needs a right rotation.
As depicted, the unbalanced node becomes the right child of its left child by
performing a right rotation.
Left-Right Rotation
State Action
A node has been inserted into the right subtree of the left subtree.
This makes C an unbalanced node. These scenarios cause AVL
tree to perform left-right rotation.
We shall now right-rotate the tree, making B the new root node of
this subtree. C now becomes the right subtree of its own left
subtree.
Right-Left Rotation
State Action
A node has been inserted into the left subtree of the right subtree.
This makes A, an unbalanced node with balance factor 2.
First, we perform the right rotation along C node, making C the right
subtree of its own left subtree B. Now, B becomes the right subtree
of A.
QUESTIONS:
1. What is an AVL tree. Explain with the help of example. what are the applications of
AVL tree.
2. What is the difference between OBST, Huffman’s tree and AVL tree
3. Explain Single rotation and Double rotation with example?
4.Write down the time and space complexity of AVL Tree.
Modern Education Society’s College of Engineering, Pune
OBJECTIVES:
1. To understand priority queue data structure.
2. To understand practical implementation and usage of queue linear data structures
OUTCOMES:
1. Apply and analyze appropriate data structure to solve the real time problems using
priority queue
PRE-REQUISITE:
1. Knowledge of C++ programming
2. Knowledge of 2D-Array, structures
THEORY:
A Queue is a linear structure which follows a particular order in which the operations are
performed. The order is First In First Out (FIFO). The difference between stacks and
queues is in removing. In a stack we remove the item the most recently added; in a
queue, we remove the item the least recently added.
A queue in which we are able to insert and remove items from any position based on
some property (such as priority of the task to be processed) is often referred as
priority queue. Fig 1 represents a priority queue of jobs waiting to use a computer.
Priority Queue is an extension of queue with following properties.
Heap is generally preferred for priority queue implementation because heaps provide
better performance compared arrays or linked list.
1) CPU Scheduling
QUESTIONS:
OBJECTIVES:
1. To understand file handling.
2. To understand working of sequential file.
OUTCOMES:
1. To apply appropriate file handling techniques on given data
2. To use sequential file.
PRE-REQUISITE:
1. Knowledge of C++ programming
2. Basic knowledge of sequential file
THEORY:
File is a collection of records related to each other. The file size is limited by the size
of memory and storage medium.
There are two important features of file:
1. File Activity: File activity specifies percent of actual records which proceed in a
single run.
2. File Volatility: File volatility addresses the properties of record changes. It helps to
increase the efficiency of disk design than tape.
File Organization
File organization ensures that records are available for processing. It is used to
determine an efficient file organization for each base relation.
Types of File Organization-
1. Sequential access file organization
2. Indexed sequential access file organization
3. Direct access file organization
QUESTIONS:
1. Explain direct sequential file.
2. Explain advantage and disadvantages of the sequential file
3. Explain advantages and disadvantages of direct access method.
Modern Education Society’s College of Engineering, Pune
OBJECTIVES:
1. To understand file handling.
2. To understand working of index sequential file.
OUTCOMES:
1. To apply appropriate file handling techniques on given data
2. To use index sequential file.
PRE-REQUISITE:
1. Knowledge of C++ programming
2. Basic knowledge of sequential file
THEORY:
File is a collection of records related to each other. The file size is limited by the size
of memory and storage medium.
There are two important features of file:
1. File Activity: File activity specifies percent of actual records which proceed in a
single run.
2. File Volatility: File volatility addresses the properties of record changes. It helps to
increase the efficiency of disk design than tape.
File Organization
File organization ensures that records are available for processing. It is used to
determine an efficient file organization for each base relation.
Types of File Organization-
1. Sequential access file organization
2. Indexed sequential access file organization
3. Direct access file organization
QUESTIONS:
1. Explain direct sequential file.
2. Explain advantage and disadvantages of the index sequential file
Modern Education Society’s College of Engineering, Pune
TITLE: Mini-project
AIM/PROBLEM STATEMENT:
• Design a mini project using JAVA which will use the different data structure
with or without Java collection library and show the use of specific data
structure on the efficiency (performance) of the code.
• Design a mini project to implement Snake and Ladders Game using python.
• Design a mini project to implement a Smart text editor.
• Design a mini project for automated Term work assessment of student based
on parameters like daily attendance, Unit Test / Prelim performance, students
achievements if any, Mock Practical.