Unit 1 FDS
Unit 1 FDS
A data structure is a storage that is used to store and organize data. It is a way of arranging data
on a computer so that it can be accessed and updated efficiently.
A data structure is not only used for organizing the data. It is also used for processing, retrieving,
and storing data. Different basic and advanced types of data structures are used in almost every
program or software system that has been developed. So we must have good knowledge of data
structures.
Data structures are an integral part of computers used for the arrangement of data in memory.
They are essential and responsible for organizing, processing, accessing, and storing data
efficiently. But this is not all. Various types of data structures have their characteristics, features,
applications, advantages, and disadvantages. So how do you identify a data structure that is
suitable for a particular task? What is meant by the term ‘Data Structure’? How many types of
data structures are there and what are they used for?
We have got you covered. We have made a complete list of everything about what data structure
is, what are the types of data structures, the classification of data structures, the applications of
each data structure, and so on.
We already have learned about data structure. Many times, what happens is that people get
confused between data type and data structure. So let’s see a few differences between data type
and data structure to make it clear.
Data Type Data Structure
The data type is the form of a variable to which Data structure is a collection of different kinds
a value can be assigned. It defines that the of data. That entire data can be represented
particular variable will assign the values of the using an object and can be used throughout the
given data type only. program.
It can hold value but not data. Therefore, it is It can hold multiple types of data within a
dataless. single object.
There is no time complexity in the case of data In data structure objects, time complexity plays
types. an important role.
Data structure has many different uses in our daily life. There are many different data structures
that are used to solve different mathematical and logical problems. By using data structure, one
can organize and process a very large amount of data in a relatively short period. Let’s look at
different data structures that are used in different situations.
Linear data structure: Data structure in which data elements are arranged sequentially
or linearly, where each element is attached to its previous and next adjacent elements, is
called a linear data structure.
Examples of linear data structures are array, stack, queue, linked list, etc.
o Static data structure: Static data structure has a fixed memory size. It is easier to
access the elements in a static data structure.
An example of this data structure is an array.
o Dynamic data structure: In the dynamic data structure, the size is not fixed. It
can be randomly updated during the runtime which may be considered efficient
concerning the memory (space) complexity of the code.
Examples of this data structure are queue, stack, etc.
Non-linear data structure: Data structures where data elements are not placed
sequentially or linearly are called non-linear data structures. In a non-linear data
structure, we can’t traverse all the elements in a single run only.
Examples of non-linear data structures are trees and graphs.
The structure of the data and the synthesis of the algorithm are relative to each other. Data
presentation must be easy to understand so the developer, as well as the user, can make an
efficient implementation of the operation.
Data structures provide an easy way of organizing, retrieving, managing, and storing data.
Here is a list of the needs for data.
1. Efficiency: The efficiency and organization of a program depend on the selection of the
right data structures. Suppose we want to search for a particular item from a collection of
data records. In that case, our data should be organized in linear like an array that helps
to perform the sequential search, such as element by element. However, it is efficient but
more time consuming because we need to arrange all elements and then search each
element. Hence, we choose a better option in a data structure, making the search process
more efficient, like a binary search tree, selection or hash tables.
2. Reusability: In the data structure, many operations make the programs reusable. For
example, when we write programs or implement a particular data structure, we can use it
from any source location or place to get the same results.
3. Abstraction: The data structure is maintained by the ADT, which provides different
levels of abstraction. The client can interact with data structures only through the
interface.
4. The data structure helps to simplify the process of collection of data through the software
systems.
5. It is used to save collection data storage on a computer that can be used by various
programs.
1. A user who has deep knowledge about the functionality of the data structure can make
changes to it.
2. If there is an error in the data structure, an expert can detect the bug; The original user
cannot help themselves solve the problem and fix it.
Problems
Programmers commonly deal with problems, algorithms, and computer programs. These are
three distinct concepts.
This concept of all problems behaving like mathematical functions might not match your
intuition for the behavior of computer programs. You might know of programs to which you can
give the same input value on two separate occasions, and two different outputs will result. For
example, if you type date to a typical Linux command line prompt, you will get the current date.
Naturally the date will be different on different days, even though the same command is given.
However, there is obviously more to the input for the date program than the command that you
type to run the program. The date program computes a function. In other words, on any
particular day there can only be a single answer returned by a properly running date program on
a completely specified input. For all computer programs, the output is completely determined by
the program’s full set of inputs. Even a “random number generator” is completely determined by
its inputs (although some random number generating systems appear to get around this by
accepting a random input from a physical process beyond the user’s control). The limits to what
functions can be implemented by programs is part of the domain of Computability.
Algorithms -
might be more efficient than solution B for a specific variation of the problem, or for a specific
class of inputs to the problem, while solution B might be more efficient than A
for another variation or class of inputs. For example, one sorting algorithm might be the best for
sorting a small collection of integers (which is important if you need to do this many times).
Another might be the best for sorting a large collection of integers. A third might be the best for
sorting a collection of variable-length strings.
By definition, something can only be called an algorithm if it has all of the following properties.
1. It must be correct. In other words, it must compute the desired function, converting each
input to the correct output. Note that every algorithm implements some function, because
every algorithm maps every input to some output (even if that output is a program crash).
At issue here is whether a given algorithm implements the intended function.
2. It is composed of a series of concrete steps. Concrete means that the action described by
that step is completely understood — and doable — by the person or machine that must
perform the algorithm. Each step must also be doable in a finite amount of time. Thus,
the algorithm gives us a “recipe” for solving the problem by performing a series of steps,
where each such step is within our capacity to perform. The ability to perform a step can
depend on who or what is intended to execute the recipe. For example, the steps of a
cookie recipe in a cookbook might be considered sufficiently concrete for instructing a
human cook, but not for programming an automated cookie-making factory.
3. There can be no ambiguity as to which step will be performed next. Often it is the next
step of the algorithm description. Selection (e.g., the if statement) is normally a part of
any language for describing algorithms. Selection allows a choice for which step will be
performed next, but the selection process is unambiguous at the time when the choice is
made.
4. It must be composed of a finite number of steps. If the description for the algorithm were
made up of an infinite number of steps, we could never hope to write it down, nor
implement it as a computer program. Most languages for describing algorithms
(including English and “pseudocode”) provide some way to perform repeated actions,
known as iteration. Examples of iteration in programming languages include the while
and for loop constructs. Iteration allows for short descriptions, with the number of steps
actually performed controlled by the input.
5. It must terminate. In other words, it may not go into an infinite loop.
Programs
The requirement that an algorithm must terminate means that not all computer programs meet the
technical definition of an algorithm. Your operating system is one such program. However, you
can think of the various tasks for an operating system (each with associated inputs and outputs)
as individual problems, each solved by specific algorithms implemented by a part of the
operating system program, and each one of which terminates once its output is produced.
2. Non-primitive data structure : “The data types that are derived from primary data types are
known as non-Primitive data types. These datatype are used to store group of values.” e.g. struct,
array, linklist, stack, tree , graph etc.
Linear data structures organize data elements sequentially, where each element is connected to
its previous and next neighbors in a linear order. The main characteristic of linear structures is
that each element (except the first and last) has a unique predecessor and successor.
Linear data structures are straightforward to implement and manipulate, and they excel in
scenarios where elements need to be accessed in a sequential manner or where operations like
insertion and deletion at specific positions are less frequent.
Non-linear data structures do not organize elements in a sequential order. Instead, they allow
elements to be interconnected in a more complex manner, often forming hierarchical
relationships or arbitrary connections between elements.
1. Trees: Hierarchical structures consisting of nodes connected by edges, with a single root
node at the top. Nodes below the root are organized into levels, and each node can have
child nodes (subtrees).
o Binary Trees: Each node has at most two children (left and right).
o Binary Search Trees (BST): A binary tree where the left child of a node contains
only nodes with values less than the node's value, and the right child only nodes
with values greater.
2. Graphs: A collection of nodes (vertices) and edges connecting these nodes. Unlike trees,
graphs may have cycles and can represent complex relationships between elements.
o Directed Graphs: Edges have a direction (from one node to another).
o Undirected Graphs: Edges have no direction.
Non-linear data structures are versatile and can represent a wide range of relationships and
connections between data elements. They are particularly useful in applications involving
networks, hierarchical relationships, and complex data modeling.
The choice between linear and non-linear data structures depends on the specific requirements of
the problem being solved:
Linear structures are efficient for operations that involve accessing elements
sequentially, such as iterating through a list or performing batch processing.
Non-linear structures are suitable when the relationships between elements are complex
or hierarchical, such as representing organizational structures, networks, or hierarchical
data like file systems.
Data is considered as set of facts and figures or data is value of group of value which is in
particular format.
Data structure is method of gathering as well as organizing data in such manner that several
operation can be performed
Problem is defined as a situation or condition which need tosolve to achieve the goals
Algorithm is set of ordered instruction which are written insimple english language.
ALGORITHM – PROBLEM SOLVING COMPUTER :
“Computer is multi purpose Electronic Machine which is used for storing , organizing and
processing data by set of program
Problem :
“Problem is defined as situation or condition which needs to solve to achive goal”
CHARACTRISTICS OF ALGORITHM
1. Unambiguous − Algorithm should be clear and unambiguous. Each of its steps (or phases),
and their inputs/outputs should be clear and must lead to only one meaning.
2. Input − An algorithm should have 0 or more well-defined inputs.
3. Output − An algorithm should have 1 or more well-defined outputs, and should match the
desired output.
4. Finiteness − Algorithms must terminate after a finite number of steps.
5. Feasibility − Should be feasible with the available resources.
6. Independent − An algorithm should have step-by-step directions, which should be
independent of any programming code.
EXAMPLE OF ALGORITHM
Example
Let's try to learn algorithm-writing by using an example.
Problem − Design an algorithm to add two numbers and display the result.
Step 1 − START
Step 2 − declare three integers a, b & c
Step 3 − define values of a & b
Step 4 − add values of a & b
Step 5 − store output of step 4 to c
Step 6 − print c
Step 7 – STOP
ALGORITHM DESIGN TOOL
• There can be two tools :
1. Flowchart
2. Pseudo Code
Flowchart :
“ Flowchart is graphical representation of the algorithms”
Pseudo Code :
“It is simply an implementation of an algorithm in the form of annotations and informative text
written in plain English
1.Flowchart
1. Terminal/Start/End Point:
o Represents the beginning or end of a process.
o Usually depicted as an oval shape with the word "Start" or "End".
2. Process:
o Represents a specific action or operation within the process.
o Displayed as a rectangle with rounded corners, containing a brief description of
the action.
3. Decision:
o Represents a decision point where the flow of the process can diverge based on a
condition.
o Shown as a diamond shape, with arrows indicating the different possible paths
(usually labeled with conditions like yes/no, true/false).
4. Input/Output:
o Represents where data enters (input) or exits (output) the process.
o Displayed as a parallelogram shape.
5. Flow Arrows:
o Arrows connect the various symbols to show the sequence and direction of the
process flow.
o Arrows typically point from one symbol to the next, indicating the order of
operations.
Here’s a basic example of a flowchart for a simple process of determining whether a number is
even or odd:
Example –
ALGORITHM ANALYSIS
• A Priori Analysis − This is a theoretical analysis of an algorithm. Efficiency of an algorithm is
measured by assuming that all other factors, for example, processor speed, are constant and have
no effect on the implementation.
Definition: Worst case analysis determines the maximum amount of time an algorithm would
take to complete for a given input size, assuming the least favorable conditions.
Example: Consider a sorting algorithm like Bubble Sort. The worst case occurs when
the input array is in reverse order. Bubble Sort would then require O(n2)O(n^2)O(n2)
comparisons and swaps, where nnn is the number of elements in the array.
2. Best Case Analysis
Definition: Best case analysis determines the minimum amount of time an algorithm would take
to complete for a given input size, assuming the most favorable conditions.
Example: For Bubble Sort, the best case occurs when the input array is already sorted.
In this scenario, Bubble Sort would require only O(n)O(n)O(n) comparisons to check that
the array is sorted, resulting in a time complexity of O(n)O(n)O(n).
Example: Continuing with Bubble Sort, the average case time complexity is
O(n2)O(n^2)O(n2). This is because, on average, the algorithm will make approximately
n24\frac{n^2}{4}4n2 comparisons and swaps when sorting a randomly ordered input
array.
Decision Making: Understanding these cases helps in selecting the appropriate algorithm
based on the expected input characteristics.
Performance Optimization: Designing algorithms that perform well in worst and
average cases is crucial for efficient computation.
Resource Management: Ensuring that worst-case scenarios do not lead to unacceptable
performance degradation.
BIG – oh NOTATION
Big Oh Notation, Ο The notation Ο(n) is the formal way to express the upper bound of an
algorithm's running time. It measures the worst case time complexity or the longest amount of
time an algorithm can possibly take to complete.
Omega NOTATION
Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time. It
measures the best case time complexity or the best amount of time an algorithm can possibly
take to comp
Theta NOTATION
Theta Notation, θ
The notation θ(n) is the formal way to express both thelower bound and the upper bound of an
algorithm's runningtime. It is represented as follows –
ALGORITHMIC STRATEGIES
Algorithm design strategies are the general approaches used to develop efficient solution to
problem.
Algorithm Strategies are :
1. Divide and conquer
2. Merge sort
3. Recursive algorithm
4. Backtracking algorithms
5. Heuristic algorithms
6. Dynamic Programming algorithm
Divide and Conquer Algorithm involves breaking a larger problem into smaller subproblems,
solving them independently, and then combining their solutions to solve the original problem.
The basic idea is to recursively divide the problem into smaller subproblems until they become
simple enough to be solved directly. Once the solutions to the subproblems are obtained, they are
then combined to produce the overall solution.
Divide and Conquer Algorithm can be divided into three steps: Divide, Conquer and Merge
1. Divide:
2. Conquer:
If a subproblem is small enough (often referred to as the “base case”), we solve it directly
without further recursion.
The goal is to find solutions for these subproblems independently.
3. Merge:
Combine the sub-problems to get the final solution of the whole problem.
Once the smaller subproblems are solved, we recursively combine their solutions to get
the solution of larger problem.
The goal is to formulate a solution for the original problem by merging the results from
the subproblems.
Divide and Conquer Algorithm involves breaking down a problem into smaller, more
manageable parts, solving each part individually, and then combining the solutions to solve the
original problem. The characteristics of Divide and Conquer Algorithm are:
Dividing the Problem: The first step is to break the problem into smaller, more
manageable subproblems. This division can be done recursively until the subproblems
become simple enough to solve directly.
Conquering Each Subproblem: Once divided, the subproblems are solved individually.
This may involve applying the same divide and conquer approach recursively until the
subproblems become simple enough to solve directly, or it may involve applying a
different algorithm or technique.
Combining Solutions: After solving the subproblems, their solutions are combined to
obtain the solution to the original problem. This combination step should be relatively
efficient and straightforward, as the solutions to the subproblems should be designed to
fit together seamlessly.
2.Merge sort is a sorting algorithm that follows the divide-and-conquer approach. It works by
recursively dividing the input array into smaller subarrays and sorting those subarrays then
merging them back together to obtain the sorted array.
In simple terms, we can say that the process of merge sort is to divide the array into two halves,
sort each half, and then merge the sorted halves back together. This process is repeated until the
entire array is sorted.
Merge sort is a popular sorting algorithm known for its efficiency and stability. It follows the
divide-and-conquer approach to sort a given array of elements.
1. Divide: Divide the list or array recursively into two halves until it can no more be
divided.
2. Conquer: Each subarray is sorted individually using the merge sort algorithm.
3. Merge: The sorted subarrays are merged back together in sorted order. The process
continues until all elements from both subarrays have been merged.
Let’s sort the array or list [38, 27, 43, 10] using Merge Sort
Let’s look at the working of above example:
Divide:
[38, 27, 43, 10] is divided into [38, 27 ] and [43, 10] .
Conquer:
Merge:
Merge [27, 38] and [10,43] to get the final sorted list [10, 27, 38, 43]
Greedy algorithm :
An algorithm is designed to achieve optimum solution for a given problem. In greedy algorithm
approach, decisions are made from the given solution domain. As being greedy, the closest
solution that seems to provide an optimum solution is chosen.
A minimum spanning tree (MST) or minimum weight spanning tree for a weighted, connected,
undirected graph is a spanning tree with a weight less than or equal to the weight of every other
spanning tree..
Here we will discuss Kruskal’s algorithm to find the MST of a given weighted graph.
In Kruskal’s algorithm, sort all edges of the given graph in increasing order. Then it keeps on
adding new edges and nodes in the MST if the newly added edge does not form a cycle. It picks
the minimum weighted edge at first and the maximum weighted edge at last. Thus we can say
that it makes a locally optimal choice in each step in order to find the optimal solution. Hence
this is a Greedy Algorithm.
How to find MST using Kruskal’s algorithm?
Below are the steps for finding MST using Kruskal’s algorithm:
2. Pick the smallest edge. Check if it forms a cycle with the spanning tree formed so far. If
the cycle is not formed, include this edge. Else, discard it.
3. Repeat step#2 until there are (V-1) edges in the spanning tree.
Kruskal’s algorithm to find the minimum cost spanning tree uses the greedy approach. The
Greedy Choice is to pick the smallest weight edge that does not cause a cycle in the MST
constructed so far. Let us understand it with an example:
Illustration:
We have discussed Kruskal’s algorithm for Minimum Spanning Tree. Like Kruskal’s algorithm,
Prim’s algorithm is also a Greedy algorithm. This algorithm always starts with a single node and
moves through several adjacent nodes, in order to explore all of the connected edges along the
way.
The algorithm starts with an empty spanning tree. The idea is to maintain two sets of vertices.
The first set contains the vertices already included in the MST, and the other set contains the
vertices not yet included. At every step, it considers all the edges that connect the two sets and
picks the minimum weight edge from these edges. After picking the edge, it moves the other
endpoint of the edge to the set containing MST.
A group of edges that connects two sets of vertices in a graph is called cut in graph theory. So, at
every step of Prim’s algorithm, find a cut, pick the minimum weight edge from the cut, and
include this vertex in MST Set (the set that contains already included vertices).
The working of Prim’s algorithm can be described by using the following steps:
Consider the following graph as an example for which we need to find the Minimum Spanning
Tree (MST).