Module1 DATA STRUCTURE
Module1 DATA STRUCTURE
A data structure is a
specialized format for
organizing, processing,
retrieving and storing
data.
In computer
programming, a data
structure may be
selected or designed to
store data for the
purpose of working on it
with various algorithms.
What is Data Structure
Data Structure can be defined as the group of data elements which provides an efficient
way of storing and organizing data in the computer so that it can be used efficiently.
examples of Data Structures are arrays, Linked List, Stack, Queue, etc.
Data Structures are widely used in almost every aspect of Computer Science i.e.
Operating System, Compiler Design, Artificial intelligence, Graphics and many more.
Data Structures are the main part of many computer science algorithms as they enable
the programmers to handle the data in an efficient way.
It plays a vital role in enhancing the performance of a software or a program, as the main
function of the software is to store and retrieve the user’s data as fast as possible
Data Structure
◦ A data structure is a particular way of organizing data in a computer so that it can be used effectively.
◦ For example, we can store a list of items having the same data-type using the array data structure.
The representation of particular data structure in the main memory of a computer is called as
storage structure.
The storage structure representation in auxiliary memory is called as file structure.
It is define as the way of storing and manipulating data in organized form so that it can be
used efficiently
Data Structure mainly specifies the following four things:
1)organization of data 2)accessing method 3)degree of associativity 4) processing
alternative for information
Algorithm + Data Structure = Program
Data Structure study Covers the following points
1) Amount of memory require to store
2) Amount of time require to process
3) Representation of data in memory
4) Operations performed on data
Types Of DS
◦ Every item is related to its previous and ◦ Every item is attached with many other
next item. items.
◦ Data is arranged in linear sequence. ◦ Data is not arranged in sequence.
◦ Data items can be traversed in a single ◦ Data cannot be traversed in a single run.
run ◦ E.g. Tree, Graph
◦ E.g. Array, Stacks, Linked list, Queue ◦ Implementation is difficult.
◦ Implementation is easy.
Operation on Data Structures
Design of efficient data structure must take operations to be performed on the DS into account.
The most commonly used operations on DS are broadly categorized into following types
1. Create: This operation results in reserving memory for program elements. This can be done by
declaration statement Creation of DS may take place either during compile-time or run-time.
2. Selection: This operation deals with accessing a particular data within a data structure.
3. Updation: It updates or modifies the data in the data structure.
4. Searching: It finds the presence of desired data item in the list of data items, it may also find
locations of all elements that satisfy certain conditions.
5. Sorting: This is a process of arranging all data items in a DS in particular order, for example
either ascending order or in descending order.
6. Splitting: It is a process of partitioning single list to multiple list.
7. Merging: It is a process of combining data items of two different sorted list into single sorted
list.
8. Traversing: It is a process of visiting each and every node of a list in systematic manner.
What are Arrays?
Array is a container which can
hold a fix number of items and
these items should be of the same
type.
Most of the data structures make
use of arrays to implement their
algorithms.
•Following are the important terms
to understand the concept of
Array.
Element − Each item stored
in an array is called an element.
Index − Each location of an
element in an array has a
numerical index, which is used to
1. An array is a container of elements. identify the element.
2. Elements have a specific value and data type, like "ABC", TRUE or FALSE,
etc.
3. Each element also has its own index, which is used to access the element.
• Elements are stored at
contiguous memory locations.
• An index is always less than the
total number of array items.
• In terms of syntax, any variable
that is declared as an array can
store multiple values.
• Almost all languages have the
same comprehension of arrays
but have different ways of
declaring and initializing them.
• However, three parts will
always remain common in all
the initializations, i.e., array
name, elements, and the data
type of elements.
Syntax
arrayName[indexNum]
Example
balance[1]
Here, we have accessed the second value of the array using its index, which is
1. The output of this will be 200, which is basically the second value of the
balance array.
◦ Array Representation
◦ Arrays can be declared in various ways in different languages. For illustration, let's take C array
declaration.
In ADT Minimum required functionality is given
● Unambiguous − Algorithm should be clear and unambiguous. Each of its steps (or phases), and
their inputs/outputs should be clear and must lead to only one meaning.
● Input − An algorithm should have 0 or more well-defined inputs.
● Output − An algorithm should have 1 or more well-defined outputs, and should match the
desired output.
● Finiteness − Algorithms must terminate after a finite number of steps.
● Feasibility − Should be feasible with the available resources.
● Independent − An algorithm should have step-by-step directions, which should be independent
of any programming code.
How to Write an Algorithm?
There are no well-defined standards for writing algorithms. Rather, it is problem and resource dependent.
Algorithms are never written to support a particular programming code.
As we know that all programming languages share basic code constructs like loops (do, for, while),
flow-control (if-else), etc. These common constructs can be used to write an algorithm.
We write algorithms in a step-by-step manner, but it is not always the case. Algorithm writing is a process
and is executed after the problem domain is well-defined. That is, we should know the problem domain,
for which we are designing a solution.
Problem − Design an algorithm to add two numbers and display the
result.
Step 1 − START
Step 2 − declare three integers a, b & c
Step 3 − define values of a & b
Step 4 − add values of a & b
Step 5 − store output of step 4 to c
Step 6 − print c
Step 7 − STOP
We shall learn about a priori algorithm analysis. Algorithm analysis deals with the execution or running
time of various operations involved. The running time of an operation can be defined as the number of
computer instructions executed per operation.
Asymptotic analysis of an algorithm refers to defining the mathematical boundation/framing of its run-time
performance. Using asymptotic analysis, we can very well conclude the best case, average case, and worst
case scenario of an algorithm.
Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it is concluded to work in a
constant time. Other than the "input" all other factors are considered constant.
Asymptotic analysis refers to computing the running time of any operation in mathematical units of
computation. For example, the running time of one operation is computed as f(n) and may be for another
operation it is computed as g(n2). This means the first operation running time will increase linearly with
the increase in n and the running time of the second operation will increase exponentially when n
increases. Similarly, the running time of both operations will be nearly the same if n is significantly small.
Ο(f(n)) = { g(n) : there exists c > 0 and n0 such that f(n) ≤ c.g(n) for all n > n0. }
Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time. It
measures the best case time complexity or the best amount of time an algorithm can possibly take
to complete.
Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that g(n) ≤ c.f(n) for all n > n0. }
Theta Notation, θ
The notation θ(n) is the formal way to express both the lower bound and the upper
bound of an algorithm's running time. It is represented as follows −
θ(f(n)) = { g(n) if and only if g(n) = Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }
Following is a list of some common asymptotic notations
constant − Ο(1)
logarithmic − Ο(log n)
linear − Ο(n)
quadratic − Ο(n2)
cubic − Ο(n3)
polynomial − nΟ(1)
exponential − 2Ο(n)
Classification of Data Structures
Primitive Data Structures
Primitive Data Structures are the data structures consisting of the numbers and
the characters that come in-built into programs.
1. Non-Primitive Data Structures are those data structures derived from Primitive
Data Structures.
2. These data structures can't be operated directly by machine-level instructions.
3. The focus of these data structures is on forming a set of data elements that is
either homogeneous (same data type) or heterogeneous (different data types).
4. Based on the structure and arrangement of data, we can divide these data
structures into two sub-categories -
a. Linear Data Structures
b. Non-Linear Data Structures
Linear Data Structures
A data structure that preserves a linear connection among its data elements is known as a Linear Data Structure
The arrangement of the data is done linearly, where each element consists of the successors and predecessors
except the first and the last data element.
1. Static Data Structures: The data structures having a fixed size are known as Static Data Structures. The
memory for these data structures is allocated at the compiler time, and their size cannot be changed by the
user after being compiled; however, the data stored in them can be altered.
The Array is the best example of the Static Data Structure as they have a fixed size, and its data can be
modified later.
2. Dynamic Data Structures: The data structures having a dynamic size are known as Dynamic Data
Structures. The memory of these data structures is allocated at the run time, and their size varies during the
run time of the code. Moreover, the user can change the size as well as the data elements stored in these
data structures at the run time of the code.
Linked Lists, Stacks, and Queues are common examples of dynamic data structures
Linked Lists
A Linked List is another example of a linear data structure used to store a collection of data elements dynamically.
Data elements are represented by the Nodes, connected using links or pointers.
Each node contains two fields, the information field consists of the actual data, and the pointer field consists of the address of
the subsequent nodes in the list.
The pointer of the last node of the linked list consists of a null pointer, as it points to nothing. The user can dynamically adjust
the size of a Linked List as per the requirements.
Linked Lists can be classified into different types:
a. Singly Linked List: A Singly Linked List is the most common type of Linked List. Each node has data
and a pointer field containing an address to the next node.
b. Doubly Linked List: A Doubly Linked List consists of an information field and two pointer fields. The
information field contains the data. The first pointer field contains an address of the previous node,
whereas another pointer field contains a reference to the next node. Thus, we can go in both directions
(backward as well as forward).
c. Circular Linked List: The Circular Linked List is similar to the Singly Linked List. The only key difference
is that the last node contains the address of the first node, forming a circular loop in the Circular Linked
List.
a. Circular Linked List is also helpful in a Slide Show where a user requires to go back to the first slide after
the last slide is presented.
b. Doubly Linked List is utilized to implement forward and backward buttons in a browser to move forward
and backward in the opened pages of a website.
Stacks
A Stack is a Linear Data Structure that follows the LIFO (Last In, First Out) principle that allows operations like
insertion and deletion from one end of the Stack, i.e., Top.
Stacks can be implemented with the help of contiguous memory, an Array, and non-contiguous memory, a
Linked List.
Real-life examples of Stacks are piles of books, a deck of cards, piles of money, and many more.
The primary operations in the Stack
are as follows:
A Queue is a linear data structure similar to a Stack with some limitations on the insertion and deletion of the elements.
The insertion of an element in a Queue is done at one end, and the removal is done at another or opposite end.
Queue data structure follows FIFO (First In, First Out) principle to manipulate the data elements.
Some real-life examples of Queues are a line at the ticket counter, an escalator, a car wash, and many more.
Primary operations of the Queue:
Whenever the data structure does such operations, it is known as an Abstract Data Type (ADT).
We can define it as a set of data elements along with the operations on the data. The term "abstract"
refers to the fact that the data and the fundamental operations defined on it are being studied
independently of their implementation. It includes what we can do with the data, not how we can do
it.
An ADI implementation contains a storage structure in order to store the data elements and
algorithms for fundamental operation. All the data structures, like an array, linked list, queue, stack,
etc., are examples of ADT.
Array in Data Structure
● Arrays are defined as the collection of similar types of data items stored at contiguous memory
locations.
● In C programming, they are the derived data types that can store the primitive type of data such as int, char,
double, float, etc.
● For example, if we want to store the marks of a student in 6 subjects
● we can define an array that can store the marks in each subject at the contiguous memory locations.
a. One-Dimensional Array: An Array with only one row of data elements is known as a One-Dimensional
Array. It is stored in ascending storage location.
b. Two-Dimensional Array: An Array consisting of multiple rows and columns of data elements is called a
Two-Dimensional Array. It is also known as a Matrix.
c. Multidimensional Array: We can define Multidimensional Array as an Array of Arrays. Multidimensional
Arrays are not bounded to two indices or two dimensions as they can include as many indices are per the
need.
Properties of array
○ Each element in an array is of the same data type and carries the same size that
is 4 bytes.
○ Elements in the array are stored at contiguous memory locations from which the
first element is stored at the smallest memory location.
○ Elements of the array can be randomly accessed since we can calculate the
address of each element of the array with the given base address and the size
of the data element.
Representation of an array
All the data elements of an array are stored at contiguous locations in the main memory. The first
element in the main memory represents the base address. Each element of the array is represented by
proper indexing.
= 100 + 2 x [4]
= 100 + 8
= 108
Basic operations
1. #include <stdio.h>
2. void main()
3. {
4. int Arr[5] = {18, 30, 15, 70, 12};
5. int i;
6. printf("Elements of the array are:\n");
7. for(i = 0; i<5; i++)
8. {
9. printf(" %d, ", Arr[i]);
10. }
11. }
Insertion operation
1. x = 50; // element to be inserted
This operation is performed to insert one or more elements into the 2. pos = 4;
array. An element can be added at the beginning, end, or at any index 3. for (i = n-1; i >= pos-1; i--)
of the array.
4. arr[i] = arr[i + 1];
1. #include <stdio.h> 5. arr[pos - 1] = x;
2. int main() 6. n++;
3. { 7. printf("Array elements after insertion\n");
4. int arr[20] = { 18, 30, 15, 70, 12 }; 8. for (i = 0; i < n; i++)
5. int i, x, pos, n = 5; 9. printf("%d ", arr[i]);
6. printf("Array elements before insertion\n"); 10. printf("\n");
7. for (i = 0; i < n; i++) 11. return 0;
8. printf("%d ", arr[i]);
9. printf("\n");
Deletion operation
As the name implies, this operation removes an element from the array and then reorganizes all of the array elements.
1. #include <stdio.h> 1. while( j < n) {
2. void main() { 2. arr[j+1] = arr[j];
3. int arr[] = {18, 30, 15, 70, 12}; 3. }
4. int delete = 12 , n = 5; 4.
5. int i, j; 5. n = n -1;
6. 6.
7. printf("Given array elements are :\n"); 7. printf("\nElements of array after deletion:\n");
8. for(i = 0; i<n; i++) { 8.
9. printf("%d, ", arr[i]); 9. for(i = 0; i<n; i++) {
10. } 10. printf("arr[%d] = %d, ", i, arr[i]);
11. if(delete==a[i]) 11. }
12. j = i; 12. }
Search operation
This operation is performed to search an element in the array based on the value or index.
This operation is performed to update an existing array element located at the given
index.
1. #include <stdio.h>
2.
3. void main() {
4. int arr[5] = {18, 30, 15, 70, 12};
5. int item = 50, i, pos = 3;
6.
7. printf("Given array elements are :\n");
8.
9. for(i = 0; i<5; i++) {
10. printf("arr[%d] = %d, ", i, arr[i]);
11. }
12.
13. arr[pos-1] = item;
14. printf("\nArray elements after updation :\n");
Advantages of Array
○ Array provides the single name for the group of variables of the same type. Therefore, it is easy to
remember the name of all the elements of an array.
○ Traversing an array is a very simple process; we just need to increment the base address of the array in
order to visit each element one by one.
○ Any element in the array can be directly accessed by using the index.
Disadvantages of Array
○ Array is homogenous. It means that the elements with similar data type can be stored in it.
○ In array, there is static memory allocation that is size of an array cannot be altered.
○ There will be wastage of memory if we store less number of elements than the declared size.
2D Array
2D array can be defined as an array of arrays. The 2D array is organized as matrices which can be
represented as the collection of rows and columns.
However, 2D arrays are created to implement a relational database look alike data structure. It provides
ease of holding bulk of data at once which can be passed to any number of functions wherever required
1. int arr[max_rows][max_columns];
How do we access data in a 2D array
Due to the fact that the elements of 2D arrays can be random accessed. Similar to one dimensional arrays, we can access
the individual cells in a 2D array by using the indices of the cells. There are two indices attached to a particular cell, one is
its row number while the other is its column number.
However, we can store the value stored in any particular cell of a 2D array to some variable x by using the following syntax.
1. int x = a[i][j];
where i and j is the row and column number of the cell respectively.