0% found this document useful (0 votes)
20 views70 pages

Graphs 2

The document explains Huffman coding, a compression algorithm that uses variable length codes based on character frequency to minimize storage size. It also covers fundamental concepts of graphs, including definitions, types, and representations, as well as graph traversal algorithms like Breadth First Search (BFS) and Depth First Search (DFS), detailing their procedures and complexities. Additionally, it discusses various graph properties and components, such as strongly connected graphs and articulation points.

Uploaded by

anmolsxn2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views70 pages

Graphs 2

The document explains Huffman coding, a compression algorithm that uses variable length codes based on character frequency to minimize storage size. It also covers fundamental concepts of graphs, including definitions, types, and representations, as well as graph traversal algorithms like Breadth First Search (BFS) and Depth First Search (DFS), detailing their procedures and complexities. Additionally, it discusses various graph properties and components, such as strongly connected graphs and articulation points.

Uploaded by

anmolsxn2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

Data Structures

Graphs
Huffman Coding

Huffman coding algorithm compresses the storage if data using variable


length codes.
The basic idea behind Huffman coding algorithm is to use fewer bits for
more frequently occurring characters.

Given a set of n characters from the alphabet A [each character c ∈ A] and


their associated with a frequency freq(c), we want to find a binary code for
each character c ∈ A, such that

෌𝐶∈A 𝑓𝑟𝑒𝑞 𝑐 ∗ |𝑏𝑖𝑛𝑎𝑟𝑦𝑐𝑜𝑑𝑒(𝑐)|


is minimum, where |binarycode(c)| represents the length of binary code of
character c.
Prefix Code: A code is called a prefix (free) code if no codeword is a
prefix of another one.
Example: {a = 0, b = 110, c = 10, d = 111} is a prefix code.
Example: 01101100 = 01101100 = abba
Example:
Suppose in a message we have these letters and their frequencies:

a: 05 b: 48 c: 07 d: 17 e: 10 f: 13

Sort the nodes based on their frequencies. Further, consider the two nodes
having minimum frequencies:

a: 05 c: 07 e: 10 f: 13 d: 17 b: 48

Connect these two nodes at a newly created common node that will store
only the sum of frequencies of all the nodes connected below it.

12 e: 10 f: 13 d: 17 b: 48

a: 05 c: 07
Repeat the same process:

12 e: 10 f: 13 d: 17 b: 48

a: 05 c: 07
Further, consider the two nodes having minimum frequencies:

22 f: 13 d: 17 b: 48

12 e: 10

a: 05 c: 07
Further, consider the two nodes having minimum frequencies:

22 f: 13 d: 17 b: 48

12 e: 10

a: 05 c: 07

22 30 b: 48

12 e: 10 f: 13 d: 17

a: 05 c: 07
22 30 b: 48

12 e: 10 f: 13 d: 17

a: 05 c: 07

Further, consider the 52 b: 48


two nodes having
minimum frequencies:
22 30

12 e: 10 f: 13 d: 17

a: 05 c: 07
Further, consider the
two nodes having 100
minimum frequencies: 0 1

52 b: 48
0 1

22 30 Character Code

0 1 0 1 a 0000

b 1
12 e: 10 f: 13 d: 17
c 0001
0 1
d 011
a: 05 c: 07
e 001

f 010
Character Frequency Code Size

a 5 0000 5*4 = 20

b 48 1 48*1 = 48

c 7 0001 7*4 = 28

d 17 011 17*3 = 51

e 10 001 10*3 = 30
f 13 010 13*3 = 39

6 * 8 = 48
18 bits 216 bits
bits

Without encoding, the total size of the string was 800 bits. After encoding
the size is reduced to 48 + 18 + 216 = 282.
Graphs

A graph is an abstract data type (ADT) which consists of a set of objects that
are connected to each other via links.

A graph G = (V, E)
o V is a set of vertices
o E is a set of edges
Terminology
Order: Order defines the total number of vertices present in the graph.
Size: Size defines the number of edges present in the graph.
Self-loop: It is the edges that are connected from a vertex to itself.
Isolated vertex: It is the vertex that is not connected to any other vertices in
the graph.
Vertex degree: It is defined as the number of edges incident to a vertex in a graph.
Weighted graph: A graph having value or weight of vertices.
Unweighted graph: A graph having no value or weight of vertices.
Directed graph (Digraph): A graph having a direction indicator.
Undirected graph: A graph where no directions are defined.
Pendant vertex: A vertex in a diagraph is said to be pendant if its indegree
is equal to 1 and outdegree is equal to 0.
Isolated vertex: If the degree of a vertex is 0, then it is called an isolated
vertex.
Max. edges in a graph: If n is the total number of vertices in a graph, then
an undirected graph can have maximum n(n-1)/2 edges and a diagraph can
have maximum n(n-1) edges.
Multigraph: A graph which contain loop or multiple edges is known as
multigraph.
Simple Graph: A graph which does not have loop or multiple edges is
known as multigraph.
Regular Graph: A graph is regular if every vertex is adjacent to the same
number of vertices.
Planar Graph: A graph is called planar regular if it can be drawn in a
plane without any two edges intersecting.
NULL Graph: A graph which has only isolated vertices is called NULL
graph.
Strongly Connected Graph: A directed graph G is said to be connected or
strongly connected graph if for each pair (U, V) of nodes in G there is a
path from U to V and there is also a path from V to U.
Strongly Connected Components: A directed graph which is not strongly
connected may have different parts of the graph which are strongly
connected. These parts are called strongly connected components.
Unilaterally connected: A graph is said to be unilaterally connected if it
contains a directed path from u to v OR a directed path from v to u for every
pair of vertices u, v. Hence, at least for any pair of vertices, one vertex should
be reachable form the other.
Weakly Connected Components: A directed graph is called weakly
connected if for any pair of vertices u and v, there is a path from u to v or a
path from v to u or both. From a directed graph, if we remove the
directions and the resulting undirected graph is connected then that graph
is weakly connected.

The directed graph G above is weakly connected since its underlying


undirected graph Ĝ is connected.
Articulation point: If on removing a
vertex from a connected graph, the
graph becomes disconnected then that
vertex is called the articulation point or
cut vertex.

Biconnected Graph: A connected


graph with no articulation points is
called a biconnected graph.

Forest: A forest is a disjoint union of


trees. In a forest there is a at most one
path between any two vertices, this
means that there is either no path or a
single path between any two vertices.
Representation of Graph
To represent a graph, we just need
✓ Set of vertices
✓ For each vertex the neighbors of the vertex (vertices which is directly
connected to it by an edge).
✓ If it is a weighted graph, then the weight will be associated with each
edge.
There are different ways to optimally represent a graph, depending on the
density of its edges, type of operations to be performed and ease of use.

1. Adjacency Matrix
2. Incidence Matrix
3. Adjacency List
1. Adjacency Matrix
Adjacency matrix is a matrix that maintain the information of adjacent
vertices.

A( m, n) = 1 iff (m, n) is an edge,


0 otherwise

For weighted graph: A( m, n) = w (weight of edge), or positive infinity


otherwise
Adjacency matrix

o Good for dense graphs --|E|~O(|V|2)


o Memory requirements: O(|V| + |E| ) = O(|V|2 )
o Connectivity between two vertices can be tested quickly

Adjacency list

o Good for sparse graphs -- |E|~O(|V|)


o Memory requirements: O(|V| + |E|)=O(|V|)
o Vertices adjacent to another vertex can be found quickly
2. Incidence Matrix
The incidence matrix of a directed graph is a n×m matrix B where n and m are
the number of vertices and edges respectively, such that
−1 𝑖𝑓 𝑒𝑑𝑔𝑒𝑠 𝑒𝑗 𝑙𝑒𝑎𝑣𝑒𝑠 𝑣𝑒𝑟𝑡𝑒𝑥 𝑣𝑖
𝐵𝑖𝑗 =൞ 1 𝑖𝑓 𝑒𝑑𝑔𝑒 𝑒𝑗 𝑒𝑛𝑡𝑒𝑟𝑠 𝑣𝑒𝑟𝑡𝑒𝑥 𝑣𝑖
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
3. Adjacency List
If the graph is not dense i.e. the number of edges is less, the it is efficient to
represent the graph through adjacency list. Adjacency list is a linked
representation.
An adjacency list is a data structure used to represent a graph where each node
in the graph stores a list of its neighboring vertices.

How to build an Adjacency List?


❑ Create an array of linked lists of size N, where N is the number of vertices
in the graph.
❑ Create a linked list of adjacent vertices for each vertex in the graph.
❑ For each edge (u, v) in the graph, add v to the linked list of u, and add u to
the linked list of v if the graph is undirected otherwise add v to the list of u
if it is directed from u to v. (In case of weighted graphs store the weight
along with the connections).
Graph Searching

Problem: Find a path between two nodes of the graph.

Methods:
❑ Breadth First Search
❑ Depth First Search

Breadth First Search (BFS)

Breadth First Search (BFS) is a


graph traversal algorithm that
explores all the vertices in a graph at
the current depth before moving on
to the vertices at the next depth
level. It starts at a specified vertex
and visits all its neighbors before
moving on to the next level of
neighbors.
Breadth First Search (BFS) for a Graph Algorithm:

Initialization: Enqueue the starting node into a queue and mark it as visited.
Exploration: While the queue is not empty:
• Dequeue a node from the queue and visit it (e.g., print its value).
• For each unvisited neighbor of the dequeued node:
▪ Enqueue the neighbor into the queue.
▪ Mark the neighbor as visited.
Termination: Repeat step 2 until the queue is empty.

Time Complexity of BFS Algorithm: O(V + E)


BFS explores all the vertices and edges in the graph. In the worst case, it
visits every vertex and edge once. Therefore, the time complexity of BFS is
O(V + E), where V and E are the number of vertices and edges in the given
graph.
Space Complexity of BFS Algorithm: O(V)
BFS uses a queue to keep track of the vertices that need to be visited. In the
worst case, the queue can contain all the vertices in the graph. Therefore, the
space complexity of BFS is O(V), where V and E are the number of vertices
and edges in the given graph.
Procedure BREADTH-FIRST (G)
1. Initialize all vertices as "unvisited".
2. Let Q be a queue.
3. Enqueue the root on Q.
4. While Q not empty, do
5. begin
6. Let n <- Dequeue Q.
7 If n is marked as "unvisited", then
8. begin
9. Mark n as "visited", and output n to the terminal.
10. For each vertex v in Adj[n], do
11. If v is marked as "unvisited", then
12. enqueue v on Q.
13. end
14. end
Example:

Step1: Initially queue and visited arrays are empty.

Step2: Push node 0 into queue and mark it visited.


Step 3: Remove node 0 from the front of queue and visit the unvisited neighbors
and push them into queue.

Step 4: Remove node 1 from the front of queue and visit the unvisited neighbors
and push them into queue.
Step 5: Remove node 2 from the front of queue and visit the unvisited
neighbours and push them into queue.

Step 6: Remove node 3 from the front of queue and visit the unvisited
neighbours and push them into queue. As we can see that every neighbours of
node 3 is visited, so move to the next node that are in the front of the queue.
Steps 7: Remove node 4 from the front of queue and visit the unvisited
neighbours and push them into queue.
As we can see that every neighbours of node 4 are visited, so move to the next
node that is in the front of the queue.

Now, Queue becomes empty, So, terminate these process of iteration.


Depth First Search (DFS)

Depth First Search (BFS) is a graph


traversal algorithm that travel along
a path in the graph and when a dead
end comes we backtrack. i.e. we
traverse along a path as deep as we
can.
Depth First Search (DFS) for a Graph Algorithm:
Step 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Push
it in a stack.

Step 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It will
pop up all the vertices from the stack, which do not have adjacent vertices.)

Step 3 − Repeat Step 1 and Step 2 until the stack is empty.

Time Complexity of DFS Algorithm: O(V + E)


DFS explores all the vertices and edges in the graph. In the worst case, it
visits every vertex and edge once. Therefore, the time complexity of DFS is
O(V + E), where V and E are the number of vertices and edges in the given
graph.
Space Complexity of DFS Algorithm: O(V)
DFS uses a stack to keep track of the vertices that need to be visited. In the
worst case, the stack can contain all the vertices in the graph. Therefore, the
space complexity of DFS is O(V), where V and E are the number of vertices
and edges in the given graph.
Procedure DEPTH-FIRST (G)
1. Initialize all vertices as "unvisited".
2. Let S be a stack.
3. Push the root on S.
4. While S not empty, do
5. begin
6. Let n <- Pop S.
7 If n is marked as "unvisited", then
8. begin
9. Mark n as "visited", and output n to the terminal.
10. For each vertex v in Adj[n], do
11. If v is marked as "unvisited", then // this test is actually redundant
12. push v on S.
13. end
14. end
Example:

Step1: Initially stack and visited arrays are empty.

Step2: Visit 0 and put its adjacent nodes which are not visited yet into the stack.
Step 3: Now, Node 1 at the top of the stack, so visit node 1 and pop it from the
stack and put all of its adjacent nodes which are not visited in the stack.

Step 4: Now, Node 2 at the top of the stack, so visit node 2 and pop it from the
stack and put all of its adjacent nodes which are not visited (i.e, 3, 4) in the
stack.
Step 5: Now, Node 4 at the top of the stack, so visit node 4 and pop it from the
stack and put all of its adjacent nodes which are not visited in the stack.

Step 6: Now, Node 3 at the top of the stack, so visit node 3 and pop it from the
stack and put all of its adjacent nodes which are not visited in the stack.
Applications of Depth-First-Search (DFS) :

▪ For an unweighted graph, DFS traversal of the graph produces the


minimum spanning tree and all pair shortest path tree.
▪ Detecting cycle in a graph :
o A graph has cycle if and only if we see a back edge during DFS. So
we can run DFS for the graph and check for back edges
▪ Path Finding
▪ Topological sorting
▪ Solving puzzles with only one solution, such as mazes.

Applications of Breadth-First-Search (BFS) :

▪ Shortest Path and Minimum Spanning Tree for unweighted graph


▪ Peer to Peer Networks:
o In Peer to Peer Networks like BitTorrent, BFS is used to find all
neighbor nodes.
▪ GPS Navigation systems
▪ Social Networking Websites:
o In social networks, we can find people within a given distance ‘k’
from a person using Breadth First Search till ‘k’ levels.
Spanning Tree: A spanning tree is defined as a subset of a connected
undirected graph that has all the vertices covered with the minimum number of
edges possible.
In the spanning tree, the total number of edges is n-1. Here, n is the number of
vertices in the graph.

Remark: A complete undirected graph can have maximum nn-2 number of


spanning trees, where n is the number of nodes.
General Properties of Spanning Tree

❑ A connected graph G can have more than one spanning tree.

❑ All possible spanning trees of graph G, have the same number of edges
and vertices.

❑ The spanning tree does not have any cycle (loops).

❑ Removing one edge from the spanning tree will make the graph
disconnected, i.e. the spanning tree is minimally connected.

❑ Adding one edge to the spanning tree will create a circuit or loop, i.e.
the spanning tree is maximally acyclic.
Minimum Spanning Tree (MST)
A minimum spanning tree or minimum cost spanning tree is that spanning
tree, which covers all the vertices of the graph with minimum edges and the
sum of the weight of those edges is minimum among other spanning trees of
that graph.

The minimum spanning trees are mainly of two types:

❑ Kruskal’s Algorithm
❑ Prim’s Algorithm
❑ Kruskal’s Algorithm:

Kruskal’s algorithm builds a minimum cost spanning tree T by adding edges


to T one at a time. The algorithm selects the edges for inclusion in T in
nondecreasing order of their cost. An edge is added to T if it does not form a
cycle with the edges that are already in T.
T ={};
while(T contains less than n-1 edges && E is not empty)
{
choose a least cost edge (v,w) from E;
delete (v, w) from E;
If((v, w) does not create a cycle in T)
add (v, w) to T;
else
discard (v, w);
}
If (T contains fewer than n-1 edges)
printf(“ No Spanning tree”);
Example:

Weight Source Destination


The graph contains 9 vertices and 14 edges. 1 7 6
So, the minimum spanning tree formed will 2 8 2
be having (9 – 1) = 8 edges. 2 6 5
4 0 1
After sorting:
4 2 5
6 8 6
7 2 3
7 7 8
8 0 7
8 1 2
9 3 4
10 5 4
11 1 7
14 3 5
Step 1: Pick edge 7-6. No cycle is formed, include it.

Step 2: Pick edge 8-2. No cycle is formed, include it.


Step 3: Pick edge 6-5. No cycle is formed, include it.

Step 4: Pick edge 0-1. No cycle is formed, include it.


Step 5: Pick edge 2-5. No cycle is formed, include it.

Step 6: Pick edge 8-6. Since including this edge results in the cycle, discard it.
Pick edge 2-3: No cycle is formed, include it.
Step 7: Pick edge 7-8. Since including this edge results in the cycle, discard it.
Pick edge 0-7. No cycle is formed, include it.

Step 8: Pick edge 1-2. Since including this edge results in the cycle, discard it.
Pick edge 3-4. No cycle is formed, include it.
❑ Prim’s Algorithm:

Prim’s algorithms, like Kruskal’s algorithm constructs a minimum cost


spanning tree T by adding edges to T one at a time. However, at each stage
of the algorithm, the set of selected edges forms a tree.

T ={};
TV = {0}; /* start with vertex 0 and no edges*/
while(T contains fewer than n-1 edges)
{
let (u, v) be a least cost edge such that u ∈ TV and v ∉ TV;
If(there is no such edge)
break;
add v to TV;
add (u, v) to T;
}
If (T contains fewer than n-1 edges)
printf(“ No Spanning tree”);
Example:

Step 1: Firstly, we select an arbitrary vertex that acts as the starting vertex of the
Minimum Spanning Tree. Here we have selected vertex 0 as the starting vertex.
Step 2: All the edges connecting the
incomplete MST and other vertices
are the edges {0, 1} and {0, 7}.
Between these two the edge with
minimum weight is {0, 1}. So
include the edge and vertex 1 in the
MST.

Step 3: The edges connecting the


incomplete MST to other vertices
are {0, 7}, {1, 7} and {1, 2}.
Among these edges the minimum
weight is 8 which is of the edges
{0, 7} and {1, 2}. Let us here
include the edge {0, 7} and the
vertex 7 in the MST. [We could
have also included edge {1, 2} and
vertex 2 in the MST].
Step 4: The edges that connect the
incomplete MST with the fringe
vertices are {1, 2}, {7, 6} and {7,
8}. Add the edge {7, 6} and the
vertex 6 in the MST as it has the
least weight (i.e., 1).

Step 5: The connecting edges now


are {7, 8}, {1, 2}, {6, 8} and {6,
5}. Include edge {6, 5} and vertex
5 in the MST as the edge has the
minimum weight (i.e., 2) among
them.
Step 6: Among the current
connecting edges, the edge {5, 2}
has the minimum weight. So include
that edge and the vertex 2 in the
MST.

Step 7: The connecting edges


between the incomplete MST and
the other edges are {2, 8}, {2, 3},
{5, 3} and {5, 4}. The edge with
minimum weight is edge {2, 8}
which has weight 2. So include
this edge and the vertex 8 in the
MST.
Step 8: See here that the edges {7,
8} and {2, 3} both have same
weight which are minimum. But 7 is
already part of MST. So we will
consider the edge {2, 3} and include
that edge and vertex 3 in the MST.

Step 9: Only the vertex 4 remains


to be included. The minimum
weighted edge from the
incomplete MST to 4 is {3, 4}.
The final structure of the MST is as follows and the weight of the edges of
the MST is (4 + 8 + 1 + 2 + 4 + 2 + 7 + 9) = 37.
Dijkstra’s Algorithm:

Dijkstra’s algorithm is a popular algorithms for solving many single-source


shortest path problems having non-negative edge weight in the graphs i.e., it
is to find the shortest distance between two vertices on a graph.

Dijkstra's algorithm - Pseudocode


dist[s] ←0 (distance to source vertex is zero)
for all v ∈ V–{s}
do dist[v] ←∞ (set all other distances to infinity)
S←∅ (S, the set of visited vertices is initially empty)
Q←V (Q, the queue initially contains all vertices)
while Q ≠∅ (while the queue is not empty)
do u ← mindistance(Q, dist) (select the element of Q with the min. distance)
S←S∪{u} (add u to list of visited vertices)
for all v ∈ neighbors[u]
do if dist[v] > dist[u] + w(u, v) (if new shortest path found)
then d[v] ←d[u] + w(u, v) (set new value of shortest path)
return dist
Example:
Example:

Generate the shortest path from node 0 to all the other nodes in the graph.
Step 1: Start from Node 0 and mark Node as visited as you can check in below
image visited Node is marked red.
Step 2: Check for adjacent Nodes, Now we have to choices (Either choose
Node1 with distance 2 or either choose Node 2 with distance 6) and choose
Node with minimum distance. In this step Node 1 is Minimum distance
adjacent Node, so marked it as visited and add up the distance.
Distance: Node 0 -> Node 1 = 2
Step 3: Then Move Forward and check for adjacent Node which is Node 3, so
marked it as visited and add up the distance, Now the distance will be:
Distance: Node 0 -> Node 1 -> Node 3 = 2 + 5 = 7
Step 4: Again we have two choices for adjacent Nodes (Either we can choose
Node 4 with distance 10 or either we can choose Node 5 with distance 15) so
choose Node with minimum distance. In this step Node 4 is Minimum distance
adjacent Node, so marked it as visited and add up the distance.
Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 = 2 + 5 + 10 = 17
Step 5: Again, Move Forward and check for adjacent Node which is Node 6, so
marked it as visited and add up the distance, Now the distance will be:
Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 -> Node 6 = 2 + 5 + 10 + 2 =
19

You might also like