0% found this document useful (0 votes)
2 views36 pages

ada unit 2

The document explains the greedy algorithm, which selects the best option available at each step without reconsidering previous choices, and outlines its advantages and drawbacks. It details the Optimal Merge Pattern (OMP) for merging sorted files efficiently, emphasizing its greedy nature and applications in various fields like Huffman coding and data compression. Additionally, it describes the Huffman coding process for lossless data compression, highlighting its use of a binary tree to assign variable-length codes based on character frequency.

Uploaded by

Afreen Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views36 pages

ada unit 2

The document explains the greedy algorithm, which selects the best option available at each step without reconsidering previous choices, and outlines its advantages and drawbacks. It details the Optimal Merge Pattern (OMP) for merging sorted files efficiently, emphasizing its greedy nature and applications in various fields like Huffman coding and data compression. Additionally, it describes the Huffman coding process for lossless data compression, highlighting its use of a binary tree to assign variable-length codes based on character frequency.

Uploaded by

Afreen Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 36

A greedy algorithm is an approach for solving a problem by selecting the best option available at the

moment. It doesn't worry whether the current best result will bring the overall optimal result.

The algorithm never reverses the earlier decision even if the choice is wrong. It works in a top-down
approach.

This algorithm may not produce the best result for all the problems. It's because it always goes for
the local best choice to produce the global best result.

However, we can determine if the algorithm can be used with any problem if the problem has the
following properties:

1. Greedy Choice Property

If an optimal solution to the problem can be found by choosing the best choice at each step without
reconsidering the previous steps once chosen, the problem can be solved using a greedy approach.
This property is called greedy choice property.

2. Optimal Substructure

If the optimal overall solution to the problem corresponds to the optimal solution to its subproblems,
then the problem can be solved using a greedy approach. This property is called optimal
substructure.

Advantages of Greedy Approach

 The algorithm is easier to describe.

 This algorithm can perform better than other algorithms (but, not in all cases).

Drawback of Greedy Approach

As mentioned earlier, the greedy algorithm doesn't always produce the optimal solution. This is the
major disadvantage of the algorithm.

For example, suppose we want to find the longest path in the graph below from root to leaf. Let's use
the greedy algorithm here.

Greedy algorithms are a class of algorithms that make locally optimal choices at each step with the
hope of finding a global optimum solution.

 At every step of the algorithm, we make a choice that looks the best at the moment. To make
the choice, we sometimes sort the array so that we can always get the next optimal choice
quickly. We sometimes also use a priority queue to get the next optimal item.

 After making a choice, we check for constraints (if there are any) and keep picking until we
find the solution.

 Greedy algorithms do not always give the best solution. For example, in coin change and 0/1
knapsack problems, we get the best solution using Dynamic Programming.

 Examples of popular algorithms where Greedy gives the best solution are Fractional
Knapsack, Dijkstra's algorithm, Kruskal's algorithm, Huffman coding and Prim's Algorithm
Greedy algorithms build a solution part by part, choosing the next part in such a way, that it gives an
immediate benefit. This approach never reconsiders the choices taken previously. This approach is
mainly used to solve optimization problems. Greedy method is easy to implement and quite efficient
in most of the cases. Hence, we can say that Greedy algorithm is an algorithmic paradigm based on
heuristic that follows local optimal choice at each step with the hope of finding global optimal
solution.

In many problems, it does not produce an optimal solution though it gives an approximate (near
optimal) solution in a reasonable time.

Areas of Application

Greedy approach is used to solve many problems, such as

 Finding the shortest path between two vertices using Dijkstra's algorithm.

 Finding the minimal spanning tree in a graph using Prim's /Kruskal's algorithm, etc.

Merge a set of sorted files of different length into a single sorted file. We need to find an optimal
solution, where the resultant file will be generated in minimum time.

If the number of sorted files are given, there are many ways to merge them into a single sorted file.
This merge can be performed pair wise. Hence, this type of merging is called as 2-way merge
patterns.

As, different pairings require different amounts of time, in this strategy we want to determine an
optimal way of merging many files together. At each step, two shortest sequences are merged.

To merge a p-record file and a q-record file requires possibly p + q record moves, the obvious choice
being, merge the two smallest files together at each step.

Two-way merge patterns can be represented by binary merge trees. Let us consider a set of n sorted
files {f1, f2, f3, , fn}. Initially, each element of this is considered as a single node binary tree. To find this
optimal solution, the following algorithm is used.

Pseudocode

Following is the pseudocode of the Optimal Merge Pattern Algorithm −

for i := 1 to n 1 do

declare new node

node.leftchild := least (list)

node.rightchild := least (list)

node.weight) := ((node.leftchild).weight)+

((node.rightchild).weight)

insert (list, node);

return least (list);

At the end of this algorithm, the weight of the root node represents the optimal cost.

Examples
Let us consider the given files, f1, f2, f3, f4 and f5 with 20, 30, 10, 5 and 30 number of elements
respectively.

If merge operations are performed according to the provided sequence, then

M1 = merge f1 and f2 => 20 + 30 = 50

M2 = merge M1 and f3 => 50 + 10 = 60

M3 = merge M2 and f4 => 60 + 5 = 65

M4 = merge M3 and f5 => 65 + 30 = 95

Hence, the total number of operations is

50 + 60 + 65 + 95 = 270

Now, the question arises is there any better solution?

Sorting the numbers according to their size in an ascending order, we get the following sequence −

f4, f3, f1, f2, f5

Hence, merge operations can be performed on this sequence

M1 = merge f4 and f3 => 5 + 10 = 15

M2 = merge M1 and f1 => 15 + 20 = 35

M3 = merge M2 and f2 => 35 + 30 = 65

M4 = merge M3 and f5 => 65 + 30 = 95

Therefore, the total number of operations is

15 + 35 + 65 + 95 = 210

Obviously, this is better than the previous one.

In this context, we are now going to solve the problem using this algorithm.

Initial Set

Step 1

Step 2
Step 3

Step 4

Hence, the solution takes 15 + 35 + 60 + 95 = 205 number of comparisons.

Sure! Let’s elaborate the theory of the Optimal Merge Pattern (OMP) with greater conceptual
depth across its background, principles, mathematical analysis, and extensions.

🔍 Optimal Merge Pattern — In-Depth Theory

🧱 1. Foundational Principle

The Optimal Merge Pattern problem arises when you must combine multiple sorted files (or
datasets) into a single one, minimizing the total cost of doing so.

 Cost of merging two files: Equal to the sum of their sizes.

 Objective: Minimize the total cost incurred by merging all files into one, step-by-step.

This is a greedy strategy where we always make the locally optimal choice (merge the two smallest
files) at each step, which also turns out to be globally optimal.
🧠 2. Why Greedy Works Here

To understand why the greedy approach is optimal, consider:

 Merging two large files early causes larger costs in later steps.

 Merging smaller files early keeps the intermediate merged files small, reducing cascading
cost.

This is similar to how Huffman encoding builds a binary tree by combining least-frequent symbols
first, minimizing total weighted path length.

3. Step-by-Step Merge Process

Given files: f₁, f₂, ..., fₙ


Let’s say we merge fᵢ and fⱼ (where fᵢ ≤ fⱼ ≤ others)

 Cost = fᵢ + fⱼ

 New file = size fᵢ + fⱼ → Reinsert into the list.

Repeat until one file remains.

Total cost = Sum of all merge costs

This process forms a binary merge tree, where:

 Leaves = original files

 Internal nodes = merged files

 Root = final single file

📊 4. Mathematical Formulation

Let the files be merged in n - 1 steps, producing merged sizes m₁, m₂, ..., mₙ₋₁.
Then:

Total Cost=∑i=1n−1mi\text{Total Cost} = \sum_{i=1}^{n-1} m_i

Where each m_i is the sum of the two smallest files (or merged files) at that step.

This cost is minimized when:

 We always combine the two smallest files

 Done using a priority queue (min-heap)

🔄 5. Comparison With Naive Methods


Strategy Description Total Cost

Naive Merge Merge arbitrarily or sequentially Higher

Optimal Merge (OMP) Merge smallest files first using greedy method Minimum cost

So, unlike naive merge (e.g., left-to-right), OMP ensures we avoid "expensive" large early merges.

🌲 6. Binary Tree Representation

 The merging process can be visualized as a binary tree:

o Each merge = internal node

o File sizes = weights at leaves

o Total cost = sum of internal node values

This is conceptually similar to a Huffman Tree, where:

 File sizes = symbol frequencies

 Total cost = weighted sum of depth × size

⚙️7. Algorithm Implementation Outline

Using Min-Heap:

1. Insert all file sizes into a min-heap.

2. Repeat:

o Extract two smallest files a and b.

o Merge them → cost = a + b

o Add cost to total

o Insert a + b back into the heap

3. Stop when only one element remains.

📘 8. Applications in Real-World Systems

 Huffman Coding:

o OMP is the basis for Huffman's algorithm.

o Optimal encoding minimizes average code length.

 External Sorting:

o In large-scale systems (databases), merging sorted chunks from disk is optimized


using OMP.
 Data Compression Tools:

o Algorithms like ZIP, JPEG, MP3 use similar tree-building methods.

 Tape Drive Optimization (old systems):

o Data merging from multiple tapes minimized mechanical operations.

 Compiler Design:

o Used in intermediate code optimization and tree restructuring.

 Job Scheduling:

o In cloud systems, optimal merging can reduce cumulative job completion times.

⏳ 9. Time and Space Complexity

 Time Complexity: O(n log n)

o Due to n - 1 heap operations, each taking O(log n)

 Space Complexity: O(n)

o Priority queue holds up to n file sizes at any point

🧠 10. Key Properties

Property Value/Explanation

Problem type Greedy, Optimization

Data structure used Min-Heap (Priority Queue)

Cost calculation Sum of intermediate merge sizes

Minimum number of merges Always n - 1 for n files

Tree structure Binary merge tree

Huffman Coding is an algorithm used for lossless data compression.

Huffman Coding is also used as a component in many different compression algorithms. It is used as
a component in lossless compressions such as zip, gzip, and png, and even as part of lossy
compression algorithms like mp3 and jpeg.

Huffman Coding is a technique of compressing data to reduce its size without losing any of the
details. It was first developed by David Huffman.

Huffman Coding is generally useful to compress the data in which there are frequently occurring
characters.

How Huffman Coding works?


Suppose the string below is to be sent over a network.

Initial string

Each character occupies 8 bits. There are a total of 15 characters in the above string. Thus, a total
of 8 * 15 = 120 bits are required to send this string.

Using the Huffman Coding technique, we can compress the string to a smaller size.

Huffman coding first creates a tree using the frequencies of the character and then generates code
for each character.

Once the data is encoded, it has to be decoded. Decoding is done using the same tree.

Huffman Coding prevents any ambiguity in the decoding process using the concept of prefix code ie.
a code associated with a character should not be present in the prefix of any other code. The tree
created above helps in maintaining the property.

Huffman coding is done with the help of the following steps.

1. Calculate the frequency of each character in the string. Frequency of


string

2. Sort the characters in increasing order of the frequency. These are stored in a priority

queue Q. Characters sorted according to the frequency

3. Make each unique character as a leaf node.

4. Create an empty node z. Assign the minimum frequency to the left child of z and assign the
second minimum frequency to the right child of z. Set the value of the z as the sum of the

above two minimum frequencies. Getting the sum of the least numbers

5. Remove these two minimum frequencies from Q and add the sum into the list of frequencies
(* denote the internal nodes in the figure above).

6. Insert node z into the tree.


7. Repeat steps 3 to 5 for all the characters. Repeat steps 3 to 5 for all the

characters. Repeat steps 3 to 5 for all the characters.

8. For each non-leaf node, assign 0 to the left edge and 1 to the right edge.
Assign 0 to the left edge and 1 to the right edge

For sending the above string over a network, we have to send the tree as well as the above
compressed-code. The total size is given by the table below.

Character Frequency Code Size

A 5 11 5*2 = 10

B 1 100 1*3 = 3

C 6 0 6*1 = 6

D 3 101 3*3 = 9

4 * 8 = 32 bits 15 bits 28 bits


Without encoding, the total size of the string was 120 bits. After encoding the size is reduced to 32 +
15 + 28 = 75.

Formulas Used in Huffman Tree

Average code length per character = ∑(frequencyi x code lengthi)/ ∑ frequencyi

Total number of bits in Huffman encoded message

= Total number of characters in the message x Average code length per character

= ∑ ( frequencyi x Code lengthi )

Here's a detailed and clear explanation of Huffman Coding—including its theory, process, example,
and application.

🧠 What is Huffman Coding?

Huffman Coding is a lossless data compression algorithm used to reduce the average size of data by
encoding frequently occurring characters with shorter binary codes and less frequent ones with
longer codes.

It's based on the idea that:

"Not all characters occur equally often—so don’t give them equal-length codes."

📌 Key Characteristics

Property Detail

Type Lossless Compression

Based on Frequency (or probability) of characters

Strategy Greedy Algorithm

Tree used Binary Tree (Huffman Tree)

Related to Optimal Merge Pattern

Output Prefix Code (no code is a prefix of another)

📚 Steps in Huffman Coding

Let’s assume you are given a list of characters with their frequencies:

🔧 Step 1: List Characters & Frequencies

Example:
Character Frequency

A 5

B 9

C 12

D 13

E 16

F 45

🌲 Step 2: Build Huffman Tree

1. Create leaf nodes for each character.

2. Insert them into a min-heap (priority queue) based on frequency.

3. Repeat until one node remains:

o Extract two nodes with the lowest frequencies.

o Create a new internal node with:

 Frequency = sum of both

 Left & Right children = the two nodes

o Insert this internal node back into the heap.

4. The final node is the root of the Huffman Tree.

🧵 Step 3: Assign Binary Codes

 Traverse the Huffman Tree:

o Left edge → assign 0

o Right edge → assign 1

 Continue until you reach leaf nodes (characters).

 The code for each character is the binary path from root to leaf.

✏️Example: Huffman Codes from the Above Table

Character Huffman Code

F 0

C 100
Character Huffman Code

D 101

A 1100

B 1101

E 111

The more frequent the character, the shorter its code.

💡 Why Huffman Coding Works

 Prefix Property: No code is a prefix of another, so decoding is unambiguous.

 Greedy Approach: Always merge the least frequent nodes first to keep total bit cost low.

 Optimality: Among all prefix codes, Huffman gives minimum average code length.

📈 Time and Space Complexity

Operation Complexity

Building Tree O(n log n)

Encoding O(n)

Decoding O(L), where L = total bits

🧰 Applications of Huffman Coding

 File compression (ZIP, RAR, 7z)

 Multimedia compression (JPEG, MP3, MPEG)

 Compiler Design: Intermediate code compression

 Telecommunication Systems: Efficient encoding

🔄 Relation to Optimal Merge Pattern

Huffman Coding is a special case of Optimal Merge Pattern:

 You merge two lowest-frequency nodes just like you merge the two smallest files.

 Instead of minimizing merge cost, you minimize total encoding length.

🔠 Huffman Coding vs Fixed Length


Say we want to represent 6 characters:

 Fixed-length code: Needs at least 3 bits for each → total = 3 * total chars

 Huffman code: Uses variable lengths (1–4 bits), reducing average bits per char.

✅ Summary

 Huffman Coding = Optimal Binary Prefix Code

 Built using Greedy Strategy and Min-Heap

 Efficient for compression where some items are more frequent than others

Prim's minimal spanning tree algorithm is one of the efficient methods to find the minimum
spanning tree of a graph. A minimum spanning tree is a sub graph that connects all the vertices
present in the main graph with the least possible edges and minimum cost (sum of the weights
assigned to each edge).

The algorithm, similar to any shortest path algorithm, begins from a vertex that is set as a root and
walks through all the vertices in the graph by determining the least cost adjacent edges.

Prim's Algorithm

To execute the prim's algorithm, the inputs taken by the algorithm are the graph G {V, E}, where V is
the set of vertices and E is the set of edges, and the source vertex S. A minimum spanning tree of
graph G is obtained as an output.

Algorithm

 Declare an array visited[] to store the visited vertices and firstly, add the arbitrary root, say S,
to the visited array.

 Check whether the adjacent vertices of the last visited vertex are present in the visited[]
array or not.

 If the vertices are not in the visited[] array, compare the cost of edges and add the least cost
edge to the output spanning tree.

 The adjacent unvisited vertex with the least cost edge is added into the visited[] array and
the least cost edge is added to the minimum spanning tree output.
 Steps 2 and 4 are repeated for all the unvisited vertices in the graph to obtain the full
minimum spanning tree output for the given graph.

 Calculate the cost of the minimum spanning tree obtained.

Examples

 Find the minimum spanning tree using prims method (greedy approach) for the graph given
below with S as the arbitrary root.

Solution

Step 1

Create a visited array to store all the visited vertices into it.

V={}

The arbitrary root is mentioned to be S, so among all the edges that are connected to S we need to
find the least cost edge.

S→B=8

V = {S, B}

Step 2

Since B is the last visited, check for the least cost edge that is connected to the vertex B.

B→A=9

B → C = 16

B → E = 14

Hence, B → A is the edge added to the spanning tree.

V = {S, B, A}
Step 3

Since A is the last visited, check for the least cost edge that is connected to the vertex A.

A → C = 22

A→B=9

A → E = 11

But A → B is already in the spanning tree, check for the next least cost edge. Hence, A → E is added
to the spanning tree.

V = {S, B, A, E}

Step 4

Since E is the last visited, check for the least cost edge that is connected to the vertex E.

E → C = 18

E→D=3

Therefore, E → D is added to the spanning tree.

V = {S, B, A, E, D}

Step 5
Since D is the last visited, check for the least cost edge that is connected to the vertex D.

D → C = 15

E→D=3

Therefore, D → C is added to the spanning tree.

V = {S, B, A, E, D, C}

The minimum spanning tree is obtained with the minimum cost = 46

🧮 1. Using an Adjacency Matrix + Linear Search

 For each of the V vertices, we find the minimum key value vertex not yet included in MST →
O(V)

 This process is repeated for V vertices

 For each selected vertex, we update its adjacent vertices (V times)

🔹 Time Complexity:

O(V²)
✅ Simple and works well for dense graphs (many edges)

🧮 2. Using an Adjacency List + Min-Heap (Priority Queue)

 Uses a Min-Heap to pick the next minimum weight edge

 We insert all vertices → O(V) insertions

 Extract-min and decrease-key operations → O(log V)

 For all edges (E), we might need to update adjacent vertices in the heap

🔹 Time Complexity:

O(E log V)
✅ Efficient for sparse graphs (E ≈ V)

📌 Summary Table:
Implementation Method Time Complexity Use Case

Adjacency Matrix + Linear Search O(V²) Dense Graphs

Adjacency List + Min Heap O(E log V) Sparse Graphs

💡 Applications of Prim’s Algorithm

Prim's algorithm is used in real-world scenarios where we need to minimize the total cost of
connecting all components, such as:

1. Network Design

 Telecommunication networks: Laying cables between network nodes (servers, routers) at


minimum cost

 Internet: Constructing the shortest layout for LANs, WANs, or backbone infrastructure

2. Electrical Grid Design

 Designing power lines between substations while reducing the total wiring cost

3. Road or Railway Network Planning

 Connecting cities or stations with minimal total construction cost

4. Cluster Analysis in AI & ML

 Used in hierarchical clustering where you want to group points into clusters by connecting
them with minimum distance

5. Image Processing

 In segmentation, MST helps to group pixels together with minimal dissimilarity

Kruskal's minimal spanning tree algorithm is one of the efficient methods to find the minimum
spanning tree of a graph. A minimum spanning tree is a subgraph that connects all the vertices
present in the main graph with the least possible edges and minimum cost (sum of the weights
assigned to each edge).

The algorithm first starts from the forest which is defined as a subgraph containing only vertices of
the main graph of the graph, adding the least cost edges later until the minimum spanning tree is
created without forming cycles in the graph.

Kruskal's algorithm has easier implementation than prims algorithm, but has higher complexity.
Kruskal's Algorithm

The inputs taken by the kruskals algorithm are the graph G {V, E}, where V is the set of vertices and E
is the set of edges, and the source vertex S and the minimum spanning tree of graph G is obtained as
an output.

Algorithm

 Sort all the edges in the graph in an ascending order and store it in an array edge[].

 Construct the forest of the graph on a plane with all the vertices in it.

 Select the least cost edge from the edge[] array and add it into the forest of the graph. Mark
the vertices visited by adding them into the visited[] array.

 Repeat the steps 2 and 3 until all the vertices are visited without having any cycles forming in
the graph

 When all the vertices are visited, the minimum spanning tree is formed.

 Calculate the minimum cost of the output spanning tree formed.

Examples

Construct a minimum spanning tree using kruskals algorithm for the graph given below −

Solution

As the first step, sort all the edges in the given graph in an ascending order and store the values in an
array.

Edge B→D A→B C→F F→E B→C G→F A→G C→D D→E C→

Cost 5 6 9 10 11 12 15 17 22 25

Then, construct a forest of the given graph on a single plane.


From the list of sorted edge costs, select the least cost edge and add it onto the forest in output
graph.

B→D=5

Minimum cost = 5

Visited array, v = {B, D}

Similarly, the next least cost edge is B → A = 6; so we add it onto the output graph.

Minimum cost = 5 + 6 = 11

Visited array, v = {B, D, A}

The next least cost edge is C → F = 9; add it onto the output graph.

Minimum Cost = 5 + 6 + 9 = 20

Visited array, v = {B, D, A, C, F}


The next edge to be added onto the output graph is F → E = 10.

Minimum Cost = 5 + 6 + 9 + 10 = 30

Visited array, v = {B, D, A, C, F, E}

The next edge from the least cost array is B → C = 11, hence we add it in the output graph.

Minimum cost = 5 + 6 + 9 + 10 + 11 = 41

Visited array, v = {B, D, A, C, F, E}

The last edge from the least cost array to be added in the output graph is F → G = 12.

Minimum cost = 5 + 6 + 9 + 10 + 11 + 12 = 53

Visited array, v = {B, D, A, C, F, E, G}


The obtained result is the minimum spanning tree of the given graph with cost = 53.

Time Complexity Analysis

Let’s analyze the time complexity step-by-step:

📌 1. Sorting the Edges

 Sorting all edges takes O(E log E) time.

📌 2. Union-Find Operations

 We use Disjoint Set Union (DSU) to detect cycles.

 With path compression and union by rank, DSU operations are nearly constant time:

o O(α(V)) per operation, where α is the inverse Ackermann function, which grows
extremely slowly (considered constant in practice).

 For E edges, we perform makeSet, find, and union → O(E α(V))

Applications of Kruskal’s Algorithm

Kruskal’s algorithm is highly useful in systems where connection costs matter, and cycle formation
must be avoided:

1. Network Design

 Cable TV networks, LANs, optical fiber layouts, etc.

 Ensures minimal wiring/cost without redundancy

2. Road Construction Projects

 Connecting cities with the minimum length of roads, while avoiding circular paths (no
duplicate paths)

3. Electrical Grid Design

 Layout of power lines that minimize copper use, while ensuring all stations are connected

4. Clustering in Data Science

 Hierarchical clustering algorithms use MST concepts to group data efficiently

5. Computer Vision

 Segmentation and image clustering applications can use MST for grouping pixels
6. Social Network Analysis

 Building friend recommendation or community detection trees by connecting users with


minimum "distance" or "similarity" score

The knapsack problem states that − given a set of items, holding weights and profit values, one must
determine the subset of the items to be added in a knapsack such that, the total weight of the items
must not exceed the limit of the knapsack and its total profit value is maximum.

It is one of the most popular problems that take greedy approach to be solved. It is called as
the Fractional Knapsack Problem.

To explain this problem a little easier, consider a test with 12 questions, 10 marks each, out of which
only 10 should be attempted to get the maximum mark of 100. The test taker now must calculate the
highest profitable questions the one that hes confident in to achieve the maximum mark. However,
he cannot attempt all the 12 questions since there will not be any extra marks awarded for those
attempted answers. This is the most basic real-world application of the knapsack problem.

Knapsack Algorithm

The weights (Wi) and profit values (Pi) of the items to be added in the knapsack are taken as an input
for the fractional knapsack algorithm and the subset of the items added in the knapsack without
exceeding the limit and with maximum profit is achieved as the output.

Algorithm

 Consider all the items with their weights and profits mentioned respectively.

 Calculate Pi/Wi of all the items and sort the items in descending order based on their
Pi/Wi values.

 Without exceeding the limit, add the items into the knapsack.

 If the knapsack can still store some weight, but the weights of other items exceed the limit,
the fractional part of the next time can be added.

 Hence, giving it the name fractional knapsack problem.

Examples

 For the given set of items and the knapsack capacity of 10 kg, find the subset of the items to
be added in the knapsack such that the profit is maximum.

Items 1 2 3 4 5

Weights (in kg) 3 3 2 5 1

Profits 10 15 10 12 8

Solution

Step 1

Given, n = 5
Wi = {3, 3, 2, 5, 1}

Pi = {10, 15, 10, 12, 8}

Calculate Pi/Wi for all the items

Items 1 2 3 4 5

Weights (in kg) 3 3 2 5 1

Profits 10 15 10 20 8

Pi/Wi 3.3 5 5 4 8

Step 2

Arrange all the items in descending order based on Pi/Wi

Items 5 2 3 4 1

Weights (in kg) 1 3 2 5 3

Profits 8 15 10 20 10

Pi/Wi 8 5 5 4 3.3

Step 3

Without exceeding the knapsack capacity, insert the items in the knapsack with maximum profit.

Knapsack = {5, 2, 3}

However, the knapsack can still hold 4 kg weight, but the next item having 5 kg weight will exceed
the capacity. Therefore, only 4 kg weight of the 5 kg will be added in the knapsack.

Items 5 2 3 4 1

Weights (in kg) 1 3 2 5 3

Profits 8 15 10 20 10

Knapsack 1 1 1 4/5 0

Hence, the knapsack holds the weights = [(1 * 1) + (1 * 3) + (1 * 2) + (4/5 * 5)] = 10, with maximum
profit of [(1 * 8) + (1 * 15) + (1 * 10) + (4/5 * 20)] = 37.

🧮 Time Complexity Analysis


To analyze the time complexity, let’s break down the algorithm:

1. Calculate Value-to-Weight Ratio

 For every item, calculate value/weight.

 Time: O(n) for n items.

2. Sort Items by Ratio (Descending)

 Use a sorting algorithm like Merge Sort or Heap Sort.

 Time: O(n log n).

3. Pick Items for Knapsack

 Traverse the sorted list and add items to the knapsack until full.

 If the item can’t be fully added, add a fraction to fill the bag.

 Time: O(n).

✅ Total Time Complexity: O(n log n)

The sorting step dominates, so this is the final complexity.

💼 Applications of Fractional Knapsack Problem

This greedy algorithm has several practical applications, especially in resource allocation and
logistics:

1. Cargo Loading & Shipping

When you have a truck with limited capacity, and you want to load items with maximum profit. If
some items are too heavy, you can take a portion of them, like grains or liquids.

2. Investment Portfolios

Suppose you have limited capital, and you want to invest in assets. The assets can be fractionally
invested in (e.g., stocks or bonds), and each provides a different return per unit. Fractional knapsack
helps in maximizing ROI.

3. Bandwidth Allocation in Networks

You want to allocate limited bandwidth among different users or services. Some services might be
more important (higher value), so a fractional approach allows you to split resources wisely.

4. Resource Distribution in Crisis Situations


During disasters, limited supplies like food, water, or medicine need to be distributed for maximum
impact. Fractional knapsack helps in deciding what portion of which resource to send to each
location.

5. Machine Scheduling

Jobs with profits can be scheduled on a limited-time machine, and partial jobs can be allowed. This
maximizes profit per time slot used.

Job scheduling algorithm is applied to schedule the jobs on a single processor to maximize the
profits.

The greedy approach of the job scheduling algorithm states that, Given n number of jobs with a
starting time and ending time, they need to be scheduled in such a way that maximum profit is
received within the maximum deadline.

Job Scheduling Algorithm

Set of jobs with deadlines and profits are taken as an input with the job scheduling algorithm and
scheduled subset of jobs with maximum profit are obtained as the final output.

Algorithm

Step1 − Find the maximum deadline value from the input set

of jobs.

Step2 − Once, the deadline is decided, arrange the jobs

in descending order of their profits.

Step3 − Selects the jobs with highest profits, their time

periods not exceeding the maximum deadline.

Step4 − The selected set of jobs are the output.

Examples

Consider the following tasks with their deadlines and profits. Schedule the tasks in such a way that
they produce maximum profit after being executed −

S. No. 1 2 3 4 5

Jobs J1 J2 J3 J4 J5

Deadlines 2 2 1 3 4

Profits 20 60 40 100 80

Step 1

Find the maximum deadline value, dm, from the deadlines given.
dm = 4.

Step 2

Arrange the jobs in descending order of their profits.

S. No. 1 2 3 4 5

Jobs J4 J5 J2 J3 J1

Deadlines 3 4 2 1 2

Profits 100 80 60 40 20

The maximum deadline, dm, is 4. Therefore, all the tasks must end before 4.

Choose the job with highest profit, J4. It takes up 3 parts of the maximum deadline.

Therefore, the next job must have the time period 1.

Total Profit = 100.

Step 3

The next job with highest profit is J5. But the time taken by J5 is 4, which exceeds the deadline by 3.
Therefore, it cannot be added to the output set.

Step 4

The next job with highest profit is J2. The time taken by J5 is 2, which also exceeds the deadline by 1.
Therefore, it cannot be added to the output set.

Step 5

The next job with higher profit is J3. The time taken by J3 is 1, which does not exceed the given
deadline. Therefore, J3 is added to the output set.

Total Profit: 100 + 40 = 140

Step 6

Since, the maximum deadline is met, the algorithm comes to an end. The output set of jobs
scheduled within the deadline are {J4, J3} with the maximum profit of 140.

What is Job Sequencing with Deadlines?

The prime objective of the Job Sequencing with Deadlines algorithm is to complete the given order
of jobs within respective deadlines, resulting in the highest possible profit. To achieve this, we are
given a number of jobs, each associated with a specific deadline, and completing a job before its
deadline earns us a profit. The challenge is to arrange these jobs in a way that maximizes our total
profit.

It is not always possible to complete all of the assigned jobs within the deadlines. For each job,
denoted as Ji, we have a deadline di and a profit pi associated with completing it on time. Our
objective is to find the best solution that maximizes profit while still ensuring that the jobs are
completed within their deadlines.

Here’s how Job Sequencing with Deadlines algorithm works:

Problem Setup

You’re given a list of jobs, where each job has a unique identifier (job_id), a deadline (by which the
job should be completed), and a profit value (the benefit you gain by completing the job).

Sort the Jobs by Profit

To ensure we consider jobs with higher profits first, sort the jobs in non-increasing order based on
their profit values.

Initialize the Schedule and Available Time Slots

Set up an array to represent the schedule. Initialize all elements to -1, indicating that no job has been
assigned to any time slot. Also, create a boolean array to represent the availability of time slots, with
all elements set to true initially.

Assign Jobs to Time Slots

Go through the sorted jobs one by one. For each job, find the latest available time slot just before its
deadline. If such a time slot is available, assign the job to that slot. If not, skip the job.

Calculate Total Profit and Scheduled Jobs

Sum up the profits of all the scheduled jobs to get the total profit. Additionally, keep track of which
job is assigned to each time slot.

Output the Results

Finally, display the total profit achieved and the list of jobs that have been scheduled.

Job Sequencing with Deadlines algorithm

Given jobs J(i) with deadline D(i) and profit P(i) for 0≤i≤1, these jobs are arranged in descending order
of profit p1⩾p2⩾p3⩾…⩾pn.

Job-Sequencing-With-Deadline (D, J, n, k)

D(0) := J(0) := 0

k:= 1

J(1) := 1 // means first job is selected

for i = 2 … n do

r:= k

while D(J(r)) > D(i) and D(J(r)) ≠ r do

r:= r – 1

if D(J(r)) ≤ D(i) and D(i) > r then

for l = k … r + 1 by -1 do
J(l + 1): = J(l)

J(r + 1): = i

k:= k + 1

Job Sequencing with Deadlines using Greedy method

Job sequencing with deadlines is often solved using a Greedy algorithm approach, where jobs are
selected based on their profitability and deadline constraints. The goal is to maximize the total profit
by scheduling jobs in the most optimal manner.

Here’s a clear breakdown of the applications and time complexity of the Job Sequencing with
Deadlines problem:

✅ Applications of Job Sequencing with Deadlines

This algorithm is used in any scenario where jobs must be scheduled under deadline constraints
with the goal of maximizing profit or efficiency.

🔹 1. Operating Systems

 Scheduling jobs in CPU to maximize throughput within deadline constraints.

 Useful in real-time systems where tasks must be completed before certain time limits.

🔹 2. Project Management

 Assign tasks to time slots in order to complete them profitably and on time.

 Especially useful in freelancing platforms where each task has a deadline and payout.

🔹 3. Manufacturing & Production

 Machines are scheduled to perform high-value tasks first within deadlines to maximize
revenue.

🔹 4. Cloud Resource Allocation

 Allocating cloud compute resources (VMs, containers) to high-priority jobs with deadlines to
increase cost-efficiency.

🔹 5. Advertisement Scheduling

 Ads with different profits and time constraints are selected to show during fixed slots (like
during a match).

🔹 6. Exam or Interview Slot Scheduling

 Assigning slots for interviews/tests to maximize participation or accommodate top


candidates within limited time.

🕒 Time Complexity

🧱 Basic Greedy Approach:


Steps:

1. Sort jobs by profit:


👉 O(n log n)

2. For each job, search for a free slot from deadline to 1:


👉 Worst-case O(n) per job
👉 Total: O(n²)

🔻 Total Time Complexity = O(n²)

🚀 Optimized Version Using DSU (Disjoint Set Union):

Dijkstras shortest path algorithm is similar to that of Prims algorithm as they both rely on finding the
shortest path locally to achieve the global solution. However, unlike prims algorithm, the dijkstras
algorithm does not find the minimum spanning tree; it is designed to find the shortest path in the
graph from one vertex to other remaining vertices in the graph. Dijkstras algorithm can be performed
on both directed and undirected graphs.

Since the shortest path can be calculated from single source vertex to all the other vertices in the
graph, Dijkstras algorithm is also called single-source shortest path algorithm. The output obtained
is called shortest path spanning tree.

In this chapter, we will learn about the greedy approach of the dijkstras algorithm.

Dijkstras Algorithm

The dijkstras algorithm is designed to find the shortest path between two vertices of a graph. These
two vertices could either be adjacent or the farthest points in the graph. The algorithm starts from
the source. The inputs taken by the algorithm are the graph G {V, E}, where V is the set of vertices
and E is the set of edges, and the source vertex S. And the output is the shortest path spanning tree.

Algorithm

 Declare two arrays − distance[] to store the distances from the source vertex to the other
vertices in graph and visited[] to store the visited vertices.

 Set distance[S] to 0 and distance[v] = ∞, where v represents all the other vertices in the
graph.

 Add S to the visited[] array and find the adjacent vertices of S with the minimum distance.

 The adjacent vertex to S, say A, has the minimum distance and is not in the visited array yet.
A is picked and added to the visited array and the distance of A is changed from ∞ to the
assigned distance of A, say d1, where d1 < ∞.

 Repeat the process for the adjacent vertices of the visited vertices until the shortest path
spanning tree is formed.

Examples

To understand the dijkstras concept better, let us analyze the algorithm with the help of an example
graph −
Step 1

Initialize the distances of all the vertices as ∞, except the source node S.

Vertex S A B C D E

Distance 0 ∞ ∞ ∞ ∞ ∞

Now that the source vertex S is visited, add it into the visited array.

visited = {S}

Step 2

The vertex S has three adjacent vertices with various distances and the vertex with minimum
distance among them all is A. Hence, A is visited and the dist[A] is changed from ∞ to 6.

S→A=6

S→D=8

S→E=7

Vertex S A B C D E

Distance 0 6 ∞ ∞ 8 7

Visited = {S, A}

Step 3

There are two vertices visited in the visited array, therefore, the adjacent vertices must be checked
for both the visited vertices.

Vertex S has two more adjacent vertices to be visited yet: D and E. Vertex A has one adjacent vertex
B.
Calculate the distances from S to D, E, B and select the minimum distance −

S → D = 8 and S → E = 7.

S → B = S → A + A → B = 6 + 9 = 15

Vertex S A B C D E

Distance 0 6 15 ∞ 8 7

Visited = {S, A, E}

Step 4

Calculate the distances of the adjacent vertices S, A, E of all the visited arrays and select the vertex
with minimum distance.

S→D=8

S → B = 15

S → C = S → E + E → C = 7 + 5 = 12

Vertex S A B C D E

Distance 0 6 15 12 8 7

Visited = {S, A, E, D}

Step 5

Recalculate the distances of unvisited vertices and if the distances minimum than existing distance is
found, replace the value in the distance array.

S → C = S → E + E → C = 7 + 5 = 12
S → C = S → D + D → C = 8 + 3 = 11

dist[C] = minimum (12, 11) = 11

S → B = S → A + A → B = 6 + 9 = 15

S → B = S → D + D → C + C → B = 8 + 3 + 12 = 23

dist[B] = minimum (15,23) = 15

Vertex S A B C D E

Distance 0 6 15 11 8 7

Visited = { S, A, E, D, C}

Step 6

The remaining unvisited vertex in the graph is B with the minimum distance 15, is added to the
output spanning tree.

Visited = {S, A, E, D, C, B}

The shortest path spanning tree is obtained as an output using the dijkstras algorithm.

Let's analyze the time complexity of Dijkstra's Algorithm in detail based on different
implementations and data structures used.

🧮 Time Complexity of Dijkstra's Algorithm

Let:

 V = number of vertices (nodes)

 E = number of edges
✅ Case 1: Using an Adjacency Matrix + Linear Search for Min Distance

 Each time, you search all vertices to find the unvisited vertex with the smallest distance →
O(V)

 You do this for all V vertices → O(V²)

 Edge relaxation for each neighbor is constant per vertex

🔹 Time Complexity: O(V²)

✅ Suitable for dense graphs (when E ≈ V²)

✅ Case 2: Using Min-Heap (Priority Queue) + Adjacency List

 Insertion and extraction in the priority queue takes O(log V)

 You do this for all vertices → O(V log V)

 For each edge, you may update the priority queue → O(E log V)

🔹 Total Time Complexity: O((V + E) log V)

✅ Efficient for sparse graphs where E is much less than V²

✅ Case 3: Using Fibonacci Heap + Adjacency List (Advanced)

 Decrease-key operation: O(1) (amortized)

 Extract-min: O(log V)

 Edge relaxations: O(E)

 Vertex processing: O(V log V)

🔹 Time Complexity: O(E + V log V)

✅ This is the theoretically optimal time complexity but rarely used in practice due to high constant
overhead.

📊 Summary Table

Implementation Method Time Complexity Suitable for

Adjacency Matrix + Linear Search O(V²) Dense graphs

Adjacency List + Binary Heap (Min-Heap) O((V + E) log V) Sparse graphs (most common)

Adjacency List + Fibonacci Heap O(E + V log V) Theoretical optimization


🧠 Best, Worst, and Average Cases

 Best Case: Even in best case, Dijkstra's algorithm processes all vertices once. So, complexity
depends on the graph representation and remains O(V²) or O((V + E) log V)

 Worst Case: Worst case happens in dense graphs with E ≈ V², leading to:

o With min-heap: O(V² log V)

o With matrix: O(V²)

Here are detailed applications of Dijkstra's Algorithm, each explained in simple terms and with clear
examples. Dijkstra’s algorithm is a shortest path algorithm used to find the minimum distance
between a source node and all other nodes in a graph. It works only with non-negative edge
weights.

🌐 1. GPS Navigation and Route Planning

Dijkstra’s algorithm is widely used in Google Maps, GPS devices, and online travel planners. It helps
in finding the shortest or fastest route between two locations on a map.

 How it works: The road network is modeled as a graph, with intersections as nodes and
roads as weighted edges (distance or time). Dijkstra’s algorithm finds the most efficient route
from your current location to your destination.

 Example: You want to drive from Delhi to Jaipur. Dijkstra’s algorithm calculates the shortest
path based on road lengths or estimated driving time.

📡 2. Network Routing Protocols

Dijkstra's algorithm is used in network routing protocols like OSPF (Open Shortest Path First) to
determine the most efficient route for data packets.

 How it works: Routers use it to compute the shortest path to all other routers in the
network. This helps in optimizing bandwidth usage and reducing latency.

 Example: In the internet, when you send a message from your phone to a server, the routers
use Dijkstra’s algorithm to ensure your message follows the fastest available path.

🏭 3. Logistics and Supply Chain Optimization

Companies use Dijkstra’s algorithm to optimize product delivery, transport routes, and supply
chains.

 How it works: Warehouses and stores are treated as nodes, and transportation costs as edge
weights. Dijkstra’s algorithm finds the cheapest or quickest way to move goods.

 Example: Amazon or Flipkart uses this logic to figure out the fastest way to deliver your order
from the warehouse to your house.
🚊 4. Railway and Flight Scheduling

Dijkstra’s algorithm helps in determining the shortest travel path, including transfers and
connections, in public transportation networks like trains, metros, and airlines.

 How it works: Each station or airport is a node; each direct travel option is an edge with a
time or cost weight.

 Example: You want to go from Mumbai to Kolkata by train with minimum travel time.
Dijkstra helps determine the best possible route with minimum transfers.

💻 5. Operating Systems – CPU Scheduling

While not its main use, a variation of Dijkstra's algorithm can be used in shortest job next scheduling
algorithms to select the next process to execute, minimizing average wait time.

🎮 6. Game Development (Pathfinding for Characters)

Games with large maps use Dijkstra’s algorithm (or A* which is an optimized version) for AI
characters to find paths to targets or goals.

 How it works: The game map is a grid of nodes, and the characters find the shortest
walkable path to chase, escape, or explore.

 Example: In a maze-solving robot simulation or in games like Age of Empires, characters use
shortest path logic to move efficiently.

🏢 7. Urban Planning and Traffic Control

Urban planners use Dijkstra’s algorithm to simulate traffic flow, plan infrastructure, and optimize
public transportation.

 How it works: Cities are modeled as graphs, and the algorithm helps predict where to build
roads, bus routes, or reduce congestion.

 Example: Which new bus route from Sector 20 to Sector 80 in a city will reduce travel time
for most users?

🧠 8. Social Network Analysis

Used to find shortest connections between users in social networks (like Facebook or LinkedIn), to
measure closeness, reach, or influence.

 How it works: Each person is a node; a connection is an edge. Shortest path = fewest
connections to reach someone.

 Example: "You are 3 connections away from Elon Musk" – this type of result is based on
shortest path logic.
✈️9. Flight Fare Comparison Tools

Travel websites use Dijkstra’s algorithm to show the cheapest or fastest flights between cities.

 How it works: Cities are nodes, and flights are edges with weights based on fare or duration.

 Example: Find the cheapest route from Delhi to New York with 1 or 2 stopovers.

🧾 Summary Table

Application Area Use of Dijkstra's Algorithm

GPS & Maps Shortest/fastest route between locations

Network Routing Efficient data path in routers (OSPF protocol)

Logistics Optimal delivery routes and cost-effective shipping

Public Transport Minimum time routes with connections in trains, buses, flights

Operating Systems Task scheduling based on shortest next process (variation used)

Gaming AI Pathfinding for character movement

Urban Planning Traffic control and public service optimization

Social Networks Finding closest social connections

Travel Booking Sites Cheapest or shortest flights with connections

You might also like