0% found this document useful (0 votes)
57 views4 pages

CSA Lab 10

The document describes Huffman coding, a lossless data compression algorithm that uses variable-length codes. It assigns shorter codes to more frequent characters and longer codes to less frequent characters. The algorithm builds a Huffman tree from character frequencies and assigns codes by traversing the tree. An exercise asks the student to build a Huffman tree from sample character frequencies, print the codes, and calculate the number of bits required to encode a sample string.

Uploaded by

vol dam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views4 pages

CSA Lab 10

The document describes Huffman coding, a lossless data compression algorithm that uses variable-length codes. It assigns shorter codes to more frequent characters and longer codes to less frequent characters. The algorithm builds a Huffman tree from character frequencies and assigns codes by traversing the tree. An exercise asks the student to build a Huffman tree from sample character frequencies, print the codes, and calculate the number of bits required to encode a sample string.

Uploaded by

vol dam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Computer System Algorithm (MCS-205) SUET/QR/114

LAB # 10

HOFFMAN CODES

OBJECTIVE

To implement Hoffman codes based on Greedy Algorithm to encode the data with the prefix code.

THEORY

HOFFMAN CODES:

Huffman Coding is a famous Greedy Algorithm. It is used for the lossless compression of data. It uses variable
length encoding. It assigns variable length code to all the characters. The code length of a character depends on
how frequently it occurs in the given text. The character which occurs most frequently gets the smallest code.
The character which occurs least frequently gets the largest code. It is also known as Huffman Encoding.

Prefix Rule:

Huffman Coding implements a rule known as a prefix rule. This is to prevent the ambiguities while decoding.
It ensures that the code assigned to any character is not a prefix of the code assigned to any other character.

Major Steps in Huffman Coding:

There are two major steps in Huffman Coding-


• Building a Huffman Tree from the input characters.
• Assigning code to the characters by traversing the Huffman Tree.

Steps to build Huffman Tree:

Input is an array of unique characters along with their frequency of occurrences and output is Huffman Tree.

1. Create a leaf node for each unique character and build a min heap of all leaf nodes (Min Heap is used as a
priority queue. The value of frequency field is used to compare two nodes in min heap. Initially, the least
frequent character is at root)

2. Extract two nodes with the minimum frequency from the min heap.

3. Create a new internal node with a frequency equal to the sum of the two nodes frequencies. Make the first
extracted node as its left child and the other extracted node as its right child. Add this node to the min
heap.

4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the root node and
the tree is complete.

Lab 10: Huffman Codes


Name: Tanzeel Ur Rehman 1 Roll no: BMCS22S-
002
Computer System Algorithm (MCS-205) SUET/QR/114

Time Complexity:
The time complexity analysis of Huffman Coding is as follows- extractMin( ) is called 2 x (n-1)
times if there are n nodes.
As extractMin( ) calls minHeapify( ), it takes O(logn) time.
Thus, Overall time complexity of Huffman Coding becomes O(nlogn). Here, n is the number of
unique characters in the given text.

EXERCISE

A. Given a string S of distinct character of size N and their corresponding frequency f [] i.e.
character S[i] has f[i] frequency. Your task is to build the Huffman tree print all the Huffman
codes in preorder traversal of the tree.
NOTE: If two elements have same frequency, then the element which occur at first will be taken on
the left of Binary Tree and other one to the right.

Source Code:
# A Huffman Tree Node
class node:
    def __init__(self, freq, symbol, left=None, right=None):
        # frequency of symbol
        self.freq = freq

        # symbol name (character)
        self.symbol = symbol

        # node left of current node
        self.left = left

        # node right of current node
        self.right = right

        # tree direction (0/1)
        self.huff = ''

# utility function to print huffman codes for all symbols in the newly created Huffman tree
def printNodes(node, val=''):
    # huffman code for current node
    newVal = val + str(node.huff)

    # if node is not an edge node then traverse inside it
    if(node.left):
        printNodes(node.left, newVal)
    if(node.right):
        printNodes(node.right, newVal)
     # if node is edge node then display its huffman code
Lab 10: Huffman Codes
Name: Tanzeel Ur Rehman 2 Roll no: BMCS22S-
002
Computer System Algorithm (MCS-205) SUET/QR/114

    if(not node.left and not node.right):
        print(f"{node.symbol} -> {newVal}")
# characters for huffman tree
chars = ['a', 'b', 'c', 'd', 'e', 'f']

# frequency of characters
freq = [ 5, 9, 12, 13, 16, 45]

# list containing unused nodes
nodes = []

# converting characters and frequencies into huffman tree nodes
for x in range(len(chars)):
    nodes.append(node(freq[x], chars[x]))

while len(nodes) > 1:
    # sort all the nodes in ascending order based on theri frequency
    nodes = sorted(nodes, key=lambda x: x.freq)

    # pick 2 smallest nodes
    left = nodes[0]
    right = nodes[1]

    # assign directional value to these nodes
    left.huff = 0
    right.huff = 1

    # combine the 2 smallest nodes to create new node as their parent
    newNode = node(left.freq+right.freq, left.symbol+right.symbol, left, right)

    # remove the 2 nodes and add their parent as new node among others
    nodes.remove(left)
    nodes.remove(right)
    nodes.append(newNode)

# Huffman Tree is ready!
printNodes(nodes[0])

Output:

Lab 10: Huffman Codes


Name: Tanzeel Ur Rehman 3 Roll no: BMCS22S-
002
Computer System Algorithm (MCS-205) SUET/QR/114

B. How many bits may be required for encoding the message ‘mississippi’?

Solution:

Following is the frequency table of characters in ‘mississippi’ in non-decreasing order of frequency:

Character Frequency
M 1
P 2
S 4
I 4

The generated Huffman tree is:

11
0 1

2 7
0 1

4 3 5
0 1

m 4
p

1 2

Character Frequency Code Code Length


m 1 100 3
p 2 101 3
s 4 11 2
i 4 0 1

Total number of bits


= freq(m) * codelength(m) + freq(p) * code_length(p) + freq(s) * code_length(s) + freq(i) * code
length(i)
= 1*3 + 2*3 + 4*2 + 4*1 = 21
Also, average bits per character can be found as:
Total number of bits required / total number of characters = 21/11 = 1.909

Lab 10: Huffman Codes


Name: Tanzeel Ur Rehman 4 Roll no: BMCS22S-
002

You might also like