0% found this document useful (0 votes)
94 views

Design and Analysis of Algorithms (COM336) : Huffman Coding

Huffman coding is a lossless data compression algorithm that assigns variable-length codes to characters based on their frequency, with more common characters getting shorter codes. It uses prefix codes where no code is a prefix of another to avoid ambiguity. The project involves implementing Huffman coding to compress a text file by building a Huffman tree from character frequencies, encoding the file using the tree, and decoding and uncompressing the file.

Uploaded by

ff
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views

Design and Analysis of Algorithms (COM336) : Huffman Coding

Huffman coding is a lossless data compression algorithm that assigns variable-length codes to characters based on their frequency, with more common characters getting shorter codes. It uses prefix codes where no code is a prefix of another to avoid ambiguity. The project involves implementing Huffman coding to compress a text file by building a Huffman tree from character frequencies, encoding the file using the tree, and decoding and uncompressing the file.

Uploaded by

ff
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Design and Analysis of Algorithms (COM336)

Second Semester 2019/2020


Project # 2
Huffman Coding
Huffman coding is a lossless data compression algorithm. The idea is to assign variable-
length codes to input characters; lengths of the assigned codes are based on the
frequencies of corresponding characters. The most frequent character gets the smallest
code and the least frequent character gets the largest code.
The variable-length codes assigned to input characters are Prefix Codes, means the codes
(bit sequences) are assigned in such a way that the code assigned to one character is not
prefix of code assigned to any other character. This is how Huffman Coding makes sure
that there is no ambiguity when decoding the generated bit stream.

In this project, you will be using a priority queue and a binary tree of your design to
implement a file compression/uncompression algorithm called "Huffman Coding".
Your program will read a text file and compress it using your implementation of the
Huffman coding algorithm found in the explanation. The compressed text will be written
to a file. That compressed file will be then be read back by your program and
uncompressed. The uncompressed text will then be written to a third file. The
uncompressed text file should of course match the original text file.

Summary of Processing
• Read the specified file and count the frequency of all characters in the file.
• Create the Huffman coding tree based on the frequencies.
• Create the table of encodings for each character from the Huffman coding tree.
• Encode the file and output the encoded/compressed file.
• Read the encoded/compressed file you just created, decode it and output the
decoded file.

You might also like