0% found this document useful (0 votes)
112 views

3rd Sem Project Report

This document summarizes a project report on implementing Huffman coding. Huffman coding is a data compression algorithm that uses variable-length codewords to encode source symbols. The project aims to allow a user to enter symbol frequencies and display the corresponding Huffman codes. It was developed using Turbo C++ on Windows and involves sorting symbols by frequency, creating Huffman trees, and assigning codes. The project has advantages like a user-friendly interface and increased storage capacity.

Uploaded by

Jigar Chheda
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views

3rd Sem Project Report

This document summarizes a project report on implementing Huffman coding. Huffman coding is a data compression algorithm that uses variable-length codewords to encode source symbols. The project aims to allow a user to enter symbol frequencies and display the corresponding Huffman codes. It was developed using Turbo C++ on Windows and involves sorting symbols by frequency, creating Huffman trees, and assigning codes. The project has advantages like a user-friendly interface and increased storage capacity.

Uploaded by

Jigar Chheda
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

HUFFMANN CODING

A PROJECT REPORT
Submitted By
JIGAR CHHEDA & MANJINDER SINGH BHATIA

In partial fulfillment for the award of the degree


of

MASTERS of Computer Applications

S.Y.MCA (iii sem)

MAEER’S ARTS, COMMERCE AND SCIENCE COLLEGE, PUNE-38

2010-2011

Page | 1
Acknowledgement
Few pleasant feelings are beyond words which
could be felt only by heart. Here we are trying to bead
our feelings into few pearls of words, but honestly this
means a lot for us than these simple words. Here we are
trying to express our thankful feelings to those people
who supported us directly or indirectly for
accomplishing this project. We are deeply indebted to
all those were associated with the successful completion
of “ S.Y.MCA(Sci) Project”.
We are also thankful to Mr. S.M. Mali HOD of our IT
department. Special thanks to our project guide Ms.
Ujawala our subject teacher.

Prepared By:
Jigar Chheda
Manjinder Singh Bhatia

Page | 2
PLATFORM USED
Turbo C++: Turbo C++ was a C++ compiler and integrated
development environment (IDE) originally from Borland. Most
recently it was distributed by Embarcadero Technologies, which
acquired all of Borland's compiler tools with the purchase of its
CodeGear division in 2008. The original Turbo C++ product line
was put on hold after 1994, and was revived in 2006 as an
introductory-level IDE, essentially a stripped-down version of their
flagship C++Builder. Turbo C++ 2006 was released on September
5, 2006 and was available in 'Explorer' and 'Professional' editions.
The Explorer edition was free to download and distribute while the
Professional edition was a commercial product. In October 2009
Embarcadero Technologies discontinued support of its 2006 C++
editions. As such, the Explorer edition is no longer available for
download and the Professional edition is no longer available for
purchase from Embarcadero Technologies. Turbo C++ is
succeeded by C++Builder.

Page | 3
C++ (programming language): C++ is a statically typed,
free-form, multi-paradigm, compiled, general-purpose
programming language. It is regarded as a "middle-level" language,
as it comprises a combination of both high-level and low-level
language features. It was developed by Bjarne Stroustrup starting
in 1979 at Bell Labs as an enhancement to the C language and
originally named C with Classes. It was renamed C++ in 1983.

As one of the most popular programming languages ever created,


C++ is widely used in the software industry. Some of its
application domains include systems software, application
software, device drivers, embedded software, high-performance
server and client applications, and entertainment software such as
video games. Several groups provide both free and proprietary C++
compiler software, including the GNU Project, Microsoft, Intel and
Borland. C++ has greatly influenced many other popular
programming languages, most notably C# and Java.

C++ is also used for hardware design, where design is initially


described in C++, then analyzed, architecturally constrained, and
scheduled to create a register transfer level hardware description
language via high-level synthesis.

The language began as enhancements to C, first adding classes,


then virtual functions, operator overloading, multiple inheritance,
templates, and exception handling among other features. After
years of development, the C++ programming language standard
was ratified in 1998 as ISO/IEC 14882:1998. That standard is still
current, but is amended by the 2003 technical corrigendum,
ISO/IEC 14882:2003. The next standard version (known
informally as C++0x) is in development.

Page | 4
INTRODUCTION
“ To design the system that will allow the user to enter the total
number of characters with their frequencies at the terminal and
then display the Huffman codes on the terminal in an interactive
manner.”

The main aim of the feasibility study activity is to determine


whether it would be financially, and technically feasible to develop
the product. After thoroughly analyzing the problem definition and
Huffman coding algorithm from various standard books on
information theory and internet various strategies for solving the
problem were analyzed and finally the algorithm based on priority
queue (singly linked list) was chosen.

Page | 5
Objective Of The Project
In computer science and information theory, Huffman coding is an
entropy encoding algorithm used for lossless data compression.
The term refers to the use of a variable-length code table for
encoding a source symbol (such as a character in a file) where the
variable-length code table has been derived in a particular way
based on the estimated probability of occurrence for each possible
value of the source symbol.
Huffman coding uses a specific method for choosing the
representation for each symbol, resulting in a prefix code
(sometimes called "prefix-free codes", that is, the bit string
representing some particular symbol is never a prefix of the bit
string representing any other symbol) that expresses the most
common source symbols using shorter strings of bits than are
used for less common source symbols. Huffman was able to design
the most efficient compression method of this type: no other
mapping of individual source symbols to unique strings of bits will
produce a smaller average output size when the actual symbol
frequencies agree with those used to create the code. A method
was later found to design a Huffman code in linear time if input
probabilities (also known as weights) are sorted.

Page | 6
Hardware Requirements

Memory : 16 M.B or better.


Processor: Pentium-IV or above.
Disk Space required: 25 M.B.
Keyboard: Standard keyboard.
Mouse: Standard mouse.
Moniter: Svga monitor.

Software Requirements

Operating System: Windows 98, 2000 or above.


Platform: Turbo C++ or Borland C++, Visual C++
Development Language: C++.

Page | 7
PROPOSED SYSTEM : If the user enters only one symbol this
leads to incompleteness which is removed by not using of
algorithm and we can conventionally assign 1 or 0 to the symbol.
Similarly, the number of symbols cant be greater than 94

Our proposed system has several ADVANTAGES


 User friendly interface
 Less error
 More Storage Capacity

Page | 8
DATA FLOW DIAGRAM
A Data Flow Diagram can be used to describe the existing or
planned data processing of an organization. It is Pictorial
Technique to indicate:

1) The entities, which are source of destination of data or output


processing etc.
2) The Process where data is transferred.
3) The flow storage where data is transferred.

The flow of data between the above three components of DFD’s are
better compared to text because of unambiguous description,
preciseness and three enforce related description, preciseness and
three enforce related description to appear.

An ideal DFD technique should facilitate top down successive


requirement approach to present details so that neither the
analyst nor the users (who must understand and approve the
proposals) are swamped by details.

Be independent of physical implementation technique i.e. a DFD


should indicate only the logical requirement of the system without
making assumptions about or implying how data process will be
implemented:

1) Esily understood by all.


2) Easily facilitates expressing refrence tradeoff i.e. the user
analyst be in position to discuss evaluate alternatives at the
levels of DFD itself. The DFD methodology is quite effective. It
does not depend on hardware, software data structure or file
organization.

Page | 9
DFD

Page | 10
Project Flow
Given Data:

Encoding:
Steps:
1)

2)

3)

Page | 11
4)

5)

6)

Page | 12
Final Output

Page | 13
Coding:
#include<iostream.h>

#include<fstream.h>

#include<string.h>

#include<conio.h>

#include<stdlib.h>

#define MAX 100

typedef struct node

struct node* left;

int freq;

char ch[MAX];

struct node* right;

}NODE;

void sort(NODE* a[], int n)

int i=0, j=0;

NODE* temp;

for(i = 0; i < n - 1; i++)

for(j = i; j < n; j++)

if(a[i]->freq > a[j]->freq)

Page | 14
{

temp = a[i];

a[i] = a[j];

a[j] = temp;

NODE* create(char a[], int x)

NODE* ptr;

ptr = (NODE *)malloc(sizeof(NODE));

ptr->ch[0]='\0';

ptr->freq = x;

strcpy(ptr->ch , a);

ptr->right = ptr->left = NULL;

return(ptr);

void sright(NODE* a[], int n)

int i=0;

for(i=1; i<n-1; i++)

Page | 15
a[i] = a[i+1];

void Assign_Code(NODE* tree, int c[], int n)

int i=0;

if((tree->left == NULL) && (tree->right == NULL))

cout<<"\nEnCoding for :"<<tree->ch<<" = ";

for(i=0; i<n; i++)

cout<<c[i];

cout<<"\n";

else

c[n] = 1;

n++;

Assign_Code(tree->left, c, n);

c[n-1] = 0;

Assign_Code(tree->right, c, n);

Page | 16
}

void Delete_Tree(NODE * root)

if(root!=NULL)

Delete_Tree(root->left);

Delete_Tree(root->right);

free(root);

void main()

NODE* ptr, * head;

int i, fchoice=0, u=0, c[15], count[255], tcount=0,d=0;

char str[MAX], name, fname[50], data[MAX], data1[MAX], data2[MAX],


str1[MAX];

static NODE *a[100];

int freq,intt;

static int ln=0;

clrscr();

str[0]='\0';

Page | 17
data[0]='\0';

data1[0]='\0';

data2[0]='\0';

str1[0]='\0';

name='\0';

fname[0]=NULL;

a[0]='\0';

for(i=0;i<255;i++) count[i]=0;

cout<<"\nYou want to Encode a New file or an Existing One";

cout<<"\n1.Create New File\n2.Existing File\nChoice 1 or 2...: ";

cin>>fchoice;

if(fchoice==1)

cout<<"\nEnter the new file name: ";

cin>>fname;

ofstream fout(fname);

cout<<"\nPress Ctrl + Z then Enter Key to Save File after


Completion";

cout<<"\nEnter text for File\n\n";

while(cin)

cin.get(name);

fout<<name;

Page | 18
}

fout.close();

if(fchoice==2)

cout<<"\n\t\t\tEnter the Name of File : ";

cin>>fname;

cout<<"\n";

cout<<"Contents of the File:"<<fname<<"\n";

ifstream fin(fname);

while(fin)

fin.get(name);

cout<<name;

str[ln]=name;

ln++;

cout<<"\n";

getch();

for(i=0;i<ln-2;i++)

data[i]=str[i];

intt=(int)data[i];

Page | 19
count[intt]=count[intt]+1;

d=0;

tcount=2;

for(i=0;i<255;i++)

data1[i]='\0';

if(count[i]>=1)

data1[i]=(char)i;

data2[0]=data1[i];

freq=count[i];

a[d]=create(data2,freq);

data2[0]='\0';

tcount++;

d++;

while (tcount > 1)

sort(a,tcount);

u = a[0]->freq + a[1]->freq;

strcpy(str,a[0]->ch);

strcat(str,a[1]->ch);

Page | 20
ptr = create(str, u);

ptr->right = a[1];

ptr->left = a[0];

a[0] = ptr;

sright(a,tcount);

tcount--;

Assign_Code(a[0], c, 0);

Delete_Tree(a[0]);

getch();

Page | 21
CONCLUSION

Our project is only a humble venture to satisfy


the needs in an Institution. Several user friendly coding
have also adopted.

Last but not least it is no the work that played the


ways to success but ALMIGHTY

Page | 22
BIBLIOGRAPHY

1) http://www.google.co.in/
2) Indroduction To Algorithm – PHI Publication
3) http://en.wikipedia.org/wiki/Huffman_coding
4) Complete Reference C++

Page | 23

You might also like