TABLE OF CONTENTS
S.NO DATE EXPERIMENT TITLE MARK SIGN.
. S/10
1 LEXICAL ANALYZER TO RECOGNIZE A FEW
PATTERNS IN C
2 IMPLEMENT A LEXICAL ANALYZER USING
LEX TOOL
3.a PROGRAM TO RECOGNIZE A VALID
ARITHMETIC EXPRESSION
3.b PROGRAM THAT RECOGNIZES A VALID
VARIABLE
3.c IMPLEMENTATION OF CALCULATOR USING
LEX AND YACC
4 THREE ADDRESS CODE USING LEX AND
YACC
TYPE CHECKING USING LEX AND YACC
5
6 IMPLEMENT SIMPLE CODE OPTIMIZATION
TECHNIQUES
7 IMPLEMENT THE BACK END OF THE
COMPILER
EXNO:1 LEXICAL ANALYZER TO RECOGNIZE A FEW
PATTERNS IN C
DATE:
AIM:
To develop a lexical analyser to identify identifiers, constants, comments, operators etc using
C program
ALGORITHM:
Step1: Start the program.
Step2: Declare all the variables and file pointers.
Step3: Display the input program.
Step4: Separate the keyword in the program and display it.
Step5: Display the header files of the input program
Step6: Separate the operators of the input program and display it.
Step7: Print the punctuation marks.
Step8: Print the constant that are present in input program.
Step9: Print the identifiers of the input program.
PROGRAM:
#include<string.h>
#include<ctype.h>
#include<stdio.h>
#include<stdlib.h>
void keyword(char str[10])
{
if(strcmp("for",str)==0||strcmp("while",str)==0||strcmp("do",str)==0||strcmp("int",str)==0||strcmp("float",str
)==0||strcmp("char",str)==0||strcmp("double",str)==0||strcmp("printf",str)==0||strcmp("switch",str)==0||strcm
p("case",str)==0)
printf("\n%s is a keyword",str);
else
printf("\n%s is an identifier",str);
}
void main()
{
FILE *f1,*f2,*f3;
char c,str[10],st1[10];
int num[100],lineno=0,tokenvalue=0,i=0,j=0,k=0;
f1=fopen("input.txt","r");
f2=fopen("identifier","w");
f3=fopen("specialchar","w");
while((c=getc(f1))!=EOF)
{
if(isdigit(c))
{
tokenvalue=c-'0';
c=getc(f1);
while(isdigit(c))
{
tokenvalue*=10+c-'0';
c=getc(f1);
}
num[i++]=tokenvalue;
ungetc(c,f1);
}
else
if(isalpha(c))
{
putc(c,f2);
c=getc(f1);
while(isdigit(c)||isalpha(c)||c=='_'||c=='$')
{
putc(c,f2);
c=getc(f1);
}
putc(' ',f2);
ungetc(c,f1);
}
else
if(c==' '||c=='\t')
printf(" ");
else
if(c=='\n')
lineno++;
else
putc(c,f3);
}
fclose(f2);
fclose(f3);
fclose(f1);
printf("\n the no's in the program are:");
for(j=0;j<i;j++)
printf("\t%d",num[j]);
printf("\n");
f2=fopen("identifier","r");
k=0;
printf("the keywords and identifier are:");
while((c=getc(f2))!=EOF)
if(c!=' ')
str[k++]=c;
else
{
str[k]='\0';
keyword(str);
k=0;
}
fclose(f2);
f3=fopen("specialchar","r");
printf("\n Special Characters are");
while((c=getc(f3))!=EOF)
printf("\t%c",c);
printf("\n");
fclose(f3);
printf("Total no of lines are:%d",lineno);
}
input.txt
#include<stdio.h>
void main()
{
printf("Hello World");
}
OUTPUT:
D:\compiler>gcc fi.c
D:\compiler>a.exe
the no's in the program are:
the keywords and identifier are:
include is an identifier
stdio is an identifier
h is an identifier
void is an identifier
main is an identifier
printf is a keyword
Hello is an identifier
World is an identifier
Special Characters are # < . > ( ) { ( " " ) ;
}
Total no of lines are:4
D:\compiler>
RESULT:
Thus the program to develop a lexical analyser to identify identifiers, constants, comments,
operators etc using C program is executed and verified.
EXNO:2 IMPLEMENT A LEXICAL ANALYZER USING LEX TOOL
DATE:
AIM:
To write a program for implementing a Lexical analyser using LEX tool.
ALGORITHM:
Step1: Lex program contains three sections: definitions, rules, and user subroutines. Each section must be
separated from the others by a line containing only the delimiter, %%. The format is as follows: definitions
%% rules %% user_subroutines
Step2: In definition section, the variables make up the left column, and their definitions make up the right
column. Any C statements should be enclosed in %{..}%. Identifier is defined such that the first letter of an
identifier is alphabet and remaining letters are alphanumeric.
Step3: In rules section, the left column contains the pattern to be recognized in an input file to yylex(). The
right column contains the C program fragment executed when that pattern is recognized. The various patterns
are keywords, operators, new line character, number, string, identifier, beginning and end of block, comment
statements, preprocessor directive statements etc.
Step4: Each pattern may have a corresponding action, that is, a fragment of C source code to execute when
the pattern is matched.
Step5: When yylex() matches a string in the input stream, it copies the matched text to an external character
array, yytext, before it executes any actions in the rules section.
Step6: In user subroutine section, main routine calls yylex(). yywrap() is used to get more input.
Step7: The lex command uses the rules and actions contained in file to generate a program, lex.yy.c, which
can be compiled with the cc command. That program can then receive input, break the input into the logical
pieces defined by the rules in file, and run program fragments contained in the actions in file.
PROGRAM :
%{
#include<stdio.h>
%}
%%
if|else|while|int|switch|for {printf("%s is a keyword",yytext);}
[a-z|A-Z]([a-z|A-Z]|[0-9])* {printf("%s is an identifier",yytext);}
[0-9]* {printf("%s is a number",yytext);}
%%
int main()
yylex();
return 0;
int yywrap()
OUTPUT:
RESULT:
Thus the program for implementing a Lexical analyser using LEX tool is executed and verified.
EXNO:3.a PROGRAM TO RECOGNIZE A VALID ARITHMETIC
EXPRESSION
DATE:
AIM:
To write a program to recognize a valid arithmetic expression that uses operator +, - , * and / using
YACC tool.
ALGORITHM:
LEX
1. Declare the required header file and variable declaration with in ‘%{‘ and ‘%}’.
2. LEX requires regular expressions to identify valid arithmetic expression token of lexemes.
3. LEX call yywrap() function after input is over. It should return 1 when work is done or should return
when more processing is required.
YACC
1. Declare the required header file and variable declaration with in ‘%{‘ and ‘%}’.
2. Define tokens in the first section and also define the associativity of the operations
3. Mention the grammar productions and the action for each production.
4. $$ refer to the top of the stack position while $1 for the first value, $2 for the second value in the stack.
5. Call yyparse() to initiate the parsing process.
6. yyerror() function is called when all productions in the grammar in second section doesn't match to the
input statement.
PROGRAM:
//art_expr.l
%{
#include<stdio.h>
#include "y.tab.h"
%}
%%
[a-zA-Z][0-9a-zA-Z]* {return ID;}
[0-9]+ {return DIG;}
[ \t]+ {;}
. {return yytext[0];}
\n {return 0;}
%%
int yywrap()
{
return 1;
}
//art_expr.y
%{
#include<stdio.h>
%}
%token ID DIG
%left '+''-'
%left '*''/'
%right UMINUS
%%
stmt:expn ;
expn:expn'+'expn
|expn'-'expn
|expn'*'expn
|expn'/'expn
|'-'expn %prec UMINUS
|'('expn')'
|DIG
|ID
26
;
%%
int main()
{
printf("Enter the Expression \n");
yyparse();
printf("valid Expression \n");
return 0;
}
int yyerror()
{
printf("Invalid Expression");
exit(0);
}
OUTPUT:
RESULT:
Thus the program to recognize a valid arithmetic expression that uses operator +, - , * and / using
YACC tool was executed and verified successfully.
EXNO:3.b PROGRAM THAT RECOGNIZES A VALID VARIABLE
DATE:
AIM:
To write a program to recognize a valid variable which starts with a letter followed by any number of
letters ordigits using YACC tool.
ALGORITHM:
LEX
1. Declare the required header file and variable declaration with in ‘%{‘ and ‘%}’.
2. LEX requires regular expressions or patterns to identify token of lexemes for recognize a valid
variable.
3. Lex call yywrap() function after input is over. It should return 1 when work is done or should
return 0when more processing is required.
YACC
1. Declare the required header file and variable declaration with in ‘%{‘ and ‘%}’.
2. Define tokens in the first section and also define the associativity of the operations
3. Mention the grammar productions and the action for each production.
4. $$ refer to the top of the stack position while $1 for the first value, $2 for the second value in the
stack.
5. Call yyparse() to initiate the parsing process.
6. yyerror() function is called when all productions in the grammar in second section doesn't
match to theinput statement.
PROGRAM:
//valvar.l
%{
#include "y.tab.h"
%}
%%
[a-zA-Z] {return LET;}
[0-9] {return DIG;}
. {return yytext[0];}
\n {return 0;}
%%
int yywrap()
{
return 1;
}
//valvar.y
%{
#include<stdio.h>
%}
%token LET DIG
%%
variable:var
;
var:var DIG
|var LET
|LET
;
%%
int main()
{
printf("Enter the
variable:\n"); yyparse();
printf("Valid variable \n");
return 0;
}
int yyerror()
{
printf("Invalid variable
\n");exit(0); }
OUTPUT:
RESULT:
Thus the program to recognize a valid variable which starts with a letter followed by any number
of letters ordigits using YACC tool was executed and verified successfully.
EXNO:3.c IMPLEMENTATION OF CALCULATOR USING LEX AND
YACC
DATE:
AIM:
To write a program to implement Calculator using LEX and YACC.
ALGORITHM:
Step 1: Start the program.
Step 2: In the declaration part of lex, includes declaration of regular definitions as digit.
Step 3: In the translation rules part of lex, specifies the pattern and its action that is to be executed
whenever alexeme matched by pattern is found in the input in the cal.l.
Step 4: By use of Yacc program,all the Arithmetic operations are done such as +,-,*,/.
Step 5: Display error is persist.
Step 6: Provide the input.
Step 7: Verify the output.
Step 8: End.
PROGRAM:
Cal.l
%{
#include<stdio.h>
#include "y.tab.h"
extern int yylval;
%}
%%
[0-9]+ {
yylval=atoi(yytext);
return NUMBER;
}
[\t] ;
[\n] return 0;
. return yytext[0];
%%
int yywrap()
{
return 1;
}
Cal.y
%{
#include<stdio.h>
int flag=0;
%}
%token NUMBER
%left '+' '-'
%left '*' '/' '%'
%left '(' ')'
%%
ArithmeticExpression: E{
printf("\nResult=%d\n",$$);
return 0;
};
E:E'+'E {$$=$1+$3;}
|E'-'E {$$=$1-$3;}
|E'*'E {$$=$1*$3;}
|E'/'E {$$=$1/$3;}
|E'%'E {$$=$1%$3;}
|'('E')' {$$=$2;}
| NUMBER {$$=$1;}
;
%%
void main()
{printf("\nEnter Any Arithmetic Expression which can have operations Addition, Subtraction, Multiplication,
Divison, Modulus and Round brackets:\n");
yyparse();
if(flag==0)
printf("\nEntered arithmetic expression is Valid\n\n");
}
int yyerror()
{
printf("\nEntered arithmetic expression is Invalid\n\n");
flag=1;
OUTPUT:
RESULT:
Thus the program for implementing calculator using LEX and YACC is executed and verified.
EXNO:4 THREE ADDRESS CODE USING LEX AND YACC
DATE:
AIM:
To write a program to generate three address code for simple expression using LEX and YACC
tool.
ALGORITHM:
LEX
1. Declare the required header file and variable declaration with in ‘%{‘ and ‘%}’.
2. LEX requires regular expressions or patterns to identify token of lexemes for recognize a valid
variable.
3. Lex call yywrap() function after input is over. It should return 1 when work is done or should
return 0when more processing is required.
YACC
1. Declare the required header file and variable declaration with in ‘%{‘ and ‘%}’.
2. Define tokens in the first section and also define the associativity of the operations
3. Mention the grammar productions and the action for each production.
4. $$ refer to the top of the stack position while $1 for the first value, $2 for the second value in the
stack.
5. Call yyparse() to initiate the parsing process.
6. yyerror() function is called when all productions in the grammar in second section doesn't
match to theinput statement.
PROGRAM:
Addr.l
%{
#include "y.tab.h"
%}
%%
[0-9]+ { yylval = atoi(yytext); return NUM; }
[ \t] ; // Skip whitespace
\n return '\n';
. return yytext[0];
%%
int yywrap()
{
return 1;
}
addr.y
%{
#include <stdio.h>
extern int yylex();
extern int yyerror(const char*);
int yylval;
int temp_count = 1;
void emit(char op, int arg1, int arg2, int result) {
printf("t%d = ", result);
if (op == '+') {
printf("t%d + t%d\n", arg1, arg2);
} else if (op == '*') {
printf("t%d * t%d\n", arg1, arg2);
}
}
%}
%token NUM
%%
program: stmt '\n' { printf("\n"); }
| program stmt '\n' { printf("\n"); }
stmt: expr { printf("Result: t%d\n", $1); }
;
expr: expr '+' term { emit('+', $1, $3, temp_count); $$ = temp_count++; }
| term { $$ = $1; }
;
term: term '*' factor { emit('*', $1, $3, temp_count); $$ = temp_count++; }
| factor { $$ = $1; }
;
factor: '(' expr ')' { $$ = $2; }
| NUM { $$ = $1; }
;
%%
int main() {
yyparse();
return 0;
}
int yyerror(const char *msg) {
fprintf(stderr, "Error: %s\n", msg);
return 1;
OUTPUT:
4+6*6
t1 = t6 * t6
t2 = t4 + t1
Result: t2
RESULT:
Thus the output is executed for three code address using lex and yacc.
EXNO:5 TYPE CHECKING USING LEX AND YACC
DATE:
AIM:
To write a program to generate three address code for simple expression using LEX and YACC tool.
ALGORITHM:
LEX
4. Declare the required header file and variable declaration with in ‘%{‘ and ‘%}’.
5. LEX requires regular expressions or patterns to identify token of lexemes for recognize a valid
variable.
6. Lex call yywrap() function after input is over. It should return 1 when work is done or should
return 0when more processing is required.
YACC
7. Declare the required header file and variable declaration with in ‘%{‘ and ‘%}’.
8. Define tokens in the first section and also define the associativity of the operations
9. Mention the grammar productions and the action for each production.
10. $$ refer to the top of the stack position while $1 for the first value, $2 for the second value in the
stack.
11. Call yyparse() to initiate the parsing process.
12. yyerror() function is called when all productions in the grammar in second section doesn't
match to theinput statement.
PROGRAM:
Type.l
%{
#include "y.tab.h"
%}
%%
[0-9]+(\.[0-9]+)? {
yylval.dval = atof(yytext);
return NUMBER;
}
[+-/*] {
return *yytext;
}
[ \t\n]
.{
return *yytext;
}
%%
int yywrap()
{
return 1;
}
Type.y
%{
#include <stdio.h>
%}
%union {
double dval;
}
%token <dval> NUMBER
%left '+' '-'
%left '*' '/'
%type <dval> expr
%%
input: /* empty */
| input line
;
line: expr '\n' {
printf("Result: %lf\n", $1);
}
;
expr: NUMBER
| expr '+' expr {
$$ = $1 + $3;
}
| expr '-' expr {
$$ = $1 - $3;
}
| expr '*' expr {
$$ = $1 * $3;
}
| expr '/' expr {
if ($3 == 0) {
printf("Error: Division by zero\n");
}
$$ = $1 / $3;
}
;
%%
int main() {
yyparse();
return 0;
}
int yyerror()
{
printf("mismatch type \n");
}
OUTPUT:
D:\compiler>lex type.l
D:\compiler>yacc -d type.y
D:\compiler>gcc lex.yy.c y.tab.c
D:\compiler>a.exe
8.0
8+7.0
mismatch type
RESULT :
Thus the output is executed for type checking using lex and yacc.
EXNO.6 IMPLEMENT SIMPLE CODE OPTIMIZATION
TECHNIQUES
DATE:
AIM:
To write a c program to implement simple code optimization techniques.
ALGORITHM:
Start the Program.
Write the c program for simple code optimization.
Constant Folding: It performs arithmetic operations with constants at compile-time, reducing them
to a single constant value.
Strength Reduction: It replaces expensive operations with cheaper ones (like replacing
multiplication with shifting or using integer division).
Algebraic Transformation: It simplifies expressions to optimize the calculations.
PROGRAM
#include <stdio.h>
int main() {
int a = 10;
int b = 5;
int c, d, e, f;
// Constant folding: Performing operations with constants at compile-time
c = a + 15; // Constant folding: 10 + 15 = 25
d = b * 4; // Constant folding: 5 * 4 = 20
// Strength reduction: Replacing expensive operations with cheaper ones
e = a * 2; // Strength reduction: 10 * 2 = 20 (multiplication replaced by shifting)
f = b / 2; // Strength reduction: 5 / 2 = 2 (integer division)
// Algebraic transformation: Simplifying expressions
int result1 = ((a * b) + (c - d)) / e; // ((10 * 5) + (25 - 20)) / 20 = (50 + 5) / 20 = 55 / 20 = 2
int result2 = ((a * b) + (c - d)) * f; // ((10 * 5) + (25 - 20)) * 2 = (50 + 5) * 2 = 55 * 2 = 110
printf("c = %d\n", c);
printf("d = %d\n", d);
printf("e = %d\n", e);
printf("f = %d\n");
printf("Result 1 = %d\n", result1);
printf("Result 2 = %d\n", result2);
return 0;
OUTPUT:
RESULT :
Thus the output is executed for implementation of code optimization.
EXNO:7 IMPLEMENT THE BACK END OF THE COMPILER
DATE:
AIM:
To implement the back end of the compiler which takes the three address code and produces
the 8086 assembly language instructions that can be assembled and run using a 8086 assembler. The target
assembly instructions can be simple move, add, sub, jump. Also simple addressing modes are used.
INTRODUCTION:
A compiler is a computer program that implements a programming language specification to “translate”
programs, usually as a set of files which constitute the source code written in source language, into their
equivalent machine readable instructions(the target language, often having a binary form known as object
code). This translation process is called compilation.
BACK END:
Some local optimization
Register allocation
Peep-hole optimization
Code generation
Instruction scheduling
The main phases of the back end include the following:
Analysis: This is the gathering of program information from the intermediate representation derived from
the input; data-flow analysis is used to build use-define chains, together with dependence analysis, alias
analysis, pointer analysis, escape analysis etc.
Optimization: The intermediate language representation is transformed into functionally equivalent but
faster (or smaller) forms. Popular optimizations are expansion, dead, constant, propagation, loop
transformation, register allocation and even automatic parallelization.
Code generation: The transformed language is translated into the output language, usually the native
machine language of the system. This involves resource and storage decisions, such as deciding which
variables to fit into registers and memory and the selection and scheduling of appropriate machine
instructions along with their associated modes. Debug data may also need to be generated to facilitate
debugging.
ALGORITHM:
1. Start the program
2. Open the source file and store the contents as quadruples.
3. Check for operators, in quadruples, if it is an arithmetic operator generator it or if
assignment operator generates it, else perform unary minus on register C.
4. Write the generated code into output definition of the file in outp.c
5. Print the output.
6. Stop the program.
PROGRAM:
#include<stdio.h>
#include<stdio.h>
//#include<conio.h>
#include<string.h>
void main()
char icode[10][30],str[20],opr[10];
int i=0;
//clrscr();
printf("\n Enter the set of intermediate code (terminated by exit):\n");
do
scanf("%s",icode[i]);
}
while(strcmp(icode[i++],"exit")!=0);
printf("\n target code generation");
printf("\n************************");
i=0;
do
strcpy(str,icode[i]);
switch(str[3])
case '+':
strcpy(opr,"ADD");
break;
case '-':
strcpy(opr,"SUB");
break;
case '*':
strcpy(opr,"MUL");
break;
case '/':
strcpy(opr,"DIV");
break;
printf("\n\tMov %c,R%d",str[2],i);
printf("\n\t%s%c,R%d",opr,str[4],i);
printf("\n\tMov R%d,%c",i,str[0]);
}while(strcmp(icode[++i],"exit")!=0);
//getch();
OUTPUT:
RESULT:
Thus the program was implemented to the TAC has been successfully executed.