0% found this document useful (0 votes)
313 views23 pages

Compiler and Interpreter Overview

The document discusses different types of translators including compilers, interpreters, assemblers, and describes their key differences such as compilers translating the entire source code at once into machine code while interpreters translate line by line and are generally slower. It also covers the different phases of a compiler including lexical analysis, syntax analysis, code generation, and describes how compilers are used to translate high-level programming languages into machine-readable object code.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
313 views23 pages

Compiler and Interpreter Overview

The document discusses different types of translators including compilers, interpreters, assemblers, and describes their key differences such as compilers translating the entire source code at once into machine code while interpreters translate line by line and are generally slower. It also covers the different phases of a compiler including lexical analysis, syntax analysis, code generation, and describes how compilers are used to translate high-level programming languages into machine-readable object code.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Introduction

Translator:-It is a software which translates one level language into a functionally equivalent other level language.

Assembler: It is a program which translates source (Assembly level) language into a functionally equivalent target (Machine level) language. Interpreter: It is a program which translates source (High level) language into a functionally equivalent target (Machine level) language. An interpreter reads one instruction or line of the source code at a time, converts this line into machine code and executes it. The machine code is then discarded and the next line is read. The advantage of this is that, it is simple and you can interrupt it while it is running, change the program and either continue or start again. It is easier to find errors. The disadvantage is that every line has to be translated every time it is executed, even if it is executed many times as the program runs. Because of this interpreters tend to be slow. However the source code of an interpreted language cannot run without the interpreter. Examples of interpreters are Basic on older home computers, and script interpreters such as JavaScript, and languages such as Lisp and Forth.

Compiler: It is a program which translates source (High level) language into a functionally equivalent target (Machine level) language. A compiler reads the whole source code and translates it into a complete machine code program which is output as a new file. After translation of whole program is over it starts execution. This completely separates the source code from the executable file, as execution is done on intermediate object file.

SibaramaPanigrahi, Lecture in CSE

Page 1

The biggest advantage of this is that the translation is done once only and as a separate process. The program that is run is already translated into machine code so is much faster in execution. Compilers produce better optimized code that generally run faster and compiled code is self-sufficient and can be run on their intended platforms without the compiler present. The disadvantage is that you cannot change the program without going back to the original source code, editing that and recompiling (though for a professional software developer this is more of an advantage because it stops source code being copied). It is difficult to find errors. Current examples of compilers are Visual Basic, C, C++, C#, FORTRAN, COBOL, Ada, and Pascal.

Types Of Compiler: Compilers are of two kinds: native compiler and cross compiler. 1. Native compilers are the compilers that run on one machine and produces object code for the same machine. For example, SMM is a compiler for the language S that is in a language that runs on machine M and generates output code that runs on machine M. 2. Cross compilers are compilers that run on one machine and produces object code for another machine. For example, SNM is a compiler for the language S that is in a language that runs on machine N and generates output code that runs on machine M. A compiler can be characterized by three languages: i. The source language (S) ii. The target language (T) iii. The implementation language (I) . The three languages S, I, and T can be quite different. Such a compiler is called cross-compiler

SibaramaPanigrahi, Lecture in CSE

Page 2

Boot Strapping Bootstrapping is obtaining a compiler for a language L by writing the compiler code in the same language L. DE-Compiler: It is program which translates Machine level language into a functionally equivalent High level language. Source to source Translator (language translator or language converter) A compiler that translates a high level language into a functionally equivalent another high level language is called source to source translator. How to translate? The high level languages and machine languages differ in level of abstraction. At machine level we deal with memory locations, registers whereas these resources are never accessed in high level languages. But the level of abstraction differs from language to language and some languages are farther from machine code than others.
. Goals

of translation Good performance for the generated code Good compile time performance

Good performance for generated code: The metric for the quality of the generated code is the ratio between the size of handwritten code and compiled machine code for same program. A better compiler is one which generates smaller code. For optimizing compilers this ratio will be lesser. Good compile time performance: A handwritten machine code is more efficient than a compiled code in terms of the performance it produces. In other words, the program handwritten in machine code will run faster than compiled code. If a compiler produces a code which is 20-30% slower than the handwritten code then it is considered to be acceptable. In addition to this, the compiler itself must run fast (compilation time must be proportional to program size). - Maintainable code - High level of abstraction . Correctness is a very important issue. Correctness: A compiler's most important goal is correctness - all valid programs must compile correctly. How do we check if a compiler is correct i.e. whether a compiler for a programming language generates correct machine code for programs in the language. The complexity of writing a correct compiler is a major limitation on the amount of optimization that can be done. Can compilers be proven to be correct? Very tedious! . However, the correctness has an implication on the development cost

SibaramaPanigrahi, Lecture in CSE

Page 3

Phases of Compiler:A compiler is composed of several components called as phases each


performing one specific task. The complete compilation process is divided into six phases and those phases can be regrouped into two parts. 1. Analysis Phase: In this part the source program is broken into constituent pieces and creates an intermediate representation. Analysis can be done in three sub phases. i. Lexical Analysis ii. Syntax Analysis iii. Semantic Analysis 2. Synthesis Phase: Synthesis constructs the desired target program from intermediate representation. Synthesis can be done in three sub phases. i. Intermediate code generation ii. Code Optimization iii. Code Generation

Many modern compilers share a common 'two stage' design.

SibaramaPanigrahi, Lecture in CSE

Page 4

The "front end" translates the source language or the high level program into an intermediate representation. It includes all analysis phases and intermediate code generation phase. It analyses the source program and produces intermediate code. The second stage is the "back end", which works with the internal representation to produce code in the output language which is a low level code. The higher the abstraction a compiler can support, the better it is. It includes code optimization and code generation phase of compiler. It synthesizes the target program from the intermediate code.

Lexical Analysis(Scanning)
In simple words, lexical analysis is the process of identifying the words from an input string of characters, which may be handled more easily by a Syntax Analyzer. These words must be separated by some predefined delimiter or there may be some rules imposed by the language for breaking the sentence into tokens or words which are then passed on to the next phase of syntax analysis. In programming languages, a character from a different class may also be considered as a word separator. Recognizing words is not completely trivial. For example: is this a sentence? Therefore, we must know what the word separators are The language must define rules for breaking a sentence into a sequence of words. Normally white spaces and punctuations are word separators in languages. In programming languages a character from a different class may also be treated as word separator. Lexeme:Sequence of characters that forms a token of a language is known as lexeme. Token: Categorization of lexeme is called as token, which is the smallest individual part of a Program. The first phase of a compiler is called lexical analysis or scanning.Thelexical analyzer scans character by character from the source program and groups the characters into meaningful sequences called lexemes and checks whether it is a valid token of that language or not. If the lexeme is not a valid token then it generates an error i.e. handled by error handler. If the lexeme is a valid token then For each lexeme, the lexical analyzer produces as output a token of the form (Token-name, attribute-value) In order to perform above operation the lexical analyzer design must: 1. Specify the token of the language 2. Suitably recognize the tokens. To specify the token of the language, Regular expression concept from automata theory is used and recognition of the tokens is done by Deterministic Finite Automata.

SibaramaPanigrahi, Lecture in CSE

Page 5

Pattern:Rule of description is a pattern. For example letter (letter + digit)* is a pattern to symbolize a set of strings which consist of a letter followed by a letter or digit

Q. Find the tokens and lexeme of following program.


main() { int a=5; int b[11]; while(a<=5) b[a]=3*a; } Solution Lexeme Token main Identifier ( Special Character ) Special Character { Special Character Int Keyword a Identifier = Assignment Operator 5 Constant ; Delimiter b Identifier [] Subscript Operator while Keyword < Relational Operator 3 Constant } Special Character

Q. What do you mean by porting of a compiler? Solution:The process of modifying an existing compiler to work on a new machine is often known as porting the compiler. Porting a compiler to a new host just requires that the back end of the source code be rewritten to generate code for the new machine. This is then compiled using the old compiler to produce a working version of for the new machine.

SibaramaPanigrahi, Lecture in CSE

Page 6

Transition diagram for relational operators(relop)


token is relop , lexeme is >= token is relop, lexeme is >

token is relop, lexeme is < token is relop, lexeme is <>(not) token is relop, lexeme is <= token is relop, lexeme is = token is relop , lexeme is >= token is relop, lexeme is >

In case of < or >, we need a look ahead to see if it is a <, = , or <> or = or >. We also need a global data structure which stores all the characters. In lex, yylex is used for storing the lexeme. We can recognize the lexeme by using the transition diagram. Depending upon the number of checks a relational operator uses, we land up in a different kind of state like >= and > are different. From the transition diagram in the slide it's clear that we can land up into six kinds of relops. Transition diagram for identifier In order to reach the final state, it must encounter a letter followed by one or more letters or digits. #include<stdio.h> #include<conio.h> Void main() { charch; ch=getchar(); if(isalpha(ch)) ch=getchar(); else error(); while(isalpha(ch)||isdigit(ch)) ch=getchar(); }
SibaramaPanigrahi, Lecture in CSE Page 7

Transition diagram for white spaces Transition diagram for white spaces : In order to reach the final state, it must encounter a delimiter (tab, white space) followed by one or more delimiters and then some other symbol.

Transition diagram for unsigned numbers

Transition diagram for Unsigned Numbers: We can have twokinds of unsigned numbers and hence need two transition diagrams which distinguish each of them. The first one recognizes realnumbers. The second one recognizesintegers.

LEX- A Lexical Analyzer Generator:


Since the function of the lexical analyzer is to scan the source program and produce a stream of tokens as output. Therefore, the first thing that is required is to identify what the keywords are, what the operators are, and what the delimiters are. These are the tokens of the language. After identifying the tokens of the language, we must use suitable notation to specify these tokens. This notation, should be compact, precise, and easy to understand. Regular expressions can be used to specify a set of strings, and a set of strings that can be specified by using regular-expression notation is called a "regular set." The tokens of a programming language constitute a regular set. Hence, this regular set can be specified by using regular-expression notation. Therefore, we write regular expressions for things like operators, keywords, and identifiers. For example,
SibaramaPanigrahi, Lecture in CSE Page 8

the regular expressions specifying the subset of tokens of typical programming language are as follows: operators = +| -| * |/ | % keywords = If|while|do|then letter = a|b|c|d|....|z|A|B|C|....|Z digit = 0|1|2|3|4|5|6|7|8|9 identifier = letter (letter|digit)* The next step is the construction of a DFA from the regular expression that specifies the tokens of the language. But the DFA is a flow-chart (graphical) representation of the lexical analyzer. Therefore, after constructing the DFA, the next step is to write a program in suitable programming language that will simulate the DFA. This program acts as a token recognizer or lexical analyzer. Therefore, we find that by using regular expressions for specifying the tokens, designing a lexical analyzer becomes a simple mechanical process that involves transforming regular expressions into finite automata and generating the program for simulating the finite automata. Therefore, it is possible to automate the procedure of obtaining the lexical analyzer from the regular expressions and specifying the tokensand this is what precisely the tool LEX is used to do. Q. What is a LEX? Hence A Lex is a program generator Designed for lexical processing of character input streams. It accepts high level, problem oriented specification for character string matching and produces a program in general purpose language which recognizes regular expression. The regular expression are specified by the user in the source specification given to LEX .The LEX written code recognizes these expressions in an input stream and partition the input stream into strings matching the regular expressions.

Errors Detected By Lexical Analyzer:


The lexical analyzer detects the following type of errors

Characters that cannot appear in any token in our source language, such as @ or #. Integer constants out of bounds (range is 0 to 32767 for signed int in C Language). Identifier names that are too long (maximum length is 32 characters in C Language). Text strings that are two long (maximum length is 256 characters in C Language). Text strings that span more than one line. (In C Language)

Read The Following TOPICS 1. DFA,NFA, Removal Of Epsilon Production, Removal of Unit Production, Removal of Useless Grammar, Removal Of Left Recursion
SibaramaPanigrahi, Lecture in CSE Page 9

SYNTAX ANALYSIS
A syntax analyzer or parser is a program that performs syntax analysis. A parser obtains a string of tokens from the lexical analyzer and verifies whether or not the grouping of tokens is a valid construct of the source language-that is, whether or not it is in accordance to the grammar for the source language. If the tokens in a string are grouped according to the language's rules of syntax, then the string of tokens generated by the lexical analyzer is accepted as a valid construct of the language and it produce a parse tree.Otherwise, an error handler is called. Hence, two issues are involved when designing the syntax-analysis phase of a compilation process: 1. All valid constructs of a programming language must be specified and by using these specifications, a valid program is formed. 2. A suitable recognizer will be designed to recognize whether a string of tokens generated by the lexical analyzer is a valid construct or not. Therefore, suitable notation must be used to specify the constructs of a language. The notation for the construct specifications should be compact, precise, and easy to understand. The syntax-structure specification for the programming language (i.e., the valid constructs of the language) uses context-free grammar (CFG), because for certain classes of grammar, we can automatically construct an efficient parser that determines if a source program is syntactically correct. Classification of Parsing:Process of determination whether a string can be generated by a grammar. Parsing is of two types. 1. Top Down Parsing 2. Bottom up Parsing Top Down Parsing
1. An attempt to derive w from the grammar's start symbol S by using the grammar of the language is known as Top down Parsing. 2. Top-down parsing attempts to find the left-most derivations for an input string w. 3. Back tracking is a problem found in Top down Parsing. 4. E.g. LL(0), LL(1) Parsers

Bottom up Parsing
1. An attempt to reduce w to the grammar's start symbol S by using the grammar of the language is known as Bottom up Parsing 2. Bottom up Parsing attempts to find the left-most derivations for an input string w. 3. No Back Tracking is required. 5. LR(0), LR(1), LALR Parsers

SibaramaPanigrahi, Lecture in CSE

Page 10

TOP DOWN PARSING


Top-down parsing attempts to find the left-most derivations for an input string w, which is equivalent to constructing a parse tree for the input string w that starts from the root and creates the nodes of the parse tree in a predefined order. Q. Why Top down Parsing seeks the left most derivation of input String? The reason that top-down parsing seeks the left-most derivations for an input string w and not the right-most derivations is that the input string w is scanned by the parser from left to right, one symbol/token at a time, and the left-most derivations generate the leaves of the parse tree in left-to-right order, which matches the input scan order.

Back Tracking:
In the attempt to obtain the left-most derivation of the input string w, a parser may encounter a situation in which a nonterminal A is required to be derived next, and there are multiple A-productions, such as A1 | 2 | | n. In such a situation, deciding which A-production to use for the derivation of A is a problem. Therefore, the parser will select one of the A-productions to derive A 1. If this derivation finally leads to the derivation of w, then the parser announces the successful completion of parsing. 2. Otherwise, the parser resets the input pointer to where it was when the nonterminal A was derived, and it tries another A-production. The parser will continue this until it either announces the successful completion of the parsing or reports failure after trying all of the alternatives.

Example:1 Consider the top-down parser for the following grammar:

Let the input string be w = acb. The parser initially creates a tree consisting of a single node, labeled S, and the input pointer points to a, the first symbol of input string w. The parser then uses the S-production S aAb to expand the tree as below

SibaramaPanigrahi, Lecture in CSE

Page 11

Parser uses the S-production to expand the parse tree. The left-most leaf, labeled a, matches the first input symbol of w. Hence, the parser will now advance the input pointer to c, the second symbol of string w, and consider the next leaf labeled A. It will then expand A, using the first alternative for Ai.e. Acdin order to obtain the tree shown below

Parser uses the first alternative for A in order to expand the tree. The parser now has the match for the second input symbolc. So, it advances the pointer to b, the third symbol of w, and compares it to the label of the next leaf. If the label does not match d, it reports failure and goes back (backtracks) to A, as shown above figure. The parser will also reset the input pointer to the second input symbolthe position it had when the parser encountered Aand it will try a second alternative to Ai.e.Ac in order to obtain the tree as shown in below figure.

As here all the symbols match with string w=acb so this derivation is accepted.
SibaramaPanigrahi, Lecture in CSE Page 12

Example:2Consider a grammar S aa | aSa. If a top-down backtracking parser for this grammar tries S aSa before S aa, show that the parser succeeds on two occurrences of a. In the case of two occurrences of a, the parser will first expand S

The first input symbol matches the left-most leaf. Therefore, the parser will advance the pointer to a second a and consider the nonterminal S for expansion in order to obtain the tree shown below.

The parser advances the pointer to a second occurrence of a. The second input symbol also matches. Therefore, the parser will consider the next leaf labeled S and expand it

The parser expands the next leaf labeled S

SibaramaPanigrahi, Lecture in CSE

Page 13

The parser now finds that there is no match. Therefore, it will backtrack to S, as shown by the thick arrow in below figure. The parser then continues matching and backtracking, as shown in below figures until it arrives at the required parse tree.

SibaramaPanigrahi, Lecture in CSE

Page 14

SibaramaPanigrahi, Lecture in CSE

Page 15

Question:Consider a grammar S aa | aSa. If a top-down backtracking parser for this grammar tries S aSa before S aa, show that the parser succeeds on four occurrences of a (aaaa), but not on six occurrences of a (aaaaaa). Also check whether whether the parser succeeds or not on eighth occurrence of a (aaaaaaaa).

FIRST and FOLLOW:


Before going to study Top Down predictive parsing a student must have the knowledge of FIRST and Follow. FIRST and FOLLOW are two Functions associated with a grammar that help us fill in the entries of Predictive Parsing Table. FIRST and FOLLOW of Nonterminal is found.

FIRST () :This function gives the set of terminals that begin the strings derived from
production rules. FOLLOW() :This function gives the set of terminals that can appear immediately to the right of given symbol.

Algorithm/Rules to compute FIRST():


Consider a grammar G={V,T,P,S} For every Production P 1. If a production of the form VT(V+T)* FIRST(V)=FIRST(T(V+T)*) ={T} 2. If a production of the form V FIRST(V)=FIRST() ={} 3. If a production is of the form VV1X //(V1=Non Terminal& X=(V+T)*) To find FIRST(V) check whether FIRST(V1) contains or not (i) FIRST(V1) contains Then FIRST(V)= {FIRST(V1) --} U {FIRST(X)} (ii) FIRST(V1) does not contain Then FIRST(V)= {FIRST(V1)} Started with a Nonterminal //Epsilon production //Stared with a Terminal

Note:While Calculating FIRST Start from Bottom and Look at the Left hand side of production to
find the Nonterminal, whose FIRST you want to calculate.
SibaramaPanigrahi, Lecture in CSE Page 16

Example-1 Find the FIRST of Following Grammar SACB | CbB | Ba Ada | BC HINT: To Find FIRST(C) Look at the production which B g | contains C on its LHS. So Here we use Ch | C h | Solution: FIRST(C) = FIRST(h) FIRST() ={h }{ } ={h, } FIRST(B) = FIRST(g) FIRST( ) = { g}{ } = { g, } FIRST(A) = FIRST(da) FIRST(BC) FIRST(BC) = FIRST(B) { } FIRST(C)[Since First(B) contains ] ={{g,}-{}} {h,} ={g,h,} Therefore FIRST(A) = FIRST(da) FIRST(BC) ={d}{g,h,} ={d,g,h,} FIRST(S) = FIRST(ACB) FIRST(CbB) FIRST(Ba) Therefore: FIRST(ACB) =FIRST(A) { } FIRST(CB)[Since First(A) contains ] = {d,g,h,}FIRST(C) { } FIRST(B) = {d, g, h,}{ h }{ g, } ={d,g,h,} FIRST(CbB) =FIRST(C) { } FIRST(bB)[Since First(C) contains ] ={h}{b} ={h,b} FIRST(Ba) =FIRST(B) { } FIRST(a)[Since First(B) contains ] ={g}{a} ={g,a} FIRST(S) = FIRST(ACB) FIRST(CbB) FIRST(Ba) ={d,g,h,}{h,b }{g,a } ={a,b,d,g,h,}

SibaramaPanigrahi, Lecture in CSE

Page 17

Example:2 Find the FIRST of following grammar ETA A+TA / TFB B*FB / F(E)/ id Solution: FIRST(F)=FIRST((E)) U FIRST(id) ={(}U {id} ={(,id} FIRST(B)=FIRST(*FB)U FIRST() ={*}U {} ={*,} FIRST(T)=FIRST(FB) ={FIRST(F)} [Since First(F) does not contain ] ={(, id} FIRST(A)=FIRST(+TA)U FIRST() ={+,} FIRST(E)=FIRST(TA) =FIRST(T) [Since First(T) does not contain ] ={(,id}

Algorithm/Rules to compute FOLLOW():


Consider a grammar G={V,T,P,S} For every Production P 1. A {$} is included in the follow of start symbol. 2. If a production of the form V(V+T)*XY FIRST(Y) contains Then FOLLOW(X)= {FIRST(Y) --} U {FOLLOW(V)} (iii) FIRST(Y) does not contain Then FOLLOW(X)= FIRST(Y) 3. If a production is of the form V(V+T)*X//X is a Nonterminal
FOLLOW(X)=FOLLOW(V)

//Y represents (V+T)*

To find FOLLOW(X) check whether FIRST(Y) contains or not (i)

Note:While Calculating FOLLOW Start from TOP and Look at the Right hand side of production to find
the Nonterminal, whose FOLLOW you want to calculate.
SibaramaPanigrahi, Lecture in CSE Page 18

Example:1Find the FOLLOW of following grammar ETA A+TA / TFB B*FB / F(E)/ id

Note (Refer E.g. 2)


FIRST(E) ={( , id} FIRST(A) ={+ , } FIRST(T) ={( , id} FIRST(B) ={* , } FIRST(F) ={( , id}

Solution: FOLLOW(E)=FIRST( )) U {$} ={ ), $} FOLLOW(A)=FOLLOW(E)U FOLLOW(A) [Applying Rule-3 on ETA and A+TA] ={),$} FOLLOW(T)={FIRST(A) {}} U FOLLOW(E) U {FIRST(A) {}} U FOLLOW(A) [Applying Rule 2 case-1 on ETA & A+TA ] ={{+, } {}} U { ), $} U {{+, } {}} U {),$} ={),$,+} FOLLOW(B)=FOLLOW(T) U FOLLOW(B) ={),$,+} FOLLOW(F)={{FIRST(B) {}} U FOLLOW(T))}U {{FIRST(B) {}} U FOLLOW(B)} ={*}U{),$,+}U {*} U {),$,+} ={+, ), *, $}

Top Down Predictive Parsers:


A Predictive parser is an efficient way of implementing recursive descent parsing since a stack is maintained in predictive parsing for handling the activation of records. In Top down predictive parsing the parser will be able to predict the right alternative for expansion of non-terminal during parsing process and hence it backtracking is not required.

Parse table is a two dimensional array M[X,a] where "X" is a non-terminal and "a" is a terminal of the grammar.

SibaramaPanigrahi, Lecture in CSE

Page 19

In two dimensional parse table each row corresponds to a Nonterminal and each column corresponds to a terminal. It is possible to build a non-recursive predictive parser maintaining a stack explicitly, rather than implicitly via recursive calls. A table-driven predictive parser has an input buffer, a stack, a parsing table, and an output stream. The input buffer contains the string to be parsed, followed by $, a symbol used as a right end marker to indicate the end of the input string. The stack contains a sequence of grammar symbols with a $ on the bottom, indicating the bottom of the stack. Initially the stack contains the start symbol of the grammar on top of $. The parsing table is a two-dimensional array M [X,a] , where X is a nonterminal, and a is a terminal or the symbol $ . The key problem during predictive parsing is that of determining the production to be applied for a nonterminal. The non-recursive parser looks up the production to be applied in the parsing table.

Algorithm For Creating A Predictive Parsing Table: Step-1Compute FIRST and FOLLOW for every nonterminal of the grammar. Step-2For every production A Check whether it is an epsilon Production or not
1. If A is not an epsilon production Then Look at FIRST (A) for every b in FIRST (A) which is derived from Ado TABLE [A, b] = A 2. If A is an epsilon production i.e A Then Look at FOLLOW (A)for every b in FOLLOW (A) do TABLE[A, b] = A Example:For the following grammar draw the predictive parsing table and check whether id+id*id is acceptable by that grammar or not. ETA A+TA / TFB B*FB / F(E)/ id Solution: Step:1Find the FIRST & FOLLOW Of Grammar From e.g. 2 of FIRST and Example of Follow we can refer the first and follow.

SibaramaPanigrahi, Lecture in CSE

Page 20

Note (Refer E.g. 2)


FIRST(E) ={( , id} FIRST(A) ={+ , } FIRST(T) ={( , id} FIRST(B) ={* , } FIRST(F) ={( , id}

Note (Refer E.g. 2 OF FOLLOW)


FOLLOW (E)={ ), $} FOLLOW (A) ={ ), $} FOLLOW (T) ={+, ), $} FOLLOW (B) ={+, ), $} FOLLOW (F) ={+, ), *, $}

Step:2 Draw the Predictive parse Table ( ETA TFB B F(E) F id ) A TFB B B*FB B id ETA + A+TA * $ A

E A T B F

Checking For the acceptance of id+id*id Stack E$ TA$ FBA$ idBA$ BA$ A$ +TA$ TA$ FBA$ idBA$ BA$ *FBA$ FBA$ idBA$ BA$ A$ $ $ Input String Id+id*id$ Id+id*id$ Id+id*id$ Id+id*id$ +id*id$ +id*id$ +id*id$ id*id$ id*id$ id*id$ *id$ *id$ id$ id$ $ $ $ $ Production Used ETA TFB F id Id is cancelled from stack and i/p string B A+TA + is cancelled from stack and i/p string TFB Fid id is cancelled from stack and i/p string B*FB * is cancelled from stack and i/p string Fid id is cancelled from stack and i/p string B A Accepted

As the string id+id*id is derived from start symbol E so the string is acceptable by the above grammar.

SibaramaPanigrahi, Lecture in CSE

Page 21

The heart of the table-driven predictive parser is the parsing table-the parser looks at the parsing table to decide which alternative is a right choice for the expansion of a nonterminal during the parsing of the input string. Hence, constructing a table-driven predictive parser can be considered as equivalent to constructing the parsing table. While drawing parsing table the parser have to check whether or not there exist multiple entries for same cell in predictive parsing table. Case-1: If there is no multiple entries for all cell in predictive parsing table then the parser becomes deterministic and hence backtracking is not required while deriving the string using predictive parsing table. Case-2: If there is multiple entries for same cell in predictive parsing table then the parser is still non-deterministic in predicting which production to use and hence backtracking is required while deriving the string using predictive parsing table. LL(1) PARSER A given grammar is LL(1) if its parsing table contains no multiple entries. If the table contains multiple entries, then the grammar is not LL(1). In the acronym LL(1), the first L stands for the left-to-right scan of the input, the second L stands for the left-most derivation, and the (1) indicates that the next input symbol is used to decide the next parsing process (i.e., length of the look ahead is "1").

Algorithm For Checking LL(1) Grammar:


For every pair of productions of the Form A | Check Condition-1: FIRST() FIRST() = Condition-2: If FIRST() contains , and FIRST() does not contain ThenFIRST() FOLLOW(A) = If above two conditions are satisfied by a grammar then i.e. a LL(1) grammar otherwise not.

SibaramaPanigrahi, Lecture in CSE

Page 22

Example:1Check whether the following Grammar is LL(1) or not ?


ETA A+TA / TFB B*FB / F(E)/ id

Solution:
In the above grammar There are three productions of the form A | A+TA / B*FB / F(E)/ id

For the production A+TA / Condition-1:


FIRST(+TA) FIRST() ={+} *+

Here =+TA and=

[Condition-1 Satisfied]

Condition-2: If FIRST() contains , and FIRST() does not contain


So Find FOLLOW(A) FIRST(+TA) ={ ), $}{+} [Look E.g.2 of Follow]

[Condition-2 Satisfied] Here =*FB and=

This Production is under LL(1)

For the production B*FB / Condition-1:


FIRST(*FB) FIRST() ={*} *+

[Condition-1 Satisfied]

Condition-2: If FIRST() contains , and FIRST() does not contain


So Find FOLLOW(B) FIRST(*FB) ={+, ), $} {*} [Look E.g.2 of Follow]

= [Condition-2 Satisfied] This Production is Under LL(1) For the production F(E)/ idHere =(E)and= id Condition-1: FIRST((E)) FIRST(id) ={ ( } {id} = [Condition-1 Satisfied] Since FIRST() does not contain ,So condition-2 need not be checked,Hence this production is also under LL(1).As all the production of the form A | satisfies the conditions. So The Grammar
is under LL(1). SibaramaPanigrahi, Lecture in CSE Page 23

You might also like