0% found this document useful (0 votes)
55 views20 pages

LR Parsing Techniques and Examples

The document discusses various types of LR parsers, including Simple LR, SLR, and Canonical LR(1) parsers, detailing their construction and functionality. It explains the importance of LR parsers in recognizing programming language constructs and their efficiency in detecting syntactic errors. Additionally, the document covers the construction of parsing tables and the use of items and automata in the parsing process.

Uploaded by

Pramod Shenoy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views20 pages

LR Parsing Techniques and Examples

The document discusses various types of LR parsers, including Simple LR, SLR, and Canonical LR(1) parsers, detailing their construction and functionality. It explains the importance of LR parsers in recognizing programming language constructs and their efficiency in detecting syntactic errors. Additionally, the document covers the construction of parsing tables and the use of items and automata in the parsing process.

Uploaded by

Pramod Shenoy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

UNIT-3

TOPICS IN UNIT-3
• Simple LR-Parser (LR(0) Parser, SLR
Parser)
• More Powerful LR Parser: Canonical LR(1)
items, Canonical LR(1) Parsing Table,
Constructing LALR parsing Tables, Parser
Generator.
• Syntax – Directed Translation: Syntax-
Directed Definitions, Evaluation Order of
SDD’s, Application of SDT’s
LR PARSER: INTRODUCTION
• The most prevalent type of bottom-up parser today is based
on a concept called LR(k) parsing; the "L" is for left-to-right
scanning of the input, the "R" for constructing a rightmost
derivation in reverse, and the ‘k’ for the number of input
symbols of lookahead that are used in making parsing
decisions.
• The cases k = 0 or k = 1 are of practical interest, and we shall
only consider LR parsers with k <= 1 here. When (k) is
omitted, k is assumed to be 1.
• The LR parser consists of items and parser states along with a
table.
LR PARSER: WORKING
• LR parser has an input, a stack and a
parsing table.
• Stack contains string of form S0, S1,
…, Sm where Sm is on top of the stack.
• Input is read from Left to Right one
symbol at a time.
a1 a2
... an $

• Parsing table has two parts ACTION Sm


and GOTO functions.
• Procedure: LR parser program Sm-1
determines Sm (State currently on top Sm-2 Parsing Table
of stack) and ai (Current input
symbol). It then consults ACTION [sm,
.
ai] which can have one of the . DRIVER ROUTINE
following values: .
- Shift S S1
- Reduce Aβ LR PARSER
- Accept S0
- Error
IMPORTANCE OF LR PARSERS
• LR parsers can be constructed to recognize virtually all
programming language constructs for which context-
free grammars can be written. Non-LR context-free
grammars exist, but these can generally be avoided for
typical programming-language constructs.
• The LR-parsing method is the most general
nonbacktracking shift-reduce parsing method known,
yet it can be implemented as efficiently as other, more
primitive shift-reduce methods.
• An LR parser can detect a syntactic error as soon as
it is possible to do so on a left-to-right scan of the input.
• LR grammars can describe more languages than LL
grammars.
DRAWBACK OF LR PARSER
• The principal drawback of the LR method is that it is too much
work to construct an LR parser by hand for a typical
programming-language grammar.
• A specialized tool, an LR parser generator, is needed. Example:
YACC
• Such a generator takes a context-free grammar and
automatically produces a parser for that grammar.
• If the grammar contains ambiguities or other constructs that
are difficult to parse in a left-to-right scan of the input, then the
parser generator locates these constructs and provides
detailed diagnostic messages.
ITEMS and the LR(0) AUTOMATON
• How does a LR parser know when to shift and when to reduce?
• An LR parser makes shift-reduce decisions by maintaining states to
keep track of where we are in a parse.
• States represent sets of "items."
• An LR(0) item (item for short) of a grammar G is a production of G
with a dot at some position of the body.
• Thus, production A XYZ yields the four items:
A.XYZ (Indicates that parser hopes to see a string derivable from XYZ next on the
input)
AX.YZ (Parser has just seen on the input a string derivable from X and hope to see
string
derivable from YZ)
AXY.Z (seen string derivable from XY and hope to see string derivable from Z)
AXYZ. (seen XYZ and time to reduce XYZ to A)
CANONICAL LR(0) / LR(0) / SIMPLE LR (SLR)
• One collection of sets of LR(0) items, called the canonical LR(0)
collection, provides the basis for constructing a deterministic finite
automaton that is used to make parsing decisions.
• Such an automaton is called an LR(0) automaton.
• In particular, each state of the LR(0) automaton represents a set of
items in the canonical LR(0) collection.
• To construct the canonical two functions, CLOSURE and GOTOLR(0)
collection for a grammar, we define an augmented grammar and.
• If G is a grammar with start symbol S, then G', the augmented
grammar for G, is G with a new start symbol S' and production S' S.
• The purpose of this new starting production is to indicate to the parser
when it should stop parsing and announce acceptance of the input.
• That is, acceptance occurs when and only when the parser is about to
reduce by S' S
CLOSURE OF ITEM SETS

• If I is a set of items for a


grammar G, then CLOSURE(I)s
the set of items constructed from
I by the two rules:
• Initially, add every item in I to
CLOSURE(I).
• If A  a.Bβ is in CLOSURE(Ia)n
d B  y is a production, then
add the item B.y to
CLOSURE(I)if, it is not already
there.
• Apply this rule until no more
new items can be added to
CLOSURE(I).
GOTO FUNCTION
• The second useful function is GOTO(I, X) where I is
a set of items and X is a grammar symbol.
• GOTO(I, X) is defined to be the closure of the set of
all items AαX.β such that [A  α.X β] is in I.
• Intuitively, the GOT0 function is used to define the
transitions in the LR(0) automaton for a grammar.
• The states of the automaton correspond to sets of
items, and GOTO(I, X) specifies the transition from
the state for I under input X.
1) Construct LR(0) / SLR / Canonical LR(0) parser for the following grammar:
$ I6
E  E+T | T T I9
(Accept)
I1 EE+.T
* I7
T  T*F | F EE+T.
E’  E. + T.T * F TT.*F
F(E) | id E  E. + T.F
Ans: T F.(E) F
E I3
F.id
I2
I0 (
T ET. * id
E’  .E I7 I4
TT.*F
E id
TT *. F
 .E+T F I3 F.(E) I5
E.T F
TF. F.id
T.T * F
T.F ( I4 I10
F.(E) F(.E) E I8 TT*F.
F.id E F(E.) )
 .E+T EE.+T I11
id E.T
T + F(E).
T.T * F I2
I5 id T.F I6
Fid. F.(E) (
F.id
Parsing table construction for Problem-1
E  E+T | T
T  T*F | F Step-2) Compute FIRST and FOLLOW sets
F(E) | id FIRST(E) = {(, id}
Step-1)
FIRST(T) = {(, id}
Separate the FIRST(F) = {(, id}
productions and FOLLOW(E) = {$, +, )}
number them FOLLOW(T)={*, ,$, +, )}
EE+T ….(1) FOLLOW(F)={$, +, ), *}
E T ……(2) When “.” is at end, add r(production-number) to
TT*F …..(3) columns of FOLLOW(symbol) present in LHS of 
TF ………(4) Example: For ET. in I2 place r2 in FOLLOW(E) = {$,
F(E) ……(5)
+, )} columns
F id ……(6)
ACTION GOTO
STAT + * ( ) id $ E T F
E
0 s4 s5 1 2 3
1 s6 Ac
ce
pt

2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s4 s5 8 2 3
5 r6 r6 r6 r6
6 s4 s5 9 3
7 s4 s5 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
LR PARSING TABLE CONSTRUCTION ALGORITHM

• Input: ‘C’ is the canonical collection of sets of items of


Augmented Grammar G’
• Output: LR parsing table consisting of functions ACTION
and GOTO.
• Method: Let C={I0, I1, …, In}. The parsing actions are:
i. If [Aα.aβ] is in Ii, and GOTO(Ii, a) = Ij, then set ACTION[i,a] to
“shift j”
ii. If [A α.] is in Ii, then set ACTION[i,a] to “reduce A α” for all
‘a’ in FOLLOW(A).
iii. If [S’  S.] is in Ii, then set ACTION[i,$] to “accept
• If any box in table has 2/more entries, then conflict
occurs, and the grammar G is not in SLR(1).
2) Construct LR(0) / SLR / Canonical LR(0) parser for the following
grammar:
SL=R | R $
I1 I9
L  *R | id Accept
S S’S. R SL=R.
R L I6
Ans: I2 SL=.R
=
I0 SL.=R R.L L
SL=R  (1) L RL. L.*R I8
S’.S
SR  (2) L.id
S.L=R
L*R  (3) S.R R I3 *
L.*R SR. I7 I4
Lid  (4)
L.id
R L*R.
RL  (5) R.L * I4 id
id L*.R
L I8 I5
I5 R.L
L.*R RL.
Lid.
id L.id
Parsing table for problem-2 ACTION GOTO
STAT = * id $ S L R
E
0 s4 s5 1 2 3
1 Acc
2 s6
3 r2
4 s4 s5 8 7
5 r4 r4
6 s4 s5 8 9
SL=R  (1)
FIRST(S)={*,id} 7 r3 r3
SR  (2)
8 r5 r5
FIRST(L)={*,id}
L*R  (3) 9 r1
FIRST(R)={*,id} When “.” is at end, add r(production-number) to
Lid  (4) columns of FOLLOW(symbol) present in LHS of 
FOLLOW(S)={$} For SR. in I3 place r2 in FOLLOW(S) = {$} column
RL  (5) For Lid. In I5, place r4 in FOLLOW(L)={=, $}
FOLLOW(L)={=,$}
SLR PROBLEMS
Construct SLR parsers for the following grammars:
1) S  BB
B  cB | d

2) Sa | ^ | (T)
TT , S | S

3) SAS | b
ASA | a
1) S  BB 1
Accept
B  cB | d S S’  S.

0 5 c d $ S B
S  BB (1) 2 B
S’  .S B S  BB. 0 s3 s4 1 2
B  cB (2) S  B.B
S  .BB
B  .cB 1 Ac
B  d (3) B  .cB c
B  .d
B  .d
d 2 s3 s4 5
d c
c 3 s3 s4 6
4 4 4 r3 r3 r3
B  d. 3
B  c.B 5 r1
B  .cB 6 r2 r2 r2
B  .d

FOLLOW(S) = {$}
FOLLOW(B) = FIRST(B) U
FOLLOW(S) = {c, d, $}
2) Sa | ^ | Accep
t
(T) 1 7
TT , S | S S’  S. S  (T).
9
S  a (1) T 
8 T,S.
0 2 5
S  ^ (2) T 
S’  .S S  a. S  (T.)
S  (T) (3) T,.S
S  .a T  T.,S 2
S  .a
T  T,S (4) S^ 3 S  .^
S  .(T) S  ^. 6 S .(T)
T  S (5)
T S. 3
4
FOLLOW(S)= {$ ) ,} S  (.T) 2 4
T
FOLLOW(T)={) , }
 .T,S
T .S 3
S  .a
S  .^
S .(T)
a ^ ( ) , $ S T
0 s2 s3 s4 1
1 Acc

2 r1 r1 r1
3 r2 r2 r2
4 s2 s3 s4 6 5
5 s7 s8
6 r5 r5
7 r3 r3 r3
8 s2 s3 s4 9
9 r4 r4

You might also like