0% found this document useful (0 votes)
57 views

Botttom Up Parsing

The document discusses different parsing techniques including top-down and bottom-up parsing. It focuses on bottom-up parsing techniques including shift-reduce parsing, operator precedence parsing, and LR parsing. Key aspects covered include handles, handle pruning to derive rightmost derivations, and resolving shift-reduce and reduce-reduce conflicts that can occur in shift-reduce parsing.

Uploaded by

vishnugehlot
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Botttom Up Parsing

The document discusses different parsing techniques including top-down and bottom-up parsing. It focuses on bottom-up parsing techniques including shift-reduce parsing, operator precedence parsing, and LR parsing. Key aspects covered include handles, handle pruning to derive rightmost derivations, and resolving shift-reduce and reduce-reduce conflicts that can occur in shift-reduce parsing.

Uploaded by

vishnugehlot
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

Vishnu Kumar Gehlot

(Lecturer,CSE)
Parsing Techniques

Top-down parsers (LL(1), recursive descent)


 Start at the root of the parse tree from the start symbol and grow toward
leaves (similar to a derivation)
 Pick a production and try to match the input
 Bad “pick”  may need to backtrack
 Some grammars are backtrack-free (predictive parsing)

Bottom-up parsers (LR(1), operator precedence)


 Start at the leaves and grow toward root
 We can think of the process as reducing the input string to the start
symbol
 At each reduction step a particular substring matching the right-side of a
production is replaced by the symbol on the left-side of the production
 Bottom-up parsers handle a large class of grammars
Bottom-up Parsing

A general style of bottom-up syntax analysis, known as


shift-reduce parsing.
Two types of Shift Reduce Parsing:
1. Operator-Precedence parsing
2. LR parsing
Bottom Up Parsing

 “Shift-Reduce” Parsing
 Reduce a string to the start symbol of the grammar.
 At every step a particular sub-string is matched (in left-to-
right fashion) to the right side of some production and then
it is substituted by the non-terminal in the left hand side of
the production.

Consider: Reverse
abbcde order
S  aABe
aAbcde
A  Abc | b
aAde
Bd
aABe
S
Rightmost Derivation:
S  aABe  aAde  aAbcde  abbcde
Handles

 Handle of a string: Substring that matches the RHS of some production


AND whose reduction to the non-terminal on the LHS is a step along
the reverse of some rightmost derivation.
 Formally:
 Handle of a right sentential form  is <A  ,
location of  in > , that satisfies the above property.
 i.e. A   is a handle of  at the location immediately
after the end of , if:
S => A => 
 A certain sentential form may have many different handles.
*
 Right sentential
rm forms of arm
non-ambiguous grammar have
one unique handle.
Example

Consider:
S  aABe
A  Abc | b
Bd

S  aABe  aAde  aAbcde  abbcde

It follows that:
S  aABe is a handle of aABe in location 1.
B  d is a handle of aAde in location 3.
A  Abc is a handle of aAbcde in location 2.
A  b is a handle of abbcde in location 2.
Handle Pruning

 A rightmost derivation in reverse can be obtained by


“handle-pruning.”
 Apply this to the previous example.
S  aABe
A  Abc | b Also Consider:
Bd EE+E | E*E|
| ( E ) | id
abbcde
Derive id+id*id
Find the handle = b at loc. 2
By two different Rightmost
aAbcde
derivations
b at loc. 3 is not a handle:
aAAcde
... blocked.
Handle-pruning, Bottom-up Parsers

The process of discovering a handle & reducing it to the


appropriate left-hand side is called handle pruning.
Handle pruning forms the basis for a bottom-up parsing method.

To construct a rightmost derivation


S = 0  1  2  …  n-1  n = w
Apply the following simple algorithm
for i  n to 1 by -1
Find the handle Ai i in i
Replace i with Ai to generate i-1
Handle Pruning, II

 Consider the cut of a parse-tree of a certain right sentential


form.
S

Left part Handle (only terminals here)


Viable prefix
Shift Reduce Parsing with a Stack

 Two problems:
 locate a handle and
 decide which production to use (if there are more than two
candidate productions).

 General Construction: using a stack:


 “shift” input symbols into the stack until a handle is found
on top of it.
 “reduce” the handle to the corresponding non-terminal.
 other operations:
 “accept” when the input is consumed and only the start
symbol is on the stack, also: “error”
Example

STACK INPUT Action E E+E


$
id + id * id$ Shift | E*E
$ id + id * id$ Reduce by E  id | ( E ) | id
$E + id * id$ Shift
$E + id * id$ Shift
$E + id * id$ Reduce by E  id
$E + E * id$ Shift
$E + E * id$ Shift

$E + E * id $ Reduce by E  id

$E + E * E $ Reduce by E  E * E

$E + E $ Reduce by E  E + E

$E $ Accept
Shift-reduce Parsing

Shift reduce parsers are easily built and easily understood

A shift-reduce parser has just four actions


 Shift — next word is shifted onto the stack
 Reduce — right end of handle is at top of stack
Locate left end of handle within the stack
Pop handle off stack & push appropriate lhs
 Accept — stop parsing & report success
 Error — call an error reporting/recovery routine

Accept & Error are simple


Shift is just a push and a call to the scanner
Reduce takes |rhs| pops & 1 push
More on Shift-Reduce Parsing

Viable prefixes:
The set of prefixes of a right sentential form that can appear on the
stack of a Shift-Reduce parser is called Viable prefixes.
Conflicts
“shift/reduce” or “reduce/reduce”
Example:
stmt  if expr then stmt
We can’t tell
| if expr then stmt else stmt
whether it is a
handle | other (any other statement)

Stack Input
if … then stmt else … Shift/ Reduce Conflict
More Conflicts
stmt  id ( parameter-list )
stmt  expr := expr
parameter-list  parameter-list , parameter | parameter
parameter  id
expr-list  expr-list , expr | expr
expr  id | id ( expr-list )
Consider the string A(I,J)
Corresponding token stream is id(id, id)
After three shifts:

Stack = … id(id Input = , id)…


Reduce/Reduce Conflict … what to do?
(it really depends on what is A,
an array? or a procedure?

How the symbol third from the top of stack determines the reduction to be made,
even though it is not involved in the reduction. Shift-reduce parsing can utilize
information far down in the stack to guide the parse.
Operator-Precedence Parser

 Operator grammar
 small, but an important class of grammars
 we may have an efficient operator precedence parser (a shift-
reduce parser) for an operator grammar.
 In an operator grammar, no production rule can have:
  at the right side
 two adjacent non-terminals at the right side.
 Ex:
EAB EEOE EE+E |
Aa Eid E*E |
Bb | id O+|*|/ E/E

not operator grammar not operator grammar operator grammar


Precedence Relations

 In operator-precedence parsing, we define three disjoint


precedence relations between certain pairs of terminals.
a <. b b has higher precedence than a
a =· b b has same precedence as a
a .> b b has lower precedence than a

 The determination of correct precedence relations between


terminals are based on the traditional notions of
associativity and precedence of operators. (Unary minus
causes a problem).
Using Operator-Precedence Relations

 The intention of the precedence relations is to find the


handle of a right-sentential form,
<. with marking the left end,
=· appearing in the interior of the handle, and
.> marking the right hand.

 In our input string $a1a2...an$, we insert the precedence


relation between the pairs of terminals (the precedence
relation holds between the terminals in that pair).
Using Operator -Precedence Relations

E  E+E | E-E | E*E | E/E | E^E | (E) | -E | id


id + * $
The partial operator-precedence
id >
.
>
.
>
.
table for this grammar
+ <. > <.
.
>
.

* <. >
.
>
.
>
.

$ <. <. <.

 Then the input string id+id*id with the precedence relations


inserted will be:
$ <. id .> + <. id .> * <. id .> $
To Find The Handles

1. Scan the string from left end until the first .> is encountered.
2. Then scan backwards (to the left) over any =· until a <. is
encountered.
3. The handle contains everything to left of the first .> and to
the right of the <. is encountered.

$ <. id .> + <. id .> * <. id .> $ E  id $ id + id * id $


$ <. + <. id .> * <. id .> $ E  id $ E + id * id $
$ <. + <. * <. id .> $ E  id $ E + E * id $
$ < . + < . * .> $ E  E*E $ E + E * .E $
$ < . + .> $ E  E+E $E+E$
$$ $E$
Operator-Precedence Parsing Algorithm

The input string is w$, the initial stack is $ and a table holds
precedence relations between certain terminals
Algorithm:
set p to point to the first symbol of w$ ;
repeat forever
if ( $ is on top of the stack and p points to $ ) then return
else {
let a be the topmost terminal symbol on the stack and let b be the symbol
pointed to by p;
if ( a <. b or a =· b ) then { /* SHIFT */
push b onto the stack;
advance p to the next input symbol;
}
else if ( a .> b ) then /* REDUCE */
repeat pop stack
until ( the top of stack terminal is related by <. to the terminal most
recently popped );
else error();
}
Operator-Precedence Parsing Algorithm -- Example

stack input action


$ id+id*id$ $ <. id shift
$id +id*id$ id .> + reduce E  id
$ +id*id$ shift
$+ id*id$ shift
$+id *id$ id .> * reduce E  id
$+ *id$ shift
$+* id$ shift
$+*id $ id .> $ reduce E  id
$+* $ * .> $ reduce E  E*E
$+ $ + .> $ reduce E  E+E
$ $ id + * $ accept
id .
> .> .>
+ <. > <.
.
>
.

* <. >
.
>
.
>
.

$ < . <. <.


How to Create Operator-Precedence Relations

 We use associativity and precedence relations among operators.

1. If operator 1 has higher precedence than operator  2,


  1 .>  2 and  2 <.  1

2. If operator  1 and operator  2 have equal precedence,


they are left-associative   1 .>  2 and  2 .>  1
they are right-associative   1 <.  2 and  2 <.  1

3. For all operators ,  <. id, id .> ,  <. (, (<. ,  .> ), ) .> ,  .> $, and
$ <. 

4. Also, let
(=·) $ <. ( id .> ) ) .> $
( <. ( $ <. id id .> $ ) .> )
( <. id
Operator-Precedence Relations

+ - * / ^ id ( ) $
+ .
> .
> <. <. <. <. <. .
> .
>
- .
> .
> <. <. <. <. <. .
> .
>
* .
> .
> .
> .
> <. <. <. .
> .
>
/ .
> .
> .
> .
> <. <. <. .
> .
>
^ .
> .
> .
> .
> <. <. <. .
> .
>
id .
> .
> .
> .
> .
> .
> .
>
( <. <. <. <. <. <. <. =·
) .
> .
> .
> .
> .
> >
.
>
.

$ <. <. <. <. <. <. <.


Handling Unary Minus

 Operator-Precedence parsing cannot handle the unary minus


when we also have the binary minus in our grammar.

 The best approach to solve this problem, let the lexical analyzer
handle this problem.
 The lexical analyzer will return two different tokens for the unary
minus and the binary minus.
 The lexical analyzer will need a lookhead to distinguish the
binary minus from the unary minus.

 Then, we make
 <. unary-minus for any operator
Precedence Functions

 Compilers using operator precedence parsers do not need to


store the table of precedence relations.
 The table can be encoded by two precedence functions f
and g that map terminal symbols to integers.
 For symbols a and b.
f(a) < g(b) whenever a <. b
f(a) = g(b) whenever a =· b
f(a) > g(b) whenever a .> b
Constructing precedence functions

Method:
1. Create symbols fa and gb for each a that is a terminal or $.
2. Partition the created symbols into as many groups as possible, in
such a way that if a =. b, then fa and gb are in the same group.
3. Create a directed graph whose nodes are the groups found in (2).
For any a and b, if a <.b , place an edge from the group of gb to the
group of fa. Of a .> b, place an edge from the group of fa to that of
gb.
4. If the graph constructed in (3) has a cycle, then no precedence
functions exist. If there are no cycle, let f(a) be the length of the
longest path beginning at the group of fa; let g(a) be the length of
the longest path beginning at the group of ga.
Example

+ * Id $
f 2 4 4 0
gid fid
g 1 3 5 0

f* g*

g+ f+

f$ g$
Disadvantages of Operator Precedence Parsing

 Disadvantages:
 It cannot handle the unary minus (the lexical
analyzer should handle the unary minus).
 Small class of grammars.
 Difficult to decide which language is recognized
by the grammar.
 Advantages:
 simple
 powerful enough for expressions in programming
languages
Error Recovery in Operator-Precedence Parsing

Error Cases:
1. No relation holds between the terminal on the top of stack
and the next input symbol.
2. A handle is found (reduction step), but there is no
production with this handle as a right side

Error Recovery:
1. Each empty entry is filled with a pointer to an error routine.
2. Decides the popped handle “looks like” which right hand
side. And tries to recover from that situation.

You might also like