0% found this document useful (0 votes)
2 views28 pages

Compiler 8

The document discusses context-free grammars (CFGs) and their role in syntax analysis within compilers. It explains the formal definition of CFGs, the process of derivations, and the concept of ambiguity in grammars. Additionally, it compares CFGs with regular expressions, highlighting their differences in expressive power.

Uploaded by

tmtsmbdalbary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views28 pages

Compiler 8

The document discusses context-free grammars (CFGs) and their role in syntax analysis within compilers. It explains the formal definition of CFGs, the process of derivations, and the concept of ambiguity in grammars. Additionally, it compares CFGs with regular expressions, highlighting their differences in expressive power.

Uploaded by

tmtsmbdalbary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

‫جامعة إفريقيا العالمية‬

‫كلية دراسات الحاسوب‬

‫تصميم المترجمات‬
‫حسب (‪)412‬‬
:Lecture 8
Context Free
Grammar

Source code IR Object code


Front-End Back-End
LexicalSyntax
Analysis
Analysis
Introduction to Parsing
(Syntax Analysis)
Lexical Analysis:
Reads characters of the input
program and produces tokens.
But: Are they syntactically
correct? Are they valid
sentences of the input
language?
The Role of the Parser
 In our compiler model, the parser
obtains a string of tokens from the
lexical analyzer and verifies that the
string of token names can be
generated by the grammar for the
source language.
 We expect the parser to report any

syntax errors in an intelligible fashion


and to recover from commonly
occurring errors to continue
processing the remainder of the
Context Free
Grammar
Context-Free
Grammars
 Grammars describe the syntax of
programming language constructs like
expressions and statements.
 Using a syntactic variable stmt to denote

statements and variable expr to denote


expressions, the production
stmt → if ( expr ) stmt else stmt

CatNoise  CatNoise miau| miau


The Formal Definition of a Context-
Free Grammar
Definition. A context-free grammar (grammar for short)
is a 4-tuple (, NT, R, S), where:
•  is an alphabet (each character in  is called
terminal)
• NT is a set (each element in NT is called nonterminal)
• R, the set of rules, is a subset of NT  (  NT)*
• If (,)  R, we write production   
• is called a sentential form
• S, the start symbol, is one of the symbols in NT
CFGs: Alternate Definition
many textbooks use different symbols and terms to
describe CFG’s
G = (V, S, P, S)
V= variables , a finite set
S = alphabet or terminals , a finite set
P = productions , a finite set
S = start variable SV

Productions’ form, where AV, a(VS)*:


Aa
Example : Grammar for simple
arithmetic expressions
 The grammar defines simple arithmetic
expressions.
 the terminal symbols are : id + - * / ( )

 The nonterminal symbols are expression :term ,

factor,
 the start symbol is : expression .
Example : Grammar for simple
arithmetic expressions
expression → expression + term
expression → expression - term
expression → term
term → term * factor
term → term / factor
term → factor
factor → ( expression )
factor → id
Informally a Context-Free Language
(CFL) is a language generated by a
Context-Free Grammar (CFG).
Informally, a CFG is a set of rules for

deriving (or generating) strings (or


sentences) in a language.
Derivations
Beginning with the start symbol,
each rewriting step replaces a
nonterminal by the body of one
of its productions.
Derivations
 Example, consider the following
grammar:
 E →E + E | E * E | - E | (E) | id

 a derivation of -(id) from E :


E   E   (E )   (id )
a derivation of -(id+id) from E :

E   E   (E )   ( E  E )
  (id  E )   (id  id )
Derivations

E   E   (E )   ( E  E )
  ( E  id )   (id  id )

 Each nonterminal is replaced by the


same body in the two derivations,
but the order of replacements is
different.
Leftmost, Rightmost
Derivations
Definition. A left-most derivation of a sentential
form is one in which rules transforming the left-most
nonterminal are always applied .

Definition. A right-most derivation of a sentential


form is one in which rules transforming the right-
most nonterminal are always applied .
leftmost derivations for the
sentence
id + id * id

E E+E
 id + E
 id + E*E
 id + id * E
 id + id * id
leftmost derivations for the
sentence
id + id * id

E E*E
E+E*E
 id + E * E
 id + id * E
 id + id * id
Parse Trees and Derivations
 A parse tree is a graphical representation of a
derivation that alters out the order in which
productions are applied to replace
nonterminals.
 Each interior node of a parse tree represents

the application of a production.


 The interior node is labeled with the

nonterminal A in the head of the production;


 the children of the node are labeled, from left

to right, by the symbols in the body of the


production by which this A was replaced
during the derivation.
 For example, the parse tree for - (id + id)
The parse tree for - (id + id)
Two parse trees for id+id*id
Ambiguity
a grammar that produces more
than one parse tree for some
sentence is said to be ambiguous.
 Put another way, an ambiguous

grammar is one that produces


more than one leftmost derivation
or more than one rightmost
derivation for the same sentence
Verifying the Language
Generated by a Grammar
 Although compiler designers
rarely do so for a complete
programming-language grammar,
it is useful to be able to reason
that a given set of productions
generates a particular language
Verifying the Language Generated
by a Grammar
 Describe the language produced by
the following Grammar:
S → ( S ) S |ε ?

 set
of all strings of balanced
parentheses {(), (()), ((())), …},
Verifying the Language Generated
by a Grammar
 Describe the language produced by
the following Grammar:
S  0 S1
S 
 theset of all 0s followed by an equal
number of 1s, {01, 0011, 000111, ...}.

L(G ) {0 1 : n 0}


n n
Context-Free Grammars Versus
Regular Expressions
 grammars are a more powerful
notation than regular expressions.
 Every construct that can be described

by a regular expression can be


described by a grammar, but not vice-
versa.
 Alternatively, every regular language

is a context-free language, but not


vice-versa.
For example, the regular
expression (a|b) * abb and the
grammar :

A0 → aA0 | bA0 | aA1


A1 → bA2
A2 → bA3
A3 → ε
 The grammar was constructed from the NFA
in using the following construction:
 1. For each state i of the NFA, create a
nonterminal Ai.
 2. If state i has a transition to state j on
input a, add the production Ai →aAj. If state i
goes to state j on input , add the production
Ai → Aj.
 3. If i is an accepting state, add Ai → 
 4. If i is the start state, make Ai be the
start symbol of the grammar.
On the other hand, the
language L = {an bn |
n≥1} with an equal
number of a’s and b’s
is a prototypical example of
a language that can be
described by a grammar
but not by a regular
expression

You might also like