0% found this document useful (0 votes)
6 views

pl10ch3

Chapter 3 discusses the syntax and semantics of programming languages, defining syntax as the structure of expressions and semantics as their meaning. It covers formal methods for describing syntax, including context-free grammars and Backus-Naur Form (BNF), as well as the role of attribute grammars in specifying static semantics. The chapter concludes with a summary of the equivalence of BNF and context-free grammars, and highlights three primary methods for describing semantics.

Uploaded by

vysl.genc01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

pl10ch3

Chapter 3 discusses the syntax and semantics of programming languages, defining syntax as the structure of expressions and semantics as their meaning. It covers formal methods for describing syntax, including context-free grammars and Backus-Naur Form (BNF), as well as the role of attribute grammars in specifying static semantics. The chapter concludes with a summary of the equivalence of BNF and context-free grammars, and highlights three primary methods for describing semantics.

Uploaded by

vysl.genc01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 33

Chapter 3

Describing Syntax
and Semantics
Chapter 3 Topics

• Introduction
• The General Problem of Describing
Syntax
• Formal Methods of Describing Syntax
• Attribute Grammars
• Describing the Meanings of Programs:
Dynamic Semantics

Copyright © 2012 Addison-Wesley. All rights reserved. 1-2


Introduction

• Syntax: the form or structure of the


expressions, statements, and program
units
• Semantics: the meaning of the
expressions, statements, and program
units
• Syntax and semantics provide a
language’s definition
– Users of a language definition
• Other language designers
• Programmers (the users of the language)

Copyright © 2012 Addison-Wesley. All rights reserved. 1-3


The General Problem of Describing
Syntax: Terminology

• A sentence is a string of characters over


some alphabet

• A language is a set of sentences

• A lexeme is the lowest level syntactic unit


of a language (e.g., *, sum, begin)

• A token is a category of lexemes (e.g.,


identifier)
Copyright © 2012 Addison-Wesley. All rights reserved. 1-4
The General Problem of Describing
Syntax: Example

index = 2 * count + 17;

The lexemes and tokens of this statement are


Lexemes Tokens
Index identifier
= equal_sign
2 int_literal
* mult_op
count identifier
+ plus_op
17 int_literal
; semicolon
Copyright © 2012 Addison-Wesley. All rights reserved. 1-5
Formal Definition of Languages

• Recognizers
– A recognition device reads input strings over the
alphabet of the language and decides whether the input
strings belong to the language
– Example: syntax analysis part of a compiler
- Detailed discussion of syntax analysis appears in
Chapter 4

• Generators
– A device that generates sentences of a language
– One can determine if the syntax of a particular
sentence is syntactically correct by comparing it to the
structure of the generator (grammar, BNF.. )

Copyright © 2012 Addison-Wesley. All rights reserved. 1-6


BNF and Context-Free Grammars

• Context-Free Grammars
– Developed by Noam Chomsky in the mid-1950s
– Language generators (classes), meant to
describe the syntax of natural languages
– Define a class of languages called context-free
languages

• Backus-Naur Form (1959)


– Invented by John Backus to describe the syntax
of Algol 58
– BNF is equivalent to context-free grammars

Copyright © 2012 Addison-Wesley. All rights reserved. 1-7


BNF Fundamentals

• In BNF, abstractions are used to represent


classes of syntactic structures--they act like
syntactic variables (also called nonterminal
symbols, or just terminals)

• Terminals are lexemes or tokens

• A rule has a left-hand side (LHS), which is a


nonterminal, and a right-hand side (RHS), which
is a string of terminals and/or nonterminals

<assign> <var> = <expression>


total = subtotal1 + subtotal2
Copyright © 2012 Addison-Wesley. All rights reserved. 1-8
BNF Fundamentals (continued)

• Nonterminals are often enclosed in angle


brackets

– Examples of BNF rules:


<ident_list> → identifier | identifier, <ident_list>
<if_stmt> → if <logic_expr> then <stmt>

• Grammar: a finite non-empty set of rules

• A start symbol is a special element of the


nonterminals of a grammar

Copyright © 2012 Addison-Wesley. All rights reserved. 1-9


C and Python Code Examples

Copyright © 2012 Pearson Education. All rights reserved. 1-10


Describing Lists

• Syntactic lists are described using


recursion
<ident_list>  ident
| ident, <ident_list>

• A derivation is a repeated application of


rules, starting with the start symbol and
ending with a sentence (all terminal
symbols)

Copyright © 2012 Addison-Wesley. All rights reserved. 1-11


Example Grammar 1

<program>  <stmts>
<stmts>  <stmt> | <stmt> ; <stmts>
<stmt>  <var> = <expr>
<var>  a | b | c | d
<expr>  <term> + <term> | <term> - <term>
<term>  <var> | const

Copyright © 2012 Addison-Wesley. All rights reserved. 1-12


Example Derivation 1

<program> => <stmts> => <stmt>


=> <var> = <expr>
=> a = <expr>
=> a = <term> + <term>
=> a = <var> + <term>
=> a = b + <term>
=> a = b + const

Copyright © 2012 Addison-Wesley. All rights reserved. 1-13


Example Grammar 2

<program> → begin <stmt_list> end


<stmt_list> → <stmt>
| <stmt> ; <stmt_list>
<stmt> → <var> = <expression>
<var> → A | B | C
<expression> → <var> + <var>
| <var> – <var>
| <var>

Copyright © 2012 Pearson Education. All rights reserved. 1-14


Example Derivation 2

<program> => begin <stmt_list> end


=> begin <stmt> ; <stmt_list> end
=> begin <var> = <expression> ; <stmt_list>
end
=> begin A = <expression> ; <stmt_list> end
=> begin A = <var> + <var> ; <stmt_list> end
=> begin A = B + <var> ; <stmt_list> end
=> begin A = B + C ; <stmt_list> end
=> begin A = B + C ; <stmt> end
=> begin A = B + C ; <var> = <expression> end
=> begin A = B + C ; B = <expression> end
=> begin A = B + C ; B = <var> end
=> begin A = B + C ; B = C end
Copyright © 2012 Pearson Education. All rights reserved. 1-15
Parse Tree

• A hierarchical
representation of
a derivation <program>

<stmts>

<stmt>

<var> = <expr>

a <term> + <term>

<var> const

b
Copyright © 2012 Addison-Wesley. All rights reserved. 1-17
An Example Parse Tree

An example
grammar and the
hierarchical
representation of a
derivation

Copyright © 2012 Pearson Education. All rights reserved. 1-18


Ambiguity in Grammars

• A grammar is ambiguous if and only if it


generates a sentential form that has
two or more distinct parse trees

Copyright © 2012 Addison-Wesley. All rights reserved. 1-19


An Ambiguous Expression Grammar

<expr>  <expr> <op> <expr> | const


<op>  / | -

<expr> <expr>

<expr> <op> <expr> <expr> <op> <expr>

<expr> <op> <expr> <expr> <op> <expr>

const - const / const const - const / const

Copyright © 2012 Addison-Wesley. All rights reserved. 1-20


Ambiguity in Grammars contd.

A=B+C*A

Copyright © 2012 Pearson Education. All rights reserved. 1-21


Associativity of Operators

• Operator associativity can also be indicated by a


grammar

<expr> -> <expr> + <expr> | const (ambiguous)


<expr> -> <expr> + const | const (unambiguous)

<expr>
<expr>

<expr> + const

<expr> + const

const
Copyright © 2012 Addison-Wesley. All rights reserved. 1-23
Extended BNF

• Optional parts are placed in brackets [ ]


<proc_call> -> ident [(<expr_list>)]
• Alternative parts of RHSs are placed
inside parentheses and separated via
vertical bars
<term> → <term> (+|-) const
• Repetitions (0 or more) are placed
inside braces { }
<ident> → letter {letter|digit}

Copyright © 2012 Addison-Wesley. All rights reserved. 1-24


BNF and EBNF
• BNF
<expr>  <expr> + <term>
| <expr> - <term>
| <term>
<term>  <term> * <factor>
| <term> / <factor>
| <factor>
• EBNF
<expr>  <term> {(+ | -) <term>}
<term>  <factor> {(* | /) <factor>}

Copyright © 2012 Addison-Wesley. All rights reserved. 1-25


Static Semantics

• Context-free grammars (CFGs) cannot


describe all of the syntax of programming
languages
• Categories of constructs that are trouble:
- Context-free, but cumbersome (e.g.,
types of operands in expressions)
- Non-context-free (e.g., variables must
be declared before they are used)

Copyright © 2012 Addison-Wesley. All rights reserved. 1-27


Attribute Grammars

• Attribute grammars (AGs) have


additions to CFGs to carry some
semantic info on parse tree nodes

• Primary value of AGs:


– Static semantics specification
– Compiler design (static semantics checking)

Copyright © 2012 Addison-Wesley. All rights reserved. 1-28


Attribute Grammars : Definition

• Def: An attribute grammar is a context-


free grammar G = (S, N, T, P) with the
following additions:
– For each grammar symbol x there is a set A(x)
of attribute values
– Each rule has a set of functions that define
certain attributes of the nonterminals in the
rule
– Each rule has a (possibly empty) set of
predicates to check for attribute consistency

Copyright © 2012 Addison-Wesley. All rights reserved. 1-29


Attribute Grammars: Definition

• Let X0  X1 ... Xn be a rule


• Functions of the form S(X0) = f(A(X1), ... ,
A(Xn)) define synthesized attributes
• Functions of the form I(Xj) = f(A(X0), ... ,
A(Xn)), for i <= j <= n, define inherited
attributes
• Initially, there are intrinsic attributes on
the leaves

Copyright © 2012 Addison-Wesley. All rights reserved. 1-30


Attribute Grammars: An Example

• Syntax
<assign> -> <var> = <expr>
<expr> -> <var> + <var> | <var>
<var> A | B | C
• actual_type: synthesized for <var>
and <expr>
• expected_type: inherited for <expr>

Copyright © 2012 Addison-Wesley. All rights reserved. 1-31


Attribute Grammar (continued)

• Syntax rule: <expr>  <var>[1] + <var>[2]


Semantic rules:
<expr>.actual_type  <var>[1].actual_type
Predicate:
<var>[1].actual_type == <var>[2].actual_type
<expr>.expected_type == <expr>.actual_type

• Syntax rule: <var>  id


Semantic rule:
<var>.actual_type  lookup (<var>.string)

Copyright © 2012 Addison-Wesley. All rights reserved. 1-32


Attribute Grammars (continued)

• How are attribute values computed?


– If all attributes were inherited, the tree could
be decorated in top-down order.
– If all attributes were synthesized, the tree
could be decorated in bottom-up order.
– In many cases, both kinds of attributes are
used, and it is some combination of top-down
and bottom-up that must be used.

Copyright © 2012 Addison-Wesley. All rights reserved. 1-33


Attribute Grammars (continued)

<expr>.expected_type  inherited from parent

<var>[1].actual_type  lookup (A)


<var>[2].actual_type  lookup (B)
<var>[1].actual_type =? <var>[2].actual_type

<expr>.actual_type  <var>[1].actual_type
<expr>.actual_type =? <expr>.expected_type

Copyright © 2012 Addison-Wesley. All rights reserved. 1-34


Semantics

• There is no single widely acceptable


notation or formalism for describing
semantics
• Several needs for a methodology and
notation for semantics:
– Programmers need to know what statements mean
– Compiler writers must know exactly what language
constructs do
– Correctness proofs would be possible
– Designers could detect ambiguities and inconsistencies

Copyright © 2012 Addison-Wesley. All rights reserved. 1-35


Summary

• BNF and context-free grammars are


equivalent meta-languages
– Well-suited for describing the syntax of
programming languages
• An attribute grammar is a descriptive
formalism that can describe both the
syntax and the semantics of a language
• Three primary methods of semantics
description
– Operation, axiomatic, denotational

Copyright © 2012 Addison-Wesley. All rights reserved. 1-62

You might also like