Chapter 5
Syntax-Directed
Translation
This chapter develops the theme of Section 2.3: the translation of languages
guided by context-free grammars. The translation techniques in this chapter
will be applied in Chapter 6 to type checking and intermediate-code generation.
The techniques are also useful for implementing little languages for specialized
tasks; this chapter includes an example from typesetting.
We associate information with a language construct by attaching attributes
to the grammar symbol(s) representing the construct, as discussed in Sec-
tion 2.3.2. A syntax-directed definition specifies the values of attributes by
associating semantic rules with the grammar productions. For example, an
infix-to-postfix translator might have a production and rule
This production has two nonterminals, E and T; the subscript in El distin-
guishes the occurrence of E in the production body from the occurrence of E
as the head. Both E and T have a string-valued attribute code. The semantic
rule specifies that the string E. code is formed by concatenating El. code, T.code,
and the character ' + I . While the rule makes it explicit that the translation of
E is built up from the translations of E l , T, and I + ' , it may be inefficient to
implement the translation directly by manipulating strings.
From Section 2.3.5, a syntax-directed translation scheme embeds program
fragments called semantic actions within production bodies, as in
E -+ El + T { print '+I }
By convention, semantic actions are enclosed within curly braces. (If curly
braces occur as grammar symbols, we enclose them within single quotes, as in
304 CHAPTER 5. SYNTAX-DIRECTED TRANSLATION
I { ' and I } ' . ) The position of a semantic action in a production body determines
the order in which the action is executed. In production (5.2), the action
occurs at the end, after all the grammar symbols; in general, semantic actions
may occur at any position in a production body.
Between the two notations, syntax-directed definitions can be more readable,
and hence more useful for specifications. However, translation schemes can be
more efficient, and hence more useful for implementations.
The most general approach to syntax-directed translation is to construct a
parse tree or a syntax tree, and then to compute the values of attributes at the
nodes of the tree by visiting the nodes of the tree. In many cases, translation
can be done during parsing, without building an explicit tree. We shall therefore
study a class of syntax-directed translations called "L-attributed translations"
(L for left-to-right), which encompass virtually all translations that can be
performed during parsing. We also study a smaller class, called "S-attributed
translations" (S for synthesized), which can be performed easily in connection
with a bottom-up parse.
5.1 Syntax-Directed Definitions
A syntax-directed definition (SDD) is a context-free grammar together with,
attributes and rules. Attributes are associated with grammar symbols and rules
are associated with productions. If X is a symbol and a is one of its attributes,
then we write X.a to denote the value of a at a particular parse-tree node
labeled X. If we implement the nodes of the parse tree by records or objects,
then the attributes of X can be implemented by data fields in the records that
represent the nodes for X. Attributes may be of any kind: numbers, types, table
references, or strings, for instance. The strings may even be long sequences of
code, say code in the intermediate language used by a compiler.
5.1.1 Inherited and Synthesized Attributes
We shall deal with two kinds of attributes for nonterminals:
1. A synthesized attribute for a nonterminal A at a parse-tree node N is
defined by a semantic rule associated with the production at N. Note
that the production must have A as its head. A synthesized attribute at
node N is defined only in terms of attribute values at the children of N
and at N itself.
2. An inherited attribute for a nonterminal B at a parse-tree node N is
defined by a semantic rule associated with the production at the parent
of N. Note that the production must have B as a symbol in its body. An
inherited attribute at node N is defined only in terms of attribute values
at N's parent, N itself, and N's siblings.
5.1. SYNTAX-DIRECTED DEFINITIONS 305
An Alternative Definition of Inherited Attributes
No additional translations are enabled if we allow an inherited attribute
B.c at a node N to be defined in terms of attribute values at the children
of N , as well as at N itself, at its parent, and at its siblings. Such rules can
be "simulated" by creating additional attributes of B , say B.cl ,B.c2, . . . .
These are synthesized attributes that copy the needed attributes of the
children of the node labeled B. We then compute B.c as an inherited
attribute, using the attributes B.cl, B.cz,.. . in place of attributes at the
children. Such attributes are rarely needed in practice.
While we do not allow an inherited attribute at node N to be defined in terms of
attribute values at the children of node N , we do allow a synthesized attribute
at node N to be defined in terms of inherited attribute values at node N itself.
Terminals can have synthesized attributes, but not inherited attributes. At-
tributes for terminals have lexical values that are supplied by the lexical ana-
lyzer; there are no semantic rules in the SDD itself for computing the value of
an attribute for a terminal.
Example 5.1 : The SDD in Fig. 5.1 is based on our familiar grammar for
arithmetic expressions with operators + and *. It evaluates expressions termi-
nated by an endmarker n. In the SDD, each of the nonterminals has a single
synthesized attribute, called val. We also suppose that the terminal digit has
a synthesized attribute lexval, which is an integer value returned by the lexical
analyzer.
PRODUCTION SEMANTICRULES
1) L + E n L.val = E.val
2) E + E l T + E.val=E1.val+T.val
3) E + T E.val = T.val
4) T + T l * F T.val=Tl.vaExF.val
5) T + F T.val = F.val
6) F + ( E ) F.val = E.val
7 ) F + digit F. val = digit .lexval
Figure 5.1: Syntax-directed definition of a simple desk calculator
The rule for production 1, L -+ E n , sets L.val to E.va1, which we shall see
is the numerical value of the entire expression.
Production 2, E -+ El +T , also has one rule, which computes the val
attribute for the head E as the sum of the values at El and T . At any parse-
306 CHAPTER 5. SYNTAX-DIRECTED TRANSLATION
tree node N labeled E , the value of val for E is the sum of the values of val at
the children of node N labeled E and T.
Production 3, E + T , has a single rule that defines the value of val for E
to be the same as the value of val at the child for T. Production 4 is similar to
the second production; its rule multiplies the values at the children instead of
adding them. The rules for productions 5 and 6 copy values at a child, like that
for the third production. Production 7 gives F.val the value of a digit, that is,
the numerical value of the token digit that the lexical analyzer returned.
An SDD that involves only synthesized attributes is called S-attributed; the
SDD in Fig. 5.1 has this property. In an S-attributed SDD, each rule computes
an attribute for the nonterminal at the head of a production from attributes
taken from the body of the production.
For simplicity, the examples in this section have semantic rules without
side effects. In practice, it is convenient to allow SDD's to have limited side
effects, such as printing the result computed by a desk calculator or interacting
with a symbol table. Once the order of evaluation of attributes is discussed
in Section 5.2, we shall allow semantic rules to compute arbitrary functions,
possibly involving side effects.
An S-attributed SDD can be implemented naturally in conjunction with an
LR parser. In fact, the SDD in Fig. 5.1 mirrors the Yacc program of Fig. 4.58,
which illustrates translation during LR parsing. The difference is that, in the
rule for production 1, the Yacc program prints the value E.val as a side effect,
instead of defining the attribute L.va1.
An SDD without side effects is sometimes called an attribute grammar. The
rules in an attribute grammar define the value of an attribute purely in terms
of the values of other attributes and constants.
5.1.2 Evaluating an SDD at the Nodes of a Parse Tree
To visualize the translation specified by an SDD, it helps to work with parse
trees, even though a translator need not actually build a parse tree. Imagine
therefore that the rules of an SDD are applied by first constructing a parse tree
and then using the rules to evaluate all of the attributes at each of the nodes
of the parse tree. A parse tree, showing the value(s) of its attribute(s) is called
an annotated parse tree.
How do we construct an annotated parse tree? In what order do we evaluate
attributes? Before we can evaluate an attribute at a node of a parse tree, we
must evaluate all the attributes upon which its value depends. For example,
if all attributes are synthesized, as in Example 5.1, then we must evaluate the
ual attributes at all of the children of a node before we can evaluate the val
attribute at the node itself.
With synthesized attributes, we can evaluate attributes in any bottom-up
order, such as that of a postorder traversal of the parse tree; the evaluation of
S-attributed definitions is discussed in Section 5.2.3.
5.1. SYNTAX-DIRECTED DEFINITIONS 307
For SDD's with both inherited and synthesized attributes, there is no guar-
antee that there is even one order in which to evaluate attributes at nodes.
For instance, consider nonterminals A and B, with synthesized and inherited
attributes A.s and B.i, respectively, along with the production and rules
These rules are circular; it is impossible to evaluate either A.s at a node N or B.i
at the child of N without first evaluating the other. The circular dependency
of A.s and B.i at some pair of nodes in a parse tree is suggested by Fig. 5.2.
Figure 5.2: The circular dependency of A.s and B.i on one another
It is computationally difficult to determine whether or not there exist any
circularities in any of the parse trees that a given SDD could have to translate.'
Fortunately, there are useful subclasses of SDD's that are sufficient to guarantee
that an order of evaluation exists, as we shall see in Section 5.2.
Example 5.2 : Figure 5.3 shows an annotated parse tree for the input string
+
3 * 5 4 n, constructed using the grammar and rules of Fig. 5.1. The values
of lexval are presumed supplied by the lexical analyzer. Each of the nodes for
the nonterminals has attribute val computed in a bottom-up order, and we see
the resulting values associated with each node. For instance, at the node with
a child labeled *, after computing T.val= 3 and F.val = 5 at its first and third
children, we apply the rule that says T.val is the product of these two values,
or 15.
Inherited attributes are useful when the structure of a parse tree does not
"match" the abstract syntax of the source code. The next example shows how
inherited attributes can be used to overcome such a mismatch due to a grammar
designed for parsing rat her than translation.
'without going into details, while the problem is decidable, it cannot be solved by a
polynomial-time algorithm, even if F = N'P, since it has exponential time complexity.
CHAPTER 5. SYNTAX-DIRECTED TRANSLATION
I I
2aj=\
F.val = 4
T.va1 = 3
I
F.val= 5 digit.lexval= 4
I
F.val = 3
I
digit. lexval = 5
I
digit. lexval = 3
Figure 5.3: Annotated parse tree for 3 * 5 +4 n
Example 5.3 : The SDD in Fig. 5.4 computes terms like 3 * 5 and 3 * 5 * 7.
The top-down parse of input 3 * 5 begins with the production T + F T'. Here,
F generates the digit 3, but the operator * is generated by TI. Thus, the left
operand 3 appears in a different subtree of the parse tree from *. An inherited
attribute will therefore be used to pass the operand to the operator.
The grammar in this example is an excerpt from a non-left-recursive version
of the familiar expression grammar; we used such a grammar as a running
example to illustrate top-down parsing in Section 4.4.
1) T+FT1 TI.inh = F.val
T.val = T1.syn
4) F -+ digit I F.val = digit .lexval
Figure 5.4: An SDD based on a grammar suitable for top-down parsing
Each of the nonterminals T and F has a synthesized attribute val; the
terminal digit has a synthesized attribute lexval. The nonterminal T' has two
attributes: an inherited attribute inh and a synthesized attribute syn.