Module 3
SEMANTICS ANALYSIS
Semantic analysis
• To interleave semantic analysis with the syntax
analysis phase of the compiler, Syntax Directed
Translation is used
• Grammar + Semantic rule / Semantic Action
= SDT (syntax directed translation)
• Evaluation of the semantic rules may generate
code, save information in a symbol table, issue
error messages, Datatype checking or perform
any other activities.
Syntax-Directed Translation
• The Syntax-Directed translation of languages guided by
context-free grammars.
– Syntax-Directed Translation is to construct a
parse tree or syntax tree and compute the
values of attributes at the nodes of the tree by
visiting them in some order.
Syntax-directed definitions Vs Syntax-
directed translation scheme
• The Syntax-Directed translation can be done by
– Syntax-Directed Definition (SDD)
– Syntax Directed Translation Scheme(SDT)
• Syntax-directed definitions(SDD) can be more
readable, and hence more useful for
specifications.
• Syntax-directed Translation schemes(SDT)
can be more efficient, and hence more useful
for implementations.
Syntax-Directed Definition(SDD)
• A syntax-directed definition (SDD) is a context-free
grammar together with attributes and rules.
– Each grammar symbol has attribute to hold information
– Attributes may be of any kind: numbers, types, table
references, or strings.
• A Syntax-Directed Definition specifies the values of
attributes by associating semantic rules with the
grammar productions.
• In short, Attributes are associated with grammar symbols
and rules are associated with productions.
Syntax-Directed Definitions(SDD)
• If X is a symbol and ‘a’ is one of its attributes, then
we write X.a to denote the value of ‘a’ at a
particular parse-tree node labeled X.
• Attributes may be of any kind: numbers, types,
code, table references, or strings.
• The semantic rule specifies that the string E. code
is formed by concatenating El. code, T. code, and
the character ' + ‘
Syntax-Directed Translation Scheme(SDT)
• A Syntax-directed translation scheme embeds
program fragments called semantic actions
within production bodies
• Semantic actions are enclosed within curly
braces.
• The position of a semantic action in a production
body determines the order in which the action is
executed
Inherited and Synthesized Attributes
• A synthesized attribute for a nonterminal A at a
parse-tree node N is defined by a semantic rule
associated with the production at N. A synthesized
attribute at node N is defined only in terms of
attribute values at the children of N and at N itself.
• An inherited attribute for a nonterminal B at a
parse-tree node N is defined by a semantic rule
associated with the production at the parent of N.
Note that the production must have B as a symbol
in its body. An inherited attribute at node N is
defined only in terms of attribute values at N's
parent, N itself, and N's siblings.
Inherited and Synthesized
Attributes (Examples)
• Synthesized Attributes
• Inherited Attributes
S-Attributed & L-Attributed Definitions
• The two classes can be implemented efficiently
in connection with top-down or bottom-up
parsing.
• The first class is defined as follows:
• An SDD is S-attributed if every attribute is
synthesized.
• The second class of SDD's is called L-attributed
definitions.
• A L-attributed definition consist of synthesized
and inherited attributes
S-attributed SDD
• An SDD that involves only synthesized
attributes is called S-attributed;
• In an S-attributed SDD, each rule computes an
attribute for the nonterminal at the head of a
production from attributes taken from the body
of the production.
• An S-attributed SDD can be implemented
naturally in conjunction with an LR parser
• Semantic rules allow SDD to print the result
computed by a desk calculator or interacting
with a symbol table
S-Attributed Definitions
• When an SDD is S-attributed, we can evaluate its
attributes in the bottomup order of the nodes of the
parse tree.
• Evaluate the attributes by performing a postorder
traversal of the parse tree and evaluating the attributes
at a node N when the traversal leaves N for the last
time.
• S-attributed definitions can be implemented during
bottom-up parsing, since a bottom-up parse
corresponds to a postorder traversal.
SDD with Synthesized Attributes
• In the SDD, each of the nonterminals has a single
synthesized attribute, called ‘val’. And the terminal
digit has a synthesized attribute lexval, which is an
integer value returned by the lexical analyzer.
Syntax-directed
definition of a
simple desk
calculator
ANNOTATED PARSE TREE
• A parse tree, showing the value(s) of its
attribute(s) is called an annotated parse tree.
• First the parse tree is Constructed
• Then SDD rules are used to evaluate all of the
attributes at each of the nodes of the parse
tree
How to evaluate attributes
• Before evaluating an attribute at a node of a
parse tree, first evaluate all the attributes upon
which its value depends.
• Hence dependency between attributes of the
grammar symbol should be assessed first.
• If all attributes are synthesized, then evaluate
the ‘val’ attributes at all of the children of a
node before evaluating the ‘val’ attribute at the
node itself.
• With synthesized attributes, attributes are
evaluated in bottom-up order
Annotated Parse Tree - Synthesized
Attributes
• An annotated parse tree for the input string
3 * 5 + 4 n, constructed using the grammar and
rules. The values of lexval are supplied by the lexical
analyzer. Each of the nodes for the nonterminals
has attribute ‘val’ computed in a bottom-up order
L-Attributed Definitions
Between the attributes associated with a production body (i.e
R.H.S of the Production), dependency-graph edges can go from
left to right, but not from right to left. Each attribute can be
Its value is calculated from the attributes of the right
side of its production
• The first of these rules defines the inherited
attribute T’.inh using only [Link], and F appears to
the left of T’ in the production body, as required.
The second rule defines T’[Link] using the inherited
attribute T’.inh associated with the head, and [Link],
where F appears to the left of T1' in the production
body.
• In each of these cases, the rules use information
"from above or from the left ," as required by the
class
• Any SDD containing the above productions and
rules cannot be a L-attributed:
• The first rule, A.s = B.b, is a legitimate rule of
either a S-attributed or L-attributed SDD.
• The second rule defines an inherited attribute
B.i, so the entire SDD cannot be S-attributed.
• The SDD cannot be L-attributed also, because
the attribute C.c is used to define B.i, and also
Non-terminal C is right of B, in the production
body.
An SDD with Synthesized Attributes &
Inherited Attributes ( L-Attributed SDD)
A Top Down Parser needs both Synthesized Attributes
& Inherited Attributes
Annotated Parse Tree - Synthesized Attributes &
Inherited Attributes
• Each of the nonterminals T and F has a synthesized
attribute ‘val’; the terminal digit has a synthesized
attribute ‘lexval’. The nonterminal T' has two attributes:
an inherited attribute ‘inh’ and a synthesized attribute
‘syn’
• The semantic rules are based on the idea that the left
operand of the operator * is inherited. That is, the head
T' of the production T’ * F T1’ inherits the left operand
of * in the production body. Given a term x * y * z, the
root of the subtree for * y * z inherits x. Then, the root of
the subtree for * z inherits the value of x * y, and so on,
Once all the factors have been accumulated, the result is
passed back up the tree using synthesized attributes.
Annotated parse tree for 3 * 5
• The leftmost leaf in the parse tree, labeled digit,
has attribute value lexval = 3, where the 3 is
supplied by the lexical analyzer. Its parent is
production 4, (ie) F digit. The only semantic
rule associated with this production is
F. val = digit. lexval, which equals 3.
• In the second child of the root, the inherited
attribute [Link] is defined by the semantic rule
[Link] = [Link] which is associated with production
1. Thus, the left operand, 3, for the * operator is
passed from left to right across the children of the
root.
• The production at the node for T’ is T’ * FT’1.
The inherited attribute T’[Link] is defined by the
semantic rule T’[Link] = T’.inh x [Link] which is
associated with production 2.
• With T’.inh = 3 and [Link] = 5, we get T’[Link] = 15.
• At the lower node for T1, the production is T’ .
• The semantic rule T’.syn = T’.inh defines T’.syn =
15. The syn attributes at the nodes for T' pass the
value 15 up the tree to the node for T, where [Link]
= 15.
Evaluation Orders for SDD's
• "Dependency graphs" are a useful tool for
determining an evaluation order for the
attribute instances in a given parse tree.
• An edge from one attribute instance to
another means that the value of the first is
needed to compute the second
RULES FOR Dependency graph
• The dependency graph has a node for each
attribute associated with X .
• Suppose that a semantic rule associated with a
production p (AX) defines the value of
synthesized attribute A.b in terms of the value of
X.c. Then, the dependency graph has an edge from
X.c to A.b.
• Suppose that a semantic rule associated with a
production p (X B) defines the value of inherited
attribute B.c in terms of the value of X.a. Then, the
dependency graph has an edge from X.a to B.c.
The X could be a parent for B or a sibling for B
RULES for Dependency graph(contd.)
Ordering the Evaluation of Attributes
• The dependency graph characterizes the
possible orders in which we can evaluate the
attributes at the various nodes of a parse tree.
• If the dependency graph has an edge from
node M to node N, then the attribute
corresponding to M must be evaluated before
the attribute of N.
Topological Sorting
• Allowable orders of evaluation are those
sequences of nodes Nl, N2,. . . , Nk such that if
there is an edge of the dependency graph from
Ni to Nj; then i < j.
• Such an ordering embeds a directed graph into a
linear order, and is called a topological sort of
the graph.
• If there is any cycle in the graph, then there are
no topological sorts; that is, there is no way to
evaluate the SDD on this parse tree.
Ordering the Evaluation using Topological
sorting
• If there are no cycles, then there is always at
least one topological sort
• Since there are no cycles, we can surely find a
node with no edge entering.
• Make this node the first in the topological
order, remove it from the dependency graph,
and repeat the process on the remaining
nodes.
Topological sort
1, 2, 3, 4, 5, 6, 7, 8, 9
Topological sort
1, 3, 5, 2, 4, 6, 7, 8, 9
Semantic Rules with Side Effects
Translations involve side effects:
• A desk calculator might print a result
• A code generator might enter the type of an
identifier into a symbol table.
Side effects - Entering the Data type of an
identifier into a symbol table.
• A simple declaration D consisting of a basic
type T followed by a list L of identifiers.
• T can be int or float.
• For each identifier on the list, the Data type is
entered into the symbol-table entry for the
identifier.
Syntax-directed definition for simple type
declarations
Non-terminal D represents a declaration, which, from
production 1, consists of a type T followed by a list L
of identifiers.
‘T’ has one attribute, [Link], which is the datatype in
the declaration D.
• Non-terminal L also has one attribute, ‘inh’ to
emphasize that it is an inherited attribute.
• The purpose of [Link] is to pass the declared
datatype down the list of identifiers, so that it can
be added to the appropriate symbol-table entries.
• Productions 2 and 3 each evaluate the synthesized
attribute [Link]
• Production 4 passes [Link] down the parse tree.
• A function addType has two arguments
1. [Link], a lexical value that points to a symbol-
table object, and
2. [Link], the type being assigned to every identifier
on the list.
A dependency graph for the input string
float id1, id2, id3
• Numbers 1 through 10 represent the nodes of
the dependency graph.
• Nodes 1, 2, and 3 represent the attribute entry
associated with each of the leaves labeled id.
• Nodes 6, 8, and 10 are the dummy attributes
that represent the application of the function
addType to a type and one of these entry values.
• Node 4 represents the attribute T. type, and is
actually where attribute evaluation begins.
• This type is then passed to nodes 5, 7, and 9
representing [Link] associated with each of the
occurrences of the nonterminal L.
Applications of Syntax-Directed Translation
Application : - construction of syntax trees. Since
some compilers use syntax trees as an
intermediate representation
• To complete the translation to intermediate code,
the compiler may then walk the syntax tree, using
another set of rules.
• Two SDD's used for constructing syntax trees for
expressions.
• The first, an S-attributed definition, is suitable for
use during bottom-up parsing.
• The second, L-attributed, is suitable for use during
top-down parsing.
Construction of Syntax Trees
• Each node in a syntax tree represents a construct;
• The children of the node represent the meaningful
components of the construct
• The nodes of a syntax tree implemented by objects
with a suitable number of fields.
• Each object will have an op field that is the label of
the node with additional field
Additional fields:
• If the node is a leaf, an additional field holds the
lexical value for the leaf. A constructor function
Leaf ( op, val) creates a leaf object.
• If the node is an interior node, there are as many
additional fields as the node has children in the
syntax tree. A constructor function for Node takes
two or more arguments:
Node(op, c1, c2, . . . , ck)
• It creates an object with first field op and k
additional fields for the k children c1, . . . , ck.
Constructing syntax trees for simple
expressions (Bottom Up Parsing)
Every time the first production E E1 + T is used, its
rule creates a node with ‘+’ for op and two children,
[Link] and [Link], for the sub expressions.
The second production has a similar rule.
Constructing syntax trees for simple
expressions (Bottom Up Parsing)
• For production 3, E T, no node is created, since
[Link] is the same as [Link].
• Similarly, no node is created for production 4,
T ( E ). The value of [Link] is the same as [Link]
• The last two T-productions have a single terminal
on the right.
• So the constructor for Leaf is used to create a
suitable node, which becomes the value of [Link].
Constructing syntax trees for simple
expressions (Bottom Up Parsing)
• The nodes of the syntax tree are shown as
records, with the op field first.
• Syntax-tree edges are now shown as solid lines.
• The underlying parse tree, which need not
actually be constructed, is shown with dotted
edges.
• The third type of line, shown dashed line,
represents the values of [Link] and [Link];
each line points to the appropriate syntax-tree
node.
Syntax tree for a - 4 + c
Constructing syntax trees for simple
expressions (Bottom Up Parsing)
• At the bottom, leaves for a, 4 and c, constructed by
Leaf constructor function.
• The lexical value [Link] points into the symbol
table, and the lexical value [Link] is the
numerical value of a constant.
• These leaves, become the value of [Link] at the
three parse-tree nodes labeled T, according to rules
5 and 6.
• By rule 3, the pointer to the leaf for ‘a’ is also the
value of [Link] for the leftmost E in the parse tree.
Constructing syntax trees for simple
expressions (Bottom Up Parsing)
• Rule 2 is used to create a node with op equal to
the minus sign and pointers to the first two
leaves.
• Then, rule 1 produces the root node of the
syntax tree by combining the node for ‘ - ’ with
the third leaf node.
• The rules are evaluated during a postorder
traversal of the parse tree, or with reductions
during a bottom-up parse,
• Finally pointer p5 points to the root of the
constructed syntax tree.
Syntax tree for a - 4 + c
Constructing Syntax Tree –
Top down parsing
a - 4 +c
Syntax-Directed Translation (SDT) Schemes
• Syntax-directed translation schemes are a
complementary notation to syntax directed
definitions.
• All of the applications of syntax-directed can be
implemented using syntax-directed translation
schemes.
• A syntax-directed translation scheme (SDT) is a
context free grammar with program fragments
embedded within production bodies.
• The program fragments are called semantic actions
and can appear at any position within a production
body. curly braces placed around actions
Syntax-directed translation scheme
• Syntax-directed translation is to construct a
parse tree or a syntax tree, and then to compute
the values of attributes at the nodes of the tree
by visiting the nodes of the tree.
• A class of syntax-directed translations called "L-
attributed translations“ (L for left-to-right),
which encompass virtually all translations that
can be performed during parsing.
• A smaller class, called "S-attributed
translations" (S for synthesized), which can be
performed easily in connection with a bottom-up
parse.
Syntax-Directed Translation
(SDT)Schemes
• SDT is used to implement two important classes of
SDDs:
1. The underlying grammar is LR-parsable, and the
SDD is S-attributed.
2. The underlying grammar is LL-parsable, and the SDD
is L-attributed.
• The semantic rules in an SDD can be converted into
an SDT with actions that are executed at the right
time.
• During parsing, an action in a production body is
executed as soon as all the grammar symbols to the
left of the action have been matched/parsed
SDT for S-Attributed Definition -
Postfix Translation Schemes
• If SDD is S-attributed, then in SDT, each
action is placed at the end of the production
and is executed along with the reduction of
the body to the head of that production.
• Such SDT’s are called postfix SDT's.
SDT for S-Attributed Definition
SDD for S-Attributed Definition
SDT's for L-Attributed Definitions
• With any grammar, the technique below can be
implemented by attaching actions to a parse tree and
executing them during preorder traversal of the tree.
• The rules for turning an L-attributed SDD into an SDT are as
follows:
• Embed the Action that computes the inherited attributes
for a nonterminal A immediately before that occurrence
of A in the body of the production.
• If several inherited attributes for A depend on one another
in an acyclic fashion, order the evaluation of attributes so
that those needed first are computed first.
• Place the Actions that compute a synthesized attribute for
the head of a production at the end of the body of that
production.
SDT's With Actions Inside Productions
• An action may be placed at any position within the
body of a production.
• It is performed immediately after all symbols to its
left are processed.
• Thus, if we have a production
B X {a} Y,
• The action ‘a’ is done after we have recognized X
(if X is a terminal) or all the terminals derived
from X (if X is a nonterminal) .
SDT's With Actions Inside Productions
B X {a} Y,
• If the parse is bottom-up, then we perform
action ‘a’ as soon as this occurrence of X
appears on the top of the parsing stack.
• If the parse is top-down, we perform action ‘a ‘
just before we attempt to expand the
occurrence of Y (if Y a nonterminal) or check
for Y on the input (if Y is a terminal).
Example
S while ( C ) S1
• The meaning of our while-statement is that
the conditional C is evaluated.
• If it is true, control goes to the beginning of
the code for S1.
• If false, then control goes to the code that
follows the while-statement's code.
• The code for S1 must be designed to jump to
the beginning of the code for the while-
statement when finished;
SDD for while-statements
• The inherited attribute [Link] labels the beginning of
the code that must be executed after S is finished.
• The inherited attribute [Link] labels the beginning of
the code that must be executed if C is true.
• The inherited attribute [Link] labels the beginning
of the code that must be executed if C is false.
• The inherited attribute [Link] is the label that directs
the control to the condition statement C
• The synthesized attribute [Link] is the sequence of
intermediate-code steps that implements a
statement S and ends with a jump to [Link].
• The synthesized attribute [Link] is the sequence of
intermediate-code steps that implements the
SDT for while-statements
Parser-Stack Implementation of Postfix
SDT's
• Postfix SDT's can be implemented during LR parsing by
executing the actions when reductions occur.
• The attribute(s) of each grammar symbol can be put on
the stack in a place where they can be found during the
reduction.
• The three grammar symbols XYZ are on top of the
stack; they can be reduced according to a production
like A XYZ. Here X.x is one attribute of X and so on
Desk-calculator SDT
Desk-calculator SDT
• Suppose that the stack is kept in an array of records
called stack, with ‘top’ as a index to the top of the
stack.
• Thus, stack[top] refers to the top record on the
stack, stack[top - I] to the record below that, and so
on.
• Also, we assume that each record has a field called
val, which holds the attribute of grammar symbol is
represented in that record. Thus, we may refer to
the attribute [Link] that appears at the third
position on the stack as stack[top – 2] .Val.