Module 3 – Semantic Analysis
Syntax directed definitions
• Syntax directed definition is a generalization of context free
grammar in which each grammar symbol has an associated set
of attributes.
• The attributes can be a number, type, memory location, return
type etc….
• Types of attributes are:
1. Synthesized attribute
2. Inherited attribute
Syntax Directed Translation Scheme
• The Syntax directed translation scheme is a context
free grammar
• It is used to evaluate the order of semantic rules.
• In translation scheme, the semantic rules are
embedded within the right side of the production.
• The position at which the action to be executed is
shown by enclosed between braces.
Synthesized attributes
• Value of synthesized attribute at a node can be computed from the value of attributes at the
children of that node in the parse tree. (Parent attributes taking value from child attribute
and itself)
• A syntax directed definition that uses synthesized attribute exclusively is said to be S-
attribute definition.
• Example: Syntax directed definition of simple desk
Production calculator
Semantic rules
L En Print ([Link])
E E1+T [Link] = [Link] + [Link]
ET [Link] = [Link]
T T1*F [Link] = [Link] * [Link]
TF [Link] = [Link]
F (E) [Link] = [Link]
F digit [Link] = [Link]
Applications of SDT
• Executing Arithmetic Expression
• Conversion from Infix to Postfix
• Conversion from Infix to Prefix
• Conversion from Binary to Decimal
• Counting No. of Reductions
• Creating Syntax Tree
• Generating Intermediate Code
• Type Checking
• Storing Type information into Symbol Table.
Example: Synthesized attributes
String: 3*5+4n; Production Semantic rules
L
L En Print ([Link])
n
[Link]= [Link] = [Link] + [Link]
E E1+T
19
ET [Link] = [Link]
+ [Link]=4
[Link]=1
5 T T1*F [Link] = [Link] * [Link]
The process of computing the
[Link]=15 [Link]= attribute values at the node is TF [Link] = [Link]
4 called annotating or
decorating the parse tree F (E) [Link] = [Link]
* [Link]=4
[Link]=3 [Link]=5 F digit [Link] = digit . lexval
[Link]=3 [Link]=
5 parse tree showing the value
[Link]=3 of the attributes at each node
Annotated parse tree is called Annotated parse tree
for 3*5+4n
Exercise
Draw Annotated Parse tree for following:
1. 7+3*2n
2. (3+4)*(5+6)n
Syntax directed definition to translates arithmetic expressions from infix
to prefix notation
Production Semantic rules
LE Print([Link])
EE+T [Link]=’+’ [Link] [Link]
EE-T [Link]=’-‘ [Link] [Link]
ET [Link]= [Link]
TT*F [Link]=’*’ [Link] [Link]
TT/F [Link]=’/’ [Link] [Link]
TF [Link]= [Link]
FF^P [Link]=’^’ [Link] [Link]
FP [Link]= [Link]
P(E) [Link]= [Link]
Pdigit [Link]=[Link]
Inherited attribute
• An inherited value at a node in a parse tree is computed from
the value of attributes at the parent and/or siblings of the node.
(attribute taking value from its parents itself, and its sibilings)
Production Semantic rules
D→TL [Link] = [Link]
T → int [Link] = integer
T → real [Link] = real
L → L1 , id [Link] = [Link], addtype([Link],[Link])
L → id addtype([Link],[Link])
Syntax directed definition with inherited attribute [Link]
• Symbol T is associated with a synthesized attribute type.
• Symbol L is associated with an inherited attribute in.
Example: Inherited attribute
Example: Pass data types to all
identifier real id1,id2,id3
D
Production Semantic rules
D→TL [Link] = [Link] L
[Link]=rea
T
[Link]=re
T → int [Link] = integer al l
T → real [Link] = real real ,
L1
[Link]=rea id
id
L → L1 , id [Link] = [Link], l 3
addtype([Link],[Link])
L → id addtype([Link],[Link]) ,
[Link]=rea
L1 id
id
l 2
id
id
1
DTL
L → Lid
1 , id
Evaluation order
• A topological sort of a directed acyclic graph is any ordering of
the nodes of the graph such that edges go from nodes earlier in
the ordering to later nodes.
• If is an edge from to then appears before in the ordering.
D
1 [Link]=re [Link]=rea 2
al l
real 3 ,
[Link]=rea id 4
l 3
,
5 [Link]=rea id 6
l 2
7 id
1
Construction of syntax tree
• Following functions are used to create the nodes of the syntax
tree.
1. Mknode (op,left,right): creates an operator node with label op and
two fields containing pointers to left and right.
2. Mkleaf (id, entry): creates an identifier node with label id and a
field containing entry, a pointer to the symbol table entry for the
identifier.
3. Mkleaf (num, val): creates a number node with label num and a
field containing val, the value of the number.
Construction of syntax tree for
expressions
Example: construct syntax
tree for a-4+c P5 +
P1: mkleaf(id, entry for a);
P2: mkleaf(num, 4);
P3: mknode(‘-‘,p1,p2); P3 - P4 id
P4: mkleaf(id, entry for c);
Entry for c
P5: mknode(‘+’,p3,p4);
P1 id P2 Num 4
Entry for a
Bottom up evaluation of S-attributed definitions
• S-attributed definition is one such class of syntax directed
definition with synthesized attributes only.
• Synthesized attributes can be evaluated using bottom up
parser only.
Synthesized attributes on the parser stack
• Consider the production AXYZ and associated semantic action
is A.a=f(X.x, Y.y, Z.z) State Value
State Value
top-2 top
top-1
top
Before reduction After reduction
Bottom up evaluation of S-attributed definitions
Production Semantic rules Input State Val Production Used
L En Print (val[top]) 3*5n - -
*5n 3 3
E E1+T val[top]=val[top-2] + val[top]
*5n F 3 Fdigit
ET
*5n T 3 TF
T T1*F val[top]=val[top-2] * val[top] 5n T* 3
TF n T*5 3,5
F (E) val[top]=val[top-2] - val[top] n T*F 3,5 Fdigit
n T 15 TT1*F
F digit
n E 15 ET
Implementation of a desk calculator En 15
with bottom up parser L 15 L En
Move made by translator
L-Attributed definitions
• A syntax directed definition is L-attributed if each inherited
attribute of , , on the right side of depends only on:
1. The attributes of the symbols j-1 to the left of in the production and
2. The inherited attribute of A.
Production Semantic Rules
• Example: A LM L.i:=l(A.i)
M.i=m(L.s)
A.s=f(M.s)
AXYZ A QR R.i=r(A.i)
Q.i=q(R.s)
A.s=f(Q.s)
Attributed
NotL-L-Attributed
• Above syntax directed definition is not L-attributed because the
Bottom up evaluation of S-attributed definitions
• Translation scheme is a context free grammar in which
attributes are associated with the grammar symbols and
semantic actions enclosed between braces { } are inserted
within the right sides of productions.
• Attributes are used to evaluate the expression along the
process of parsing.
• During the process of parsing the evaluation of attribute takes
place by consulting the semantic action enclosed in { }.
• A translation scheme generates the output by executing the
semantic actions in an ordered manner.
• This process uses the depth first traversal.
Example: Translation scheme (Infix to postfix
notation)
R addopR1 | 𝜖
String: 9-5+2 ETR
E
T num
T R
- R
9 {Print(9 T {Print(-
)} )}
5 {Print(5 + R
T {Print(
)}
+)}
2 {Print( 𝜖
2)}
Now, Perform Depth first traversal Postfix=95-2+
Types of errors
Types of Errors
Errors
Compile time Run time
Lexical Syntactic Semantic
Phase error Phase error Phase error
Lexical error
• Lexical errors can be detected during lexical analysis phase.
• Typical lexical phase errors are:
1. Spelling errors
2. Exceeding length of identifier or numeric constants
3. Appearance of illegal characters
• Example:
fi ( )
{
}
• In above code 'fi' cannot be recognized as a misspelling of keyword
if rather lexical analyzer will understand that it is an identifier and
will return it as valid identifier.
• Thus misspelling causes errors in token formation.
Syntax error
• Syntax error appear during syntax analysis phase of compiler.
• Typical syntax phase errors are:
1. Errors in structure
2. Missing operators
3. Unbalanced parenthesis
• The parser demands for tokens from lexical analyzer and if the
tokens do not satisfy the grammatical rules of programming
language then the syntactical errors get raised.
• Example:
printf(“Hello World !!!”) Error: Semicolon missing
Semantic error
• Semantic error detected during semantic analysis phase.
• Typical semantic phase errors are:
1. Incompatible types of operands
2. Undeclared variable
3. Not matching of actual argument with formal argument
• Example:
id1=id2+id3*60 (Note: id1, id2, id3 are real)
(Directly we can not perform multiplication due to incompatible types of
variables)
Error recovery strategies
(Ad-Hoc & systematic
methods)
Error recovery strategies (Ad-Hoc & systematic
methods)
• There are mainly four error recovery strategies:
1. Panic mode
2. Phrase level recovery
3. Error production
4. Global generation
Panic mode
• In this method on discovering error, the parser discards input
symbol one at a time. This process is continued until one of a
designated set of synchronizing tokens is found.
• Synchronizing tokens are delimiters such as semicolon or end.
• These tokens indicate an end of the statement.
• If there is less number of errors in the same statement then
this strategy is best choice.
fi ( ) Scan entire line otherwise scanner will return fi as valid identifier
{
}
Phrase level recovery
• In this method, on discovering an error parser performs local
correction on remaining input.
• The local correction can be replacing comma by semicolon,
deletion of semicolons or inserting missing semicolon.
• This type of local correction is decided by compiler designer.
• This method is used in many error-repairing compilers.
Error production
• If we have good knowledge of common errors that might be
encountered, then we can augment the grammar for the
corresponding language with error productions that generate
the erroneous constructs.
• Then we use the grammar augmented by these error
production to construct a parser.
• If error production is used then, during parsing we can
generate appropriate error message and parsing can be
continued.
Global correction
• Given an incorrect input string x and grammar G, the
algorithm will find a parse tree for a related string y, such that
number of insertions, deletions and changes of token require to
transform x into y is as small as possible.
• Such methods increase time and space requirements at parsing
time.
• Global correction is thus simply a theoretical concept.