0% found this document useful (0 votes)

8 views33 pages

Understanding Syntax Analysis in Compilers

Uploaded by

teshomeegashaww

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views33 pages

Understanding Syntax Analysis in Compilers

Uploaded by

teshomeegashaww

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

CHAPTER THREE

COMPILER DESIGN

Prepared by: Mesay Yohannes (MSc.)

November, 2023
Jinka
Syntax Analysis (Parsing)
Outline
3.1. The Role of Parsing

3.2. Parsing

3.2.1. Derivation,

3.2.2. Parse Tree,

3.2.3. Ambiguity, and

3.2.4. Left Recursion

1
3.1. The Role of Parsing
 Syntax analyzer is the second phase of a compiler and also known as parser.
 As we see in chapter 2, a lexical analyzer can identify tokens with the help of regular
expressions and patterns/rules.
 As second phase, a syntax analyzer or parser takes the input from a lexical analyzer in the form
of token streams.
 A parser is a program that generates a parse tree for the given string, if the string is generated
from the underlying grammar.
 A parser analyzes the source code (token stream) against the production rules to detect any
errors in the code.

2
3.1. The Role of Parsing…(Cont.)
 In this way, the parser accomplishes two tasks,

 Parsing the code,

 Looking for errors and generating a parse tree as the output of the phase.

 In compilation, the parser obtains a string of tokens from the lexical analyzer, and
expected to parse the whole code even if some errors exist in the program.

 A lexical analyzer cannot check the syntax of a given sentence due to the limitations of
the regular expressions.

 Regular expressions cannot check balancing tokens, such as parenthesis.

 Therefore, the syntax analysis phase uses context-free grammar (CFG), which is
recognized by push-down automata.

 The syntax of a language is specified by a context free grammar (CFG).

3
3.1. The Role of Parsing…(Cont.)
Context Free Grammar (CFG)
 CFG is a helpful tool in describing the syntax of programming languages.
 CFG is a formal grammar which is used to generate all possible patterns of strings in a given form.
 The rules in a CFG are mostly recursive.
 A syntax analyzer checks whether a given program satisfy the rules implied by a CFG or not.
 If it satisfies, the syntax analyzer creates a parse tree for the given program.
Components or Tuples of Context Free Grammar (CFG)
 A context free grammar has four components; G = {V, T, P, S}
Grammar (G): is grammar, which consists of a set of production rules. Which also generate the
string of a language.
1. Non-terminals (V): are a set of syntactic variables that defines sets of strings.
 They are represented by upper case letters ([A-Z]) and they can be further derivate.
2. Terminal Symbols (T): are the basic symbols from which strings are formed, also set of
tokens.
 Terminals are represented by lower case letters ([a-z], [0-9], (, ), … ) and they do not
have further derivation.
4
3.1. The Role of Parsing…(Cont.)
3. Productions (P): is a set of production rules of a grammar specify the manner in
which the terminals and non-terminals can be combined to form strings.
 Each productions consists of three things.
 A non-terminal (left side of the production),
 An arrow, and
 A sequence of tokens with non-terminals (right side of the production),
EE+E
S  As
Sb|a

4. Start Symbol (S): is one of the non-terminals from where the production begins.
 The strings are derived from the start symbol by repeatedly replacing a non-
terminal.
 Initially a non-terminal is start symbol by the right side of a production, for that
non-terminal.

5
3.1. The Role of Parsing…(Cont.)
Example: Let we take the problem of palindrome language; L = {w| w = wR},
 This language cannot be described by means of regular expressions, because it is not a
regular language, but it can be described by means of CFG, as illustrated below;
 G = (V, Σ, P, S); where: G- is a grammar,
 V = {Q, Z, N}, which is set of non-terminals,
 Σ = {0, 1}, which is a set of terminals,
 P = {Q -> Z | Q -> N | Q -> | Z -> 0Q0 | N -> 1Q1}, which is a set of production
 S = {Q}, which is a start symbol
 This grammar describes palindrome language, such as: 1001, 11100111, 1010101, 11111, etc.
Check: 1001
QN
Q  1Q1
Q  10Q01
Q  10
Q  1001

6
3.1. The Role of Parsing…(Cont.)
Syntax Analyzer Vs Lexical Analyzer
 A program recognized by LA, and it is also by the syntax analyzer.
 Both of them do similar things;
 But the LA deals with simple non-recursive constructs of the language.
 The SA deals with recursive constructs (Recursive is a function which calls itself
again and again) of the language.
Q  N + T|T
T  T + P|P
Pa|b
 The lexical analyzer simplifies the jobs of the syntax analyzer.
 The lexical analyzer recognizes the smallest meaningful units (tokens) in a source
program.

7
3.2. Parsing
 Parsing is performed at the SA phase where a stream of tokens is taken as input from
LA and the parser produces the parser tree for the tokens against the syntax error.
 The concept of parsing is basically address various concepts such as:
 Derivation
 Parse Tree
 Ambiguity
 Left Recursion

A. Derivation

 A derivation is basically a sequence of production rules, in order to get the input string.

 During the production rule, we take two decisions for some sentential form of input.

 Deciding the non-terminal which is to be replaced.

 Deciding the production rule, by which the non-terminal will be replaced.

8
3.2. Parsing…(Cont.)
 To decide which non-terminal to be replaced with production rule, we can have two options;

i. Left-most Derivation

 Here, the sentential form of an input is scanned and replaced from left to right.

 Sentential form derived from left-most derivation is called a left-sentential form.

 Sentential Form: a sentence that contains variables and terminals.

i.e. S  0S1

Variables Combination of terminals

ii. Right-most Derivation
and variables

 Here, we scan and replace the input with production rules, from right to left.

 Sentential form derived from right-most derivation is called right-sentential form.

9
3.2. Parsing…(Cont.)
Example: consider the following production rules, Obtain left most and right most
derivation for the input string: id + id * id
EE+E
EE*E
E  id

The left-most derivation is: The right-most derivation is:

EE*E Since {E  E + E} EE+E Since {E  E + E}
EE+E*E Since {E  E * E} EE+E*E Since {E  E * E}
E  id + E * E Since {E  id} E  E + E * id Since {E  id}
E  id + id * E Since {E  id} E  E + id * id Since {E  id}
E  id + id * id Since {E  id} E  id + id * id Since {E  id}

 NB; in left-most derivation, the left-most side non-terminal is always processed first;
Whereas,
 In right-most derivation, the right-most side non-terminal is always processed first.
10
3.2. Parsing…(Cont.)
 Example: consider the following grammar
S  0A | 1B | 0 | 1
A  0S | 1B | 1
B  0A | 1S
Construct LMD and parse tree for the following string.
i) 0101 ii) 1100101
i) 0101
S  0A
S  01B
S  010A
S  0101

11
3.2. Parsing…(Cont.)
 Example: consider the following grammar
S  0A | 1B | 0 | 1
A  0S | 1B | 1
B  0A | 1S
Construct LMD and parse tree for the following string.
i) 0101 ii) 1100101
i) 1100101
S  1B
S  11B
S  110A
S  1100S
S  11001B
S  110010A
S  1100101
12
3.2. Parsing…(Cont.)
 Example: consider the grammar
S  (L)|a
L  L,S|S
a) What are the non-terminals, terminals, and start symbols
b) Find LMD, RMD and parse trees for the following
i) (a, a) ii) (a, (a,a))
c) What language does the grammar great
Solution
a) Non-terminals = {S, L}
Terminals = {(, ), a, ,}
Start Symbols = {S}
b) i) (a, a) derivate by using LMD
S  (L)
S  (L, S)
S  (S, S)
S  (a, S)
S  (a, a)
13
3.2. Parsing…(Cont.)
b(i) (a, a) derivate by using RMD
S  (L)
S  (L, S)
S  (L, a)
S  (S, a)
S  (a, a)

b(ii) (a, (a, a)) derivate by using RMD

S  (L)
S  (L, S)
S  (S, S)

S  (a, S)
S  (a, (L))
S  (a, (L, S))
S  (a, (S, S))
S  (a, (a, S))
S  (a, (a, a))

14
3.2. Parsing…(Cont.)
B. Parse Tree
 A parse tree is a graphical depiction of a derivation.
 It is convenient to see how strings are derived from the start symbol.
 The start symbol of the derivation becomes the root of the parse tree.
 All leaf nodes are terminals, and
 All the interior nodes are non-terminals.

Example: let we take the left-most derivation of the given input string; id + id * id and draw the

parse tree. Assume the production rule E  E + E| E * E | a| b| c.

Solution: The left-most derivation is a + b * c is:
EE*E

EE+E*E

E  id + E * E

E  id + id * E
14
E  id + id * id
3.2. Parsing…(Cont.)
Example 3: design LMD AND RMD for the string aaabaab.
Grammar is S  aS|aSbS|

LMD: aaabaab
S  aS

S  [Link]

S  aaaSbS

S  aaabS

S  aaabaS

S  aaabaaSbS

S  aaabaab

15
3.2. Parsing…(Cont.)
C. Ambiguity
 If a context free grammar G has more than one derivation tree for some string w ∈ L(G), it is
called an ambiguous grammar.
 There exist multiple right-most or leftmost derivations for some string generated from that
grammar.
Example: Check whether the grammar G with production rules is ambiguous or not.
X → X+X | X*X |X| a

Solution
 Let’s find out the derivation tree for the string "a+a*a". It has two leftmost derivations.
Derivation 1:
X → X+X
X → a +X
X → a+ X*X
X → a+a*X
X → a+a*a
16
3.2. Parsing…(Cont.)
Example: Check whether the grammar G with production rules is ambiguous or not.

X → X+X | X*X |X| a

Solution

 Let’s find out the derivation tree for the string "a+a*a". It has two leftmost derivations.

Derivation 2:

X → X*X

X → X +X*X

X → a+ X*X

X → a+a*X

X → a+a*a

 Since there are two parse trees for a single string "a+a*a", the grammar G is ambiguous.
17
3.2. Parsing…(Cont.)
Class Activity:
 Consider the following grammar:
G1: S  SbS|a
Design LMD AND RMD for the string ababa

18
3.2. Parsing…(Cont.)
 Solution:
G1: S  SbS|a
Design LMD AND RMD for the string ababa
Solution: LMD
S  SbS
S  abS
S  abSbS
S  ababa
Solution: RMD
S  SbS
S  Sba
S  SbSba
S  Sbaba
S  ababa

19
3.2. Parsing…(Cont.)
 There are three main reasons leads to Ambiguity
 Associativity
 Rewriting the whole grammar
 Precedence.

i) Associativity
 It represents the direction in which the expression to be evaluated
Two types of associativity
1. Left to Right
2. Right to Left
 If the operation is left to Right associative, then the operand will be taken by the left operator
 If the operation is right to Left associative, the right operator will take the operand.
 Operations like Addition, Multiplication, Subtraction, and division are left
associative.
 For example; if the expression contains: id op id op id
 It will be evaluated as; (id op id) op id i.e. (id + id) + id
 Operations like exponentiation are right associative.
20
3.2. Parsing…(Cont.)

21
3.2. Parsing…(Cont.)
 Example: a + b + c
 Two way to evaluate the above expression
(a +b) + c and a + (b + c)
 Example: a = b = c

Left Associative Grammar

Example:
EE+E
Ea|b|c
Design for the string a + b + c (The associativity of addition operator is left to right)
(a + b) + c a + (b + c)
EE+E EE+E
EE+E+E EE+E+E
Ea+E+E EE+E+c
Ea+b+E EE+b+c
Ea+b+c Ea+b+c

Left to Right Associative

22
3.2. Parsing…(Cont.)
 Example: a = b = c
 Two way to evaluate the above expression
a = (b = c) and (a = b) = c

Right Associative Grammar

Example:
EE=E
Ea|b|c
Design for the string a = b = c (The associativity of assignment operator is right to left)
Which means a = (b = c)

A = (b = c)
(a = b) = c
EE=E
EE=E
EE=E=E
EE=E=E
EE=E=c
EE=E=c
EE=b=c
EE=b=c
Ea=b=c
Ea=b=c

Right to Left Associative

23
3.2. Parsing…(Cont.)

24
3.2. Parsing…(Cont.)
ii) Rewrite the Grammar (Left Recursion)
Example:
E  E + T|T
EE+E
Rewrite the Grammar Ta|b|c
Ea|b|c
a+b+c
a+b+c

Solution
a+b+c
EE+T
EE+T+E
ET+T+E
Ea+T+E
Ea+b+E
Ea+b+T
Ea+b+c

25
3.2. Parsing…(Cont.)
ii) Rewrite the Grammar (Right Recursion)
Example:
E  T = E|T
EE=E
Rewrite the Grammar Ta|b|c
Ea|b|c
a=b= c
a=b=c

Solution
a=b=c
ET=E
EE=E=T
EE=E=c
EE=T=c
EE=b=c
ET=b=c
Ea+b+c

26
3.2. Parsing…(Cont.)

27
3.2. Parsing…(Cont.)
iii) Precedence
 When more than one operator presents in an expression which one to be evaluated
first is determined by Operator Precedence.
Operator Precedence Table

28
3.2. Parsing…(Cont.)
iii) Precedence
 Consider the Expression 4 + 3 * 6
4 + 3 * 6 = 7 * 6 = 42
4 + 3 * 6 = 4 + 18 = 22

Example: E + E | E * E | id, for the string id + id * id

29
3.2. Parsing…(Cont.)
Re-writing Grammar With Additional Rules
Example: E  E + E | E * E | id
For the string id + id * id
Create Levels of Non-Terminals
 E – First level of precedence
 T – Second Level of precedence
 F – Generating basic units (terminal) in expression
E  E + T|T
T  T * F|F
F  id
id + id * id
EE+T
EE+T*F
E  E + T * id
E  E + F * id
E  E + id * id
E  id + id * id 30
U!
YO
NK
H A
T

Parsing Theory and Compiler Design
No ratings yet
Parsing Theory and Compiler Design
151 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
28 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
60 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
14 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
60 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
16 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
17 pages
Syntax Analysis and Parsing Techniques
No ratings yet
Syntax Analysis and Parsing Techniques
29 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
67 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
76 pages
Understanding Syntax Analysis in Compilers
No ratings yet
Understanding Syntax Analysis in Compilers
3 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
31 pages
Understanding Syntax Analysis in Compilers
No ratings yet
Understanding Syntax Analysis in Compilers
93 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
36 pages
Chapter 4 Syntax Analysis
No ratings yet
Chapter 4 Syntax Analysis
11 pages
Unit 3
No ratings yet
Unit 3
128 pages
Syntax Analysis and Parsing Techniques
No ratings yet
Syntax Analysis and Parsing Techniques
69 pages
Understanding Compilers and Parsing Techniques
No ratings yet
Understanding Compilers and Parsing Techniques
96 pages
Parsing Techniques in Compiler Construction
No ratings yet
Parsing Techniques in Compiler Construction
50 pages
Understanding Syntax Analysis in Compilers
No ratings yet
Understanding Syntax Analysis in Compilers
11 pages
Syntax Analysis and Context-Free Grammars
No ratings yet
Syntax Analysis and Context-Free Grammars
90 pages
Parsing Techniques and CFG Overview
No ratings yet
Parsing Techniques and CFG Overview
180 pages
Understanding Parsing Techniques in Compilers
No ratings yet
Understanding Parsing Techniques in Compilers
29 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
27 pages
Understanding Syntax Analysis in Compilers
No ratings yet
Understanding Syntax Analysis in Compilers
20 pages
Understanding LR Parsers in Compilers
No ratings yet
Understanding LR Parsers in Compilers
46 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
180 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
50 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
100 pages
Understanding Compiler Syntax Analysis
No ratings yet
Understanding Compiler Syntax Analysis
77 pages
Syntax Analysis and Yacc in Compilers
No ratings yet
Syntax Analysis and Yacc in Compilers
21 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
36 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
12 pages
Context-Free Grammars in Compiler Design
No ratings yet
Context-Free Grammars in Compiler Design
136 pages
CD Mod 2
No ratings yet
CD Mod 2
47 pages
Top Down Parsing in Syntax Analysis
No ratings yet
Top Down Parsing in Syntax Analysis
15 pages
Syntax Analysis and Parsing Techniques
No ratings yet
Syntax Analysis and Parsing Techniques
63 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
72 pages
Unit 2
No ratings yet
Unit 2
89 pages
Syntax Analysis and Parsing Techniques
No ratings yet
Syntax Analysis and Parsing Techniques
32 pages
Syntax Analysis and Parsing Techniques
No ratings yet
Syntax Analysis and Parsing Techniques
10 pages
Syntax Analysis: Parsing Techniques Explained
No ratings yet
Syntax Analysis: Parsing Techniques Explained
36 pages
Chapter 3 Context-Free Grammars
No ratings yet
Chapter 3 Context-Free Grammars
43 pages
Understanding Syntax Analysis
No ratings yet
Understanding Syntax Analysis
91 pages
Module2 CAT1
No ratings yet
Module2 CAT1
71 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
180 pages
UNIT-II Compiler Design
No ratings yet
UNIT-II Compiler Design
23 pages
Syntax Analysis in Programming Languages
No ratings yet
Syntax Analysis in Programming Languages
126 pages
Syntax Analysis and Parsing Methods
No ratings yet
Syntax Analysis and Parsing Methods
23 pages
CD 2,3 Unit's Material
100% (1)
CD 2,3 Unit's Material
170 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
19 pages
Syntax Analysis and Parsing Techniques
No ratings yet
Syntax Analysis and Parsing Techniques
93 pages
Terminal and Nonterminal Symbols in Parsing
No ratings yet
Terminal and Nonterminal Symbols in Parsing
28 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
11 pages
Syntax Analysis in Programming Languages
No ratings yet
Syntax Analysis in Programming Languages
38 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
15 pages
Film Parsing Techniques Explained
No ratings yet
Film Parsing Techniques Explained
105 pages
Understanding Noun Inflection in Portuguese
No ratings yet
Understanding Noun Inflection in Portuguese
15 pages
Direct Fluency Instruction for Readers
100% (4)
Direct Fluency Instruction for Readers
42 pages
Skehan Modelsof Speakingandthe Assessmentof Second Language Proficiency
No ratings yet
Skehan Modelsof Speakingandthe Assessmentof Second Language Proficiency
24 pages
Language as Discourse in Education
No ratings yet
Language as Discourse in Education
31 pages
Advertising Essentials: Logos & Slogans
No ratings yet
Advertising Essentials: Logos & Slogans
24 pages
FCE Article Writing Guidelines
No ratings yet
FCE Article Writing Guidelines
6 pages
ESL Family Members Activities
No ratings yet
ESL Family Members Activities
7 pages
English Phonetics I Course Guide 2023
No ratings yet
English Phonetics I Course Guide 2023
31 pages
Understanding Street Performers
No ratings yet
Understanding Street Performers
2 pages
GEB 3213 Exam 1 Study Guide
No ratings yet
GEB 3213 Exam 1 Study Guide
6 pages
Answer Key - Language - Unit 7
63% (8)
Answer Key - Language - Unit 7
2 pages
Manuel L. Quezon: Language and Nationhood
No ratings yet
Manuel L. Quezon: Language and Nationhood
2 pages
Understanding Phrasal Verbs and Infinitives
No ratings yet
Understanding Phrasal Verbs and Infinitives
6 pages
English Talent Search Olympiad 2015-16
100% (1)
English Talent Search Olympiad 2015-16
4 pages
Writing Questions (Papers 1 and 2)
No ratings yet
Writing Questions (Papers 1 and 2)
4 pages
Arabic Communication Skills Course 31112
No ratings yet
Arabic Communication Skills Course 31112
9 pages
Form 5 English Language Scheme 2020
No ratings yet
Form 5 English Language Scheme 2020
17 pages
Master Spanish Listening Skills Guide
No ratings yet
Master Spanish Listening Skills Guide
15 pages
Translation as Understanding: Insights
No ratings yet
Translation as Understanding: Insights
11 pages
Comprehensive Class Syllabus Overview
No ratings yet
Comprehensive Class Syllabus Overview
1 page
English 11 Exam Review Guide
No ratings yet
English 11 Exam Review Guide
8 pages
Square Dancing: Grammar and Conjugation
No ratings yet
Square Dancing: Grammar and Conjugation
7 pages
Wonders of English Grammar Basics
No ratings yet
Wonders of English Grammar Basics
2 pages
Animal Park Visit and Conversations Guide
No ratings yet
Animal Park Visit and Conversations Guide
4 pages
Grade 8 English 1st Quarter Test
No ratings yet
Grade 8 English 1st Quarter Test
2 pages
Handbook of Bilingualism Psycholinguistic Approaches (Judith F. Kroll, Annette M. B. de Groot)
100% (3)
Handbook of Bilingualism Psycholinguistic Approaches (Judith F. Kroll, Annette M. B. de Groot)
603 pages
Guidelines for Effective Quotation in Research
No ratings yet
Guidelines for Effective Quotation in Research
11 pages
English Language Skills Practice Guide
No ratings yet
English Language Skills Practice Guide
6 pages
English Weekly Test Instructions
No ratings yet
English Weekly Test Instructions
4 pages
Actividades y Ejercicios de Inglés
No ratings yet
Actividades y Ejercicios de Inglés
3 pages

Understanding Syntax Analysis in Compilers

Uploaded by

Understanding Syntax Analysis in Compilers

Uploaded by

CHAPTER THREE

Prepared by: Mesay Yohannes (MSc.)

3.2.2. Parse Tree,

3.2.3. Ambiguity, and

3.2.4. Left Recursion

 Parsing the code,

 Regular expressions cannot check balancing tokens, such as parenthesis.

 The syntax of a language is specified by a context free grammar (CFG).

 Deciding the non-terminal which is to be replaced.

 Deciding the production rule, by which the non-terminal will be replaced.

 Sentential form derived from left-most derivation is called a left-sentential form.

 Sentential Form: a sentence that contains variables and terminals.

Variables Combination of terminals

 Sentential form derived from right-most derivation is called right-sentential form.

The left-most derivation is: The right-most derivation is:

b(ii) (a, (a, a)) derivate by using RMD

parse tree. Assume the production rule E  E + E| E * E | a| b| c.

X → X+X | X*X |X| a

Left Associative Grammar

Left to Right Associative

Right Associative Grammar

Right to Left Associative

Example: E + E | E * E | id, for the string id + id * id

You might also like