0% found this document useful (0 votes)

16 views37 pages

Compiler Design: Syntax and Parsing

The document discusses the design and construction of a simple one-pass compiler, focusing on lexical analysis and syntax-directed translation. It explains the use of context-free grammars to define programming language syntax and the process of parsing, including the concepts of tokens, parse trees, and operator precedence. Additionally, it outlines the role of the lexical analyzer in transforming character input streams into tokens for further processing by the parser.

Uploaded by

bevonomwenga17

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views37 pages

Compiler Design: Syntax and Parsing

Uploaded by

bevonomwenga17

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

BCS 324: COMPILER DESIGN &

CONSTRUCTION

Introduction to Compiling – Part B

1. A simple one-pass compiler
2. Lexical analysis

Dr. D. K. Muyobo
I. A simple one-pass compiler
 A programming language can be defined
by describing
◦ What its program’s syntax looks like, and
◦ What its statements mean (i.e. semantics)
 We use BNF (Backus-Naur Form) or
context-free grammars to specify the
syntax of a language
 Then we use informal descriptions to
specify semantics of the language.
 In addition to specifying syntax,
◦ Context-free grammar can also be used to guide the translation
of programs
◦ Such grammar-oriented compiling is known as syntax-directed
translation.
 We use an example of a compiler to translate infix
expressions into polish postfix notation where:
◦ The lexical analyzer converts the stream of input characters into
a stream of tokens
◦ The token stream becomes input to the syntax-directed
translator.
◦ The syntax-directed translator then produces intermediate code
representation as output, as shown in Fig 1.

3
Syntax-
Character Lexical Token Intermediate
directed
stream analyzer stream representation
translator

Fig.1: Syntax-directed translator

4
Syntax definition
 A context-free grammar describes the
hierarchical structure of many programming
language constructs e.g.
◦ The branch statement in C language has the form:
 if(expression)statement else statement
◦ It is the concat of opening parenthesis, an expression, closing
parenthesis, statement, else keyword, statement.
◦ We can use variables as follows:
 expr – to denote expression, and
 stmt – to denote statement.

5
 Thus, we have:
◦ stmt  if(expr)stmt else stmt
 Such a rule is called a production.
 Individual lexical units of a production e.g.
if, parenthesis etc. are called tokens.
 Variables like expr and stmt represent
sequences of tokens and are called
non-terminals.

6
Components of a context-free grammar

1. A set of tokens known as terminal symbols

2. A set of non-terminals e.g. expr, stmt.
3. A set of productions where each production
consists of a non-terminal called leftside of
the production, an arrow, a sequence of
tokens and/or non-terminals called the
rightside of the production
4. A designation of one of the non-terminals as
the start symbol.

7
 The convention followed lists grammars
by their productions listing start symbol
first
 Digits and signs such as <= and while etc.
are terminals e.g.
◦ list  list + digit
◦ list  list - digit
◦ list  digit
◦ digit  0|1|2|3|4|5|6|7|8|9

8
 The right sides of the 3 productions with
non-terminal list on the leftside can be
grouped equivalently as:
◦ list  list+digit|list–digit|digit
 Tokens of the grammar are the symbols
+-0123456789

9
Parse trees
 Parse trees
◦ Pictorially show how the start symbol of a
grammar derives a string in the language
 If non-terminal A has a production
AXYZ such that A is the start symbol,
then its parse tree is
A

X Y Z

Fig.2: Parse tree for non-terminal A

10
 Properties of parse trees
1. Root is labeled by a start symbol
2. Each leaf is labeled by a terminal (token)
3. Each interior node is labeled by a non-terminal
4. The children of a node in a parse tree form the
rightside of its production while the statement
symbol forms the leftside e.g. node A with children
XYZ has the production
AXYZ
NB: Parsing is the process of finding a parse
tree for a given string of tokens.
11
Ambiguity
 Ambiguity is caused by a string with more
than one parse tree
 This is because it will have more than one
meaning
 See example in Fig.3 below.

12
(a) (b)
string string

string + string -
string string
- string 2 9
string string + string

9 5 5 2

Fig. 3:Two parse trees for 9-5+2

13
Associativity of operators
 9+5+2  (9+5)+2 and
 9-5-2  (9-5)-2 and
 The operand 5 given a plus operator to its right
and left will first take the left one i.e.
◦ (9+5) and thereafter take +2
 This means math operators are all left
associative e.g.
◦ In the second case, the number 5 with a minus
operator on both sides will first take (9-5) and then -
2.

14
 On the other hand, the assignment operator (=)
in C language is right associative e.g.
◦ a=b=c  a=(b=c)
 This statement has the following grammar:
◦ rightletter=right|letter
◦ lettera|b|…|z

NB: A left associative parse tree grows down towards the

left while a right associative parse tree grows down
towards the right as shown in Fig. 4.

15
(a) left associative (b) right associative
list right

list - digit =
letter right
- digit 2 a
list letter = right

digit 5 b letter

c
9
Fig. 4:Two parse trees for (a) 9-5-2 and (b) a=b=c

16
Precedence of operators
 When more than one operator is used,
we follow precedence where
◦ The (* and /) operators are higher in
precedence than the (+ and -) operators
meaning that they take their operands first.

17
Constructing a grammar for arithmetic
operators
◦ Constructed from a table showing
associativity and precedence of the operators
◦ We create two non-terminal expressions
expr and term for the two levels of
precedence (i.e. *,/ and +,-)
◦ Then we create a third non-terminal factor
that will be used for generating basic units
such as digits and parenthesized expressions.
factordigit|(expr)

20
Syntax-directed translation
 In order to translate a program, a
compiler needs to keep track of many
quantities of information e.g.
◦ Type of construct, location of first instruction
in the target code, number of instructions
generated etc.
◦ These are called attributes associated with
the constructs.

21
 Syntax-directed definitions
◦ Uses grammar to specify the structure of the input
◦ A translation is an input-output mapping in the
following manner:
 First construct a parse tree for X.
 Suppose a node n in the parse tree is labeled by the grammar
symbol X, then we write X.a to denote the value of attribute
a of X at that node.
◦ A parse tree showing the attribute values at
each node is called an annotated parse
tree e.g.
 In Fig.5, the 9-5+2 is translated from infix to postfix
notation, with attribute values attached to nodes.

22
expr.t=95-2+

expr.t=95- + term.t=2

- term.t=5 2
expr.t=9

term.t=9 5

Fig. 5: Annotated parse tree for 9-5+2

23
 Exercise
◦ Study Fig.5 carefully and then draw a
corresponding ordinary parse tree for it.

24
Parsing
 Definitions
◦ Parsing is the process of determining if a string of
tokens can be generated by a grammar
◦ Parsing may also be defined as the process of finding a
parse tree for a given string of tokens.
 NB:
◦ A parser must be able to construct the tree,
otherwise the translation cannot be guaranteed
correct.

25
 Parsing methods
◦ There are two main parsing methods,
distinguished by the order in which nodes in
the tree are constructed
 Top-down methods
 Construction here starts at the root and proceeds towards
the leaves
 Bottom-up methods
 Construction here starts at the leaves and proceeds
towards the root.

26
 Justification for choosing a parsing
method
◦ Top-down methods
 May be chosen when efficient parsers need to be
constructed easily
◦ Bottom-up methods
 May be chosen when a large class of grammars
need to be generated.

27
 Top-down parsing
◦ As an example, we construct a grammar for
subset of Pascal language.
◦ We construct an array, which uses dotdot
token in place of the actual “..” so as to
emphasize that the two dots are treated as
one token

28
type  simple
| id
| array [simple] of type
simple  integer
| char
| num dotdot num

29
 We start with the root and label it with
starting non-terminal, and then
repeatedly perform the following two
steps:
1. At node n, labeled with non-terminal A,
select one of the productions for A and
construct children at n for the symbols on
the right side of the production
2. Find the next node at which a sub-tree is to
be constructed.
30
 The current token being scanned in the input is
frequently referred to as the lookahead symbol
 Initially, lookahead symbol is the first (leftmost
token) of the string

 For example, in the Pascal array

array [num dotdot num] of integer
token array is initially the lookahead symbol
and the known part of the tree is only the root,
labeled with the starting non-terminal type.
NB: The generated tree should match the input string
31
(d) type
(a) type
array [ simple ] of type
(b) type
num dotdot num simple
array [ simple ] of type
(e) type

(c) type array [ simple ] of type

array [ simple ] of type num dotdot num simple

num dotdot num integer

Fig.6: Steps in top-down construction of a parse tree

32
2. Lexical analysis
 The role of the lexical analyzer
◦ Its task is to read a character input stream and
produce as output a sequence of tokens that the
parser uses for syntax analysis
◦ May perform the secondary role as user interface
 It is implemented as a subroutine of the lexical
analyzer as shown below.
Read
character
Lexical Pass token
Input and its attributes Parser
Pushback analyzer
character

Fig.7: Lexical analysis

33
 In Fig.7, if the lexical analyzer reads >
followed by =, then it passes to the parser
the token >=
 If it only reads > and no = symbol
following, then it pushes the other
character back to the input to act as the
beginning of next token and passes only >
to the parser.

34
Functions of the lexical analyzer
 The functions of the lexical analyzer are:
1. Removes white space e.g. tabs, newlines, blanks etc.
2. Removes comments from source program
3. Corrects errors from source program
4. Recognizes and takes constants as digits i.e. takes
digit tokens e.g. num accompanied by their constant
value, such as <num,21><+><num,9>
5. Recognizes identifiers and keywords e.g.
Totalmarks = test1+test2*3 becomes
id1 = id2+id3*3

35
Patterns, tokens, and lexemes
 Pattern
◦ Is a rule describing a set of input strings (lexemes)
associated with the token
 Tokens
◦ Terminal symbols generated from lexemes e.g.
keywords, operators, identifiers, constants, literal
strings, parentheses, punctuation symbols, commas
and semicolons
 Lexeme
◦ A sequence of characters in the source program that
is matched by the pattern for a token.

36
Lexical errors and error recovery actions

 Wrongly spelt words e.g. fi. Change to if

 Missing characters e.g. flot. Change to float
 Extra characters e.g. flooat. Remove extra
character
 Incorrect characters. Correct as appropriate.
 Panic mode recovery
◦ If lexer cannot correct a certain error, it panics and
deletes the remaining input until it can find a well-
formed token.

Simple One-Pass Compiler Overview
No ratings yet
Simple One-Pass Compiler Overview
22 pages
Syntax-Directed Compiler Overview
No ratings yet
Syntax-Directed Compiler Overview
50 pages
Syntax-Directed Translator Overview
No ratings yet
Syntax-Directed Translator Overview
45 pages
One-Pass Compiler Construction Guide
No ratings yet
One-Pass Compiler Construction Guide
12 pages
Syntax-Directed Translation Basics
100% (1)
Syntax-Directed Translation Basics
45 pages
Syntax-Directed Translation in CS 4300
No ratings yet
Syntax-Directed Translation in CS 4300
70 pages
Syntax-Directed Translation in Compilers
No ratings yet
Syntax-Directed Translation in Compilers
47 pages
Simple Compiler and Syntax Translation Guide
No ratings yet
Simple Compiler and Syntax Translation Guide
39 pages
One-Pass Compiler Overview and Structure
No ratings yet
One-Pass Compiler Overview and Structure
79 pages
One-Pass Compiler Overview and Syntax
No ratings yet
One-Pass Compiler Overview and Syntax
12 pages
Syntax-Directed Translation Overview
No ratings yet
Syntax-Directed Translation Overview
32 pages
CSC4021Z Compilers 4 Parsing 2024
No ratings yet
CSC4021Z Compilers 4 Parsing 2024
20 pages
Syntax-Directed Translation in Compilers
No ratings yet
Syntax-Directed Translation in Compilers
30 pages
Parse Trees in Syntax Analysis
No ratings yet
Parse Trees in Syntax Analysis
18 pages
One-Pass Compiler Overview and Syntax
No ratings yet
One-Pass Compiler Overview and Syntax
21 pages
Simple Compiler Syntax and Translation
No ratings yet
Simple Compiler Syntax and Translation
39 pages
Syntax-Directed Translation in Compilers
No ratings yet
Syntax-Directed Translation in Compilers
7 pages
Syntax Directed Translation in Compilers
No ratings yet
Syntax Directed Translation in Compilers
41 pages
Syntax Analysis and Parsing Techniques
No ratings yet
Syntax Analysis and Parsing Techniques
42 pages
Syntax Analysers Part B
No ratings yet
Syntax Analysers Part B
23 pages
Lec 4
No ratings yet
Lec 4
37 pages
Syntax Directed Translation Overview
No ratings yet
Syntax Directed Translation Overview
60 pages
APBN Compilation Mechanism Explained
No ratings yet
APBN Compilation Mechanism Explained
19 pages
Syntax Directed Translation in Compilers
No ratings yet
Syntax Directed Translation in Compilers
57 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
74 pages
Understanding Abstract Syntax Trees
No ratings yet
Understanding Abstract Syntax Trees
38 pages
Compiler Design Syllabus: Syntax Analysis
No ratings yet
Compiler Design Syllabus: Syntax Analysis
38 pages
Compiler Front End Overview and Syntax
No ratings yet
Compiler Front End Overview and Syntax
41 pages
A Simple Syntax-Directed Translator (Part1)
No ratings yet
A Simple Syntax-Directed Translator (Part1)
7 pages
Compiler Design and Complexity Theory
No ratings yet
Compiler Design and Complexity Theory
19 pages
Syntax-Directed Translation in Compilers
No ratings yet
Syntax-Directed Translation in Compilers
37 pages
Syntax-Directed Translation Overview
No ratings yet
Syntax-Directed Translation Overview
51 pages
Parsing Techniques in Compiler Design
No ratings yet
Parsing Techniques in Compiler Design
51 pages
Syntax-Directed Translation Overview
No ratings yet
Syntax-Directed Translation Overview
45 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
41 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
67 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
60 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
21 pages
Understanding Syntax Analysis in Compilers
No ratings yet
Understanding Syntax Analysis in Compilers
46 pages
(Week 3) Syntax Analysis (Derivation)
No ratings yet
(Week 3) Syntax Analysis (Derivation)
46 pages
Parsing and Context-Free Grammars
No ratings yet
Parsing and Context-Free Grammars
32 pages
Syntax Analysis and Parsing Techniques
No ratings yet
Syntax Analysis and Parsing Techniques
103 pages
Simple One-Pass Compiler Overview
No ratings yet
Simple One-Pass Compiler Overview
70 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
31 pages
One-Pass Compiler Overview and Concepts
No ratings yet
One-Pass Compiler Overview and Concepts
11 pages
Syntax Directed Translation Overview
No ratings yet
Syntax Directed Translation Overview
25 pages
Understanding Recursive Grammars
No ratings yet
Understanding Recursive Grammars
25 pages
Applications of SDT in Compiler Design
No ratings yet
Applications of SDT in Compiler Design
30 pages
Chapter 3
No ratings yet
Chapter 3
34 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
60 pages
Syntax Analysis and Parsing Techniques
No ratings yet
Syntax Analysis and Parsing Techniques
59 pages
Understanding Compiler Phases and Syntax
No ratings yet
Understanding Compiler Phases and Syntax
31 pages
Lecture 1
No ratings yet
Lecture 1
14 pages
Worship and Covenant Renewal in Joshua
No ratings yet
Worship and Covenant Renewal in Joshua
12 pages
Lessons from Caleb's Faith and Courage
No ratings yet
Lessons from Caleb's Faith and Courage
12 pages
Climate Change
No ratings yet
Climate Change
2 pages
Inheritance of the Promised Land
No ratings yet
Inheritance of the Promised Land
9 pages
Understanding Computer Threats and Security
No ratings yet
Understanding Computer Threats and Security
1 page
Software Process Models Overview
No ratings yet
Software Process Models Overview
9 pages
Top 7 Software Process Models Explained
No ratings yet
Top 7 Software Process Models Explained
6 pages
Introduction to Software Engineering Basics
No ratings yet
Introduction to Software Engineering Basics
10 pages
Software Engineering Course Outline 2024
No ratings yet
Software Engineering Course Outline 2024
2 pages
Software Project Management Essentials
No ratings yet
Software Project Management Essentials
4 pages
Time and Health in Bosnian Language
No ratings yet
Time and Health in Bosnian Language
20 pages
Class 8 English Exam Paper Overview
No ratings yet
Class 8 English Exam Paper Overview
3 pages
Class 6
No ratings yet
Class 6
8 pages
Describing Imagery in Literature
No ratings yet
Describing Imagery in Literature
11 pages
Jumbled Sentences Exercises for Class 10
100% (2)
Jumbled Sentences Exercises for Class 10
7 pages
Defining Theme in Portuguese Linguistics
100% (1)
Defining Theme in Portuguese Linguistics
20 pages
Identifying Common Grammar Errors
No ratings yet
Identifying Common Grammar Errors
11 pages
Countable vs Uncountable Nouns Guide
No ratings yet
Countable vs Uncountable Nouns Guide
28 pages
Lesson 5: Loving Your Parents
No ratings yet
Lesson 5: Loving Your Parents
17 pages
Understanding Phrasal Verbs in English
No ratings yet
Understanding Phrasal Verbs in English
7 pages
Understanding Present Perfect Tense
No ratings yet
Understanding Present Perfect Tense
9 pages
Understanding Simple Future Tense
No ratings yet
Understanding Simple Future Tense
12 pages
Understanding Deterministic Finite Automata
No ratings yet
Understanding Deterministic Finite Automata
33 pages
Duolingo English Test Vocabulary Guide
100% (8)
Duolingo English Test Vocabulary Guide
15 pages
Understanding Modal Verbs and Usage
No ratings yet
Understanding Modal Verbs and Usage
3 pages
Class 10 Communication Skills Guide
100% (3)
Class 10 Communication Skills Guide
4 pages
Present Simple Exercise Guide
No ratings yet
Present Simple Exercise Guide
5 pages
Adverbs of Frequency Exercises PDF
No ratings yet
Adverbs of Frequency Exercises PDF
2 pages
English Words
No ratings yet
English Words
91 pages
Useful Expressions and Vocabulary For CAE and FCE Speaking Part 1 - CISL English School Blog
No ratings yet
Useful Expressions and Vocabulary For CAE and FCE Speaking Part 1 - CISL English School Blog
1 page
Kagan Rally Coach Instructions for 3rd Grade
No ratings yet
Kagan Rally Coach Instructions for 3rd Grade
6 pages
English Grammar MCQs and Exercises
No ratings yet
English Grammar MCQs and Exercises
24 pages
Adverb Usage and Rules Explained
No ratings yet
Adverb Usage and Rules Explained
20 pages
Theory of Computation Study Guide
No ratings yet
Theory of Computation Study Guide
34 pages
Present Simple vs. Present Continuous
100% (1)
Present Simple vs. Present Continuous
1 page
Irregular and Regular Verbs Worksheet
50% (4)
Irregular and Regular Verbs Worksheet
5 pages
Present Perfect: Correr Conjugation
No ratings yet
Present Perfect: Correr Conjugation
13 pages
Learning Units for IX Grade English
No ratings yet
Learning Units for IX Grade English
6 pages
A2-B1 Writing Assessment Rubric
No ratings yet
A2-B1 Writing Assessment Rubric
2 pages
Irregular Plural Nouns in English
No ratings yet
Irregular Plural Nouns in English
2 pages

Compiler Design: Syntax and Parsing

Uploaded by

Compiler Design: Syntax and Parsing

Uploaded by

BCS 324: COMPILER DESIGN &

Introduction to Compiling – Part B

Fig.1: Syntax-directed translator

1. A set of tokens known as terminal symbols

Fig.2: Parse tree for non-terminal A

Fig. 3:Two parse trees for 9-5+2

NB: A left associative parse tree grows down towards the

Fig. 5: Annotated parse tree for 9-5+2

 For example, in the Pascal array

(c) type array [ simple ] of type

array [ simple ] of type num dotdot num simple

num dotdot num integer

Fig.6: Steps in top-down construction of a parse tree

Fig.7: Lexical analysis

 Wrongly spelt words e.g. fi. Change to if

You might also like