0% found this document useful (0 votes)
6 views26 pages

Top-Down Parsing Techniques Explained

The document discusses top-down parsing techniques, focusing on Recursive-Descent and Predictive Parsing. It explains the concepts of LL(1) grammars, parsing tables, and the construction of FIRST and FOLLOW sets, which are essential for creating LL(1) parsers. Additionally, it provides examples of LL(1) parsing and highlights the conditions under which grammars are considered LL(1).

Uploaded by

addisjunior85
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views26 pages

Top-Down Parsing Techniques Explained

The document discusses top-down parsing techniques, focusing on Recursive-Descent and Predictive Parsing. It explains the concepts of LL(1) grammars, parsing tables, and the construction of FIRST and FOLLOW sets, which are essential for creating LL(1) parsers. Additionally, it provides examples of LL(1) parsing and highlights the conditions under which grammars are considered LL(1).

Uploaded by

addisjunior85
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Top-Down Parsing

• The parse tree is created top to bottom.


• Top-down parser
– Recursive-Descent Parsing
• Backtracking is needed (If a choice of a production rule does not work, we backtrack to try other
alternatives.)
• It is a general parsing technique, but not widely used.
• Not efficient
– Predictive Parsing
• no backtracking
• efficient
• needs a special form of grammars (LL(1) grammars).
• Recursive Predictive Parsing is a special form of Recursive Descent parsing without backtracking.
• Non-Recursive (Table Driven) Predictive Parser is also known as LL(1) parser.

1
Recursive-Descent Parsing (uses Backtracking)
• Backtracking is needed.
• It tries to find the left-most derivation.

S  aBc
B  bc | b
S S
input: abc
a B c a B c
fails, backtrack
b c b

2
Predictive Parser
a grammar   a grammar suitable for
predictive
eliminate left parsing (a LL(1) grammar)
left recursion factor no %100 guarantee.

• When re-writing a non-terminal in a derivation step, a predictive parser


can uniquely choose a production rule by just looking the current
symbol in the input string.

A  1 | ... | n input: ... a .......

current token

3
Predictive Parser (example)

stmt  if ...... |
while ...... |
begin ...... |
for .....

• When we are trying to write the non-terminal stmt, if the current token
is if we have to choose first production rule.
• When we are trying to write the non-terminal stmt, we can uniquely
choose the production rule by just looking the current token.
• We eliminate the left recursion in the grammar, and left factor it. But it
may not be suitable for predictive parsing (not LL(1) grammar).
4
Recursive Predictive Parsing
• Each non-terminal corresponds to a procedure.

Ex: A  aBb (This is only the production rule for A)

proc A {
- match the current token with a, and move to the next token;
- call ‘B’;
- match the current token with b, and move to the next token;
}

5
Recursive Predictive Parsing (cont.)
A  aBb | bAB

proc A {
case of the current token {
‘a’: - match the current token with a, and move to the next
token;
- call ‘B’;
- match the current token with b, and move to the next token;
‘b’: - match the current token with b, and move to the next token;
- call ‘A’;
- call ‘B’;
}
}
6
Recursive Predictive Parsing (cont.)
• When to apply -productions.

A  aA | bB | 

• If all other productions fail, we should apply an -production. For


example, if the current token is not a or b, we may apply the
-production.
• Most correct choice: We should apply an -production for a non-
terminal A when the current token is in the follow set of A (which
terminals can follow A in the sentential forms).

7
Recursive Predictive Parsing (Example)
A  aBe | cBd | C
B  bB | 
Cf
proc C { match the current token with f,
proc A { and move to the next token; }
case of the current token {
a: - match the current token with a,
and move to the next token; proc B {
- call B; case of the current token {
- match the current token with e, b: - match the current token with b,
and move to the next token; and move to the next token;
c: - match the current token with c, - call B
and move to the next token; e,d: do nothing
- call B; }
- match the current token with d, }
and move to the next token;
f: - call C
}
}
follow set of B

first set of C
8
Non-Recursive Predictive Parsing -- LL(1) Parser
• Non-Recursive predictive parsing is a table-driven parser.
• It is a top-down parser.
• It is also known as LL(1) Parser.

input buffer

stack Non-recursive output


Predictive Parser

Parsing Table

9
LL(1) Parser
input buffer
– our string to be parsed. We will assume that its end is marked with a special symbol $.

output
– a production rule representing a step of the derivation sequence (left-most derivation) of the string in the
input buffer.

stack
– contains the grammar symbols
– at the bottom of the stack, there is a special end marker symbol $.
– initially the stack contains only the symbol $ and the starting symbol S. $S  initial stack
– when the stack is emptied (ie. only $ left in the stack), the parsing is completed.

parsing table
– a two-dimensional array M[A,a]
– each row is a non-terminal symbol
– each column is a terminal symbol or the special symbol $
– each entry holds a production rule.

10
LL(1) Parser – Parser Actions
• The symbol at the top of the stack (say X) and the current symbol in the input string
(say a) determine the parser action.
• There are four possible parser actions.
1. If X and a are $  parser halts (successful completion)
2. If X and a are the same terminal symbol (different from $)
 parser pops X from the stack, and moves the next symbol in the input buffer.
3. If X is a non-terminal
 parser looks at the parsing table entry M[X,a]. If M[X,a] holds a production rule
XY1Y2...Yk, it pops X from the stack and pushes Yk,Yk-1,...,Y1 into the stack. The
parser also outputs the production rule XY1Y2...Yk to represent a step of the
derivation.
4. none of the above  error
– all empty entries in the parsing table are errors.
– If X is a terminal symbol different from a, this is also an error case.

11
LL(1) Parser – Example1
S  aBa a b $ LL(1) Parsing
B  bB |  S S  aBa Table
B B B  bB

stack input output


$S abba$ S  aBa
$aBa abba$
$aB bba$ B  bB
$aBb bba$
$aB ba$ B  bB
$aBb ba$
$aB a$ B
$a a$
$ $ accept, successful completion

12
LL(1) Parser – Example1 (cont.)

Outputs: S  aBa B  bB B  bB B

Derivation(left-most): SaBaabBaabbBaabba

S
parse tree
a B a

b B

b B


13
LL(1) Parser – Example2

E  TE’
E’  +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id

id + * ( ) $
E E  TE’ E  TE’
E’ E’  +TE’ E’   E’  
T T  FT’ T  FT’
T’ T’   T’  *FT’ T’   T’  
F F  id F  (E)
14
LL(1) Parser – Example2
stack input output
$E id+id$ E  TE’
$E’T id+id$ T  FT’
$E’ T’F id+id$ F  id
$ E’ T’id id+id$
$ E ’ T’ +id$ T’  
$ E’ +id$ E’  +TE’
$ E’ T+ +id$
$ E’ T id$ T  FT’
$ E ’ T’ F id$ F  id
$ E’ T’id id$
$ E ’ T’ $ T’  
$ E’ $ E’  
$ $ accept

15
Constructing LL(1) Parsing Tables
• Two functions are used in the construction of LL(1) parsing tables:
– FIRST FOLLOW

• FIRST() is a set of the terminal symbols which occur as first symbols


in strings derived from  where  is any string of grammar symbols.
• if  derives to , then  is also in FIRST() .

• FOLLOW(A) is the set of the terminals which occur immediately after


(follow) the non-terminal A in the strings derived from the starting
symbol.
– a terminal a is in FOLLOW(A) if S  *
Aa
*
– $ is in FOLLOW(A) if S  A

16
Compute FIRST for Any String X
• If X is a terminal symbol  FIRST(X)={X}
• If X is a non-terminal symbol and X   is a production rule
  is in FIRST(X).
• If X is a non-terminal symbol and X  Y1Y2..Yn is a production rule
 if a terminal a in FIRST(Yi) and  is in all FIRST(Yj) for
j=1,...,i-1 then a is in FIRST(X).
 if  is in all FIRST(Yj) for j=1,...,n
then  is in FIRST(X).
• If X is   FIRST(X)={}
• If X is Y1Y2..Yn
 if a terminal a in FIRST(Yi) and  is in all FIRST(Yj) for
j=1,...,i-1 then a is in FIRST(X).
 if  is in all FIRST(Yj) for j=1,...,n
then  is in FIRST(X).
17
FIRST Example
E  TE’
E’  +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id

FIRST(F) = {(,id} FIRST(TE’) = {(,id}


FIRST(T’) = {*, } FIRST(+TE’ ) = {+}
FIRST(T) = {(,id} FIRST() = {}
FIRST(E’) = {+, } FIRST(FT’) = {(,id}
FIRST(E) = {(,id} FIRST(*FT’) = {*}
FIRST() = {}
FIRST((E)) = {(}
FIRST(id) = {id}

18
Compute FOLLOW (for non-terminals)
• If S is the start symbol  $ is in FOLLOW(S)

• if A  B is a production rule


 everything in FIRST() is FOLLOW(B) except 

• If ( A  B is a production rule ) or
( A  B is a production rule and  is in FIRST() )
 everything in FOLLOW(A) is in FOLLOW(B).

We apply these rules until nothing more can be added to any follow set.

19
FOLLOW Example
E  TE’
E’  +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id

FOLLOW(E) = { $, ) }
FOLLOW(E’) = { $, ) }
FOLLOW(T) = { +, ), $ }
FOLLOW(T’) = { +, ), $ }
FOLLOW(F) = {+, *, ), $ }

20
Constructing LL(1) Parsing Table -- Algorithm
• for each production rule A   of a grammar G
– for each terminal a in FIRST()
 add A   to M[A,a]
– If  in FIRST() 
for each terminal a in FOLLOW(A) add A   to M[A,a]
– If  in FIRST() and $ in FOLLOW(A) 
add A   to M[A,$]

• All other undefined entries of the parsing table are error entries.

21
Constructing LL(1) Parsing Table -- Example
E  TE’ FIRST(TE’)={(,id}  E  TE’ into M[E,(] and M[E,id]

E’  +TE’ FIRST(+TE’ )={+}  E’  +TE’ into M[E’,+]

E’   FIRST()={}  none
but since  in FIRST()
and FOLLOW(E’)={$,)}  E’   into M[E’,$] and M[E’,)]

T  FT’ FIRST(FT’)={(,id}  T  FT’ into M[T,(] and M[T,id]


T’  *FT’ FIRST(*FT’ )={*}  T’  *FT’ into M[T’,*]

T’   FIRST()={}  none
but since  in FIRST()
and FOLLOW(T’)={$,),+}  T’   into M[T’,$], M[T’,)] and
M[T’,+]

F  (E) FIRST((E) )={(}  F  (E) into M[F,(]

F  id FIRST(id)={id}  F  id into M[F,id]


22
LL(1) Grammars
• A grammar whose parsing table has no multiply-defined entries is said
to be LL(1) grammar.

one input symbol used as a look-head symbol do determine parser action

LL(1) left most derivation


input scanned from left to right

• The parsing table of a grammar may contain more than one production
rule. In this case, we say that it is not a LL(1) grammar.

23
A Grammar which is not LL(1)
SiCtSE | a FOLLOW(S) = { $,e }
EeS |  FOLLOW(E) = { $,e }
Cb FOLLOW(C) = { t }

FIRST(iCtSE) = {i}
a b e i t $
FIRST(a) = {a} S  iCtSE
S Sa
FIRST(eS) = {e}
E EeS E
FIRST() = {}
E
FIRST(b) = {b}
C Cb

two production rules for M[E,e]

Problem  ambiguity
24
A Grammar which is not LL(1) (cont.)
• What do we have to do it if the resulting parsing table contains multiply
defined entries?
– If we didn’t eliminate left recursion, eliminate the left recursion in the grammar.
– If the grammar is not left factored, we have to left factor the grammar.
– If its (new grammar’s) parsing table still contains multiply defined entries, that grammar is
ambiguous or it is inherently not a LL(1) grammar.
• A left recursive grammar cannot be a LL(1) grammar.
– A  A | 
 any terminal that appears in FIRST() also appears FIRST(A) because A  .
 If  is , any terminal that appears in FIRST() also appears in FIRST(A) and FOLLOW(A).

• A grammar is not left factored, it cannot be a LL(1) grammar


• A  1 | 2
 any terminal that appears in FIRST(1) also appears in FIRST(2).

• An ambiguous grammar cannot be a LL(1) grammar.


25
Properties of LL(1) Grammars
• A grammar G is LL(1) if and only if the following conditions hold for
two distinctive production rules A   and A  

1. Both  and  cannot derive strings starting with same terminals.

2. At most one of  and  can derive to .

3. If  can derive to , then  cannot derive to any string starting


with a terminal in FOLLOW(A).

26

You might also like