0% found this document useful (0 votes)
3 views53 pages

6 Codegen+Opti

The document discusses the role and complexities of code generation in compiler design, focusing on the transition from intermediate representation (IR) to target machine code while preserving semantics. It highlights challenges such as instruction selection, register allocation, and instruction ordering, which are often NP-complete problems. Additionally, it covers optimization techniques using control-flow graphs and common sub-expression elimination to enhance code efficiency.

Uploaded by

mineman270
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views53 pages

6 Codegen+Opti

The document discusses the role and complexities of code generation in compiler design, focusing on the transition from intermediate representation (IR) to target machine code while preserving semantics. It highlights challenges such as instruction selection, register allocation, and instruction ordering, which are often NP-complete problems. Additionally, it covers optimization techniques using control-flow graphs and common sub-expression elimination to enhance code efficiency.

Uploaded by

mineman270
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Code Generation

Rupesh Nasre.

CS3300 Compiler Design


IIT Madras
August 2020
Character stream

Machine-Independent
Machine-Independent
Lexical
LexicalAnalyzer
Analyzer Code
CodeOptimizer
Optimizer

Intermediate representation

Backend
Token stream
Frontend

Syntax
SyntaxAnalyzer
Analyzer Code
CodeGenerator
Generator

Syntax tree Target machine code

Machine-Dependent
Machine-Dependent
Semantic
SemanticAnalyzer
Analyzer Code
CodeOptimizer
Optimizer

Syntax tree Target machine code

Intermediate
Intermediate Symbol
Code
CodeGenerator
Generator Table
2
Intermediate representation
Role of Code Generator

From IR to target program.

Must preserve the semantics of the source program.
– Meaning intended by the programmer in the original
source program should carry forward in each compilation
stage until code-generation.

Target code should be of high quality
– execution time or space or energy or …

Code generator itself should run efficiently.
11 instruction selection,
22 register allocation and 3
33 instruction ordering.
Code Generator in Reality

The problem of generating an optimal target
program is undecidable.

Several subproblems are NP-Hard (such as
register allocation).

Need to depend upon
– Approximation algorithms
– Heuristics
– Conservative estimates

4
Input + Output

3AC (Quadruples / Triples / Indirect triples)

VM instructions (bytecodes / stack machine codes)
Intermediate representation ●
Linear representations (postfix)

Graphical representation (syntax trees / DAGs)

Code
CodeGenerator
Generator ●
RISC (many registers, 3AC, simple addressing
modes, simple ISA)

CISC (few registers, 2AC, variety of addressing
Target machine code modes, several register classes, variable length
instructions, instructions with side-effects)

Stack machine (push / pop, stack top uses
registers, used in JVM, JIT compilation)

It helps to assume an assembler. Imagine if


in A3 you had to generate machine code and
manipulate bits rather than generating x86
assembly.

5
IR and Target Code

t0 = y
Intermediate representation t0 = t0 + z
x = t0

Code
CodeGenerator
Generator
LD R0, y
Target machine code ADD R0, R0, z
ST x, R0

What
Whatis
isthe
theissue
issuewith
withthis
thiskind
kindof
ofcode
codegeneration?
generation?
6
IR and Target Code

t0 = y Generate code for


Intermediate representation t0 = t0 + z a=b+c
x = t0 d=a+e

Code
CodeGenerator
Generator LD R0, b
LD R0, y ADD R0, R0, c
ADD R0, R0, z ST a, R0
Target machine code LD R0, a
ST x, R0
ADD R0, R0, e
ST d, R0

7
1 Instruction Selection

Complexity of instruction selection depends upon
– Level of the IR
Low-level IR can help generate more efficient code.
e.g., intsize versus 4.
– Nature of the ISA
Uniformity and completeness of ISA affects the code.
e.g., floats required to be loaded in special registers.
– Desired quality of the generated code
Context and amount of information to process affects
the code quality.
8
e.g., INC a versus LD R0, a; ADD R0, R0, #1; ST a, R0
2 Register Allocation

Register allocation involves
– Allocation: which variables to be put into registers
– Assignment: which register to use for a variable

Finding an optimal assignment of registers to
variables is NP-Complete.

Architectural conventions complicate matters.
– Combination of registers used for double-precision arithmetic.
– Result is stored in accumulator.
– Registers are reserved for special instructions.
– ...
9
Thought
Thoughtexercise:
exercise:How
Howto
touse
usegraph
graphcoloring
coloringfor
forregister
registerallocation?
allocation?
3 Instruction Ordering

Instruction order affects execution efficiency.

Picking the best order is NP-complete.

Optimizer / Code generator needs to look at multiple instructions at
a time.

Classwork: Create an example IR whose generated code results in
the same meaning but different efficiency for different orders.

1:
1:R0
R0==aa 1:
1:R0
R0==aa
2:
2:R1
R1==bb 2:
2:R1
R1==bb
3:
3:R2
R2==cc 4:
4:R2
R2==R0
R0++R1
R1
4:
4:R3
R3==R0
R0++R1
R1 3:
3:R0
R0==cc
5:
5:R4
R4==R2
R2++R3
R3 5:
5:R3
R3==R2
R2++R0
R0
6:
6:dd==R4
R4 6:
6:dd==R3
R3

10
A Typical Target Machine Model
Instruction Type Example
Load LD R1, x
Store ST R1, x
Computation SUB R1, R2, R3
Unconditional Jump BR main
Conditional Jump BLTZ R1, main

Addressing Mode Example


Direct LD R1, [100000]
Named / Variable LD R1, x
Variable Indexed LD R1, a(R2)
Immediate Indexed LD R1, 100(R2)
Indirect LD R1, *x
Immediate MOV R1, #100
11
... ...
Example Code Generation
using our Target Machine Model
Source ...
...
xx==yy--zz Optimization and Code
Generation are often run
together multiple-times.
...
...
t1
t1==yy
IR t2
t2==zz
t1
t1==t1
t1––t2
t2
xx==t1
t1 … …
; ;yyalready
alreadyininR1
R1
LDLDR2,
R2,zz
SUB
SUBR1, R1,R1,
R1,R2 R2
...
... ST x, R1
LD ST x, R1
LDR1,
R1,yy Optimized
Target LD
LDR2,
R2,zz target
code SUB
SUBR1,R1,R1,
R1,R2
R2 code …
ST …
STx,x,R1
R1 LDLDR1, R1,yy
LDLDR2, R2,zz
; ;xxisisnot
notused
used
12
Homework

Exercises 8.2.3 from ALSU book.

13
Basic Blocks and CFG

A basic block is a maximal sequence of
consecutive 3AC instructions such that
– Single-entry: Control-flow enters the basic-block
through only the first instruction in the block.
– Single-exit: Control leaves the block only after the
last instruction.

Thus, if control reaches a basic block, all
instructions in it are executed in sequence.
– No branching from in-between or no jumps to in-
between instructions.

Basic-blocks together form a control-flow graph.14
for
for(ii(ii==1;
1;iiii<<10;10;++ii)
++ii){{
for ENTRY i i==11 B1
for(jj(jj==1;
1;jjjj<<10;
10;++jj)
++jj){{

Source
a[ii][jj]
a[ii][jj]==0;
0;
}}
j j==11 B2
}}
for
for(ii(ii==1;
1;iiii<<10;10;++ii)
++ii)
a[ii][ii] = 1;
a[ii][ii] = 1;
t1t1==10 10**i i
i i==11 t2t2==t1 t1++j j

Control-flow graph (CFG)


L2: t3t3==44**t2 t2
L2: t4 B3
j j==11 t4 = t3 –44
= t3 – 44
L1: a[t4]
a[t4]==00
L1: j j==j j++11
t1t1==10 10**i i

Intermediate representation
t2 ififj j<<1010goto
gotoB3
B3
t2==t1 t1++j j
t3t3==44**t2 t2
t4t4==t3 t3––4444
a[t4]
a[t4]==00 i i==i i++11 B4
j j==j j++11 ififi i<<10
10goto
gotoB2
B2
ififj j<<10
10goto
gotoL1
L1
i i==i i++11
ififi i<<10
10goto
gotoL2
L2 i i==11 B5
i i==11
L3:
L3:
t5 t5t5==i i--11
t5==i i--11
t6 t6t6==44 44**t5
t6==44 44**t5t5 t5
a[t6] EXIT a[t6]
a[t6]==11 B6
a[t6]==11
i i==i i++11 i i==i i++11 15
ififi i<<10 ififi i<<10
10goto
gotoB6
10goto
gotoL3
L3 B6
Classwork

Build the CFG for the following program.
– Ideally, we should first convert it to IR, but we will
work with the source code for simplicity.
prod
prod==1;1;
prod
prod==1;
1; ENTRY
iiii==1;
1;
for
for(ii(ii==1;1;iiii<=
<=N; N;++ii)
++ii){{
ifif(ii(ii%
%2)2) iiii<=
prod <=NN result
result==prod;
prod;
prod*= *=ii;ii;
}}
result
result==prod; prod; iiii%
%22 EXIT

prod
prod*=
*=ii;ii;

++ii;
++ii; 16
Character stream

Machine-Independent
Machine-Independent
Lexical
LexicalAnalyzer
Analyzer Code
CodeOptimizer
Optimizer

Intermediate representation

Backend
Token stream
Frontend

Syntax
SyntaxAnalyzer
Analyzer Code
CodeGenerator
Generator

Syntax tree Target machine code

Machine-Dependent
Machine-Dependent
Semantic
SemanticAnalyzer
Analyzer Code
CodeOptimizer
Optimizer

Syntax tree Target machine code

Intermediate
Intermediate Symbol
Code
CodeGenerator
Generator Table
17
Intermediate representation
Optimizations using CFG

Local: within a basic-block
– Local common sub-expressions
– Dead-code elimination
– Use of algebraic identities

Global: across blocks
– Common sub-expression elimination
– Strength reduction
– Data-flow analysis

18
Local Common Sub-expressions
Elimination
a + a * (b – c) + (b – c) * d
aa==bb++cc
bb==aa––dd ++ ++ 9
cc==bb++cc ++ ** 8 **
++ 6
dd==aa--dd
aa ** -- dd ** 5 7 dd
aa -- bb cc 1 aa -- 4
bb cc bb cc
2 3
c ++

Does not distinguish properly between different
b, d --
variable instances.

It is unclear why certain variable should be used or a
a ++ dd new one should be formed.

We need use-def information.
bb cc
19
Local Common Sub-expressions
Elimination
aa==bb++cc a1
a1==b0
b0++c0
c0

Variables have initial DEFs.
bb==aa––dd b1
b1==a1
a1––d0
d0

Each DEF creates a new instance of
cc==bb++cc c1
c1==b1
b1++c0c0 the variable (recall SSA).
dd==aa--dd d1
d1==a1
a1--d0
d0

Each USE refers to the latest DEF.

c ++ c1 ++

b, d -- b1, d1 --

a ++ dd a1 ++ d0
d0

bb cc b0
b0 c0
c0
20
Local Common Sub-expressions
Elimination No common
expressions

aa==bb++cc a1
a1==b0
b0++c0
c0 aa==bb++cc a1
a1==b0
b0++c0
c0
bb==aa––dd b1
b1==a1
a1––d0
d0 bb==bb––dd b1
b1==b0
b0––d0
d0
cc==bb++cc c1
c1==b1
b1++c0c0 cc==cc++dd c1
c1==c0
c0++d0
d0
dd==aa--dd d1
d1==a1
a1--d0
d0 ee==bb++cc e1
e1==b1
b1++c1
c1
Classwork: Find the Basic Block DAG
(expression DAG) for the above Basic Block.

c ++ c1 ++

++ e1
b, d -- b1, d1 --

a ++ dd a1 ++ d0
d0 a1 ++ b1 -- ++ c1

bb cc b0
b0 c0
c0 b0
b0 c0
c0 d0
d0
21
Dead-code Elimination

Remove root from the DAG that have no live
variables attached.
– There could be multiple roots in the DAG.
– We may be able to apply this repeatedly.

++ e1
Assuming a and b are live (used later)
while c and e are not, then

We can remove e1. a1 ++ b1 -- ++ c1

Once e1 is removed, c1 can also be removed.
b0
b0 c0
c0 d0
d0 22
Algebraic Identities

Algebraic properties
– x+0=0+x=x x–0=x
– x*1=1*x=x x/1=x

Strength reduction
– x2 = x * x
– 2*x=x+x
– x / 2 = x * 0.5

Constant folding
– 2 * 3.14 = 6.28
23
Algebraic Identities

Commutativity and Associativity
– DAG construction can help us here.
– Apart from checking left op right, we could also
check right op left for commutativity.
e.g., (a + b) + (b + a).
e.g., a = b + c; e = c + d + b;

Some algebraic laws are not obvious.
– e.g., Can you optimize if (x > y) a = b + x + c – y?
However, we need to worry about underflows.
24
Array References

Array references cannot be treated like usual
variables.
xx==a1
a1 xx==a[ii]
a[ii] We represent a[ii] as a node with
a2
a2==yy a[jj]
a[jj]==yy two or three children depending
zz==a1 zz==a[ii] upon whether it is rvalue or lvalue.
a1 a[ii]

x, z a2 x, z a[jj]
x
x =[]
=[] =[]
=[] =[]
=[] z

a1
a1 yy a[ii]
a[ii] yy a0
a0 ii0
ii0 jj0
jj0 y0
y0

wrong correct
How do you decide the order in
which assignments are executed?
25
Array References

Array references cannot be treated like usual
variables.
xx==a1
a1 xx==a[ii]
a[ii] xx==a[ii]
a[ii]
a2
a2==yy a[jj]
a[jj]==yy b[jj]
b[jj]==yy
zz==a1
a1 zz==a[ii]
a[ii] zz==a[ii]
a[ii]

x, z
a1
a1
a2
yy
x =[]
=[]
x =[]
=[] =[]
=[] z
Depending upon how much time a
compiler can afford,

it would either analyze if a[ii] and
a0
a0 ii0 jj0 y0
b[jj] are referring to the same
ii0 jj0 y0
memory location OR

conservatively assume that they
MAY be referring to the same
location.
26
Aliasing

The issue with array references is called aliasing.

Two expressions may refer to the same memory
location at the execution time.
– a[ii] and a[jj]
– *p and *q
– union variables
– Pass by reference variables

Local processing may fail to identify aliasing
– Precise alias analysis is computationally difficult.
27
Classwork: Aliasing

Find all the aliases in this C++ program.
#include
#include<iostream>
<iostream>
int
intgg==1;
1;
void
voidfun(int
fun(int&p,
&p,int
int*q,
*q,int
intr,r,int
inta,
a,int
int*s)
*s){{
int
int&x
&x==g;g;
int
int*y
*y==&p;
&p;
std::cout
std::cout<<<<pp<<<<*q*q<<<<rr<< <<aa<<<<*s*s<<
<<xx<<
<<*y*y<<
<<gg<<
<<std::endl;
std::endl;
}}
int
intmain()
main(){{
int
intaa==g;
g;
int
int&b
&b==g;g; e.g.,
e.g.,<&b,
<&b,&g>
&g>
int *c = &g;
int *c = &g;
fun(b,
fun(b,&g,
&g,g,
g,a,
a,c);
c);
return
return0;
0; 28
}}
Peephole Optimization

Consider a sliding window of instructions and
optimize it.

Repeated passes are often helpful.
– Redundant load/store elimination
– Dead-code elimination
– Control-flow optimization
– Algebraic simplifications
– Use of machine idioms
– ...
29
Peephole Optimization

Consider a sliding window of instructions and
optimize it.
...

Repeated passes are often helpful. ...
LD
LDR0,
R0,aa
– Redundant load/store elimination ST
STa,a,R0
R0
...
...
– Dead-code elimination
– Control-flow optimization
– Algebraic simplifications
– Use of machine idioms
– ...
30
Peephole Optimization

Consider a sliding window of instructions and
optimize it.

Repeated passes are often helpful.
– Redundant load/store elimination
– Dead-code elimination
– Control-flow optimization
– Algebraic simplifications
– Use of machine idioms debug = 0
constant
propagation
– ...
31
Peephole Optimization

Consider a sliding window of instructions and
optimize it.

Repeated passes are often helpful.
– Redundant load/store elimination
– Dead-code elimination
– Control-flow optimization
– Algebraic simplifications
– Use of machine idioms
– ...
Remove L1: goto L2 if no jumps to it.
32
Can be generalized to conditional jump to L1.
Peephole Optimization

Consider a sliding window of instructions and
optimize it.

Repeated passes are often helpful.
– Redundant load/store elimination
– Dead-code elimination
– Control-flow optimization
– Algebraic simplifications
– Use of machine idioms
– ...
33
Peephole Optimization

Consider a sliding window of instructions and
optimize it.

Repeated passes are often helpful.
– Redundant load/store elimination
– Dead-code elimination
– Control-flow optimization
– Algebraic simplifications
– Use of machine idioms
– ...
34
Register Allocation

Memory hierarchy: Network, File system, Main
memory, L3 cache, L2, L1, Registers.
– Capacity reduces, access time reduces.

Critical to allocate and assign registers for
efficiency.
– Register versus Memory could be ~10x
performance difference.

C allows register variables.
– register int a; // not always a good idea.
– register int a asm(“r12”); // tries a specific register.
35
– gcc -ffixed-r12 … // reserve r12.
Register Allocation
Classwork: Allocate registers for the following code.


First-Come-First-Served way (linear
scan) is often not the best policy for
register allocation (used in JIT).

We need to perform some analysis to
find out the benefit of allocating registers
to variables.

We may have to assign cost / benefit to
various operations within a loop.

What if we say that K registers would be
allocated to the top K variables that have
the maximum number of uses?

By paying a small spilling cost, we may
be able to increase the benefit of K
registers to more than K variables.
benefit(x, B) = F(use(x, B), live(x, B))
Variable x, Basic block B
use returns the number of uses.
live returns 0 or 1 based on if x is live after
36
leaving B.
Liveness bcdf
benefit(x, B) =
F(use(x, B), live(x, B)) B1
use returns number of uses.
live returns 0 or 1 based on if x
is live after leaving B.
acdef acdf
use(a, B1) = 1, live(a, B1) = 1 acde
use(a, B2) = 1, live(a, B2) = 0
use(b, B3) = 1, live(b, B3) = 1 B2 B3
use(c, B2) = 0, live(c, B2) = 1
use(a, B4) = 0, live(a, B4) = 0 cdef
... bcdef
cdef
Overall benefit S(x) =
sum(benefit(x, B)) for all B bdef
B4 (obtained from
Say, S(a) = 4, S(b) = 5, S(c) = 3, analyzing fun)
S(d) = 6, S(e) = 4, S(f) = 4.
bcdef

Assign R0, R1, R2 to a, b and
d globally (global allocation). B5 37

Use remaining register R3
inside blocks (local allocation).
Allocation

R1 and R2 remain assigned
to b and d throughout. B1

R3 is loaded repeatedly
inside the loop as an auxiliary
register.

a is not live at the start,
hence it is not loaded initially.

At the end of the loop, the
register values are stored
back. B2 B3

Overall benefit S(x) =


sum(benefit(x, B)) for all B
B4
Say, S(a) = 4, S(b) = 5, S(c) = 3,
S(d) = 6, S(e) = 4, S(f) = 4.

Assign R0, R1, R2 to a, b and
d globally (global allocation).
38

Use remaining register R3
inside blocks (local allocation).
Register Allocation as Graph Coloring

Vertices? Edges?

Vertices: Variables (or their instances)

Edges: Co-Live information
– If x and y are live at the same program point, add an
(undirected) edge between x and y.

Vertex coloring colors neighbors differently.
– Thus, vertex coloring colors x and y differently, if they
are live at the same program point.
– This means, x and y should not use the same register.
– Corollary: if x and z have the same color, they can 39
reuse the register (at different program points).
bcdf
Live Ranges

acdef acdf
acde

cdef
bcdef
cdef

bdef
(obtained from
analyzing fun)

bcdef

40
Interference Graph

This means, in basic block B1, b and e could use the same register.

Classwork: Try it for .


41
Interference Graph


Coloring gave us the maximum number of registers required for the program.

However, in practice, the number of registers is fixed.

Therefore, we need to generate spill code for storing a variable into memory
(ST x, R) and then reload the register with the next variable (LD R, y)

Thus, if b and e use the same register, that means b is not globally assigned
to R1 (4 slides ago). 42
Spilling

Spilling stores a value from register to memory.
– This frees-up the register.

Local Register Allocation (within basic block)
– If the maximum number of variables live at any
program point is K and number of registers is R,

If K <= R, no spilling is required.

If K > R, then spill-code is generated (store instruction)

Prioritize max-use variables.

Register assignment is straightforward.

Alternatives exist. e.g., spill the variable whose next use
is the farthest from the current point. 43
Spilling

Spilling stores a value from register to memory.
– This frees-up the register.

Global Register Allocation (across basic blocks)
– Use graph coloring via interference graph
– If more than R colors are required,

Split live ranges (insert spill-code).

Perform cost-benefit analysis to find spill victims (e.g.,
two spills in an if statement are preferred over a single
spill in a loop).
– May use other algorithms:

Δ + 1 coloring where Δ is the maxdegree. 44

Chaitin’s algorithm and its enhancements (uses stack)
Data Flow Analysis

Flow-sensitive: Considers the control-flow in a
function

Operates on a flow-graph with nodes as basic-
blocks and edges as the control-flow

Examples
aa == 88
– Constant propagation
– Common subexpression elimination
aa == 33 aa == 22
– Dead code elimination
What is the
bb == aa
value of b?
45
Reaching Definitions

Every assignment is a definition.

A definition d reaches a program point p if there
exists a path from the point immediately
following d to p such that d is not killed along
D0:
D0: yy == 33
the path. D1:
D1: xx == 10
10 B0
D2:
D2: yy == 1111
ifif cc

B1 D3:
D3: xx == 11 D5:
D5: zz == xx B2
D4:
D4: yy == 22 D6:
D6: xx == 44

B3 What definitions reach B3?


46
DFA Equations

in(B) = set of data flow facts entering block B

out(B) = …

gen(B) = set of data flow facts generated in B

kill(B) = set of data flow facts from the other
blocks killed in B

47
DFA for Reaching Definitions

in(B) = U out(P) where P is a predecessor of B

out(B) = gen(B) U (in(B) – kill(B)) D0:
D0: yy == 33
D1:
D1: xx == 1010 B0
D2:
D2: yy == 1111
ifif cc

Initially, out(B) = { }
B1 D3:
D3: xx == 11 D5:
D5: zz == xx B2
gen(B0) = {D1, D2} kill(B0) = {D3, D4, D6} D4:
D4: yy == 22 D6:
D6: xx == 44
gen(B1) = {D3, D4} kill(B1) = {D0, D1, D2, D6}
gen(B2) = {D5, D6} kill(B2) = {D1, D3} B3
gen(B3) = { } kill(B3) = { }
in1 out1 in2 out2 in3 out3
B0 {} {D1, D2} {} {D1, D2} {} {D1, D2}
B1 {} {D3, D4} {D1, D2} {D3, D4} {D1, D2} {D3, D4}
B2 {} {D5, D6} {D1, D2} {D2, D5, D6} {D1, D2} {D2, D5, D6} 48
B3 {} {} {D3, D4, D5, D6} {D3, D4, D5, D6} {D2, D3, D4, D5, D6} {D2, D3, D4, D5, D6}
Algorithm for Reaching Definitions
for each basic block B
compute gen(B) and kill(B)
out(B) = {}

do {
for each basic block B
in(B) = U out(P) where P \in pred(B)
out(B) = gen(B) U (in(B) - kill(B))
} while in(B) changes for any basic block B 49
Classwork

in(B) = U out(P) where P is a predecessor of B

out(B) = gen(B) U (in(B) – kill(B)) D1:
D1: yy == 33 B0
D2:
D2: xx == 1010
ifif cc
B2

Initially, out(B) = { } B1 D3:
D3: xx == 11 D5:
D5: zz == xx
D4:
D4: yy == 22 D6:
D6: xx == 44
gen(B0) = {D1, D2} kill(B0) = {D3, D4, D6, D8}
gen(B1) = {D3, D4} kill(B1) = {D1, D2, D6, D8}
B3 D7:
D7: zz == yy
gen(B2) = {D5, D6} kill(B2) = {D2, D3, D7, D8}
D8:
D8: xx == zz
gen(B3) = {D7, D8} kill(B3) = {D2, D3, D5, D6}
in1 out1 in2 out2 in3 out3 in4 out4
B0 {} {D1,D2} {D7,D8} {D1,D2, {D4,D7,D8} {D1,D2,D7} {D1,4,7} {D1,2,7}
D7}
B1 {} {D3,D4} {D1,D2} {D3,D4} {D1,D2,D7} {D3,D4,D7} {D1,2,7} {D3,4,7}
B2 {} {D5,D6} {D1,D2} {D1,D5,D6} {D1,D2,D7} {D1,D5,D6} {D1,2,7} {D1,5,6}
50
B3 {} {D78} {D3456} {D478} {D13456} {D1478} {D134567} {D1478}
DFA for Reaching Definitions
Domain Sets of definitions
Transfer function in(B) = U out(P)
out(B) = gen(B) U (in(B) - kill(B))
Direction Forward
Meet / confluence U
operator
Initialization out(B) = { }

51
DFA for Live Variables
Domain Sets of variables
Transfer function in(B) = use(B) U (out(B) - def(B))
out(B) = U in(S) where S is a successor of B
Direction Backward
Meet / confluence U
operator
Initialization in(B) = { }

A variable v is live at a program point p if v is used along some path


in the flow graph starting at p.
Otherwise, the variable v is dead.

52
Character stream

Machine-Independent
Machine-Independent
Lexical
LexicalAnalyzer
Analyzer Code
CodeOptimizer
Optimizer

Intermediate representation

Backend
Token stream
Frontend

Syntax
SyntaxAnalyzer
Analyzer Code
CodeGenerator
Generator

Syntax tree Target machine code

Machine-Dependent
Machine-Dependent
Semantic
SemanticAnalyzer
Analyzer Code
CodeOptimizer
Optimizer

Syntax tree Target machine code

Intermediate
Intermediate Symbol
Code
CodeGenerator
Generator Table
53
Intermediate representation

You might also like