Unit-5 Compiler Design-Missing Topics
Unit-5 Compiler Design-Missing Topics
Unit-5
Run Time Environments: Storage Organization, Run Time Storage Allocation,
Activation Records, Procedure Calls, Displays
Code Optimization: The Principal Sources of Optimization, Basic Blocks,
Optimization of Basic Blocks, Structure Preserving Transformations, Flow
Graphs, Loop Optimization, Data-Flow Analysis, Peephole Optimization
Code Generation: Issues in the Design of a Code Generator, Object Code
Forms, Code Generation Algorithm, Register Allocation and Assignment.
1|P age
Compiler Design Unit-5
Subdivision of Run-time Memory:
Runtime storage comes into blocks, where
a byte is used to show the smallest unit of
addressable memory.
Using the four bytes a machine word can
form. Object of multibyte is stored in
consecutive bytes and gives the first byte
address.
Run-time storage can be subdivided to hold
the different components of an executing program:
[Link] executable code
[Link] data objects
[Link] data-object- heap
[Link] data objects- stack
Static Versus Dynamic Storage Allocation:
The layout and allocation of data to memory locations in the run-time
environment are key issues in storage management.
The two terms static and dynamic distinguish between compile time and run
time, respectively.
Difference between static and Dynamic:
Static Dynamic
Also known as compile time allocation Also known as
runtime allocation
It can be made by the compiler looking only at the text it can be decided
of the program, not at what the program does when it only while the
executes i.e., based on content of the program, doesn’t program is running.
depend on output
EX: Code and Static Area EX: Stack and Heap
areas
3|P age
Compiler Design Unit-5
• A stack top pointer manages allocation (increase) and deallocation
(decrease).
• This method supports recursion and allows variables to have different
storage and values per call.
Stack allocation of space:
1. Activation Trees
2. Activation Records
3. Calling Sequences
4. Variable-Length Data on the Stack
[Link] Trees:
• An activation tree is a hierarchical data structure that visually represents
the flow of execution in a program, particularly when dealing with nested
function calls.
• Each node in the tree represents a procedure or function call, and the
edges show the caller-callee relationships.
• This structure is commonly used in compiler design to manage memory
allocation and deallocation during program execution.
Example:
4|P age
Compiler Design Unit-5
2. Activation Records:
a. Procedure calls and returns are usually managed by a
run-time stack called the control stack.
b. Each live activation has an activation record
(sometimes called a frame)
c. The root of activation tree is at the bottom of the
stack
d. The current execution path specifies the content of
the stack with the last
e. Activation has record in the top of the stack.
• An activation record contains all the necessary information required to
call a procedure.
• An activation record may contain the following units (depending upon the
source language used).
5|P age
Compiler Design Unit-5
Temporaries: Stores temporary and intermediate values of an expression.
Local Data: Stores local data of the called procedure.
Machine Status: Stores machine status such as Registers, Program Counter etc.,
before the procedure is called.
Control Link: Stores the address of activation record of the caller procedure.
Access Link: Stores the information of data which is outside the local scope.
Actual Parameters: Stores actual parameters, i.e., parameters which are used to
send input to the called procedure.
Returned values: Hold value returned by called procedure
[Link] Sequences:
Designing calling sequences and the layout of activation records, the following
[Link] communicated between caller and callee are generally placed at the
beginning of callee’s activation record
[Link]-length items: are generally placed at the middle. such items typically
include the controlling, the access link, and the machine status fields.
[Link] whose size may not be known early enough: are placed at the end of
activation record
[Link] must locate the top-of-stack pointer judiciously: a common approach is to
have it point to the end of fixed length fields in the activation record.
• In compiler design, the caller is the function that calls another function,
which is called the callee.
• The caller initiates the function call by passing arguments and controlling
the transfer of execution to the callee.
• The callee then executes its code, potentially modifying arguments or
creating its own local data within its activation record.
6|P age
Compiler Design Unit-5
4. Variable-Length Data on the Stack:
• The run-time memory-management system must deal frequently with the
allocation of space for objects the sizes of which are not known at compile
time, but which are local to a procedure and thus may be allocated on the
stack.
• It is possible to allocate objects, arrays, or other structures of unknown
size on the stack.
• The reason to prefer placing objects on the stack if possible is that we
avoid the expense of garbage collecting their space.
• Note that the stack can be used only for an object if it is local to a
procedure and becomes inaccessible when the procedure returns.
• A common strategy for allocating variable-length arrays (i.e., arrays
whose size depends on the value of one or more parameters of the called
procedure) is shown in Fig below
The same scheme works for objects of any type if they are local to the
procedure called and have a size that depends on parameters of the call.
7|P age
Compiler Design Unit-5
• Java also supports heap-based allocation through the use of new
keywords.
• The heap is the portion of the store that is used for data that lives
indefinitely, or until the program explicitly deletes it.
1. The Memory Manager
2. The Memory Hierarchy of a Computer
3. Locality in Programs
4. Reducing Fragmentation
5. Manual Deallocation Requests
Advantages:
• Dynamic Data Structures can be handled by heap-based allocation.
• Heap-Based Allocation gives more flexibility to a data structure.
• It can also handle large amounts of memory as compared to stack-based
allocation.
Disadvantages:
• Heap-Based Allocation is slower than Stack-Based in terms of speed.
• Risk of memory leaks.
• It enables the allocation of memory in a non-nested design. Storage can
be allocated & freed arbitrarily from an area known as Heap.
• Heap Allocation is helpful for executing data whose size varies as the
program is running. Heap is maintained as a list of free space called free
space list.
Procedure Calls:
• Involves transferring control to a specific function or subroutine.
• Creates an activation record on the stack to store local variables,
parameters, and return addresses.
• Focuses on managing the execution flow and memory for the called
procedure.
• Procedure is an important and frequently used programming construct
for a compiler.
• It is used to generate good code for procedure calls and returns.
• The translation for a call includes a sequence of actions taken on entry
and exit from each procedure.
• Following actions take place in a calling sequence:
o When a procedure call occurs then space is allocated for activation
record.
8|P age
Compiler Design Unit-5
o Evaluate the argument of the called procedure.
o Establish the environment pointers to enable the called procedure
to access data in enclosing blocks.
o Save the state of the calling procedure so that it can resume
execution after the call.
o Also save the return address. It is the address of the location to
which the called routine must transfer after it is finished.
o Finally generate a jump to the beginning of the code for the called
procedure.
o Queue is used to store the list of parameters in the procedure call.
Example: Production rule Semantic action
S → call id (Elist) for each item p on QUEUE do
S →call id (Elist) GEN (param p)
Elist → Elist, E GEN (call id. PLACE)
Elist → E Elist → Elist, E append E. PLACE to the end of QUEUE
Elist → E initialize QUEUE to contain only
E. PLACE
Displays:
• One problem with the access-link approach to nonlocal data is that if the
nesting depth gets large
• We may have to follow long chains of links to reach the data we need. A
more efficient implementation uses an auxiliary array d, called the
display, which consists of one pointer for each nesting depth.
• We arrange that, at all times, d[i] is a pointer to the highest activation
record on the stack for any procedure at nesting depth i.
• Examples of a display are shown in Fig. In order to maintain the display
correctly, we need to save previous values of display entries in new
activation records.
9|P age
Compiler Design Unit-5
• These are as follows:
o Structure-Preserving Transformations
o Algebraic Transformations
[Link] preserving transformations:
The primary Structure-Preserving Transformation on basic blocks is as follows:
a. Common sub-expression elimination
b. Dead code elimination
c. Renaming of temporary variables
d. Interchange of two independent adjacent statements
a. Common sub-expression elimination:
In the common sub-expression, you don't need to be computed it over and over
again. Instead of this you can compute it once and kept in store from where it's
referenced when encountered again.
Example:
a: = b + c a:=b+c
b: = a - d In the expression on LHS, the second and forth b: = a - d
c: = b + c expression computed the same expression. So, the c: = b + c
d: = a - d block can be transformed as follows on RHS d: = b
10 | P a g e
Compiler Design Unit-5
2. Algebraic Transformations:
• In the algebraic transformation, we can change the set of expression into
an algebraically equivalent set.
• Thus, the expression x: = x + 0 or x: = x *1 can be eliminated from a basic
block without changing the set of expression.
• Constant folding is a class of related optimization. Here at compile time,
we evaluate constant expressions and replace the constant expression by
their values.
• Thus, the expression 5*2.7 would be replaced by13.5.
• Sometimes the unexpected common sub expression is generated by the
relational operators like <=, >=, <, >, +, = etc.
• Sometimes associative expression is applied to expose common sub
expression without changing the basic block value. if the source code has
the assignments
Example:
a: = b + c The following a: = b + c
e: = c +d +b intermediate code may t: = c +d
be generated is on RHS: e: = t + b
Example:
x: =x+0 can be removed
x: =y**2 can be replaced by a cheaper statement x: =y*y
11 | P a g e
Compiler Design Unit-5
12 | P a g e
Compiler Design Unit-5
• There are certain strategies adopted by the compiler for register allocation and
assignment.
• The most commonly used strategy to register allocation and assignment is to
assign specific values to specific registers.
• Some of the strategies used in register allocation and assignment are:
[Link] Register Allocation:
• While generating the code the registers are used to hold values for duration
of single block and all the live variables are stored at the end of each block.
• For the variables that are used again and again (consistently), we can allocate
specific set of registers. Hence allocation of variables to specific registers that is
consistent across the block boundaries is called as global register allocation.
• Following are the strategies adopted while doing the global register allocation.
1. The global register allocation has a strategy of storing the most frequently
used variables infixed registers throughout the loop.
2. Another strategy is to assign some fixed number of global registers to hold
the most active values in each inner loop.
3. The registers that are not already allocated may be used to hold values local
to one block.
[Link] Assignment for Outer Loop:
• Consider that there are two loops L1 is outer loop and L2 is an inner loop. And
allocation of variable a is to be done to some register.
• Following strategies should be adopted for register assignment for outer loop.
1. If the variable a is allocated in loop L2 then it should not be allocated in
between L1 and L2.
2. If the variable a is allocated in L1 and it is not allocated in L2 then store an at
the beginning to L2and load the variable a while leaving L2.
3. If the variable a is allocated in L2 and not in L1 then load a at the beginning of
L2 and store a onexit from L2.
[Link] Count:
• The usage count is the count for the use of some variable x in some register
used in any basic block.
• The usage count gives the idea about how many units of cost(time) can be
saved by selecting a specific variable for global register allocation.
• The approximate formula for usage count for the loop L in some basic block B
can be given as (use (x, B) + 2*live (x, B)) Σ Block B in L
• Where use (x, B) specifies the number of times x used in block B, and live (x, B)
=1if x is live on exit from B otherwise live (x, B) =0.
13 | P a g e