Module 4
ARM
Introduction to ARM – ARM family, ARM 7
register architecture. ARM programmer’s
model.
CISC & RISC
• CISC: Complex instruction set computers
Instruction set design was towards increasing complexity to
reduce the semantic gap that the compiler had to bridge.
Single instruction ,performing a complex sequence of
operations that it takes several clock cycles to complete
In a CISC machine, the instructions can be variable lengths,
which increases the processing time.
CISC has Large number of instructions.
•
Contd..
• RISC: Reduced Instruction set computers
less complicated set of instructions makes designing a CPU
easier, cheaper and quicker.
Every instruction has a fixed memory size, which makes them
easier to decode and execute.
Has small number of fixed length instructions
Executes one instruction per clock cycle.
The Acron RISC machine
The first ARM processor was developed at Acorn Computers Limited,
of Cambridge, England, between October 1983 and April 1985.
The first RISC microprocessor developed for commercial use -
significant differences from subsequent RISC architectures.
Before 1990, ARM stood for Acorn RISC Machine
Later on ARM stands for Advanced RISC Machine
RISC concept was introduced in 1980 at Stanford and Berkley
ARM core limited founded in 1990
The Acron RISC machine
16-bit CISC microprocessor had certain disadvantages available in 1983
They were slower than standard memory parts
Instructions that took many clock cycles to complete
Long interrupt latency
The Berkeley RISC processor designed by a few postgraduate students in
under a year, was competitive , inherently simple and there were no complex
instructions to ruin the interrupt latency.
The ARM, then, was born through a combination of factors, and became the
core component in Acorn's product line. Later, after a judicious modification
of the acronym expansion to Advanced RISC Machine, it lent its name to
the company formed to broaden its market beyond Acorn's product range.
Despite the change of name, the architecture still remains close to the original
Acorn design.
ARM FAMILY
The ARM processor family comprises various cores designed for different
applications
ARM7: An older 32-bit ARM processor core.
ARM9: An incremental improvement over the ARM8, featuring a Harvard
architecture.
ARM11: Introduces 32-bit SIMD for media processing, TrustZone for
hardware-enforced security, and tightly coupled memories.
ARM7TDMIS stands for
• T: THUMB ;
• D: for on-chip Debug support, enabling the processor to halt in
response to a debug request,
• M: enhanced Multiplier, yield a full 64-bit result, high performance
I: ICE. Embedded ICE microcell
S: syntheizable i.e distributed as RTL
E: Enhanced DSP instruction set support
J: Java bytecode support
F: Hardware floating point support
The Cortex-A series for high-performance computing,
The Cortex-M series for low-power microcontrollers, and
The Cortex-R series for real-time applications.
ARM-Architectural inheritance
The ARM chip was designed based on Berkeley RISC I and II and the
Stanford MIPS (Microprocessor without Interlocking Pipeline Stages)
Features Used from Berkeley RISC design
a load-store architecture
fixed length 32-bit instructions
3-address instruction formats
Features Rejected
Register windows
Delayed Branches
Single Cycle execution of all instructions
ARM-Architectural inheritance
• Basedupon RISC Architecture with enhancements to meet requirements
of embedded applications
A Large uniform register file
Load-store architecture
Uniform and fixed length instructions
32-bit processor
Instructions are 32-bit long
Good speed/power consumption ratio
High Code Density
ARM7 INSTRUCTION FORMAT..EXAMPLES
ARM ORGANIZATION
Overview: Core data path
Data items are placed in register file
No data processing instructions directly manipulate data in
memory
Instructions typically use two source registers and single result
or destinations registers
A Barrel shifter on the data path can preprocess data before it
enters ALU
Increment/Decrement logic can update register content for
sequential access independent of ALU
ARM architecture
• The principal components are
• The register bank, which stores the processor state. It has two read ports and
one write port which can each be used to access any register, plus an
additional read port and an additional write port that give special access to
r15, the program counter.
• The barrel shifter, which can shift or rotate one operand by any number of
bits.
• The ALU, which performs the arithmetic and logic functions required by the
instruction set.
• The address register and incrementer, which select and hold all memory
addresses and generate sequential addresses when required.
• The data registers, which hold data passing to and from memory.
• The instruction decoder and associated control logic.
The ARM programmer's model
When used in relation to the ARM:
Halfword means 16 bits (two bytes)
Word means 32 bits (four bytes)
Doubleword means 64 bits (eight bytes)
Most ARMs implement two instruction sets
32-bit ARM Instruction Set
16-bit Thumb Instruction Set
Operating Modes
The ARM programmer's model
ARM has seven operating modes
The user mode corresponds to the simplest mode, with least privileges, but is the
mode under which most application programs run.
The system mode is a highly privileged mode. This mode is used by operating
systems to manipulate and control the activities of the processor.
The other modes are entered on the occurrence of exceptions or rather, they are
interrupt modes.
List of the operating modes of ARM.
i) User: Unprivileged mode under which most tasks run
ii) FIQ (Fast Interrupt Request): Entered on a high priority (fast) interrupt request
iii) IRQ (Interrupt Request): Entered on a low priority interrupt request
iv) Supervisor: Entered on reset and when a software interrupt instruction (SWI) is
executed
v) Abort: Used to handle memory access violations
vi) Undef: Used to handle undefined instructions
vii) System: Privileged mode using the same registers as user mode
The ARM Processor Modes of operation
The ARM architecture supports seven operating modes. ---- 1 user mode
and 6 privileged modes.
Register Set
ARM has 37 registers each of which is 32 bits long. They are
listed as follows:
i) 1 dedicated program counter (PC)
ii) 1 dedicated current program status register (CPSR)
iii) 5 dedicated saved program status registers (SPSR)
iv) 30 general purpose registers
ARM Registers
General Purpose Registers hold either data or address
All registers are of 32 bits
Total 37 registers
In user mode 16 data registers and 2 status registers are visible
Data registers: r0 to 15
Three registers r13, r14, r15 perform special functions
-r13: stack pointer
-r14: link register (where return address is put whenever a
subroutine is called)
-r15: program counter
• Depending upon context, registers r13 and r14 can also be used as GPR
• Any instruction which use r0 can as well be used with anyother GPR (r1-
r13)
• In addition, there are two status registers
-CPSR: Current Program Status Register
-SPSR: Saved Program Status Register
ARM's visible registers.
Current Program Status Register (CPSR)
• The CPSR is used in user-level programs to store the condition code
bits.
• Monitors and control Internal operations
N: Negative results from ALU
Z: Zero result from ALU
C: ALU operation Carried out
V: ALU operation oVerflow
The condition code flags are in the top four bits of the register and have the
following meanings:
• N: Negative; the last ALU operation which changed the flags produced a
negative result (the top bit of the 32-bit result was a one).
• Z: Zero; the last ALU operation which changed the flags produced a zero
result (every bit of the 32-bit result was zero).
• C: Carry; the last ALU operation which changed the flags generated a carry-
out, either as a result of an arithmetic operation in the ALU or from the
shifter.
• V: Overflow; the last arithmetic ALU operation which changed the flags
generated an overflow into the sign bit.
Current Program Status Register (CPSR)
ARM Pipelining :
A Pipelining is the mechanism used by RISC (Reduced instruction set
computer) processors to execute instructions, by speeding up the
execution by fetching the instruction, while other instructions are being
decoded and executed simultaneously.
Pipelining is a design technique which helps in increasing the
efficiency of data processing in the processor of a computer and
microcontroller. By keeping the processor in a continuous process of
fetching, decoding and executing called (F&E cycle).
ARM 7 – 3 stage pipeling
It has 3 stage pipelining as shown in the figure.
It can complete it’s process in 3 cycles.
It has the basic F&E cycle leading to optimum throughput.
This is why the ARM 7 has the lowest throughput as compared to that
of it’s other family members.
It processes 32bit data.
Fetch loads an instruction from memory.
Decode identifies the instruction to be executed.
Execute processes the instruction and writes the result back to the
register.
By over lapping the above stages of execution of different
instructions, the speed of execution is increased.
The pipelining allows the core to execute an instruction every cycle,
which results in increased throughput.
Load-store architecture
• The instruction set will only process (add, subtract, and so on) values
which are in registers (or specified directly within the instruction
itself), and will always place the results of such processing into a
register.
• The only operations which apply to memory state are ones which copy
memory values into registers (load instructions) or copy register
values into memory (store instructions).
• ARM does not support such ‘memory-to-memory’ operations
Therefore all ARM instructions fall into three categories;
1) Data Processing Instructions
2) Data Transfer Instructions
3) Control Flow Instructions
Data processing instructions: These use and change only register values.
For example, an instruction can add two registers and place the result in a
register
Data transfer instructions. These copy memory values into registers
(load instructions) or copy register values into memory (store instructions).
An additional form, useful only in systems code, exchanges a memory
value with a register value.
Control flow instructions. Normal instruction execution uses instructions
stored at consecutive memory addresses.
Control flow instructions cause execution to switch to a different
address, either permanently (branch instructions) or saving a return
address to resume the original sequence (branch and link instructions)
or trapping into system code (supervisor calls).
Exception handler
Exceptions:
– Interrupts
– Supervisor Call
– Traps
• When an exception takes place:
– The value of PC is copied to r14_exc
– The operating mode changes into the respective
exception mode.
The I/O System
• ARM handles peripherals as “memory mapped devices with
interrupt support”.
• Interrupts:
IRQ: normal interrupt
FIQ: fast interrupt
Both are Level Sensitive and Maskable
• Normally most interrupt sources share the IRQ input
• Some may include DMA hardware external to the processor
to handle high-bandwidth I/O traffic
ARM development tools
• The tools are intended for cross-development (that is, they run
on a different architecture from the one for which they produce
code) from a platform such as a PC running Windows or a
suitable UNIX workstation.
ARM development tools
ARM C Compiler
• It can be told to produce assembly source output instead of ARM object
format, so the code can be inspected, or even hand optimized, and then
assembled subsequently.
The compiler can also produce Thumb code.
ARM Assembler
• The ARM assembler is a full macro assembler which produces ARM object
format output that can be linked with output from the C compiler
Linker
• The linker takes one or more object files and combines them into an
executable program.
• It resolves symbolic references between the object files and extracts object
modules from libraries as needed by the program.
• It can assemble the various components of the program in a number of
different ways, depending on whether the code is to run in RAM or ROM,
whether overlays are required, and so on.
ARMsd
• The ARM symbolic debugger is a front-end interface to assist in debugging
programs running either under emulation (on the ARMulator) or remotely on a
target system such as the ARM development board.
• ARMsd allows an executable program to be loaded into the ARMulator or a
development board and run. It allows the setting of breakpoints, which are
addresses in the code that, if executed, cause execution to halt so that the
processor state can be examined.
• In the ARMulator, or when running on hardware with appropriate support, it
also allows the setting of watchpoints
• ARMsd supports full source level debugging, allowing the C programmer to
debug a program using the source file to specify breakpoints and using variable
names from the original program.
ARMulator
• The ARMulator (ARM emulator) is a suite of programs that models
the behaviour of various ARM processor cores in software on a host
system.
• It can operate at various levels of accuracy:
Instruction-accurate modelling gives the exact behaviour of the
system state without regard to the precise timing characteristics of
the processor.
Cycle-accurate modelling gives the exact behaviour of the
processor on a cycle-by-cycle basis, allowing the exact number of
clock cycles that a program requires to be established.
Timing-accurate modelling presents signals at the correct time
within a cycle, allowing logic delays to be accounted for.