0% found this document useful (0 votes)
9 views39 pages

Overview of ARM Microcontroller Architecture

The document provides an overview of ARM microcontrollers, detailing the company's history, architecture, and various processor families. ARM, founded in 1990, designs RISC processor cores and licenses them to semiconductor partners, emphasizing power efficiency and performance in embedded applications. Key ARM architectures include the Cortex series, which cater to different application needs such as real-time processing, microcontrollers, and high-performance computing.

Uploaded by

doandc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views39 pages

Overview of ARM Microcontroller Architecture

The document provides an overview of ARM microcontrollers, detailing the company's history, architecture, and various processor families. ARM, founded in 1990, designs RISC processor cores and licenses them to semiconductor partners, emphasizing power efficiency and performance in embedded applications. Key ARM architectures include the Cortex series, which cater to different application needs such as real-time processing, microcontrollers, and high-performance computing.

Uploaded by

doandc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

20-Mar-22

Chapter 4

ARM MICROCONTROLLER

Introduction to ARM Lmt

 Founded in November 1990


 Spun out of (tiền thân) Acorn
Computers
 Initial funding from Apple, Acorn and
VLSI
 Designs the ARM range of RISC
processor cores
 Licenses ARM core designs to
semiconductor partners who fabricate
and sell to their customers
 ARM does not fabricate silicon itself
2

1
20-Mar-22

 ARM does not fabricate (chế tạo) silicon itself


 Also develop technologies to assist with the design-in of the ARM
architecture
 Software tools, boards, debug hardware
 Application software
 Bus architectures
 Peripherals, etc

ARM= Advanced RISC Machine

 ARM (Advanced RISC Machine)


 is the industry's leading provider of 32-bit embedded microprocessors
 offering (cung cấp) a wide range of processors that deliver high
performance, industry leading power efficiency and reduced system cost

2
20-Mar-22

ARM Partnership Model

Design and license ARM core design but not fabricate

ARM Processor Applications

3
20-Mar-22

Why ARM?

 One of the most licensed and thus widespread processor cores in


the world
 Used in PDA, cell phones, multimedia players, handheld game console,
digital TV and cameras
 ARM7: GBA, iPod
 ARM9: NDS, PSP, Sony Ericsson, BenQ
 ARM11: Apple iPhone, Nokia N93, N800
 90% of 32-bit embedded RISC processors till 2009
 Used especially in portable devices due to its low power
consumption and reasonable performance

ARM processors

 A simple but powerful design


 A whole family of designs sharing similar design principles and a
common instruction set

4
20-Mar-22

Naming ARM
 ARMxyzTDMIEJFS
 x: series
 y: MMU
 z: cache
 T: Thumb
 D: debugger
 M: Multiplier
 I: EmbeddedICE (built-in debugger hardware)
 E: Enhanced instruction
 J: Jazelle (JVM)
 F: Floating-point
 S: Synthesizible version (source code version for EDA tools)

Definition of “Architecture”
 The Architecture is the contract between the Hardware and the Software
 Confers rights and responsibilities to both the Hardware and the Software
 MUCH more than just the instruction set
 The architecture distinguishes between:
 Architected behaviors:
 Must be obeyed
 May be just the limits of behavior rather than specific behaviors
 Implementation specific behaviors – that expose the micro-architecture
 Certain areas are declared implementation specific. E.g.:
 Power-down
 Cache and TLB Lockdown
 Details of the Performance Monitors
 Code obeying the architected behaviors is portable across implementations
 Reliance on implementation specific behaviors gives no such guarantee
 Architecture is different from Micro-architecture
 What vs How 10

5
20-Mar-22

History

 ARM has quite a lot of history


 First ARM core (ARM1) ran code in April 1985…
 3 stage pipeline very simple RISC-style processor
 Original processor was designed for the Acorn Microcomputer
 Replacing a 6502-based design
 ARM Ltd formed in 1990 as an “Intellectual Property” company
 Taking the 3 stage pipeline as the main building block
 This 3 stage pipeline evolved into the ARM7TDMI
 Still the mainstay of ARM’s volume
 Code compatibility with ARM7TDMI remains very important
 Especially at the applications level
11

Evolution of the ARM Architecture

 Original ARM architecture:


 32-bit RISC architecture focussed on core instruction set
 16 Registers - 1 being the Program counter – generally accessible
 Conditional execution on all instructions
 Load/Store Multiple operations - Good for Code Density
 Shifts available on data processing and address generation
 Original architecture had 26-bit address space
 Augmented by a 32-bit address space early in the evolution

12

6
20-Mar-22

 Thumb instruction set was the next big step


 ARMv4T architecture (ARM7TDMI)
 Introduced a 16-bit instruction set alongside the 32-bit instruction set
 Different execution states for different instruction sets
 Switching ISA as part of a branch or exception
 Not a full instruction set – ARM still essential
 ARMv4 architecture was still focused on the Core instruction set only

13

Versions, cores and architectures ?

14

7
20-Mar-22

ARM Instruction Sets

15

Popular ARM architectures

 ARM7TDMI
 3 pipeline stages (fetch/decode/execute)
 High code density/low power consumption
 One of the most used ARM-version (for low-end systems)
 All ARM cores after ARM7TDMI include TDMI even if they do not
include TDMI in their labels
 ARM9TDMI
 Compatible with ARM7
 5 stages (fetch/decode/execute/memory/write)
 Separate instruction and data cache
 ARM11

8
20-Mar-22

ARM family comparison

Processor family # of pipeline stages Memory Clock Rate MIPS/MHz


organization

ARM6 3 Von Neumann 25 MHz


ARM7 3 Von Neumann 66 MHz 0.9
ARM8 5 Von Neumann 72 MHz 1.2
ARM9 5 Harvard 200 MHz 1.1
ARM10 6 Harvard 400 MHz 1.25
StrongARM 5 Harvard 233 MHz 1.15
ARM11 8 Von Neumann/ 550 MHz 1.2
Harvard

17

ARM is a RISC

 RISC: simple but powerful instructions that execute within a


single cycle at high clock speed.
 Four major design rules:
 Instructions: reduced set/single cycle/fixed length
 Pipeline: decode in one stage/no need for microcode
 Registers: a large set of general-purpose registers
 Load/store architecture: data processing instructions apply to registers
only; load/store to transfer data from memory
 Results in simple design and fast clock rate
 The distinction blurs because CISC implements RISC concepts

9
20-Mar-22

ARM features

 Different from pure RISC in several ways:


 Variable cycle execution for certain instructions: multiple-register
load/store (faster/higher code density)
 Inline barrel shifter leading to more complex instructions: improves
performance and code density
 Thumb 16-bit instruction set: 30% code density improvement
 Conditional execution: improve performance and code density by
reducing branch
 Enhanced instructions: DSP instructions

ARM architecture

 32-bit RISC-processor core (32-bit instructions)


 37 pieces of 32-bit integer registers(16 available)
 Pipelined (ARM7: 3 stages)
 Cached (depending on the implementation)
 Von Neuman-type bus structure (ARM7), Harvard (ARM9)
 8 / 16 / 32 -bit data types
 7 modes of operation (usr, fiq, irq, svc, abt, sys, und)
 Simple structure -> reasonably good speed / power consumption
ratio
20

10
20-Mar-22

What is ARM Architecture

 ARM architecture is a family of RISC-based processor


architectures
 Well-known for its power efficiency;
 Hence widely used in mobile devices, such as smart-phones and tablets
 Designed and licensed to a wide eco-system by ARM
 ARM Holdings
 The company designs ARM-based processors;
 Does not manufacture, but licenses designs to semiconduc-tor partners
who add their own Intellectual Property (IP)on top of ARM’s IP,
fabricate and sell to customers;
 Also offer other IP apart from processors, such as physical IPs,
interconnect IPs, graphics cores, and development tools 21

ARM7 Architecture
 Load/store architecture
 Most instructions are RISCy, Some multi-register operations take multiple cycles
 All instructions can be executed conditionally
 ARM7 is a small, low power, 32-bit microprocessor. Three-stage pipeline, each
stage takes one clock cycle
 Instruction fetch from memory
 Instruction decode
 Instruction execution.
 Register read
 A shift applied to one operand and the ALU operation
 Register write
 This limits the CPU max clock speed to around 80 MHz on a 0.35-micron
silicon process.
22

11
20-Mar-22

ARM CPU Core Organization

23

ARM7 Features

 ARM7 uses von-Neumann memory architecture where instructions


and data occupy single address space that can limit the performance
 Instruction fetching (and execution) must stop for instructions that access
memory
 The reduced cost of a single memory outweighs performance in many
embedded applications.
 The pipeline stalls during load and store operations, ARM7 can continue
useful work.

24

12
20-Mar-22

ARM Architectures

25

Examle: ARM926EJ-S

 5 stage pipeline single issue core


 Fetch, Decode, Execute, Memory, Writeback
 Most common instructions take 1 cycle in each pipeline stage
 Split Instruction/Data Level1 caches Virtually tagged
 MMU – hardware page table walk based

26

13
20-Mar-22

ARM1176JZF-S

 8 stage pipeline single issue


 Split Instruction/Data Level1 caches Physically tagged
 Two cycle memory latency
 MMU – hardware page table walk based
 Hardware branch prediction

27

ARM Processor Families

 Cortex-A series (Application)


 High performance processors capable of
full Operating Sys-tem (OS) support;
 Applications include smart-phones,
digital TV, smart books, home gateways
etc.
 Cortex-R series (Real-time)
 High performance for real-time
applications;
 High reliability
 Applications include automotive
braking system, power-trains etc.
28

14
20-Mar-22

 Cortex-M series (Microcontroller)


 Cost-sensitive solutions for deterministic microcontroller applications;
 Applications include microcontrollers, mixed signal devices, smart
sensors, automotive body electronics and airbags;
 SecurCore series
 High security applications.
 Previous classic processors: Include ARM7, ARM9, ARM11
families

29

ARM Families and Architecture Over Time

30

15
20-Mar-22

ARM Cortex-M Series

 Cortex-M series: Cortex-M0, M0+, M1, M3, M4, M7, M33.


 Energy-efficiency
 Lower energy cost, longer battery life
 Smaller code
 Lower silicon costs
 Ease of use
 Faster software development and reuse
 Embedded applications
 Smart metering, human interface devices, automotive and industrial control
systems, white goods, consumer products and medical instrumentation
31

ARM Processors vs. ARM Architectures

 ARM architecture
 Describes the details of instruction set, programmer’s model, exception
model, and memory map
 Documented in the Architecture Reference Manual
 ARM processor
 Developed using one of the ARM architectures
 More implementation details, such as timing information
 Documented in processor’s Technical Reference Manual

32

16
20-Mar-22

Cortex-M4 Block Diagram

33

ARM Cortex-M4

 Latest Cortex-M series CPU that has a combination of efficient signal


processing and low-power.

34

17
20-Mar-22

Cortex-M4 Block Diagram (cont.)

 Processor core
 Contains internal registers, the ALU, data path, and some control logic
 Registers include sixteen 32-bit registers for both general and special
usage
 Processor pipeline stages
 Three-stage pipeline: fetch, decode, and execution
 Some instructions may take multiple cycles to execute, in which case the
pipeline will be stalled

35

 The pipeline will be flushed if a branch instruction is executed


 Up to two instructions can be fetched in one transfer (16-bit instructions)

36

18
20-Mar-22

 Nested Vectored Interrupt Controller (NVIC)


 Up to 240 interrupt request signals and a non-maskable interrupt (NMI)
 Automatically handles nested interrupts, such as comparing priorities
between interrupt requests and the current priority level
 Wakeup Interrupt Controller (WIC)
 For low-power applications, the microcontroller can enter sleep mode by
shutting down most of the components.
 When an interrupt request is detected, the WIC can inform the power
management unit to power up the system.

37

 Memory Protection Unit (optional)


 Used to protect memory content, e.g. make some memory regions read-
only or preventing user applications from accessing privileged
application data
 Bus interconnect
 Allows data transfer to take place on different buses simultaneously
 Provides data transfer management, e.g. a write buffer, bit-oriented
operations (bit-band)
 May include bus bridges (e.g. AHB-to-APB bus bridge) to connect
different buses into a network using a single global memory space
38

19
20-Mar-22

Cortex-M4 Processor Overview

 Cortex-M4 Processor
 Introduced in 2010
 Designed with a large variety of highly efficient signal processing
features
 Features extended single-cycle multiply accumulate instructions,
optimized SIMD arithmetic, saturating arithmetic and an optional
Floating Point Unit.
 High Performance Efficiency
 Low Power Consumption
 Longer battery life – especially critical in mobile products
39

Cortex-M4 Processor Features


 32-bit Reduced Instruction Set Computing (RISC) processor
 Harvard architecture
 Separated data bus and instruction bus
 Instruction set
 Include the entire Thumb®-1 (16-bit) and Thumb®-2 (16/32-bit)
instruction sets
 Supported Interrupts
 Non-maskable Interrupt (NMI) + 1 to 240 physical interrupts
 8 to 256 interrupt priority levels

40

20
20-Mar-22

Cortex-M4 Registers

41

 R0 – R12: general purpose registers


 Low registers (R0 – R7) can be accessed by any instruction
 High registers (R8 – R12) sometimes cannot be accessed e.g. by some
Thumb (16-bit) instructions
 R13: Stack Pointer (SP)
 Records the current address of the stack
 Used for saving the context of a program while switching between tasks
 Cortex-M4 has two SPs: Main SP, used in applications that require
privileged access e.g. OS kernel, and exception handlers, and Process SP,
used in base-level application code (when not running an exception
handler) 42

21
20-Mar-22

 Program Counter (PC)


 Records the address of the current
instruction code
 Automatically incremented by 4 at
each operation (for 32-bit instruction
code), except branching operations
 A branching operation, such as
function calls, will change the PC to a
specific address, meanwhile it saves
the current PC to the Link Register
(LR)
43

 R14: Link Register (LR)


 The LR is used to store the return address of a subroutine or a function call
 The program counter (PC) will load the value from LR after a function is
finished

44

22
20-Mar-22

 xPSR, combined Program Status Register


 Provides information about program execution and ALU flags
 Application PSR (APSR)
 Interrupt PSR (IPSR)
 Execution PSR (EPSR)

45

 APSR
 N: negative flag – set to one if the result from ALU is negative
 Z: zero flag – set to one if the result from ALU is zero
 C: carry flag – set to one if an unsigned overflow occurs
 V: overflow flag – set to one if a signed overflow occurs
 Q: sticky saturation flag – set to one if saturation has occurred in saturating
arithmetic instructions, or overflow has occurred in certain multiply
instructions

46

23
20-Mar-22

 IPSR
 ISR number – current executing interrupt service routine number
 EPSR
 T: Thumb state – always one since Cortex-M4 only supports the Thumb
state (more on processor states in the next module)
 IC/IT: Interrupt-Continuable Instruction (ICI) bit, IF-THEN instruction
status bit

47

 Interrupt mask registers


 1-bit PRIMASK
 Set to one will block all the interrupts apart from nonmaskable interrupt
(NMI) and the hard fault exception
 1-bit FAULTMASK
 Set to one will block all the interrupts apart from NMI
 1-bit BASEPRI
 Set to one will block all interrupts of the same or lower level (only allow for
interrupts with higher priorities)

48

24
20-Mar-22

 CONTROL: special register


 1-bit stack definition
 Set to one: use the process stack pointer (PSP)
 Clear to zero: use the main stack pointer (MSP)

49

ARM Cortex-M3

50

25
20-Mar-22

ARM Cortex-M3

 Introduced in 2004, the mainstream ARM processor developed specifically


with microcontroller applications in mind.

51

ARM Cortex-M3
 Implement Thumb-2 instruction subset of ARM Instruction Set.
 Most Thumb-2 instructions are 16-bit wide that are expanded internally to a full 32-bit
ARM instructions.
 ARM CPUs are capable of performing multiple low-level operations in parallel.
 A hardware sign extender convert 8-16 bit operands to 32-bit
 Load store architecture.
 Barrel shifter allows operand Rm to beshited first and then ALU can perform another
operation (e.g. add, subtract, mul etc.)
 Barrel shifter can do 5X = X + 22X; -7X = X-23X.
 MAC is memory address calculator for different addressing of arrays and repetitive address
calculations.
 R0-R12 GPR, R13-R15 special purpose registers i.e. SP, PC and LR (that holds the return
address when a subroutine is called.
52

26
20-Mar-22

ARM Cortex-M3 - Architecture


 32-bit microprocessor
 32-bit data path
 32-bit register bank
 32-bit memory interface
 Harvard architecture
 3-stage pipeline
 separate instruction bus and data bus
 share the same memory space, difference length of code and data
 Interrupts
 1 to 240 physical interrupts, plus NMI
 12 cycle interrupt latency
 Instruction Set
 Thumb (entire)
 Thumb-2 (entire)
53

Processor Modes
 The ARM has seven basic operating modes
 Each mode has access to its own stack space and a different
subset of registers
 Some operations can only be carried out in a privilegde mode

54

27
20-Mar-22

ARM Registers

55

Processor Register Set

 Cortex-M3 core has 16 user-visible registers


 All processing takes place in these registers
 Three of these registers have dedicated functions
 program counter (PC) - holds the address of the next instruction to execute
 link register (LR) - holds the address from which the current procedure
was called
 “the” stack pointer (SP) - holds the address of the current stack top (CM3
supports multiple execution modes, each with their own private stack
pointer).

56

28
20-Mar-22

The registers set

57

The Registers

 ARM has 37 registers all of which are 32-bits long.


 1 dedicated program counter
 1 dedicated current program status register
 5 dedicated saved program status registers
 30 general purpose registers
 The current processor mode governs which of several banks is accessible.
Each mode can access
 a particular set of r0-r12 registers
 a particular r13 (the stack pointer, sp) and r14 (the link register, lr)
 the program counter, r15 (pc)
 the current program status register, cpsr
58

29
20-Mar-22

Program Memory Model


 RAM for an executing program is
divided into three regions
 Data in RAM are allocated during the link
process and initialized by startup code at
reset
 The (optional) heap is managed at runtime
by library code implementing functions
such as the malloc and free which are part
of the standard C library
 The stack is managed at runtime by
compiler generated code which generates
per-procedure-call stack frames containing
local variables and saved registers
59

Cortex-M3 Memory Address Space

 ARM Cortex-M3 processor has a


single 4 GB address space
 The SRAM and Peripheral areas are
accessed through the System bus
 The “Code” region is accessed through
the ICode (instructions) and DCode
(constant data) buses

60

30
20-Mar-22

ARM Cortex-M3 Bus

61

62

31
20-Mar-22

Bit Banding

 Memory mapped I/O, 4GB memory address space organized in bytes.


 4GB is very large for small embedded applications.
 Bit-banding happens by taking advantage of this large memory space.
 Uses two different regions of the address space to refer the same physical
data in the memory.
 In primary bit-band region each address corresponds to single data byte.
 In the bit-band alias each address corresponds to 1-bit of the same data.
 It allows the access of a bit of data (read or write) by a single instruction.
 LDR can load a single bit and STR can write a single bit of data.

63

 Two bit band alias regions can be used to access individual status and
control bit of I/O devices or to implement a set of 1-bit Boolean flags that
can be used to implement a set of mutex objects.
 Bit-band hardware does not allow interruption of read-modify write.

64

32
20-Mar-22

Bit banding

65

66

33
20-Mar-22

Bit banding

67

Bit Banding Example

68

34
20-Mar-22

ARM7: Programming Model

 Word is 32 bits long.


 Word can be divided into four 8-bit bytes.
 ARM addresses can be 32 bits long.
 Address refers to byte.
 Address 4 starts at byte 4. 69

ARM Cortex Status Registers (xPSR)

70

35
20-Mar-22

PSR: Program Status Register

 Divided into three bit fields


 Application Program Status Register (APSR)
 Interrupt Program Status Register (IPSR)
 Execution Program Status Register (EPSR)
 Q-bit is the sticky saturarion bit and supports two rarely used instructions
(SSAT and USAT) SSAT{cond} Rd, #sat, Rm{, shift}
 EPSR holds the exception number is exception processing.
 ICI/IT bits holds the state information of for IT block instructions or
instructions that are suspended during interrupt processing.
 T bit is always 1 to indicate Thumb instructions.

71

Software Development Overview


 The software development flow

72

36
20-Mar-22

 At the file level the build process for Keil MDK is:

73

 When using the GNU tool chain compilation and linking


are merged

74

37
20-Mar-22

 A debugging process that we will follow with the limited


capability of the STM32F4 on-board emulator, is the use of a
UART

75

Software Development with MSP432 (ES-Lab)

76

38
20-Mar-22

Software Development (ES-Lab)

 Software development is nowadays usually done with the support


of an IDE (Integrated Debugger and Editor / Integrated
Development Environment)
 edit and build the code
 debug and validate

77

78

39

You might also like