Lecture 4
Lecture 4
Michele Magno
IIS Group - ETH Zurich
[Link]@[Link]
1
Outline
MCU Architecture
CPU
Power Consumption
MCU Peripherals
ARM Architecture
ARM Coretx
ARM Instruction Set
STM32 ARM Coretex Mx Family
ISA
Data Sheet Examples
2
What is a microcontroller ?
3
Example of MCU Architecture
Clock Memory ADC - DAC I/O Port
CPU
BUS
4
Performance Metrics
How we compare and classify microcontrollers?
Performance Metrics NOT easy to define and mostly application
depended.
Eletrical: Computation:
Power Consumptions Clock Speed
Voltage Supply MIPS (instructions per sec)
Noise Immunity Latency
Lateness of the response
Sensitivity Lag between the begin and the end
of the computation
Goal: best tradeoff Throughput
Tasks per second
power consumptions Vs Byte per second
performances
5
Power as a Design Constraint
6
Dynamic Power Consumption
C – Total capacitance V – Supply voltage
seen by the gate’s outputs Trend: has been dropping
Function of wire lengths, with each successive fab
transistor sizes, ...
2
ACV f
A - Activity of gates f – clock frequency
How often on average do Trend: increasing ...
wires switch?
7
Short-circuit Power Consumption
τAVIshort f
Finite slope of the input signal
causes a direct current path
between VDD and GND for a
Vin Ishort Vout short period of time during
switching when both the NMOS
CL and PMOS transistors are
conducting
Reducing Short-circuit
1) Lower the supply voltage V
2) Slope engineering – match the rise/fall time of the input and output signals
8
Leakage Power
VIleak
Sub-threshold
current
9
How can we reduce
power consumption?
Dynamic power consumption
Reduce the rate of charge/discharge of highly loaded nodes
Reduce spurious switching (glitches)
Reduce switching in idle states (clock gating)
Decrease frequency
Decrease voltage (and frequency)
Static power Consumption
Smaller area (!)
Reduce device leakage through power gating
Reduce device leakage through body biasing
Use higher-threshold transistors when possible
11
Why Ultra-low Power Is so Important
12
Clock Distribution
System Clock
Key Features:
Generator
• MCLK Main clock provided to the CPU
ACLK
SMCLK
• SMCLK Sub-Main clock provided to the peripherals
• ACLK Auxiliary clock at low frequency provided to the
MCLK
peripherals
• Peripherals can work at High and Low frequency
CPU
13
Clock System Generator
14
Memory
RAM (usually SRAM)
Volatile memory for runtime execution
Fastest access, low amount (<100Kb)
Allocates variables
Flash ROM
On-chip non-volatile memory used for code or data storage
8-512Kb, about 10k write cycles
Bootloader: protected section to upload code in flash
(Ferroelectric Random Access Memory) FRAM
Forefront of next generation non-volatile memory technology
On-chip non-volatile memory faster (50ns) and lower power (250x less)
than Flash.
External memory
Connected via serial (I2C, SPI) or dedicated (FSMC) interface
15
Memory - Address Space
On-Chip FLASH/ROM and RAM memory
Everything is mapped into a single, contiguous address space:
All memory, including RAM, Flash/ROM, information memory,
special function registers (SFRs), and peripheral registers.
Memory Address Description Access
End: 0FFFFh Interrupt Vector Table
Word/Byte
Start: 0FFE0h
End: 0FFDFh
Flash/ROM
Word/Byte
0F800h
Start *:
01100h
RAM End *:
09FFh
027Fh RAM Word/Byte
Start: 0200h
End: 01FFh
16-bit Peripheral modules Word
Start: 0100h
Peripherals End:
Start:
00FFh
0010h
8-bit Peripheral modules Byte
End: 000Fh
Special Function Registers Byte
Start: 0000h
16
16
Interrupts
A way to respond to an external event (i.e., Main Prog
flag being set) without polling ISR
:
How it works: :
:
H/W senses flag being set :
Automatically transfers control to s/w that RETI
“services” the interrupt
When done, H/W returns control to
wherever it left off
Advantages:
Transparent to user
cleaner code
μC doesn’t waste time polling
17
Interrupts: details
3 types
System reset
(Non)-maskable NMI
Maskable
Interrupt priorities
could be fixed and
defined by the
arrangement of
modules or set in the
interrupt priority
register
18
(Non)-Maskable Interrupts
Sources
An edge on the RESET pin when configured in NMI mode
An oscillator fault occurs
An access violation to the flash memory
19
NMI Interrupt Handler example
20
Maskable Interrupts
21
Interrupt acceptance
22
Return from Interrupt
23
Timers
Correct system timing is a fundamental requirement for the proper
operation of a real-time application;
If the timing is incorrect, the input data may be processed after the output
was updated
The timers may be driven from an internal or external clock;
Usually timers include multiple independent capture and compare
blocks, with interrupt capabilities;
Main applications:
Generate events of fixed-time period;
Allow periodic wake-up from sleep;
Count external signals/events;
Signal generation (Pulse Width Modulation – PWM);
Replacing delay loops with timer calls allows the CPU to sleep between operations,
thus consuming less power.
24
24
Timers
• The general-purpose timers consist of a 16-bit auto-reload counter driven by a programmable prescaler.
• They may be used for a variety of purposes, including measuring the pulse lengths of input signals (input
capture) or generating output waveforms (output compare and PWM).
• Pulse lengths and waveform periods can be modulated from a few microseconds to several milliseconds
using the timer prescaler and the RCC clock controller prescalers.
- 16-bit programmable prescaler used to divide (also “on the fly”) the counter clock frequency by any
‣ Input capture
‣ Output compare
‣ PWM generation (Edge- and Center-aligned modes)
‣ One-pulse mode output
25
Timers
26
Timers
27
Timers
28
Timers
29
Timers
30
Timers
24MHz
31
Timers
32
Timers
33
Timers
CNT
CK_CNT
34
Timers
Period
Timer Interrupt
Autoreload
Register
CNT
CK_CNT
35
Timers
Timer Interrupt
Autoreload
Register
Compare
Register
CNT
CH1 Interrupt
OC1
CK_CNT
36
Timers
Timer Interrupt
Autoreload
Register
Compare
Register
CNT
CH1 Interrupt
OC1
CK_CNT
37
Watchdog Timer (WDT)
38
Watchdog timer (WDT )
The 16-bit WDT module can be used in:
Supervision mode:
- Ensure the correct working of the software application;
- Perform a PUC;
- Generate an interrupt request after the counter overflows.
Interval timer:
- Independent interval timer to perform a “standard” interrupt upon
counter overflow periodically;
- Upper counter (WDTCNT) is not directly accessible by software;
- Control and the interval time selecting WDTCTL register;
39
Digital I/O
Independently programmable
individual I/Os Port1
Port3
…
Port2 Port6
P2.
Some pins can be configured P3. 7 6 5 4 3 2 1 0
P6.
40
GPIO – General Purpose I/O
To Input Logic
VCC Button
VCC Button
Button produces
either Vcc Port Pin
or Floating input. 5.6KΩ
5.6KΩ
Adding a pull-down
resistor fixes it.
Some ports have internal
programmable resistors
41
GPIO - Inside Inputs/Outputs
Output section
Input section
42
Interfaces
Several protocols for inter-chip communication
UART, I2C, SPI, USB,…
43
I2C
Shorthand for an “Inter-integrated circuit” bus
I2C devices include EEPROMs, thermal sensors, and real-
time clocks
Used as a control interface to signal processing devices that
have separate data interfaces, e.g. RF tuners, video decoders
and encoders, and audio processors.
I2C bus has three speeds:
Slow (under 100 Kbps)
Fast (400 Kbps)
High-speed (3.4 Mbps) – I2C v.2.0
Limited to about 3 meters for moderate speeds
44
I2C (Inter-Integrated Circuit) protocol
Communications is always initiated and completed by
the master, which is responsible for generating the clock
signal;
In more complex applications, I2C can operate in multi-
master mode;
The slave selection by the master is made using the
seven-bit address of the target slave;
The master (in transmit mode) sends:
Start bit;
7-bit address of the slave it wishes to communicate with;
A single bit representing whether it wishes to write (0) to or
read (1) from the slave;
The target slave will acknowledge its address.
45
I2C Bus Configuration
2-wire serial bus – Serial data (SDA) and Serial clock (SCL)
Half-duplex, synchronous, multi-master bus
No chip select or arbitration logic required
Lines pulled high via resistors, pulled down via open-drain drivers
(wired-AND, avoid short
circuit among the bus)
46
I2C Features
47
Example
I2C bridge
48
Sensors data acquisition example
Realization with digital sensor:
Data acquisition procedure:
49
SPI
Shorthand for “Serial Peripheral Interface”
Defined by Motorola on the MC68HCxx line of
microcontrollers
Generally faster than I2C, capable of several Mbps
Applications:
Like I2C, used in EEPROM, Flash, and real time clocks
Better suited for “data streams”, i.e. ADC converters
Full duplex capability, i.e. communication between a codec and
digital signal processor
50
Serial Peripheral Interface (SPI) protocol
Supports only one master;
Can support more than a slave;
Short distance between devices, e.g. on a printed circuit
boards (PCBs);
Special attention needs to be observed to the polarity and
phase of the clock signal;
The master sends data on one edge of clock and reads
data on the other edge. Therefore, it can send/receive at
the same time.
51
SPI Bus Configuration
52
SPI structure
53
SPI vs. I2C
SPI
For multiple slaves, each slave
needs separate slave select signal
SPI requires more effort and more
hardware than I2C
I2 C
54
UART
Shorthand for “Universal Asynchronous Receiver-Transmitter “
A UART’s transmitter is essentially just a parallel-to-serial converter
with extra features.
The UART bus is a full-duplex bus.
The essence of the UART transmitter is a shift register that is loaded
in parallel, and then each bit is sequentially shifted out of the device
on each pulse of the serial clock.
Application:
Communication between microprocessors, pc
Used to interface the microcontroller with others transmission bus as: RS232,
RS485, USB, CAN BUS, KNX, LonWorks ecc.
Used to connect microntroller with modem and transceiver as: telephone
modem, Bluetooth, WIFi, GSM/GPRS/HDPSA
55
UART
Asynchronous serial devices, such as UARTs, do not share a common
clock
Each device has its own, local clock.
The devices must operate at exactly the same frequency.
Logic (within the UART) is required to detect the phase of the
transmitted data and phase lock the receiver’s clock to this.
Bitrate: 2400, 19200, 57600,115200, 921600…
One of the problems associated with serial transmission is
reconstructing the data at the receiving end, because the clock is not
transmitted.
Difficulties arise in detecting boundaries between bits.
56
UART
The transmission format uses:
1 start bit at the beginning
Settable 5,6,7,8 data bits string length
Settable 1 or 0 even/odd parity bit control
settable 1, 1.5, 2 stop bits end of each frame.
Parity control
The parity bit control is accordingly set to 0 or 1 to have and odd
number of frame 1 bits in odd parity either an even number of frame
1 bits in the even parity
The control can detect 1 bit error in the frame
57
UART transmission
UART can transmit either with 2 or 4 wires
2 wires mode has transmit and receive 4 wires mode has transmit and receive
lines lines plus 2 handshake signals, RTS
request to send, CTS clear to send
UART UART
Tx FIFO Rx FIFO
Rx FIFO Tx FIFO
58
Analog to Digital Converters
Most engineering applications require some form of data processing:
measurement, control, calculation, communication or data recording;
These operations, either grouped or isolated, are built into the measuring
instruments;
59
Direct Memory Access
60
Direct Memory Access
Concept of DMA: move functionality to peripherals
Peripherals use less current than the CPU;
Delegating control to peripherals allows the CPU to shut down (saves
power) or perform other tasks (increase processing capabilities);
“Intelligent” peripherals are more capable, providing a better opportunity for
CPU shutoff;
DMA can be enabled for repetitive data handling, increasing the throughput
of peripheral modules;
Minimal software requirements and CPU cycles.
61
ARM Processors Families
62
STM32 ARM® CortexTM-M Family
63
Embedded ARM Cortex Processors
Cortex M0:
Ultra low gate count (less
that 12 K gates).
Ultra low-power (3
µW/MHz ).
32-bit processor.
64
Embedded ARM Cortex Processors (2)
Cortex M1:
The first ARM processor
designed specifically for
implementation in
FPGAs.
Supports all major FPGA
vendors.
Easy migration path from
FPGA to ASIC.
65
Embedded ARM Cortex Processors (3)
Cortex M3:
The mainstream ARM
processor for
microcontroller
applications.
High performance and
energy efficiency.
66
Embedded ARM Cortex Processors (4)
Cortex M4:
The latest embedded
processor for DSP.
67
STM32 ARM® CortexTM-M Family
68
STM32L1x - Block Diagram
340µA/MHz 105°C
25°C
Wake up time
71
STM32F101 Product Lines
72
STM32F10x Product Lines (2)
All lines include: Connectivity Line: STM32F107
Multiple communication peripherals Up to 256 KB 2x12-bit ADC Ethernet
72MHz USB 2.0 2 x Audio 2x PWM
Up to 5 x USART, 3xSPI, 2xI²C Flash / (1µs) IEEE158
CPU OTG (FS) Class I2S CAN timer
64KB SRAM TempSensor 8
ETM*
Connectivity Line: STM32F105
FSMC**
Up to 256 KB 2x12-bit ADC PWM
72MHz (1µs) USB 2.0 2 x Audio 2x
Dual 12-bit DAC*** Flash /
CPU OTG (FS) Class I2S CAN timer
64KB SRAM TempSensor
74
Which architecture is my processor?
75
Data Sizes and Instruction Sets
ARM is a 32-bit load / store RISC architecture
The only memory accesses allowed are loads and stores
Most internal registers are 32 bits wide
Most instructions execute in a single cycle
77
Datasheet example
78
Datasheet example: Timers
79
Datasheet example
80
Questions?
81