0% found this document useful (0 votes)
20 views7 pages

Implementing The Arm7 Soft Core Processor in Fpga

Arm Processor

Uploaded by

Muthe Murali
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views7 pages

Implementing The Arm7 Soft Core Processor in Fpga

Arm Processor

Uploaded by

Muthe Murali
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Y B T Sundari, et al International Journal of Computer and Electronics Research [Volume 2, Issue 2, April 2013]

IMPLEMENTING THE ARM7 SOFT CORE PROCESSOR


IN FPGA
1

Y B T Sundari1, T. Surender Reddy2, Dr. Laxminarayana G3

Assistant Professor, Dept of ECE,Auroras Scientific & Technological Institute, Hyderabad, India.
Associate Professor, Dept of ECE, Auroras Scientific & Technological Institute, Hyderabad, India.

Director, Auroras Scientific,Technological and Research Academy, Hyderabad, India.

AbstractToday

Soft processor cores are gaining


importance for FPGA based embedded applications, where
the end user can configure the processor as per his
requirement and achieve the benefits of SOC by
implementing other required logic on FPGA fabric. SOC
systems require an FPGA with a processor core. Processor
cores are classified as either hard or soft. Hard
processor cores added to an FPGA are a hybrid approach,
offering
performance tradeoffs that fall somewhere
between a traditional ASIC and an FPGA; they are
available from several manufacturers with a number of
different processor flavors. Soft cores, such as Alteras Nios
II and Xilinxs MicroBlaze and PicoBlaze processors, use
existing programmable logic elements from the FPGA to
implement the processor logic. It is of great concern to build
ARM soft processor cores in the context of FPGA based
multiprocessor based SOC applications. ARM architecture
is considered to be market dominant in the field of mobile
phones and several other embedded applications. The ARM
processor has been specifically designed to be small to
reduce power consumption and extend battery operation.
In this paper a subset of ARM 7, V4 instruction set will be
implemented to cater for such applications. A selected set of
32 bit instructions will be implemented with single cycle
data path and random logic based instruction decoder. The
core shall be implemented with UART and SPI
communication capabilities. The major blocks would be
register file, barrel shifter, ALU, multiplier, program
counter updating logic and controller. The ROM will be
implemented to store the hex codes of a program which will
be used to test the implemented ARM soft processor core.
GNU ARM assembler generated hex codes will be used to
validate the design. Result analysis done in Modelsim XE
for simulation and in Xilinx XST for synthesis and in Chip
scope for chip verification. This ARM processor embedded
into FPGA which can be used for different applications like
DSP and Image processing.

Keywords - SOC; ARM7; UART; SPI; FPGA;

I.

INTRODUCTION

Nowadays the hardware systems need to develop


with low power, with multi functionality and with
fast performance for better communication services.
[Link]

More embedded system developers and System On


chip designers are used microprocessor based
methodology[3].ARM processor has been playing as
a major role in the Embedded systems till
now[1].Today ARM company is considered to be
market dominant in the field of mobile phone chips,
due to its power saving features. Over the last 15
years, the ARM architecture has become most
pervasive architecture for several 32 bit embedded
processing applications. The most successful
implementation has been the ARM7TDMI with
hundreds of millions sold in almost every kind of
microcontroller equipped products. ARM offers its
popular microcontroller and microprocessor cores
which are manufactured by several leading chip
manufacturers.
II.

ARM DESIGN PHILOSOPHY

An ARM processor is any of several 32bit RISC (reduced


instruction
set
computer)microprocessor s developed by Advanced
RISC Machines(previously known as Acorn RISC
Machine),[3].The ARM processor has been
specifically designed to be small to reduce power
consumption and extend battery operation essential
for applications such as mobile phones and personal
digital assistants (PDAs).
High code density is another major requirement
since embedded systems have limited memory due to
cost and/or physical size restrictions. High code
density is useful for applications that have limited
on-board memory, such as mobile phones and mass
storage devices.
In addition, embedded systems are price sensitive
and use slow and low-cost memory devices. For
high-volume applications like digital cameras, every
cent has to be accounted for in the design [2]. The
ability to use low-cost memory devices produces
substantial savings. Another important requirement is

ISSN: 2278-5795

Page 153

Y B T Sundari, et al International Journal of Computer and Electronics Research [Volume 2, Issue 2, April 2013]

to reduce the area of the die taken up by the


embedded processor. For a single-chip solution, the
smaller the area used by the embedded processor, the
more available space for specialized peripherals. This
in turn reduces the cost of the design and
manufacturing since fewer discrete chips are required
for the end product [2].
ARM has incorporated hardware debug
technology within the processor so that software
engineers can view what is happening while the
processor is executing code. With greater visibility,
software engineers can resolve an issue faster, which
has a direct effect on the time to market and reduces
overall development costs The ARM core is not a
pure RISC architecture because of the constraints of
its primary application the embedded system. In
todays systems the key is not raw processor speed
but total ffective system performance and power
consumption [1], [2], [3].
ARM Processors are used extensively in
consumer electronics, including PDAs, mobile
phones, Digital media and music players, hand held
game consoles, calculators and computer peripherals
such as hard drives and routers. Soft processor cores
are gaining importance for FPGA based embedded
applications..
1)

III. ARCHITECTURAL DESIGN


Over view of Design:

It is of great concern to build ARM soft processor


cores in the context of FPGA based multiprocessor
based SOC applications. A subset of ARM 7, V4
instruction set will be implemented to cater for such
applications [4]. A selected set of 32-bit instructions
will be implemented with single cycle data path and
random logic based instruction decoder. The
instructions of Data processing, Arithmetic, Branch
instructions, Logical and compare will be
implemented. The data path will be implemented
with multiplexer based design, which is suitable for
FPGA implementation. The major blocks would be
register file, barrel shifter, ALU, multiplier, program
counter updating logic and controller.
The ROM will be implemented to store the hex
codes of a program which will be used to test the
implemented ARM soft processor core. GNU ARM
assembler generated hex codes will be used to
validate the design. Figure.1 shows the top level
diagram of ARM7 Processor with UART and SPI
communication in our implementation.
[Link]

Controller

ALU

Control
Signals

Register File
ROM
Barrel
Shifter

M
U
X

UART

Data path
SPI

Figure.1: Top Level Block Diagram of ARM7


The Block diagram in Figure1 contains the Data
path, Controller, ROM and the communication
modules are UART and SPI modules. The block
diagram comprises of:

2)

Data path
o Register File
o Multiplier
o Barrel Shifter
o ALU
Controller
ROM
UART Communication Module
SPI Communication Module
High level view of Datapath:

Data path is the heart of the ARM7 soft processor


core. The major blocks would be register file, barrel
shifter, ALU, multiplier, program counter updating
logic and [Link] is the module which
helps to form an instruction set to any function Data
path in this project is MUX based design, where
multiplexes are used to control all its sub-blocks.
Data path is the module which helps to form an
instruction set to any function, for that we have
different modules internally used to get the following

ISSN: 2278-5795

Page 154

Y B T Sundari, et al International Journal of Computer and Electronics Research [Volume 2, Issue 2, April 2013]

structure. Figure 2 shows the high level view of the


Data path.

Data path
o Register File
o Barrel Shifter
o ALU
o Multiplier

Figure.3: Register file


Rd, Rdh: Write Ports; Rm, Rs, Rn, Rn: Read Ports;
Rd_en, Rdh_en :Write Enables; Radh, Rad : Read
Enables;
Pc_En: Enable the PC to update; PC_in: the input
port from PC; Ras, Ram, Ran, Ran : Address Ports.
2.2)

Barrel Shifter

Barrel Shifter is used to shift any number of bits


of the data depending upon the shift-size in a single
clock cycle. The data path uses this barrel shifter to
shift any data as per the instruction with the given
amount of shift given by the controller, if we want to
bypass the barrel shifter then just give the signal
Bp_bs, coming from the controller. The two inputs
coming are the amount of size and the data, and then
result will be Alu_in2 that is shifted data.
Figure.2: High level view of Data path
2.1)

Register file

Register File is a sub-block of Data path which is


used to store data Register file of ARM consists of 16
registers each of 32-bit length including program
counter and Current Program Status Register
(CPSR).The following figure 3 shows the block
diagram of 6 port register file implemented for
ARM7 core. There are 4 read and 2 write ports. The
selection of the number of ports is based on the bit
location of source and destination operands in the
ARM7 instructions. Each port has address port of
size 4-bit and data port of size 32- bit. In addition
both the write ports have single bit enable control
signals. When this enable signal is 1 then only the
data on the input data port will be written in to the
register which is addressed by the address port. As
this implementation of ARM core does not
implement shadow registers, the number of 32-bit
registers in register file is only 16 hence 4 bit address
bus is sufficient.

[Link]

Figure.4: Barrel shifter


Shift_size: Shift amount given by the mux
i.e. either Rs or inst 0 & (11:8) or inst
(11:7)
B_bus : Data coming from above mux
i.e. Rm or Rn or Mul _l or Mul_H.
Alu_in2
: Result from shifter as input to ALU.
2.3) Arithmetic&Logic Unit
ALU is used to perform all the Arithmetic and
Logical Operations in the data path. The results of
operations performed by ALU are stored in the
register file.
The Data path uses ALU for all Arithmetic and
Logical operations with the signals given by the
controller i.e. Alu_fun decides the function to be
performed and Bp_alu decides to use the ALU or just
bypass the ALU. The two inputs Alu_in1 and
Alu_in2 are the data for some function coming from

ISSN: 2278-5795

Page 155

Y B T Sundari, et al International Journal of Computer and Electronics Research [Volume 2, Issue 2, April 2013]

Op4 mux (Rn or Rn2 or Temp_reg data for two cycle


instruction or constant K) and Barrel shifter.

Figure.5: Arithmetic & logic unit

ROM. The controller for ARM7 core is implemented


with random logic method, where the required
controlled signals are generated by pure
combinational logic. This method is chosen to enable
the ARM7 core to work at higher clock speeds. The
other reasons why this method is suitable here are
given below.

The output will be going to Reg file to be


written

Alu_in1, Alu_in2: Datas coming from Op4 mux and


Barrel shifter.
Alu_fun: decides the function.

Bp_alu: signal to bypass ALU.


2.4) Multiplier
The Data path uses this MUL mux for all
Multiplication operations with the signals given by
the controller i.e. Mul_type decides the function to be
performed i.e. either 32-bit multiplication or 64-bit
multiplication and Unsigned_sign decides whether
given input is signed or unsigned. The two inputs Rs
and output of op2 mux are the data. The outputs will
be Mul_L and MUL_H which are sent to Op3 type
Mux.
Rs, Data from Op2type Mux: Two inputs for Mul
Mul_type: Decides type of operation 32 bit or 64 bit.
Unsigned_sign: Decides whether data is signed or
unsigned.
Mul_H: For 64 bit Multiplication.
Mul_L: For 32 bit Multiplication.

Random logic based controller results in


lower instruction decoding results and hence
makes the ARM7 core to work at higher
clock frequencies.
Since here a subset of instructions is chosen
for implementation random logic based
controller is not a big area overhead. If it is
the full ARM7 core instruction set then lot of
area will be taken by controller itself if
random logic method is used.
The chosen implementation method is based
on single cycle implementation in which the
random logic based implementation is very
convenient.
It is to be understood that in multi cycle
implementation the micro coded instruction
decoder implementation is more suitable.

The instruction set chosen for implementation is


divided into different types based on the type of
control signals need to be issued for various blocks
of data path. The following table shows the
classification of the instructions.
ype

Instructions

T1

AND,EOR,SUB,RSB,AD
D,ADC,SBC,RSC
AND,EOR,SUB,RSB,AD
D,ADC,SBC,RSC
AND,SUB
MUL
MLA,UMULL,UMLAL,S
MULL,SMLAL
Branch instruction
BL,BX
ORR,BIC
MOV,MVN
MOV,MVN
LDR
TST,TEQ,CMP,CMN
TST,TEQ,CMP,CMN
TST,TEQ,CMP,CMN
ORR,BIC

T2
T3
T4
T5
T8
T9
T11
T12
T14
T15
T17
T18
T19
T20

Figure.6: Multiplier
3)

Controller Of ARM7

Controller plays an important role in the ARM


processor as it controls all the operations performed
by the blocks of Data path and ROM. The controller
generates the control signals to the blocks of the Data
path for the execution of the instruction given in the
[Link]

Table 1: Classifications of Instructions

ISSN: 2278-5795

Page 156

Y B T Sundari, et al International Journal of Computer and Electronics Research [Volume 2, Issue 2, April 2013]

4)

ROM :

ROM consists of HEX codes of a program which


will be used to test the implemented ARM7 softcore. The Hex codes are generated using the GNU
ARM Assembler.
5)
UART:
Serial communication is an essential to computers
and allows them to communicate with low speed
peripheral devices, such as the keyboard, the mouse,
modems etc. Thus, the UART or Universal
Asynchronous Receiver/ Transmitter is the most
important
component
required
in
serial
communication. The Universal Asynchronous
Receiver /Transmitter (UART) controller is the key
component of the serial communications subsystem
of a computer. The UART takes bytes.
Serial transmission is commonly used with
modems and for non-networked communication
between computers, terminals and other devices.
CPU Bus
Controlle
Data
Bus

Receiver
(RxD)

r
+

CPU

Address
Bus

Baud rate
Generato
r

Ctrl,
Status &

directions at the same time. The SPI is most often


employed in systems for communication between the
central processing unit (CPU ) and peripheral
devices. It is also possible to connect two
microprocessors by means of SPI.

Figure.8: Master-Slave Configuration


In the master SPI, the bits are sent out of the
MOSI pin and received in the MISO pin. The bits to
be shifted out are stored in the SPI data register,
SP0DR, and are sent out most significant bit (bit 7)
first. When bit 7 of the master is shifted out through
MOSI pin, a bit from bit 7 of the slave is being
shifted into bit 0 of the master via the MISO pin.
After 8 clock pulses or shifts, this bit will eventually
end up in bit 7 of the master. An SPI transmission is
always initiated by the master, and the peripheral
device is called the slave.
IV. SIMULATION AND CHIPSCOPE
RESULTS

Data
Ctrl
Bus

In this each module was implemented using


ModelSim XE 6.2g for functional simulation, Xilinx
used for FPGA synthesis and results shown using
Chip scope pro Analyzer.

Reg.
Transmitter
(TxD)

1. Multiplier
In this wave form we can observe the different
kinds of multiplications can be done through this i.e.
32 bit multiplication, 64 bit multiplication, signed
and unsigned multiplication.

Figure.7: Block Diagram of UART


6)

Serial Periphiral Interface:

SPI
interface
provides
Bi-Directional,
Synchronous Serial Communications between
Microcontrollers and Peripherals. It is based upon a
Master-Slave protocol where the master is the device
that drives the clock signal. Multiple masters and
multiple slaves are allowed on the bus; Serial
Peripheral Interface (SPI) is an interface that enables
the serial (one bit at a time) exchange of data
between two devices, one called a master and the
other called a slave . An SPI operates in full duplex
mode. This means that data can be transferred in both
[Link]

Figure.9: Simulation results of Multiplier

ISSN: 2278-5795

Page 157

Y B T Sundari, et al International Journal of Computer and Electronics Research [Volume 2, Issue 2, April 2013]

2. Arithmetic & logic unit


In this waveform we can observe the ALU
functions. Depending on the alu_fun variable, we can
perform many kinds of arithmetic and logical
functions.

Figure.13: Simulation results of the Controller


6 Top level module:
In this waveform we can observe the complete
instruction fetching, decoding and executing of an
instruction.

Figure.10: Simulation results of ALU


[Link] shifter
In this waveform we can observe that according
to the given inputs we can have different kinds of
shifts with different sizes.

Figure.14: Simulation results of the Top level module


7 Serial Peripheral Interface
In the below waveform observe the SPI MasterSlave configuration simulation results.
Figure.11: Simulation results of Barrel shifter
4 Data path
In this waveform we can observe that according
to the instruction coming from the ROM and inputs
from the controller we have addressed, data and
CPSR register outputs.

Figure.15: Simulation results of Serial Peripheral


Interface
ChipScope Results
a) In this waveform we can observe the complete
functionality of the ARM7 processor on the chip.
8)

Figure.12: Simulation results of Data path


5 Controller
In this waveform we can observe that according
to the inputs data and cpsr, the outputs will be sent to
the data path.
Figure.16: Chip scope result of ARM7
Transmitting any character by pressing on
hyper terminal window, it receives that character and
shown in the chip scope window. The below figure
b)

[Link]

ISSN: 2278-5795

Page 158

Y B T Sundari, et al International Journal of Computer and Electronics Research [Volume 2, Issue 2, April 2013]

16 represents the Chips cope result for UART


receiver

Figure.17 Chip scope result for UART receiver


V.

CONCLUSION AND FUTURE SCOPE

ARM soft processor core were implemented in the


context of FPGA based multiprocessor based SOC
applications. All the 32-bit instructions were
implemented with single cycle data path and random
logic based instruction decoder. The instructions of
Data processing, Arithmetic, Branch instructions,
Logical and compare were implemented.
The data path was implemented with multiplexer
based design, which is suitable for FPGA
implementation. GNU ARM assembler generated
hex codes was used to validate
So this ARM processor embedded into FPGA
which can be used for different applications like DSP
and Image processing (Audio codec and video
codec).Alternatively the design can also be
implemented on high end FPGA devices like Virtex4
or Virtex5 for better [Link] cores can be used
for verification platforms in the industries. It can be
extended by ARM9, ARM11 processors also.
ACKNOWLEDGWMENTS
This work is supported by AURORA group of
Institutions, Hyderabad.

[Link]

REFERENCES
[1] Alex Heunhe Han, Young-Si Hwang, YoungHo An, So-Jin Lee, Ki-Seok Chung Virtual
ARM Platform for Embedded System
Developers, IEEE 2008 pp 586-592.
[2] J. O. Hamblen, T. S. Hall Using System-ona-Programmable-Chip Technology to Design
Embedded Systems IJCA, Vol. 13, No. 3,
Sept. [Link] 1-11.
[3] Geun-young Jeong Design of 32-bit RISC
processor
and
efficient
verification
Proceedings of the 7th Korea-Russia
International
Symposium.
KORUS
2003.pp222-227.
[4] Shebli Anvar, Olivier Gachelin, et al.
FPGA-based System-on-Chip Designs for
Real-Time Applications in Particle Physics
14th IEEE Real Time Conference,
Stockholm, Sweden, June 6-10, 2005 pp 1-5.
[5] Andrew [Link], Dominic symes, Chris Wrig
ARM System Developers Guide designed
and
optimized
system
software
ISBN: 9781558608740
[6] ARM Architecture Reference Manual
[7] Dave Auer & Mark Buer A Design Flow For
Embedding The ARM Processor In An ASIC
IEEE 1995,pp : 342 345.
[8] ARM Ltd, 1995, ARh47TDMI Data Sheet
(ARM DDI 0029E), Advanced RISC
Machines Ltd.
[9] [Link]
[10] [Link]
[Link]

ISSN: 2278-5795

Page 159

You might also like