Computer Organization
and Architecture
ITT 303
RISC-V ISA
Dr. Janibul Bashir
NIT Srinagar
Autumn 2024
[Link]
ISA
■ The ISA is the interface between what the software commands
and what the hardware carries out.
■ The ISA specifies
❑ The memory organization
■ Address space (LC-3: 2^16, RISC-V: 2^32)
■ Addressability (LC-3: 16 bits, RISC-V: 8 bits)
❑ The register set
■ 8 registers (R0 to R7) in LC-3
■ 32 registers in RISC-V
❑ The instruction set
■ Opcodes
■ Data types
■ Addressing modes
■ Length and format of instructions
2
RISC-V
■ RISC-V: RISC-V (pronounced “risk-five”) is a new instruction set architecture
(ISA) that was originally designed to support computer architecture research and
education, but which we now hope will also become a standard free and open
architecture for industry implementations.
■ Load-store architecture.
■ A completely open ISA that is freely available to academia and industry.
■ An ISA supporting extensive ISA extensions and specialized variants.
■ There are two primary base integer variants, RV32I and RV64I.
■ RV32E subset variant of the RV32I base instruction set, which has been added to
support small microcontrollers, and which has half the number of integer
registers.
■ The base integer instruction sets use a two’s-complement representation for
signed integer values.
3
RISC-V - II
■ RISC-V has been designed to support extensive customization and specialization.
Each base integer ISA can be extended with one or more optional instruction-set
extensions.
■ The base integer ISA is named “I” (prefixed by RV32 or RV64 depending on
integer register width), and contains integer computational instructions, integer
loads, integer stores, and control flow instructions. The standard integer
multiplication and division extension is named “M”, and adds instructions to
multiply and divide values held in the integer registers. The standard atomic
instruction extension, denoted by “A”, adds instructions that atomically read,
modify, and write memory for inter-processor synchronization. The standard
single-precision floating-point extension, denoted by “F”, adds floating-point
registers, single-precision computational instructions, and single-precision loads
and stores.
4
RISC-V - Memory
■ A RISC-V has a single byte-addressable address space.
■ A word of memory is defined as 32 bits (4 bytes). Correspondingly, a halfword is
16 bits (2 bytes), a doubleword is 64 bits (8 bytes), and a quadword is 128 bits
(16 bytes).
■ Uses little endian.
5
RV32I Base Integer Instruction Set
■ A RISC-V has a single byte-addressable address space.
■ RV32I has 32 registers (named as X0, X1 and so on), each 32 bits wide.
■ Register x0 is hardwired with all bits equal to 0. General purpose registers
x1–x31 hold values that various instructions interpret as a collection of Boolean
values, or as two’s complement signed binary integers or unsigned binary
integers.
■ There is one additional unprivileged register: the program counter pc holds the
address of the current instruction.
■ There is no dedicated stack pointer or subroutine return address link register in
the Base Integer ISA; the instruction encoding allows any x register to be used
for these purposes.
6
RV32I Base Integer Instruction Set - II
■ Software calling convention uses register x1 to hold the return address for a call,
with register x5 available as an alternate link register. The standard calling
convention uses register x2 as the stack pointer.
■ Each instruction is of 32-bits.
■ A larger number of integer registers helps performance on high-performance
code, where there can be extensive use of loop unrolling, software pipelining.
■ For resource-constrained embedded applications, we have the RV32E subset,
which only has 16 registers
7
Let us discuss about the Instructions
■ An instruction is the most basic unit of computer processing
❑ Instructions are words in the language of a computer
❑ Instruction Set Architecture (ISA) is the vocabulary
■ The language of the computer can be written as
❑ Machine language: Computer-readable representation (that is, 0’s and 1’s)
❑ Assembly language: Human-readable representation
We will study about RISC V instructions and some LC-3 instructions
Principles are similar in all ISAs
8
Instruction: Opcode and Operand
■ An instruction is made up of two parts
❑ Opcode and Operands
■ Opcode specifies what the instruction does
■ Operands specify who the instruction is to do it to
■ Both are specified in instruction format (or instr. encoding)
9
RISC V Instruction Formats
10
Base Instruction Formats
■ In the base RV32I ISA, there are four core instruction formats (R/I/S/U).
■ All are a fixed 32 bits in length and must be aligned on a four-byte boundary in
memory.
11
Base Instruction Formats - II
■ In the base RV32I ISA, there are four core instruction formats (R/I/S/U) and 2
sub-formats (SB/UJ).
■ All are a fixed 32 bits in length and must be aligned on a four-byte boundary in
memory.
■ The RISC-V ISA keeps the source (rs1 and rs2) and destination (rd) registers at
the same position in all formats to simplify decoding.
12
Base Instruction Formats - III
■ The 6 Instruction Formats
• R-Format: instructions using 3 register inputs – arithmetic/logical ops
• I-Format: instructions with immediates, loads – addi, lw, jalr, slli
• S-Format: store instructions: sw, sb, sh
• SB-Format: branch instructions: beq, bge
• U-Format: instructions with upper immediates – lui, auipc
• UJ-Format: jump instructions: jal
■ Type of operation determined by three parts in the instruction: opcode(7),
funct3(3), and funct7(7).
13
R-format Instructions - I
■ 2 source registers (rs1, and rs2), and 1 destination register (rd).
■ Format is:
31 25 24 20 19 15 14 12 11 76 0
funct7(7) rs2(5) rs1(5) funct3(3) rd(5) opcode(7)
■ Registers determined using 5 bits (0-31).
■ opcode, funct3, and funct7: specifies operation.
■ For R-types, opcode = 0110011.
■ rs1: first source operand, rs2: second source operand, rd: destination register.
■ 10 instructions: add (000),sub (000),sll (001),slt (010),sltu (011),xor (100),srl
(101),sra (101),or (110),and (111). -> value of funct3 in brackets.
■ Funct7 is 000000 for all except sub and sra where it is 0100000.
14
R-format Instructions - II
■ SLT (set less than): places the value 1 in register rd if register rs1 is less
than the rs2 when both are treated as signed numbers, else 0 is written
to rd.
■ Logical shift: all vacant bits are filled with 0.
■ Arithmetic shift: sign extension in vacant bits. Only right shift.
■ SLL, SRL, and SRA perform logical left, logical right, and arithmetic right
shifts on the value in register rs1 by the shift amount held in the lower 5
bits of register rs2.
15
Example of R Type instruction
■ Addition
a=b+c
■ add: mnemonic to indicate the operation to perform
b, c: source operands
a: destination operand
a←b+c
16
Mapping: Assembly to Machine instruction
■ Map variables to registers
a = x2 b = x3 c = x4
■ So in RISC V assembly, the instruction becomes:
add x4,x2,x3
■ Opcode: 0110011, funct3: 000, funct7: 0000000
■ rs1 = x2 = 00010, rs2 = x3 = 00011, rd = x4 = 00100
funct7(7): rs2(5): rs1(5): funct3(3) : rd(5): opcode(7):
0000000 00011 00010 000 00100 0110011
17
R-format Instructions - III
■ All RV32 R format instructions are:
Addressing mode used: Register Direct
18
I-format Instructions - I
■ 1 source registers (rs1), 12-bit immediate, and 1 destination register (rd).
■ Format is:
31 20 19 15 14 12 11 76 0
Imm[11:0] rs1(5) funct3(3) rd(5) opcode(7)
■ opcode, and funct3: specifies operation.
■ For I-types, opcode = 0010011.
■ Imm is sign extended to 32 bits. – Imm[y:x] specifies that in the final 32-bit
immediate y:x bits are filled with these specific bits.
■ 9 arithmetic instructions: addi (000),slli (001),slti (010),sltui (011),xori (100),srli
(101),srai (101),ori (110),andi (111). -> value of funct3 in brackets.
■ ADDI rd, rs1, 0 is used to implement the MV rd, rs1 assembler
pseudoinstruction.
19
I-format Instructions - II
■ For shift instructions imm[11:0] is divided in two parts: imm[11:5] and
imm[4:0].
■ Imm[4:0] determines the no. of bits to shift (0-31).
■ Imm[11:5] determines the operation: logical(0000000) or arithmetic(0100000).
■ Load instructions are also I-format instructions.
■ Opcode for load instructions: 0000011.
■ Effective address (EA) = [rs1] + Immediate.
■ rs1 is a base register and imm is offset.
■ Load data from memory(EA) to rd.
■ Types: LB(000), LH(001), LW(010), LBU(100), LHU(101). (B: byte, H: halfword,
and w: word).
20
I-format Instructions - III
■ All arithmetic I-format instructions are:
Addressing mode used: Immediate
21
I-format Instructions - IV
■ All load I-format instructions are:
Addressing mode used: Base-offset
22
S-format Instructions - I
■ Used for store instructions. Both source, no destination.
■ Format is:
31 25 24 20 19 15 14 12 11 76 0
imm(11:5) rs2(5) rs1(5) funct3(3) Imm[4:0] opcode(7)
■ Effective address (EA) = [rs1] + Immediate.
■ Store data from rs2 to memory(EA).
■ Opcode = 0100011.
■ Types: sb(000), sh(001), sw(010).
23
U-format Instructions - I
■ Dealing with large immediates.
■ Requires a destination registers in which we want to load a large immediate.
■ Two instructions: LUI and AUIPC with opcode 0110111 and 0010111
respectively.
■ Format is:
31 12 11 76 0
Imm[31:12] Rd(5) opcode(7)
■ LUI loads upper immediate into rd and adds 12 lsbs as 0s.
■ AUIPC adds upper immediate (extended to 32 bits) to PC.
■ LUI with ADDI are used to load 32 bit into registers.
■ A pseud0 instruction li rd, oX12345654 is implemented using LUI and ADDI.
24
U-format Instructions - II
■ lui writes the upper 20 bits of the destination with the immediate value and
clears the lower 12 bits.
■ Together with an addi to set low 12 bits, can create any 32-bit value in a
register using two instructions (lui/addi).
■ lui x10, 0x87654 # x10 = 0x87654000.
■ addi x10, x10, 0x321 # x10 = 0x87654321.
■ How to set 0xDEADBEEF?
■ lui x10, 0xDEADB # x10 = 0xDEADB000
■ addi x10, x10,0xEEF # x10 = 0xDEADAEEF
■ addi 12-bit immediate is always sign-extended! - if top bit of the 12-bit
immediate is a 1, it will subtract -1 from upper 20 bits.
■ Solution: Pre-increment value placed in upper 20 bits, if sign bit will be set on
immediate in lower 12 bits.
25
SB-format Instructions - I
■ Used for branch instructions. Same format as S with some minor changes.
■ Compares rs1 and rs2 and accordingly makes a branch.
■ We have to calculate the branch target.
■ Calculated PC relative, branch target(BT) = PC + imm*2. It should be 4 because
instructions are 4 bytes wide. However, 2 is used to support 16 bit instructions
as well.
■ In order to save from multiplication, we consider the imm as 13 bits wide, with
12 bits embedded in the instruction and the lsb as 0.
■ The format is:
31 30 25 24 20 19 15 14 12 11 8 76 0
Imm[12] Imm[10:5] rs2(5) rs1(5) funct3(3) Imm[4:1] Imm[11] opcode(7)
26
SB-format Instructions - II
■ Opcode: 1100011
■ Types: beq(000), bne(001), bge(101), blt(100), bltu(110), bgeu(111).
■ All branch instructions are:
27
UJ-format Instructions - I
■ For branches, we assumed that we won’t branch too far, so we can specify a
change in the PC.
■ For general jumps (jal), we may jump to anywhere in code memory – Ideally,
we would specify a 32-bit memory address to jump to.
■ Unfortunately, we can’t fit both a 7-bit opcode and a 32-bit address into a single
32-bit word.
■ Also, when linking we must write to an rd register.
■ For such cases we have jump instructions with similar format as of U
instructions.
28
UJ-format Instructions - II
■ Format is:
31 12 11 76 0
Imm[20|10:1|11|19:12] Rd(5) opcode(7)
■ Opcode: 1101111
■ The return address (PC+4) is stored in rd and is generally x1 register.
■ Instruction is jal ra, 0X12345.
■ PC = PC + imm*2.
■ # j pseudo-instruction.
■ j Label = jal x0, Label # Discard return address.
■ # Call function within 2^18 instructions of PC.
■ jal ra, FuncName.
29
Jalr instruction (I-format)
■ Opcode: 1100111
■ Uses same immediates as arithmetic & loads – no multiplication by 2 bytes.
■ The target address is obtained by adding the sign-extended 12-bit I-immediate
to the register rs1, then setting the least-significant bit of the result to zero.
■ jalr rd, rs1, imm.
■ Writes PC+4 to rd (return address).
■ Sets PC = [rs1] + imm.
■ ret and jr psuedo-instructions
■ ret = jr ra = jalr x0, ra, 0
■ # Call function at any 32-bit absolute address: 1. lui x1, imm(20bits)
2. jalr ra, x1, imm(12 bits)
■ # Jump PC-relative with 32-bit offset: 1. auipc x1, imm(20 bits).
2. jalr x0, x1, imm(12 bits).
30
Levels of Representation/Interpretation
Higher-Level Language temp = v[k];
Program (e.g. C) v[k] = v[k+1];
v[k+1] = temp;
Compiler
lw x3, 0(x2)
Assembly Language
We are here lw x1, 4(x2)
Program (e.g. RISC-V) sw x1, 0(x2)
Assembler sw x3, 4(x2)
0000 1001 1100 0110 1010 1111 0101 1000
Machine Language 1010 1111 0101 1000 0000 1001 1100 0110
Program (RISC-V) 1100 0110 1010 1111 0101 1000 0000 1001
0101 1000 0000 1001 1100 0110 1010 1111
Machine
Interpretation
Hardware Architecture Description
(e.g. block diagrams)
Architecture
Implementation
Logic Circuit Description
(Circuit Schematic Diagrams)
31
Translation vs. Interpretation - I
▪ How do we run a program written in a source language?
– Interpreter: Directly executes a program in the source language
– Translator: Converts a program from the source language to an equivalent
program in another language
▪ Directly interpret a high-level language when efficiency is not critical
▪ Translate to a lower level language when increased performance
is desired.
▪ Generally easier to write an interpreter
▪ Interpreter closer to high-level, so can give better error messages
(e.g. Python, Venus)
▪ Interpreter is slower (~10x), but code is smaller (~2x)
32
Translation vs. Interpretation - II
▪ Translated/compiled code almost always more efficient and therefore
higher performance
– Important for many applications, particularly operating systems
▪ Translation/compilation helps “hide” the program “source” from the users
– One model for creating value in the marketplace (e.g. Microsoft keeps all
their source code secret)
– Alternative model, “open source”, creates value by publishing the source code and
fostering a community of developers
33
C Translation C program: foo.c
Compiler
Assembly program: foo.s
Steps to Starting
a Program: Assembler
1) Compiler
Object (mach lang module): foo.o
2) Assembler
3) Linker Linker
lib.o
4) Loader
Executable (mach lang pgm): [Link]
Loader
29
Memory
Compiler
▪ Input:Higher-level language (HLL) code (e.g. C, Java in files such as foo.c).
▪ Output: Assembly Language Code (e.g. foo.s for RISC-V).
▪ Note that the output may contain pseudo-instructions.
Assembler
▪ Input:Assembly language code (e.g. foo.s for RISC-V)
▪ Output: Object code (True Assembly), information tables (e.g.
foo.o for RISC-V) – Object file.
▪ Reads and uses directives. Replaces pseudo-instructions.
▪ Produces machine language.
30
Assembler Directives
▪ Give directions to assembler, but do not produce machine instructions
▪ .text: Subsequent items put in user text segment (machine code)
▪ .data: Subsequent items put in user data segment (binary rep of data in
source file)
Operation – 2 pass
▪ Pass 1:
– Expands pseudo instructions encountered
– Remember position of labels
– Take out comments, empty lines, etc
– Error checking
▪ Pass 2:
– Use label positions to generate relative addresses (for branches and jumps)
– Outputs the object file, a collection of instructions in binary code 36
Producing Machine Language
▪ What about jumps to external labels?
– Requiring knowing a final address
– Forward or not, can’t generate machine instruction without knowing the position
of instructions in memory
▪ What about references to data?
– la gets broken up into lui and ori
– These will require the full 32-bit address of the data
▪ These can’t be determined yet, so we create two tables- symbol table
and relocation table.
37
Symbol Table
▪ List of “items” that may be used by other files
– Each file has its own symbol table
▪ What are they?
– Labels: function calling
– Data: anything in the .data section; variables may be
accessed across files
▪ Keeping track of the labels fixes the forward reference problem
38
Relocation Table
▪ List of “items” this file will need the address of later (currently
undetermined)
▪ What are they?
– Any external label jumped to: jal or jalr
• internal
• external (including library files)
– Any piece of data
• such as anything referenced in the data section
39
Object File Format
1) object file header: size and position of the other pieces of the object
file
2) text segment: the machine code
3) data segment: data in the source file (binary)
4) relocation table: identifies lines of code that need to be “handled”
5) symbol table: list of this file’s labels and data that can be referenced
6) debugging information
40
Linker
• Input: Object Code files, information tables (e.g. foo.o,lib.o for
RISC-V)
• Output: Executable Code (e.g. [Link] for RISC-V)
• Combines several object (.o) files into a single executable (“linking”)
• Enables separate compilation of files
– Changes to one file do not require recompilation of whole program
– Old name “Link Editor” from editing the “links” in jump and link instructions
6/28/2018 CS61C Su18 - Lecture 8 41
Linker - II
object file 1
text 1
data 1 [Link]
Relocated text 1
info 1
Relocated text 2
Linker
Relocated data 1
object file 2
text 2 Relocated data 2
data 2
info 2
6/28/2018 CS61C Su18 - Lecture 8 42
Linker - III
1) Take text segment from each .o file and put them together
2) Take data segment from each .o file, put them together, and
concatenate this onto end of text segments
3) Resolve References
– Go through Relocation Table; handle each entry
– i.e. fill in all absolute addresses
6/28/2018 CS61C Su18 - Lecture 8 43
Loader
• Input:Executable Code (e.g. [Link] for RISC-V)
• Output: <program is run>
• Executable files are stored on disk
• When one is run, loader’s job is to load it into memory and start it
running
• In reality, loader is the operating system (OS)
– loading is one of the OS tasks
Loader - II
1) Reads executable file’s header to determine size of text and data segments
2) Creates new address space for program large enough to hold text and data
segments, along with a stack segment
3) Copies instructions and data from executable file into the new address space
4) Copies arguments passed to the program onto the stack
5) Initializes machine registers
– Most registers cleared, but stack pointer assigned address of 1st free stack
location
6) Jumps to start-up routine that copies program’s arguments from stack to
registers and sets the PC
– If main routine returns, start-up routine terminates program with the exit system
45
call
Question
46
Question
47
C.A.L.L. Example
#include <stdio.h>
int main()
{
printf("Hello, %s\n", "world");
return 0;
}
6/28/2018 CS61C Su18 - Lecture 8 48
Compiled Hello.c: Hello.s
6/28/2018 CS61C Su18 - Lecture 8 49
Assembled Hello.s: Linkable Hello.o
6/28/2018 CS61C Su18 - Lecture 8 50
Linked Hello.o: [Link]
6/28/2018 CS61C Su18 - Lecture 8 51
END
52