Virtual Memory
EE312
Memory Management Goal
We want to share main memory such that:
• Each process thinks it has a private memory of
2-4 GB (or more), even if it doesn’t use it all
• Real memory is allocated efficiently to parts of
process memory actually being used (locality)
• No process can interfere with or even see
memory belonging to another
– Unless we want that to happen
Program memory is the part of a computer’s address space used by a program (process) during its [Link] running process gets i
own private program memory layout — so even if multiple programs are running, each thinks it has the same memory map (thanks to vir
Layout of program memory
memory).
Reserved by OS made it unaccessible to the
7FFF FFFF reserved (4KB) users.
7FFF EFFF stack (grows down)
~1792 MB
Used for function calls
stores local variables If Stack & Heap grow too much and meet — → stack overflow / he
Function Parameters collision occurs.
Not to
Scale!
Dynamic Memory Allocation
1001 0000 heap (grows up)
1000 FFFF Holds Global & Static Variables that persists for
1000 0000 global data (64 KB) the lifetime of the program. Divided in to 2 parts
1-> Initialised data segment : int x = 5;
2-> Uninitialised data segment : int y;
0FFF FFFF Contains compiled code of the program
Also called text segment or code segment
program (252 MB)
0040 0000 Stores CPU instructions that implement program's funcitons
003F FFFF Reserved by OS or hardware
reserved (4 MB)
0000 0000
When the program is compiled, the compiler & linker assign memory to all the variables and the code segment. That memory is virtu
logical memory. That is not the real memory on the RAM . Indeed, during the program, they will interact using this memory only.
Program Memory Addresses
• Program addresses are fixed at the time the
source file is compiled and linked
• Small, simple systems can use program
addresses as the physical address in memory
In small systems without OS, the addresses assigned by the compiler/linker are directly mapped to physical memory.
• Modern systems usually much more complex
» program address space very large
» other programs running at the same time
» operating system is in memory too
Direct Physical Addressing
physical
addresses
program
addresses
stack
physical
heap memory
program
Physical Addressing
• Address generated by the program is the same as the
address of the actual memory location
• Simple approach, but lots of problems
» Only one process can easily be in memory at a time
» There is no way to protect the memory that the process
isn't supposed to change (i.e., the OS or other processes)
» A process can only use as much memory as is physically
in the computer
» A process occupies all the memory in its address space,
even if most of that space is never used
• 2 GB for the program and 2 GB for the system kernel
A kernel is the core part of the computer system that directly interacts with the computer's hardware.
Idea
• Separate program notion of memory addresses
from actual physical memory locations
» Program memory = virtual addresses
» Physical memory = real addresses
» Use hardware to map between the two
Memory Mapping
program memory physical
addresses mapping addresses
stack
heap
program
physical
stack
memory
heap
program
stack
heap
program disk
here disk helps in paging or swappint(when the physical memory is full , the OS moves some data from RAM to disk.)
Virtual Addresses
• The program addresses are now considered to
be “virtual addresses”
• The memory management unit (MMU)
translates the program addresses to the real
physical addresses of locations in memory
• This is another of the many interface layers
that let us work with abstractions, instead of
all details at all levels
Abstractions: Here we are hiding the unnecessary details of physical address RAM memory and showing the virtual memory
which is what required.
Address Translation
●
Requires some sort of register/table
to allow arbitrary mappings of logical
to physical addresses.
●
Two basic schemes:
●
segmented;
●
paged.
●
Segmentation and paging can be
combined.
Segments and Pages
page 1
page 2
segment 1
Physical memory
segment 2
Segments and Pages
●
Segmented: Large, arbitrary sized
●
Paged: Small (512 bytes- 4KB),
equally sized
●
Simple h/w
●
Allows fragmentation
●
Segmented paged: Each segment
divided into pages
●
Two step address translation
Virtual Physical
Paging Page # Page #
0x0000
0 0 0x0000
1 1
• Divide a process's virtual 2 2
address space into fixed- 3 3
size chunks (called pages) 4 4
• Divide physical memory 0x6000
5 5
into pages of the same size 6
• Any virtual page can be 0x0000 7
0
located at any physical 8
1
page 9
2
10
• Translation box converts 0x4000 3
11
from virtual pages to 12
physical pages Translation
13 0xE000
Multiple Processes Virtual Physical
Page # Page #
Share Memory 0x0000
0 0 0x0000
1 1
2 2
• Each process thinks it 3 3
starts at address 4 4
5 5
0x0000 and has all of 0x6000
6
memory 0x0000
0
7
8
• A process doesn't 1
9
know anything about 2
3
10
physical addresses 0x4000 11
12
and doesn't care Translation
13 0xE000
Protection
Virtual Physical
Page # Page #
• A process can only use 0x0000
0 0 0x0000
1 1
virtual addresses
2 2
• A process can't corrupt 3 3
another process's memory 4 4
5 5
» It has no address to refer to it 0x6000
6
• How can Blue write to 0x0000
0
7
8
Green's page 2? 1
9
2
» needs an address to refer to 10
3
physical page 7, but it doesn't 0x4000 11
have one 12
Translation
13 0xE000
The OS looks for pages that aren’t currently being used (for example, old or idle data). It copies those pages from RAM to disk (this proce
called swapping out). he OS marks them as “not in memory.”
Store Memory on Disk
Virtual Physical
Page # Page #
• Memory that isn't being 0x0000
0 0 0x0000
used can be saved on disk 1 1
» swapped back in when it is 2 2
referenced via page fault 3 3
4 4
• Programs can address 5 5
more memory than is 6 6
physically available 7 7
• This is an important 8 8
9 9 0xA000
reason for virtual memory 10
» too hard for programs to do 11
this on their own 12
The OS brings them back from disk to RAM (this is called
0xE000 13 Translation Disk
swapping in),
Not all parts of a program’s virtual address space actually need to exist in physical memory or even on disk.
Sparse Address
Spaces Virtual Physical
Page # Page #
0x0000
0 0 0x0000
• Memory addresses that 1 1
aren't being used at all 2 2
don't have to be in 0x4000 3 3
memory or on disk 4
Unused 5
» Code can start at a very
6
low logical address 7
0x0FFC000 997
» Stack can start at a very 8
998
high logical address 9 0xA000
999
» No physical pages 1000
0x1000000
allocated for unused Translation
addresses in between
Sharing Memory Virtual
Page #
Physical
Page #
0x0000
0 0 0x0000
• Two processes can share
1 1
memory by mapping two 2 2
virtual pages to the same 3 3
physical page 4 4
5 5
• The code for Word can be 0x6000
Word 6
shared for two Word 0x0000 7
0
processes 1 8
» code pages are read only 2 9
10
• Each process has its own 3
11
4
data pages 12
5 Translation
» possible to share data pages 0x6000
Word 13 0xE000
too, but less common
Virtual Address Translation
virtual physical
address address
Virtual Physical
VPN Translate PPN
Page # page #
Offset Offset
Offset remains the same & no need of translation.
program -> virtual -> physical
program address (32 bits)
virtual page number (20 bits) offset in page (12)
memory management unit
physical page number (n bits) offset in page (12)
physical address (n+12 bits)
Page Tables
for example
• Offset field is 12 bits
» so each page is 212 bytes = 4096 bytes = 4KB
• Virtual Page Number field is 20 bits
» so 220 = 1 million virtual pages
• Page table is an array with one entry for
each virtual page
» 1 million entries
» entry includes physical page number and flags
Issues with Space & Latency
• Each process has a page table with 1 Million
entries - big
» no memory left to store the actual programs
• Each page table must be referenced for every
address reference in a program - slow
» no time left to do any useful work
A page fault is an exception that occurs when a running program tries to access a piece of memory (a "page") that is not currently
located in the computer's physical RAM
Memory Hierarchy Revisited
• Once the translation hardware is there we have a
caching problem again
» Want size ≈ disk, performance ≈ memory
• Key issue: disk latency is 100,000 times memory,
so design motivation is to avoid accessing disk
• Minimizing miss rate (“page faults”):
» VM “pages” are much larger than cache blocks = size
of disk blocks, usually 4K or 8K or more
» Use fully associative lookup with approximate LRU
Disk transfer cost is dominated by seek + rotational latency, so transferring larger blocks make sense. This concept is Amortization
meaning for the given time taken it is always better to have more data rather less data. So, carrying more data weakly dominates
carrying less data.
Fully associative : Any smaller memory can be placed at any larger memory.
Finding the Right Page (frame)
• If fully associative, how do we find the right page without
scanning all of memory?
• Answer: index is called the page table
» Each process has a separate page table
• Processor “page table register” points to active one – part of
process state
» Page table indexed with virtual page number (VPN)
• The bits that aren’t part of the page offset
» Each entry contains a valid bit and a physical page number (PPN)
• PPN is concatenated with page offset to get physical address
» No index tag needed – full VPN is index Physical Address == PPN ||Offset
The offset remains the same as the size of each page remains same in vm & pm. Also, the offset is not used as the key in page table
as they are same.
Page Table
Here, Physical memory size = 1GB
Valid bit:
Virtual Memory size = 4GB
1 -> Page in memory
thus only a fraction of VM is mapped
0 --> page fault
to PM at the time.
Thus, VM is larger than PM.
Page Table Organizations
page
descriptor
i page descriptor
flat tree
Dealing with large Page Table
This may increase the time complexity. To tackle this, we cache the system using TLB(Transitional Lookaside Buffer).
This stores the PPN translations.
Hit rate is almost 95 - 98%.
2 states, TLB MISS || TLB HIT;
ARM Memory Management
●
Memory region types:
●
section: 1 Mbyte block; Maps a large 1-MB chunk of virtual memory directly
to physical memory.
●
large page: 64 kbytes; Used when you want more control than 1MB but don’t
need 4KB granularity.
●
small page: 4 [Link] when fine-grained protection or demand paging is
needed.
●
An address is marked as section-
mapped or page-mapped.
●
Two-level translation scheme.
ARM Address Translation
Virtual address
Translation table 1st index 2nd index offset
base register
descriptor Page mapped case only
concatenate
1st level table 1st entry to choose:
if(sectionMapped) {"No 2nd lookup table"; PA =Concatenate(PPN + OFFSET)}\
if(pageMapping){"Look for 2nd table"; Move to next step}
Tells whether {64kb || 4kb || invalid}
descriptor concatenate
2nd level table
physical address
Page Tables - size problem
• The page tables are addressed using virtual
addresses in the kernel
• Therefore they don’t need physical memory
except for the parts that are actually used
» see “Sparse Address Spaces” diagram
• Operating System manages these tables in its
own address space
» kernel address space
Page Tables - speed problem
• Use special memory cache for page table
entries - Translation Lookaside Buffer
• Each TLB entry contains
» address space ID number (part of the tag)
» virtual page number (rest of the tag)
» flags (read only, dirty, etc)
» associated physical page number (the data)
• TLB is a fully associative cache
Cpu generates program address:
Using the TLB
ASID | Virtual Page Number | Offset
ASID : Current process ID
TLB matches(ASID, VPN):
if(TLB hit ==true){retrieve ppn;concatenate(PPN, OFFSET); return;} // takes 1 CPU cycle
if(TLB MISS == true){PPL = pageTableWalk(); loads(PPL, TLB); //refill; retry();return;}
A Process
Page Table
Process Program address
...
ASID Virtual Page Number Offset
TLB
PPN
ASID VPN Physical Page Number
refill
Physical address
Physical Page Number Offset
...
TLB Miss
●
If we miss in the TLB, we need to “walk the page table”
» In MIPS, an exception is raised and software fills the TLB
» In x86, a “hardware page table walker” fills the TLB
●
What if the page is not in memory?
» This situation is called a page fault.
» The operating system will have to read the page from disk.
» It will need to select a page to replace.
The O/S tries to approximate LRU
» The replaced page will need to be written back if dirty.
Reference
• Computer Organization and Design, Patterson and Hennessy
(3e)