Memory Hierarchy and Technologies Explained
Memory Hierarchy and Technologies Explained
[Link] HIERARCHY
“States that programs access a relatively small portion of their address space at any instant
of time.” Eg: just as you accessed a very small portion of the library’ s collection. There are
again soon. If you recently brought a book to your desk to look at, you will probably need to
■ Spatial locality (locality in space): if an item is referenced, items whose addresses are
close by will tend to be referenced soon. For example, If you are referring a book you will also
A memory hierarchy consists of multiple levels of memory with different speeds and sizes.
The faster memories are more expensive per bit than the slower memories and thus are
smaller.
A structure that uses multiple levels of memories; as the distance from the processor
increases, the size of the memories and the access time both increase.
Today, there are three primary technologies used in building memory hierarchies.
Main memory is implemented from DRAM (dynamic random access memory), while
levels closer to the processor (caches) use SRAM (static random access memory).
The third technology, used to implement the largest and slowest level in the hierarchy,
is usually magnetic disk. (Flash memory is used instead of disks in many embedded
devices)
DRAM is less costly per bit than SRAM, although it is substantially slower. The price
difference arises because DRAM uses significantly less area per bit of memory, and
memory as a hierarchy of levels. In fig 1 it shows the faster memory is close to the
The upper level— the one closer to the processor— is smaller and faster than the lower
level, since the upper level uses technology that is more expensive.
The below fig shows that the minimum unit of information that can be either present
Fig 2 Every pair of levels in the memory hierarchy can be thought of as having an upper and
lower level. transfer an entire block when we copy something between levels.
Hit: If the data requested by the processor appears in some block in the upper level, this is
called a .
Miss: If the data is not found in the upper level, the request is called a .
The lower level in the hierarchy is then accessed to retrieve the block containing the
requested data.
The hit rate, or , is the fraction of memory accesses found in the upper level; it is
Since performance is the major reason for having a memory hierarchy, the time to service
Hit time is the time to access the upper level of the memory hierarchy, which includes the
The miss penalty is the time to replace a block in the upper level with the corresponding
block from the lower level, plus the time to deliver this block to the processor .
Because the upper level is smaller and built using faster memory parts, the hit time will be
much smaller than the time to access the next level in the hierarchy, which is the major
Programs exhibit both temporal locality, the tendency to reuse recently accessed data
items, and spatial locality, the tendency to reference data items that are close to other
accessed data items closer to the processor. Memory hierarchies take advantage of
Fig.3 This diagram shows the structure of a memory hierarchy: as the distance from the
The above fig shows that a memory hierarchy uses smaller and faster memory
technologies close to the processor. Thus, accesses that hit in the highest level of the
hierarchy can be processed quickly. Accesses that miss go to lower levels of the
close to that of the highest (and fastest) level and a size equal to that of the lowest
level.
In most systems, the memory is a true hierarchy, meaning that data cannot be present
[Link] TECHNOLOGY
SRAM Technology
The first letter of SRAM stands for . SRAMs don’ t need to refresh and so the access
time is very close to the cycle time. SRAMs typically use six transistors per bit to prevent
The dynamic nature of the circuits in DRAM requires data to be written back after being
read— hence the difference between the access time and the cycle time as well as the
need to refresh.
SRAM needs only minimal power to retain the charge in standby mode. SRAM designs are
concerned with speed and capacity, while in DRAM designs the emphasis is on cost per bit
and capacity.
8 times that of SRAMs. The cycle time of SRAMs is 8– 16 times faster than DRAMs, but
DRAM Technology
1) As early DRAMs grew in capacity, the cost of a package with all the necessary address
lines was an issue. The solution was to multiplex the address lines, thereby cutting the
2) One-half of the address is sent first, called the (RAS). The other half of
3) These names come from the internal chip organization, since the memory is organized as a
means that the memory system is occasionally unavailable because it is sending a signal
telling every chip to refresh. The time for a refresh is typically a full memory access (RAS
and CAS) for each row of the DRAM. Since the memory matrix in a DRAM is conceptually
square, the number of steps min a refresh is usually the square root of the DRAM capacity.
DRAM designers try to keep time spent refreshing to less than 5% of the total time. So far
we have presented main memory as if it operated like a Swiss train, consistently delivering
Although we have been talking about individual chips, DRAMs are commonly sold on small
and they are normally organized to be 8 bytes wide (+ ECC) for desktop systems.
To improve bandwidth, there has been a variety of evolutionary innovations over time.
The first was timing signals that allow repeated accesses to the row buffer without
another row access time, typically called fast page mode. Such a buffer comes naturally,
as each array will buffer 1024– 2048 bits for each access. Conventional DRAMs had an
asynchronous interface to the memory controller, and hence every transfer involved
The second major change was to add a clock signal to the DRAM interface, so that the
repeated transfers would not bear that overhead. Synchronous DRAM (SDRAM) is the
name of this optimization. SDRAMs typically also had a programmable register to hold the
number of bytes requested, and hence can send many bytes over several cycles per
request.
The third major DRAM innovation to increase bandwidth is to transfer data on both the
rising edge and falling edge of the DRAM clock signal, thereby doubling the peak data
FLASH MEMORY
products include a controller to spread the writes by remapping blocks that have been written
many times to less trodden blocks. This technique is called . With wear leveling,
personal mobile devices are very unlikely to exceed the write limits in the flash. S uch wear
levelling lowers the potential performance of flash, but it is needed unless higher-level
DISK MEMORY
A magnetic hard disk consists of a collection of platters, which rotate on a spindle at 5400
The metal platters are covered with magnetic recording material on both sides, similar to
The entire drive is permanently sealed to control the environment inside the drive, which, in
turn, allows the disk heads to be much closer to the drive surface.
Each disk surface is divided into concentric circles, called tracks. There are typically tens
Each track is in turn divided into sectors that contain the information; each track may have
The process of positioning a read/write head over the proper track on a disk.
Rotational latency
A lso called rotational delay. The time required for the desired sector of a disk to rotate
under the read/write head; usually assumed to be half the rotation time. The average latency
to the desired information is halfway around the disk. Disks rotate at 5400 RPM to 15,000
CACHE BASICS
Cache is one of the fastest and smallest level of the memory hierarchy between the
Figure below shows such a simple cache, before and after requesting a data item that is
Before the request, the cache contains a collection of recent references X1, X2, … , Xn− 1
The processor requests a word Xn that is not in the cache. This request results in a miss,
The simplest way to assign a location in the cache for each word in memory is to assign
This cache structure is called direct mapped, since each memory location is mapped
For example, almost all direct-mapped caches use this mapping to find a block:
Thus, an 8- block cache uses the three lowest bits (8=23) of the block address.
For example, Figure below shows how the memory addresses between 1ten (00001two) and
29ten (11101two) map to locations 1ten (001two) and 5ten (101two) in a direct-mapped cache
of eight words.
A direct-mapped cache with eight entries showing the addresses of memory words
The tags contain the address information required to identify whether a word in the cache
The tag needs only to contain the upper portion of the address .Only have the upper 2 of
The most common method is to add a valid bit to indicate whether an entry contains a
valid address.
If the bit is not set, there cannot be a match for this block.
Accessing a Cache
A sequence of nine memory references to an empty eight-block cache, including the action
Figure below shows how the contents of the cache change on each miss.
• The index of a cache block, together with the tag contents of that block, uniquely specifies
size, because the cache includes both the storage for the data and the tags.
• Control unit deals with cache misses. The control unit must detect a miss and process the
• If the cache reports a hit, the computer continues using the data as if nothing happened.
• If the data is not present in the cache then it is a miss. The cache miss handling is done in
collaboration with the processor control unit and with a separate controller that initiates
• The processing of a cache miss creates a pipeline stall as different to an interrupt, which
• To get the proper instruction into the cache, instruct the lower level in the memory
2. Instruct main memory to perform a read and wait for the memory to complete its access.
3. Write the cache entry, putting the data from memory in the data portion of the entry, writing
the upper bits of the address (from the ALU) into the tag field, and turning the valid bit on.
4. Restart the instruction execution at the first step, which will refetch the instruction, this time
Handling writes
• Suppose on a store instruction, we wrote the data into only to the data cache (without
• Then, after the write into the cache, memory would have a different value from that in the
data into both the memory and the cache. This scheme is called write-through.
• write buffer- A queue that holds data while the data is waiting to be written to memory. A
execution.
• In a write back scheme, when a write occurs, the new value is written only to the block in
the cache. The modified block is written to the main memory only when it is replaced.
• To measure and analyze cache performance. Two different techniques for improving
cache performance.
• One focuses on reducing the miss rate by reducing the probability that two different
• The second technique reduces the miss penalty by adding an additional level to the
• Memory-stall clock cycles come primarily from cache misses. Stalls generated by reads
• Memory-stall clock cycles can be defined as the sum of the stall cycles coming from reads
• The read-stall cycles can be defined in terms of the number of read accesses per
• Writes are more complicated. For a write-through scheme, we have two sources of stalls:
• Write misses, which usually require that we fetch the block before continuing the write.
• buffer stalls, which occur when the write buffer is full when a write occurs.
Average memory access time is the average time to access memory considering both hits
• Direct mapped cache : A block can go in exactly one place in the cache. There is a direct
mapping from any block address in memory to a single location in the upper level of the
hierarchy.
Such a scheme is called fully associative, because a block in memory may be associated
with any entry in the cache. To find a given block in a fully associative cache, all the entries
in the cache must be searched because a block can be placed in any one.
• The middle range of designs between direct mapped and fully associative is called set
associative.
• Set-associative cache - there are a fixed number of locations where each block can be
set-associative cache.
• An which consists of
associative placement: a block is directly mapped into a set, and then all the blocks in the
Figure below shows where block 12 may be placed in a cache with eight blocks total,
according to the three block placement policies. Varies for direct mapped, set-associative,
In direct-mapped placement, there is only one cache block where memory block 12 can be
• In a two-way set-associative cache, there would be four sets, and memory block 12 must
be in set (12 mod 4)=0; the memory block could be in either element of the set.
• In a fully associative placement, the memory block for block address 12 can appear in any
associative,&fully associative
VIRTUAL MEMORY:
The main memory can act as a “cache” for the secondary storage, usually implemented
Techniques that automatically move program and data blocks into the physical main
memory when they are required for execution is called the Virtual Memory.
The binary address that the processor issues either for instruction or data are called the
The virtual address is translated into physical address by a combination of hardware and
Management Unit).
When the desired data are in the main memory , these data are fetched /accessed
immediately.
If the data are not in the main memory, the MMU causes the Operating system to bring the
data into memory from the disk. Transfer of data between disk and main memory is
In address translation, all programs and data are composed of fixed length units called
Pages.
The Page consists of a block of words that occupy contiguous locations in the main
memory.
The cache bridge speed up the gap between main memory and secondary storage and it is
Each virtual address generated by the processor contains virtual Page number(Low order
Virtual Page number+ Offset Specifies the location of a particular byte (or word) within a
page.
Page Table: It contains the information about the main memory address where the page is
Page Frame: An area in the main memory that holds one page is called the page frame.
Page Table Base Register: It contains the starting address of the page table.
Virtual Page Number+Page Table Base register->Gives the address of the corresponding
entry in the page [Link])it gives the starting address of the page if that page currently
resides in memory.
Function:
The control bit indicates the validity of the page ie)it checks whether the page is actually
It also indicates that whether the page has been modified during its residency in the
memory;this information is needed to determine whether the page should be written back
to the disk before it is removed from the main memory to make room for another page.
The Page table information is used by MMU for every read & write access.
The Page table is placed in the main memory but a copy of the small portion of the page
This portion consists of the page table enteries that corresponds to the most recently
accessed pages and also contains the virtual address of the entry.
The figure below shows the translation of the virtual page number to a
The physical page number constitutes the upper portion of the physical address, while the
The number of bits in the page offset field determines the page size.
In virtual memory systems, we locate pages by using a table that indexes the memory; this
structure is called a page table, and it resides in memory. Each program has its own page
table, which maps the virtual address space of that program to main memory.
A valid bit is used in each page table entry, If the bit is 0, the page is not present in main
memory and a page fault occurs. If the bit is 1, the page is in memory and the entry
Page Faults
If the valid bit for a virtual page is 0, a page fault occurs. The operating system must be
given control.
The operating system gets control, and it must find the page in the next level of the
hierarchy (usually flash memory or magnetic disk) and decide where to place the
The operating system usually creates the space on flash memory or disk for all the pages
of a process when it creates the process. This space is called the swap space.
Fig: Indicates a page fault and the data is brought from the disk strage
BUFFER
Modern processors include a special cache that keeps track of recently used translations.
physical page number is used to form the address, and the corresponding reference bit is
turned on.
If a miss in the TLB occurs, we must determine whether it is a page fault or merely a TLB
miss. If the page exists in memory, then the TLB miss indicates only that the translation is
missing.
The processor can handle the TLB miss by loading the translation from the page table into
If the page is not present in memory, the TLB miss indicates a true page fault. In this case,
After a TLB miss occurs and the missing translation has been retrieved from the page
Because the reference and dirty bits are contained in the TLB entry, we need to copy these
Some systems use other techniques to approximate the reference and dirty bits,
eliminating the need to write into the TLB except to load a new table entry on a miss.
A special control unit may be provided to allow the transfer of large block of data at high
speed directly between the external device and main memory, without continuous
DMA transfers are performed by a control circuit called the DMA Controller.
DMA Controller.
i) Starting address
iii)Direction of transfer.
When a block of data is transferred , the DMA controller increment the memory address for
successive words and keep track of number of words and it also informs the processor by
While DMA control is taking place, the program requested the transfer cannot continue and
After DMA transfer is completed, the processor returns to the program that requested the
transfer.
When R/W =1, DMA controller read data from memory to I/O device.
o Done Flag=1, the controller has completed transferring a block of data and is ready to
o IE=1, it causes the controller to raise an interrupt (interrupt Enabled) after it has completed
A DMA controller connects a high speed network to the computer bus,and the disk
controller for two disks, also has DMA capability and it provides two DMA channels.
To start a DMA transfer of a block of data from main memory to one of the disks,the
program write’ s the address and the word count information into the registers of the
When DMA transfer is completed, it will be recorded in status and control registers of the
Cycle Stealing:
Requests by DMA devices for using the bus are having higher priority than processor
requests .
Top priority is given to high speed peripherals such as, Disk, High speed Network Interface
said to steal the memory cycles from the processor. This interviewing technique is called
Cycle stealing.
Burst Mode: The DMA controller may be given exclusive access to the main memory to
Bus Master: The device that is allowed to initiate data transfers on the bus at any given time
Bus Arbitration:
It is the process by which the next device to become the bus master is selected and the bus
ii)Distributed arbitration (all devices participate in the selection of next bus master).
Centralized Arbitration:
Here the processor is the bus master and it may grants bus mastership to one of its DMA
controller.
A DMA controller indicates that it needs to become the bus master by activating the Bus
The signal on BR is the logical OR of the bus request from all devices connected to [Link]
BR is activated the processor activates the Bus Grant Signal (BGI) and indicated the DMA
controller that they may use the bus when it becomes free.
If DMA requests the bus, it blocks the propagation of Grant Signal to other devices and it
indicates to all devices that it is using the bus by activating open collector line, Bus Busy
(BBSY).
.
Sequence of signals during transfer of bus mastership for the devices
The timing diagram shows the sequence of events for the devices connected to the
processor is shown.
DMA controller 2 requests and acquires bus mastership and later releases the bus.
During its tenture as bus master, it may perform one or more data transfer.
Distributed Arbitration:
It means that all devices waiting to use the bus have equal responsibility in carrying out the
arbitration process.
they assert the Start-Arbitration signal&place their 4 bit ID number on four open collector
A winner is selected as a result of the interaction among the signals transmitted over
these lines.
The net outcome is that the code on the four lines represents the request that has the
highest ID number.
The drivers are of open collector type. Hence, if the i/p to one driver is equal to 1, the i/p to
another driver connected to the same bus line is equal to „ 0‟ (ie. bus the is in low-voltage
state).
Eg: Assume two devices A & B have their ID 5 (0101), 6(0110) and their code is 0111.
Each devices compares the pattern on the arbitration line to its own ID starting from MSB.
If it detects a difference at any bit position, it disables the drivers at that bit position. It
In our eg. „ A‟ detects a difference in line ARB1, hence it disables the drivers on lines
ARB1 & ARB0. This causes the pattern on the arbitration line to change to 0110 which
INTERRUPTS:
In program‐controlled I/O, when the processor continuously monitors the status of the
An alternate approach would be for the I/O device to alert the processor when it becomes
ready. – The Interrupt request line will send a hardware signal called the interrupt signal
to the processor. On receiving this signal, the processor will perform the useful function
Routine. The interrupt resembles the subroutine calls. The interrupt request uses a line in
After the execution of ISR, the processor has to come back to instruction i + 1.
Therefore, when an interrupt occurs, the current contents of PC which point to i +1 is put in
A return from interrupt instruction at the end of ISR reloads the PC from that temporary
When the processor is handling the interrupts, it must inform the device that its request
This may be accomplished by a special control signal called the interrupt acknowledge
signal.
The task of saving and restoring the information can be done automatically by the
processor.
The processor saves only the contents of program counter & status register (ie) it saves
only the minimal amount of information to maintain the integrity of the program execution.
Saving registers also increases the delay between the time an interrupt request is received
and the start of the execution of the ISR. This delay is called the Interrupt Latency.
Generally, the long interrupt latency in unacceptable. The concept of interrupts is used in
Operating System and in Control Applications, where processing of certain routines must
be accurately timed relative to external events. This application is also called as real-time
processing.
Interrupt Hardware:
Fig: An equivalent circuit for an open drain bus used to implement a common interrupts
request line.
A single interrupt request line may be used to serve „ n‟ devices.
All devices are connected to the line via switches to ground. To request an interrupt, a
device closes its associated switch, the voltage on INTR line drops to 0(zero).
If all the interrupt request signals (INTR1 to INTRn) are inactive, all switches are open and
When a device requests an interrupts, the value of INTR is the logical OR of the requests
INTR->It is used to name the INTR signal on common line it is active in the low voltage state.
Open collector (bipolar ckt) or Open drain (MOS circuits) is used to drive INTR line.
The Output of the Open collector (or) Open drain control is equal to a switch to the ground
that is open when gates input is in „ 0‟ state and closed when the gates input is in „ 1‟
state.
Resistor „ R‟ is called a pull-up resistor because it pulls the line voltage upto the high
The arrival of an interrupt request from an external device causes the processor to suspend
the execution of one program & start the execution of another because the interrupt may
Interrupts are disabled by changing the control bits is PS (Processor Status register)
The device is informed that its request has been recognized&in response, it deactivates the
INTR signal.
Edge-triggered:
The processor has a special interrupt request line for which the interrupt handling circuit
responds only to the leading edge of the signal. Such a line said to be edge-triggered.
When several devices requests interrupt at the same time, it raises some questions. They
are.
o Given that the different devices are likely to require different ISR, how can the processor
o Should a device be allowed to interrupt the processor while another interrupt is being
serviced?
Polling Scheme:
If two devices have activated the interrupt request line, the ISR for the selected device (first
device) will be completed & then the second request can be serviced.
The simplest way to identify the interrupting device is to have the ISR polls all the
IRQ (Interrupt Request) -> when a device raises an interrupt requests, the status register
IRQ is set to 1.
Merit:
It is easy to implement.
Demerit:
The time spent for interrogating the IRQ bits of all the devices that may not be requesting
any service.
Vectored Interrupt:
Here the device requesting an interrupt may identify itself to the processor by sending a
special code over the bus & then the processor start executing the ISR.
The code supplied by the processor indicates the starting address of the ISR for the device.
The code length ranges from 4 to 8 bits. The location pointed to by the interrupting device
The processor reads this address, called the interrupt vector & loads into PC.
The interrupt vector also includes a new value for the Processor Status Register.
When the processor is ready to receive the interrupt vector code, it activate the interrupt
Interrupt Nesting:
In multiple level priority scheme, we assign a priority level to the processor that can be
The priority level of the processor is the priority of the program that is currently being
executed.
The processor accepts interrupts only from devices that have priorities higher than its own.
At the time the execution of an ISR for some device is started, the priority of the processor
The action disables interrupts from devices at the same level of priority or lower.
Privileged Instruction:
The processor priority is usually encoded in a few bits of the Processor Status word.
It can also be changed by program instruction & then it is write into PS. These instructions
The processor is in supervisor mode only when executing OS routines. It switches to the
User program cannot accidently or intentionally change the priority of the processor &
An attempt to execute a privileged instruction while in user mode, leads to a special type of
lines
Interrupt request received over these lines are sent to a priority arbitration circuit in the
processor.
A request is accepted only if it has a higher priority level than that currently assigned to the
processor.
Simultaneous Requests:
Daisy Chain:
The interrupt request line INTR is common to all devices.
The interrupt acknowledge line INTA is connected in a daisy chain fashion such that INTA
When several devices raise an interrupt request, the INTR is activated&the processor
Device1 passes the signal on to device2 only if it does not require any service.
If devices1 has a pending request for interrupt blocks that INTA signal & proceeds to put
its identification code on the data lines. Therefore, the device that is electrically closest to
Merits:
Here the devices are organized in groups & each group is connected at a different priority
At the devices end, an interrupt enable bit in a control register determines whether the
At the processor end, either an interrupt enable bit in the PS (Processor Status) or a priority
Exception of ISR:
Read the input characters from the keyboard input data register. This will cause the
Store the characters in a memory location pointed to by PNTR & increment PNTR.
When the end of line is reached, disable keyboard interrupt & inform program main.
A standard I/O Interface is required to fit the I/O device with an Interface circuit.
The processor bus is the bus defined by the signals on the processor chip itself.
The devices that require a very high speed connection to the processor such as the main
The bridge connects two buses, which translates the signals and protocols of one bus into
another.
The bridge circuit introduces a small delay in data transfer between processor and the
devices.
SCSI INTERFACE
SCSI is available in a variety of interfaces. The first, still very common, was parallel SCSI
SCSI interfaces have often been included on computers from various manufacturers for
use under Microsoft Windows, Mac OS, Unix, Commodore Amiga and Linux operating
Short for Small Computer System Interface, SCSI is pronounced as "Scuzzy" and is one of
the most commonly used interface for disk drives that was first completed in 1982.
SCSI-1 is the original SCSI standard developed back in 1986 as ANSI X3.131-1986. SCSI-1 is
SCSI-2 was approved in 1990, added new features such as Fast and Wide SCSI, and support
SCSI is a standard for parallel interfaces that transfers information at a rate of eight bits per
second and faster, which is faster than the average parallel interface. SCSI-2 and above
supports up to seven peripheral devices, such as a hard drive, CD-ROM, and scanner, that
can attach to a single SCSI port on a system's bus. SCSI ports were designed for Apple
Macintosh and Unix computers, but also can be used with PCs. Although SCSI has been
popular in the past, today many users are switching over to SATA drives.
SCSI connectors
The below illustrations are examples of some of the most commonly found and used SCSI
SCSI is used for connecting additional devices both inside and outside the computer box.
SCSI bus is a high speed parallel bus intended for devices such as disk and video display.
SCSI refers to the standard bus which is defined by ANSI (American National Standard
Institute).
Because of these various options, SCSI connector may have 50, 68 or 80 pins.
The data transfer rate ranges from 5MB/s to 160MB/s 320Mb/s, 640MB/s.
To achieve high transfer rat, the bus length should be 1.6m for SE signaling and 12m for
LVD signaling.
The SCSI bus us connected to the processor bus through the SCSI controller.
Each sector contains several hundreds of bytes. These data will not be stored in
SCSI protocol is designed to retrieve the data in the first sector or any other selected
sectors. Using SCSI protocol, the burst of data are transferred at high speed.
Initiator
Target
Initiator:
It has the ability to select a particular target & to send commands specifying the operation to
be performed.
Target:
It carries out the commands it receive from the initiator. The initiator establishes a logical
Steps:
Consider the disk read operation, it has the following sequence of events.
The SCSI controller acting as initiator, contends process, it selects the target controller &
The target starts an output operation, in response to this the initiator sends a command
The target that it needs to perform a disk seek operation, sends a message to the initiator
The target controller sends a command to disk drive to move the read head to the first
sector involved in the requested read in a data buffer. When it is ready to begin transferring
data to initiator, the target requests control of the bus. After it wins arbitration, it reselects
connection again. Data are transferred either 8 (or) 16 bits in parallel depending on the
As the initiator controller receives the data, if stores them into main memory using DMA
approach.
The SCSI controller sends an interrupt to the processor to inform it that the requested
Bus Signals:-
Instead, it has data lines to identify the bus controllers involved in the selection /
Once a connection is established b/w two controllers, these is no further need for
- SEL Selection
- MSG Message
- ACK Acknowledge
- RST Reset.
PCI:
PCI has plug and play capability for connecting I/O devices.
To connect new devices, the user simply connects the device interface board to the bus.
Data Transfer:
The data are transferred between cache and main memory is the bursts of several words
When the processor specifies an address and request a „ read‟ operation from memory,
the memory responds by sending a sequence of data words starting at that address.
During write operation, the processor sends the address followed by sequence of data
A read / write operation involving a single word is treated as a burst of length one.
Configuration space → It is intended to give PCI, its plug and play capability.
The master maintains the address information on the bus until data transfer is completed.
The addressed device that responds to read and write commands is called a target.
A complete transfer operation on the bus, involving an address and bust of data is called a
„ transaction’ .
USB is used for connecting additional devices both inside and outside the computer box.
USB uses a serial transmission to suit the needs of equipment ranging from keyboard-
The USB has been designed to meet the key objectives. They are,
It provide a simple, low cost & easy to use interconnection s/w that overcomes the
It accommodate a wide range of data transfer characteristics for I/O devices including
Port Limitation:-
To add new ports, the user must open the computer box to gain access to the internal
The user may also need to know to configure the device & the s/w.
Merits of USB:-
USB helps to add many devices to a computer system at any time without opening the
computer box.
Device Characteristics:-
The kinds of devices that may be connected to a cptr cover a wide range of functionality.
The speed, volume&timing constrains associated with data transfer to&from devices varies
significantly.
Eg:1 Keyboard ->Since the event of pressing a key is not synchronized to any other event in a
The data generated from keyboard depends upon the speed of the human operator which
is about 100bytes/sec.
The main objective of USB is that it provides a plug & play capability.
The plug & play feature enhances the connection of new device at any time, while the
system is operation.
USB Architecture:-
USB has a serial bus format which satisfies the low-cost & flexibility requirements.
Clock & data information are encoded together & transmitted as a single signal.
There are no limitations on clock frequency or distance arising form data skew, & hence it
is possible to provide a high data transfer bandwidth by using a high clock frequency.
To accommodate a large no/. of devices that can be added / removed at any time, the USB
At the root of the tree, the „ root hub‟ connects the entire tree to the host computer.
The leaves of the tree are the I/O devices being served.