0% found this document useful (0 votes)
4 views66 pages

Unit4 Memory Array Structures (Part1)

The document discusses various memory array structures in CMOS systems, categorizing them into types such as RAM, ROM, SRAM, DRAM, and non-volatile memories like EEPROM and Flash. It explains the characteristics, advantages, and disadvantages of each memory type, including their access methods and operational principles. Additionally, it covers the design of memory cells, read/write operations, and the role of sense amplifiers and content-addressable memory (CAM) in memory systems.

Uploaded by

madhum.lvs25
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views66 pages

Unit4 Memory Array Structures (Part1)

The document discusses various memory array structures in CMOS systems, categorizing them into types such as RAM, ROM, SRAM, DRAM, and non-volatile memories like EEPROM and Flash. It explains the characteristics, advantages, and disadvantages of each memory type, including their access methods and operational principles. Additionally, it covers the design of memory cells, read/write operations, and the role of sense amplifiers and content-addressable memory (CAM) in memory systems.

Uploaded by

madhum.lvs25
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit – IV

Memory Array structures design


&
Interconnects
Memory arrays account for the majority of transistors in a CMOS
system-on-chip. Arrays may be divided into following categories:
Random access memory is accessed with an address and has a latency
independent of the address.
Serial access memories are accessed sequentially so no address is
necessary.
Content addressable memories determine which address(es) contain
data that matches a specified key.

RAM is Volatile memory which retains its data as long as power is


applied
ROM is nonvolatile memory which hold data indefinitely.
Memory cells used in volatile memories can further be divided into static
structures and dynamic structures.
Static cells use some form of feedback to maintain their state,
while dynamic cells use charge stored on a floating capacitor through an
access transistor. Charge will leak away through the access transistor
even while the transistor is OFF,
dynamic cells must be periodically read and rewritten to refresh their
state.
Static RAMs (SRAMs) are faster and less troublesome, but require
more area per bit than their dynamic counterparts (DRAMs).
Some nonvolatile memories are read-only. The contents of a mask
ROM are hardwired during fabrication and cannot be changed.
But many nonvolatile memories are programmable.
A programmable ROM (PROM) can be programmed once after
fabrication by blowing on-chip fuses with a special high programming
voltage.
An erasable programmable ROM (EPROM) is programmed by storing
charge on a floating gate. It can be erased by exposure to ultraviolet
(UV) light for several minutes to knock the charge off the gate. Then the
EPROM can be reprogrammed.
Electrically erasable programmable ROMs (EEPROMs) are similar, but
can be erased in microseconds with on-chip circuitry.
Flash memories are a variant of EEPROM that erases entire blocks
rather than individual bits.
Because of their good density and easy in-system reprogrammability,
Flash memories have replaced other nonvolatile memories in most
modern CMOS systems.
Read-only memory

· It is read only memory.


· Data at any memory location can be only read.
· It is non-volatile memory, i.e. the contents are retained even after electricity
is switched off and available after it is switched on.
· Data access to ROM is slow compared to RAM

Traditionally written to, “programmed”, before inserting to embedded


system
Uses:
– Store software program for general-purpose processor
– Store constant data needed by system
– Implement combinational circuit
BL BL BL
VDD
WL
WL WL
1

BL BL BL
WL WL
WL
0
GND

Diode ROM MOS ROM 1 MOS ROM 2

Fig: Different approaches for implementing 1 and 0 ROM cells

Figure shows ROM cell. The cell should be designed such that 0 or 1 is presented to the
bit line upon activation of its word line. Initially BL is connected to GND through resistor.
The presence or absence of a diode between WL and BL differentiates between ROM
cells storing 1 or 0 respectively.
The disadvantage of diode cell is that it does not isolate the bit line from the word line.
Other approach is to use active NMOS transistor whose drain is connected to the supply
voltage. Here all output driving current is provided by the MOS transistor in the cell.

BL [0] BL [1] BL [2] BL [3]


To store logic 1 connect a transistor
WL[0]
in the respective bit position
VDD To store logic 0 remove transistor
WL[1] in the respective bit position

WL[2] Initially all bit lines are 0,


VDD
When word lines are high, the
respective one or more transistors
WL[3] in the column reads logic 1,
If no transistor in the column reads
Vbias 0
Pull-down loads

Fig; MOS 4x4 OR ROM cell array


V DD
Pull-up devices

WL [0]
GND
WL [1]

WL [2]

GND
WL [3]

BL [0] BL [1] BL [2] BL [3]

Fig: 4x4 MOS NOR ROM


Fig: Dot diagram
representation of ROM

word0: 010101,
word1: 011001
Fig: Pseudo-nMOS OR ROM
word2: 100101,
word3: 101010
V DD
Pull-up devices

BL [0] BL [1] BL [2] BL [3]

WL [0]

WL [1]

WL [2]

WL [3]

Fig: 4x4 MOS NAND ROM

Here word lines are active low signals, initially WL are high, to store logic 1 connect
a transistor, to store logic 0 no transistor is required
Fig: Pseudo-nMOS AND ROM

When word line is 0(active low) if transistor is connected respective output of a


inverter is 0, if no transistor output of respective inverter is 1
EPROM: Erasable programmable ROM
Erasable Programmable Read Only (EPROM) memory gives the flexibility to re-
program the same chip
 It can be erased and reprogrammed repeatedly as the name suggests.
 The erase operation in case of an EPROM is performed by exposing the chip to a
source of ultraviolet light.
 The reprogramming ability makes EPROM as essential part of software
development and testing process.

• Better write ability


– can be erased and reprogrammed thousands of times

• Typically used during design development


EEPROMs
 EEPROMs stand for Electrically Erasable and Programmable ROM.
 It is same as EPROM, but the erase operation is performed electrically.
Any byte in EEPROM can be erased and rewritten as desired
 Better write ability
– can be in-system programmable with built-in circuit to provide higher than
normal voltage
– writes very slow due to erasing and programming
 “busy” pin indicates to processor EEPROM still writing
– can be erased and programmed tens of thousands of times
 Similar storage permanence to EPROM (about 10 years)
 Far more convenient than EPROMs, but more expensive
Flash Memory
 Flash memory is the most recent advancement in memory technology.
Flash memory devices are high density, low cost, nonvolatile, fast (to
read, but not to write), and electrically reprogrammable.
Flash is much more popular than EEPROM and is rapidly displacing
many of the ROM devices.
 Flash devices can be erased one sector at a time, not byte by byte. i.e.
large blocks of memory erased at once, rather than one word at a time. The
blocks are typically several thousand bytes large .
Memory – RAM – SRAM Vs DRAM

SRAM Cell DRAM Cell

Made up of 6 CMOS transistors Made up of a MOSFET and a capacitor


(MOSFET)

Doesn’t Require refreshing Requires refreshing


Low capacity (Less dense) High Capacity (Highly dense)
More expensive Less Expensive
Fast in operation. Typical access Slow in operation due to refresh
time is 10ns requirements. Typical access time is 60ns.
Write operation is faster than read
operation.
RAM: “Random-access” memory
• Typically volatile memory

– bits are not held without power supply

• Read and written to easily by embedded system during execution

• Internal structure more complex than ROM

– a word consists of several memory cells, each storing 1 bit

– each input and output data line connects to each cell in its
column

– rd/wr connected to every cell

– when row is enabled by decoder, each cell has logic that stores
input data bit or outputs stored bit depending on rd/wr control.
SRAM
Figure shows a 6-transistor (6T) SRAM commonly used in practice.
Such a cell uses a single wordline and both true and complementary
bitlines. The complementary bit-line is often called bit_b. The cell
contains a pair of cross-coupled inverters and an access transistor for
each bitline. True and complementary versions of the data are stored
on the cross-coupled inverters. If the data is disturbed slightly, positive
feedback around the loop will restore it to VDD or GND. The wordline
is asserted to read or write the cell.

Fig: 6-transistor SRAM cell


For reads, the bitlines are initially precharged high and one is pulled
down by the SRAM cell through the access transistor. For writes, the
bitline or its complement is actively driven low and this low value
overpowers the cell to write the new value. Careful choice of transistor
sizes is necessary for correct operation.

The 6T cell achieves its compactness at the expense of more complex


peripheral circuitry for reading and writing the cells. This is a good
tradeoff in large RAM arrays where the cell size dominates the area.
The small cell size also offers shorter wires and hence lower power
consumption.
SRAM operation is divided into two phases. The phases will be called
Φ1 and Φ2, but may actually be generated from clk and its
complement clkb.
Assume that in phase 2, the SRAM is precharged. In phase 1, the
SRAM is written or read by raising the appropriate wordline and
either driving the bitlines to the value that should be written or leaving
the bitlines floating and observing which one is pulled down.
Reading a large SRAM can be slow because the capacitance of all the
cells sharing the bitline is large. Sense amplifiers accelerate reads by
detecting small differences between the bitline and its complement.
The following sections discuss the role of each block in the SRAM
operation:
Memory Cell Read/Write Operation

Fig: Read operation for 6T SRAM cell Fig x1: SRAM column read
Figure shows a SRAM cell being read. The both bitlines are initially
pre-charged to high. Assume A is initially '0' and thus A_b is initially
'1.' A_b and bit_b both should remain '1'. When the wordline is raised,
bit should be pulled down through transistors Nl and N2. At the same
time bit is being pulled down, node A tends to rise. A is held low by N1,
but raised by current flowing in from N2. Hence, N1 must be stronger
than N2. Specifically, the transistors must be ratioed such that node A
remains below the switching threshold of the P2/N3 inverter. This
constraint is called read stability.
Waveforms for the read operation are shown in Figure (b) as a 0 is
read onto bit. Observe that A momentarily rises, but does not glitch
badly enough to flip the cell.
Figure x1 shows the same cell in the context of a full column from the
SRAM. During phase 2, the bitlines are precharged high.
Many SRAM cells share the same bitline pair, which acts as a
distributed dual-rail footless dynamic multiplexer.

Fig: Write operation for 6T SRAM cell Fig: SRAM column write
The waveforms of Figure show the SRAM cell being written. Again,
assume A is initially '0' and that we wish to write a '1‘ into the cell, bit
is precharged high and left floating. bit_b is pulled low by a write
driver. We know on account of the read stability constraint that bit will
be unable to force A high through N2. Hence, the cell must be written
by forcing A_b low through N4. P2 opposes this operation; thus, P2
must be weaker than N4 so that A_b can be pulled low enough. This
constraint is called writeability.
Once A_b falls low, N1 turns OFF and P1 turns ON, pulling A high as
desired.
DRAM
Dynamic RAMs (DRAMs) store their contents as charge on a capacitor
rather than in a feedback loop. Thus, the basic cell is substantially
smaller than SRAM, but the cell must be periodically read and
refreshed so that its contents do not leak away.

Fig:1T DRAM cell read operation

A 1-transistor (1T) dynamic RAM cell consists of a transistor and a


capacitor, as shown in Figure (a). Like SRAM, the cell is accessed by
asserting the wordline to connect the capacitor to the bitline.
On a read, the bitline is first precharged to VDD/2. When the wordline
rises, the capacitor shares its charge with the bitline, causing a voltage
change ∆V that can be sensed, as shown in Figure(b).

The read disturbs the cell contents at x, so the cell must be rewritten
after each read.
On a write, the bitline is driven high or low and the voltage is forced onto
the capacitor. Some DRAMs drive the wordline to VDDP = VDD + Vt to
avoid a degraded level when writing a '1.'
The DRAM capacitor Ccell must be as small as possible to achieve good
density. However, the bitline is contacted to many DRAM cells and has
a relatively large capacitance Cbit.
Therefore, the cell capacitance is typically much smaller than the bitline
capacitance.
3-Transistor DRAM Cell
BL 1 BL 2
WWL

RWL WWL

M3 RWL

M1 X X VDD 2 VT
M2
VDD
CS BL 1

BL 2 VDD 2 VT DV

3-Transistor DRAM Cell gives inverted value of the stored data.


Write: When Write Word Line(WWL) is high, value on Bit Line1(BL1) is stored on Cs (if
BL1=1Cs=1, if BL1=0Cs=0)
When WWL is low the capacitor retains its value.
Read: When Read Word Line(RWL) is high M2 is on or off based on stored value on Cs.
The precharged BL2 connected to Gnd if Cs=1 or retains its high value if Cs=0
(i.e. if RWL=1, Cs=1M2=M3=On BL2=0, if RWL=1, Cs=0M2=Off, M3=On
BL2=1 due to precharged)
Sense Amplifiers
A differential amplifier takes small
signal differential inputs (bit line
voltages ) and amplifies them to a
large-signal single ended output.

Initially Sense Amplifier enable (SE) is low


VDD
(disabling the sense amp. circuit) and BL
and BLb are charged to high.
M3 M4
The inputs are fed to the differential input
y Out
devices (M1, M2) and transistors M3, M4

M1 M2 acts as an active current mirror load. Once


bit bit
the read operation is initiated one of the
bitlines drops. SE is enabled. When
SE M5 differential signal has been established
amplifier evaluates and gives the single
ended output.
Fig: Differential Sensing as applied to SRAM memory column

Figure shows two stage sensing approach along with the SRAM bit column structure. The
bit lines are connected to the inputs x and xb. of the amplifier.
Content-Addressable Memory (CAM)

Fig (a): Content-addressable memory Fig (b): Translation Lookaside


Buffer (TLB) using CAM

The CAM acts as an ordinary SRAM that can be read or written given adr and data,
but also performs matching operations. Matching asserts a matchline output for each
word of the CAM that contains a specified key.
Bit Bit Bit Bit
Word Bit Bit
M8 M9
M4 M5
CAM ••• CAM
M6 M7

Word
Word S S
int
CAM ••• CAM M3 M2
Match
M1

Wired-NOR Match Line


Fig: 9T CAM cell

Figure (a) shows the symbol for a content-addressable memory (CAM).


A common application of CAMs is translation lookaside buffers (TLBs) in
microprocessors supporting virtual memory.
The virtual address is given as the key to the TLB CAM. If this address is in the
CAM, the corresponding matchline is asserted. This matchline can serve as the
wordline to access a RAM containing the associated physical address.
Figure shows a 9T CAM cell consisting of a normal SRAM cell with additional
transistors to perform the match. The matchline is either precharged or pulled
high as a distributed pseudo-nMOS gate. The key is placed on the bitlines. If
the key and the value stored in the cell differ, the matchline will be pulled
down. Only if all of the key bits match all of the bits stored in the word of
memory will be the matchline for that word remain high.

(i.e. initially matchline is precharged to Vdd


When S and incoming key on BL matches (S=1, BL=1, BLb=0)the internal
node int=0(M2 on, M3 off) M1 turned offmatchline remains high.
When S and incoming key on BL different (S=0, BL=1, BLb=0)the internal
node int=1(M2 off, M3 on) M1 turned onmatchline is connected to Gnd).
10T CAM cell

Fig (c): 10T CAM cell implementation

Figure (c) shows a 10T CAM cell consisting of a normal SRAM cell with
additional transistors to perform the match. Multiple CAM cells in the
same word are tied to the same matchline.
Figure below shows a complete 4 × 4 CAM array. Like an SRAM, it consists
of an array of cells, a decoder, and column circuitry. However, each row also
produces a dynamic matchline. The matchlines are precharged with the
clocked pMOS transistors. The miss signal is produced with a distributed
pseudo-nMOS NOR.

Fig: 4 × 4 CAM array


Problem1:Implement 2-word by 6-bit Read-Only Memory (ROM) cells using
pseudo-nMOS pullups with the following contents: word0: 010101, word1:
011001.

Problem2: Implement a 4-word by 6-bit NOR ROM using pseudo-nMOS pull-ups


with the following contents:
word0: 011101
word1: 010001
word2: 100001
word3: 101110
Interconnects

Interconnect introduces three types of parasitic effects


1. Capacitive,
2. Resistive, and
3. Inductive
all of which influence the signal integrity and degrade the performance
of the circuit. Here we analyze how interconnect affects the circuit
operation
(Signal integrity or SI is a set of measures of the quality of an
electrical signal)
1. Capacitive Parasitics

Capacitance and Reliability—Cross Talk

Cross talk is a noise induced by one signal that interferes with


another signal.
i.e. An unwanted coupling from a neighboring signal wire to a
network node introduces an interference that is generally called cross
talk.
In integrated circuits, this inter signal coupling can be both
capacitive and inductive
The impact of capacitive crosstalk is influenced by the impedance of the line
under examination. If the line is floating, the disturbance caused by the
coupling persists and may be worsened by subsequent switching on adjacent
wires.

Floating Lines

Consider the circuit configuration of


Figure. Line X is coupled to wire Y by a
parasitic capacitance CXY. Line Y sees a
Fig: Capacitive coupling to a floating line.
total capacitance to ground equal to CY.
Assume that the voltage at node X
experiences a step change equal to ∆VX.
This step appears on node Y attenuated by
the capacitive voltage divider.
Circuits that are particularly susceptive to capacitive cross talk are networks
with low swing precharged nodes, located in adjacency to full-swing wires
(with ∆VX = VDD).
Examples are dynamic memories, low-swing on-chip busses, and some dynamic
logic families.

Driven Lines
If the line Y is driven with a resistance
RY, a step on line X results in a transient
on line Y. The transient decays with a
time constant tXY = RY(CXY+CY). The
actual impact on the “victim line” is a
strong function of the rise (fall) time of
the interfering signal. Fig: Capacitive coupling to a driven line
If the rise time is larger than the time constant, the peak value of disturbance
is diminished(reduced). This is illustrated in the waveforms of figure.
Obviously, keeping the driving impedance of a wire—and hence tXY —low
goes a long way towards reducing the impact of capacitive cross talk.
The keeper transistor, added to a dynamic gate or precharged wire, is an
excellent example of how impedance reduction helps to control noise.

Fig: Voltage response for different rise


times of Vx
Design Techniques—Dealing with Capacitive Cross Talk

There are several approaches to dealing with cross talk between capacitive
transmission lines:
1. Wherever possible signals on adjacent wiring layers should be routed in
perpendicular directions to minimize the vertical coupling.
(i.e. It is also used to say that if on one layer the traces follow the
direction "from north to south", on the layer adjacent to it the
traces should follow the direction "from east to west“).

2. Floating signals should be avoided, and keeper devices should be placed


on dynamic signals to restore values after a disturbance.
3. Signal rise time should be made as long as possible, to minimize the
effect of cross talk on driven nodes.

4. If signals are sent differentially (e.g., like the bit lines in a SRAM), cross talk
can be made common-mode by routing the true and complement lines close to
each other and (optionally) periodically reversing their positions.
5. Sensitive signals (e.g., those that operate at a low voltage) should be well
separated from full-swing signals to minimize the capacitive coupling from
signals with large ΔV.
6. In extreme cases, a sensitive signal can be shielded by placing conductors
above, below, and on either side of it that are tied to the reference supply (Vp or
GND, depending on the signal) at a single point.
7. The interwire capacitance between signals on different layers can be further
reduced by the addition of extra routing layers.

Fig: Cross section of routing


layers, illustrating the use of shielding to
reduce capacitive cross talk
Capacitance and Performance in CMOS

Cross Talk and Performance

The circuit schematic of Figure is illustrative of


how capacitive cross talk may result in a data-
dependent variation of the propagation delay.

Fig:Impact of cross talk on


propagation delay.
Assume that the inputs to the three parallel wires X, Y, and Z experience simultaneous
transitions. Wire Y (called the victim wire) switches in a direction that is opposite to the
transitions of its neighboring signals X and Z. The coupling capacitances experience a
voltage swing that is double the signal swing, and hence represent an effective capacitive
load that is twice as large as Cc— by the well-known Miller effect. Since the coupling
capacitance represents a large fraction of the overall capacitance in the deep-submicron
dense wire structures, this increase in capacitance is substantial, and has a major impact
on the propagation delay of the circuit.
Some Interesting Driver Circuits.

When one device is sending data on the bus, all other sending devices should
be disconnected.
This can be achieved by putting the output buffers of those devices in a high
impedance state Z that effectively disconnects the gate from the output wire.
Such a buffer has three possible states—0, 1, and Z—and is therefore called a
tri-state buffer.

Fig: Two possible implementations of a


tri-state buffer. En = 1 enables the buffer.
Resistive Parasitics
Resistance and Reliability—Ohmic Voltage Drop

Current flowing through a resistive wire results in an ohmic voltage drop that
degrades the signal levels. This is especially important in the power
distribution network,
This is demonstrated by the circuit in Figure , where an inverter placed far
from the power and ground pins connects to a device closer to the supply. The
difference in logic levels caused by the IR voltage drop over the supply rails
might partially turn on transistor M1. This can result in an accidental
discharging of the precharged, dynamic node X, or cause static power
consumption if the connecting gate is static.

Fig: Ohmic voltage drop on the


supply rails reduces the noise
margins.
In short, the current pulses from the on-chip logic, memories and I/O pins
cause voltage drops over the power-distribution network, and are the major
source for on-chip power supply noise. A small drop in the supply voltage may
cause a significant increase in delay.

The most obvious solution to this problem is to reduce the maximum distance
between the supply pins and the circuit supply connections. This is most easily
accomplished through a structured layout of the power distribution network.
A number of onchip power-distribution networks with peripheral bonding are
shown in Figure.
(a) Single layer power grid (b) Dual layer grid; (c) Dual power plane.
Fig: On-chip power distribution networks.
The power and ground are brought onto the chip via bonding pads located on
the four sides of the chip. In the first approach (a), power and ground are
routed vertically (or horizontally) on the same layer. Power is brought in from
two sides of the chip. Local power strips are strapped to this upper grid, and
then further routed on the lower metal levels.
Method (b) uses two coarse metal layers for the power distribution, and the
power is brought in from the four sides of the die (chip).
The other method is to use two solid metal planes for the distribrution of Vdd
and GND (c).
This approach has the advantage of drastically reducing the resistance of the
network.
The metal planes also act as shields between data signalling layers, hence
reducing cross-talk.
They also help to reduce the on-chip inductance. Obviously, this approach is
only feasible when sufficient metal layers are available.
Resistance and Performance—RC Delay
The delay of a wire grows quadratically with its length (i.e ).
Doubling the length of a wire increases its delay by a factor of four.
The signal delay of long wires therefore tends to be dominated by the RC
effect. This is becoming an larger problem in modern technologies, which
feature an increasing average length of the global wires.
In this section, we discuss a number of design techniques that may help to
cope with the delay imposed by the resistance of a wire.
1. Better Interconnect Materials
Resistivity of commonly-used
A first option for reducing RC delays is to use better conductors (at 20 C)
interconnect materials. The introduction of silicides
and Copper have helped to reduce the resistance of
polysilicon and metal wires, respectively, while the
use of dielectric materials with a lower permittivity
lowers the capacitance.
But as the technology grows the new material do not solve the fundamental problem of
the delay of long wires.
2. Innovative design techniques are often the only way of coping with the latter.

Sometimes, it is hard to avoid the use of long polysilicon wires. A good example of such
circumstance are the address lines in memories, which must connect to a large number
of transistor gates. Keeping the wires in polysilicon increases the memory density by
avoiding the overhead of the extra metal contacts.
The polysilicon-only option unfortunately leads to an excessive propagation delay.
One possible solution is to drive the word line from both ends, as shown in Figure.
This effectively reduces the worst-case delay by a factor of four.

Figa: Driving the word line from both sides


Another option is to provide an extra metal (bypass) wire, which runs parallel to the
polysilicon one, and connects to it every k cells. The delay is now dominated by the much
shorter polysilicon segments between the contacts.

Figb: Using a metal bypass


Fig: Approaches to reduce the word-line delay
3. Better Interconnect Strategies
Interconnections are routed along the diagonal direction(45° lines). This yields a
sizable reduction in wirelength—of up to 29% .
This in turn results in higher performance, lower power dissipation, and smaller chip
area.

Fig: Diagonal routing


Fig: Example of layout using 45° lines
4. Introducing Repeaters

Fig: Reducing RC interconnect delay by introducing repeaters

The most popular design approach to reduce the propagation delay of long wires is to
introduce intermediate buffers, also called repeaters, in the interconnect line.
Making an interconnect line m times shorter reduces its propagation delay quadratically.

Assuming that the repeaters have a fixed delay tpbuf , the delay of the partitioned wire
is given by:
5. Optimizing the Interconnect Architecture

Even with buffer insertion, the delay of a resistive wire cannot be reduced below the
minimum. Long wires hence often exhibit a delay that is longer than the clock period
of the design.
Wire pipelining is a popular performance-improvement technique in this category.

Fig: Wire pipelining improves the throughput of a wire.

The wire is partitioned in k segments by inserting registers or latches. While this does not
reduce the delay through the wire segment— it takes k clock cycles for a signal to proceed
through the wire—, it helps to increase its throughput, as the wire is handling k signals
simultaneously at any point in time. The delay of the individual wire segments can further
be optimized by repeater insertion, and should be below a single clock period.
Inductive Parasitics
Besides having a parasitic resistance and capacitance, interconnect wires also
exhibit an inductive parasitic.
An important source of parasitic inductance is introduced by the bonding wires
and chip packages.
The source of inductive parasitics and their quantitative values are discussed
here.
Inductance and Reliability— Ldi/dt Voltage Drop
During each switching action, a transient current is sourced from (or
sunk into) the supply rails to charge (or discharge) the circuit
capacitances, as shown in Figure. Both VDD and VSS connections are
routed to the external supplies through bonding wires and package
pins and possess a nonignorable series inductance.

Fig: Inductive coupling between external and internal


supply voltages
Hence, a change in the transient current creates a voltage difference between
the external and internal (VDD’, GND’) supply voltages. This situation is
especially severe at the output pads, The deviations on the internal supply
voltages affect the logic levels and result in reduced noise margins.

In an actual circuit, a single supply pin serves a large number of gates or


output drivers. A simultaneous switching of those drivers causes even worse
current transients and voltage drops. As a result, the internal supply voltages
deviate in a substantial way from the external ones.
Simultaneous switching of a many pins results in huge spikes on the supply
rails that are bound to disturb the operation of the internal circuits as well as
other external components connected to the same supplies.
Design Techniques:
A number of approaches are available to the designer to address the L(di/dt) problem

1. Separate power pins for I/O pads and chip core—Since the I/O drivers
require the largest switching currents, they also cause the largest current
changes. It is wise to isolate the center of the chip, where most of the logic
action occurs, from the drivers by providing different power and ground pins.
2. Multiple power and ground pins— In order to reduce the di/dt per supply
pin, we can restrict the number of I/O drivers connected to a single supply pin.
Typical numbers are five to ten drivers per supply pin.
3. Careful selection of the positions of the power and ground pins on the
package— The inductance of pins located at the corners of the package is
substantially higher due to its length
as shown in Figure
Fig: The inductance of a bonding-wire/pin
combination depends upon the pin position
4. Increase the rise and fall times of the off-chip signals to the maximum extent
allowable and distributed all over the chip.
5. Use advanced packaging technologies such as surface-mount or hybrids that
come with a substantially reduced capacitance and inductance per pin. For
instance, we can see from Table 2.2 that the bonding inductance of a chip
mounted in flip-chip style on a substrate using the solder-bump techniques is
reduced to 0.1nH, which is 50 to 100 times smaller than for standard packages.
6. Adding decoupling capacitances on the board— These capacitances, which
should be added for every supply pin, act as local supplies and stabilize the
supply voltage seen by the chip. They separate the bonding-wire inductance
from the inductance of the board interconnect (Figure). The bypass capacitor,
combined with the inductance, actually acts as a low-pass network that filters
away the high-frequency components of the transient voltage spikes on the
supply lines.

Fig: Decoupling capacitors isolate the board inductance


from the bonding wire
7. Adding decoupling capacitances on the chip— In high-performance circuits
with high switching speeds, it is becoming common practice to integrate
decoupling capacitances on the chip, which ensures cleaner supply voltages.
Onchip bypass capacitors reduce the peak current demand to the average
value.

Finally, be aware that the mutual inductance between neighboring wires also
introduces cross talk. This effect is not yet a major concern in CMOS but
definitely is emerging as an issue at the highest switching speeds.
Inductance and Performance—Transmission Line Effects
When an interconnection wire becomes sufficiently long or when the circuits
become sufficiently fast, the inductance of the wire starts to dominate the
delay behavior, and transmission line effects must be considered.
In this section, we discuss some techniques to minimize the impact of the
transmission line behavior such as Termination and Shielding of wires.

Termination
Appropriate termination is the most effective way of minimizing the delay.
Matching the load impedance to the characteristic impedance of the line
results in the fastest response. This leads to the following design rule:
To avoid the negative effects of transmission-line behavior such as slow
propagation delays, the line should be terminated, either at the source (series
termination), or at the destination (parallel termination) with a resistance
matched to its characteristic impedance Z0.
The two scenarios — series and
parallel termination — are shown in
Figure.
Series termination requires that the
impedance of the signal source is
matched to the connecting wire.
The impedance of the driver inverter
can be matched to the line by careful Fig: Matched termination scenarios for
transistor sizing. wires behaving as transmission lines:
(a) series termination at the source;
(b) parallel termination at the destimation.

It is important that the impedance of the driver is closely matched to the line,
typically to within 10% or better, then excessive reflections of travelling waves
can be avoided. This can be compensated by making the resistance of the driver
transistors electrically tunable as shown in figure.
Fig: Tunable segmented driver providing matched series-termination to a transmission
line load.

Each of the driver transistors is replaced by a segmented driver, with the


segments being switched in and out by control lines c1 to cn, to match the
driver impedance as close as possible to the line impedance.
Each of the segments has a different resistance (typically ratioed with factors
of 2). A fixed element (s0) is added in parallel with the adjustable ones.
Fig: Parallel termination of transmission line using transistors as resistors:
(a) Grounded PMOS; (b)PMOS with negative gate bias;
(c) PMOS NMOS combination; (d) Simulation.
Similar considerations are valid when the termination is provided at the
destination end, called parallel termination.
Since PMOS transistors typically display a larger region of linear operation
than their NMOS, they are the preferred way of implementing a terminating
resistance. Assume that the triode-connected PMOS transistor of Figure a is
used as a 50-Ω matched termination, connected to VDD.
From simulations (Figure d), it is observe that the resistance is fairly constant
for small values of VR, but increases rapidly once the transistor saturates.
It can extend the linear region by increasing the bias voltage of the PMOS
transistor (Figure b). This however requires an extra supply voltage, which is
not practical in most situations.
A better approach is to add a diode-connected NMOS transistor in parallel with
the PMOS device (Figure c). The combination of the two devices gives a near-
constant resistance over the complete voltage range (similar to TG gate
resistance discussed in the earlier chapter)
Shielding

A good example of a well-defined transmission line is the coaxial cable, where


the signal wire is surrounded by a cylindrical ground plane.
To accomplish similar effects on a board or on a chip, designers often
surround the signal wire with ground (supply) planes and shielding wires.

Though expensive in real-estate(area), adding shielding makes the behavior


and the delay of an interconnection a lot more predictable.

You might also like