VLSI Design and Technology
-DR.V. SUMITRA,
ASSISTANT PROFESSOR, DEPT. OF ECE, SRM IST-KTR
UNIT III – VLSI Subsystem Design and Introduction to CMOS Logic Styles
Decoders -Comparators -Adders: Standard adder cells -Ripple Carry Adder (RCA) -Carry Look-
Ahead Adder (CLA) -Carry Select /Save/skip Adder (CSL/CSA/ CSK), Multipliers: Overview of
multiplication- types of multiplier architectures -Braun multiplier -Baugh-Wooley multiplier -
Wallace Tree multiplier -Booth multiplier. CMOS Circuit Design Styles: Static CMOS logic styles -
CMOS circuits, pseudo-nMOS, tristate circuits, Clocked CMOS circuits -DCVSL, Pass Transistor
Logic (PTL) -Dynamic CMOS logic styles: NORA, TSPC.
2
CMOS Invertor – based on metrics
VTn – (+ve) VGSp = (VG – Vs)p = Vin - VDD
VTp – (-ve) VDSp = (VD – Vs)p = Vout - VDD
VGSn = (VG – Vs)n = Vin
VDSn = (VD – Vs)n = Vout
CMOS Invertor – Voltage Transfer Char
CMOS Invertor – Voltage Transfer Char
CMOS Invertor – based on metrics
Area:
The area of a CMOS inverter is determined by the size of its constituent transistors:
the PMOS and NMOS transistors.
• Transistor size: The area depends on the width-to-length ratio (W/L) of
the transistors. Typically, the PMOS transistor is made larger than the NMOS
transistor to balance their drive strengths, because PMOS transistors have
lower mobility compared to NMOS transistors. This is why PMOS transistors
are generally wider than NMOS.
• Layout considerations: The layout design of the inverter (the physical
arrangement of the transistors and interconnections) affects the area.
Optimized layout strategies minimize area usage while avoiding performance
degradation.
• Scaling: As technology scales down (e.g., from 90nm to 7nm), the area
occupied by the transistors becomes smaller, allowing more inverters (and
other logic gates) to be packed into the same chip area.
CMOS Invertor – based on metrics
Speed (Delay):
The speed or delay of the CMOS inverter is critical for high-performance circuits and
is usually measured as propagation delay (the time it takes for a signal to travel
through the inverter).
▪ Propagation delay: This is defined by the time difference between the input
signal change and the corresponding output change. It depends on the
capacitance and resistance of the transistors.
▪ Capacitance: The main contributors to delay are the parasitic capacitances
(input and output capacitance) associated with the transistors and
interconnections. The larger the capacitance, the longer the delay.
▪ Resistance: The on-resistance of the NMOS and PMOS transistors also
contributes to delay. Faster transistors have lower resistance, reducing delay.
▪ Load capacitance: The speed is also affected by the load capacitance the
inverter has to drive. More load means slower performance.
▪ Voltage: Reducing the supply voltage typically slows down the inverter,
because lower voltage reduces the drive strength of the transistors.
CMOS Invertor – based on metrics
Energy:
Energy consumption in a CMOS inverter is important in low-power applications,
particularly for battery-powered devices.
▪ Dynamic energy: Most energy consumed by the CMOS inverter is due to dynamic
switching, which occurs when the inverter transitions from one logic state to another
(charging and discharging the load capacitance).
▪ The energy consumed per transition is given by:
E = Cload VDD2
where Cload is the load capacitance, and VDD is the supply voltage.
▪ Reducing voltage or load capacitance can help lower dynamic energy consumption.
▪ Short-circuit energy: During the switching process, both PMOS and NMOS may
conduct simultaneously for a brief moment, leading to short-circuit energy dissipation.
▪ Leakage energy: When the CMOS inverter is not switching, there is still a small
leakage current (due to sub-threshold leakage, gate leakage, and junction leakage),
which leads to static power dissipation. Leakage becomes more significant in advanced
technology nodes (e.g., sub-10nm).
CMOS Invertor – based on metrics
Power:
Power dissipation in a CMOS inverter can be broken down into two main
components: dynamic power and static power.
▪ Dynamic power: This is the power consumed when the inverter is actively
switching. It is given by:
Pdynamic = α Cload VDD2 f
where α is the switching activity factor, Cload is the load capacitance, VDD is the
supply voltage, and f is the switching frequency. Dynamic power is directly
proportional to the supply voltage and switching frequency.
▪ Static power: This is the power consumed when the inverter is idle (no
switching occurs). It is primarily due to leakage currents, which include sub-
threshold leakage, gate-oxide leakage, and junction leakage. As mentioned
earlier, leakage becomes a significant concern as transistor sizes shrink in
advanced technologies.
Metrics for selection of circuit styles
❑ Inverter Gate is evaluated based upon area, speed, energy and power
❑ High Performance Processor – Switching Speed
❑ Battery Operated Circuit – Energy Dissipation
❑ Other metrics – Power Dissipation, Robustness to noise and Reliability
CMOS structure Advantage: Robustness (i.e, low sensitivity to noise), good
performance, and low power consumption with no static power dissipation
Static CMOS Design
❑ Widely used logic style.
❑ Extension of CMOS inverter to multiple inputs.
❑ In static CMOS, each gate output is connected to either VDD or Vss via a low-
resistance path. Also, the outputs of the gates assume at all times the value of the
Boolean function implemented by the circuit (ignoring, once again, the transient
effects during switching periods).
❑ Types: Complementary CMOS, Ratioed logic (pseudo-NMOS and DCVSL), and
Pass Transistor logic.
Complementary MOS
❑ This is the most common and basic type of CMOS circuit design. In static CMOS,
for every logic operation, there are two types of transistors used:
❑ PMOS transistors, which are good at passing high (1) voltage signals, and
❑ NMOS transistors, which are good at passing low (0) voltage signals.
❑ Together, these transistors form a CMOS gate that performs logical operations
like AND, OR, and NOT. The static part means the circuit always has a valid
output (either 0 or 1) as long as it has power, even if the inputs are not changing.
❑ Example:
▪ Imagine a light switch (the input). In a static CMOS circuit, whenever the
switch is turned on or off, the light (output) will either turn on or off and will
stay that way until the switch changes again.
Pseudo-nMOS Logic Style
❑ This is a simplified version of CMOS logic. In pseudo-nMOS, the circuit uses only
one type of transistor (NMOS) for some operations, and a weak, always-on
PMOS transistor that simplifies the design. The pseudo means that it behaves
like CMOS, but it's not fully complementary like in static CMOS logic.
▪ Advantage: It can be faster and use fewer transistors, saving space.
▪ Disadvantage: It wastes some power because the PMOS transistor is always
on, even when it’s not needed.
❑ Example:
▪ Imagine a water tap that leaks a little water even when it’s off. In pseudo-
nMOS, there’s always a little bit of power being wasted, like a leaky tap.
Tristate Circuits
❑ A tristate circuit is like a regular CMOS circuit but with a third state beyond just
0 (off) and 1 (on). The third state is called high impedance or Z, which means
the output is effectively "disconnected" from the circuit.
▪ Uses: These circuits are useful when you want to allow multiple circuits to
share the same wire without interfering with each other. Only one circuit at a
time can output 0 or 1, while the others are in the Z state, not affecting the
signal.
❑ Example:
▪ Think of three people trying to talk on the same phone line. Only one can talk
at a time while the others are silent (Z-state), allowing clear communication.
Clocked CMOS Circuits
❑ In these circuits, the logic operation is controlled by a clock signal. This clock
tells the circuit when to change its state, making it work synchronously with
other parts of the system. One common style of clocked CMOS is DCVSL
(Differential Cascode Voltage Switch Logic), which uses two signals (instead of
one) to make decisions.
▪ DCVSL (Differential Cascode Voltage Switch Logic): It’s like having two
people vote on the outcome of a decision. If both agree, the output changes
accordingly. It helps make circuits more reliable and noise-resistant.
❑ Example:
▪ Imagine two workers who check each other’s work before submitting it. This
ensures that errors are less likely, and the work is done in sync with a timer
(clock).
Pass Transistor Logic
❑ In Pass Transistor Logic, the focus is on using transistors as switches to
directly pass signals from input to output. Instead of always combining PMOS
and NMOS transistors like in static CMOS, pass transistor logic only uses one type
(usually NMOS).
▪ Advantage: PTL circuits can be smaller and faster because they use fewer
transistors.
▪ Disadvantage: They might have trouble passing signals correctly, especially
if the signal weakens as it passes through multiple transistors.
❑ Example:
▪ Think of PTL like a series of gates in a fence. When each gate opens, it passes
something (a signal) to the next gate. However, if too many gates are opened
in a row, the person passing through might get tired (signal weakening), and
they might not make it to the end.
Complementary CMOS
❑ A static CMOS gate is a combination of two networks, called the pull-up network
(PUN) and the pull-down network (PDN).
❑ It has a generic N input logic gate where all inputs are distributed to both the
pull-up and pull-down networks.
❑ The function of the PUN is to provide a connection between the output and VDD
anytime the output of the logic gate is meant to be 1 (based on the inputs).
Similarly, the function of the PDN is to connect the output to VSS when the output
of the logic gate is meant to be 0.
❑ The PUN and PDN networks are constructed in a mutually exclusive fashion such
that one and only one of the networks is conducting in steady state.
❑ In this way, once the transients have settled, a path always exists between VDD
and the output F, realizing a high output (“one”), or, alternatively, between VSS
and F for a low output (“zero”). This is equivalent to stating that the output node
is always a low-impedance node in steady state.
Complementary CMOS
Complementary logic gate as a combination of a PUN (pull-up
network) and a PDN (pull-down network).
Complementary CMOS
Observations:
❑ The PDN is constructed using NMOS
devices, while PMOS transistors are used
in the PUN. The primary reason for this
choice is that NMOS transistors produce
“strong zeros,” and PMOS devices
generate “strong ones”.
❑ The output capacitance is initially
charged to VDD. Two possible discharge
scenarios are shown. An NMOS device
pulls the output all the way down to
GND, while a PMOS lowers the output no
further than |VTp| — the PMOS turns off
at that point, and stops contributing
discharge current. NMOS transistors are
hence the preferred devices in the PDN.
Complementary CMOS
Construction Rules:
❑ NMOS devices connected in series corresponds to an NAND function.
❑ NMOS transistors connected in parallel represent an NOR function.
❑ PMOS devices connected in series corresponds to an NOR function.
❑ PMOS transistors connected in parallel represent an NAND function.
❑ Complementary CMOS structure are dual networks. This means that a parallel
connection of transistors in the pull-up network corresponds to a series connection
of the corresponding devices in the pull-down network, and vice versa.
❑ The complementary gate is naturally inverting, implementing only functions such as
NAND, NOR, and XNOR.
❑ The number of transistors required to implement an N-input logic gate is 2N.
Note: The realization of a non-inverting Boolean function (such as AND OR, or XOR) in a single stage is not possible, and
requires the addition of an extra inverter stage.
Complementary CMOS
Propagation Delay:
❑ Each transistor is modeled as a
resistor in series with an ideal switch.
❑ The value of the resistance is
dependent on the power supply
voltage and an equivalent large signal
resistance, scaled by the ratio of
device width over length, must be
used.
❑ The logic is transformed into an
equivalent RC network that includes
the effect of internal node
capacitances.
Note: The RC circuit would behave as a passive filter, fundamentally altering the
circuit's intended purpose and making it unsuitable for digital logic operations.
Complementary CMOS
Transistor Sizing:
❑ Several approaches may be used to reduce delays in large fan-in circuits.
❑ The most obvious solution is to increase the overall transistor size. This lowers
the resistance of devices in series and lowers the time constant.
❑ However, increasing the transistor size, results in larger parasitic capacitors,
which do not only affect the propagation delay of the gate, but also present a
larger load to the preceding gate. This technique should, therefore, be used with
caution.
❑ If the load capacitance is dominated by the intrinsic capacitance of the gate,
widening the device only creates a “self-loading” effect, and the propagation delay
is unaffected.
Complementary CMOS
Input Reordering:
❑ Some signals in complex combinational logic blocks might be more critical than
others.
❑ Not all inputs of a gate arrive at the same time (due, for instance, to the
propagation delays of the preceding logical gates).
❑ An input signal to a gate is called critical if it is the last signal of all inputs to
assume a stable value. The path through the logic which determines the ultimate
speed of the structure is called the critical path.
❑ Putting the critical-path transistors closer to the output of the gate can result in a
speedup.
Complementary CMOS
Logic Restructuring:
❑ Manipulating the logic equations can
reduce the fan-in requirements and
hence reduce the gate delay.
❑ The quadratic dependency of the gate
delay on fanin makes the six-input NOR
gate extremely slow. Partitioning the
NOR-gate into two three input gates
Logic Restructuring
results in a significant speed-up, which
offsets by far the extra delay incurred by
turning the inverter into a two-input
NAND gate.
Complementary CMOS
Power Consumption:
❑ The power dissipation is a strong function of transistor sizing (which affects
physical capacitance), input and output rise/fall times (which affects the short-
circuit power), device thresholds and temperature (which affect leakage power),
and switching activity.
❑ The dynamic power of a logic gate can be reduced by minimizing the physical
capacitance and the switching activity.
❑ The physical capacitance can be minimized in a number ways, including circuit
style selection, transistor sizing, placement and routing, and architectural
optimizations.
❑ The switching activity, on the other hand, can be minimized at all level of the
design abstraction.
Ratioed CMOS
❑ Ratioed logic is an attempt to reduce the
number of transistors required to implement a
given logic function, at the cost of reduced
robustness and extra power dissipation.
❑ The purpose of the PUN in complementary
CMOS is to provide a conditional path between
VDD and the output when the PDN is turned
off.
❑ In ratioed logic, the entire PUN is replaced
with a single unconditional load device that
pulls up the output for a high output.
❑ Instead of a combination of active pull-down
and pull-up networks, such a gate consists of
an NMOS pull-down network that realizes the
logic function, and a simple load device.
Ratioed CMOS – pSeudo NMOS
❑ The clear advantage of pseudo-NMOS is the reduced number of transistors (N+1
versus 2N for complementary CMOS).
❑ The nominal high output voltage (VOH) for this gate is VDD since the pull-down
devices are turned off when the output is pulled high (assuming that VOL is
below VTn).
❑ On the other hand, the nominal low output voltage is not 0 V since there is a fight
between the devices in the PDN and the grounded PMOS load device.
❑ This results in reduced noise margins and more importantly static power
dissipation.
❑ The sizing of the load device relative to the pull-down devices can be used to
trade-off parameters such a noise margin, propagation delay and power
dissipation.
❑ Since the voltage swing on the output and the overall functionality of the gate
depends upon the ratio between the NMOS and PMOS sizes, the circuit is called
ratioed.
Ratioed CMOS - DCVSL
❑ It is possible to create a ratioed logic style that completely eliminates static
currents and provides rail-to-rail swing. Such a gate combines two concepts:
differential logic and positive feedback.
❑ A differential gate requires that each input is provided in complementary
format, and produces complementary outputs in turn.
❑ The feedback mechanism ensures that the load device is turned off when not
needed.
❑ A example of such a logic family, called Differential Cascode Voltage Switch
Logic (DCVSL).
Ratioed CMOS - DCVSL
Ratioed CMOS - DCVSL
❑ The pull-down networks PDN1 and PDN2 use NMOS devices and are
mutually exclusive (this is, when PDN1 conducts, PDN2 is off, and when
PDN1 is off, PDN2 conducts), such that the required logic function and its
inverse are simultaneously implemented.
❑ The resulting circuit exhibits a rail-to-rail swing, and the static power
dissipation is eliminated: in steady state, none of the stacked pull-down
networks and load devices are simultaneously conducting.
❑ However, the circuit is still ratioed since the sizing of the PMOS devices
relative to the pull-down devices is critical to functionality, not just
performance.
Pass Transistor Logic
❑ A popular and widely-used alternative to complementary
CMOS is pass-transistor logic, which attempts to reduce the
number of transistors required to implement logic by allowing
the primary inputs to drive gate terminals as well as
source/drain terminals.
❑ In this gate, if the B input is high, the top transistor is turned on
and copies the input A to the output F. When B is low, the
Pass Transistor Logic of
bottom pass transistor is turned on and passes a 0. AND Gate
❑ The switch driven by B seems to be redundant at first glance.
Its presence is essential to ensure that the gate is static, this is Note:
that a low-impedance path exists to the supply rails under all • No of transistors
circumstances, or, in this particular case, when B is low. required is 4.
❑ The reduced number of devices has the additional advantage of • In case of Com. CMOS
it is 6.
lower capacitance.
Pass Transistor Logic
❑ Unfortunately, as discussed earlier, an NMOS device is effective at passing a 0 but
is poor at pulling a node to VDD.
❑ When the pass transistor pulls a node high, the output only charges up to
VDD -VTn.
❑ In fact, the situation is worsened by the fact that the devices experience body
effect, as there exists a significant source-to-body voltage when pulling high.
❑ Pass-transistors require lower switching energy to charge up a node due to the
reduced voltage swing.
Dynamic CMOS Design
❑ The basic construction of an (n-
type) dynamic logic gate.
❑ The PDN (pull-down network) is
constructed exactly as in
complementary CMOS.
❑ The operation of this circuit is
divided into two major phases:
precharge and evaluation, with
the mode of operation
determined by the clock signal
CLK.
Dynamic CMOS Design
Precharge:
❑ When CLK = 0, the output node Out is precharged to VDD by the PMOS transistor Mp.
❑ During that time, the evaluate NMOS transistor Me is off, so that the pull-down path is
disabled.
❑ The evaluation FET eliminates any static power that would be consumed during the
precharge period (this is, static current would flow between the supplies if both the
pulldown and the precharge device were turned on simultaneously).
Dynamic CMOS Design
Evaluation:
❑ For CLK = 1, the precharge transistor Mp is off, and the evaluation transistor Me is turned on.
❑ The output is conditionally discharged based on the input values and the pull-down topology.
❑ If the inputs are such that the PDN conducts, then a low resistance path exists between Out and GND
and the output is discharged to GND.
❑ If the PDN is turned off, the precharged value remains stored on the output capacitance CL, which is a
combination of junction capacitances, the wiring capacitance, and the input capacitance of the fan-out
gates.
❑ During the evaluation phase, the only possible path between the output node and a supply rail is to
GND. Consequently, once Out is discharged, it cannot be charged again till then next precharge
operation.
❑ The inputs to the gate can therefore make at most one transition during evaluation. Notice that the
output can be in the high-impedance state during the evaluation period if the pull-down network is
turned off.
❑ This behavior is fundamentally different from the static counterpart that always has a low resistance
path between the output and one of the power rails.
Dynamic CMOS Design - Properties
❑ The logic function is implemented by the NMOS pull-down network.
❑ The construction of the PDN proceeds just as it does for static CMOS.
❑ The number of transistors (for complex gates) is substantially lower than in the static case: N + 2
versus 2N.
❑ It is non-ratioed. The sizing of the PMOS precharge device is not important for realizing proper
functionality of the gate. The size of the precharge device can be made large to improve the low-to-high
transition time (of course, at a cost to the high-to low transition time).
❑ There is however, a trade-off with power dissipation since a larger precharge device directly increases
clock-power dissipation.
❑ It only consumes dynamic power. Ideally, no static current path ever exists between VDD and GND. The
overall power dissipation, however, can be significantly higher compared to a static logic gate.
❑ The logic gates have faster switching speeds. There are two main reasons for this.
❑ The first (obvious) reason is due to the reduced load capacitance attributed to the lower number of
transistors per gate and the single-transistor load per fan-in.
❑ Second, the dynamic gate does not have short circuit current, and all the current provided by the pull-
down devices goes towards discharging the load capacitance.
True Single-Phase Clocked Register (TSPCR)
❑ In the two-phase clocking schemes described above, care must be taken in routing the two
clock signals to ensure that overlap is minimized.
❑ While the C2MOS provides a skew-tolerant solution, it is possible to design registers that
only use a single phase clock.
❑ The True Single-Phase Clocked Register (TSPCR), proposed by Yuan and Svensson, uses a
single clock.
❑ The basic single-phase positive and negative latches are shown in the below figure.
True Single-Phase Clocked Register (TSPCR)
Transparent Mode True Single Phase Latches Hold Mode
True Single-Phase Clocked Register (TSPCR)
❑ For the positive latch, when CLK is high, the latch is in the transparent mode and
corresponds to two cascaded inverters; the latch is non-inverting, and propagates the
input to the output.
❑ On the other hand, when CLK = 0, both inverters are disabled, and the latch is in hold-
mode.
❑ Only the pull-up networks are still active, while the pull-down circuits are deactivated.
❑ As a result of the dual-stage approach, no signal can ever propagate from the input of the
latch to the output in this mode.
❑ A register can be constructed by cascading positive and negative latches. The clock load is
similar to a conventional transmission gate register, or C2MOS register. The main
advantage is the use of a single clock phase.
❑ The disadvantage is the slight increase in the number of transistors — 12 transistors are
required.
True Single-Phase Clocked Register (TSPCR)
❑ TSPC offers an additional advantage: the possibility of embedding logic functionality into
the latches. This reduces the delay overhead associated with the latches.
Adding logic to the TSPC approach
True Single-Phase Clocked Register (TSPCR)
❑ The above figure (a) outlines the basic approach for embedding logic, while figure (b)
shows an example of a positive latch that implements the AND of In1 and In2 in addition to
performing the latching function.
❑ While the set-up time of this latch has increased, the overall performance of the digital
circuit (that is, the clock period of a sequential circuit) has improved: the increase in set-
up time is typically smaller than the delay of an AND gate.
❑ This approach of embedding logic into latches has been used extensively in the design of
the EV4 DEC Alpha microprocessor and many other high performance processors.
True Single-Phase Clocked Register (TSPCR)
❑ The TSPC latch circuits can be further reduced in
complexity, where only the first inverter is controlled
by the clock.
❑ Besides the reduced number of transistors, these
circuits have the advantage that the clock load is
reduced by half.
❑ On the other hand, not all node voltages in the latch
experience the full logic swing.
❑ For instance, the voltage at node A (for Vin = 0 V) for
the positive latch maximally equals VDD – VTn, which
results in a reduced drive for the output NMOS
Simplified TSPC latch
transistor and a loss in performance.
(also called split-output)
❑ Similarly, the voltage on node A (for Vin = VDD) for the
negative latch is only driven down to |VTp|. This also
limits the amount of VDD scaling possible on the latch.
NORA CMOS Design (NP-Domino Logic)
❑ The latch-based pipeline circuit can
also be implemented using C2MOS
latches.
❑ A C2MOS-based pipelined circuit is
race-free as long as all the logic
functions F.
❑ During a (0-0) overlap between CLK
and CLK, all C2MOS latches, simplify
to pure pull-up networks. Pipelined datapath using
❑ The only way a signal can race from C2MOS latches
stage to stage under this condition is
when the logic function F is inverting
where F is replaced by a single, static
CMOS inverter.
NORA CMOS Design
❑ Based on this concept, a logic circuit style called NORA-CMOS was conceived.
❑ It combines C2MOS pipeline registers and NORA dynamic logic function blocks.
❑ Each module consists of a block of combinational logic that can be a mixture of static and
dynamic logic, followed by a C2MOS latch. Logic and latch are clocked in such a way that
both are simultaneously in either evaluation, or hold (precharge) mode.
❑ A block that is in evaluation during CLK = 1 is called a CLK-module, while the inverse is
called a CLK-module.
Operations modes of NORA Logic modules
NORA CMOS Design
NORA CMOS Design
❑ A NORA datapath consists of a chain of alternating CLK and CLK modules.
❑ While one class of modules is precharging with its output latch in hold mode, preserving
the previous output value, the other class is evaluating.
❑ Data is passed in a pipelined fashion from module to module.
❑ NORA offers designers a wide range of design choices.
❑ Dynamic and static logic can be mixed freely, and both CLKp and CLKn dynamic blocks can
be used in cascaded or in pipelined form.
❑ With this freedom of design, extra inverter stages, as required in DOMINO-CMOS, are most
often avoided.
Logic Style Type Key Feature Pros Cons
Complementary Full swing, no static
CMOS Logic Static Large area
pMOS/nMOS power
Static power
Pseudo-nMOS Static Always-on pMOS load Compact
dissipation
Tri-State Static High-impedance output Useful in buses Needs control logic
Clocked CMOS Clocked control of
Static Low power Complex timing
(C²MOS) PUN/PDN
Differential complementary High speed, noise
DCVSL Static High transistor count
outputs immune
Weak ‘1’, degraded
PTL Static Uses pass transistors Fewer devices
logic levels
Clock design
NORA Dynamic Nonoverlapping clocks High speed
complexity
TSPC Dynamic Single-phase dynamic logic Compact, fast Sensitive to leakage
DEPT. OF IT MACHINE LEARNING 47
THANK YOU
48