0% found this document useful (0 votes)
3 views36 pages

QP 2021

The document contains a series of questions and answers related to computer architecture, including conversions between number systems, construction of logic gates using NOR gates, differentiation between CISC and RISC architectures, instruction formats, limitations of Instruction Level Parallelism, the priority of DMA over CPU in memory transfers, characteristics of multiprocessors, and types of ROM. Each section provides detailed explanations and examples to illustrate the concepts. The content is structured as a question-and-answer format, suitable for educational purposes.

Uploaded by

charancb807
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views36 pages

QP 2021

The document contains a series of questions and answers related to computer architecture, including conversions between number systems, construction of logic gates using NOR gates, differentiation between CISC and RISC architectures, instruction formats, limitations of Instruction Level Parallelism, the priority of DMA over CPU in memory transfers, characteristics of multiprocessors, and types of ROM. Each section provides detailed explanations and examples to illustrate the concepts. The content is structured as a question-and-answer format, suitable for educational purposes.

Uploaded by

charancb807
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

COA -> question with answer 2021

Answer any five of the following. Each question carries six marks : (5×6=30)

1. Convert the following :


i) 7562(10)=7562(10)= ______(16)(16)
ii) 1110101(2)=1110101(2)= ______(10)(10)
iii) F3A7(16)=F3A7(16)= ______(8)(8)
ANS :-

1. 1D8A(16)
2. 117(10)
3. 171647(8)

2. Construct the basic logic gates using NOR logic gate.

The NOR gate is known as a "universal gate" because it can be used to construct any
other basic logic gate (NOT, AND, OR). This is done by combining the NOR gates in
specific configurations.

i) NOT Gate (Inverter) using NOR

A NOT gate outputs the opposite of the input. To make a NOT gate from a NOR, you
simply tie the two inputs of the NOR gate together.

 Circuit: Connect input AA to both inputs of the NOR gate.


 Operation: NOR Gate: A+A‾=A‾A+A=A.
 Result: Output = A‾A (NOT A).

ii) OR Gate using NOR

An OR gate outputs 1 if at least one input is 1. To make an OR gate, you use two NOR
gates.
 Circuit: Feed inputs AA and BB into the first NOR gate. Then, feed the output of
that NOR into a second NOR gate configured as a NOT gate (inputs tied together).
 Operation:

o First NOR: A+B‾A+B


o Second NOR (as NOT): A+B‾‾=A+BA+B=A+B
 Result: Output = A+BA+B (OR).

iii) AND Gate using NOR

An AND gate outputs 1 only if both inputs are 1. To make an AND gate, you need three
NOR gates.

 Circuit: First, pass AA and BB through NOT gates (NORs with tied inputs) to
get A‾A and B‾B. Then, feed these inverted signals into a NOR gate.
 Operation:

o NOR as NOT on A: A‾A


o NOR as NOT on B: B‾B
o Final NOR: A‾+B‾‾A+B
 Boolean Identity: Using De Morgan's Theorem, A‾+B‾‾=A⋅BA+B=A⋅B.
 Result: Output = A⋅BA⋅B (AND).

3. Differentiate CISC and RISC.

CISC (Complex Instruction Set Computer) and RISC (Reduced Instruction Set Computer)
are two different design philosophies for computer architecture. Here are the key
differences:

CISC (Complex Instruction


Feature RISC (Reduced Instruction Set Computer)
Set Computer)
CISC (Complex Instruction
Feature RISC (Reduced Instruction Set Computer)
Set Computer)

Large and complex.


Includes specialized, high-
Instruction Small and simple. Contains only basic,
level instructions (e.g., a
Set general-purpose instructions.
single instruction for a
complex math operation).

Instruction Variable instruction length Fixed instruction length (usually 4 bytes / 32


Size (e.g., 1 to 15 bytes). bits).

Multiple clock cycles per


Single clock cycle per instruction
Cycles per instruction. Complex
(pipelined). Most instructions execute in
Instruction instructions may take many
one cycle.
cycles to execute.

Addressing Supports many different Supports very few, simple addressing


Modes addressing modes. modes.

Instructions can operate Only Load and Store instructions access


Memory
directly on memory. memory. All other operations work on
Access
(e.g., ADD [MEM1], [MEM2]). registers. (Load-Store Architecture).

Micro-programmed (slow).
Complex instructions are Hard-wired (fast). Simple instructions are
Control Unit
broken down into executed directly by hardware.
microcode.

Fewer general-purpose
Register Large number of general-purpose registers
registers (due to memory
Usage (typically 32 to 128).
operands).

Compiler Compiler is simpler Compiler is more complex to optimize the


Complexity because hardware provides use of simple instructions and registers.
CISC (Complex Instruction
Feature RISC (Reduced Instruction Set Computer)
Set Computer)

complex instructions.

Intel x86, AMD processors,


Examples ARM, MIPS, RISC-V, PowerPC.
Motorola 68000.

4. Explain different instruction formats with an example for each.

An instruction format defines the layout of the bits of an instruction, specifying how
the opcode and operands are organized. The most common classification is based on
the number of addresses (operands) an instruction contains.

Here are the three main types of instruction formats:

1. Three-Address Instruction Format

 Structure: OPCODE A, B, C
 Meaning: This format performs an operation on two operands (B and C) and
stores the result in a third operand (A).
 Operation: A = B OP C
 Advantage: It is very flexible and requires the fewest number of instructions to
perform complex tasks.
 Disadvantage: It requires longer instruction words to accommodate three
addresses, leading to larger program size.
 Example (Assembly-like):

o ADD R1, R2, R3 This means: R1 = R2 + R3


o MUL X, Y, Z This means: X = Y * Z

2. Two-Address Instruction Format

 Structure: OPCODE A, B
 Meaning: This format performs an operation using two operands. Typically, the
result overwrites one of the operands (usually the first one).
 Operation: A = A OP B or A = B OP A (depending on architecture).
 Advantage: It reduces the instruction length compared to three-address formats.
 Disadvantage: It destroys the original value of one of the operands, which may
require a copy instruction if the original value is needed later.
 Example (Assembly-like):

o ADD R1, R2 This means: R1 = R1 + R2 (R1 is both a source and the destination).
o MOV R3, R4 This means: R3 = R4 (copy value).

3. One-Address Instruction Format

 Structure: OPCODE A
 Meaning: This format uses a single explicit address. It implicitly uses a special
register called the accumulator (ACC) for all operations.
 Operation: ACC = ACC OP A
 Advantage: Very short instruction length.
 Disadvantage: Program length increases significantly because many instructions
are needed to load and store values from the accumulator.
 Example (Accumulator-based Machine):

o LOAD X This means: ACC = X (load value from memory location X into accumulator).
o ADD Y This means: ACC = ACC + Y (add value at Y to accumulator).
o STORE Z This means: Z = ACC (store accumulator value to memory location Z).

4. Zero-Address Instruction Format

 Structure: OPCODE
 Meaning: This format has no address fields. It is used in Stack-based
architectures. All operations implicitly use the top of the stack.
 Operation: TOS = (TOS-1) OP TOS (Pop two values from stack, apply operation,
push result back).
 Advantage: Instructions are extremely short, allowing for very dense code.
 Disadvantage: It is inefficient for random data access and requires many
push/pop instructions.
 Example (Stack Machine):

o PUSH A (Place A on top of stack)


o PUSH B (Place B on top of stack)
o ADD (Pop top two values: A and B, add them, push result back: A+B)

5. Explain the limitation of Instruction level parallelism.

Instruction Level Parallelism (ILP) is a measure of how many of the instructions in a


computer program can be executed simultaneously. While techniques like pipelining,
superscalar, and VLIW aim to exploit ILP, they face several hard limits. These limitations
can be classified into three main categories:

1. Data Dependencies (True Data Hazards)

This is the most fundamental limitation. If instruction B needs the result of instruction A,
it cannot execute at the same time as A or before A. It must wait.

 Example:

1. ADD R1, R2, R3 (R1 = R2 + R3)


2. SUB R4, R1, R5 (R4 = R1 - R5)
 Limitation: The SUB instruction depends on the value of R1 produced by the ADD. It
cannot execute until the ADD is complete. This creates a "bubble" in the pipeline
and limits parallelism.

2. Control Dependencies (Branching)


Program flow changes due to conditional branches (e.g., if-else statements, loops). The
processor cannot be sure which instructions to execute next until the branch condition is
evaluated.

 Problem: When a processor encounters a branch, it may guess (speculate) which


path to take. If it guesses wrong, it must discard all the work done on the wrong
path (a pipeline flush), wasting clock cycles.
 Limitation: As pipelines get deeper and the number of execution units increases,
the cost of a mispredicted branch becomes higher, severely limiting the effective
ILP. This is known as the "branch penalty."

3. Resource Conflicts (Structural Hazards)

Even if instructions are independent, they may compete for the same hardware
resources at the same time.

 Example: A CPU might have only one memory port. If one instruction needs to
read data from memory and another needs to write data to memory
simultaneously, they cannot both proceed. One must stall.
 Limitation: Hardware designers try to mitigate this by adding more resources
(e.g., multiple ALUs, multiple memory ports), but there are physical and cost limits
(power consumption, chip area).

4. Other Fundamental Limits (Law of Diminishing Returns)

Beyond the classical hazards, there are theoretical limits described by researchers.

 Amdahl's Law: The speedup from parallel processing is limited by the sequential
portion of the code. Even in highly parallel programs, there is always some part
that must run in order.
 Memory Wall: The speed gap between the processor and main memory is huge.
If an instruction needs data from memory, the CPU often stalls waiting for it,
regardless of how many execution units are free. This is often the biggest
bottleneck in modern systems.
 Power Wall: Increasing ILP requires more complex hardware (more execution
units, complex schedulers), which consumes exponentially more power and
generates more heat, making further scaling impractical.

6. DMA has Priority over the CPU when both request a memory
transfer. Justify your answer.

DMA (Direct Memory Access) is a feature that allows hardware devices (like disk
drives, graphics cards, network cards) to transfer data directly to and from the main
memory without involving the CPU for every byte.

When both the CPU and a DMA controller request access to the memory bus at the
same time, the DMA controller is generally given priority. This is justified by the
following reasons:

1. Speed Matching and Preventing Data Loss

I/O devices that use DMA (e.g., a hard disk or network interface) operate at a fixed
speed determined by the hardware. They have a limited buffer to hold incoming or
outgoing data.

 The Risk: If the CPU is using the memory bus and the DMA controller has to wait,
the I/O device's buffer may overflow (during input) or underflow (during output).
This results in data loss.
 The Justification: The CPU can wait for a few cycles without losing its state (it is
just paused), but an I/O device cannot pause the physical flow of data. Therefore,
DMA must be serviced immediately to maintain data integrity.

2. Efficiency of Burst Transfers


DMA controllers typically operate in "burst mode," where they transfer a large block of
data (e.g., an entire sector from a disk) in one go.

 The Benefit: If the DMA gets priority, it can quickly grab the bus, transfer the
entire block at high speed, and then release the bus back to the CPU.
 Comparison: If the CPU held priority, it would intersperse its own memory
accesses, dragging out the DMA transfer time and keeping the I/O device busy for
longer, which degrades overall system performance.

3. CPU is Designed to be Interruptible

The CPU is a programmable unit that can save its current state (registers, program
counter) and resume later. When the CPU requests memory, it is usually to fetch an
instruction or data for the current process. If the CPU is forced to wait (insert "wait
states"), it simply stalls temporarily.

 The Trade-off: Giving priority to DMA causes a slight, temporary slowdown for
the CPU (known as cycle stealing). However, this small delay is a worthwhile
trade-off to prevent a much more catastrophic failure (data loss from an I/O
device).

Conclusion: DMA gets priority because I/O devices have real-time constraints and
physical buffers that cannot wait, whereas the CPU is a flexible, stateful machine that can
tolerate brief stalls without losing data.

7. Explain the characteristics of multiprocessors.

A multiprocessor is a computer system that contains two or more processors (CPUs)


that share access to a common memory (shared memory multiprocessor). They are used
to achieve higher throughput and reliability.
Here are the key characteristics:

1. Parallelism and Throughput

 Characteristic: Multiple processors can execute multiple tasks (or threads of the
same task) simultaneously.
 Explanation: This allows the system to handle more work in a given amount of
time. For example, one processor can handle a database query while another
renders a video, vastly improving overall system throughput.

2. Shared Memory

 Characteristic: All processors share a single, global memory space.


 Explanation: Any processor can communicate with any other processor by
reading or writing to a shared memory location. This makes inter-processor
communication fast and easy for the programmer, but it requires complex
mechanisms (cache coherence protocols) to ensure all processors have a
consistent view of memory.

3. Cache Coherence

 Characteristic: The need to maintain consistent data across multiple caches.


 Explanation: If Processor A modifies a data in its local cache, Processor B should
not read the old (stale) copy from its own cache or main memory. Multiprocessors
implement hardware protocols (like MESI protocol) to ensure that all caches are
"coherent" and reflect the most recent write to any shared data location.

4. Interconnection Network

 Characteristic: A dedicated network (bus, crossbar switch, or multi-stage


network) connects the processors to each other and to memory modules.
 Explanation: This allows processors to exchange data and coordinate activities.
For example, processors may use this network to send "interrupts" to each other.
5. Scalability

 Characteristic: The ability to add more processors to increase computing power.


 Explanation: While adding more processors increases computing power, shared
memory multiprocessors face scalability limits. As the number of processors
grows, the traffic on the interconnection network and the complexity of
maintaining cache coherence can become bottlenecks.

6. Fault Tolerance and Graceful Degradation

 Characteristic: The system can continue operating even if one processor fails.
 Explanation: In a multiprocessor system, if one CPU fails, the operating system
can typically reassign its tasks to another functioning CPU. The system might run
slower (graceful degradation), but it does not crash entirely, which is crucial for
servers and critical systems.

7. Load Balancing

 Characteristic: The operating system or hardware must distribute work evenly.


 Explanation: To maximize efficiency, the workload must be spread across all
processors so that no single processor is idle while others are overloaded.

8. Explain different types of ROM.

ROM (Read-Only Memory) is a type of non-volatile memory used in computers and


electronic devices to store firmware or software that is rarely changed during the life of
the system. Over time, different types of ROM have been developed, offering varying
levels of reprogrammability.

Here are the different types:


1. Mask ROM (MROM)

 Description: This is the original type of ROM. The data is physically "hard-wired"
into the chip during the manufacturing process using a photomask.
 Programming: Programmed at the factory. It cannot be altered or
reprogrammed after manufacturing.
 Advantages: Very inexpensive for mass production.
 Disadvantages: Inflexible. If there is an error in the data, the entire batch of chips
is useless. High initial cost for creating the mask.
 Usage: Used in devices with absolutely no need for updates (e.g., simple
calculators, legacy automotive controllers).

2. Programmable ROM (PROM)

 Description: A type of ROM that is purchased blank and then programmed by


the user once.
 Programming: Programming is done by "blowing" internal fuses using a special
device called a PROM programmer. This is a physical, irreversible process.
 Advantages: More flexible than Mask ROM; users can program their own data.
 Disadvantages: Cannot be erased or reused. If a mistake is made during
programming, the chip is wasted.
 Usage: Used in early game consoles, medical devices, or prototypes where only a
few units are needed.

3. Erasable Programmable ROM (EPROM)

 Description: A type of ROM that can be programmed, erased, and


reprogrammed multiple times.
 Programming: Programmed electronically using a programmer. Erasure is done
by exposing the chip's quartz window to strong ultraviolet (UV) light for about
20-30 minutes, which resets the memory cells.
 Advantages: Reusable and erasable.
 Disadvantages: The erasure process is slow and inconvenient. The chip must be
removed from the circuit for erasing/programming. The quartz window makes
packaging expensive.
 Usage: Used during system development and debugging in the 80s and 90s.

4. Electrically Erasable Programmable ROM (EEPROM)

 Description: An evolution of EPROM that can be erased and reprogrammed


electrically, without removing the chip from the circuit or using UV light.
 Programming: Erasure and writing are done using electrical signals (higher
voltage than normal). It can erase one byte at a time.
 Advantages: Very flexible; can be updated in-circuit. No UV light needed.
 Disadvantages: Slower than other ROM types. It is relatively expensive and has a
limited lifespan (typically 100,000 to 1,000,000 write cycles).
 Usage: Used to store BIOS settings on older computer motherboards.

5. Flash ROM (or Flash Memory)

 Description: A modern, high-density variant of EEPROM. It is currently the most


common type of ROM.
 Programming: Electrically erasable and programmable. Unlike EEPROM, which
erases byte-by-byte, Flash erases data in blocks (or entire sectors), making it much
faster.
 Advantages: Fast read and write speeds, high density, low power consumption,
and durable.
 Disadvantages: Must erase entire blocks to rewrite; finite number of write cycles
(though still in the hundreds of thousands).
 Usage: Used everywhere: USB drives, SSDs, memory cards (SD, MicroSD),
BIOS/UEFI firmware in modern computers, and smartphones.

SECTION – B
9. a) Minimize the following expression using K-Map

Y=A‾BC‾D‾+A‾BCD‾+ABC‾D‾+AB‾CD‾+ABCD‾+A‾BCD+ABCDY=ABCD+
ABCD+ABCD+ABCD+ABCD+ABCD+ABCD

Note: I have rewritten the expression as it appeared to have some typographical errors in
the original transcription (duplicate terms). This corrected version contains all 7 unique
minterms.

Step 1: Identify the Minterms

The expression is in Sum of Products (SOP) form. Let's convert each term to its minterm
number (assuming A is the MSB and D is the LSB):

1. A‾BC‾D‾ABCD = 0 1 0 0 = 4₁₀
2. A‾BCD‾ABCD = 0 1 1 0 = 6₁₀
3. ABC‾D‾ABCD = 1 0 1 0 = 10₁₀
4. AB‾CD‾ABCD = 1 0 1 0 = 10₁₀ (Wait, this seems to be a duplicate of term 3—likely a
transcription error. Let's check carefully.)

Actually, let's list each term with correct binary:

Term ABCD Decim

A‾BC‾D‾ABCD 0100 4

A‾BCD‾ABCD 0110 6

ABC‾D‾ABCD 1 0 0 0 Wait, means =1, =1 o, means =1, =1 ctually


together means =1 =1 n expression , it s
Term ABCD Decim

OT OT . So =1, =1, =0, =0 → 1100₂ = 12₁₀.

Let's list all systematically:

Term 1: A‾BC‾D‾ABCD = =0, =1, =0, =0 → 0100₂ = 4


Term 2: A‾BCD‾ABCD = =0, =1, =1, =0 → 0110₂ = 6
Term 3: ABC‾D‾ABCD = =1, =1, =0, =0 → 1100₂ = 12
Term 4: AB‾CD‾ABCD = =1, =0, =1, =0 → 1010₂ = 10
Term 5: ABCD‾ABCD = =1, =1, =1, =0 → 1110₂ = 14
Term 6: A‾BCDABCD = =0, =1, =1, =1 → 0111₂ = 7
Term 7: ABCDABCD = =1, =1, =1, =1 → 1111₂ = 15

So minterms: 4, 6, 7, 10, 12, 14, 15

Step 2: Plot on K-Map (4 variables)

K-Map for 4 variables (A, B, C, D):

A‾B‾A‾BABAB‾C‾D‾04128C‾D15139CD371511CD‾261410CDCDCDCD
AB0132AB4576AB12131514AB891110

Now place 1's at minterms 4, 6, 7, 10, 12, 14, 15:

 4 ( Wait, 4 is 0100 → =0, =1, =0, =0 → row 00 ( , column 01 ( →


correct row 00, col 01 → 4 ✓
 6 0110 → =0, =1, =1, =0 → row 10 ( , col 01 ( →6✓
 7 0111 → =0, =1, =1, =1 → row 11 ( , col 01 ( →7✓
 10 1010 → =1, =0, =1, =0 → row 10 ( , col 10 ( → 10 ✓
 12 1100 → =1, =1, =0, =0 → row 00 ( , col 11 ( → 12 ✓
 14 1110 → =1, =1, =1, =0 → row 10 ( , col 11 ( → 14 ✓
 15 1111 → =1, =1, =1, =1 → row 11 ( , col 11 ( → 15 ✓

K-Map with 1's:

0001111000011001000011011010011100011110000000011011111011100001

Step 3: Grouping

Looking at the K-Map:

Row 00 cols 01 and 11 have 1 s → 4 and 12 → both have = , =1, =0, =0 Wait, 4 is
0100 (A=0, B=1, C=0, D=0), 12 is 1100 (A=1, B=1, C=0, D=0). So group: B=1, C=0, D=0
→ term BC‾D‾BCD (A cancels)

Row 10 cols 01, 11, 10 have 1 s → 6 (0110 , 14 (1110 , 10 (1010 . ctually 10 is col 10,
row 10 → yes. So in row 10 ( =1, =0 , cols 01, 11, 10 have 1 s → that s 01 =1, 11 =1,
10: B=0. So in this row, all three columns with 1's share C=1, D=0, but B varies. So
group =1, =0 → term CD‾CD (A and B cancel? Wait, check:
6: A=0,B=1,C=1,D=0
14: A=1,B=1,C=1,D=0
10: A=1,B=0,C=1,D=0
So yes, only C=1 and D=0 common. So group: CD‾CD

Row 11 cols 01 and 11 have 1 s → 7 (0111 and 15 (1111 → share =1, =1, =1 Wait,
7 =0, =1, =1, =1; 15 =1, =1, =1, =1 → share =1, =1, =1 →
term: BCDBCD (A cancels)

No other groups possible.


Step 4: Minimized Expression

From the three prime implicants:

1. BC‾D‾BCD
2. CD‾CD
3. BCDBCD

So minimized expression:

Y=BC‾D‾+CD‾+BCDY=BCD+CD+BCD

Answer 9(a): Y=BC‾D‾+CD‾+BCDY=BCD+CD+BCD

9. b) Design a combination logic circuit that forms the


arithmetic sum of three input bits.

This is a Full Adder circuit. A full adder adds three bits (A, B, and Carry-in Cin) and
produces a Sum (S) and a Carry-out (Cout).

Step 1: Truth Table


A B Cin Sum Cout

0 0 0 0 0

0 0 1 1 0

0 1 0 1 0

0 1 1 0 1

1 0 0 1 0

1 0 1 0 1

1 1 0 0 1

1 1 1 1 1

Step 2: Boolean Expressions

From truth table:

Sum = A ⊕ B ⊕ Cin
(Sum is 1 when odd number of inputs are 1)

Cout = (A ∧ B) ∨ (Cin ∧ (A ⊕ B))


Alternatively: Cout = A·B + Cin·(A ⊕ B)
Or from K-Map: Cout = A·B + B·Cin + A·Cin
Step 3: Logic Circuit Design

Using basic gates:

Sum = A ⊕ B ⊕ Cin
This requires two XOR gates:
First XOR: A ⊕ B
Second XOR: (A ⊕ B) ⊕ Cin

Cout = (A·B) + (Cin·(A ⊕ B))


This requires:
1 AND gate for A·B
1 AND gate for Cin·(A ⊕ B)
1 OR gate to combine them

Step 4: Circuit Diagram (Text Representation)


text

A -----\
XOR -----\
B -----/ XOR ----- Sum
/
Cin ------------/

A -----\
AND \
B -----/ \
OR ----- Cout
A -----\ /
XOR --/
B -----/ \
AND
Cin ---------/

Or using the alternative expression:

Cout = A·B + B·Cin + A·Cin


This is just three AND gates and one OR gate.

Answer 9(b): The circuit is a Full Adder with:

 Sum = A ⊕ B ⊕ Cin
 Cout = A·B + B·Cin + A·Cin

10. a) Construct a 4-to-1 line multiplexer using logic gates.


Explain its working procedure.
A 4-to-1 multiplexer (MUX) selects one of 4 input lines (I0,I1,I2,I3I0,I1,I2,I3) and
forwards it to a single output line (YY) based on the values of two select lines (S1,S0S1
,S0).

Truth Table

S1S1 S0S0 YY

0 0 I0I0

0 1 I1I1
S1S1 S0S0 YY

1 0 I2I2

1 1 I3I3

Boolean Expression
From the truth table, the output can be written as:

Y=S1‾⋅S0‾⋅I0+S1‾⋅S0⋅I1+S1⋅S0‾⋅I2+S1⋅S0⋅I3Y=S1⋅S0⋅I0+S1⋅S0⋅I1+S1⋅S0⋅I2+S1⋅S0
⋅I3

Logic Circuit Diagram (using AND, OR, NOT gates)

1. NOT gates to generate complements of select lines: S1‾S1 and S0‾S0


2. Four AND gates to combine each input with its corresponding select condition:
o AND1: S1‾⋅S0‾⋅I0S1⋅S0⋅I0
o AND2: S1‾⋅S0⋅I1S1⋅S0⋅I1
o AND3: S1⋅S0‾⋅I2S1⋅S0⋅I2
o AND4: S1⋅S0⋅I3S1⋅S0⋅I3
3. One OR gate to combine all AND outputs into final output YY

Working Procedure

1. Select Lines (S1,S0S1,S0) determine which input is connected to the output.


o If S1S0=00S1S0=00, only 1 is enabled → Y=I0Y=I0
o If S1S0=01S1S0=01, only 2 is enabled → Y=I1Y=I1
o If S1S0=10S1S0=10, only 3 is enabled → Y=I2Y=I2
o If S1S0=11S1S0=11, only 4 is enabled → Y=I3Y=I3
2. Data Inputs (I0,I1,I2,I3I0,I1,I2,I3) are the actual signals to be routed.
3. The selected input passes through its AND gate and appears at the output YY via the
OR gate.

Result: The MUX acts as a digitally controlled switch, routing one of four inputs to the
output based on the binary value on the select lines.

Answer 10(a): A 4-to-1 MUX is constructed using 4 AND gates, 1 OR gate, and 2 NOT
gates. It works by enabling exactly one AND gate based on the select lines, passing the
corresponding input to the output.

10. b) Implement the Boolean function using an 8:1


multiplexer.
F(A,B,C,D)=∑m(1,3,5,6)F(A,B,C,D)=∑m(1,3,5,6)

Step 1: Understanding the function

 4 variables: A,B,C,DA,B,C,D
 Minterms: 1, 3, 5, 6
 8:1 MUX has 3 select lines and 8 data inputs

We can use A,B,CA,B,C as select lines, and DD as a variable input to the data lines.
Step 2: Truth table with A, B, C as select lines
List all combinations of A,B,CA,B,C (000 to 111) and for each, determine FF as a
function of DD:

A B C Minterms covered FF in terms of D

0 0 0 m0 (0000), m1 (0001) m1 is 1, m0 is 0 → F=DF=D

0 0 1 m2 (0010), m3 (0011) m3 is 1, m2 is 0 → F=DF=D

0 1 0 m4 (0100), m5 (0101) m5 is 1, m4 is 0 → F=DF=D

0 1 1 m6 (0110), m7 (0111) m6 is 1, m7 is 0 → F=D‾F=D

1 0 0 m8 (1000), m9 (1001) none are 1 → F=0F=0

1 0 1 m10 (1010), m11 (1011) none are 1 → F=0F=0

1 1 0 m12 (1100), m13 (1101) none are 1 → F=0F=0

1 1 1 m14 (1110), m15 (1111) none are 1 → F=0F=0

Step 3: Assign MUX data inputs


8:1 MUX has:

 Select lines: A,B,CA,B,C (A = MSB of select)


 Data inputs: I0I0 to I7I7
From the table:

 I0I0 ( =0, =0, =0 → F=DF=D


 I1I1 ( =0, =0, =1 → F=DF=D
 I2I2 ( =0, =1, =0 → F=DF=D
 I3I3 ( =0, =1, =1 → F=D‾F=D
 I4I4 ( =1, =0, =0 → F=0F=0
 I5I5 ( =1, =0, =1 → F=0F=0
 I6I6 ( =1, =1, =0 → F=0F=0
 I7I7 ( =1, =1, =1 → F=0F=0

Step 4: Implementation
Connect:

 I0,I1,I2I0,I1,I2 to DD
 I3I3 to D‾D (use a NOT gate on D)
 I4,I5,I6,I7I4,I5,I6,I7 to logic 0 (ground)
Select lines A,B,CA,B,C are connected directly to the MUX select inputs.

Output FF is the MUX output.

Answer 10(b): The function is implemented by an 8:1 MUX with A,B,CA,B,C as


selects. Data inputs: I0=I1=I2=DI0=I1=I2=D, I3=D‾I3=D, I4=I5=I6=I7=0I4=I5=I6
=I7=0.
11. Discuss any two addressing modes of any generic
microprocessor with an example for each.
Addressing modes are the techniques used by a microprocessor to access operands
(data) for instructions. Different modes provide flexibility and efficiency in programming.

Here are two common addressing modes with examples:

1. Immediate Addressing Mode


In this mode, the operand itself is specified in the instruction. The data to be operated
on is part of the instruction itself, located in the instruction byte(s) immediately
following the opcode.

 No memory reference is needed to fetch the operand (other than the instruction fetch
itself).
 Fast execution because the data is immediately available.
 Limitation: The operand value is fixed at compile/assembly time and cannot be
changed dynamically; only constants can be used.

Example:
text

MOV R1, #25

 Instruction: Move the value 25 into register R1.


 Operation: The value 25 is stored in the instruction itself. The processor fetches the
instruction, extracts the value 25, and loads it into R1. No memory cycle is needed to get
the data.

Another Example:
text

ADD R2, #10

 Adds the constant 10 to the contents of register R2.


2. Direct Addressing Mode
In this mode, the instruction contains the memory address of the operand. The
effective address (the actual location of the data) is directly given in the instruction.

 Requires one memory access to fetch the operand (after fetching the instruction).
 Simple to implement and understand.
 Limitation: The address is fixed, limiting the range of addressable memory and making
dynamic data structures harder to implement. The address space is limited by the
number of bits in the address field.

Example:
text

LOAD R1, 2000

 Instruction: Load register R1 with the contents of memory location 2000.


 Operation: The processor fetches the instruction, extracts the address 2000, then
performs a memory read cycle to location 2000, and loads the data found there into R1.

Another Example:
text

STORE R3, 3000

 Stores the contents of register R3 into memory location 3000.

Comparison Summary

Feature Immediate Mode Direct Mode

Operand location In the instruction itself In memory at specified address


Feature Immediate Mode Direct Mode

Speed Fastest (no extra memory access) Slower (requires memory access)

Flexibility Constants only Fixed memory locations

Example MOV R1, #25 LOAD R1, 2000

Answer 11: Two addressing modes are:

1. Immediate Addressing – operand is in instruction (e.g., MOV R1, #25)


2. Direct Addressing – operand's memory address is in instruction (e.g., LOAD R1, 2000)

12. Explain distributed memory MIMD architecture with


neat diagram.

MIMD (Multiple Instruction stream, Multiple Data stream) is a category of parallel


computer architecture where multiple processors execute different instructions on
different data simultaneously.

Distributed Memory MIMD is a type of MIMD architecture where each processor has
its own private memory and there is no shared global memory. Processors communicate
by passing messages over an interconnection network.

Key Characteristics
1. Private Memory: Each processor can access only its own local memory directly.
Accessing another processor's memory requires explicit communication.
2. Message Passing: Processors communicate and synchronize by sending and receiving
messages through the interconnection network.
3. Scalability: This architecture scales very well to hundreds or thousands of processors
because there is no shared memory bottleneck.
4. No Cache Coherence Problem: Since there is no shared memory, the hardware does
not need to maintain cache coherence across processors (this is handled by the
programmer or software).
5. Non-Uniform Memory Access (NUMA): Accessing local memory is fast, but accessing
remote memory (via messages) is very slow.

Diagram
text

+-------------+ +-------------+
| Processor 1 | | Processor 2 |
+-------------+ +-------------+
| |
+-------------+ +-------------+
| Memory 1 | | Memory 2 |
+-------------+ +-------------+
\ /
\ /
\ /
+-------------------------+
| Interconnection Network |
| (e.g., Mesh, Hypercube, |
| Ring, Torus) |
+-------------------------+
/ \
/ \
+-------------+ +-------------+
| Processor N | | Processor 3 |
+-------------+ +-------------+
| |
+-------------+ +-------------+
| Memory N | | Memory 3 |
+-------------+ +-------------+

Working Procedure

1. Local Computation: Each processor works independently on the data stored in its own
local memory.
2. Data Sharing: If Processor 1 needs data that resides in Memory 2, it must:

o Package the data request into a message.


o Send the message through the interconnection network to Processor 2.
o Processor 2 receives the message, retrieves the data from its memory, and sends it back
in a reply message.
3. Synchronization: Processors use message passing
(e.g., send() and receive() operations) to coordinate their activities.

Advantages

 High scalability (can build systems with thousands of CPUs).


 No memory bus contention.
 Cost-effective for large systems.

Disadvantages

 Programmer must explicitly manage data distribution and communication (complex


programming).
 Message passing overhead can be high.
 Not suitable for problems requiring frequent fine-grained communication.

13. Explain three state buffer with all supporting diagrams.

A Three-State Buffer (or Tri-state Buffer) is a digital logic device that has three possible
output states: logic 0, logic 1, and High-Impedance (Hi-Z) . The Hi-Z state effectively
disconnects the output from the circuit, making it appear as if the device is not present.

It is essential for bus-oriented systems where multiple devices share a common


communication line (bus), but only one device is allowed to drive the bus at any given
time.

Symbol and Truth Table

The buffer has:

 Input (A): The data signal.


 Output (Y): The data signal or Hi-Z.
 Enable (E): The control signal.

Logic Symbol:
text

A ------|>o----- Y
|
E
(The small circle (bubble) indicates active-low enable, though active-high enables are also
common).

Truth Table (for Active-High Enable):

Enable (E) Input (A) Output (Y)

0 0 Hi-Z (High Impedance)

0 1 Hi-Z (High Impedance)

1 0 0

1 1 1

 When E = 1: The buffer is enabled. Output Y = A (the buffer acts like a normal buffer).
 When E = 0: The buffer is disabled. Output Y is disconnected (Hi-Z). It acts like an open
circuit.

Internal Circuit Diagram (Conceptual)

A tri-state buffer is typically built using MOSFET transistors.


text

Vdd
|
|
+--o--+
| PMOS|
+--|--+
|
A ----+--------o|----+---- Y
| NMOS |
| | |
+-------o|----+
NMOS
|
GND

The enable signal controls the gates of the output transistors, either connecting the
output to the input logic or disconnecting both pull-up and pull-down paths, resulting
in Hi-Z.

Application: Bus System

Tri-state buffers allow multiple devices to share a common bus.

Diagram:
text

Device 1 Device 2 Device 3


Data Data Data
| | |
+--------+ +--------+ +--------+
| Tri- | | Tri- | | Tri- |
| State | | State | | State |
| Buffer | | Buffer | | Buffer |
+--------+ +--------+ +--------+
| | |
+---------------+---------------+------- Data Bus
| | |
+--------+ +--------+ +--------+
| Enable | | Enable | | Enable |
| 1 | | 0 | | 0 |
+--------+ +--------+ +--------+
 Only one device's buffer is enabled at a time.
 All other devices have their buffers in Hi-Z state, effectively removing them from the
bus.
 This prevents multiple devices from driving the bus simultaneously, which would cause
short circuits and data corruption.

14. What is multithreading with respect to any generic


microprocessor / OS? Discuss multithreaded processor
architecture in detail with neat diagram.

Part 1: What is Multithreading?

Multithreading is a programming and execution model that allows multiple threads


(lightweight processes) to exist within the context of a single process. These threads
share the process's resources but can execute independently.

 From an OS perspective: The operating system's scheduler manages multiple threads,


switching between them to give the illusion of parallelism and to keep the CPU busy
during long operations (like I/O waits).
 From a Microprocessor perspective: It is a technique where the processor hardware
supports the execution of multiple threads by maintaining duplicate sets of architectural
state (registers, program counter) for each thread, allowing for fast switching between
threads.

The primary goal is to hide latency (e.g., cache misses, pipeline stalls) by utilizing the
processor's resources with work from another thread when one thread is stalled.
Part 2: Multithreaded Processor Architecture

A multithreaded processor is designed to handle multiple threads simultaneously at the


hardware level. There are different types, but the core idea is to keep the execution units
as busy as possible.

Key Hardware Features

1. Multiple Register Sets: The processor has multiple copies of the architectural registers
(one set per hardware thread).
2. Multiple Program Counters (PCs): It maintains a separate PC for each thread.
3. Thread Selector Logic: Hardware logic that decides which thread's instructions to fetch
and execute at any given time.

Types of Multithreading

1. Fine-Grained Multithreading: The processor switches between threads on almost


every cycle (e.g., round-robin). It interleaves instruction issuance from different threads.
This hides long pipeline stalls effectively but can slow down an individual thread.
2. Coarse-Grained Multithreading: The processor switches to another thread only when
the current thread encounters a long-latency event (e.g., a cache miss). This avoids
thread switching overhead during short stalls but can still leave the pipeline empty
during the switch.
3. Simultaneous Multithreading (SMT): The most advanced type (e.g., Intel's Hyper-
Threading). In a single cycle, the processor can issue instructions from multiple
different threads to the multiple execution units (ALUs, FPUs, etc.) of a superscalar
processor. It exploits both instruction-level parallelism (ILP) and thread-level parallelism
(TLP).

Diagram: Simultaneous Multithreading (SMT) Architecture


text

Thread 1 Thread 2 Thread 3 Thread 4


(PC, Regs) (PC, Regs) (PC, Regs) (PC, Regs)
| | | |
+------------------+-----------------+-----------------+
|
v
+-------------------------+
| Instruction Fetch & |
| Decode Unit (Multi- |
| threaded Scheduler) |
+-------------------------+
|
Instructions from Thread 1, 2, 3, 4
|
v
+---------------------------------------------+
| Execution Pipeline |
| +--------+ +--------+ +--------+ +--------+ |
| | ALU 1 | | ALU 2 | | FPU | | Load/ | |
| | | | | | | | Store | |
| +--------+ +--------+ +--------+ +--------+ |
| (Executing Thread 1's instruction) |
| (Executing Thread 3's instruction) |
| (Executing Thread 2's instruction) |
| (Executing Thread 4's instruction) |
+---------------------------------------------+
|
v
+-------------------------+
| Commit / Writeback |
+-------------------------+

Working Procedure (SMT Example)

1. The Fetch unit, in a single cycle, may fetch instructions from multiple threads (e.g.,
Thread 1 and Thread 2).
2. The Decode unit prepares these instructions for execution.
3. The Scheduler, in the same cycle, dispatches these independent instructions to the
available functional units.

o Cycle N: Thread 1's ADD goes to ALU 1, Thread 3's MUL goes to FPU.
o Cycle N+1: Thread 2's LOAD goes to Load/Store unit, Thread 4's SUB goes to ALU 2.
4. By filling the execution units with instructions from different threads, the processor
maximizes the utilization of its hardware, significantly improving overall throughput
compared to a single-threaded processor.

You might also like