0% found this document useful (0 votes)
12 views6 pages

Assignment 01 Solution

The document provides a tutorial solution for an assignment involving CPU performance calculations, comparing execution times, clock cycles, and CPI for different processors. It includes detailed equations and examples to illustrate how to determine the performance of various computer architectures based on their clock rates and instruction counts. Additionally, it discusses the impact of die area and yield on manufacturing costs.

Uploaded by

mohamed.kecir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views6 pages

Assignment 01 Solution

The document provides a tutorial solution for an assignment involving CPU performance calculations, comparing execution times, clock cycles, and CPI for different processors. It includes detailed equations and examples to illustrate how to determine the performance of various computer architectures based on their clock rates and instruction counts. Additionally, it discusses the impact of die area and yield on manufacturing costs.

Uploaded by

mohamed.kecir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

66 Chapter A.

Tutorial Solution

A.1 Assignment 01 (2025)


A.1.1 Exercise 01
1. Let’s first find the number of clock cycles required for the program on A (see equation 8,
slide 84):

CPU clock cyclesA


CPU execution timeA =
Clock rateA

CPU clock cyclesA


10 seconds = cycles (A.1)
2 × 109 second

cycles
CPU clock cyclesA = 10 seconds × 2 × 109 = 20 × 109 cycles
second
CPU time for B can be found using this equation:

1.2 ×CPU clock cyclesA


CPU execution timeB =
Clock rateB

1.2 × 20 × 109 cycles


6 seconds =
Clock rateB

1.2 × 20 × 109 cycles 0.2 × 20 × 109 cycles 4 × 109 cycles


Clock rateB = = = = 4 GHz
6 seconds seconds seconds
(A.2)

To run the program in 6 seconds, B must have twice the clock rate of A.

2. We know that each computer executes the same number of instructions for the program; let’s
call this number I. First, find the number of processor clock cycles for each computer:

CPU clock cyclesA = I × 2.0


(A.3)
CPU clock cyclesB = I × 1.2
Now we can compute the CPU time for each computer (see equation 7, slide 84):

CPU execution timeA = CPU clock cyclesA ×Clock cycle time


(A.4)
= I × 2.0 × 250 ps = 500 × I ps
Likewise, for B:

CPU execution timeB = I × 1.2 × 500 ps = 600 × I ps (A.5)

Clearly, computer A is faster. The amount faster is given by the ratio of the execution times:

CPU per f ormanceA Execution timeB 600 × I ps


= = = 1.2 (A.6)
CPU per f ormanceB Execution timeA 500 × I ps
A.1 Assignment 01 (2025) 67

We can conclude that computer A is 1.2 times as fast as computer B for this program.

3. Sequence 1 executes 2+1+2 = 5 instructions. Sequence 2 executes 4+1+1 = 6 instructions.


Therefore, sequence 1 executes fewer instructions.
We can use the equation for CPU clock cycles based on the instruction count and CPI to find
the total number of clock cycles for each sequence:

n
CPU clock cycles = ∑ (CPIi ×Ci ) (A.7)
i=1
This yields

CPU clock cycles1 = (2 × 1) + (1 × 2) + (2 × 3) = 2 + 2 + 6 = 10 cycles


(A.8)
CPU clock cycles2 = (4 × 1) + (1 × 2) + (1 × 3) = 4 + 2 + 3 = 9 cycles
So code sequence 2 is faster, even though it executes one extra instruction. Since code
sequence 2 takes fewer overall clock cycles but has more instructions, it must have a lower
CPI. The CPI values can be computed by

CPU clock cycles


CPI =
Instruction count

CPU clock cycles1 10


CPI1 = = = 2.0 (A.9)
Instruction count1 5

CPU clock cycles2 9


CPI2 = = = 1.5
Instruction count2 6
4. From equation 9 slide 86:
CPU timeold =ICold ×CPIold ×Clock Cycle Time
(A.10)
CPU timenew =ICnew ×CPInew ×Clock Cycle Time
The Clock Cycle Time is the same because it is the same computer:
CPU timeold CPU timenew
=
ICold ×CPIold ICold × 0.6 ×CPIold × 1.1

CPU timenew (A.11)


15 =
0.6 × 1.1

CPU timenew =15 × 0.6 × 1.1 = 9.9 seconds


Therefore, b is the correct answer.
5. a. The clock rates are the inverse of the clock cycle time.
1
P1 = = 3 GHz
0.33 × 10−9 seconds

1
P2 = = 2.5 GHz (A.12)
0.40 × 10−9 seconds

1
P3 = = 4GHz
0.25 × 10−9 seconds
68 Chapter A. Tutorial Solution

Thus, P3 has the highest clock rate.

b. Since all have the same instruction set architecture, all programs have the same in-
struction count, so we can measure performance as the product of average clock cycles
per instruction (CPI) times clock cycle time, which is also the average time of an
instruction:
i. P1 = 1.5 × 0.33 ns = 0.495 ns (you could also calculate average instruction time
using CPI/clock rate, or 1.5/3.0 GHz = 0.495 ns)

ii. P2 = 1.0 × 0.40 ns = 0.400 ns (or 1.0/2.5 GHz = 0.400 ns)

iii. P3 = 2.2 × 0.25 ns = 0.550 ns (or 1.0/4.0 GHz = 0.550 ns)

P2 is the fastest and P3 is the slowest. Despite having the highest clock rate, on average
P3 takes so many more clock cycles that it loses the benefit of a higher clock rate.

c. The CPI calculation was based on running some benchmarks. If they are representative
of real workloads, the answers to these questions are correct. If the benchmarks are
unrealistic, they may not be. The difference between things that are easy to advertise,
like clock rate and actual performance highlights the importance of developing good
benchmarks.

A.1.2 Exercise 02
1. For the three processors, we have the clock rate and the CPI:

Clock cycles
Clock rate =
Second

Clock cycles
CPI =
Instruction
(A.13)

Clock rate × Second =CPI × Instruction

Instruction Clock rate


=
Second CPI

3 × 109
per f ormance o f P1 (instruction/sec) = = 2 × 109
1.5

2.5 × 109
per f ormance o f P2 (instruction/sec) = = 2.5 × 109 (A.14)
1.0

4 × 109
per f ormance o f P3 (instructions/sec) = = 1.8 × 109
2.2
A.1 Assignment 01 (2025) 69

2.
cycles(P1) = 10 × 3 × 109 = 30 × 109 s

cycles(P2) = 10 × 2.5 × 109 = 25 × 109 s


(A.15)

cycles(P3) = 10 × 4 × 109 = 40 × 109 s

30 × 109
IC(P1) = = 20 × 109
1.5

25 × 109
IC(P2) = = 25 × 109
1 (A.16)

40 × 109
IC(P3) = = 18.18 × 109
2.2

3.
CPInew = CPIold × 1.2

CPI(P1) = 1.8

(A.17)
CPI(P2) = 1.2

CPI(P3) = 2.6

IC ×CPI
Clock rate =
time

20 × 109 × 1.8
Clock rate(P1) = = 5.14GHz
7

(A.18)
25 × 109 × 1.2
Clock rate(P2) = = 4.28GHz
7

18.18 × 109 × 2.6


Clock rate(P1) = = 6.75GHz
7

A.1.3 Exercise 03
1. Class A: 105 instructions Class B: 2 × 105 instructions Class C: 5 × 105 instructions Class D:
2 × 105 instructions
70 Chapter A. Tutorial Solution

IC ×CPI
Time =
clock rate

(1 × 1 × 105 + 2 × 105 × 2 + 5 × 105 × 3 + 2 × 105 × 3)


Total time P1 = = 10.4 × 10−4 s
(2.5 × 109 )

(1 × 105 × 2 + 2 × 105 × 2 + 5 × 105 × 2 + 2 × 105 × 2)


Total time P2 = = 6.66 × 10−4 s
(3 × 109 )
(A.19)
2.
10.4 × 10−4 × 2.5 × 109
CPI(P1) = = 2.6
106
(A.20)
6.66 × 10−4 × 3 × 109
CPI(P2) = = 2.0
106
3. clock cycles(P1) = 105 × 1 + 2 × 105 × 2 + 5 × 105 × 3 + 2 × 105 × 3 = 26 × 105

clock cycles(P2) = 105 × 2 + 2 × 105 × 2 + 5 × 105 × 2 + 2 × 105 × 2 = 20 × 105

A.1.4 Exercise 04
I forgot to mention that N = 2.
From equations 2 and 3 slide 78.
1.
wa f er area π × 7.52
die area15cm = = = 2.10 cm2
dies per wa f er 84

1
yield15cm = = 0.9593
(1 + (0.020 × 2.10
2 ))
2

(A.21)
wa f er area π × 102
die area20cm = = = 3.14 cm2
dies per wa f er 100

1
yield20cm = = 0.9093
(1 + (0.031 × 3.14
2 ))
2

2. From equation 1 slide 78.


12
cost/die15cm = = 0.1489
(84 × 0.9593)

(A.22)
15
cost/die20cm = = 0.1650
(100 × 0.9093)
A.1 Assignment 01 (2025) 71

3.
wa f er area π × 7.52
die area15cm = = = 1.91cm2
dies per wa f er (84 × 1.1)

1
yield15cm = = 0.9575
(1 + (0.020 × 1.15 × 1.91
2 ))
2

(A.23)
wa f er area π × 102
die area20cm = = = 2.86cm2
dies per wa f er 100 × 1.1

1
yield20cm = = 0.9082
(1 + (0.03 × 1.15 × 2.86
2 ))
2

4.
√ √
(1 − y) (1 − 0.92)
de f ects per area0.92 = √ = √ = 0.043 de f ects/cm2
( y × die 2area ) ( 0.92 × 22 )
(A.24)
√ √
(1 − y) (1 − 0.95)
de f ects per area0.95 = √ = √ = 0.026 de f ects/cm2
( y × die 2area ) ( 0.95 × 22 )

You might also like