Chapter 2:
Understanding Parallel Computers
Principles of Parallel Programming
First Edition
by
Calvin Lin
Lawrence Snyder
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Figure 2.1 Logical organization of the Intel Core
Duo. The bus controller interfaces to the Front
Side Bus that connects to the RAM.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-2
Figure 2.2 Logical Organization of the AMD Dual Core Opteron.
The processors address a private L2 cache; memory consistency is
provided by the System Request Interface; HyperTransport
technology connects to RAM and, possibly, other Opteron chips.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-3
Figure 2.3
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-4
Figure 2.4 Sun Fire E25K. Eighteen boards are connected
with crossbars for address, data and response; each board
contains four UltraSPARC IV Cu processors; the snoopy
buses are shown as dashed lines.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-5
Figure 2.5 Crossbar switch connecting four nodes. Notice the
output and input channels; crossing wires do not connect
unless a connection is shown. Each pair of nodes is directly
connected by setting one of the open circles.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-6
Figure 2.6 Architecture of the Cell processor. The architecture is
designed to move data: The high speed I/O controllers have a capacity of
76.8 GB/s; each of the two channels to RAM runs at 12.8 GB/s; the
capacity of the EIB is theoretically capable of 204.8 GB/s.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-7
Figure 2.7 Logical organization of a
BlueGene/L node.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-8
Figure 2.8 BlueGene/L communication networks; (a) 3D
torus for standard interprocessor data transfer; (b) collective
network for fast evaluation of reductions.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-9
Figure 2.9 Two searching computations:
a) linear search, b) binary search.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-10
Figure 2.10
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-11
Figure 2.11 Common topologies used for
interconnection networks; (a) 2-D torus, (b) binary 3-cube
(see Exercise 8), (c) fat tree, (d) omega network.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-12
Figure 2.11 Common topologies used for
interconnection networks; (a) 2-D torus, (b) binary 3-cube
(see Exercise 8), (c) fat tree, (d) omega network. (cont.)
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-13
Figure 2.11 Common topologies used for
interconnection networks; (a) 2-D torus, (b) binary 3-cube
(see Exercise 8), (c) fat tree, (d) omega network. (cont.)
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-14
Table 2.1 Estimates for λ for common architectures;
speeds generally do not include congestion or other
traffic delays.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 2-15