UltraScale Clocking Resources Guide
UltraScale Clocking Resources Guide
Overview
Introduction to the UltraScale Architecture
Clocking Overview
Clocking Architecture Overview
Clocking Differences from Previous FPGA Generations
Clocking Resources
Overview
Global Clock Inputs
Byte Clock Inputs
Clock Buffers and Clock Routing
Clock Management Tile
Overview
MMCMs
PLLs
Dynamic Reconfiguration Port
VHDL and Verilog Templates and the Clocking Wizard
Clocking Guidelines
Additional Resources and Legal Notices
Finding Additional Documentation
Support Resources
References
Revision History
Please Read: Important Legal Notices
Overview
Introduction to the UltraScale Architecture
The AMD UltraScale™ architecture is the first ASIC-class architecture to enable multi-hundred gigabit-
per-second levels of system performance with smart processing, while efficiently routing and
processing data on-chip. UltraScale architecture-based devices address a vast spectrum of high-
bandwidth, high-utilization system requirements by using industry-leading technical innovations,
including next-generation routing, ASIC-like clocking, 3D ICs, multiprocessor SoC (MPSoC)
technologies, and new power reduction features. The devices share many building blocks, providing
scalability across process nodes and product families to leverage system-level investment across
platforms.
AMD Spartan™ UltraScale+™ devices provide high I/O to logic ratio, integrated memory controllers, and
advanced I/O that help Industrial, Vision and Healthcare (IVH), Audio, Video and Broadcast (AVB),
Automotive, and other FPGA application developers looking to build cost-optimized solutions. Available
in a wide array of packaging options, this family delivers a balance of cost, power, performance, and
size.
AMD Artix™ UltraScale+™ devices provide high serial bandwidth and signal compute density in a cost-
optimized device for critical networking applications, vision and video processing, and secured
connectivity. Coupled with the innovative InFO packaging, which provides excellent thermal and power
distribution, Artix UltraScale+ FPGAs are perfectly suited to applications requiring high compute density
in a small footprint.
AMD Kintex™ UltraScale+™ devices provide the best price/performance/watt balance in a 16 nm FinFET
node, delivering the most cost-effective solution for high-end capabilities, including transceiver and
memory interface line rates as well as 100G connectivity cores. Our newest mid-range family is ideal
for both packet processing and DSP-intensive functions and is well suited for applications including
wireless MIMO technology, Nx100G networking, and data center.
AMD Kintex™ UltraScale™ devices provide the best price/performance/watt at 20 nm and include the
highest signal processing bandwidth in a mid-range device, next-generation transceivers, and low-cost
packaging for an optimum blend of capability and cost-effectiveness. The family is ideal for packet
processing in 100G networking and data centers applications as well as DSP-intensive processing
needed in next-generation medical imaging, 8k4k video, and heterogeneous wireless infrastructure.
AMD Virtex™ UltraScale+™ devices provide the highest performance and integration capabilities in a 16
nm FinFET node, including both the highest serial I/O and signal processing bandwidth, as well as the
highest on-chip memory density. As the industry's most capable FPGA family, the Virtex UltraScale+
devices are ideal for applications including 1+Tb/s networking and data center and fully integrated
radar/early-warning systems.
Virtex UltraScale devices provide the greatest performance and integration at 20 nm, including serial
I/O bandwidth and logic capacity. As the industry's only high-end FPGA at the 20 nm process node, this
family is ideal for applications including 400G networking, large scale ASIC prototyping, and emulation.
AMD Zynq™ UltraScale+™ devices provide 64-bit processor scalability while combining real-time control
with soft and hard engines for graphics, video, waveform, and packet processing. Integrating an Arm® -
based system for advanced analytics and on-chip programmable logic for task acceleration creates
unlimited possibilities for applications including 5G Wireless, next generation ADAS, and industrial
Internet-of-Things.
The UltraScale architecture documentation suite is available at [Link].
Clocking Overview
This chapter provides an overview of clocking and a comparison between clocking in the UltraScale
architecture and previous FPGA generations. For detailed information on usage of clocking resources,
see Clocking Resources and Clock Management Tile. For more information refer to the Clocking
Guidelines section in the UltraFast Design Methodology Guide for FPGAs and SoCs (UG949).
The device is subdivided into columns and rows of segmented clock regions (CRs). CRs differ
from previous families because they are arranged in tiles and do not span half the width of a
device. A CR contains configurable logic blocks (CLBs), DSP slices, block RAMs, interconnect, and
associated clocking. The height of a CR is 60 CLBs, 24 DSP slices, and 12 block RAMs with a
horizontal clock spine (HCS) at its center. The HCS contains the horizontal routing and distribution
resources, leaf clock buffers, clock network interconnections, and the root of the clock network.
Clock buffers drive directly into the HCS. There are 42 to 66 I/Os per bank, depending on bank
type, and four gigabit transceivers (GTs) that are pitch matched to the CRs. A core column
contains configuration, System Monitor (SYSMON), and PCIe® blocks to complete a basic device.
Adjacent to the input/output block columns are the physical layer (PHY) blocks with CMTs, global
clock buffers, global clock multiplexing structures, and I/O logic management functions. The
clocking drives vertical and horizontal connectivity through separate clock routing and clock
distribution resources via HCS into the CRs and I/Os.
Horizontal clock routing and distribution tracks drive horizontally into the CRs. Vertical routing and
distribution tracks drive vertically adjacent CRs. The tracks are segmentable at the CR boundaries
in both the horizontal and vertical directions. This allows for the creation of device-wide global
clocks or local clocks of variable size.
The distribution tracks drive the clocking of synchronous elements across the device. Distribution
tracks are driven by routing tracks or directly by the clocking structures in the PHY.
I/Os are directly driven from the PHY clocking and/or an adjacent PHY via routing tracks.
A CMT contains one mixed-mode clock manager (MMCM) and two phase-locked loops (PLLs).
Each I/O bank contains global clock input pins to bring user clocks onto the device clock management
and routing resources. The global clock inputs bring user clocks onto:
Each device has three global clock buffers: BUFGCTRL, BUFGCE, and BUFGCE_DIV. In addition, there is
a local BUFCE_LEAF clock buffer for driving leaf clocks from horizontal distribution to various blocks in
the device. BUFGCTRL has derivative software representations of types BUFGMUX, BUFGMUX1,
BUFGMUX_CTRL, and BUFGCE_1. BUFGCE is for glitchless clock gating and has software derivative
BUFG (BUFGCE with clock enable tied High). The global clock buffers drive routing and distribution
tracks into the device logic via HCS rows. There are 24 routing and 24 distribution tracks in each HCS
row. There is also a BUFG_GT that generates divided clocks for GT clocking. The clock buffers:
Can be used as a clock enable circuit to enable or disable clocks either globally, locally, or within a
CR for fine-grained power control.
Can be used as a glitch-free multiplexer to:
Select between two clock sources.
Switch away from a failed clock source.
Are often driven by a CMT to:
Eliminate the clock distribution delay.
Adjust clock delay relative to another clock.
Clocking Resources has further details on global clocks, I/O, and GT clocking. It also describes which
clock routing resources to use for various applications.
CMT Overview
Each device has a CMT as part of the PHY next to each of the I/O banks. A CMT consists of one
MMCM and two PLLs. The MMCM is the primary block for frequency synthesis for a wide range of
frequencies, and serves as a jitter filter for either external or internal clocks, and deskew clocks among
a wide range of other functions. The PLL’s primary purpose is to provide clocking to the PHY I/Os, but
can also be used for clocking other resources in the device in a limited fashion. The device clock input
connectivity allows multiple resources to provide the reference clock(s) to the MMCM and PLL.
MMCMs have infinite fine phase shift capability in either direction and can be used in dynamic phase
shift mode. MMCMs also have a fractional counter in either the feedback path or in one output path,
enabling further granularity of frequency synthesis capabilities.
The AMD LogiCORE™ IP clocking wizard is available to assist in utilizing MMCMs and PLLs to create
clock networks in UltraScale architecture designs. The GUI interface is used to collect clock network
parameters. The clocking wizard chooses the appropriate CMT resource and optimally configures the
CMT resource and associated clock routing resources.
Clock Management Tile has further details on the CMT block features and connectivity.
BUFMRs, BUFRs, and BUFIOs, and the associated routing resources have been removed from this
architecture and are replaced by new clock buffers, clock routing, and a completely new I/O
clocking architecture.
The BUFGCTRL and its derivatives are still available. Two new global clock buffer resources
BUFGCE and BUFGCE_DIV have been introduced in the new architecture. At the local clocking
level, a new BUFCE_LEAF clock buffer provides local, vertical clocking with additional features.
A BUFG_GT buffer for clock division of GT clocks has been added.
A new and improved clock routing architecture is available. There are now two types of global
routing tracks called routing and distribution. Both types of routing provide a segmentable clock
network at the CR level. Both types can be driven by the global clock buffers. The distribution
tracks can be driven by routing tracks or directly by clock buffer resources. The distribution tracks
provide connectivity to all clocking points in UltraScale devices.
The CMTs now have two PLLs instead of one.
MMCMs are similar to the MMCM in the 7 series devices. PLLs have new features related to I/O
PHY clocking. However, other clocking related functionality and connectivity has been reduced as
compared to the 7 series FPGAs. For example, the PLLs do not support phase compensation or
external feedback, have fewer outputs, share a voltage-controlled oscillator (VCO) with the PHY
clocking, and have other features removed as compared to the 7 series devices. For this reason,
most customers should use the MMCM for general clocking. However, leftover PLLs are also
available for use.
The MMCM output clock frequencies can be dynamically changed without resetting the MMCM
when using the Clock Divide Dynamic Change (CDDC) feature or dynamic reconfiguration port.
Clocks affected by CDDC are not phase aligned with output clocks.
The definition of clock region has changed. A clock region no longer spans half a device width in
the horizontal direction. UltraScale architecture clock regions have a rectangular shape with a
fixed width and height and are organized in tiles. Horizontal and vertical clock tracks are
segmented at the clock region boundaries.
The clock capable pins (CC) have been replaced by global clock pins (GC). In addition, the
UltraScale+ architecture has high-density (HD) I/O banks. These banks contain four global clock
pins called HDGC which can connect to the BUFGCEs.
In Spartan UltraScale+ devices, the CMTs adjacent to the integrated memory controller (LPDDRMC) and
XP5IO have enhanced PLLs (PLLXP) with reserved ports for the LPDDRMC.
Clocking Resources
Overview
AMD UltraScale™ architecture-based devices have several clock routing resources to support various
clocking schemes and requirements, including high fanout, short propagation delay, and extremely low
skew. To best utilize the clock routing resources, the designer must understand how to get user clocks
from the PCB to the AMD UltraScale™ devices, decide which clock routing resources are optimal, and
then access those clock routing resources by utilizing the appropriate I/O and clock buffers.
Single-ended clock inputs must be assigned to the P (master) side of the GC input pin pair. If a single-
ended clock is connected to the P-side of a differential clock pin pair, the N-side cannot be used as
another single-ended clock pin—it can only be used as a user I/O. For pin naming conventions, refer to
the UltraScale and UltraScale+ FPGAs Packaging and Pinouts Product Specification (UG575).
GC inputs can be used as regular I/O if not used as clocks. When used as regular I/O, global clock input
pins can be configured as any single-ended or differential I/O standard. GC inputs can connect to the
PHY adjacent to the banks they reside in.
Clock Structure
Clock Buffers
Clock Structure
The basic device architecture is composed of blocks of CRs. CRs are organized into tiles and thus build
columns and rows. Each CR contains slices (CLBs), DSPs, and 36K block RAM blocks. The mix of slice,
DSP, and block RAM columns in each CR can be different, but are always identical when stacked in the
vertical direction, thus building columns of those resources for the entire device. I/O and GT columns
are then inserted with columns of CRs. In addition, there is a single column that contains the
configuration logic, SYSMON, and PCIe blocks. An HCS runs horizontally through the device in the
center of each row of CRs, I/Os, and GTs. The HCS contains the horizontal routing and distribution
tracks as well as leaf clock buffers and clock network interconnects between horizontal/vertical
routing and distribution. Vertical tracks of routing and distribution connect all CRs in a column, while
vertical routing spans an entire I/O column. There are 24 horizontal routing and 24 distribution tracks
(Figure 1). Artix UltraScale+, Kintex UltraScale+, and Virtex UltraScale+ device routing tracks have 24
vertical routing and 24 vertical distribution tracks (Figure 2), and Spartan UltraScale+ devices have 13
to 24 vertical routing and 13 to 24 vertical distribution tracks depending on column and device. Vertical
clocking tracks within a CR might not be equally split on either side of horizontal routing buffers.
The purpose of the clock routing resources is to route a clock from the global clock buffers to a central
point from where it is connected to the loads via the distribution resources. This central point of the
clock network is called a clock root in the UltraScale architecture. The root can be in any CR in a device
from where it is routed to the loads via the clock distribution resources. This architecture optimized
clock skew. Routing and distribution resources can either connect to adjacent CRs or disconnect
(isolated) at the border of the CR as needed. This concept extends to SSI devices as well.
In Spartan UltraScale+ devices there are less than 24 vertical tracks, and these tracks are not equally
split. Refer to the following table for the detailed track information.
X1Y1, X1Y0 13
X1Y1, X1Y0 13
1. Refer to the Step 2: Performing Place and Route on the Design section of Vivado Design Suite
Tutorial: Implementation (UG986) or use the following Vivado Tcl console command to open
the device view:
where <part_name> is the Spartan UltraScale+ device number, for example xcsu35p.
The clocks can be distributed from their sources in one of two ways (Figure 3):
The clocks can go onto routing tracks that take the clocks to a central point in a CR without going
to any loads. The clocks can then drive the distribution tracks unidirectionally from which the
clock networks fan out. In this way, the clock buffers can drive to a specific point in the CRs from
which the clock buffers travel vertically and then horizontally on the distribution tracks to drive the
clocking points. The clocking points are driven via leaf clocks with clock enable (CE) in that CR
and adjacent CRs, if needed. Distribution tracks cannot drive routing tracks.
This distribution scheme is used to move the root for all the loads to be at a specific location for
improved, localized skew. Furthermore, both routing and distribution tracks can drive into
horizontally or vertically adjacent CRs in a segmented fashion. Routing tracks can drive both
routing and distribution tracks in the adjacent CRs while the distribution tracks can drive other
horizontal distribution tracks in adjacent CRs. The CR boundary segmentation allows construction
of either truly global, device-wide clock networks or more local clock networks of variable sizes by
reusing clocking tracks.
Alternatively, clock buffers can drive straight onto the distribution tracks and distribute the clock in
that manner. This reduces the clock insertion delay.
✎ Note: Some Spartan UltraScale+ devices have fewer than 24 vertical tracks.
Each of the four bytes in the XIPHY BITSLICE have six connections from the HCS to their global
clocking pins. Therefore, only six BUFGs can drive the BITSLICE clocking pins in either half of an
I/O bank (a maximum of 6 clocks can drive any half of an I/O bank).
Clock Buffers
The PHY global clocking contains several sets of BUFGCTRLs, BUFGCEs, and BUFGCE_DIVs. Each set
can be driven by four GC pins from the adjacent bank, MMCMs, PLLs in the same PHY, and
interconnect. The clock buffers then drive the routing and distribution resources across the entire
device. Each PHY contains 24 BUFGCEs, 8 BUFGCTRLs, and 4 BUFGCE_DIVs but only 24 of them can
be used at the same time.
‼ Important: It is recommended to only allow the AMD Vivado™ Placer to assign all global clock buffers
to specific locations. Each CR contains 24 BUFGCEs, 8 BUFGCTRLs and 4 BUFGCE_DIVs. These clock
buffers share the 24 routing tracks and therefore collisions might occur resulting in unroutable designs.
If the design requires a number of global clock buffers to be in a certain CR then it is recommended to
attach the CLOCK_REGION property to these buffers instead of a specific LOCATION property.
In the clocking architecture, BUFGCTRL multiplexers and all derivatives can be cascaded to adjacent
clock buffers, effectively creating a ring of eight BUFGMUXes (BUFGCTRL multiplexers). The following
figure shows a simplified diagram of cascading BUFGCTRLs.
The following subsections detail the various configurations, primitives, and use models of the clock
buffers.
The primitives in the following table are different configurations of the clock BUFGCTRL buffers. The
Vivado tools manage the configuration of all these primitives, and the Vivado Design Suite User Guide:
Using Constraints (UG903) describes the LOC constraint.
BUFGCE_1 I O CE
BUFGMUX I0, I1 O S
BUFGMUX_1 I0, I1 O S
BUFGMUX_CTRL I0, I1 O S
BUFGCTRL
The BUFGCTRL primitive shown in the following figure can switch between two asynchronous clocks.
All other global clock buffer primitives are derived from certain configurations of BUFGCTRL.
BUFGCTRL has four select lines, S0, S1, CE0, and CE1. It also has two additional control lines, IGNORE0
and IGNORE1. These six control lines are used to control the inputs I0 and I1.
BUFGCTRL is designed to switch between two clock inputs without the possibility of a glitch. When the
presently selected clock transitions from High to Low after S0 and S1 change, the output is kept Low
until the other (to-be-selected) clock transitions from High to Low. Then, the new clock starts driving
the output. The default configuration for BUFGCTRL is falling-edge sensitive and held at Low prior to
the input switching. BUFGCTRL can also be rising-edge sensitive and held at High prior to the input
switching by using the INIT_OUT attribute.
In some applications, the conditions previously described are not desirable. Asserting the IGNORE pins
bypasses the BUFGCTRL from detecting the conditions for switching between two clock inputs. In
other words, asserting IGNORE causes the MUX to switch the inputs at the instant the select pin
changes. IGNORE0 causes the output to switch away from the I0 input immediately when the select pin
changes, while IGNORE1 causes the output to switch away from the I1 input immediately when the
select pin changes.
Selection of an input clock requires a “select” pair (S0 and CE0, or S1 and CE1) to be asserted High. If
either S or clock enable (CE) is not asserted High, the desired input is not selected. In normal operation,
both S and CE pairs (all four select lines) are not expected to be asserted High simultaneously.
Typically, only one pin of a “select” pair is used as a select line, while the other pin is tied High. The truth
table is shown in the following table.
CE0 S0 CE1 S1 O
1 1 0 X I0
1 1 X 0 I0
CE0 S0 CE1 S1 O
0 X 1 1 I1
X 0 1 1 I1
1 1 1 1 Old Input 1
1. Old input refers to the valid input clock before this state is achieved.
2. For all other states, the output becomes the value of INIT_OUT and does not toggle.
Although both S and CE are used to select a desired output, only S is suggested for glitch-free
switching. This is because when using CE to switch clocks, the change in clock selection can be faster
than when using S. A violation in the setup/hold time of the CE pins causes a glitch at the clock output.
On the other hand, using the S pins allows the user to switch between the two clock inputs without
regard to setup/hold times. As a result, using S to switch clocks does not result in a glitch. See
BUFGMUX_CTRL.
The timing diagram in the following figure illustrates various clock switching conditions using the
BUFGCTRL primitives. Exact timing numbers are best found using the speed specification.
Pre-selection of the I0 and I1 inputs are made after configuration but before device operation.
The initial output after configuration can be selected as either High or Low.
Clock selection using CE0 and CE1 only (S0 and S1 tied High) can change the clock selection
without waiting for a High-to-Low transition on the previously selected clock.
The following table summarizes the attributes for the BUFGCTRL primitive.
PRESELECT_I0 If TRUE, BUFGCTRL output uses the I0 input after FALSE (default),
configuration. 1 TRUE
PRESELECT_I1 If TRUE, BUFGCTRL output uses the I1 input after FALSE (default),
configuration. 1 TRUE
BUFGCE_1
BUFGCE_1 is a clock buffer with one clock input, one clock output, and a clock enable line. This
primitive is based on BUFGCTRL with some pins connected to logic High or Low. The following figure
illustrates the relationship of BUFGCE_1 and BUFGCTRL. The LOC constraint is available for manually
placing the BUFGCE_1 location. See the Vivado Design Suite User Guide: Using Constraints (UG903) for
more information.
The switching condition for BUFGCE_1 is similar to BUFGCTRL with INIT_OUT set to 1. If the CE input is
Low prior to the incoming falling clock edge, the following clock pulse does not pass through the clock
buffer, and the output stays High. Any level change of CE during the incoming clock Low pulse has no
effect until the clock transitions High. The output stays High when the clock is disabled. However, when
the clock is being disabled, it completes the clock Low pulse.
‼ Important: Because the clock enable line uses the CE pin of the BUFGCTRL, the select signal must
meet the setup time requirement. Violating this setup time can result in a glitch.
The following figure illustrates the timing diagram for BUFGCE_1.
BUFGMUX is a clock buffer with two clock inputs, one clock output, and a select line. This primitive is
based on BUFGCTRL with some pins connected to logic High or Low.
The following figure illustrates the relationship of BUFGMUX and BUFGCTRL. The LOC constraint is
available for manually placing the BUFGMUX and BUFGCTRL locations. See the Vivado Design Suite
User Guide: Using Constraints (UG903) for more information.
‼ Important: Because BUFGMUX uses the CE pins as select pins, when using the select, the setup time
requirement must be met. Violating this setup time can result in a glitch.
Switching conditions for BUFGMUX are the same as the CE pins on BUFGCTRL. The following figure
illustrates the timing diagram for BUFGMUX.
BUFGMUX_1 is rising-edge sensitive and held at High prior to input switch. The following figure
illustrates the timing diagram for BUFGMUX_1. The LOC constraint is available for manually placing the
BUFGMUX and BUFGMUX_1 locations. See the Vivado Design Suite User Guide: Using Constraints
(UG903) for more information.
The following table summarizes the attributes for the BUFGMUX primitive.
BUFGMUX_CTRL
BUFGMUX_CTRL is a clock buffer with two clock inputs, one clock output, and a select line. This
primitive is based on BUFGCTRL with some pins connected to logic High or Low. The following figure
illustrates the relationship of BUFGMUX_CTRL and BUFGCTRL.
BUFGMUX_CTRL uses the S pins as select pins. S can switch anytime without causing a glitch. The
setup/hold time on S is for determining whether the output passes an extra pulse of the previously
selected clock before switching to the new clock. If S changes as shown in Figure 2 prior to the setup
time TBCCCK_S and before I0 transitions from High to Low, the output does not pass an extra pulse of
I0. If S changes following the hold time for S, the output passes an extra pulse. If S violates the
setup/hold requirements, the output might pass the extra pulse but it will not glitch. In any case, the
output changes to the new clock within three clock cycles of the slower clock.
The setup/hold requirements for S0 and S1 are with respect to the falling clock edge, not the rising
edge as for CE0 and CE1.
Switching conditions for BUFGMUX_CTRL are the same as the S pin of BUFGCTRL. The following figure
illustrates the timing diagram for BUFGMUX_CTRL.
In some cases an application requires immediate switching between clock inputs or bypassing the
edge sensitivity of BUFGCTRL. An example is when one of the clock inputs is no longer switching. If
this happens, the clock output would not have the proper switching conditions because the BUFGCTRL
never detected a clock edge. This case uses the asynchronous MUX. The following figure illustrates an
asynchronous MUX with BUFGCTRL design example.
A BUFGMUX_CTRL with a clock enable BUFGCTRL configuration allows you to choose between the
incoming clock inputs. If needed, the clock enable is used to disable the output. Figure 1 illustrates the
BUFGCTRL usage design example and Figure 2 shows the timing diagram.
BUFGCE is a clock buffer with one clock input, one clock output, and a clock enable line (Figure 1). This
buffer provides glitchless clock gating. BUFGCE can directly drive the routing resources and is a clock
buffer with a single gated input. Its O output is 0 when CE is Low (inactive). When CE is High, the I input
is transferred to the O output.
CE_TYPE SYNC, ASYNC SYNC STRING Sets the clock enable behavior
where SYNC allows for
glitchless transition while
ASYNC allows immediate
transition.
The following figure shows the BUFGCE timing diagram.
BUFG is a clock buffer with one clock input and one clock output. This primitive is based on BUFGCE
with the CE pin connected to High, as shown in the following figure.
BUFCE_LEAF is a clock buffer with CE for leaf driving off horizontal HCS row. This buffer is an
interconnect leaf clock buffer driving the clocking point of the various blocks with a single gated input.
Its O output is 0 when CE is Low (inactive). When CE is High, the I input is transferred to the O output.
The following table shows the BUFCE_LEAF attributes.
CE_TYPE SYNC, SYNC STRING Sets the clock enable behavior where
ASYNC SYNC allows for glitchless transition while
ASYNC allows immediate transition.
✎ Note: The BUFGCE_LEAF is documented for information purpose only and is not user accessible in
Vivado Design Suite (e.g., for instantiation, placement, etc.).
BUFGCE_DIV
BUFGCE_DIV is a clock buffer with one clock input (I), one clock output (O), one clear input (CLR) and a
clock enable (CE) input. BUFGCE_DIV can directly drive the routing and distribution resources and is a
clock buffer with a single gated input and a reset. Its O output is 0 when CLR is High (active). When CE
is High, the I input is transferred to the O output. CE is synchronous to the clock for glitch-free
operation. CLR is an asynchronous reset assertion and synchronous reset deassertion to this buffer.
BUFGCE_DIV can also divide the input clock by 1 to 8.
When CLR (reset) is deasserted, the output clock transitions from Low to High on the first edge after
the CLR is deasserted, regardless of the divide value. Therefore, BUFGCE_DIV output clocks are always
aligned, regardless of the divide value. The output clock then toggles at the divided frequency. When
CLR is asserted, the clock stops toggling after some clock-to-out time. For an odd divide, the duty cycle
is not 50% because the clock is High one cycle less than it is Low. For example, for a divide value of 7,
the clock is High for 3 cycles and Low for 4 cycles.
When CE is deasserted, the output stops at its current state, High or Low. When CE is reasserted, the
internal counter restarts from where it stopped. For example, if the divide value is 8 and CE is
deasserted two input clock cycles after the last output High transition, the output stays High. Then
when CE is reasserted, the output transitions Low after two input clock cycles. If the reset input is used,
upon assertion the output transitions Low immediately if the current output is High, otherwise it stays
Low.
Since reset is synchronously deasserted, when reset is deasserted in the previous example, the output
transitions High at the next input clock edge and transitions Low four input clock cycles later.
The following table shows the BUFGCE_DIV pins.
The BUFG_GTs are driven by the gigabit transceivers (GTs) and the ADC/DAC blocks in the RFSoC
devices. BUFG_GTs provide the only means for those blocks to drive the clock routing resources. Only
GTs, ADCs, and DACs can drive BUFG_GTs. BUFG_GT (Figure 1) is a clock buffer with one clock input
(I), one clock output (O), one clear input (CLR) with CLR mask input (CLRMASK), a clock enable (CE)
input with a CE mask input (CEMASK) and a 3-bit divide (DIV[2:0]) input. BUFG_GT_SYNC is the
synchronizer circuit for the BUFG_GTs and is shown here explicitly. The BUFG_GT_SYNC primitive is
automatically inserted by the Vivado tools, if not present in the design. This buffer can directly drive the
routing and distribution resources and is a clock buffer with a single gated input and a reset. When CE
is deasserted (Low) the output stops at its current state, High or Low. When CE is High, the I input is
transferred to the O output. Both edges of CE and the deassertion of CLR are automatically
synchronized to the clock for glitch-free operation. The Vivado tools do not support timing for the CE
pin, therefore, a deterministic latency cannot be achieved. CLR is an asynchronous reset assertion and
synchronous reset deassertion to the BUFG_GTs. The synchronizers have two stages, but the CLR pin
does not have a setup/hold timing arc assigned. Therefore, the latency is not deterministic. BUFG_GTs
can also divide the input clock by 1 to 8. The DIV[2:0] value is the actual divide minus 1 (i.e., 3'b000
corresponds to 1 while 3'b111 corresponds to 8). The divide value (DIV inputs), CEMASK, and
CLRMASK must be changed while the buffer is held in reset. The input clock is allowed to change while
CE is deasserted or reset is asserted. However, there is a minimum deassertion/assertion time for
those control signals.
‼ Important: In RFSoC devices, the ADC and DAC tiles replace the GTH transceivers that are present in
the MPSoC devices. Therefore, ADC and DAC utilize the existing BUFG_GT clock buffers to drive the
global clock trees in the device and then back into the ADC/DAC tiles from the fabric. However, the DIV
function cannot be used when connecting to the ADC/DAC clocks. Hence the BUFG_GT functions more
like a simple global clock buffer with CE and CLR.
‼ Important: For devices in Zynq UltraScale+ and selected devices in Kintex UltraScale+ families
(XCKU9P and above), assigning the clock root in the same region as the BUFG_GT driver (X0 column)
can cause an unroutable situation and prevent the output clocks from reaching loads that are placed in
the clock regions to the right of the Zynq UltraScale+ device PS or Kintex UltraScale+ empty PL regions
in the Y0, Y1, and Y2 rows. To avoid the issue, users need to assign clock root one clock region to the
right, in this case the X1 column.
UltraScale devices have 24 BUFG_GTs and 10 BUFG_GT_SYNCs per GT Quad. UltraScale+ devices also
have 24 BUFG_GTs but they have 14 BUFG_GT_SYNCs per GT Quad. Any of the GT output clocks in a
Quad can be multiplexed to any of the BUFG_GTs. In UltraScale devices, there are 10 CE and CLR pins
which correspond to the 10 BUFG_GT_SYNCs and that can drive the 24 BUFG_GTs. In UltraScale+
devices, there are 14 CE and CLR pins which correspond to the 14 BUFG_GT_SYNCs and that can drive
the 24 BUFG_GTs. Each of the BUFG_GT buffers have an individual mask for both CE and CLR (24). All
BUFG_GTs driven by the same clock source must also have a common CE and CLR signal. Tying off CE
and CLR to a constant signal in this case is not allowed, but a mask can be set to provide the same
functionality. The output clocks of the BUFG_GTs connected to the same input clock are synchronized
(phase aligned) to each other when coming out of reset (CLR) or on CE assertion. Individual mask pins
can be used to control which BUFG_GT(s) out of the group of 24 respond to CE and CLR and therefore
are synchronized to each other or retain their previous phase and divide value. These clock buffers are
located in the HCS and are directly driven by the GT output clocks. Their purpose is to directly drive
hard blocks and logic in the CRs via routing and distribution resources. GTs have no other direct,
dedicated connections to other clock resources. However, they can connect to the CMT via the
BUFG_GT and the clock routing resources.
When CLR (reset) is deasserted, the output transitions High at the next input clock edge and transitions
Low divide_value/2 input clock cycles later. Because reset is synchronously deasserted, two clock
cycles of synchronization latency need to be added to the output to transition it to High. The next
transition to Low then occurs four input clock cycles after that (divide by 8). The output transitions to
High a number of clock cycles later, determined by the divide value specified, after which the output
clock toggles at the divided frequency. When CLR is asserted, the clock stops toggling at Low after
some clock-to-out time. For an odd divide, the duty cycle is not 50% because the clock is High one cycle
less than it is Low. For example, for a divide value of 7, the clock is High for 3 cycles and Low for 4
cycles.
When CE is deasserted, the output stops at its current state, High or Low. When CE is reasserted, the
internal counter restarts from where it stopped. For example, if the divide value is 8 and CE is
deasserted two input clock cycles after the last output High transition, the output stays High. Then,
when CE is reasserted, the output transitions Low four input clock cycles later (two for synchronization
and two to complete the High time period of the output clock because of being a divide by 8). If the
reset input is used, upon assertion the output transitions Low immediately if the current output is High,
otherwise it stays Low. Because reset is synchronously deasserted, when reset is deasserted in the
previous example, the output transitions High two input clock cycles later due to synchronization and
transitions Low four input clock cycles after that (divide by 8).
The mask pins (CEMASK and CLRMASK) control how a specific, single BUFG_GT responds to the
CE/CLR control inputs. When a mask pin is deasserted, its respective control pin has their normal
function. When a mask pin is asserted, the respective control pin is ignored, in effect allowing the clock
to propagate through (i.e., CE is effectively High and reset is effectively Low). The internal
synchronizers phase align the clock outputs of the BUFG_GTs that are not masked. Both edges of CE
are synchronized while only the deassertion of reset is synchronized. Assertion of reset immediately
causes the output of the BUFG_GT to go Low if it was previously High. This can cause a potential glitch
or runt pulse. If this is not acceptable, CE should be used to stop the output. A reset should then be
asserted after two input clock cycles plus half the “divide value.” This ensures that the output clock
High time (if the output clock happened to be disabled High) is no less than normal.
‼ Important: While the synchronizers ensure that all BUFG_GTs driven by the same clock come out of
reset in phase, they might not be in phase with BUFG_GTs that have not been reset (i.e., that have their
reset mask asserted).
BUFG_PS
The BUFG_PS is a simple clock buffer with one clock input (I), one clock output (O). This clock buffer is
a resource for the Zynq UltraScale+ MPSoC processor system (PS) and provides access to the
programmable logic (PL) clock routing resources for clocks from the processor into the PL. Up to 18 PS
clocks can drive the BUFG_PS. This clock buffer resides next to the PS.
Spartan UltraScale+ devices have two bank structures—HDIOL and HDIOS. Spartan UltraScale+ devices
include XP5IO with CMTXP columns adjacent to the XP5IO, as shown in the following figure. The
CMTXP column contains one MMCM and two PLLE4XP. The CMT column extends down to the bottom
row. The XP5IO design does not match the same pinout as XIPHY. There is additional clock routing
which helps align the connection between XP5IO and CMTXP.
✎ Note: In the Spartan UltraScale+ SU50P, SU55P, SU65P, SU100P, SU150P, and SU200P devices, the
CCIO clocks from HDIOL are directly connected to the CMT column adjacent to HDIOL.
✎ Note: For more information about the previous figure, refer to Spartan UltraScale+ FPGAs SelectIO
Resources User Guide (UG861).
MMCMs
AMD UltraScale™ architecture-based devices contain one CMT per I/O bank. The MMCMs serve as
frequency synthesizers for a wide range of frequencies, and as jitter filters for either external or internal
clocks, and deskew clocks.
Input multiplexers select the reference and feedback clocks from either the global clock I/Os or the
clock routing or distribution resources. Each clock input has a programmable counter divider (D). The
phase-frequency detector (PFD) compares both phase and frequency of the rising edges of both the
input (reference) clock and the feedback clock. If a minimum High/Low pulse is maintained, the duty
cycle is ancillary. The PFD is used to generate a signal proportional to the phase and frequency
between the two clocks. This signal drives the charge pump (CP) and loop filter (LF) to generate a
reference voltage to the VCO. The PFD produces an up or down signal to the charge pump and loop
filter to determine whether the VCO should operate at a higher or lower frequency. When VCO operates
at a frequency that is too high, the PFD activates a down signal causing the control voltage to be
reduced, thus decreasing the VCO operating frequency. When the VCO operates at a frequency that is
too low, an up signal increases voltage. The VCO produces eight output phases and one variable phase
for fine-phase shifting. Each output phase can be selected as the reference clock to the output
counters (Figure 1). Each counter can be independently programmed for a given customer design. A
special counter M is also provided. This counter controls the feedback clock of the MMCM, allowing a
wide range of frequency synthesis.
In addition to integer divide output counters, MMCMs add a fractional counter for CLKOUT0 and
CLKFBOUT.
MMCM Primitives
The UltraScale device MMCM primitives, MMCME3_BASE and MMCME3_ADV, are shown in the
following figure. The UltraScale+ devices have the same primitives with an E4 instead of an E3. In this
user guide, MMCME4_ADV is the same as the MMCME3_ADV, and MMCME4_BASE is the same as
MMCME3_BASE.
The MMCME#_BASE primitives provide access to the most frequently used features of a stand-alone
MMCM. Clock deskew, frequency synthesis, coarse phase shifting, and duty cycle programming are
available to use with the MMCME#_BASE. The ports are listed in the following table.
Description Ports
Description Ports
The MMCME#_ADV primitive provides access to all MMCME#_BASE features plus additional ports for
clock switching, access to the dynamic reconfiguration port (DRP), and dynamic fine-phase shifting.
The MMCME#_ADV ports are listed in the following table.
Description Ports
Control and data input RST, CLKINSEL, DWE, DEN, DADDR, DI, PSINCDEC, PSEN, CDDCREQ
The MMCM is a mixed-signal block designed to support clock network deskew, frequency synthesis,
and jitter reduction. These three modes of operation are discussed in more detail in this section. The
VCO operating frequency can be determined by using the following relationship:
Figure: FVCO
Figure: FOUT
where the M, D, and O counters are shown in Figure 1. The value of M corresponds to the
CLKFBOUT_MULT_F setting, the value of D to the DIVCLK_DIVIDE, and O to the CLKOUT_DIVIDE.
The seven “O” counters can be independently programmed. For example, O0 can be programmed to do
a divide-by-two while O1 is programmed for a divide-by-three. The only constraint is that the VCO
operating frequency must be the same for all the output counters because a single VCO drives all the
counters.
In many cases, designers do not want to incur the delay on a clock network in their I/O timing budget.
Therefore, an MMCM is used to compensate for the clock network delay. This feature is supported in
UltraScale architecture-based devices. A clock output matching the reference clock CLKIN frequency
(always CLKFBOUT) is connected to a clock buffer of the same type driving the logic and fed back to
the CLKFBIN feedback pin of the MMCM. The remaining outputs can still be used to divide the clock
down for additionally synthesized frequencies. In this case, all output clocks have a defined phase
relationship to the input reference clock.
The MMCMs can also be used for stand-alone frequency synthesis. In this application, the MMCM is
not used to deskew a clock network. Rather, it generates an output clock frequency for other blocks. In
this mode, the MMCM feedback paths are internal, which keeps all the routing local, minimizing the
jitter. The following figure shows the MMCM configured as a frequency synthesizer. In this example, an
external 33 MHz reference clock is available. The reference clock can be a crystal oscillator or the
output of another MMCM. Setting the M counter to 32 makes the VCO oscillate at 1056 MHz (33 MHz x
32). The MMCM outputs are programmed to provide (for example) a 528 MHz processor clock, a 264
MHz gasket clock, a 176 MHz clock, a 132 MHz memory interface clock, a 66 MHz interface, and a 33
MHz interface. In this example, there are no required phase relationships between the reference clock
and the output clocks, but there are required relationships between the output clocks.
Devices support fractional (non-integer) divides in the CLKOUT0 output path. The resolution of the
fractional divide is 1/8 or 0.125, effectively increasing the number of synthesizeable frequencies by a
factor of eight. For example, if the CLKIN frequency is 100 MHz and the M divide value is set to 8, the
VCO frequency is 800 MHz. CLKOUT0 can be used to further fractionally divide the 800 MHz VCO
frequency (for example, CLKOUT0_DIVIDE = 2.5, resulting in a 320 MHz output frequency).
When using the fractional divider, the duty cycle is not programmable for outputs used in the fractional
mode.
Jitter Filter
MMCMs reduce the jitter inherent on a reference clock. The MMCM can be instantiated as a stand-
alone function to only support filtering jitter from an external clock before it is driven into another block.
As a jitter filter, it is usually assumed that the MMCM act as a buffer and regenerate the input frequency
on the output (for example, FIN = 100 MHz, FOUT = 100 MHz). In general, greater jitter filtering is
possible by using the MMCM attribute BANDWIDTH set to Low. Setting the BANDWIDTH to Low can
incur an increase in the static offset of the MMCM.
Limitations
The MMCM has some restrictions that must be adhered to. These are summarized in the MMCM
electrical specifications in the UltraScale and UltraScale+ device data sheets. In general, the major
limitations are VCO operation range, input frequency, duty cycle programmability, and phase shift. In
addition, there are connectivity limitations to other clocking elements (pins, GTs, and clock buffers).
Cascading MMCMs can only occur through the clock routing network.
The minimum and maximum VCO operating frequencies are defined in the electrical specification of
the UltraScale and UltraScale+ device data sheets. These values can also be extracted from the speed
specification.
The minimum and maximum CLKIN input frequencies are defined in the electrical specification of the
UltraScale and UltraScale+ device data sheets.
Only discrete duty cycles are possible given a VCO operating frequency. Depending on the
CLKOUT_DIVIDE value, a minimum and maximum range is possible with a step size that is also
dependent on the CLKOUT_DIVIDE value. The Clocking Wizard tool gives the possible values for a given
CLKOUT_DIVIDE.
Phase Shift
In many cases, there needs to be a phase shift between clocks. The MMCM has multiple options to
implement phase shifting. Static phase shifting can be achieved by selecting one of the eight VCO
output phases with additional fine phase shifting available in the CLKOUT output counters depending
on the CLKOUT divide value. There is also an interpolated phase shifting capability in either fixed or
dynamic mode. The MMCM phase shifting capabilities are very powerful, which can lead to complex
scenarios. By using the Clocking Wizard, the allowable phase shift values are determined based on the
MMCM configuration settings.
The static phase shift (SPS) resolution in time units is defined as:
Figure: SPS
Because the VCO can provide eight phase-shifted clocks at 45° each; always providing possible
settings for 0°, 45°, 90°, 135°, 180°, 225°, 270°, and 315° of phase shift. The higher the VCO frequency
is, the smaller the phase shift resolution. Because the VCO has a distinct operating range, it is possible
It is possible to phase shift the CLKFBOUT feedback clock. In that case, all CLKOUT output clocks are
negatively phase shifted with respect to CLKIN.
The two fractional counters (CLKFBOUT and CLKOUT0) also have static phase shift capability. A phase
shift step is defined as:
Figure: SPS(frac)
For example, if the fractional divide value is 2.125, a static phase shift step is 360/(2.125 x 8) = 21.176
degrees.
Interpolated fine phase shift (IFPS) mode in the MMCM has linear shift behavior independent of the
CLKOUT_DIVIDE value, and the phase shift resolution only depends on the VCO frequency. In this mode,
the output clocks can be rotated 360° round robin in linear increments of .
If the VCO runs at 600 MHz, the phase resolution is approximately (rounded) 30 ps, and at 1.6 GHz is
approximately (rounded) 11 ps.
No initial phase shift value can be programmed during configuration. When using fine phase shift, no
initial phase shift amount can be set. The phase always starts at zero and can then be dynamically
incremented or decremented. The dynamic phase shift is controlled by the PS interface of the
MMCME#_ADV. This phase shift mode equally affects all CLKOUT output clocks that are selected for
this mode by setting the USE_FINE_PS attribute to TRUE. In interpolated fine phase shift mode, a clock
must always be connected to the PSCLK pin of the MMCM. Regardless of the interpolated fine phase
shift mode (fixed or dynamic) a clock is in, the clock must always be connected to the PSCLK pin of the
MMCM. Each individual CLKOUT counter can independently either select the interpolated phase shift,
the previously described static phase shift mode, or none. Fractional divide is not allowed in either fixed
or dynamic interpolated fine phase shift mode. Fixed or dynamic phase shifting of the feedback path
results in a negative phase shift of all output clocks with respect to CLKIN. The dynamic phase shift
interface cannot be used when the phase shift mode is set to fixed.
The MMCME#_ADV primitive provides three inputs and one output for dynamic fine phase shifting.
Each CLKOUT and the CLKFBOUT divider can be individually selected for phase shifting. The attributes
CLKOUT[0:6]_USE_FINE_PS and CLKFBOUT_USE_FINE_PS select the output clocks to be dynamically
phase shifted. The dynamic phase shift amount is common to all the output clocks selected.
The variable phase shift is controlled by the PSEN, PSINCDEC, PSCLK, and PSDONE ports as shown in
the following figure. The phase of the MMCM output clock(s) increments/decrements according to the
interaction of PSEN, PSINCDEC, PSCLK, and PSDONE from the initial or previously performed dynamic
phase shift. PSEN, PSINCDEC, and PSDONE are synchronous to PSCLK. When PSEN is asserted for one
PSCLK clock period, a phase shift increment/decrement is initiated. When PSINCDEC is High, an
increment is initiated and when PSINCDEC is Low, a decrement is initiated. Each increment adds to the
phase shift of the MMCM clock outputs by 1/56th of the VCO period. Similarly, each decrement
decreases the phase shift by 1/56th of the VCO period. PSEN must be active for one PSCLK period.
PSDONE is High for exactly one clock period when the phase shift is complete. The number of PSCLK
cycles is deterministic (12 PSCLK cycles). After initiating the phase shift by asserting PSEN, the MMCM
output clocks gradually drift from their original phase shift to an increment/decrement phase shift in a
linear fashion. The completion of the increment or decrement is signaled when PSDONE asserts High.
After PSDONE has pulsed High, another increment/decrement can be initiated. There is no maximum
phase shift or phase shift overflow. An entire clock period (360°) can always be phase shifted
regardless of frequency. When the end of the period is reached, the phase shift wraps around round-
robin style.
The Clock Divide Dynamic Change (CDDC) feature supports the dynamic change of the clock output
dividers (CLKOUT[6:0]_DIVIDE) in conjunction with the DRP interface without the need for resetting the
MMCM. Effectively, one or more of the MMCM output clock frequencies can be changed while leaving
other output clocks untouched and running continuously. Two pins (CDDCREQ and CDDCDONE) control
the handshaking. The application requests a change of output counter values (the CLKOUT_DIVIDE
value) by asserting the CDDCREQ signal. New values are written through the standard DRP interface
one port at a time and governed by standard DRP protocol (DEN, DWE, and DRDY). The DRP address of
the CLKOUT[6:0] counter written to determines which output clocks are affected. After a CLKOUTx
counter is written to, the associated clock output stops toggling. This can be followed by more changes
(DRP writes) to other CLKOUT counters in an identical fashion. When all DRP writes have been
completed (after the last DRDY), the CDDCREQ input must be deasserted, after which the affected
output counter(s) are synchronously restarted. The MMCM acknowledges that the changes have taken
place and the new output clocks (frequencies) are available for use by asserting CDDCDONE.
CLKOUT ports not affected by the CDDC change continue to function uninterrupted during this
operation and maintain their phase relationship to each other as shown in the following figure.
However, the output clocks (ports) that were changed via the CDDC procedure are not phase aligned
(synchronized) to the other output clocks not affected by CDDCREQ. Clocks affected by CDDCREQ
should not be used after the signal has been asserted because the output might glitch and the clocks
stop toggling. This feature is not available in fractional mode.
The CLKOUT6 divider (counter) can be cascaded with the CLKOUT4 divider. This provides the capability
of an output divider that is larger than 128. CLKOUT6 feeds the input of the CLKOUT4 divider. There is a
static phase offset between the output of the cascaded divider and all other output dividers.
MMCM Programming
Programming of the MMCM must follow a set flow to ensure configuration that guarantees stability
and performance. This section describes how to program the MMCM based on certain design
requirements. A design can be implemented in two ways, directly through the GUI interface (the
Clocking Wizard) or implementing the MMCM through instantiation. Regardless of the method
selected, the following information is necessary to program the MMCM:
The first step is to determine the input frequency. This allows all possible output frequencies to be
determined by using the minimum and maximum input frequencies to define the D counter range, the
VCO operating range to determine the M counter range, and the output counter range. There can be a
very large number of frequencies. When using integer divides, in the worst case there are 106 x 64 x
136 = 868,363 possible combinations. In reality, the total number of different frequencies is less
because the entire range of the M and D counters cannot be realized, and there is overlap between the
various settings.
As an example, consider FIN = 100 MHz, FVCO = between 600 MHz and 1600 MHz, and FPFD = between
10 MHz and 550 MHz.
For a FPFDMIN of 10 MHz, the value of D can only be between 1 and 10.
DMIN (see Figure 1) = 1
DMAX (see Figure 2) = 10
Figure: DMIN
Figure: DMAX
Figure: MMIN
Figure: MMAX
Determining the input frequency can result in several possible M and D values. The next step is to
determine the optimum M and D values. The starting M value is first determined. This is based off the
VCO target frequency, the ideal operating frequency of the VCO.
Figure: MIDEAL
The goal is to find the M value closest to the ideal operating point of the VCO. The minimum D value is
used to start the process. The goal is to make D and M values as small as possible while keeping ƒVCO
as high as possible.
MMCM Ports
CLKIN1 Input General clock input. See CLKIN1 – Primary Reference Clock
Input.
CLKIN2 Input Secondary clock input for the MMCM reference clock. SeeCLKIN2
– Secondary Clock Input.
CLKFBIN Input Feedback clock input. See CLKFBIN – Feedback Clock Input.
CLKINSEL Input This signal controls the state of the clock input MUX, High =
CLKIN1, Low = CLKIN2. CLKINSEL dynamically switches the
MMCM reference clock. See CLKINSEL – Clock Input Select.
PWRDWN Input Powers down instantiated but unused MMCMs. See PWRDWN –
Power Down.
DADDR[6:0] Input The dynamic reconfiguration address (DADDR) input bus provides
a reconfiguration address for the dynamic reconfiguration. When
not used, all bits must be assigned zeros. See DADDR[6:0] –
Dynamic Reconfiguration Address.
DI[15:0] Input The dynamic reconfiguration data input (DI) bus provides
reconfiguration data. When not used, all bits must be set to zero.
See DI[15:0] – Dynamic Reconfiguration Data Input.
DWE Input The dynamic reconfiguration write enable (DWE) input pin
provides the write enable control signal to write the DI data into
the DADDR address. When not used, it must be tied Low. See
DWE – Dynamic Reconfiguration Write Enable.
DEN Input The dynamic reconfiguration enable (DEN) provides the enable
control signal to access the dynamic reconfiguration feature.
When the dynamic reconfiguration feature is not used, DEN must
be tied Low. See DEN – Dynamic Reconfiguration Enable Strobe.
DCLK Input The DCLK signal is the reference clock for the dynamic
reconfiguration port. See DCLK – Dynamic Reconfiguration
Reference Clock.
PSCLK Input Phase shift clock. See PSCLK – Phase Shift Clock.
PSEN Input Phase shift enable. See PSEN – Phase Shift Enable.
CLKOUT[0:6] Output User configurable clock outputs (0 through 6) that can be divided
versions of the VCO phase outputs (user controllable) from 1
(bypassed) to 128. The output clocks are phase aligned to each
other (unless phase shifted) and aligned to the input clock with a
proper feedback configuration.
CLKINSTOPPED Output Status pin indicating that the input clock has stopped. See
CLKINSTOPPED – Input Clock Status.
CLKFBSTOPPED Output Status pin indicating that the feedback clock has stopped. See
CLKFBSTOPPED – Feedback Clock Status.
LOCKED Output An output from the MMCM that indicates when the MMCM has
achieved phase alignment within a predefined window and
frequency matching within a predefined PPM range. The MMCM
automatically locks after power on. No extra reset is required.
LOCKED is deasserted if the input clock stops or the phase
alignment is violated (e.g., input clock phase shift). The MMCM
must be reset after LOCKED is deasserted.
DO[15:0] Output The dynamic reconfiguration output bus provides MMCM data
output when using dynamic reconfiguration. See DO[15:0] –
Dynamic Reconfiguration Output Bus.
DRDY Output The dynamic reconfiguration ready output (DRDY) provides the
response to the DEN signal for the MMCM’s dynamic
reconfiguration feature. See DRDY – Dynamic Reconfiguration
Ready.
PSDONE Output Phase shift done. See PSDONE – Phase Shift Done.
CDDCDONE Output Signals that the dynamic frequency change is completed. See
MMCM Clock Divide Dynamic Change.
★ Tip: The port names generated by the clocking wizard can differ from the port names used on the
primitive.
CLKIN1 can be driven by a global clock I/O directly when in the same bank adjacent to the PHY tile.
CLKIN2 can be driven by a global clock I/O directly when in the same bank adjacent to the PHY tile.
CLKFBIN must be connected either directly to the CLKFBOUT for internal feedback, or to the CLKFBOUT
via a BUFG for clock buffer feedback matching, or IBUFG (through a global clock pin for external
deskew) or interconnect (not recommended). For clock alignment, the feedback path clock buffer type
should match the forward clock buffer type.
‼ Important: The internal compensation mode setting is determined by a direct connection (wire) from
the CLKFBOUT to the CLKFBIN port in the source. However, synthesis optimizes this connection away
such that the CLKFBOUT to CLKFBIN connection is removed from all subsequent representations in
Vivado Design Suite. However, the INTERNAL compensation attribute attached to the MMCM/PLL
indicates that the compensation is still internal to the MMCM/PLL.
For possible configuration of CLKFBOUT, see MMCM Use Models. CLKFBOUT can also drive logic if the
feedback path contains a clock buffer.
This signal should not be used for feedback. It provides an additional, inverted CLKFBOUT output clock.
CLKFBOUTB can drive logic if the feedback path contains a clock buffer.
The CLKINSEL signal controls the state of the clock input multiplexers. High = CLKIN1, Low = CLKIN2
(see Reference Clock Switching). The MMCM must be held in RESET during clock switchover.
The RST signal is an asynchronous reset for the MMCM. The MMCM is synchronously re-enabled when
this signal is deasserted.
This signal powers down instantiated but currently unused MMCMs. This mode can be used to save
power for temporarily inactive portions of the design and/or MMCMs that are not active in certain
system configurations. No MMCM power is consumed in this mode.
The dynamic reconfiguration address (DADDR) input bus provides a reconfiguration address for
dynamic reconfiguration. The address value on this bus specifies the 16 configuration bits that are
written or read with the next DCLK cycle. When not used, all bits must be assigned zeros.
The dynamic reconfiguration data input (DI) bus provides reconfiguration data. The value of this bus is
written to the configuration cells. The data is presented in the cycle that DEN and DWE are active. The
data is captured in a shadow register and written at a later time. DRDY indicates when the DRP port is
ready to accept another write. When not used, all bits must be set to zero.
The dynamic reconfiguration write enable (DWE) input pin provides the write/read enable control signal
to write the DI data into or read the DO data from the DADDR address. When not used, it must be tied
Low.
The dynamic reconfiguration enable strobe (DEN) provides the enable control signal to access the
dynamic reconfiguration feature and enables all DRP port operations. When the dynamic
reconfiguration feature is not used, DEN must be tied Low.
The DCLK signal is the reference clock for the dynamic reconfiguration port. The rising edge of this
signal is the timing reference for all other port signals. The setup time is specified in the UltraScale and
UltraScale+ device data sheets. There is no hold time requirement for the other input signals relative to
the rising edge of the DCLK. The pin can be driven by an IBUF, IBUFG, BUFGCE, or BUFGCTRL. There are
no dedicated connections to this clock input.
This input pin provides the source clock for the dynamic phase shift interface. All other inputs are
synchronous to the positive edge of this clock. The pin can be driven by an IBUF, IBUFG, BUFG, or
BUFGCE. There are no dedicated connections to this clock input.
A dynamic (variable) phase shift operation is initiated by synchronously asserting this signal. PSEN
must be activated for one cycle of PSCLK. After initiating a phase shift, the phase is gradually shifted
until a High pulse on PSDONE indicates that the operation is complete. There are no glitches or
sporadic changes during the operation. From the start to the end of the operation, the phase is shifted
in a continuous analog manner.
This input signal synchronously indicates if the dynamic phase shift is an increment or decrement
operation (positive or negative phase shift). PSINCDEC is asserted High for increment and Low for
decrement. There is no phase shift overflow associated with the dynamic phase shift operation. If 360°
or more are shifted, the phase wraps around, starting at the original phase.
These user-configurable clock outputs (CLKOUT0 through CLKOUT6) can be divided versions of the
VCO phase outputs (user controllable) from 1 (bypassed) to 128. The input clock and output clocks can
be phase aligned.
★ Tip: CLKOUT0 is first used to place the root clock point. In ZHOLD mode, it is used to set the
compensation. Therefore, AMD recommends using CLKOUT0 as the main clock.
For possible configurations, see MMCM Use Models. In the MMCM, CLKOUT0 and CLKFBOUT can be
used in fractional divide mode. All CLKOUT outputs can be used in non-fractional mode to provide a
static or dynamic phase shift. In fractional mode, only fixed phase shift is allowed. See Static Phase
Shift Mode (MMCM and PLL) for more information.
This is a status pin indicating that the input clock has stopped. This signal is asserted within two
CLKFBOUT clock cycles of clock stoppage. The signal is deasserted after the clock has restarted and
LOCKED is achieved, or the clock is switched to the alternate clock input and the MMCM has re-locked.
This is a status pin indicating that the feedback clock has stopped. CLKFBSTOPPED is asserted within
one clock cycle of clock stoppage. The signal is deasserted after the feedback clock has restarted and
the MMCM has re-locked.
LOCKED
This is an output from the MMCM used to indicate when the MMCM has achieved phase and frequency
alignment of the reference clock and the feedback clock at the input pins. Phase alignment is within a
predefined window and frequency matching within a predefined PPM range. The MMCM automatically
locks after power on; no extra reset is required. LOCKED is deasserted within one PFD clock cycle if the
input clock stops, the phase alignment is violated (e.g., input clock phase shift), or the frequency has
changed. The MMCM must be reset when LOCKED is deasserted. The clock outputs should not be
used prior to the assertion of LOCKED.
The dynamic reconfiguration output bus provides MMCM data output when using dynamic
reconfiguration. If DWE is inactive while DEN is active at the rising edge of DCLK, this bus holds the
content of the configuration cells addressed by DADDR. The DO bus must be captured on the rising
edge of DCLK when DRDY is active. The DO bus value is held until the next DRP operation.
The dynamic reconfiguration ready output (DRDY) provides the response to the DEN signal for the
MMCM’s dynamic reconfiguration feature. This signal indicates that a DEN/ DCLK operation has
completed.
The phase shift done output signal is synchronous to the PSCLK. When the current phase shift
operation is completed, the PSDONE signal is asserted for one clock cycle indicating that a new phase
shift cycle can be initiated.
This is a request signal for dynamically changing the output clock divide value and therefore the
frequency. When asserted High, a request is sent to all affected counters and must stay asserted until
the last change via the DRP has been completed.
This is an acknowledge signal from the MMCM that the output clock divide change is complete and the
output is valid.
MMCM Attributes
The following table lists the attributes for the MMCME#_BASE and MMCME#_ADV primitives.
CLKOUT[1:6]_USE_FINE_PS
String FALSE, TRUE FALSE CLKOUT[1:6] variable fine phase
2 shift enable.
1. The Vivado tools round up or down to the nearest multiple of 0.125 during bitstream
implementation if the value is not specified as an exact 1/8th fraction. However, the attribute
must be specified to the nearest multiple of 0.125 in the Verilog or VHDL code for proper clock
frequency calculation during timing analysis.
2. When using the variable fine phase shift, the initial phase shift value is always zero and cannot
be preset to a static, initial phase.
3. The COMPENSATION attribute values are documented for informational purpose only. The
Vivado tools automatically select the appropriate compensation based on circuit topology. Do
not manually select a compensation value, leave the attribute at the default value.
4. The specifications for the VCO frequencies MMCM_FVCOMIN/MMCM_FVCOMAX and
minimum out frequency MMCM_FOUTMIN are different for the UltraScale and UltraScale+
families. Consult the appropriate data sheets.
5. The direct source code connection (wire) from CLKFBOUT to CLKFBIN is optimized away
during synthesis.
6. When using an SEM-IP in UltraScale devices only, additional noise is coupled into VCO of
MMCM and PLL. This results in higher TIE jitter value as described in Answer Record 71314.
Refer to the answer record for guidance and mitigation techniques. To resolve the issue of TIE
jitter for SEM-IP, there are two new configurable properties: [Link]
and [Link]. If the properties are set to POSTCRC, each MMCM
instance that has the BANDWIDTH attribute set to OPTIMIZED, or to a PLL instance with an
implicit BANDWIDTH=OPTIMIZED attribute, gets configured to POSTCTRC bandwidth settings.
Refer to the Vivado Design Suite User Guide: Programming and Debugging (UG908) for
BITSTREAM property information.
IBUF – Global clock input buffer. The MMCM compensates the delay of this path. IBUF represents
a global clock pin in the same region. The IBUF must be located at a global clock pin location.
BUFGCTRL – Internal global clock buffer. The MMCM does not compensate the delay of this path.
BUFGCE – Global clock buffer. The MMCM does not compensate the delay of this path.
Counter Control
The MMCM output counters provide a wide variety of synthesized clocks using a combination of
DIVIDE, DUTY_CYCLE, and PHASE. The following figure illustrates how the counter settings impact the
counter output.
The top waveform represents the output from the VCO.
The following figure shows the eight VCO phase outputs and four different counter outputs. Each VCO
phase is shown with the appropriate start-up sequence. The phase relationship and start-up sequence
are guaranteed to ensure the correct phase is maintained. This means the rising edge of the 0° phase
happens before the rising edge of the 45° phase. The O0 counter is programmed to do a simple divide-
by-two with the 0° phase tap as the reference clock. The O1 counter is programmed to do a simple
divide-by-two but uses the 180° phase tap from the VCO. This counter setting can be used to generate a
clock for a DDR interface where the reference clock is edge aligned to the data transition. The O2
counter is programmed to do a divide-by-three. The O3 output has the same programming as the O2
output except the phase is set for a one cycle delay. Phase shifts greater than one VCO period are
possible.
If the MMCM is configured to provide a certain phase relationship and the input frequency is changed,
this phase relationship is also changed because the VCO frequency changes and therefore the
absolute shift in picoseconds changes. This aspect must be considered when designing with the
MMCM. When an important aspect of the design is to maintain a certain phase relationship among
various clock outputs, (e.g., CLK and CLK90), this relationship is maintained regardless of the input
frequency.
All O counters can be equivalent; anything O0 can do, O1 can do. The O0 counter has the additional
capability to be used in fractional divide mode. The MMCM outputs are flexible when connecting to the
global clock network because they are identical. In most cases, this level of detail is imperceptible
because the software and Clocking Wizard determine the proper settings through the MMCM attributes
and Wizard inputs.
The MMCM reference clock can be dynamically switched by using the CLKINSEL pin. The switching is
done asynchronously. After the clock switches, the MMCM is likely to lose LOCKED and automatically
lock onto the new clock. Therefore, after the clock switches, the MMCM must be reset. The MMCM
clock MUX switching is shown in the following figure. The CLKINSEL signal directly controls the MUX.
No synchronization logic is present.
When the input clock or feedback clock is lost, the CLKINSTOPPED or CLKFBSTOPPED status signal is
asserted. The MMCM deasserts the LOCKED signal. After the clock returns, the CLKINSTOPPED signal
is deasserted and a RESET must be applied.
The examples in this section show the MMCM. There are several methods to design with the MMCM.
The Clocking Wizard in the Vivado tools can assist with generating the various MMCM parameters.
Additionally, the MMCM can be manually instantiated as a component. It is also possible for the
MMCM to be merged with an IP core. The IP core would contain and manage the MMCM.
One of the predominant uses of the MMCM is for clock network deskew. The following figure shows
the MMCM in this mode. The clock output from one of the CLKOUT counters is used to drive logic
within the device and/or the I/Os. The feedback counter is used to control the exact phase relationship
between the input clock and the output clock (if, for example, a 90° phase shift is required). The
associated clock waveforms are shown to the right for the case where the input clock and output clock
need to be phase aligned. The configuration in the following figure is the most flexible, but it does
require two global clock networks.
There are certain restrictions on implementing the feedback. The CLKFBOUT output can be used to
provide the feedback clock signal. When an MMCM is driving both BUFGs and BUFGCTRL, only one of
the clock buffers that is also used in the feedback path is deskewed. The fundamental restriction is
that both input frequencies to the PFD must be identical. Therefore, this relationship must be met:
As an example, if ƒIN is 166 MHz, D = 1, M = 6, and O = 2, then VCO is 996 MHz and the clock output
frequency is 498 MHz. Because the M value in the feedback path is 6, both input frequencies at the PFD
are 166 MHz.
Another more complex scenario has an input frequency of 66.66 MHz and D = 2, M = 30, and O = 4. The
VCO frequency in this case is 1000 MHz and the CLKOUT output frequency is 250 MHz. Therefore, the
feedback frequency at the PFD is 1000/30 or 33.33 MHz, matching the 66.66 MHz/2 input clock
frequency at the PFD.
The MMCM feedback can be internal to the MMCM when the MMCM is used as a synthesizer or jitter
filter, and there is no required phase relationship between the MMCM input clock and the MMCM
output clock. The MMCM performance increases because the feedback clock is not subjected to noise
on the core supply since it never passes through a block powered by this supply. However, noise
introduced on the CLKIN signal and the BUFG are still present, as shown in the following figure.
The MMCM can also be used to generate a zero delay buffer clock. A zero delay buffer can be useful
for applications where there is a single clock signal fanout to multiple destinations with a low skew
between them. This configuration is shown in the following figure. Here, the feedback signal drives off
chip and the board trace feedback is designed to match the trace to the external components. In this
configuration, it is assumed that the clock edges are aligned at the input of the UltraScale device and
the input of the external component. The input clock buffers for CLKIN and CLKFBIN must be in the
same bank.
In some cases, precise alignment cannot occur because of the difference in loading between the input
capacitance of the external component and the feedback path capacitance of the UltraScale device.
For example, the external components can have an input capacitance of 1 pF to 4 pF while the part has
an input capacitance as specified in the UltraScale and UltraScale+ device data sheets. There is a
difference in the signal slope, which is basically skew. Designers should be aware of this effect to
ensure timing.
The MMCM can be cascaded through the routing resources only. There is no compensation for routing
delays.
SSCG spreads the electromagnetic energy over a large frequency band to effectively reduce the
electrical and magnetic field strengths measured within a narrow window of frequencies. The peak
electromagnetic energy at any one frequency is reduced by modulating the SSCG output.
The MMCME# can generate a spread-spectrum clock from a standard fixed frequency oscillator when
SS_EN is set to TRUE (see the following figure). Within the MMCME#, the VCO frequency is modulated
along with CLKFBOUT and CLKOUT[6:4,1,0]. Clock outputs CLKOUT[3:2] are used to control the
modulation period and are not available for general use. As long as the clock frequency is adjusted
slowly, the spread-spectrum does not affect the period jitter of the MMCME#.
Adjusting the modulation period SS_MOD_PERIOD allows you to direct the tools to select the closest
modulation period based on the MMCME# settings. The spread-spectrum modulation reduces EMI as
long as the modulation frequencies are higher than the audible frequency range of 30 KHz. Typically,
lower modulation frequencies are preferred to minimize the impact of the introduction of spread-
spectrum.
Increasing the frequency deviation with SS_MODE (CENTER_HIGH or DOWN_HIGH) increases the
overall EMI reduction, but care must be taken to ensure that the increased range of frequencies does
not affect the overall system operation (see the following figure). Because the spread-spectrum clock
and the input clock are operating at different frequencies, any data being transferred between the clock
domains should use an asynchronous FIFO to ensure that data is not lost. Increasing the frequency
deviation requires a larger FIFO.
Another design trade-off is the decision to use a center spread or down spread. Selecting SS_MODE
(DOWN_HIGH, DOWN_LOW) spreads the frequencies to lower frequencies as shown in the following
figure. DOWN_HIGH has similar frequency deviation to CENTER_LOW.
The decision to use down spread is often the result of considering the timing analysis impact of
spread-spectrum. When using a spread-spectrum clock, the design must meet timing at the highest
frequency in the frequency deviation. Therefore, if a 100 MHz clock with SS_MODE (CENTER_LOW)
produces a 3% (±1.5%) center spread, the 100 MHz clock with 3% center spread must pass timing
analysis as a 101.5 MHz clock. However, if SS_MODE (DOWN_HIGH) produces a 3% down spread, the
input frequency is the highest frequency within the frequency deviation. Consequently, for a 100 MHz
clock with 3% down spread, the down-spread clock would continue to be analyzed by timing analysis as
a 100 MHz clock.
Table: Manual SS Timing Adjustment Using Input Frequency for UltraScale Devices
Table: Manual SS Timing Adjustment Using Input Frequency for UltraScale+ Devices
For a 25 MHz input clock, the new timing constraints would be:
For an 80 MHz input clock, the new timing constraints would be:
Table 1 and Table 2 provide information which allows the manual adjustment of timing constraints to
the frequency range of the spread-spectrum enabled clock. This is for the generation of timing
constraints in an XDC file used by the Vivado tools.
Table 1 and Table 2 show that timing constraints should be modified when spread-spectrum clocking
parameter SS_MODE is set to CENTER_LOW or CENTER_HIGH. When the SS_MODE attribute is set to
DOWN_LOW or DOWN_HIGH timing constraint adjustment is not necessary.
Manual adjustment of the timing constraints is not needed because the Vivado tools detect when
spread-spectrum clocking in a design. Vivado tools (static timing analysis) automatically account for
any timing spread caused by the spread-spectrum enabled clocks. When spread-spectrum clocks are
used, Vivado static timing analysis adds a spread-spectrum (SS) uncertainty value of the total
uncertainty calculation formula. The formula used by the static analysis tools is as follows:
where:
When spread-spectrum clocking is used with SS_MODE set as DOWN_LOW or DOWN_HIGH the
calculated F~IN_SS~ frequency (using data from Table 1 and/or Table 2) is lower than the original
clock frequency (Refer to the examples after Table 2). If no precautions are taken, the used FIFO can fill
up and over-run. Prevent this by using a FIFO with throttle control.
Parameter Value
Parameter Value
CLKOUT[3:2]_DIVIDE N/A
CLKOUT[6:4,1,0]_DIVIDE 1 to 128
Bandwidth Low
Parameter Value
Parameter Value
Parameter Value
CLKOUT[3:2]_DIVIDE N/A
CLKOUT[6:4,1,0]_DIVIDE 1 to 128
Bandwidth Low
When using spread-spectrum generation, the VCO frequency is set by the clocking wizard based on the
input frequency and SS_MODE. As a result, the clocking wizard is recommended to set the output
frequencies for CLKOUT[6:4,1,0].
Based on the VCO frequency and SS_MOD_PERIOD, the clocking wizard also determines the correct
modulation settings to set the modulation frequency within 10% of SS_MOD_PERIOD. Because the
modulation frequency is dependent on the VCO frequency, the modulation frequency scales as the
input frequency changes for a given compilation.
CLKOUT0_PHASE = 0;
CLKOUT0_DUTY_CYCLE = 0.5;
CLKOUT0_DIVIDE_F = 2;
CLKOUT1_PHASE = 90;
CLKOUT1_DUTY_CYCLE = 0.5;
CLKOUT1_DIVIDE = 2;
CLKOUT2_PHASE = 0;
CLKOUT2_DUTY_CYCLE = 0.25;
CLKOUT2_DIVIDE = 4;
CLKOUT3_PHASE = 90;
CLKOUT3_DUTY_CYCLE = 0.5;
CLKOUT3_DIVIDE = 8;
CLKOUT4_PHASE = 0;
CLKOUT4_DUTY_CYCLE = 0.5;
CLKOUT4_DIVIDE = 8;
CLKOUT5_PHASE = 135;
CLKOUT5_DUTY_CYCLE = 0.5;
CLKOUT5_DIVIDE = 8;
CLKFBOUT_PHASE = 0;
CLKFBOUT_MULT_F = 8;
DIVCLK_DIVIDE = 1;
CLKIN1_PERIOD = 10.0;
PLLs
There are two PLLs per CMT that provide clocking to the PHY logic and I/Os. In addition, they can be
used as frequency synthesizers for a wide range of frequencies, serve as jitter filters, and provide basic
phase shift capabilities and duty cycle programming. The PLLs differ from the MMCM in number of
outputs, cannot deskew clock nets, and do not have advanced phase shift capabilities, Multipliers and
input dividers have a smaller value range and do not have many of the other advanced features of the
MMCM. In Spartan UltraScale+ devices with XP5IO, apart from two PLLs per CMT, there are two
PLLXP2 per CMTXP (that are adjacent to the XP5IO banks).
PLL Primitives
The UltraScale device PLL primitives, PLLE3_BASE and PLLE3_ADV, are shown in the following figure.
UltraScale+ devices have the same primitives with an E4 instead of an E3. In this user guide,
PLLE4_ADV is the same as the PLLE3_ADV, and PLLE4_BASE is the same as PLLE3_BASE.
The Spartan UltraScale+ devices have PLLE4_BASE, PLLE4_ADV, PLLE4XP_BASE, and PLLE4XP_ADV
primitives. The PLLE4XP is only present in Spartan UltraScale+ device CMTXPs that are adjacent to
XP5IO banks.
✎ Note: Apart from the ports in the PLLE4, RST_DMC, CLKOUTPHY_DMCEN, and LOCKED_DMC have
been added which are used by the LPDDRMC. CLKOUTPHY_N has been removed from Spartan
UltraScale+ devices with XP5IO.
The PLLE#_BASE and PLL4XP_BASE primitive provides access to the most frequently used features of
a stand-alone PLL. Clock deskew, frequency synthesis, and duty cycle programming are available to use
with the PLLE#_BASE. The ports are listed in the following table.
Description Ports
Description Ports
The PLLE#_ADV primitive provides access to all PLLE#_BASE features plus additional ports for access
to the DRP. The ports are listed in the following table.
Description Ports
PLLE4XP_ADV Primitive
Description Ports
Control and data input RST, RST_DMC,CLKOUTPHYEN, CLKOUTPHY_DMCEN, DWE, DEN, DADDR,
DI
PLL Ports
LOCKED Output An output from the PLL that indicates when the PLL
has achieved phase alignment within a predefined
window and frequency matching within a predefined
PPM range. The PLL automatically locks after power
DCLK Input The DCLK signal is the reference clock for the dynamic
reconfiguration port.
The RST signal is an asynchronous reset for the PLL. The PLL is synchronously re-enabled when this
signal is deasserted.
This signal powers down instantiated but currently unused PLLs. This mode can be used to save power
for temporarily inactive portions of the design and/or PLLs that are not active in certain system
configurations. No PLL power is consumed in this mode.
These are user-configurable clock outputs and can be divided versions of the VCO phase outputs (user
controllable) from 1 (bypassed) to 128. The input clock and output clocks can be phase aligned.
For the possible configurations of CLKFBOUT, see the following figures. Unlike the MMCM, the
CLKFBOUT cannot drive logic.
✎ Note: The Spartan UltraScale+ device PLLE4XP does not have a CLKOUTPHY port.
CLKFBIN must be connected either directly to the CLKFBOUT for internal feedback, or to the CLKFBOUT
through a BUF_IN. Using BUF_IN in the feedback path compensates for the clock network delay in the
same XIPHY bank as shown in Figure 1 where the nodes 1 and 5 are phase aligned.
This is a dedicated clock output for use by the PHY byte logic and I/O. It can be 2X, 1X, or 0.5X of the
VCO frequency.
CLKOUTPHYEN enables the CLKOUTPHY clock outputs. The PLL employs enable logic to synchronize
the asynchronous CLKOUTPHYEN signal from your design and controls when the CLKOUTPHY clocks
are released. After the CLKOUTPHY clock is released, the rising edge is aligned to the rising edge of the
input clock CLKIN. Glitch-free enabling and disabling of the CLKOUTPHY output clock is assured for all
configurations.
However, phase alignment between multiple PLL CLKOUTPHY clocks is only assured when both the
CLKFBOUT_MULT and CLKOUT[0:1]_DIVIDE values are set to 1, 2, 4, or 8. Rising edges do not align for
CLKFBOUT = 3, 5, 6, 7, 9,...
LOCKED
This output from the PLL is used to indicate when the PLLs have achieved frequency alignment of the
reference clock and the internal feedback. Frequency alignment is within a predefined window of
frequency matching within a predefined PPM range. The PLL automatically locks after power on; no
extra reset is required. LOCKED is deasserted within one PFD clock cycle if the input clock stops or the
frequency has changed. The PLL must be reset when LOCKED is deasserted. The clock outputs should
not be used prior to the assertion of LOCKED.
The dynamic reconfiguration address (DADDR) input bus provides a reconfiguration address for the
dynamic reconfiguration. The address value on this bus specifies the 16 configuration bits that are
written or read with the next DCLK cycle. When not used, all bits must be assigned zeros.
The dynamic reconfiguration data input (DI) bus provides reconfiguration data. The value of this bus is
written to the configuration cells. The data is presented in the cycle that DEN and DWE are active. The
data is captured in a shadow register and written at a later time. DRDY indicates when the DRP port is
ready to accept another write. When not used, all bits must be set to zero.
The dynamic reconfiguration write enable (DWE) input pin provides the write/read enable control signal
to write the DI data into or read the DO data from the DADDR address. When not used, DWE must be
tied Low.
The dynamic reconfiguration enable strobe (DEN) provides the enable control signal to access the
dynamic reconfiguration feature and enable all DRP port operations. When the dynamic reconfiguration
feature is not used, DEN must be tied Low.
DCLK is the reference clock for the dynamic reconfiguration port. The rising edge of this signal is the
timing reference for all other port signals. The setup time is specified in the UltraScale and UltraScale+
device data sheets. There is no hold time requirement for the other input signals relative to the rising
edge of DCLK. This signal can be driven by an IBUF, IBUFG, BUFGCE, or BUFGCTRL. There are no
dedicated connections to this clock input.
The dynamic reconfiguration output bus provides PLL data output when using dynamic reconfiguration.
If DWE is inactive while DEN is active at the rising edge of DCLK, this bus holds the content of the
configuration cells addressed by DADDR. The DO bus must be captured on the rising edge of DCLK
when DRDY is active. The DO bus value is held until the next DRP operation.
The dynamic reconfiguration ready output (DRDY) provides the response to the DEN signal for the PLL’s
dynamic reconfiguration feature. This signal indicates that a DEN/ DCLK operation has completed.
RST_DMC
CLKOUTPHY_DMCEN
LOCKED_DMC
PLL Attributes
The following table lists the attributes for the PLLE#_BASE and PLLE#_ADV primitives.
CLKOUT[0:1]_ Real 0.01 to 0.99 0.50 Specifies the duty cycle of the
DUTY_CYCLE associated CLKOUT clock output in
percentages (i.e., 0.50 generates a
50% duty cycle).
STARTUP_WAIT String FALSE, TRUE FALSE Wait during the configuration start-up
cycle for the PLL to lock.
1. The specifications for the VCO frequencies PLL_FVCOMIN/PLL_FVCOMAX and minimum out
frequency PLL_FOUTMIN are different for the UltraScale and UltraScale+ families. Consult the
appropriate data sheets.
Spartan UltraScale+ devices with XP5IO have additional attributes for the PLLE4XP_BASE and
PLLE4P_ADV primitives. These are listed in the following table.
VCO_RANGE String Low, High High Determines the VCO frequency based
on CLKIN_PERIOD (MHz) /
DIVCLK_DIVIDE x CLKFBOUT_MULT.
DCLK 1 Input The DCLK signal is the reference clock for the
dynamic reconfiguration port. This clock is
normally about 100 MHz to 200 MHz. The newer
the technology of the FPGA family used, the faster
this clock can be.
1. The width of the DADDR bus depends on the primitive that the DRP port is a part of. For a
MMCM, the address bus is 7-bit wide and for a PLL the DRP address bus is 7-bit wide
(DADDR(6:0)). The DADDR port of an ADC/DAC in an RFSoC is 12-bit wide while the DADDR
port of a GTP is 10-bit wide.
1. Put the address to write to and the data that needs to be written on the buses, DADDR[n:0] and
DI[15:0] respectively.
2. Make the DWE (Data Write Enable) signal High.
3. Pulse the DEN signal High for one clock cycle. The DEN signal is the trigger that makes the DRP
port function. When this signal is captured on the rising edge of the clock the internals of the DRP
port capture address and data and fill the correct register in the DRP map.
4. When the DEN signal goes Low for one pulse, make the DWE signal Low.
5. The DRP ports pulse the DRDY High for a clock cycle to confirm that the provided data is written
into the provided address space. This also signals that a new write or read operation can start.
1. Put the address of the register to read from on the DADDR[n:0] bus.
2. Leave the DWE signal Low at all times when reading.
3. The value ON/OFF of the DI[15:0] bus does not matter.
4. Pulse the DEN signal High for one clock cycle. The DEN signal is the trigger that makes the DRP
port function. When this signal is captured on the rising edge of the clock the internals of the DRP
port capture the address to make sure that the contents of the correct register in the DRP map are
reflected on the DO[15:0] output.
5. The DRP ports pulse the DRDY High for a clock cycle to confirm that the provided data is written
into the provided address space. This also signals that a new write or read operation can start.
Read—Write Operation
A read-write operation must always be executed with respect to the DRDY signal. Only when the DRDY
signal pulses High, a new read or write operation can be initiated. If the DRDY signal is not controlled
after a read or write operation, from or to the DRP port, it is not certain that the written bits are set or
the obtained bits are representing the value of the register.
The user-accessible DRP register set is described in this section. The DRP register map spans from
address 0x00 to address 0x7F (7-bit address bus). The following figure shows the layout of the
register map of user-accessible registers. Be aware that values of different counters overlap register
boundaries.
‼ Important: When operating a DRP port, it is recommended that the existing contents of the register
that is going to be changed are first read. Write back the register contents where only the required bits
are modified. Modify only the colored bits and always maintain the state of the gray bits.
MMCM Registers
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 0 0 0
12 mc_res(2)
11 mc_res(1)
8 mc_res(0)
4 mc_lfhf(0)
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 0
12 mc_cp(2)
11 mc_cp(1)
8 mc_cp(0)
Registers 0x4F and 0x4E define the values for the loop filters. Pick the appropriate values for these
filters from the MMCM and PLL Dynamic Reconfiguration Application Note (XAPP888).
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 0 0 0 0
12 mc_interp_en(6)
11 mc_interp_en(5)
8 mc_interp_en(4)
7 mc_interp_en(3)
4 mc_interp_en(2)
3 mc_interp_en(1)
0 mc_interp_en(0)
1. If any of the output counters is using fine phase shift then mc_interp_en[3:0] must be set to
1111 otherwise mc_interp_en[3:0] must be set to 0000 .
2. mc_interp_en(4) is always set to 1.
3. If any of the output counters is using a phase of VCO other than 0 or 180, uses fractional
division for a counter, or uses spread-spectrum mode then mc_interp_en[7:5] must be set to
111 otherwise mc_interp_en[7:5] must be set to 000.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 1 1 1 1 1 1 0 1 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
14:10 mc_lock_ref_dly[4:0] Window setting for the lock circuit of the reference
clock.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
14:10 mc_lock_fb_dly[4:0] Window setting for the lock circuit of the feedback clock.
9:0 mc_lock_sat_high[9:0] Counter setting the number of clock cycles the MMCM
needs to have CLKREF and CLKFB misaligned within a
certain window before deasserting the LOCKED output.
Default value is 1 .
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 1 1 1 1 1 0 1 0 0 0
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
9:0 mc_lock_cnt[9:0] Counter setting the number of clock cycles the MMCM
needs to have CLKREF and CLKFB aligned within a
certain window before the LOCKED output is asserted.
Default value is 1000.
Refer to MMCM and PLL Dynamic Reconfiguration Application Note (XAPP888) to determine the values
for registers 0x1A, 0x19, and 0x18.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 1 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
15:13 mc_ckfbout_pm_r[2:0] VCO phase selection mux and rising edge control.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 0 0 0 1 0 0 0 0 0 0
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
15:13 mc_ckfbout_pm_f[2:0] VCO phase selection mux and falling edge control.
Registers 0x15, 0x14, and bits [15:12] of 0x13 control the fractional feedback (M counter) shown in
MMCMs.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bits [10:0] of register 0x13 and register 0x12 control the CLKOUT6 counter.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 1 0 0 0 0 0 0
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Registers 0x11 and 0x10 control the output counter for CLKOUT4.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 1 0 0 0 0 0 0
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Registers 0x0F and 0x0E control the output counter for CLKOUT3.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 1 0 0 0 0 0 0
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Registers 0x0D and 0x0C control the output counter for CLKOUT2.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 1 0 0 0 0 0 0
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Registers 0x0B and 0x0A control the output counter for CLKOUT1.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 0 0 0 1 0 0 0 0 0 0
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 0 0 0 1 0 0 0 0 0 0
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
The fractional output counter for CLKOUT0 is controlled by registers 0x09 , 0x08 , and bits [15:12] of
register 0x07.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 0 0 0
15:10 mc_in_dly_set[5:0] Counter delay setting. Control how much delay is inserted
in the path.
9:4 mc_in_dly_mx_dvdd[5:0]
3 mc_direct_path_cntrl Reserved.
Type Description
CDDC Clock divide dynamic change. The possibility to change the DRP registers
without the need for a MMCM reset. When this option is enabled it
functions with the CDDCREQ and CDDCDONE handshake pins. For more
information read MMCM Clock Divide Dynamic Change.
EDGE Clock edge identification. Identify the clock edge used for a high to low
transition of the counter.
PM VCO phase selection. Used to select one of the eight possible VCO
outputs.
EN Counter enable.
HT Counter high time. Set the delay the counter needs to output a high value.
LT Counter low time. Set the delay the counter needs to output a low value.
Two of the counters, CLKFBOUT and CLKOUT0, are fractional counters. A fractional counter uses two
non-fractional counters, an extra state, and adder logic. This is the reason a fractional counter has two
enables (one for each counter to allow non-fractional use) and two VCO phase selection settings. For
the adder and state logic, the VCO phase selection is extra split in rising and falling settings. Additional
register configuration options defining the fractional counters are listed in the following table.
Type Description
Type Description
PM_R Select one of the eight VCO phased outputs as rising edge counter clock.
PM_L Select one of the eight VCO phased outputs as falling edge counter clock.
Fractional counter mode is enabled when both mc_ckout_en and mc_ckout_frac_en are set. Both
counters take different phases from the VCO outputs.
Example 1
Output starts High with rising edge on counter A and goes Low with second rising edge counter B. It
goes High again with second rising edge of counter A after that and so on.
Example 2
Output starts High with rising edge on counter A and goes Low with second rising edge counter B. It
goes High again with next rising edge of counter B, counter B switches to VCO phase 90 and the output
goes Low again by the second rising edge of that phase and so on.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 1 0 0 1 1 1
The settings for register 4 are controlled by the SS_MODE attribute of the MMCM. Refer to the Spread-
Spectrum Clock Generation section for detailed information on spread-spectrum clocking set up and
behavior.
ss_steps_init ss_steps
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0
PLL Registers
The PLL DRP register set is similar and runs parallel with that of the MMCM. The number of possible
changeable registers in the PLL DRP resister set is smaller than that of the MMCM because the PLL
has only two clock outputs and does not use a selectable VCO output multiplexer and interpolator.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 1
15 mc_gts_wait Wait for the GTS_CFG_B signal before starting the LOCKED
process.
14 mc_startup_wait Wait during the configuration start-up cycle for the MMCM
to lock.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 0 0 0
12 mc_res(2)
11 mc_res(1)
7 mc_lfhf(1)
4 mc_lfhf(0)
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 0 0 0 1
12 mc_cp(2)
11 mc_cp(1)
8 mc_cp(0)
3 mc_cp_res(0)
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 1 1 1 1 1 1 0 1 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
14:10 mc_lock_ref_dly[4:0] Window setting for the lock circuit of the reference
clock.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
14:10 mc_lock_fb_dly[4:0] Window setting for the lock circuit on the feedback clock.
9:0 mc_unlock_cnt[9:0] Counter setting the number of clock cycles the PLL needs
to have CLKREF and CLKFB misaligned within a certain
window before deasserting the LOCKED output. Default
value is 1.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 0 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
9:0 mc_lock_cnt[9:0] Counter setting the number of clock cycles the PLL needs
to have CLKREF and CLKFB aligned within a certain window
before the LOCKED output is asserted. Default value is
1000 .
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 1 0 0 0 0 0 0
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 1 0 0 0 0 0 0
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 1 0 0 0 0 0 1 0 0 0 0 0 0
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 1 0 0 0 0 0 0
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 1 0 0 0 0 0 1 0 0 0 0 0 1
Access R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 0 0 0 0
15:10 mc_in_dly_set[5:0] Counter delay setting. Control how much delay is inserted
in the path.
9:4 mc_in_dly_mx_dvdd[5:0]
3 mc_direct_path_cntrl Reserved.
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0
Clocking Guidelines
Clocking in a design is not just applying clock buffers, instantiating MMCM and/or PLL, and applying
one or a couple of constraints in a XDC file. Clocking and the setup of a clocking network needs
attention. To create a design, that is implemented (synthesize, place, and route) using all of Vivado
Design Suite features, and when downloaded makes the FPGA function at optimal conditions, follow
the guidelines provided in the chapters Clocking Guidelines and Clock Domain Crossing of the UltraFast
Design Methodology Guide for FPGAs and SoCs (UG949).
UltraFast Design Methodology Guide for FPGAs and SoCs (UG949) offers a set of best practices
intended to help streamline the design process for new devices. The size and complexity of these
designs require specific steps and design tasks to ensure success at each stage of the design.
Following these steps and adhering to the best practices helps you achieve your desired design goals
as quickly and efficiently as possible. Two other documents that can be useful for designing are:
Documentation Navigator
Documentation Navigator (DocNav) is an installed tool that provides access to AMD Adaptive
Computing documents, videos, and support resources, which you can filter and search to find
information. To open DocNav:
From the AMD Vivado™ IDE, select Help > Documentation and Tutorials.
On Windows, click the Start button and select Xilinx Design Tools > DocNav.
At the Linux command prompt, enter docnav.
✎ Note: For more information on DocNav, refer to the Documentation Navigator User Guide (UG968).
Design Hubs
AMD Design Hubs provide links to documentation organized by design tasks and other topics, which
you can use to learn key concepts and address frequently asked questions. To access the Design Hubs:
Support Resources
For support resources such as Answers, Documentation, Downloads, and Forums, see Support.
References
1. UltraFast Design Methodology Guide for FPGAs and SoCs (UG949)
2. UltraScale and UltraScale+ FPGAs Packaging and Pinouts Product Specification (UG575)
3. UltraScale Architecture SelectIO Resources User Guide (UG571)
4. Spartan UltraScale+ FPGAs SelectIO Resources User Guide (UG861)
5. Vivado Design Suite User Guide: Using Constraints (UG903)
6. UltraScale and UltraScale+ device data sheets:
UltraScale Architecture and Product Data Sheet: Overview (DS890)
Zynq UltraScale+ MPSoC Data Sheet: Overview (DS891)
Kintex UltraScale FPGAs Data Sheet: DC and AC Switching Characteristics (DS892)
Virtex UltraScale FPGAs Data Sheet: DC and AC Switching Characteristics (DS893)
Kintex UltraScale+ FPGAs Data Sheet: DC and AC Switching Characteristics (DS922)
Virtex UltraScale+ FPGA Data Sheet: DC and AC Switching Characteristics (DS923)
Zynq UltraScale+ MPSoC Data Sheet: DC and AC Switching Characteristics (DS925)
Spartan UltraScale+ FPGA Data Sheet: DC and AC Switching Characteristics (DS930)
Artix UltraScale+ FPGA Data Sheet: DC and AC Switching Characteristics (DS931)
7. MMCM and PLL Dynamic Reconfiguration Application Note (XAPP888)
8. UltraScale Architecture Libraries Guide (UG974)
9. Clocking Wizard LogiCORE IP Product Guide (PG065)
10. UltraFast Design Methodology Quick Reference Guide (UG1231)
11. UltraFast Design Methodology Checklist (XTP301)
12. Zynq UltraScale+ Device Packaging and Pinouts Product Specification User Guide (UG1075)
13. Zynq UltraScale+ Device Technical Reference Manual (UG1085)
14. Vivado Design Suite User Guide: Programming and Debugging (UG908)
15. Vivado Design Suite Tutorial: Implementation (UG986)
Revision History
Clock Structure
Updated description of routing tracks in first paragraph.
Added Table 1.
Added note after Figure 3.
Table 1 Updated 1.
PLL Primitives
Added paragraph about PLL primitives in Spartan UltraScale+
devices.
Added Figure 2.
PLLE3_BASE,
Added PLL4XP_BASE
PLLE4_BASE, and
Added Table 2.
PLL4XP_BASE
Primitive
Table 1
Changed CLKOUTPHY to CLKOUTPHY_N, CLKOUTPHY_P.
Added PLLE4XP ports.
Clock Management
Updated Figure 1.
Tile
Updated the table for register 15 in MMCM Registers.
Clock Management
Updated Table 1 footnote.
Tile
Updated Spread-Spectrum Clock Generation section with new
content and equations.
Updated Table 1.
Clock Management
Updated the example in Determine the Input Frequency section.
Tile
Added the new sections Dynamic Reconfiguration Port and Clocking
Guidelines.
Clock Management In Table 1, updated the description of BUF_IN for the COMPENSATION
Tile attribute.
Clocking Resources Updated the discussion on page 14. Added clarification to the BUFG_GT
and BUFG_GT_SYNC section.
Clock Management
Updated the Dynamic Phase Shift Interface in the MMCM section.
Tile
Added Table 2 and Table 4.
In Table 1, updated the descriptions for CLKOUT[0:1]_PHASE and
CLKFBOUT_PHASE.
Overview Updated the discussion in Key Differences from 7 Series FPGAs about the
differences between clock capable and global clock pins.
Clocking Resources
Added clarification to the Global Clock Inputs section.
Added further information following Figure 3.
Updated the BUFGCE_DIV section.
Revised the BUFG_GT_SYNC description after Figure 1 to include the
UltraScale+ devices.
Clock Management
Added the UltraScale+ device MMCME4 and PLLE4 primitives to the
Tile
MMCM Primitives and PLL Primitives sections.
Updated the description of PSCLK cycles in the Dynamic Phase Shift
Interface in the MMCM section.
Added a Recommended note to CLKOUT[0:6] – Output Clocks.
Updated the CLKINSTOPPED – Input Clock Status section.
Added CLKFBOUT and CLKFBIN to Table 1 and their descriptions
below the table.
Updated the CLKOUTPHYEN – PHY Clock Enable description.
Added Figure 1 and Figure 2.
In Table 1, updated the DIVCLK_DIVIDE allowed values and added
PHY_ALIGN to the COMPENSATION attribute.
General updates Updated the Please Read: Important Legal Notices section.
Overview
Under Introduction to the UltraScale Architecture, added new
introductory text for UltraScale+ devices.
Added ninth bullet under Key Differences from 7 Series FPGAs.
Clocking Resources
Updated first paragraph under Global Clock Inputs to include
information about HDGC pins.
Updated first paragraph under Clock Structure.
Added Important note under Clock Buffers.
Added second paragraph under BUFCE_LEAF Clock Buffer.
Added first two sentences under BUFG_GT and BUFG_GT_SYNC.
Added BUFG_PS section.
Clock Management
Updated Frequency Synthesis Using Fractional Divide in the MMCM
Tile
by changing 0.125 degrees to 0.125.
Revised the heading Static Phase Shift Mode (MMCM and PLL) by
adding (MMCM and PLL).
Revised the heading MMCM Clock Divide Dynamic Change by adding
MMCM.
Added Important note under CLKFBIN – Feedback Clock Input.
In Table 1, added a row of UltraScale+ device MMCM attributes for
CLKFBOUT_MULT_F, changed default value for COMPENSATION
from ZHOLD to AUTO, revised the COMPENSATION description, and
added note 4 and note 5.
In Table 1, added a row of UltraScale+ device PLL attributes for
CLKFBOUT_MULT, revised the COMPENSATION description, and
added note 1.
Added Dynamic Reconfiguration Port section.
Clock Management
Tile In Table 1, changed the Allowed Values attribute for CLKIN1_PERIOD
and CLKIN2_PERIOD.
In Table 1, changed the Allowed Values attribute for CLKIN_PERIOD.
Clocking Resources
Replaced clock-capable with global clock in Global Clock Inputs.
Updated Byte Clock Inputs.
Added BUFG_GT_SYNC to BUFG_GT and BUFG_GT_SYNC.
Clock Management
Updated Figure 1 and added tip for Table 1.
Tile
Updated Figure 1.
subject to change and may be rendered inaccurate for many reasons, including but not limited to
product and roadmap changes, component and motherboard version changes, new model and/or
product releases, product differences between differing manufacturers, software changes, BIOS
flashes, firmware upgrades, or the like. Any computer system has risks of security vulnerabilities that
cannot be completely prevented or mitigated. AMD assumes no obligation to update or otherwise
correct or revise this information. However, AMD reserves the right to revise this information and to
make changes from time to time to the content hereof without obligation of AMD to notify any person
of such revisions or changes. THIS INFORMATION IS PROVIDED "AS IS." AMD MAKES NO
REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO
RESPONSIBILITY FOR ANY INACCURACIES, ERRORS, OR OMISSIONS THAT MAY APPEAR IN THIS
INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY, OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE
LIABLE TO ANY PERSON FOR ANY RELIANCE, DIRECT, INDIRECT, SPECIAL, OR OTHER
CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN,
EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
Copyright
© Copyright 2013-2025 Advanced Micro Devices, Inc. AMD, the AMD Arrow logo, Artix, ISE, Kintex,
Spartan, UltraScale, UltraScale+, Virtex, Vivado, Zynq, and combinations thereof are trademarks of
Advanced Micro Devices, Inc. Other product names used in this publication are for identification
purposes only and may be trademarks of their respective companies.