This document describes the steps to create standalone Compute Engine instances that use A4X Max accelerator-optimized machine types. To learn about compute instance and cluster creation options, see Deployment options overview page.
A Compute Engine instance, or compute instance, is a computing resource hosted on Google's infrastructure that can be either a virtual machine (VM) or a bare metal instance. A4X Max instances are available as bare metal instances, which differ from VM instances by providing direct, non-virtualized access to the underlying physical hardware. To learn more about the A4X Max machine type, see A4X Max series in the Compute Engine documentation.
When you create a standalone A4X Max instance, the following limitations apply:
Before creating A4X Max instances, if you haven't already done so, complete the following steps:
Select the tab for how you plan to use the samples on this page:
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
Install the Google Cloud CLI.
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
To get the permissions that
you need to create compute instances,
ask your administrator to grant you the
Compute Instance Admin (v1) (roles/compute.instanceAdmin.v1) IAM role on the project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the permissions required to create compute instances. To see the exact permissions that are required, expand the Required permissions section:
The following permissions are required to create compute instances:
compute.instances.create
on the projectcompute.images.useReadOnly
on the image
compute.snapshots.useReadOnly
on the snapshot
compute.instanceTemplates.useReadOnly
on the instance template
compute.subnetworks.use
on the project or on the chosen subnet
compute.addresses.use
on the project
compute.subnetworks.useExternalIp
on the project or on the chosen subnet
compute.networks.use
on the project
compute.networks.useExternalIp
on the project
compute.instances.setMetadata
on the project
compute.instances.setTags
on the VM
compute.instances.setLabels
on the VM
compute.instances.setServiceAccount
on the VM
compute.disks.create
on the project
compute.disks.use
on the disk
compute.disks.useReadOnly
on the disk
You might also be able to get these permissions with custom roles or other predefined roles.
An A4X Max cluster is organized into a hierarchy of blocks and sub-blocks to facilitate large-scale, non-blocking network performance. Understanding this topology is key when reserving capacity and deploying workloads.
1x72 topology.The following table shows the supported topology options for A4X Max instances:
Topology (gpuTopology) |
Number of GPUs | Number of instances |
|---|---|---|
1x72 |
72 | 18 |
Tip: To use an NVLink domain with the 1x72 topology, you
can run the single instance creation commands 18 times or use the instances.bulkInsert
method that is designed to create multiple instances with a single API request. To create A4X Max
instances in bulk, see
Create A4X Max
instances in bulk.
Creating an instance with the A4X Max machine type includes the following steps:
To set up the network for A4X Max machine types, create two VPC networks for the following network interfaces:
default-subnet-1-RDMA_NAME_PREFIX-net that is
automatically provided, and all eight CX-8 NICs use this subnet. These NICs use RDMA over Converged Ethernet (RoCE),
providing the high-bandwidth, low-latency communication that's essential for
scaling out to multiple A4X Max subblocks. For a single A4X Max subblock,
you can skip this VPC network because within a single subblock, direct GPU to GPU
communication is handled by the multi-node NVLink.For more information about NIC arrangement, see Review network bandwidth and NIC arrangement.
Create the networks either manually by following the instruction guides or automatically by using the provided script.
To create the networks, you can use the following instructions:
For these VPC networks, we recommend setting the
maximum transmission unit (MTU) to a larger value.
For A4X Max machine types, the recommended MTU is 8896 bytes.
To review the recommended MTU settings for other GPU machine types, see
MTU settings for GPU machine types.
To create the networks, follow these steps.
For these VPC networks, we recommend setting the
maximum transmission unit (MTU) to a larger value.
For A4X Max machine types, the recommended MTU is 8896 bytes.
To review the recommended MTU settings for other GPU machine types, see
MTU settings for GPU machine types.
Use the following script to create regular VPC networks for the IDPF NICs.
#!/bin/bash
# Create regular VPC network for the IDPF NICs
gcloud compute networks create IDPF_NETWORK_PREFIX-net \
--subnet-mode=custom \
--mtu=8896 \
--enable-ula-internal-ipv6
# Create subnets for the IDPF NICs
for N in $(seq 0 1); do
gcloud compute networks subnets create IDPF_NETWORK_PREFIX-$N \
--network=IDPF_NETWORK_PREFIX-net \
--region=REGION \
--stack-type=IPV6_ONLY \
--ipv6-access-type=INTERNAL
done
gcloud compute firewall-rules create IDPF_NETWORK_PREFIX-internal \
--network=IDPF_NETWORK_PREFIX-net \
--action=ALLOW \
--rules=tcp:0-65535,udp:0-65535,58 \
--source-ranges=IP_RANGE
If you require multiple A4X Max subblocks, use the following script to create the RoCE VPC network and subnets for the four CX-8 NICs on each A4X Max instance.
#!/bin/bash
# List and make sure network profiles exist in the machine type's zone
gcloud compute network-profiles list --filter "location.name=ZONE"
# Create network for RDMA NICs
gcloud compute networks create RDMA_NAME_PREFIX-net \
--network-profile=ZONE-vpc-roce-metal \
--subnet-mode custom \
--mtu=8896
# For RoCE VPC networks for bare metal instances, a single subnet named
# default-subnet-1-RDMA_NAME_PREFIX-net is automatically provided.
# For more details, see https://cloud.google.com/vpc/docs/rdma-network-profiles.
Replace the following:
IDPF_NETWORK_PREFIX: the custom name prefix to use for the regular
VPC networks and subnets for the IDPF NICs.RDMA_NAME_PREFIX: the custom name prefix to use for the RoCE
VPC network and subnets for the CX-8 NICs.ZONE: specify a zone in which the machine type that you want to
use is available, such as us-central1-a. For information about regions, see
GPU availability by regions and zones.REGION: the region where you want to create the
subnets. This region must correspond to the zone specified. For example,
if your zone is us-central1-a, then your region is us-central1.IP_RANGE: the IP range to use for the
SSH firewall rules.
To create a compact placement policy, use the
gcloud beta compute resource-policies create group-placement command:
gcloud beta compute resource-policies create group-placement POLICY_NAME \
--collocation=collocated \
--gpu-topology=1x72 \
--region=REGION
Replace the following:
POLICY_NAME: the name of the compact placement policy.REGION: the region where you want to create the compact placement policy.
Specify a region in which the machine type that you want to use is available. For information
about regions, see
GPU availability by regions and zones.
To create a compact placement policy, make a POST request to the
beta
resourcePolicies.insert method.
POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/regions/REGION/resourcePolicies
{
"name": "POLICY_NAME",
"groupPlacementPolicy": {
"collocation": "COLLOCATED",
"gpuTopology": "1x72"
}
}
Replace the following:
PROJECT_ID: your project ID.POLICY_NAME: the name of the compact placement policy.REGION: the region where you want to create the compact placement policy.
Specify a region in which the machine type that you want to use is available. For information
about regions, see
GPU availability by regions and zones.
To obtain a GPU topology of 1x72, create
18 A4X Max instances. When you create the instances, apply the compact placement policy that specifies the gpuTopology
field. Applying the policy ensures that Compute Engine creates all 18 A4X Max
instances in one sub-block to use an NVLink domain.
If a sub-block lacks capacity for an A4X Max instance, then the request to create the
instance fails.
To create an A4X Max instance, select one of the following options.
The following commands also set the access scope for your instances. To simplify permissions management, Google recommends that you set the access scope on an instance tocloud-platform access and then use IAM roles to define what services the instance can
access. For more information, see
Scopes best practice.
To create the A4X Max instance, use the
gcloud compute instances create command.
gcloud compute instances create INSTANCE_NAME \
--machine-type=a4x-maxgpu-4g-metal \
--image-family=IMAGE_FAMILY \
--image-project=IMAGE_PROJECT \
--zone=ZONE \
--boot-disk-type=hyperdisk-balanced \
--boot-disk-size=DISK_SIZE \
--scopes=cloud-platform \
--network-interface=nic-type=IDPF,network=IDPF_NETWORK_PREFIX-net,stack-type=IPV6_ONLY,subnet=IDPF_NETWORK_PREFIX-sub-0 \
--network-interface=nic-type=IDPF,network=IDPF_NETWORK_PREFIX-net,stack-type=IPV6_ONLY,subnet=IDPF_NETWORK_PREFIX-sub-1,no-address \
--network-interface=subnet=default-subnet-1-RDMA_NAME_PREFIX-net,stack-type=IPV6_ONLY,nic-type=MRDMA \
--network-interface=subnet=default-subnet-1-RDMA_NAME_PREFIX-net,stack-type=IPV6_ONLY,nic-type=MRDMA \
--network-interface=subnet=default-subnet-1-RDMA_NAME_PREFIX-net,stack-type=IPV6_ONLY,nic-type=MRDMA \
--network-interface=subnet=default-subnet-1-RDMA_NAME_PREFIX-net,stack-type=IPV6_ONLY,nic-type=MRDMA \
--network-interface=subnet=default-subnet-1-RDMA_NAME_PREFIX-net,stack-type=IPV6_ONLY,nic-type=MRDMA \
--network-interface=subnet=default-subnet-1-RDMA_NAME_PREFIX-net,stack-type=IPV6_ONLY,nic-type=MRDMA \
--network-interface=subnet=default-subnet-1-RDMA_NAME_PREFIX-net,stack-type=IPV6_ONLY,nic-type=MRDMA \
--network-interface=subnet=default-subnet-1-RDMA_NAME_PREFIX-net,stack-type=IPV6_ONLY,nic-type=MRDMA \
--reservation-affinity=specific \
--reservation=RESERVATION \
--provisioning-model=RESERVATION_BOUND \
--instance-termination-action=TERMINATION_ACTION \
--maintenance-policy=TERMINATE \
--restart-on-failure \
--resource-policies=POLICY_NAME
Replace the following:
INSTANCE_NAME: the name of the A4X Max instance.IMAGE_FAMILY: the image family of the OS image that you want to use.
For a list of supported operating systems, see Operating system details.IMAGE_PROJECT: the project ID of the OS image.
ZONE: the zone in which the machine type that you want to
use is available.
You must use a zone in the same region as the
compact placement policy.
For information about regions, see
GPU availability by regions and zones.DISK_SIZE: the size of the boot disk in GB.IDPF_NETWORK_PREFIX: the name prefix
that you specified when creating the VPC networks and subnets that use IDPF NICs.RDMA_NAME_PREFIX: the name prefix that you specified when
creating the VPC networks and subnets that use RDMA NICs.RESERVATION: the reservation name, a block, or a subblock within a
reservation. To get the reservation name or the available blocks, see View reserved capacity. Based on your
requirements for instance placement, choose one of the following:
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME/reservationSubBlocks/RESERVATION_SUBBLOCK_NAME
TERMINATION_ACTION: whether Compute Engine stops
(STOP) or deletes (DELETE) the A4X Max instance at the end of the reservation period.
POLICY_NAME: the name of the compact placement policy.
To create the A4X Max instance, make a POST request to the
instances.insert method.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances
{
"machineType": "projects/PROJECT_ID/zones/ZONE/machineTypes/a4x-maxgpu-4g-metal",
"name": "INSTANCE_NAME",
"disks":[
{
"boot":true,
"initializeParams":{
"diskSizeGb": "DISK_SIZE",
"diskType": "hyperdisk-balanced",
"sourceImage": "projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"
},
"mode": "READ_WRITE",
"type": "PERSISTENT"
}
],
"serviceAccounts": [
{
"email": "default",
"scopes": [
"https://www.googleapis.com/auth/cloud-platform"
]
}
],
"networkInterfaces": [
{
"accessConfigs": [
{
"name": "external-nat",
"type": "ONE_TO_ONE_NAT"
}
],
"network": "projects/NETWORK_PROJECT_ID/global/networks/IDPF_NETWORK_PREFIX-net",
"nicType": "IDPF",
"stackType": "IPV6_ONLY",
"subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/IDPF_NETWORK_PREFIX-sub-0"
},
{
"network": "projects/NETWORK_PROJECT_ID/global/networks/IDPF_NETWORK_PREFIX-net",
"nicType": "IDPF",
"stackType": "IPV6_ONLY",
"subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/IDPF_NETWORK_PREFIX-sub-1"
},
{
"subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/default-subnet-1-RDMA_NAME_PREFIX-net",
"nicType": "MRDMA",
"stackType": "IPV6_ONLY"
},
{
"subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/default-subnet-1-RDMA_NAME_PREFIX-net",
"nicType": "MRDMA",
"stackType": "IPV6_ONLY"
},
{
"subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/default-subnet-1-RDMA_NAME_PREFIX-net",
"nicType": "MRDMA",
"stackType": "IPV6_ONLY"
},
{
"subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/default-subnet-1-RDMA_NAME_PREFIX-net",
"nicType": "MRDMA",
"stackType": "IPV6_ONLY"
},
{
"subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/default-subnet-1-RDMA_NAME_PREFIX-net",
"nicType": "MRDMA",
"stackType": "IPV6_ONLY"
},
{
"subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/default-subnet-1-RDMA_NAME_PREFIX-net",
"nicType": "MRDMA",
"stackType": "IPV6_ONLY"
},
{
"subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/default-subnet-1-RDMA_NAME_PREFIX-net",
"nicType": "MRDMA",
"stackType": "IPV6_ONLY"
},
{
"subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/default-subnet-1-RDMA_NAME_PREFIX-net",
"nicType": "MRDMA",
"stackType": "IPV6_ONLY"
}
],
"reservationAffinity":{
"consumeReservationType": "SPECIFIC_RESERVATION",
"key": "compute.googleapis.com/reservation-name",
"values":[
"RESERVATION"
]
},
"scheduling":{
"provisioningModel": "RESERVATION_BOUND",
"instanceTerminationAction": "TERMINATION_ACTION",
"onHostMaintenance": "TERMINATE",
"automaticRestart": true
},
"resourcePolicies": [
"projects/PROJECT_ID/regions/REGION/resourcePolicies/POLICY_NAME"
]
}
Replace the following:
PROJECT_ID: the project ID of the project where you want to create the
A4X Max instance.ZONE: the zone in which the machine type that you want to
use is available.
You must use a zone in the same region as the
compact placement policy.
For information about regions, see
GPU availability by regions and zones.INSTANCE_NAME: the name of the A4X Max instance.DISK_SIZE: the size of the boot disk in GB.IMAGE_PROJECT: the project ID of the OS image.
IMAGE_FAMILY: the image family of the OS image that you want to use.
For a list of supported operating systems, see Operating system details.NETWORK_PROJECT_ID: the project ID of the network.IDPF_NETWORK_PREFIX: the name prefix that you specified when
creating the VPC networks and subnets that use IDPF NICs.REGION: the region of the subnetwork.RDMA_NAME_PREFIX: the name prefix that you specified when
creating the VPC networks and subnets that use RDMA NICs.RESERVATION: the reservation name, a block, or a subblock within a
reservation. To get the reservation name or the available blocks, see View reserved capacity. Based on your
requirements for instance placement, choose one of the following:
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME
projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME/reservationSubBlocks/RESERVATION_SUBBLOCK_NAME
TERMINATION_ACTION: whether Compute Engine stops
(STOP) or deletes (DELETE) the A4X Max instance at the end of the reservation period.
PROJECT_ID: the project ID of the compact placement policy.REGION: the region of the compact placement policy.POLICY_NAME: the name of the compact placement policy.Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-06-11 UTC.