Bulk create HPC-optimized instances with H4D

This document explains how to create a large number of high performance computing (HPC) virtual machine (VM) instances in bulk that are identical and independent from each other. The instances use H4D machine types and run on reserved blocks of capacity.

For more information about creating VMs in bulk, see About bulk creation of VMs. To create instances in bulk that don't use reservations for enhanced cluster management capabilities, see instead Create VMs in bulk.

To learn about other ways to create large clusters of tightly-coupled H4D VMs, see the Overview of HPC cluster creation page.

Before you begin

Choose a consumption option: to create compute instances in bulk and enable enhanced cluster management capabilities, you can choose a Future Reservation in Calendar mode or Spot VMs.

If you choose to use Spot VMs, the VMs might not be compactly collocated. Also, Spot VMs can be preempted as needed and they are not eligible for managing host maintenance events for groups of VMs.

  • Obtain capacity: the process to obtain capacity differs for each consumption option.

    To learn more, see Choose a consumption option and obtain capacity.

  • If you haven't already, set up authentication. Authentication verifies your identity for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine by selecting one of the following options:

    Select the tab for how you plan to use the samples on this page:

    Console

    When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

    gcloud

    1. Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:

      gcloud init

      If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

    2. Set a default region and zone.

    REST

    To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.

      Install the Google Cloud CLI.

      If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

    For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

  • Required roles

    To get the permissions that you need to create VMs in bulk, ask your administrator to grant you the following IAM roles on the project:

    For more information about granting roles, see Manage access to projects, folders, and organizations.

    These predefined roles contain the permissions required to create VMs in bulk. To see the exact permissions that are required, expand the Required permissions section:

    Required permissions

    The following permissions are required to create VMs in bulk:

    You might also be able to get these permissions with custom roles or other predefined roles.

    Overview

    Creating HPC instances in bulk with the H4D machine type includes the following steps:

    1. Optional: Create Virtual Private Cloud networks.
    2. Optional: Create a placement policy if you aren't creating the compute instances on the same block or sub-block.
    3. Create H4D instances in bulk.

    Optional: Create Virtual Private Cloud networks

    When you create a compute instance, you can specify a VPC network and subnet. If you omit this configuration, the default network and subnet are used.

    To use Cloud RDMA with H4D instances, you must have at least two networks configured, one for each type of network interface (NIC):

    Instances that use Cloud RDMA can have only one IRDMA interface. You can add up to eight additional GVNIC network interfaces for a total of 10 vNICs per instance.

    To set up the Falcon VPC networks to use with your instances, you can either follow the documented instructions or use the provided script.

    Instruction guides

    To create the networks, you can use the following instructions:

    Script

    You can create up to nine gVNIC network interfaces and one IRDMA network interface per instance. Each network interface must attach to a separate network. To create the networks, you can use the following script, which creates two networks for gVNIC and one network for IRDMA.

    1. Optional: Before running the script, list the Falcon VPC network profiles to verify there is one available.
        gcloud compute network-profiles list
        
    2. Copy the following code and run it in a Linux shell window.

        #!/bin/bash
        # Set the number of GVNIC interfaces to create. You can create up to 9.
        NUM_GVNIC=NUMBER_OF_GVNIC
      
        # Create regular VPC networks and subnets for the GVNIC interfaces
          for N in $(seq 0 $(($NUM_GVNIC - 1))); do
            gcloud compute networks create GVNIC_NAME_PREFIX-net-$N \
                --subnet-mode=custom
      
            gcloud compute networks subnets create GVNIC_NAME_PREFIX-sub-$N \
                --network=GVNIC_NAME_PREFIX-net-$N \
                --region=REGION \
                --range=10.$N.0.0/16
      
            gcloud compute firewall-rules create GVNIC_NAME_PREFIX-internal-$N \
                --network=GVNIC_NAME_PREFIX-net-$N \
                --action=ALLOW \
                --rules=tcp:0-65535,udp:0-65535,icmp \
                --source-ranges=10.0.0.0/8
        done
      
        # Create SSH firewall rules
        gcloud compute firewall-rules create GVNIC_NAME_PREFIX-ssh \
            --network=GVNIC_NAME_PREFIX-net-0 \
            --action=ALLOW \
            --rules=tcp:22 \
            --source-ranges=IP_RANGE
      
        # Optional: Create a firewall rule for the external IP address for the
        #  first GVNIC network interface
        gcloud compute firewall-rules create GVNIC_NAME_PREFIX-allow-ping-net-0 \
            --network=GVNIC_NAME_PREFIX-net-0 \
            --action=ALLOW \
            --rules=icmp \
            --source-ranges=IP_RANGE
      
        # Create a Falcon VPC network for the Cloud RDMA network interface
        gcloud compute networks create RDMA_NAME_PREFIX-irdma \
            --network-profile=ZONE-vpc-falcon \
            --subnet-mode custom
      
        # Create a subnet in the Falcon VPC network
        gcloud compute networks subnets create RDMA_NAME_PREFIX-irdma-sub \
            --network=RDMA_NAME_PREFIX-irdma \
            --region=REGION \
            --range=10.2.0.0/16  # offset to avoid overlap with GVNIC subnet ranges
        

      Replace the following:

      • NUMBER_OF_GVNIC: the number of GVNIC interfaces to create. Specify a number from 1 to 9.
      • GVNIC_NAME_PREFIX: the name prefix to use for the regular VPC network and subnet that uses a GVNIC NIC type.
      • REGION: the region where you want to create the networks. This must correspond to the zone specified for the --network-profile flag, when creating the Falcon VPC network. For example, if you specify the zone as europe-west4-b, then your region is europe-west4.
      • IP_RANGE: the range of IP addresses outside of the VPC network to use for the SSH firewall rules. As a best practice, specify the specific IP address ranges that you need to allow access from, rather than all IPv4 or IPv6 sources. Don't use 0.0.0.0/0 or ::/0 as a source range because this allows traffic from all IPv4 or IPv6 sources, including sources outside of Google Cloud.
      • RDMA_NAME_PREFIX: the name prefix to use for the VPC network and subnet that uses the IRDMA NIC type.
      • ZONE: the zone where you want to create the networks and compute instances. Use either us-central1-a or europe-west4-b.
    3. Optional: To verify that the VPC network resources are created successfully, check the network settings in the Google Cloud console:

      1. In the Google Cloud console, go to the VPC networks page.

        Go to VPC networks

      2. Search the list for the networks that you created in the previous step.
      3. To view the subnets, firewall rules, and other network settings, click the name of the network.

    Optional: Create a placement policy

    You can specify VM placement by creating a compact placement policy. When you apply a compact placement policy to your VMs, Compute Engine makes best-effort attempts to create VMs that are as close to each other as possible. If your application is latency-sensitive and requires maximum compactness, then specify the maxDistance field (Preview) when you create a compact placement policy. A lower maxDistance value ensures closer VM placement, but it also increases the chance that some VMs won't be created.

    To create a compact placement policy, select one of the following options:

    gcloud

    To create a compact placement policy, use the gcloud beta compute resource-policies create group-placement command:

    gcloud beta compute resource-policies create group-placement POLICY_NAME \
        --collocation=collocated \
        --max-distance=MAX_DISTANCE \
        --region=REGION
    

    Replace the following:

    REST

    To create a compact placement policy, make a POST request to the beta resourcePolicies.insert method. In the request body, include the collocation field set to COLLOCATED, and the maxDistance field.

    POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/regions/REGION/resourcePolicies
      {
        "name": "POLICY_NAME",
        "groupPlacementPolicy": {
          "collocation": "COLLOCATED",
          "maxDistance": MAX_DISTANCE
        }
      }
    

    Replace the following:

    Create VM instances in bulk

    The instructions in this section describe how to create H4D VMs in bulk.

    Review the following limitations before creating H4D instances with Cloud RDMA:

    gcloud

    To create VMs in bulk, use the gcloud compute instances create command.

    The parameters that you need to specify depend on the consumption option that you are using for this deployment. Select the tab that corresponds to your consumption option's provisioning model.

    Reservation-bound

    Start with the following gcloud compute instances create command.

       gcloud compute instances bulk create \
           --name-pattern=NAME_PATTERN \
           --count=COUNT \
           --machine-type=MACHINE_TYPE \
           --image-family=IMAGE_FAMILY \
           --image=project= IMAGE_PROJECT \
           --instance-termination=action=DELETE \
           --maintenance-policy=TERMINATE \
           --region=REGION \
           --boot-disk-type=hyperdisk-balanced \
           --boot-disk-size=DISK_SIZE
       

    Complete the following steps:

    Replace the following:

  • NAME_PATTERN: the name pattern for the instances. For example, using vm-# for the name pattern generates instances with names such as vm-1 and vm-2, up to the number specified by the --count flag.
  • COUNT: the number of instances to create.
  • MACHINE_TYPE: the machine type to use for the instances. Use one of the H4D machine types, for example h4d-highmem-192-lssd.
  • IMAGE_FAMILY: the image family of the OS image that you want to use, for example rocky-linux-9-optimized-gcp.

    For a list of supported OS images, see Supported operating system. Choose an OS image version that supports the IRDMA interface.

  • IMAGE_PROJECT: the project ID for the OS image, for example, rocky-linux-cloud.
  • REGION: specify a region in which the machine type that you want to use is available, for example europe-west4. For information about available regions, see Available regions and zones.
  • DISK_SIZE: Optional: the size of the boot disk in GiB. The value must be a whole number.
  • Optional: If you chose to use a compact placement policy, include the --resource-policies flag:

             --resource-policies=POLICY_NAME
             

    Replace POLICY_NAME with the name of the compact placement policy.

  • To specify the reservation, do one of the following:

    To view the reservation name or the available reservation blocks, see View capacity.

  • Optional: To configure the instances to use Cloud RDMA, add the flags similar to the following to the command. This example configures two GVNIC network interfaces and one IRDMA network interface:

            --network-interface=nic-type=GVNIC, \
                network=GVNIC_NAME_PREFIX-net-0, \
                subnet=GVNIC_NAME_PREFIX-sub-0, \
                stack-type=STACK_TYPE, \
                address=EXTERNAL_IPV4_ADDRESS \
            --network-interface=nic-type=GVNIC, \
                network=GVNIC_NAME_PREFIX-net-1, \
                subnet=GVNIC_NAME_PREFIX-sub-1, no-address \
            --network-interface=nic-type=IRDMA, \
                network=RDMA_NAME_PREFIX-irdma, \
                subnet=RDMA_NAME_PREFIX-irdma-sub, \
                stack-type=IPV4_ONLY, no-address \
            

    Replace the following:

  • Optional: Add additional flags to customize the rest of the instance properties, as needed.
  • Run the command.
  • Spot

    Start with the following gcloud compute instances create command.

       gcloud compute instances bulk create \
           --name-pattern=NAME_PATTERN \
           --count=COUNT \
           --machine-type=MACHINE_TYPE \
           --image-family=IMAGE_FAMILY \
           --image=project= IMAGE_PROJECT \
           --region=REGION \
           --boot-disk-type=hyperdisk-balanced \
           --boot-disk-size=DISK_SIZE \
           --provisioning-model=SPOT \
           --instance-termination=action=TERMINATION_ACTION
       

    Complete the following steps:

    Replace the following:

  • NAME_PATTERN: the name pattern for the instances. For example, using vm-# for the name pattern generates instances with names such as vm-1 and vm-2, up to the number specified by the --count flag.
  • COUNT: the number of instances to create.
  • MACHINE_TYPE: the machine type to use for the instances. Use one of the H4D machine types, for example h4d-highmem-192-lssd.
  • IMAGE_FAMILY: the image family of the OS image that you want to use, for example rocky-linux-9-optimized-gcp.

    For a list of supported OS images, see Supported operating system. Choose an OS image version that supports the IRDMA interface.

  • IMAGE_PROJECT: the project ID for the OS image, for example, rocky-linux-cloud.
  • REGION: specify a region in which the machine type that you want to use is available, for example europe-west4. For information about available regions, see Available regions and zones.
  • DISK_SIZE: Optional: the size of the boot disk in GiB. The value must be a whole number.
  • TERMINATION_ACTION: the action to take when Compute Engine preempts the instance, either STOP (default) or DELETE.

  • Optional: If you chose to use a compact placement policy, then add the following flag to the command:

          --resource-policies=POLICY_NAME \
          

    Replace POLICY_NAME with the name of the compact placement policy.

  • Optional: To configure the instances to use Cloud RDMA, add the flags similar to the following to the command. This example configures two GVNIC network interfaces and one IRDMA network interface:

          --network-interface=nic-type=GVNIC, \
              network=GVNIC_NAME_PREFIX-net-0, \
              subnet=GVNIC_NAME_PREFIX-sub-0, \
              stack-type=STACK_TYPE, \
              address=EXTERNAL_IPV4_ADDRESS \
          --network-interface=nic-type=GVNIC, \
              network=GVNIC_NAME_PREFIX-net-1, \
              subnet=GVNIC_NAME_PREFIX-sub-1, no-address \
          --network-interface=nic-type=IRDMA, \
              network=RDMA_NAME_PREFIX-irdma, \
              subnet=RDMA_NAME_PREFIX-irdma-sub, \
              stack-type=IPV4_ONLY, no-address \
          

    Replace the following:

  • Optional: Add additional flags to customize the rest of the instance properties, as needed.
  • Run the command.
  • REST

    To create VM instances in bulk, make a POST request to the instances.bulkInsert method

    The parameters that you need to specify depend on the consumption option that you are using for this deployment. Select the tab that corresponds to your consumption option's provisioning model.

    Reservation-bound

    Start with the following POST request to the instances.bulkInsert method.

        POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/bulkInsert
        {
          "namePattern":"NAME_PATTERN",
          "count":"COUNT",
          "instanceProperties":{
            "machineType":"MACHINE_TYPE",
            "disks":[
              {
                "boot":true,
                "initializeParams":{
                  "diskSizeGb":"DISK_SIZE",
                  "diskType":"hyperdisk-balanced",
                  "sourceImage":"projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"
                },
                "mode":"READ_WRITE",
                "type":"PERSISTENT"
              }
            ],
            "scheduling":{
                "provisioningModel":"RESERVATION_BOUND",
                "instanceTerminationAction":"DELETE",
                "onHostMaintenance": "TERMINATE",
                "automaticRestart":true
            }
          }
        }
        

    Complete the following steps:

    1. Replace the following:

      • PROJECT_ID: the project ID of the project where you want to create the instances.
      • ZONE: specify a zone in which the machine type that you want to use is available. If you are using a compact placement policy, then use a zone in the same region as the compact placement policy. For information about the regions where H4D machine types are available, see Available regions and zones.
      • NAME_PATTERN: the name pattern for the instances. For example, using vm-# for the name pattern generates instances with names such as vm-1 and vm-2, up to the number specified by the count field.
      • COUNT: the number of instances to create.
      • MACHINE_TYPE: the machine type to use for the instances. Use one of the H4D machine types, for example h4d-highmem-192-lssd.
      • DISK_SIZE: the size of the boot disk in GiB.
      • IMAGE_PROJECT: the project ID for the OS image, for example, debian-cloud.
      • IMAGE_FAMILY: the image family of the OS image that you want to use, for example rocky-linux-9-optimized-gcp. For a list of supported OS images, see Supported operating system. Choose an OS image version that supports the IRDMA interface.
    2. Optional: If you chose to use a compact placement policy, include the resourcePolicies parameter in the request body as part of the "instanceProperties" parameter.

                "resourcePolicies": [
                  "projects/PROJECT_ID/regions/REGION/resourcePolicies/POLICY_NAME"
                ],
                

      Replace POLICY_NAME with the name of the compact placement policy.

    3. To specify the reservation, do one of the following:

      • If you are using a placement policy or if VMs can be placed anywhere in your reservation block, then add the following to the request body as part of the "instanceProperties" parameter:

                   "reservationAffinity":{
                     "consumeReservationType":"SPECIFIC_RESERVATION",
                     "key":"compute.googleapis.com/reservation-name",
                     "values":[
                       "RESERVATION_NAME"
                     ],
                   },
                   

        Replace RESERVATION_NAME with the name of the reservation, for example, h4d-highmem-exfr-prod.

      • If you aren't using a compact placement policy or you want the instances placed in a specific block, then add the following to the request body as part of the "instanceProperties" parameter:

                    "reservationAffinity":{
                      "consumeReservationType":"SPECIFIC_RESERVATION",
                      "key":"compute.googleapis.com/reservation-name",
                      "values":[
                        "RESERVATION_BLOCK_NAME"
                      ],
                    },
                   

        Replace RESERVATION_BLOCK_NAME with the name of a block in the reservation, for example, h4d-highmem-exfr-prod/reservationBlocks/h4d-highmem-exfr-prod-block-1.

      To view the reservation name or the available reservation blocks, see View capacity.

    4. If you want to configure the instances to use Cloud RDMA, then include a parameter block similar to the following to the request body as part of the "instanceProperties" parameter. This example configures two GVNIC network interfaces and one IRDMA network interface:

                "networkInterfaces": [
                {
                  "network": "GVNIC_NAME_PREFIX-net-0",
                  "subnetwork": "GVNIC_NAME_PREFIX-sub-0",
                  "accessConfigs": [
                     {
                        "type": "ONE_TO_ONE_NAT",
                        "name": "External IP",
                        "natIP": "EXTERNAL_IPV4_ADDRESS"
                     }
                  ],
                  "stackType": "IPV4_ONLY",
                  "nicType": "GVNIC",
                },
                {
                  "network": "GVNIC_NAME_PREFIX-net-1",
                  "subnetwork": "GVNIC_NAME_PREFIX-sub-1",
                  "stackType": "IPV4_ONLY",
                  "nicType": "GVNIC",
                },
                {
                  "network": "RDMA_NAME_PREFIX-irdma",
                  "subnetwork": "RDMA_NAME_PREFIX-irdma-sub",
                  "stackType": "IPV4_ONLY",
                  "nicType": "IRDMA",
                }
              ],
               

      Replace the following:

      • GVNIC_NAME_PREFIX: the name prefix that you used when creating the VPC network and subnet for the GVNIC interface.

        For the GVNIC network interface, you can omit the network and subnetwork fields to use the default network instead.

      • EXTERNAL_IPV4_ADDRESS: Optional: a static external IPv4 address to use with the network interface. You must have previously reserved an external IPv4 address.
      • RDMA_NAME_PREFIX: the name prefix you used when creating the VPC network and subnet for the IRDMA interface.
    5. Optional: Customize the rest of the instance properties, as needed.
    6. Submit the request.

    Spot

    Start with the following POST request to the instances.bulkInsert method.

        POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/bulkInsert
        {
          "namePattern":"NAME_PATTERN",
          "count":"COUNT",
          "instanceProperties":{
            "machineType":"MACHINE_TYPE",
            "disks":[
              {
                "boot":true,
                "initializeParams":{
                  "diskSizeGb":"DISK_SIZE",
                  "diskType":"hyperdisk-balanced",
                  "sourceImage":"projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"
                },
                "mode":"READ_WRITE",
                "type":"PERSISTENT"
              }
            ],
            "scheduling":{
                "provisioningModel":"SPOT",
                "instanceTerminationAction":"TERMINATION_ACTION"
            }
          }
        }
        

    Complete the following steps:

    1. Replace the following:

      • PROJECT_ID: the project ID of the project where you want to create the instances.
      • ZONE: specify a zone in which the machine type that you want to use is available. If you are using a compact placement policy, then use a zone in the same region as the compact placement policy. For information about the regions where H4D machine types are available, see Available regions and zones.
      • NAME_PATTERN: the name pattern for the instances. For example, using vm-# for the name pattern generates instances with names such as vm-1 and vm-2, up to the number specified by the count field.
      • COUNT: the number of instances to create.
      • MACHINE_TYPE: the machine type to use for the instances. Use one of the H4D machine types, for example h4d-highmem-192-lssd.
      • DISK_SIZE: the size of the boot disk in GiB.
      • IMAGE_PROJECT: the project ID for the OS image, for example, debian-cloud.
      • IMAGE_FAMILY: the image family of the OS image that you want to use, for example rocky-linux-9-optimized-gcp. For a list of supported OS images, see Supported operating system. Choose an OS image version that supports the IRDMA interface.
      • TERMINATION_ACTION: the action to take when Compute Engine preempts the instance, either STOP (default) or DELETE.

    2. Optional: If you chose to use a compact placement policy, include the resourcePolicies parameter as part of the "instanceProperties" parameter.

                "resourcePolicies": [
                  "projects/PROJECT_ID/regions/REGION/resourcePolicies/POLICY_NAME"
                ]
                
    3. If you want to configure the instances to use Cloud RDMA, then include a parameter block similar to the following to the request body as part of the "instanceProperties" parameter. This example configures two GVNIC network interfaces and one IRDMA network interface:

                "networkInterfaces": [
                {
                  "network": "GVNIC_NAME_PREFIX-net-0",
                  "subnetwork": "GVNIC_NAME_PREFIX-sub-0",
                  "accessConfigs": [
                     {
                        "type": "ONE_TO_ONE_NAT",
                        "name": "External IP",
                        "natIP": "EXTERNAL_IPV4_ADDRESS"
                     }
                  ],
                  "stackType": "IPV4_ONLY",
                  "nicType": "GVNIC",
                },
                {
                  "network": "GVNIC_NAME_PREFIX-net-1",
                  "subnetwork": "GVNIC_NAME_PREFIX-sub-1",
                  "stackType": "IPV4_ONLY",
                  "nicType": "GVNIC",
                },
                {
                  "network": "RDMA_NAME_PREFIX-irdma",
                  "subnetwork": "RDMA_NAME_PREFIX-irdma-sub",
                  "stackType": "IPV4_ONLY",
                  "nicType": "IRDMA",
                }
              ],
               

      Replace the following:

      • GVNIC_NAME_PREFIX: the name prefix that you used when creating the VPC network and subnet for the GVNIC interface.

        For the GVNIC network interface, you can omit the network and subnetwork fields to use the default network instead.

      • EXTERNAL_IPV4_ADDRESS: Optional: a static external IPv4 address to use with the network interface. You must have previously reserved an external IPv4 address.
      • RDMA_NAME_PREFIX: the name prefix you used when creating the VPC network and subnet for the IRDMA interface.
    4. Optional: Customize the rest of the instance properties, as needed.
    5. Submit the request.

    What's next