Scaling based on CPU utilization

The simplest form of autoscaling is to scale a managed instance group (MIG) based on the CPU utilization of its instances.

Before you begin

You can autoscale based on the average CPU utilization of a managed instance group (MIG). Using this policy tells the autoscaler to collect the CPU utilization of the instances in the group and determine whether it needs to scale. You set the target CPU utilization the autoscaler should maintain and the autoscaler works to maintain that level.

The autoscaler treats the target CPU utilization level as a fraction of the average use of all vCPUs over time in the instance group. If the average utilization of your total vCPUs exceeds the target utilization, the autoscaler adds more VM instances. If the average utilization of your total vCPUs is less than the target utilization, the autoscaler removes instances. For example, setting a 0.75 target utilization tells the autoscaler to maintain an average utilization of 75% among all vCPUs in the instance group.

You can also scale based on forecasted CPU utilization. For more information, and to see if this is suitable for your workload, see Scaling based on predictions.

Enable autoscaling based on CPU utilization

To enable autoscaling based on CPU utilization, use one of the following options. If you want to configure a stabilization period to control the pace of scaling in, then you must use either Google Cloud CLI or REST.

Permissions required for this task

To perform this task, you must have the following permissions:

compute.autoscalers.create on the project
compute.instanceGroupManagers.use on the project

Console

In the console, go to the Instance groups page.

Go to Instance groups
If you have an instance group, click the name of the instance group, and then click Edit. On the edit instance group page, do the following:
1. Click Group size & autosclaing to expand the section.
2. Click Configure autoscaling.
If you don't have an instance group, click Create instance group and do the following:
1. In the Name field, specify a name for the group.
2. In the Instance template list, select a template.
3. In the Location section, depending on whether you're creating a zonal or regional MIG, choose an option as follows:
  - For a zonal MIG, select Single zone, and then select a region and a zone.
  - For a regional MIG, select Multiple zones, and then select a region and zones.
In the Autoscaling section, a CPU utilization autoscaling signal is added by default. You can either use the default values for the signal or do the following:
1. Specify the minimum and the maximum numbers of instances that you want the autoscaler to create in this group.
2. To edit the target CPU utilization, click the CPU utilization signal to expand the section and specify the percentage.
  1. Under Predictive autoscaling, select Off. To learn more about predictive autoscaling, and whether it is suitable for your workload, see Scaling based on predictions.
3. Click Done.
You can use the Initialization period to tell the autoscaler how long it takes for your application to initialize. Specifying an accurate initialization period improves autoscaler decisions. For example, when scaling out, the autoscaler ignores data from VMs that are still initializing because those VMs might not yet represent normal usage of your application. The default initialization period is 60 seconds.
Click Save.

gcloud

Use the set-autoscaling sub-command to enable autoscaling for a managed instance group. For example, the following command creates an autoscaler that has a target CPU utilization of 60%. Along with the --target-cpu-utilization parameter, the --max-num-replicas parameter is also required when creating an autoscaler:

gcloud compute instance-groups managed set-autoscaling example-managed-instance-group \
    --max-num-replicas 20 \
    --target-cpu-utilization 0.60 \
    --cool-down-period 90 \
    --stabilization-period 240

Optionally, set the following flags to control autoscaling:

You can use the --cool-down-period flag to set the initialization period, which tells the autoscaler how long it takes for your application to initialize. Specifying an accurate initialization period improves autoscaler decisions. For example, when scaling out, the autoscaler ignores data from VMs that are still initializing because those VMs might not yet represent normal usage of your application. The default initialization period is 60 seconds.
You can use the --stabilization-period flag to set the stabilization period, which determines the duration for your MIG to scale in. A shorter stabilization period indicates quicker VM deletion to scale in. The value must be between `0` seconds and `3600` seconds. The default value is `600` seconds. For more information, see Best practices for stabilization period.

Optionally, you can enable predictive autoscaling to scale out ahead of predicted load. To learn whether predictive autoscaling is suitable for your workload, see Scaling based on predictions.

You can verify that autoscaling is successfully enabled by using the instance-groups managed describe sub-command, which describes the corresponding managed instance group and provides information about any autoscaling features for that instance group:

gcloud compute instance-groups managed describe example-managed-instance-group

For a list of available gcloud commands and flags, see the gcloud reference.

REST

To create an autoscaler, use the autoscalers.insert method for a zonal MIG or the regionAutoscalers.insert method for a regional MIG.

The following example creates an autoscaler for a zonal MIG:

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers

Your request body must contain the name, target, and autoscalingPolicy fields. autoscalingPolicy must define cpuUtilization and maxNumReplicas.

Optionally, set the following fields to control autoscaling:

You can use the coolDownPeriodSec field to set the initialization period, which tells the autoscaler how long it takes for your application to initialize. Specifying an accurate initialization period improves autoscaler decisions. For example, when scaling out, the autoscaler ignores data from VMs that are still initializing because those VMs might not yet represent normal usage of your application. The default initialization period is 60 seconds.
You can use the stabilizationPeriodSec field to set the stabilization period, which determines the duration for your MIG to scale in. A shorter stabilization period indicates quicker VM deletion to scale in. The value must be between `0` seconds and `3600` seconds. The default value is `600` seconds. For more information, see Best practices for stabilization period.

Optionally, you can enable predictive autoscaling to scale out ahead of predicted load. To learn whether predictive autoscaling is suitable for your workload, see Scaling based on predictions.

{
  "name": "example-autoscaler",
  "target": "https://www.googleapis.com/compute/v1/projects/myproject/zones/us-central1-f/instanceGroupManagers/example-managed-instance-group",
  "autoscalingPolicy": {
    "maxNumReplicas": 10,
    "cpuUtilization": {
      "utilizationTarget": 0.6
    },
    "coolDownPeriodSec": 90,
    "stabilizationPeriodSec": 240
  }
}

For more information about enabling autoscaling based on CPU utilization, complete the tutorial, Using autoscaling for highly scalable apps.

How autoscaler handles heavy CPU utilization

During periods of heavy CPU utilization, if utilization is close to 100%, the autoscaler estimates that the group might already be heavily overloaded. In these cases, the autoscaler increases the number of virtual machines by 50% at most.

What's next

Learn how to enable predictive autoscaling.
Learn about managing autoscalers.
Learn how autoscalers make decisions.
Learn how to use multiple autoscaling signals to scale your group.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-07-17 UTC.

Scaling based on CPU utilization Stay organized with collections Save and categorize content based on your preferences.

Before you begin

Console

gcloud

REST

Enable autoscaling based on CPU utilization

Permissions required for this task

Console

gcloud

REST

How autoscaler handles heavy CPU utilization

What's next

Scaling based on CPU utilization