gcloud compute backend-services add-backend BACKEND_SERVICE_NAME ([--instance-group=INSTANCE_GROUP : --instance-group-region=INSTANCE_GROUP_REGION | --instance-group-zone=INSTANCE_GROUP_ZONE] | [--network-endpoint-group=NETWORK_ENDPOINT_GROUP : --global-network-endpoint-group | --network-endpoint-group-region=NETWORK_ENDPOINT_GROUP_REGION | --network-endpoint-group-zone=NETWORK_ENDPOINT_GROUP_ZONE]) [--balancing-mode=BALANCING_MODE] [--capacity-scaler=CAPACITY_SCALER] [--description=DESCRIPTION] [--failover] [--max-utilization=MAX_UTILIZATION] [--preference=PREFERENCE] [--traffic-duration=TRAFFIC_DURATION] [--custom-metrics=[CUSTOM_METRICS,…] | --custom-metrics-file=[CUSTOM_METRICS,…]] [--global | --region=REGION] [--max-connections=MAX_CONNECTIONS | --max-connections-per-endpoint=MAX_CONNECTIONS_PER_ENDPOINT | --max-connections-per-instance=MAX_CONNECTIONS_PER_INSTANCE | --max-in-flight-requests=MAX_IN_FLIGHT_REQUESTS | --max-in-flight-requests-per-endpoint=MAX_IN_FLIGHT_REQUESTS_PER_ENDPOINT | --max-in-flight-requests-per-instance=MAX_IN_FLIGHT_REQUESTS_PER_INSTANCE | --max-rate=MAX_RATE | --max-rate-per-endpoint=MAX_RATE_PER_ENDPOINT | --max-rate-per-instance=MAX_RATE_PER_INSTANCE] [GCLOUD_WIDE_FLAG …]
gcloud compute backend-services add-backend adds a backend to a
Google Cloud load balancer or Traffic Director. Depending on the load balancing
scheme of the backend service, backends can be instance groups (managed or
unmanaged), zonal network endpoint groups (zonal NEGs), serverless NEGs, or an
internet NEG. For more information, see the backend
services overview.
For most load balancers, you can define how Google Cloud measures capacity by selecting a balancing mode. For more information, see traffic distribution.
To modify a backend, use thegcloud
compute backend-services update-backend or gcloud compute
backend-services edit command.
BACKEND_SERVICE_NAME--instance-group=INSTANCE_GROUP--instance-group-region=INSTANCE_GROUP_REGION
To avoid prompting when this flag is omitted, you can set the
property:
compute/region
gcloud config set compute/region REGIONA list of regions can be fetched by running:
gcloud compute regions listTo unset the property, run:
gcloud config unset compute/regionCLOUDSDK_COMPUTE_REGION.
--instance-group-zone=INSTANCE_GROUP_ZONEcompute/zone property isn't set, you
might be prompted to select a zone (interactive mode only).
To avoid prompting when this flag is omitted, you can set the
property:
compute/zone
gcloud config set compute/zone ZONEA list of zones can be fetched by running:
gcloud compute zones listTo unset the property, run:
gcloud config unset compute/zoneCLOUDSDK_COMPUTE_ZONE.
Network Endpoint Group
--network-endpoint-group=NETWORK_ENDPOINT_GROUP--global-network-endpoint-group--network-endpoint-group-region=NETWORK_ENDPOINT_GROUP_REGION
To avoid prompting when this flag is omitted, you can set the
property:
compute/region
gcloud config set compute/region REGIONA list of regions can be fetched by running:
gcloud compute regions listTo unset the property, run:
gcloud config unset compute/regionCLOUDSDK_COMPUTE_REGION.
--network-endpoint-group-zone=NETWORK_ENDPOINT_GROUP_ZONEcompute/zone property
isn't set, you might be prompted to select a zone (interactive mode only).
To avoid prompting when this flag is omitted, you can set the
property:
compute/zone
gcloud config set compute/zone ZONEA list of zones can be fetched by running:
gcloud compute zones listTo unset the property, run:
gcloud config unset compute/zoneCLOUDSDK_COMPUTE_ZONE.
--balancing-mode=BALANCING_MODEThis cannot be used when the endpoint type of an attached network endpoint group is INTERNET_IP_PORT, INTERNET_FQDN_PORT, or SERVERLESS.
BALANCING_MODE must be one of:
CONNECTIONINTERNAL or EXTERNAL. Available if the backend
service's protocol is one of SSL, TCP, or
UDP.
Spreads load based on how many concurrent connections the backend can handle.
For backend services with --load-balancing-scheme EXTERNAL, you
must specify exactly one of these additional parameters:
--max-connections, --max-connections-per-instance, or
--max-connections-per-endpoint.
--load-balancing-scheme is
INTERNAL, you must omit all of these parameters.
CUSTOM_METRICSIN_FLIGHTINTERNAL_MANAGED, INTERNAL_SELF_MANAGED, or
EXTERNAL_MANAGED. Available if the backend service's protocol is
one of HTTP, HTTPS, or HTTP/2.
Spreads load based on how many in-flight requests the backend can handle.
You must specify exactly one of these additional parameters:--max-in-flight-requests,
--max-in-flight-requests-per-instance, or
--max-in-flight-requests-per-endpoint, and
--traffic-duration=LONG.
RATEINTERNAL_MANAGED, INTERNAL_SELF_MANAGED, or
EXTERNAL. Available if the backend service's protocol is one of
HTTP, HTTPS, or HTTP/2.
Spreads load based on how many HTTP requests per second (RPS) the backend can handle.
You must specify exactly one of these additional parameters:--max-rate, --max-rate-per-instance, or
--max-rate-per-endpoint.
UTILIZATIONINTERNAL_MANAGED, INTERNAL_SELF_MANAGED, or
EXTERNAL. Available only for managed or unmanaged instance group
backends.
Spreads load based on the backend utilization of instances in a backend instance group.
The following additional parameters may be specified:--max-utilization, --max-rate,
--max-rate-per-instance, --max-connections,
--max-connections-per-instance. For valid combinations, see
--max-utilization.
--capacity-scaler=CAPACITY_SCALER--description=DESCRIPTION--failover--max-utilization=MAX_UTILIZATION0.0 (0%) through 1.0
(100%). This is an optional parameter for the UTILIZATION balancing
mode.
You can use this parameter with other parameters for defining target capacity.
For usage guidelines, see Balancing
mode combinations.
--preference=PREFERENCEPREFERENCE must be one of:
DEFAULTPREFERRED--traffic-duration=TRAFFIC_DURATIONTRAFFIC_DURATION must be one of:
LONGSHORTTRAFFIC_DURATION_UNSPECIFIED--custom-metrics=[CUSTOM_METRICS,…]Example:
gcloud compute backend-services add-backend --custom-metrics='name=my-signal,maxUtilization=0.8,dryRun=true'gcloud compute backend-services add-backend --custom-metrics='name=my-signal,maxUtilization=0.8,dryRun=true'--custom-metrics='name=my-signal2,maxUtilization=0.2'gcloud compute backend-services add-backend --custom-metrics='[{"name" : "my-signal", "maxUtilization" :0.8, "dryRun" : true}, {"name" : "my-signal2", "maxUtilization" : 0.1}]'
Sets custom_metrics value.
dryRundryRun value.
maxUtilizationmaxUtilization value.
namename value.
Shorthand Example:
--custom-metrics=dryRun=boolean,maxUtilization=float,name=string --custom-metrics=dryRun=boolean,maxUtilization=float,name=string
JSON Example:
--custom-metrics='[{"dryRun": boolean, "maxUtilization": float, "name": "string"}]'
File Example:
--custom-metrics=path_to_file.(yaml|json)
--custom-metrics-file=[CUSTOM_METRICS,…]Example:
gcloud compute backend-services add-backend --custom-metrics-file='customMetric.json'
Sets custom_metrics_file value.
dryRundryRun value.
maxUtilizationmaxUtilization value.
namename value.
Shorthand Example:
--custom-metrics-file=dryRun=boolean,maxUtilization=float,name=string --custom-metrics-file=dryRun=boolean,maxUtilization=float,name=string
JSON Example:
--custom-metrics-file='[{"dryRun": boolean, "maxUtilization": float, "name": "string"}]'
File Example:
--custom-metrics-file=path_to_file.(yaml|json)
--global--region=REGIONcompute/region property value for this command invocation.
--max-connections=MAX_CONNECTIONS--max-connections-per-endpoint=MAX_CONNECTIONS_PER_ENDPOINTMAX_CONNECTIONS_PER_ENDPOINT by the number of endpoints in the
network endpoint group, and then dividing by the number of healthy endpoints.
This cannot be used when the endpoint type of an attached network endpoint group
is INTERNET_IP_PORT, INTERNET_FQDN_PORT, or SERVERLESS.
--max-connections-per-instance=MAX_CONNECTIONS_PER_INSTANCEMAX_CONNECTIONS_PER_INSTANCE by the number of instances in the
instance group, and then dividing by the number of healthy instances.
--max-in-flight-requests=MAX_IN_FLIGHT_REQUESTS--max-in-flight-requests-per-endpoint=MAX_IN_FLIGHT_REQUESTS_PER_ENDPOINT--max-in-flight-requests-per-instance=MAX_IN_FLIGHT_REQUESTS_PER_INSTANCE--max-rate=MAX_RATE--max-rate-per-endpoint=MAX_RATE_PER_ENDPOINTMAX_RATE_PER_ENDPOINT by the number of
endpoints in the network endpoint group, and then dividing by the number of
healthy endpoints.
This cannot be used when the endpoint type of an attached network endpoint group
is INTERNET_IP_PORT, INTERNET_FQDN_PORT, or SERVERLESS.
--max-rate-per-instance=MAX_RATE_PER_INSTANCEMAX_RATE_PER_INSTANCE by the number of instances in the instance
group, and then dividing by the number of healthy instances. This parameter is
compatible with managed instance group backends that use autoscaling based on
load balancing.
--access-token-file,
--account, --billing-project,
--configuration,
--flags-file,
--flatten, --format, --help, --impersonate-service-account,
--log-http,
--project, --quiet, --trace-token, --user-output-enabled,
--verbosity.
Run $ gcloud help for details.
gcloud alpha compute backend-services add-backendgcloud beta compute backend-services add-backendgcloud preview compute backend-services add-backend
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-05-27 UTC.