The Application Load Balancer is a proxy-based Layer 7 load balancer that lets you run and scale your services. The Application Load Balancer distributes HTTP and HTTPS traffic to backends hosted on a variety of Google Cloud platforms—such as Compute Engine, Google Kubernetes Engine (GKE), Cloud Storage, and Cloud Run—as well as external backends connected over the internet or by using hybrid connectivity.
Application Load Balancers are available in the following modes of deployment:
External Application Load Balancer: Load balances traffic coming from clients on the internet. For architecture details, see external Application Load Balancer architecture.
| Deployment mode | Network service tier | Load balancing scheme | IP address | Frontend ports |
|---|---|---|---|---|
| Global external | Premium Tier | EXTERNAL_MANAGED | IPv4 IPv6 |
Can reference exactly one port from 1-65535. |
| Regional external | Premium or Standard Tier | EXTERNAL_MANAGED | IPv4 IPv6 (Preview) |
|
| Classic | Global in Premium Tier Regional in Standard Tier |
EXTERNAL* | IPv4 IPv6 (requires Premium Tier) |
EXTERNAL_MANAGED backend services to
EXTERNAL forwarding rules. However, EXTERNAL backend
services cannot be attached to EXTERNAL_MANAGED forwarding rules.
To take advantage of new features available
only with the global external Application Load Balancer, we
recommend that you migrate your existing EXTERNAL resources to
EXTERNAL_MANAGED by using the migration process described at
Migrate
resources from classic to global external Application Load Balancer.
Internal Application Load Balancer: Load balances traffic within your VPC network or networks connected to your VPC network. For architecture details, see internal Application Load Balancer architecture.
| Deployment mode | Network service tier | Load balancing scheme | IP address | Frontend ports |
|---|---|---|---|---|
| Regional internal | Premium Tier | INTERNAL_MANAGED | IPv4 IPv6 (Preview) |
Can reference exactly one port from 1-65535. |
Cross-region internal* |
Premium Tier | INTERNAL_MANAGED | IPv4 IPv6 (Preview) |
* The load balancer uses global resources and can be deployed in one or multiple Google Cloud regions that you choose.
_MANAGED, requests are routed either to the GFE or to the
Envoy proxy.
External Application Load Balancers are implemented using Google Front Ends (GFEs) or managed proxies. Global external Application Load Balancers and classic Application Load Balancers use GFEs that are distributed globally, operating together by using Google's global network and control plane. GFEs offer multi-region load balancing in the Premium tier, directing traffic to the closest healthy backend that has capacity and terminating HTTP(S) traffic as close as possible to your users. Global external Application Load Balancers and regional external Application Load Balancers use the open source Envoy proxy software to enable advanced traffic management capabilities.
These load balancers can be deployed in one of the following modes: global, regional, or classic.
External Application Load Balancers support the following capabilities:
The following diagram shows a sample external Application Load Balancer architecture.
For a complete overview, see Architecture overview for External Application Load Balancers.
The internal Application Load Balancers are Envoy proxy-based regional Layer 7 load balancers that enable you to run and scale your HTTP application traffic behind an internal IP address. Internal Application Load Balancers support backends in one region, but can be configured to be globally accessible by clients from any Google Cloud region.
The load balancer distributes traffic to backends hosted on Google Cloud, on-premises, or in other cloud environments. Internal Application Load Balancers also support the following features:
For a complete overview, see Architecture overview for internal Application Load Balancers.
For more information, see Use regional network firewall policies to protect internal Application Load Balancers and internal proxy Network Load Balancers.
The following sections depict some common use cases for Application Load Balancers.
You can deploy a combination of Application Load Balancers and Network Load Balancers to support conventional three-tier web services. The following example shows how you can deploy each tier, depending on your traffic type:
If you enable global access for your regional internal Application Load Balancer, your web-tier client VMs can be in another region.
This multitiered application example shows the following:
us-east1 region
that is accessed by the global web tier.europe-west1 region that
accesses the internal load-balanced database tier located in us-east1.Some workloads with regulatory or compliance requirements require that network configurations and traffic termination reside in a specific region. For these workloads, a regional external Application Load Balancer is often the preferred option to provide the jurisdictional controls these workloads require.
The Application Load Balancers support advanced traffic management features that give you fine-grained control over how your traffic is handled. These capabilities include the following:
Following is an example of path-based routing implemented by using an internal Application Load Balancer. Each path is handled by a different backend.
For more details, see the following:
The integration with Service Extensions lets you inject custom logic into the load balancing path of supported Application Load Balancers.
For more information, see Service Extensions overview.
Migrating an existing service to Google Cloud lets you free up on-premises capacity and reduce the cost and burden of maintaining an on-premises infrastructure. You can temporarily set up a hybrid deployment that lets you route traffic to both your current on-premises service and a corresponding Google Cloud service endpoint.
The following diagram demonstrates this setup with an internal Application Load Balancer. If you are using an internal load balancer, you can configure the Google Cloud load balancer to use weight-based traffic splitting to split traffic across the two services. Traffic splitting lets you start by sending 0% of the traffic to the Google Cloud service and 100% to the on-premises service. You can then gradually increase the proportion of traffic sent to the Google Cloud service. Eventually, you send 100% of the traffic to the Google Cloud service, and you can retire the on-premises service.
There are three ways to deploy Application Load Balancers for GKE clusters:
You can use an Application Load Balancer as the frontend for your Google Cloud serverless applications. This lets you configure your serverless applications to serve requests from a dedicated IP address that is not shared with any other services.
To set this up, you use a serverless NEG as the load balancer's backend. The following diagrams show how a serverless application is integrated with an Application Load Balancer.
This diagram shows how a serverless NEG fits into a global external Application Load Balancer architecture.
This diagram shows how a serverless NEG fits into a regional external Application Load Balancer architecture. This load balancer only supports Cloud Run backends.
This diagram shows how a serverless NEG fits into the regional internal Application Load Balancer model. This load balancer only supports Cloud Run backends.
This diagram shows how a serverless NEG fits into the cross-region internal Application Load Balancer model. This load balancer only supports Cloud Run backends.
Related documentation:
Application Load Balancers support load-balancing traffic to endpoints that extend beyond Google Cloud, such as on-premises data centers and other cloud environments. External backends are typically accessible in one of the following ways:
Accessible over the public internet. For these endpoints, you use an internet NEG as the load balancer's backend. The internet NEG is configured to point to a single FQDN:Port or IP:Port endpoint on the external backend. Internet NEGs can be global or regional.
The following diagram demonstrates how to connect to external backends accessible over the public internet using a global internet NEG.
For more details, see Internet NEGs overview.
Accessible by using hybrid connectivity (Cloud Interconnect or Cloud VPN). For these endpoints, you use a hybrid NEG as the load balancer's backend. The hybrid NEG is configured to point to IP:Port endpoints on the external backend.
The following diagrams demonstrate how to connect to external backends accessible by using Cloud Interconnect or Cloud VPN.
For more details, see Hybrid NEGs overview.
Private Service Connect allows private consumption of services across VPC networks that belong to different groups, teams, projects, or organizations. You can use Private Service Connect to access Google APIs and services or managed services in another VPC network.
You can use a global external Application Load Balancer to access services that are published by using Private Service Connect. For more information, see About Private Service Connect backends.
You can use an internal Application Load Balancer to send requests to supported regional Google APIs and services. For more information, see Access Google APIs through backends.
Cross-region failover is only available with global external Application Load Balancers, classic Application Load Balancers, and cross-region internal Application Load Balancers. These load balancers let you improve service availability when you create global backend services with backends in multiple regions. If backends in a particular region are down, traffic fails over to another region gracefully.
To learn more about how failover works, see the following topics:
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-06-09 UTC.