Application Load Balancer overview

The Application Load Balancer is a proxy-based Layer 7 load balancer that lets you run and scale your services. The Application Load Balancer distributes HTTP and HTTPS traffic to backends hosted on a variety of Google Cloud platforms—such as Compute Engine, Google Kubernetes Engine (GKE), Cloud Storage, and Cloud Run—as well as external backends connected over the internet or by using hybrid connectivity.

Application Load Balancers are available in the following modes of deployment:

The load balancing scheme is an attribute on the forwarding rule and the backend service of a load balancer and indicates whether the load balancer can be used for internal or external traffic. The term _MANAGED in the load balancing scheme indicates that the load balancer is implemented as a managed service either on Google Front Ends (GFEs) or on the open source Envoy proxy. In a load balancing scheme that is _MANAGED, requests are routed either to the GFE or to the Envoy proxy.

External Application Load Balancer

External Application Load Balancers are implemented using Google Front Ends (GFEs) or managed proxies. Global external Application Load Balancers and classic Application Load Balancers use GFEs that are distributed globally, operating together by using Google's global network and control plane. GFEs offer multi-region load balancing in the Premium tier, directing traffic to the closest healthy backend that has capacity and terminating HTTP(S) traffic as close as possible to your users. Global external Application Load Balancers and regional external Application Load Balancers use the open source Envoy proxy software to enable advanced traffic management capabilities.

These load balancers can be deployed in one of the following modes: global, regional, or classic.

External Application Load Balancers support the following capabilities:

The following diagram shows a sample external Application Load Balancer architecture.

External Application Load Balancer architecture.
External Application Load Balancer architecture.

For a complete overview, see Architecture overview for External Application Load Balancers.

Internal Application Load Balancer

The internal Application Load Balancers are Envoy proxy-based regional Layer 7 load balancers that enable you to run and scale your HTTP application traffic behind an internal IP address. Internal Application Load Balancers support backends in one region, but can be configured to be globally accessible by clients from any Google Cloud region.

The load balancer distributes traffic to backends hosted on Google Cloud, on-premises, or in other cloud environments. Internal Application Load Balancers also support the following features:

Internal Application Load Balancer architecture.
Internal Application Load Balancer architecture.

For a complete overview, see Architecture overview for internal Application Load Balancers.

Integration with Cloud NGFW

You can use rules in Cloud NGFW firewall policies to control access to the Envoy proxies used by regional internal Application Load Balancers and cross-region internal Application Load Balancers. Both Cloud NGFW Essentials and Cloud NGFW Standard support these features.

For more information, see Use regional network firewall policies to protect internal Application Load Balancers and internal proxy Network Load Balancers.

Use cases

The following sections depict some common use cases for Application Load Balancers.

Three-tier web services

You can deploy a combination of Application Load Balancers and Network Load Balancers to support conventional three-tier web services. The following example shows how you can deploy each tier, depending on your traffic type:

Layer 7-based routing in a three-tier web application.
Layer 7-based routing in a three-tier web application.

Global access for regional internal Application Load Balancers

If you enable global access for your regional internal Application Load Balancer, your web-tier client VMs can be in another region.

This multitiered application example shows the following:

Three-tier web app with an external Application Load Balancer, global access, and an
         internal Application Load Balancer.
Three-tier web app with an external Application Load Balancer, global access, and an internal Application Load Balancer (click to enlarge).

Workloads with jurisdictional compliance

Some workloads with regulatory or compliance requirements require that network configurations and traffic termination reside in a specific region. For these workloads, a regional external Application Load Balancer is often the preferred option to provide the jurisdictional controls these workloads require.

Advanced traffic management

The Application Load Balancers support advanced traffic management features that give you fine-grained control over how your traffic is handled. These capabilities include the following:

Following is an example of path-based routing implemented by using an internal Application Load Balancer. Each path is handled by a different backend.

Path-based routing with internal Application Load Balancers.
Path-based routing with internal Application Load Balancers.

For more details, see the following:

Extensibility with Service Extensions

The integration with Service Extensions lets you inject custom logic into the load balancing path of supported Application Load Balancers.

For more information, see Service Extensions overview.

Migrating legacy services to Google Cloud

Migrating an existing service to Google Cloud lets you free up on-premises capacity and reduce the cost and burden of maintaining an on-premises infrastructure. You can temporarily set up a hybrid deployment that lets you route traffic to both your current on-premises service and a corresponding Google Cloud service endpoint.

The following diagram demonstrates this setup with an internal Application Load Balancer. If you are using an internal load balancer, you can configure the Google Cloud load balancer to use weight-based traffic splitting to split traffic across the two services. Traffic splitting lets you start by sending 0% of the traffic to the Google Cloud service and 100% to the on-premises service. You can then gradually increase the proportion of traffic sent to the Google Cloud service. Eventually, you send 100% of the traffic to the Google Cloud service, and you can retire the on-premises service.

Migrate legacy services to Google Cloud.
Migrate legacy services to Google Cloud.

Load balancing for GKE applications

There are three ways to deploy Application Load Balancers for GKE clusters:

Load balancing for Cloud Run, Cloud Run functions, and App Engine applications

You can use an Application Load Balancer as the frontend for your Google Cloud serverless applications. This lets you configure your serverless applications to serve requests from a dedicated IP address that is not shared with any other services.

To set this up, you use a serverless NEG as the load balancer's backend. The following diagrams show how a serverless application is integrated with an Application Load Balancer.

Global external

This diagram shows how a serverless NEG fits into a global external Application Load Balancer architecture.

Global external Application Load Balancer architecture for serverless apps.
Global external Application Load Balancer architecture for serverless apps.

Regional external

This diagram shows how a serverless NEG fits into a regional external Application Load Balancer architecture. This load balancer only supports Cloud Run backends.

Regional external Application Load Balancer architecture for serverless apps.
Regional external Application Load Balancer architecture for serverless apps.

Regional internal

This diagram shows how a serverless NEG fits into the regional internal Application Load Balancer model. This load balancer only supports Cloud Run backends.

Regional internal Application Load Balancer architecture for serverless apps.
Regional internal Application Load Balancer architecture for serverless apps.

Cross-region internal

This diagram shows how a serverless NEG fits into the cross-region internal Application Load Balancer model. This load balancer only supports Cloud Run backends.

Cross-region internal Application Load Balancer architecture for serverless apps.
Cross-region internal Application Load Balancer architecture for serverless apps (click to enlarge).

Related documentation:

Load balancing to backends outside Google Cloud

Application Load Balancers support load-balancing traffic to endpoints that extend beyond Google Cloud, such as on-premises data centers and other cloud environments. External backends are typically accessible in one of the following ways:

Integration with Private Service Connect

Private Service Connect allows private consumption of services across VPC networks that belong to different groups, teams, projects, or organizations. You can use Private Service Connect to access Google APIs and services or managed services in another VPC network.

You can use a global external Application Load Balancer to access services that are published by using Private Service Connect. For more information, see About Private Service Connect backends.

You can use an internal Application Load Balancer to send requests to supported regional Google APIs and services. For more information, see Access Google APIs through backends.

High availability and cross-region failover

Cross-region failover is only available with global external Application Load Balancers, classic Application Load Balancers, and cross-region internal Application Load Balancers. These load balancers let you improve service availability when you create global backend services with backends in multiple regions. If backends in a particular region are down, traffic fails over to another region gracefully.

To learn more about how failover works, see the following topics: