This page shows how to enable session affinity for your Cloud Run service revision.
How session affinity works
By default, session affinity is not enabled, so requests from
the same client might be handled by different instances, as shown
here:
If you enable session affinity, Cloud Run routes sequential requests
for a given client to the same revision instance. Cloud Run
uses a session affinity cookie with a TTL of 30 days, and inspects its value to
identify multiple requests by the same client and directs all these requests to
the same instance, as shown here:
Key behaviors to be aware of
As shown in the diagram above, with session affinity enabled, a client will reach the same instance. However, note that the instance can receive requests from different clients. Session affinity does not mean that the instance is dedicated only to one client.
Due to the autoscaling behavior of Cloud Run, session affinity is best effort affinity. If the instance is terminated for any reason, or reaches maximum request concurrency or maximum CPU utilization, then session affinity is broken and further requests are routed to a different instance.
Although you can cache client session data in memory of instances, you cannot assume that a client will always reconnect to the same instance, even when session affinity is enabled.
Cloud Load Balancing session affinity and Cloud Run session affinity are two separate and independent implementations of session affinity. You can enable Cloud Run's session affinity on a Cloud Run service, even if it's behind a load balancer. However, you shouldn't enable Cloud Load Balancing session affinity on a serverless network endpoint group, since it's not supported.
Session affinity and traffic splitting
You enable or disable session affinity at the revision level. If you enable session affinity on a Cloud Run revision and also use traffic splitting, session affinity takes precedence over any traffic splitting. In extreme cases, if a single client using session affinity is responsible for a vast majority of all requests, all of those requests can be routed to a given revision regardless of the traffic splitting configuration.
If you enable traffic splitting for revisions where some of the revisions have session affinity enabled and some do not, the result is that requests are gradually shifted towards revisions that have session affinity enabled, even though you do not explicitly change the traffic split configuration. The reason for this is that every request that doesn't have a session affinity cookie attached is subject to a random split, where some will eventually be assigned to a revision with session affinity and subsequently those requests will stay with that particular revision.
When updating the traffic splitting configuration for a Cloud Run service, subsequent requests with session affinity cookies attached might be assigned to a different revision. Cloud Run minimizes the number of clients that are redirected to a new revision.
For example, if a service was splitting traffic at 90%/10%, and the traffic split is updated to 80%/20%, then 10% of the traffic will be redirected to the revision that is now serving 20% of the traffic.
Set session affinity
Any configuration change leads to the creation of a new revision. Subsequent revisions will also automatically get this configuration setting unless you make explicit updates to change it.
You can set session affinity using the Google Cloud console, the gcloud command line, or using a .yaml file when you create a new service or deploy a new revision: