RAG Engine on Gemini Enterprise Agent Platform provides different Deployment Modes for operating your RAG instances. Your choice of deployment mode determines where your data is stored, how that storage scales as your data grows, and what level of infrastructure management is required from you. By understanding how these modes operate, you can choose the right balance of simplicity, scalability, and costs for your project.
RAG Engine offers two deployment modes: Serverless and Spanner. You can switch between both modes seamlessly. Data within each mode remains isolated from the other.
In this section, we discuss the two deployment modes available for RAG Engine:
Serverless mode is the most affordable and recommended way to get started with RAG Engine. It provides a fully managed, planet-scale, enterprise-ready database that abstracts away all database provisioning and scaling.
In the serverless mode, the RAG managed database is for managing RAG business operations and storing RAG resources. These resources include (but are not limited to) RagCorpus, RagFiles, RagMetadata, DataSchema etc. But it can no longer be used for embedding indexing and vector search.
Users will always need to choose a different vector database separately. In Serverless mode, by default, RAG Engine provisions a Vector Search 2.0 collection in your project for embedding indexing and vector search. Compared with Spanner mode, provisioning Vector Search 2.0 in your project gives you full visibility and control over the vector DB usage and costs. See the Spanner Mode versus Serverless Mode section for a detailed comparison.
Spanner mode allocates dedicated Spanner infrastructure specifically to serve as the foundation of your RAG Engine deployment. It is designed for workloads that require specific compliance features (like CMEK) or dedicated, isolated database instances. Spanner mode is assigned as the default if a mode choice isn't explicitly selected.
When using Spanner mode, you must manage your infrastructure by selecting a performance tier:
RAG Engine lets you switch your project's deployment mode as long as there are no ongoing operations in your active deployment mode. You can have data under both modes. However, only one mode can be active at a time, and the data is strictly isolated between deployment modes.
As a helpful tool, you can imagine that your project behaves as if it has two completely separate backends. The resources you create (corpora, imported and uploaded files, and parsed embeddings) are permanently tied to the deployment mode that was active during their creation. Any retrieval requests either directly or through Gemini will also be limited to the corpora and files that are present under your current deployment mode. Switching between the two modes does not move your data over or delete data from the other mode.

As illustrated in the diagram:
The deployment mode is a project-level setting. You can view or change your current mode using the GetRagEngineConfig and UpdateRagEngineConfig APIs. See the Switching between modes page for details on how to switch between your deployment modes and choosing an appropriate tier for your Spanner mode.
Because data is isolated between modes, the processes for cleaning up resources and halting billing differ slightly depending on where your data lives.
ListRagCorpora API to view your resources, and then manually delete each corpus using the DeleteRagCorpus API.RagEngineConfig and set the Spanner tier to Unprovisioned. This will immediately delete your dedicated Spanner instance and all RAG data held within it, halting any associated billing for the Spanner mode. Note: Data deleted using the Unprovisioned tier cannot be recovered.| Feature | Serverless Mode | Spanner Mode |
|---|---|---|
| Cost |
|
|
| Scaling | Fully managed autoscaling | Choice of tier needs to be configured, but does offer an autoscaling tier. |
| Isolation | Storage is not isolated | Provides storage and performance isolation. |
| CMEK | No CMEK at the moment | Offers CMEK support |
| VPC Security Controls | Supported | Supported |
| Supported Vector DBs |
|
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-06-10 UTC.