Cloud TPU resources in Compute Engine

You can create and manage Tensor Processing Units (TPUs) by using Compute Engine resources. This page provides a conceptual overview of using TPUs with Compute Engine. It maps TPU concepts to Compute Engine resources and outlines the high-level workflows for creating TPU resources.

Primary TPU concepts

To manage TPU resources within Compute Engine, it's helpful to understand these primary TPU concepts:

TPU and Compute Engine concept map

The following table describes how TPU concepts map to Compute Engine resources:

Cloud TPU concept Compute Engine resource Resource details Use case
TPU VM VM instance A Compute Engine VM that provides direct access to TPU hardware. Individual VM tasks, SSH command execution, or debugging
TPU single-host slice VM instance or MIG with a single VM A configuration consisting of one physical host machine. Inference with autoscaling
TPU multi-host slice MIG with accelerator topology specified in workload policy A group of TPU VMs interconnected using ICI, managed as a single logical unit. Large-scale, distributed training requiring atomic provisioning

Migrate from the Cloud TPU API

The Cloud TPU API is no longer under active development. This includes the Google Cloud CLI for the Cloud TPU API and the Cloud Client Libraries for the Cloud TPU API. The Cloud TPU API will receive bug fixes and security updates only. New hardware generations, starting with TPU7x (Ironwood), are supported only through Compute Engine or Google Kubernetes Engine (GKE). For the latest features and support for the latest TPU versions, migrate by replacing your legacy Cloud TPU API calls with their equivalents in Compute Engine or GKE.

Depending on your orchestration and workload requirements, choose one of the following paths:

Existing TPU resources

TPU resources created using the Cloud TPU API (Node or QueuedResource REST objects) are incompatible with Compute Engine and GKE. To start using Compute Engine or GKE:

Limitations

TPUs in Compute Engine have the following limitations:

What's next

  • Try the quickstart: Create a single TPU instance
  • Create a single-host TPU slice
  • Create a multi-host TPU slice