This document describes how can use the Ops Agent and the OpenTelemetry Protocol (OTLP) receiver to collect user-defined metrics and traces from applications instrumented by using OpenTelemetry and running on Compute Engine.
This document is organized as follows:
- An overview that describes the use cases for the OTLP receiver.
- Prerequisites for using the OTLP receiver.
- Configuring the agent to use the OTLP receiver.
- Using the receiver to collect metrics. This section describes how to query your OpenTelemetry metrics in Cloud Monitoring.
- Using the receiver to collect traces. This section describes how to authorize a service account to write data to Cloud Trace.
Overview of using the OTLP receiver
With the Ops Agent OTLP receiver, you can do the following:
- Instrument your application by using one of the language-specific SDKs for
OpenTelemetry. For information about the supported languages, see OpenTelemetry
Instrumentation. The combination
of OpenTelemetry SDKs and the Ops Agent do the following for you:
- Collect OTLP metrics from your application and send those metrics to Cloud Monitoring for analysis.
- Collect OTLP spans—trace data—from your application and then send those spans to Cloud Trace for analysis.
- Collect traces from third-party applications that have built-in support for OTLP or plugins with such support, applications such as Nginx. The OTLP receiver in the Ops Agent can collect those traces. For an example, see OpenTelemetry nginx module.
- Use OpenTelemetry custom instrumentation.
- Use OpenTelemetry automatic instrumentation.
You can use the receiver to collect metrics, traces, or both. After the Ops Agent has collected your metrics, you can use the features of Cloud Monitoring, including charts, dashboards, and alerting policies, to monitor your metrics. If your application also sends trace data, then you can use Cloud Trace to analyze that data.
Benefits
Before the availability of the OTLP plugin for the Ops Agent, the primary ways to instrument your applications to collect user-defined metrics and traces included the following:
- Using client libraries that implement the Monitoring API or the Trace API.
- Using the older OpenCensus libraries.
Using OpenTelemetry with the OTLP receiver has several benefits over these methods, including the following:
- OpenTelemetry is the replacement for OpenCensus. The OpenCensus project is being archived. For more information, see What is OpenTelemetry?.
- Ingestion is controlled at the agent level, so you don't have to redeploy your applications if the agent configuration changes.
- Your applications don't need to set up Google Cloud credentials; all authorization is handled at the agent level.
- Your application code contains no Google Cloud-specific monitoring or tracing code. You don't have to use the Monitoring API or the Trace API directly.
- Your application pushes data to the Ops Agent, and if your application crashes, any data that has been collected by the Ops Agent isn't lost.
Limitations
The OTLP listener exposed by the Ops Agent receiver supports the gRPC transport. HTTP, which is used primarily for JavaScript clients, isn't supported. For more information about the OpenTelemetry Protocol, see Protocol Details.
The OTLP receiver doesn't collect logs. You can collect logs by using the Ops Agent and other receivers and you can include log information in OTLP spans, but the OTLP receiver doesn't support the direct collection of logs. For information about using the Ops Agent to collect logs, see Logging configurations.
Prerequisites
To collect OTLP metrics and traces by using the OTLP receiver and the Ops Agent, you must install the Ops Agent version 2.37.0 or higher.
This document assumes that you already have an OpenTelemetry-based application written by using one of the OpenTelemetry SDKs. This document doesn't cover using OpenTelemetry SDKs. For information about SDKs and the supported languages, see OpenTelemetry Instrumentation.
Configure the Ops Agent
To configure the Ops Agent to use the OTLP receiver, do the following:
- Modify the user configuration file for the Ops Agent to include the OTLP receiver.
- Restart the Ops Agent.
The following sections describe each step.
Modify the Ops Agent user-configuration file
Add the configuration elements for the OTLP receiver to the the user-configuration file for the Ops Agent:
- For Linux:
/etc/google-cloud-ops-agent/config.yaml - For Windows:
C:\Program Files\Google\Cloud Operations\Ops Agent\config\config.yaml
For general information about configuring the agent, see Configuration model.
The OTLP receiver introduces the combined configuration section
for the Ops Agent. Using the receiver requires you to configure services for
metrics and traces, even if you aren't using both of them.
The following sections describe the configuration steps for the OTLP receiver.
Add the combined receiver section
You place the receiver for OTLP metrics and traces in the combined
section. No processors or services are permitted in the combined section.
You must not configure any other receiver with the
same name as a receiver in the combined section. The following example
uses otlp as the name of the receiver.
The minimal combined configuration for OTLP looks like the following:
combined:
receivers:
otlp:
type: otlp
The otlp receiver has the following configuration options:
type: Required. Must beotlpgrpc_endpoint: Optional. The gRPC endpoint on which the OTLP receiver listens. Defaults to0.0.0.0:4317.metrics_mode: Optional. Defaults togooglemanagedprometheus, which means the receiver sends OTLP metrics as Prometheus-formatted metrics by using the Prometheus API also used by Managed Service for Prometheus.To send the metrics as Cloud Monitoring custom metrics by using the Monitoring API instead, set the
metrics_modeoption to the valuegooglecloudmonitoring.This choice affects how your metrics are ingested and how they are measured for billing. For more information about metrics formats, see Ingestion formats for OTLP metrics.
Add OTLP pipelines to your services
The OTLP receiver can collect metrics and traces, so you must define a service for metrics and for traces. If you aren't going to collect either metrics or traces, you can create empty services. If you already have services with other pipelines, you can add the OTLP pipeline to them.
The following shows the metrics and traces services with the OTLP
receiver included in the pipelines:
combined:
receivers:
otlp:
type: otlp
metrics:
service:
pipelines:
otlp:
receivers: [otlp]
traces:
service:
pipelines:
otlp:
receivers: [otlp]
If you don't want to use either the metrics or traces service for OTLP
collection, then leave the OTLP receiver out of the pipeline for the service.
The service must exist, even if it has no pipelines. If you application sends
data of a given type and there is no corresponding pipeline that includes the
receiver, then the Ops Agent discards the data.
Restart the Ops Agent
To apply your configuration changes, you must restart the Ops Agent.
Linux
- To restart the agent, run the following command on your instance:
sudo systemctl restart google-cloud-ops-agent
- To confirm that the agent restarted, run the following command and
verify that the components "Metrics Agent" and "Logging Agent" started:
sudo systemctl status "google-cloud-ops-agent*"
Windows
- Connect to your instance using RDP or a similar tool and login to Windows.
- Open a PowerShell terminal with administrator privileges by right-clicking the PowerShell icon and selecting Run as Administrator
- To restart the agent, run the following PowerShell command:
Restart-Service google-cloud-ops-agent -Force
- To confirm that the agent restarted, run the following command and
verify that the components "Metrics Agent" and "Logging Agent" started:
Get-Service google-cloud-ops-agent*
Collect OTLP metrics
When you use the OTLP receiver to collect metrics from your OpenTelemetry applications, the primary configuration choice for the receiver is the API that you want to use to ingest the metrics.
You make this choice by changing the metrics_mode option in the
configuration of the otlp receiver or using the default value.
The choice affects how your OTLP metrics are ingested into
Cloud Monitoring and how that data is measured for billing purposes.
The metrics_mode choice doesn't affect your ability to create charts,
dashboards, and alerting policies in Monitoring.
- For information about creating charts and dashboards, see Dashboards and charts overview.
- For information about alerting policies, see Alerting overview.
The following sections describe differences in the formats used by the metric modes and how to query the ingested data for use in Monitoring.
Ingestion formats for OTLP metrics
The OTLP receiver provides the metrics_mode option, which specifies
the API that is used to ingest your metric data. By default, the receiver
uses the Prometheus API; the default value for the metrics_mode option
is googlemanagedprometheus. The metrics are ingested using the same
API that is used by Managed Service for Prometheus.
You can configure the receiver to send your metric data to the
Cloud Monitoring API instead. To send data to the Monitoring API,
set the value of the metrics_mode option to googlecloudmonitoring, as
shown in the following example:
combined:
receivers:
otlp:
type: otlp
metrics_mode: googlecloudmonitoring
The ingestion format you use determines how the OTLP metrics are mapped into Cloud Monitoring. You can create charts, dashboards, and alerting policies in Monitoring for metrics of either metric format, but you refer to the metrics differently in queries.
The ingestion format also determines the pricing model used for data ingestion.
The following sections describe pricing, the structural differences between a metric ingested by the Prometheus API and the same metric ingested by the Monitoring API, and how to refer to the metrics in queries.
Pricing and quota
The ingestion format you use determines how the OTLP metrics are charged:
Prometheus API: When you use the Prometheus API to ingest your application's metrics, the data is subject to sample-based pricing, as if the metrics had come in by using Managed Service for Prometheus.
Monitoring API: When you use the Monitoring API to ingest your application's metrics, the data is subject to volume-based pricing, like data from other integrations with the Ops Agent.
Metrics ingested by using the OTLP receiver are considered types of "custom" metrics when ingested into Cloud Monitoring and are subject to the quotas and limits for custom metrics.
For current pricing, see Google Cloud Observability pricing.
Metric structure
Cloud Monitoring describes the format of metric data by using a schema called a metric descriptor. The metric descriptor includes the name of the metric, the data type of metric values, how each value is related to prior values, and any labels associated with the values. If you configure the OTLP receiver to ingest metrics by using the Prometheus API, then the metric descriptor that is created differs from the metric descriptor created when you use the Monitoring API.
Prometheus API: When you use the Prometheus API to ingest your application's metrics, each metric is transformed by using the standard OpenTelemetry-to-Prometheus transformation and mapped to a Cloud Monitoring monitored-resource type.
- The transformation includes the following changes:
- The OTLP metric name is prefixed with the string
prometheus.googleapis.com/. - Any non-alphanumeric characters, such as periods (
.), in the OTLP metric name are replaced by underscores (_). - The OTLP metric name is postfixed with a string that indicates the
metric kind, like
/gaugeor/counter.
- The OTLP metric name is prefixed with the string
- The following labels, populated with values from the OTLP resource,
are added to the metric:
instance_name: The value of thehost.nameresource attribute.machine_type: The value of thehost.typeresource attribute.
The monitored resource recorded with the metric measurements is the generic
prometheus_targettype. The generated Prometheus time series includes the following labels from theprometheus_targetresource, populated with values from the OTLP resource:location: The value of thecloud.availability_zoneresource attribute.namespace: The value of thehost.idresource attribute.
The
prometheus_targetresource type also includes these labels:project_id: The identifier of the Google Cloud project, likemy-project, in which the Ops Agent is running.cluster: The value is always__gce__when metrics are collected by the Ops Agent.
If the incoming OTLP data is missing the resource attributes used for label values, then the values are taken from information about the VM running the Ops Agent. This behavior means that OTLP data without these resource attributes appears with the same labels as data collected by the Ops Agent Prometheus receiver.
Monitoring API: When you use the Monitoring API to ingest your application's metrics, each metric is handled as follows:
- The OTLP metric name is prefixed with the string
workload.googleapis.com/, unless the OTLP metric name already contains this string or another valid metric domain, likecustom.googleapis.com. We recommend using the "workload" domain. - The monitored resource recorded with the metric measurements is
the Compute Engine virtual-machine type
gce_instance.
The following examples show the metric descriptors for a pair of OpenTelemetry
metrics. The metrics are created by an application that uses the
Go OpenTelemetry metrics library.
The Prometheus API tab shows the
metric descriptor created when the OTLP receiver uses the default
Prometheus metrics mode. The Monitoring API tab shows the
metric descriptor created when the OTLP receiver uses the
googlecloudmonitoring metric mode.
Nothing changes in the application that creates the metric; the only change is the metric mode used by the OTLP receiver.
The application creates an OTLP gauge metric, otlp.test.gauge, that
records 64-bit floating-point values.
The following tabs show the metric descriptor that each ingestion API creates:
Prometheus API
{
"name": "projects/PROJECT_ID/metricDescriptors/prometheus.googleapis.com/otlp_test_gauge/gauge",
"labels": [
{
"key": "instance_name"
},
{
"key": "machine_type"
}
],
"metricKind": "GAUGE",
"valueType": "DOUBLE",
"type": "prometheus.googleapis.com/otlp_test_gauge/gauge",
"monitoredResourceTypes": [
"prometheus_target"
]
}
Monitoring API
{
"name": "projects/PROJECT_ID/metricDescriptors/workload.googleapis.com/otlp.test.gauge",
"labels": [
{
"key": "instrumentation_source"
}
],
"metricKind": "GAUGE",
"valueType": "DOUBLE",
"type": "workload.googleapis.com/otlp.test.gauge",
"monitoredResourceTypes": [
"gce_instance",
...many other types deleted...
]
}
The application creates an OTLP counter metric, otlp.test.cumulative,
that records increasing 64-bit floating-point values.
The following tabs show the metric descriptor that each ingestion API creates:
Prometheus API
{
"name": "projects/PROJECT_ID/metricDescriptors/prometheus.googleapis.com/otlp_test_cumulative/counter",
"labels": [
{
"key": "instance_name"
},
{
"key": "machine_type"
}
],
"metricKind": "CUMULATIVE",
"valueType": "DOUBLE",
"type": "prometheus.googleapis.com/otlp_test_cumulative/counter",
"monitoredResourceTypes": [
"prometheus_target"
]
}
Monitoring API
{
"name": "projects/PROJECT_ID/metricDescriptors/workload.googleapis.com/otlp.test.cumulative",
"labels": [
{
"key": "instrumentation_source"
}
],
"metricKind": "CUMULATIVE",
"valueType": "DOUBLE",
"type": "workload.googleapis.com/otlp.test.cumulative",
"monitoredResourceTypes": [
"gce_instance",
...many other types deleted...
]
}
The following table summarizes some of the format differences imposed by the APIs used to ingest OTLP metrics:
| Prometheus API | Monitoring API | |
|---|---|---|
| Metric domain | prometheus.googleapis.com |
workload.googleapis.com |
| OTLP metric name | Modified during ingestion | Used as provided |
| Monitored resource |
prometheus_target |
gce_instance |
Ingestion formats and queries
The metrics mode used in the OTLP receiver affects the way you query the resulting metrics in Cloud Monitoring when you build charts, dashboards, and alerting policies.
When you configure a chart, dashboard, or alerting policy in Cloud Monitoring, the configuration includes a query for the data on which the chart, dashboard, or alerting policy operates.
Cloud Monitoring supports the following tools for querying metric data:
- A query-builder based interface built into tools like Metrics Explorer, the dashboard-builder interface, and the alert-policy configuration interface.
- Prometheus Query Language (PromQL): The text-based query language used by open source Prometheus.
For information about querying OTLP metrics by using these tools, see the following:
- Query OTLP metrics ingested by using the Prometheus API
- Query OTLP metrics ingested by using the Monitoring API
Query OTLP metrics ingested by using the Prometheus API
This section illustrates how you query OTLP metrics ingested by using the Prometheus API, which is the default metric mode for the OTLP receiver.
The queries are based on the OTLP metrics described in Metric structure:
otlp.test.gauge: An OTLP gauge metric that records 64-bit floating-point values.otlp.test.cumlative: An OTLP counter metric that records increasing 64-bit floating-point values.
These metrics are ingested into Cloud Monitoring with the following metric types, which function as names:
prometheus.googleapis.com/otlp_test_gauge/gaugeprometheus.googleapis.com/otlp_test_cumulative/counter
Metrics ingested by using the Prometheus API are written against the
monitored-resource type prometheus_target.
The tabs show what basic queries look like when query the metrics by using the Google Cloud console. These examples use Metrics Explorer, but the principles are the same for dashboards and alerting policies.