kafka: add a reserved Redpanda-specific Kafka API-key range#30731
Open
nguyen-andrew wants to merge 4 commits into
Open
kafka: add a reserved Redpanda-specific Kafka API-key range#30731nguyen-andrew wants to merge 4 commits into
nguyen-andrew wants to merge 4 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Introduces a reserved Redpanda-specific Kafka API key range (base key 15000) and the supporting plumbing so these keys can be dispatched, parsed (flex/tagged fields), and measured (per-key metrics/probes) without inflating the standard “dense” key-indexed tables. Adds DescribeRedpandaRoles (15000) as the first custom API, with wire schema + round-trip tests and a stub handler that enforces cluster DESCRIBE authorization and returns an empty role list.
Changes:
- Add
DescribeRedpandaRolesprotocol schemata, generated message types, and encode/decode round-trip tests. - Extend handler dispatch, flex-version lookup, and handler probe/metrics plumbing to support a rebased custom-key table for the reserved range.
- Fix throughput-controlled API-key bitmap indexing so out-of-range keys don’t throw and disconnect clients.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/v/kafka/server/tests/handler_probe_test.cc | Adds a regression test ensuring reserved-range keys map to custom probe storage without OOB access. |
| src/v/kafka/server/tests/handler_interface_test.cc | Adds tests for reserved-range dispatch and flex-version/schema recognition. |
| src/v/kafka/server/tests/BUILD | Registers new server-side unit tests and adds needed deps. |
| src/v/kafka/server/handlers/handlers.h | Adds a separate redpanda_request_types handler type-list for reserved-range APIs. |
| src/v/kafka/server/handlers/handler_probe.h | Adds _custom_probes storage for reserved-range per-handler probes. |
| src/v/kafka/server/handlers/handler_probe.cc | Initializes and routes reserved-range probes via rebased offsets. |
| src/v/kafka/server/handlers/handler_interface.cc | Adds reserved-range dispatch LUT (make_custom_lut) and lookup path in handler_for_key. |
| src/v/kafka/server/handlers/describe_redpanda_roles.h | Declares the new DescribeRedpandaRoles handler type. |
| src/v/kafka/server/handlers/describe_redpanda_roles.cc | Implements the handler with authorization + audit failure handling and empty response. |
| src/v/kafka/server/connection_context.cc | Bounds-checkes throughput-controlled API-key bitmap accesses to avoid std::out_of_range. |
| src/v/kafka/server/BUILD | Adds the new handler sources/headers and protocol deps to the server library. |
| src/v/kafka/protocol/types.h | Introduces redpanda_api_key_base constant (15000). |
| src/v/kafka/protocol/tests/describe_redpanda_roles_test.cc | Adds round-trip encode/decode tests for new request/response data types. |
| src/v/kafka/protocol/tests/BUILD | Registers the new protocol unit test. |
| src/v/kafka/protocol/schemata/generator.py | Adds new struct type names to the schema generator’s struct-type list. |
| src/v/kafka/protocol/schemata/generator.bzl | Adds describe_redpanda_roles to the list of generated messages. |
| src/v/kafka/protocol/schemata/describe_redpanda_roles_response.json | Defines the wire schema for the new response (flexible v0+). |
| src/v/kafka/protocol/schemata/describe_redpanda_roles_request.json | Defines the wire schema for the new request (nullable filter list, flexible v0+). |
| src/v/kafka/protocol/messages.h | Adds new request schema include and defines protocol-side redpanda_request_types. |
| src/v/kafka/protocol/flex_versions.cc | Adds a rebased flex-version table for reserved-range API keys and routes lookups accordingly. |
| src/v/kafka/protocol/describe_redpanda_roles.h | Adds protocol request/response wrapper types for DescribeRedpandaRoles. |
5f702f7 to
fc9aa50
Compare
The throughput-controlled API key bitmap is sized to max_api_key()+1 (~69 entries, the standard Kafka range), but the three sites that consult it indexed with .at(request_key). A request carrying an API key beyond the standard range therefore threw std::out_of_range in the throttle path, before the request reached the handler router. The error path treats that as a short-read disconnect, so the connection is dropped instead of the key being rejected as an unsupported API. Guard the three sites with an explicit size check so out-of-range keys fall through as not throughput-controlled, mirroring the bounds check handler_for_key already applies to its dispatch table. This also makes the reserved Redpanda API-key range (15000+) safe to dispatch.
Kafka API keys are assigned sequentially by Apache and are still under 100 today. To leave room for future standard assignments and avoid colliding with keys other Kafka implementations may already use, custom Redpanda APIs start at 15000, well above the standard range; DescribeRedpandaRoles is the first and takes that base key. A key that high can't simply be added to the standard dispatch structures. Kafka request routing uses three tables indexed directly by API key and sized to the largest key present: the handler lookup table (handler_interface), the flex-version map (flex_versions), and the per-shard handler_probe vector. Registering key 15000 in them would grow each to ~15000 mostly-empty entries, and because the probe vector is per-shard, that waste is multiplied across cores. So the Redpanda reserved range is kept separate. max_api_key() still derives only from the standard request_types, leaving those three structures sized to the standard range, and each site gains a second table for the reserved range. That table is rebased, indexed by key - redpanda_api_key_base, so its size tracks the span of the custom range (one entry today). Standard keys take the existing path unchanged; only keys at or above the base fall through to this secondary lookup.
fc9aa50 to
335a5bd
Compare
Member
Author
|
Force pushes to address copilot review comments and pull auth out of |
Member
Author
|
/ci-repeat 1 |
7 tasks
Collaborator
CI test resultstest results on build#85479
test results on build#85564
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Shadow linking needs to read the roles on its source cluster. Those roles are
normally read over the Admin API, but some deployment modes do not expose it,
leaving the Kafka API as the only available path to the source cluster. Reading
roles over that path requires a Redpanda-specific Kafka API, which did not exist
before.
Apache assigns Kafka API keys sequentially and is still under 100 today, so
placing Redpanda's range well above that keeps it clear of future standard
assignments and of keys other Kafka implementations may already use. This PR
reserves a Redpanda range starting at key 15000 and makes request dispatch,
flex-version (tagged-field) parsing, and per-key metrics aware of it, without
bloating the standard-range data structures. The plumbing is kept separate from
the standard tables so
max_api_key()continues to derive only from standardApache keys.
DescribeRedpandaRoles(key 15000) is the first API in the range and the oneshadow linking needs; here it serves as the proving example and simply returns
an empty role list. The API is recognized and dispatched internally but is
intentionally not advertised in ApiVersions yet, so it is not externally
discoverable until it returns real data. That, along with the
role_storewiringand enumeration, lands in the stacked follow-up PR.
This is the base of a two-PR stack; the follow-up (
describe-redpanda-roles-api)wires the cluster
role_storeinto the handler, returns real role data withname filters, and advertises the API in ApiVersions as its final step.
Backports Required
Release Notes