Skip to content

KEP-5304: Promote DRA Discoverable Device Metadata API to Beta#6160

Open
alaypatel07 wants to merge 2 commits into
kubernetes:masterfrom
alaypatel07:kep-5304-beta
Open

KEP-5304: Promote DRA Discoverable Device Metadata API to Beta#6160
alaypatel07 wants to merge 2 commits into
kubernetes:masterfrom
alaypatel07:kep-5304-beta

Conversation

@alaypatel07

@alaypatel07 alaypatel07 commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Assisted-by: Cursor no-reply@cursor.com

  • One-line PR description: KEP-5304: adding beta graduation criteria
  • Other comments:

@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Jun 4, 2026
@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 4, 2026
@alaypatel07 alaypatel07 marked this pull request as draft June 4, 2026 22:39
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 4, 2026
@alaypatel07 alaypatel07 force-pushed the kep-5304-beta branch 2 times, most recently from 8dfd39a to 1237604 Compare June 5, 2026 22:05
@alaypatel07 alaypatel07 marked this pull request as ready for review June 5, 2026 22:07
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 5, 2026
- Metric name: `dra_metadata_feature_enabled`
- Description: Gauge (0/1) indicating if the feature is enabled on this driver instance
- Aggregation method: current value per node/driver
- Components exposing the metric: DRA driver plugin framework

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... which may or may not be exposed by a DRA driver using the framework. Admins need to check the documentation of a DRA driver to determine how to collect these metrics.

That all of these metrics are by DRA driver is not clear from the KEP. The usual expection is that these are metrics in a Kubernetes component, which isn't true here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I have added an explicit section few lines about to reflect this.

These goals will help you determine what you need to measure (SLIs) in the next
question.
-->
This feature adds minimal overhead to pod startup. SLOs expressed as metrics queries:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devil's advocate: what the SLI is covering is only the overhead added to a DRA driver. It does not cover the overhead of doing more work for mounting files (container runtime, kernel).

That additional overhead is harder to measure. My expectation is that it is higher than merely writing some files.

It would be nice if the KEP at least acknowledged that this other overhead exists.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 good point, acknowledged it in the KEP

Assisted-by: Cursor <no-reply@cursor.com>
Signed-off-by: Alay Patel <alayp@nvidia.com>
extending the production code to implement this enhancement.
-->

- `<package>`: `<date>` - `<test coverage>`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fill this as part of the beta reqs?

@sohankunkerkar sohankunkerkar left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two comments but the PRR questionnaire is thoroughly filled in for Beta.
LGTM from the PRR shadow side.

Comment thread keps/sig-node/5304-dra-attributes-downward-api/README.md
Comment thread keps/sig-node/5304-dra-attributes-downward-api/kep.yaml Outdated
Co-authored-by: Wendy Ha <139814343+wendy-ha18@users.noreply.github.com>
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: alaypatel07, mrunalp
Once this PR has been reviewed and has the lgtm label, please assign soltysh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants