Skip to content

fix(coderd/x/chatd): enable Anthropic prompt caching for Bedrock-served models#26358

Open
rodmk wants to merge 1 commit into
coder:mainfrom
rodmk:fix/bedrock-prompt-caching
Open

fix(coderd/x/chatd): enable Anthropic prompt caching for Bedrock-served models#26358
rodmk wants to merge 1 commit into
coder:mainfrom
rodmk:fix/bedrock-prompt-caching

Conversation

@rodmk

@rodmk rodmk commented Jun 12, 2026

Copy link
Copy Markdown

Problem

chatd gates Anthropic-specific handling — prompt-cache breakpoints, the provider-executed tool guard, and tool-history sanitization — on model.Provider() == fantasyanthropic.Name ("anthropic"). fantasy's bedrock provider is the Anthropic provider wrapped with WithName("bedrock"), so Bedrock-served Claude models report Provider() == "bedrock" and fail that check: addAnthropicPromptCaching never runs, no cache_control reaches the wire, and usage reports cache_read_input_tokens == 0 on every request. The tools+system prefix re-bills at full input price on every step — the cache-read discount is never realized on Bedrock — and the provider-tool guard and sanitizer are skipped for Bedrock chats too.

Fix

Add chatsanitize.IsAnthropicFamily(provider) (true for both anthropic and bedrock) and use it at the caching gate and the four guard/sanitizer checks. Matching on the provider id is sufficient: fantasy's bedrock provider can only serve Anthropic models — a non-Anthropic Bedrock model would arrive through a different provider with a different Name.

Verification

Built a server with this patch and exercised it end-to-end: cache reads go from 0 → the full prefix on the second turn, and coder's linters + the affected-package tests are clean.

End-to-end — real API + real Bedrock

Drove a 2-turn chat through the real POST /api/experimental/chats + /{id}/messages endpoints on a build with this patch (fantasy bedrock provider, Claude Opus), reading usage from each assistant message. Before this change both turns report cache_creation_input_tokens = 0 and cache_read_input_tokens = 0. After:

turn input_tokens cache_creation_input_tokens cache_read_input_tokens
1 2 9773 0
2 2 27 9773

Turn 1 writes the tools+system prefix; turn 2 reads it back at the cache-read rate.

Lint + tests — coder's pinned toolchain (go 1.26.4, golangci-lint v1.64.8)
$ golangci-lint run ./coderd/x/chatd/chatloop/... ./coderd/x/chatd/chatsanitize/...      # clean
$ paralleltestctx -custom-funcs="testutil.Context,chatdTestContext" ./coderd/x/chatd/... # clean
$ go run ./scripts/intxcheck ./coderd/x/chatd/...                                        # clean
$ typos <changed files>                                                                  # clean
$ go test ./coderd/x/chatd/chatsanitize/... ./coderd/x/chatd/chatloop/...
ok  github.com/coder/coder/v2/coderd/x/chatd/chatsanitize
ok  github.com/coder/coder/v2/coderd/x/chatd/chatloop

Tests added: the IsAnthropicFamily predicate, the caching gate for a bedrock-provider model, and a chatsanitize table case proving the sanitizer now runs for bedrock (the case fails on main). The existing "leaves other providers unchanged" case still passes, confirming non-Anthropic providers stay excluded.

make gen / frontend / proto are no-ops for this change — no generated, TypeScript, or proto files are touched.

AI disclosure

Authored primarily with AI assistance (Claude), per coder's AI contribution guidelines. I reviewed the diff, ran the checks above, and verified the runtime behavior myself; I am accountable for the contents.

…ed models

chatd applies Anthropic-specific request handling (prompt-cache breakpoints,
the provider-executed tool guard, tool-history sanitization) only when
model.Provider() == fantasyanthropic.Name. fantasy's bedrock provider wraps the
Anthropic client but reports Provider() == "bedrock", so Bedrock-served Claude
models are silently excluded: addAnthropicPromptCaching never runs, no
cache_control is emitted, and usage reports cache_creation/cache_read = 0 on
every request — the prefix re-bills at full price each step.

Add chatsanitize.IsAnthropicFamily(provider), true for both providers, and use
it at the caching gate and the four guard/sanitizer checks. Adds unit tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions github-actions Bot added the community Pull Requests and issues created by the community. label Jun 12, 2026
@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@rodmk

rodmk commented Jun 12, 2026

Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

cdrci2 added a commit to coder/cla that referenced this pull request Jun 12, 2026
@rodmk rodmk marked this pull request as ready for review June 12, 2026 21:27
@rodmk

rodmk commented Jun 12, 2026

Copy link
Copy Markdown
Author

Happy to tweak the approach. We'd love to get this issue fixed to unblock wider agents dogfooding without breaking the bank 💸

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community Pull Requests and issues created by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants