fix(coderd/x/chatd): enable Anthropic prompt caching for Bedrock-served models#26358
Open
rodmk wants to merge 1 commit into
Open
fix(coderd/x/chatd): enable Anthropic prompt caching for Bedrock-served models#26358rodmk wants to merge 1 commit into
rodmk wants to merge 1 commit into
Conversation
…ed models chatd applies Anthropic-specific request handling (prompt-cache breakpoints, the provider-executed tool guard, tool-history sanitization) only when model.Provider() == fantasyanthropic.Name. fantasy's bedrock provider wraps the Anthropic client but reports Provider() == "bedrock", so Bedrock-served Claude models are silently excluded: addAnthropicPromptCaching never runs, no cache_control is emitted, and usage reports cache_creation/cache_read = 0 on every request — the prefix re-bills at full price each step. Add chatsanitize.IsAnthropicFamily(provider), true for both providers, and use it at the caching gate and the four guard/sanitizer checks. Adds unit tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
All contributors have signed the CLA ✍️ ✅ |
Author
|
I have read the CLA Document and I hereby sign the CLA |
cdrci2
added a commit
to coder/cla
that referenced
this pull request
Jun 12, 2026
Author
|
Happy to tweak the approach. We'd love to get this issue fixed to unblock wider agents dogfooding without breaking the bank 💸 |
dnguy078
approved these changes
Jun 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
chatd gates Anthropic-specific handling — prompt-cache breakpoints, the provider-executed tool guard, and tool-history sanitization — on
model.Provider() == fantasyanthropic.Name("anthropic"). fantasy'sbedrockprovider is the Anthropic provider wrapped withWithName("bedrock"), so Bedrock-served Claude models reportProvider() == "bedrock"and fail that check:addAnthropicPromptCachingnever runs, nocache_controlreaches the wire, andusagereportscache_read_input_tokens == 0on every request. The tools+system prefix re-bills at full input price on every step — the cache-read discount is never realized on Bedrock — and the provider-tool guard and sanitizer are skipped for Bedrock chats too.Fix
Add
chatsanitize.IsAnthropicFamily(provider)(true for bothanthropicandbedrock) and use it at the caching gate and the four guard/sanitizer checks. Matching on the provider id is sufficient: fantasy's bedrock provider can only serve Anthropic models — a non-Anthropic Bedrock model would arrive through a different provider with a differentName.Verification
Built a server with this patch and exercised it end-to-end: cache reads go from
0→ the full prefix on the second turn, and coder's linters + the affected-package tests are clean.End-to-end — real API + real Bedrock
Drove a 2-turn chat through the real
POST /api/experimental/chats+/{id}/messagesendpoints on a build with this patch (fantasy bedrock provider, Claude Opus), readingusagefrom each assistant message. Before this change both turns reportcache_creation_input_tokens = 0andcache_read_input_tokens = 0. After:Turn 1 writes the tools+system prefix; turn 2 reads it back at the cache-read rate.
Lint + tests — coder's pinned toolchain (go 1.26.4, golangci-lint v1.64.8)
Tests added: the
IsAnthropicFamilypredicate, the caching gate for abedrock-provider model, and achatsanitizetable case proving the sanitizer now runs forbedrock(the case fails onmain). The existing "leaves other providers unchanged" case still passes, confirming non-Anthropic providers stay excluded.make gen/ frontend / proto are no-ops for this change — no generated, TypeScript, or proto files are touched.AI disclosure
Authored primarily with AI assistance (Claude), per coder's AI contribution guidelines. I reviewed the diff, ran the checks above, and verified the runtime behavior myself; I am accountable for the contents.