Validate Anthropic extended-thinking request invariants (#253) by jpr5 · Pull Request #256 · CopilotKit/aimock

jpr5 · 2026-06-08T22:45:55Z

Summary

Fixes #253. When Anthropic extended thinking is enabled, the real API 400s if a tool-loop continuation request omits the prior assistant turn's leading thinking block (and its signature). aimock previously accepted such requests and replayed the next fixture — a false green that masked real claude-sdk-python bugs (dropped thinking blocks, last-only retention, un-replayed redacted_thinking).

What changed

validateThinkingInvariants (messages.ts): when thinking is enabled, an in-scope tool-loop continuation turn (a tool_use answered by the next turn's tool_result) must (a) lead with a thinking/redacted_thinking block, (b) carry a non-empty signature on a leading thinking block, (c) carry non-empty data on a leading redacted_thinking block. Text-only/end_turn turns are exempt (no false 400s). Strict mode → real Anthropic 400 error shape; non-strict → warn + replay (existing suites stay green). Untrusted-JSON hardening: null/non-object message and content-block entries are guarded in both the validator and claudeToCompletionRequest (no 500 on malformed input — the real API returns 400).
Round-trip-safe emission: emitted thinking blocks carry a non-empty placeholder signature (signature_delta + non-streaming assembled block) so aimock's own record→replay stays green under strict mode, while the streaming content_block_start signature stays "" per the real Anthropic wire shape. Applied consistently across all three response shapes — text, content+tool, and tool-only (ToolCallResponse gains an optional reasoning field).

Test plan

30+ unit cases for validateThinkingInvariants (scope, three violation kinds, multi-turn ordering, malformed-input guards), strict-on/off + X-AIMock-Strict override integration, and round-trip tests proving aimock's emitted thinking turns (all shapes) replay cleanly under strict.
pnpm test 3377 passed; pnpm test:drift 80 passed (Anthropic drift contract aligned); prettier + eslint + tsc --noEmit + build all clean.

Follow-ups (out of scope for this PR)

Recorder (stream-collapse.ts) captures only thinking_delta text, not the upstream signature_delta value or redacted_thinking data — recording a real-Anthropic thinking turn loses signature/redacted data.
Validator scopes invariants to the leading content block by design; multi-thinking-block-per-turn signature enforcement is not covered.
Pre-existing fidelity nits: message_start emits full output_tokens; claudeStopReason passes unmapped finish reasons through; content+tool builder emits an empty text block when content is "".

pkg-pr-new · 2026-06-08T22:46:24Z

Open in StackBlitz

npm i https://pkg.pr.new/@copilotkit/aimock@256

commit: ec8424f

Add validateThinkingInvariants: when extended thinking is enabled, a tool-loop continuation assistant turn (a tool_use answered by the next turn's tool_result) must lead with a thinking/redacted_thinking block carrying a non-empty signature (non-empty data for redacted_thinking). Strict mode emits the real Anthropic 400 error shape; non-strict warns and replays. Guards untrusted JSON against null / non-object message and content-block entries. Emits a non-empty placeholder signature on every emitted thinking block (text, content+tool, tool-only shapes) so aimock's own record->replay round-trips stay green under strict mode; the streaming content_block_start signature stays empty per the real wire shape. Tool-only reasoning emission is capability-gated via resolveReasoningForModel, consistent with the other dispatch branches.

Unit tests for validateThinkingInvariants (scope, three violation kinds, multi-turn ordering, malformed-input guards), strict-on/off + header-override integration, round-trip tests across all response shapes, and capability-gating tests for the Claude tool-only reasoning path. Aligns the Anthropic drift contract: non-empty placeholder signature on the assembled block, empty signature on the streaming content_block_start.

…idation

jpr5 added 3 commits June 8, 2026 16:40

docs: add changelog entry for extended-thinking request-invariant val…

ec8424f

…idation

jpr5 force-pushed the blitz/aimock-253-thinking/integration branch from 589340e to ec8424f Compare June 8, 2026 23:41

jpr5 merged commit 3c07705 into main Jun 8, 2026
23 checks passed

jpr5 deleted the blitz/aimock-253-thinking/integration branch June 8, 2026 23:43

jpr5 mentioned this pull request Jun 9, 2026

Gate reasoning emission on the tool-call-only path across providers #257

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate Anthropic extended-thinking request invariants (#253)#256

Validate Anthropic extended-thinking request invariants (#253)#256
jpr5 merged 3 commits into
mainfrom
blitz/aimock-253-thinking/integration

jpr5 commented Jun 8, 2026

Uh oh!

pkg-pr-new Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jpr5 commented Jun 8, 2026

Summary

What changed

Test plan

Follow-ups (out of scope for this PR)

Uh oh!

pkg-pr-new Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pkg-pr-new Bot commented Jun 8, 2026 •

edited

Loading