agentHost: narrow Copilot resume fallback to true empty-session errors#319456
Merged
roblourens merged 5 commits intoJun 5, 2026
Merged
Conversation
The catch block in CopilotAgent._doResumeSession converted *any* -32603
JSON-RPC error from client.resumeSession into a fresh session created with
the same id via client.createSession({ sessionId }). When the underlying
session file fails schema validation (e.g. a session.compaction_complete
event with batchSize: 0, rejected by @github/copilot's schema), this masked
the corruption: the user saw an empty chat instead of an error, and the
original session contents were not surfaced.
Narrow the fallback so we only recreate an empty session when the error
message clearly indicates 'no messages / empty session', and never when the
message looks like corruption / validation / parse failure. All other
-32603s now propagate so the UI and logs reflect the real failure.
Add regression tests that exercise the real _doResumeSession path via a
ResumePathCopilotAgent subclass: one confirming the empty-session fallback
still works, one confirming a corrupted-session error is no longer swallowed.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Narrows the Copilot resume-session fallback so an empty session is only recreated when the SDK error message clearly indicates "no messages"/"empty session", and never when it indicates corruption/validation/parse failures. This prevents users with a corrupted session file (e.g. batchSize: 0 schema validation rejection) from silently being shown an empty new session under the same ID instead of the real error.
Changes:
- Add
getCopilotSdkErrorCode,getErrorMessage, andshouldCreateEmptySessionAfterResumeErrorhelpers and use them in_doResumeSession's catch path. - Replace the broad
errCode !== -32603guard with the narrower message-based check. - Add a
_resumeSession fallbacktest suite exercising the real resume path via a newResumePathCopilotAgenttest subclass and auseRealResumePathoption.
Show a summary per file
| File | Description |
|---|---|
| src/vs/platform/agentHost/node/copilot/copilotAgent.ts | Introduces the SDK-error inspection helpers and narrows the resume fallback condition. |
| src/vs/platform/agentHost/test/node/copilotAgent.test.ts | Adds regression tests covering both the empty-session fallback and the corrupted-session propagation, with a ResumePathCopilotAgent to drive the real _doResumeSession. |
Copilot's findings
- Files reviewed: 2/2 changed files
- Comments generated: 0
The previous narrowing was too aggressive: it required the SDK error
message to match a small whitelist ('no messages', 'empty session', ...)
to trigger the fallback. But the legitimate empty-session case this
fallback exists for is post-'Start Over' / post-truncateSession, where
the SDK currently surfaces a generic 'no events' message that doesn't
match any of those phrases. The previous version would have regressed
Start Over (user truncates all turns, reopens session, gets a hard
error instead of an empty chat).
Invert the heuristic: treat any -32603 from resumeSession as the
empty-session case (preserving Start Over and any future SDK rewording)
UNLESS the message clearly indicates corruption / schema validation /
parse failure / malformed those should propagate so the userinput
sees the real error rather than silently getting an empty session.
Refresh tests: add a Start Over / truncated-session test using a
realistic 'returned no events' message, plus an 'unknown -32603'
defensive test; keep the corruption test.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…t-please-investigate-why-this-sess-56e5215e
PR #312679 (folding markers pattern+flags syntax) removed a doc line from FoldingMarkers in languageConfiguration.ts but didn't regenerate src/vs/monaco.d.ts. Picking up the regeneration so the monaco-d.ts check passes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…t-please-investigate-why-this-sess-56e5215e # Conflicts: # src/vs/platform/agentHost/node/copilot/copilotAgent.ts
Yoyokrazy
approved these changes
Jun 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Symptom
Opening an existing Agent Host session sometimes shows an empty chat view, even when the same session opens fine in
copilotCLI. The session contents appear lost.Root cause
CopilotAgent._doResumeSessionhad a too-broad catch block: any-32603JSON-RPC error fromclient.resumeSessionwas treated as "this session has no messages", and the agent fell back to creating a new session with the same ID viaclient.createSession({ sessionId }). The user then saw an empty chat instead of the real error.In the case that prompted this investigation, the real failure was upstream schema validation: a
session.compaction_completeevent in the session file hadtokenDetails[].batchSize: 0, and the installed@github/copilot@1.0.55-3schema declaresexclusiveMinimum: 0for that field, so the SDK rejected the entire file with-32603 Session file is corrupted (...batchSize: Number must be greater than 0). (Upstream relaxed this tominimum: 0in@github/copilot@1.0.56, but the underlying that the writer can emit0while the reader rejects is independent of our fallback.)it issueWhy the fallback exists (don't remove it)
The fallback was introduced (
54caaaf3943, "finally works v.v") for the case where the on-disk session legitimately has zero most commonly after the user invokes "Start Over", which callstruncateSessionand removes every turn. The SDK then refuses to resume an empty session, and the only reasonable recovery is to create a fresh session with the same ID so the user can keep typing into the same chat. This PR preserves that behavior.eventsFix
Invert the heuristic. Treat any
-32603fromresumeSessionas the empty-session case (preserving Start Over and any future SDK rewording of the empty-session message) unless the message clearly indicates corruption / schema validation / parse failure / malformed those propagate so the user sees the real error rather than silently getting an empty session.inputOut of scope
1.0.56
bump fixes the specificbatchSize: 0` schema issue but is a 70-file delta (CLI JS, schemas, native prebuilds, ripgrep, MCP behavior changes) that we don't want to ship the day before a release without smoke testing. If a runtime upgrade is desired, it should be its own PR.Tests
Three regression tests in a new
_resumeSession fallbacksuite insrc/vs/platform/agentHost/test/node/copilotAgent.test.ts:fallback fires, new session is created with the same id.
fallback fires (defensive default).
fallback does NOT fire, the error propagates.
The tests exercise the real
_doResumeSessionpath via a smallResumePathCopilotAgentsubclass (the existingTestableCopilotAgentbypasses it).(Written by Copilot)