Skip to content

protocol: regenerate root onto LSP 3.18 with union-aware jsonrpc2 transport#59

Merged
zchee merged 68 commits into
mainfrom
3.18
Jun 20, 2026
Merged

protocol: regenerate root onto LSP 3.18 with union-aware jsonrpc2 transport#59
zchee merged 68 commits into
mainfrom
3.18

Conversation

@zchee

@zchee zchee commented Jun 10, 2026

Copy link
Copy Markdown
Member

Summary

Converge the repository root onto a single, spec-faithful, generated LSP 3.18 type set, replacing the hand-maintained ~3.15-era package that had drifted from the spec and carried bare interface{} in domain types. The public go.lsp.dev/protocol import path stays stable while the shapes now track the meta-model.

This is 19 commits; they cluster into four areas.

1. Generated LSP 3.18 root (headline — 7eac687)

  • Hand-written types replaced by the genlsp-generated 3.18 type set.
  • Union (or) types are sealed Go interfaces, decoded through a union-aware jsonrpc2.Codec installed via WithCodec.
  • Transport rewired onto the go.lsp.dev/jsonrpc2 fastest branch (pinned via a local replace while that branch settles before its main merge).
  • Server (75) and Client (21) dispatch interfaces hand-ported from the generated method registry; $/cancelRequest handled by CancelHandler. CancelParams.ID and ProgressToken are now Integer|String unions.
  • Dependencies trimmed to library essentials: log/slog replaces go.uber.org/zap; segmentio/encoding/json and go.lsp.dev/pkg/xcontext removed; go.lsp.dev/uri folded into a parity port (File/New/Parse/From, Filename; DocumentURI aliases URI).

2. The generator (internal/genlsp/*)

The metaModel.json → Go type generator that produces the root package: sealed-interface unions, go-json-experiment codecs, jsontext.Value for LSPAny. make generate is idempotent (byte-identical re-emit).

3. JSON hot-path optimization + correctness fixes

  • Zero-alloc union dispatch (e4181b7): replaced per-predicate map[string]jsontext.Value reparsing with a hand-rolled top-level byte scanner over the resident jsontext.Value, plus a fused objectHasAndKnownGuard so each arm guard scans the object once. Measured on linux/amd64 (Xeon 8481C, GOMAXPROCS=8, count=10, p=0.000): decode geomean −17.9% ns/op, −31.6% allocs/op, −33.8% B/op; best case workspace_symbol −69.5% ns / −88.0% allocs. Encode unchanged.
  • Escaped-key decode fix (00c0212): the scanner compared raw member-key bytes against Go literals without decoding JSON escapes, so e.g. {"\u006bind":"create"} misdispatched and silently dropped fields. Escape-free keys keep the byte-for-byte zero-alloc fast path; backslash-bearing keys decode via jsontext.AppendUnquote.
  • CancelHandler clobber fix (dddd2ed): the fastest-transport CancelHandler forwarded an already-cancelled reply-time context, so every successful response through NewServer/NewClient was clobbered into ErrRequestCancelled. Reply-time substitution dropped; genuine cancellations still surfaced by the pre-dispatch ctx.Err() gate.
  • Optional-enum pointer policy (eeb4cc7): per the omitzero policy, optional enums whose zero value is never a valid member drop their *T (22 fields, e.g. Diagnostic.Severity); enums that declare a zero member (CodeActionKind.Empty, TextDocumentSyncKind.None) keep the pointer.

4. Tooling / CI

  • CircleCI → GitHub Actions (6e4feac): test job with Codecov coverage + gotestsum artifacts; lint job fails on any uncommitted diff.
  • go.mod tool directive replaces the separate tools/ module (b9720cb, feedd86); golangci-lint config ported to the v2 schema (90230c4).

Verification

  • make generate is idempotent (empty git diff on re-run).
  • golangci-lint clean; full suite passes under -race.
  • Production-path integration test exercises NewServer/NewClient end to end.
  • Registry-exhaustiveness check + FuzzUnionDispatchOracle / FuzzRoundTrip (10M+ execs on amd64); the dispatch scanner is gated by a differential oracle retaining the prior map-based predicates.
  • Benchmark spine retained under .bench/ (amd64 baseline/after, benchstat, RESULTS.md, provenance).

Notes

  • The jsonrpc2 dependency is pinned via a local replace to its fastest branch; this should be repointed once that branch merges to main.
  • This is a large, intentional rewrite of the root package; the public import path is unchanged but type shapes track the 3.18 meta-model, so downstream callers using removed hand-written types will need updates.

zchee added 12 commits June 4, 2026 09:42
Consolidate CI on GitHub Actions to drop the external CircleCI
dependency. The new workflow runs a test job that produces coverage
uploaded to Codecov plus gotestsum artifacts, and a lint job that runs
make lint and fails on any uncommitted diff.
Now that go.mod tracks tool dependencies via the tool directive, drop
the tools/bin bootstrap and the per-tool install rules. Targets invoke
binaries with `go tool` (gotestsum, gofumpt, golangci-lint,
goimports-rereviser) and `make tools` installs them via `go install
tool`, removing the separate tools module wiring.
golangci-lint is now provided through the go tool directive at v2, whose
configuration schema differs from v1. Port the linter set and settings
to the version: "2" layout so `go tool golangci-lint run` parses the
config instead of erroring on the legacy keys.
Satisfy gofumpt -extra, which requires the leading ctx argument on its
own line when the call spans multiple lines. Pure formatting; no
behavior change.
Converge the repository root onto a single, spec-faithful, generated
source of truth: the genlsp-generated LSP 3.18 type set plus a transport
layer rewired onto the go.lsp.dev/jsonrpc2 fastest branch. The prior
root was a hand-maintained ~3.15-era package that drifted from the spec
and carried bare interface{} in domain types; regenerating it keeps the
public go.lsp.dev/protocol import stable while the shapes track the
meta-model.

Union ("or") types are sealed Go interfaces decoded through a union-aware
jsonrpc2.Codec installed via WithCodec. The Server (75) and Client (21)
dispatch interfaces are hand-ported from the generated method registry,
with every method routed and $/cancelRequest handled by CancelHandler.
CancelParams.ID and ProgressToken are now unions (Integer|String).

Dependencies are trimmed to the library essentials: log/slog replaces
go.uber.org/zap; segmentio/encoding/json and go.lsp.dev/pkg/xcontext are
removed; go.lsp.dev/uri is folded into a parity port (File/New/Parse/From
and Filename, with DocumentURI an alias of URI). jsonrpc2 is pinned via a
local replace while the fastest branch settles before its main merge.

Generation is idempotent (make generate), golangci-lint is clean, and the
suite passes under -race, including a production-path integration test and
a registry-exhaustiveness check.
The vendored jsonrpc2 fastest CancelHandler cancels each request's
per-request context at reply time as cleanup, then forwards that
already-cancelled context to the reply. protocol.CancelHandler's wrapper
substituted ErrRequestCancelled whenever the reply-time context was
cancelled, so on the fastest transport every successful response was
clobbered into an error -- breaking every request issued through the
public NewServer/NewClient constructors.

Drop the reply-time substitution and detach only; genuine cancellations
are surfaced by the dispatchers' pre-dispatch ctx.Err() gate and by
jsonrpc2.CancelHandler. Re-enable TestIntegrationRoundTrip, the
production-path test that documented and was blocked by this defect, so
CI exercises NewServer/NewClient end to end.

Also guard NewServer/NewClient against a nil logger via the existing nop
logger, and correct the ErrContentModified message (it was a copy of the
cancellation text).
Per the omitzero policy (json-omitempty-omitzero.md), an optional enum
whose zero value is never a valid member needs no pointer: ",omitzero" on
a plain value omits exactly the unset case. Generalize the generator's
optionalStructDropsPointer into optionalDropsPointer, extending the
zero-meaningful check from structs to enums (via enumHasZeroMember) and
named scalar aliases.

22 optional enum/alias fields convert from *T to T (e.g. Diagnostic.
Severity). Enums that declare a zero member keep their pointer so a
present zero stays distinguishable from absent: CodeActionKind defines
Empty == "" and TextDocumentSyncKind defines None == 0, so CodeAction.Kind
and friends remain *T. Raw bool/int/string primitives also keep their
pointers (absent vs explicit 0/false/"").

Regeneration stays idempotent and round-trip is preserved. Adds
TestEnumPointerPolicy locking both halves of the policy.
aliasToken's reference branch bound s only to discard it with
"_ = s"; the comma-ok check needs no value, so use the blank form.

fieldDocText and indent each built a string with "+" inside
WriteString, allocating a throwaway per line; write the parts
directly with successive WriteString calls instead.

Both edits are generator-internal and output-neutral: make generate
regenerates a byte-identical tree (empty git diff), so the emitted
protocol package is unchanged. Verified with go build, the -race
suite, and golangci-lint (0 issues).
@zchee zchee force-pushed the 3.18 branch 3 times, most recently from 50e369c to 18e35c9 Compare June 10, 2026 06:34
zchee added 5 commits June 10, 2026 22:06
Union arm discrimination decoded each candidate object into a
map[string]jsontext.Value per predicate call, so a single arm guard
(objectHasKeys && objectKeysKnown) reparsed the same object twice and a
multi-arm union many times. Replace the map helpers in union_runtime.go
with a hand-rolled zero-alloc top-level byte scanner over the already
resident jsontext.Value, and emit a fused objectHasAndKnownGuard from
the generator so each arm guard scans the object once, not twice.

Backward API compatibility is intentionally not preserved; the runtime
helpers are unexported and the change is internal to dispatch.

Measured on linux/amd64 (Xeon 8481C, taskset -c 0-7, GOMAXPROCS=8,
count=10, p=0.000): decode geomean -17.9% ns/op, -31.6% allocs/op,
-33.8% B/op. Best case workspace_symbol -69.5% ns / -88.0% allocs
(1078->129); didchange -57.3% / -82.7% (387->67). Encode is unchanged.

Correctness is gated by a differential oracle that retains the prior
map-based predicates and asserts the scanner and the fused guards agree
on every well-formed input, plus FuzzUnionDispatchOracle and
FuzzRoundTrip (10M+ execs on amd64). A fuzz-found panic on the malformed
input {" is fixed and pinned by a seed and the malformed-input test.

Adds protocol_bench_test.go and testdata/corpus/ as the measurement
spine; .bench/ records the amd64 baseline and after results.
The zero-alloc union-dispatch scanner compared the raw object-member key
bytes (and the "kind" discriminator value) against Go string literals
without decoding JSON escapes. A valid payload that spells a member name
or discriminator with a \uXXXX escape — e.g. {"\u006bind":"create",...},
where \u006b is 'k' so the key is "kind" — failed every literal compare,
so the arm was misdispatched and fields were silently dropped.

Add keyEquals and decode escapes in unquoteJSONString: an escape-free
key keeps the byte-for-byte fast path (the common case stays
zero-alloc), and a key containing a backslash is decoded via
jsontext.AppendUnquote before comparison, matching what the prior
map-based decode produced. Route all five key-compare sites and the
kind-value path through it.

The differential oracle missed this because its inputs had no escaped
keys; the map oracle (json.Unmarshal) decodes them, so feeding escaped
keys makes the oracle diverge from a raw-compare scanner. Add escaped
key/value inputs to the oracle's static set and the fuzz seeds; verified
by breaking keyEquals and watching the oracle fail on exactly those
inputs. Fast-path allocs/op unchanged; amd64 suite + race + fuzz green.

Found by the final-gate code review of the byte-scanner change.
The FuzzUnionDispatchOracle seed block was labeled "JSON-escaped
key/value seeds" but held plain keys, so the fuzzer did not start from
genuinely escaped inputs. Replace them with real \uXXXX-escaped keys so
the seed corpus matches its intent and exercises the unescape path from
the first iteration.

Non-blocking LOW finding from the final-gate review of 5379c4c.
Stabilize benchmark ordering, add dispatch-focused benchmark cases,
and lock backend semantics that optimization probes must preserve.
Generate token-level codecs for measured hot structs and replace
selected optional pointer fields with presence-preserving values.
Keep third-party JSON backends as semantic probes instead of defaults.

Rejected: Promote Segmentio, Goccy, or Sonic without semantic parity.
Rejected: Keep objectKindCase and SymbolInformationSlice encoder probes after regressions.
zchee added 5 commits June 10, 2026 22:08
The LSP meta-model embeds JSDoc-style annotations inside documentation
prose: @SInCE version stamps, {@link}/{@linkcode} cross-references, and
@deprecated tags that duplicate the structured deprecated field. These
render as non-idiomatic godoc that adds no value to Go consumers.

Sanitize the doc text at emit time in genlsp's fieldDocText, the single
choke point every generated comment flows through:

  - @SInCE lines drop entirely; trailing changelog prose is preserved.
    An inline @SInCE stops at the version digits so a sentence-ending
    period stays attached.
  - {@link Target text} unwraps to its display text, or to the bare
    target when no display text is present; {@linkcode} likewise.
  - @deprecated prose drops in favour of the structured Deprecated:
    directive. Every prose tag was already backed by that field with
    identical text, so drop is equivalent to convert with no loss and
    no duplicate directive.

The structured Since:/Deprecated: lines are intentionally retained; the
request targeted the @-prefixed prose tags. @sample is left untouched as
out of scope. The hand-written errors.go is edited directly since the
generator does not emit it.

Most of the .go churn is regenerated output (+334/-827); the durable
change is in internal/genlsp (emit.go sanitizer + TestSanitizeDoc).
The earlier sweep unwrapped {@link Target text} to its plain display
text. Bracket the target symbol instead so godoc and pkg.go.dev render
a real doc link: {@link CodeLensRequest} becomes [CodeLensRequest] and
{@link Uri.scheme scheme} becomes [Uri.scheme].

The target is the first token inside the braces; trailing display text
is discarded since Go doc links cannot carry alternate link text.
Targets that match an exported identifier in the package resolve to a
clickable link; the rest render as harmless literal bracketed text.

Change is in genlsp's unwrapLink (now bracket-based) plus the .go churn
from regeneration; go vet stays clean (no malformed-link diagnostics).
Anti-slop pass over the branch found no dead code (staticcheck U1000
clean, go vet clean); the genlsp doc-sanitizer added this session
carried three fixable linter nits:

  - inlineSinceRe: use d shorthand (regexpSimplify) and note that it
    diverges from sinceRe on purpose -- stopping at the last version
    digit is what keeps a sentence-terminating period attached.
  - fix "favour" -> "favor" misspelling in the sanitizeDoc comment.
  - generator stale-cleanup test: exec.CommandContext(t.Context())
    instead of exec.Command (noctx).

Remaining golangci findings are pre-existing branch code (hugeParam in
the build-time generator, ptrToRefParam on decoder output params,
cyclop, a generated-coupled rename) and are left untouched: applying
them is churn or ripples into generated output without a behavior or
perf justification.
NewNull and NewNullable had zero references in generated code,
hand-written code, and tests. Nullable values are obtained by
decoding the wire (UnmarshalJSON) or as the absent zero value;
nothing constructs the null or value states programmatically.

Removing them narrows the public API to the decode-oriented use
the package actually exercises. Consumers that later need to build
a Nullable to send can reintroduce a constructor deliberately. The
unexported set/null/value fields and the remaining IsZero/IsNull/
Get/MarshalJSON/UnmarshalJSON surface stay; staticcheck and go vet
remain clean and the generated files are unaffected.
Drop jsontext duplicate-name tracking and UTF-8 validation on the wire
path (LSP peers are trusted: duplicates decode last-wins, invalid UTF-8
mangles to U+FFFD) and convert Optional, Nullable, and DiagnosticTags
to MarshalJSONTo/UnmarshalJSONFrom so the fresh-decoder-per-field cost
disappears. Settle the pre-existing lint debt that surfaced once the
Makefile lint rule pointed at the renamed .golangci.yaml again.

amd64 (n=10): geomean -4.8% ns/op; token-codec encode categories -25%
(completion_list -26.0%, publish_diagnostics -27.3%, workspace_symbol
-25.1%); decode hot categories -5..-8%; allocs byte-identical across
all 39 benchmarks. semantic_tokens and initialize_request regress ~3%
(alignment-class, allocs unchanged); both paths are replaced by the
M2/M3 byte-level milestones, so the wireOptions half is kept.

Rejected: reverting the wireOptions half over those ~3% regressions.
@zchee zchee force-pushed the 3.18 branch 2 times, most recently from e653075 to 5a26dbd Compare June 10, 2026 13:49
zchee added 2 commits June 10, 2026 22:50
Generate byte-level decoders and direct append encoders for the LSP
benchmark spine so hot payloads avoid reflection/jsontext overhead while
retaining generated ownership and raw-value validation contracts.

Benchmark current:
  same host, CPU set, Go version, benchmark names, and count after this
  generated-codec implementation plus default.pgo.

Benchstat excerpt:
  sec/op geomean: 17.85 us -> 7.208 us (-59.63%)
  B/s geomean: 126.3 MiB/s -> 312.8 MiB/s (+147.70%)
  B/op geomean: 4.298 KiB -> 4.875 KiB (+13.42%)
  allocs/op geomean: 14.50 -> 5.726 (-60.51%)

Validation:
  go test -count=1 ./...
  git diff --cached --check

Tradeoff:
  Decode/initialize_request remains +14.26% vs baseline.
  G010 raw LSPAny validation costs +2.20% vs M6/default.pgo,
  but preserves Marshal's invalid JSON error contract.
Emit protocol generator outputs with .gen.go names so generated
sources are easy to identify. Stale cleanup continues to remove old
marker-bearing files safely.

Generated protocol contents are byte-identical across the rename. Only
the generator naming policy and filename-sensitive tests change.
@zchee zchee mentioned this pull request Jun 20, 2026
zchee added 8 commits June 20, 2026 14:46
The generated SemanticTokens/SemanticTokensPartialResult appendLSP
methods build their object framing inline and call only the live
appendUint32JSONArray primitive, so the hand-written
appendSemanticTokensJSON, appendSemanticTokensDataObject, and
semanticTokensObjectLenHint wrappers have no production caller; they
were a prototype the generator superseded. staticcheck/deadcode miss
them because their unit tests still reference them, so remove the
wrappers and those three sub-tests while keeping the live
appendUint32* primitive coverage.

spike_transport_test.go was the Phase-0 migration gate and drove a
test-only spikeCodec; the real codec.go now wires lspCodec, and the
spike's end-to-end behavior is already covered by integration_test.go
(real lspCodec dispatch) and backend_semantics_test.go (RawMessage
passthrough and lifetime), so the scaffold is redundant.
unionMember.IsObject was written at seven construction sites but never
read; object dispatch keys off Token, KindConst, Required, and AllKeys
instead. renderedEnum.CustomVals was set from SupportsCustomValues but
never consulted by renderEnumerations. legacyURIRef duplicated
unionURIWrapperType (both "URI") since the URI-semantics unification,
so every "== unionURIWrapperType || == legacyURIRef" check and the
duplicate aliasType assignment compared one value twice. Collapse the
constant and drop both dead fields.

Factor the byte-for-byte identical UnmarshalJSONFrom shim emitted by
renderByteValueEntry and renderByteSliceValueEntry into one
renderUnmarshalJSONFromShim helper. Fix the misspelled, now-stale
"rquire JSON path" error and Load's doc, since the input may be a
path or an http(s)/file URL. Clarify the union-decoder "case 'n'"
branch as the duplicate-case guard it is, not dead code.

Verified output-neutral: make generate reproduces byte-identical
*.gen.go (git diff --exit-code clean) and go test ./internal/genlsp
passes.
Convert TestObjectGuardEmission and TestHotOptionalField (genlsp) and
TestIsZeroOmitValue (protocol) from []struct{name ...} to the project
standard map[string]struct{} keyed by the case name, dropping the now
redundant name field. The benchmark tables in union_runtime_bench_test
are intentionally left as slices: map iteration would randomize b.Run
sub-benchmark order and add noise to benchstat comparisons.
Pin the requested jsonrpc2 revision and migrate protocol handlers to the
new direct-return contract without adding an adapter layer.

Add focused regression coverage for cancellation, malformed cancel
notification semantics, custom fallback retention, and reentrant behavior.
Document the borrowed request contract on ClientHandler fallback and add
focused regression coverage for direct-return dispatch, fallback cloning,
parse errors, cancellation, and connection cleanup under the jsonrpc2 API.
Reorder newClientHandlerConnPair to take ctx before *testing.T so it
follows the Go context-as-first-argument convention and satisfies the
context-as-argument linter. Pure signature reshuffle; no behavior change.
The jsonrpc2 fallback paths legitimately return (nil, nil) as the "no
result, no error" sentinel, and NewServer's returned context mirrors
NewClient for a symmetric public API. Add targeted nolint directives so
nilnil and unparam stop flagging these correct-by-contract shapes,
documenting the intent inline instead of refactoring around non-bugs.
@zchee zchee force-pushed the 3.18 branch 2 times, most recently from 38fcd20 to 4e736a0 Compare June 20, 2026 15:31
zchee added 6 commits June 21, 2026 00:46
The file now exercises both ClientHandler and the shared jsonrpc2
conn-pair helpers reused by the new client and server dispatch tests,
so the broader name reflects its scope.
Add coverage for the previously untested client.go and server.go
dispatch paths, lifting every sending method and dispatcher to ~100%.

Two seams drive the tests. A recording jsonrpc2.Conn fake exercises
each *client and *server sending method synchronously, asserting the
method constant, the Notify-vs-Call shape, the result decode, and the
error-return branch. Dispatch routing runs over an in-memory pipe
through the production ClientHandler and ServerHandler, since
*jsonrpc2.Request has no exported constructor and must be built by the
transport scanner.

Notification-arm parse branches and the unreachable non-standard
Request parse branch are left uncovered: the transport swallows
notification errors and only delivers framer-validated JSON, so neither
is reachable through the wire.
Bring context.go and log.go to full coverage.

For context.go, round-trip WithLogger/LoggerFromContext and
WithClient/ClientFromContext, including the nop-logger fallback, the
type-assertion-failure branch, and the absent-client case.

For log.go, a recording jsonrpc2.Stream drives loggingStream Read,
Write, and Close (success and error) plus the full logCommon matrix:
both directions across Call, Notification, and Response (success and
error), the nil-message and nil-log guards, and both request-timing
pairings. A request and its response travel in opposite directions, so
the Call/Response timing is paired counter-directionally to exercise
both the setClient/client and setServer/server bookkeeping; pairing on
one direction consults the wrong map and recovers a zero req.
Three client dispatch tests read only the recorded method via
"method, _, _, _ := conn.snapshot()", tripping dogsled with three
blank identifiers. Add a focused fakeConn.lastMethodName accessor and
use it; these tests assert routing only, so the narrower call reads
clearer than discarding params and counts.

newServerHandlerConnPair took a handler parameter but ignored it,
hardcoding MethodNotFoundHandler while its doc comment claims it
mirrors newClientHandlerConnPair, which honors its fallback. Thread
the renamed fallback param into ServerHandler so the helper matches
its contract; behavior is unchanged as all call sites pass
MethodNotFoundHandler. Clears unparam and revive.
The dispatchRecordingServer and fakeConn test doubles return
(nil, nil) from their routing stubs by design: the dispatch tests
assert only which method was reached, never the result, so a
fabricated zero value would add noise the assertions never read.
maintidx fires the same way on the large, intentionally exhaustive
method tables that drive every route. Scope both linters off
_test.go rather than littering 30 stubs with directives.

This subsumes the two inline //nolint:nilnil directives added in
a0acafa; nolintlint then flags them as unused, so drop them and
keep their contract rationale as plain comments.
@zchee zchee merged commit af6ba92 into main Jun 20, 2026
9 checks passed
@zchee zchee deleted the 3.18 branch June 20, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/client area/protocol area/server kind/feature New feature or request kind/specification Language Server Protocol specification

Development

Successfully merging this pull request may close these issues.

2 participants