Harden codegen contracts and stabilize examples#3943
Merged
Conversation
Pins every file the transport and example generators produce for a design covering plain JSON-RPC methods, required/optional request IDs, custom errors with JSON-RPC code mappings, WebSocket streaming, SSE streaming, mixed HTTP+JSON-RPC transports, and a plain HTTP service. A manifest golden catches files appearing or disappearing.
Code generation runs only after design validation, so any internal error past that point is a bug in Goa. Convert the silent failure paths into panics matching the established 'panic(err) // bug' convention: - The five transform-error sites that printed to stdout and emitted files with empty constructor bodies now panic. - rpcTag panics on a missing rpc:tag instead of emitting field number 0, which produced invalid .proto output rejected by protoc far from the cause. - The gRPC message collector panics on a lookup miss instead of leaving a nil message that NPEs at template render time. - extractHeaders/extractCookies assert the primitive-payload mapping invariant instead of silently substituting the whole payload. - collectAttributes panics on a missing parent attribute instead of leaving nil slots, and unknown default-value slice types panic instead of emitting possibly non-compiling literals. - The unreachable string-concat type-name fallback in HTTP request builders is gone. The generated goa binary now recovers panics at its single generation boundary and prints a 'this is a bug in Goa, please report it' message with the stack trace; design errors keep their [file:line] eval path. Close the validation gap the rpcTag panic exposed: custom object error types are rendered as protobuf messages but their fields were never tag-validated, so untagged fields silently generated broken .proto files. gRPC endpoint validation now checks them, and validateMessage no longer stops tag-checking after the first matched attribute. Test DSLs that relied on the lax validation now declare Field tags.
- Delete ConvertFile, a ~190-line verbatim dead copy of ConvertFiles, and factor ConvertFiles' duplicated collection/grouping loops into relevantTypeMaps/groupByConvertPath. A type used as both payload and result no longer generates duplicate numbered conversion functions (ConvertToFoo2); the convert test pinned that bug and now expects a single function. - Remove the dead union-interface machinery: AttributeContext.IsInterface was never set, so its branches in the shared and protobuf transforms were unreachable, as was UnionValTypeName. generateHelper now works on a Dup'd context instead of mutating the caller's context with deferred restores. - Fold transformAttributeHelpers into collectHelpers behind a topLevel flag; drop generateHelper's redundant seen check and transformPrimitive's redundant compatibility check. - Collapse the 12 identical typed-default cases into one reflection loop and remove transformUnion's guards for states the preceding length and compatibility checks already exclude. - Table-ize the four identical credential-scheme cases in BuildSchemeData. - Collapse the no-op package-prefix branch in goTypeDefWithPkgOverride, inline the goTypeDef pass-through, and drop GoTypeDefWithTargetPkg in favor of the package-private implementation. - Drop AttributeTags' ignored first parameter (and attributeTags' unused parent param in http typedef).
- collectUnionTypes/buildUnionTypeData absorb their view-mode twins behind a view flag; buildTypeInits/buildProjections collapse into buildViewConversions with a direction parameter. - MethodData.IsJSONRPCWebSocket is now the single owner of the 'JSON-RPC method streams over WebSocket' fact; endpoint, service, and example generators consume it instead of re-deriving it three different ways, and isJSONRPCSSE is gone. - Unique is implemented via PeekUnique so the probing logic exists once. Generated output is byte-identical; net -116 lines.
HTTP (net -375 LOC): - One element walker replaces the extractPathParams/extractQueryParams/ extractHeaders/extractCookies quartet; the four extractors are thin adapters over it. - One side-parameterized types file builder replaces the mirrored server_types.go/client_types.go, with a single local dedup mechanism. This fixes the bug where a request body seen on an earlier endpoint skipped the later endpoint's WebSocket payload types (no golden design exercises the combination so output is unchanged). - InitArgData construction reuses the embedded AttributeData via small per-kind clone helpers instead of six field-by-field copy loops. - Sweep: always-true guards, finalize-guaranteed nil guards, dead Element.AttributeName, RunHTTPDSL wrapper (26 callers now call expr.RunDSL), per-endpoint template re-parsing (requestInitTemplate's FuncMap was entirely dead), triplicated client-endpoint-init section literals, hand-rolled hasSSE loops and duplicated dict helper. gRPC: - One side-parameterized types file builder replaces the mirrored server_types.go/client_types.go; shared usesAnyType and initArgsFromMetadata helpers replace four copies each of the structpb detection and metadata-arg loops. - protobuf_transform: dead targetPtr/sourcePtr params and unreachable template branches removed; dead TargetWrapperRefs and package-guess heuristics deleted; the live protoc oneof wrapper naming contract is one documented helper (protocOneofWrapperRef); checkNil's subsumed isRef clause removed; unAlias lives next to getPrimitive. - buildInitData drops its ignored parameter; duplicate nested meta condition removed. transformPrimitive no longer returns an always-nil error. Generated output is byte-identical throughout. CLIArgs construction stays in analyze: building it in client_cli would reorder Example() calls and churn example values everywhere.
- codegen/cli owns the shared flag construction (MakeFlags/FlagArgData), endpoint-parser file assembly (EndpointParserFile), and payload-builder file assembly (PayloadBuildersFile); http and grpc client_cli map their transport data and delegate. The HTTP parse-endpoint section keeps its name and template source (jsonrpc still post-processes it until the Phase 3 de-surgery). - codegen/example owns RootPath and APIPkg, replacing five copies of the rootPath/apiPkg boilerplate. The two sites that intentionally use a different Unique suffix pattern keep their behavior. - Location.PackageNameOrDefault replaces the pkgWithDefault duplicate in both transports (14 call sites). Generated output is byte-identical.
…ints JSON-RPC stops editing HTTP's un-rendered template text with strings.Replace/regexp. Each former surgery site is now a structural mechanism in the HTTP templates/data, keyed on a new endpoint-level EndpointData.IsJSONRPC fact (the method-level flag is wrong for methods exposed over both HTTP and JSON-RPC — caught by the kitchen-sink Mixed service): - request_decoder.go.tpl grows IsJSONRPC branches for the RawRequest param and body rewind; the rotted 'return nil,' replacement is gone. - request encoders are always named for JSON-RPC endpoints and a new jsonrpc_request_envelope partial renders the envelope (ID handling from Payload.IDAttribute, no-payload case included), replacing the regex injection, newJSONRPCBody, the minimal-encoder template, and the mid-generation RequestEncoder mutation. The remaining structural response-decoder swap is drift-guarded with a count assertion. - parse_endpoint.go.tpl branches on commandData.JSONRPC for the ConnConfigurer/ConnConfigureFunc variation. - Example CLI templates take FuncSuffix/VarPrefix data; example server templates absorb the JSON-RPC blocks behind JSONRPCServices (always set, typed nil when absent) and the jsonrpc-owned template copies and mount-splice consts are deleted. Only golden change: EncodePingRequest relocated to endpoint order in calc client encode_decode (identical body).
http/codegen.ServicesData carries an explicit jsonrpc flag set by the new NewJSONRPCServicesData constructor; the file constructors JSON-RPC reuses (types, paths, CLI, example CLI, encode/decode) derive output paths, header titles, and CLI import paths from it. The jsonrpc pass-through wrappers (server_types, client_types, paths, client_cli, example_cli, updateHeader) are deleted and the generators call the http constructors directly with the JSON-RPC data. Two latent bugs fixed by construction, visible in the kitchen-sink goldens: generated JSON-RPC client/types.go headers no longer claim to be HTTP, and the JSON-RPC CLI no longer imports nonexistent gen/http client packages for JSON-RPC-only services (updateHeader only rewrote paths containing 'gen/http'). JSON-RPC endpoints no longer generate dead response/error encoders: responses are encoded inline via jsonrpc.MakeSuccessResponse, so the forced emission in http server.go (and jsonrpc's compensating error-encoder section filter) is gone; the dead Encode*Response functions disappear from generated jsonrpc servers. Misc: RunJSONRPCDSL deleted, hasJSONRPCSSE drops its unused params and guards, the two byte-identical server templates are read from http's template FS via the new ReadTemplate accessor, and SSE path derivation uses Service.PathName.
JSON-RPC SSE services generated two parallel stream implementations: a per-endpoint stream in server/stream.go and a service-level stream in server/sse.go used only to send protocol-level errors, each with its own copy of the event writer, header setup, and error sending. One unexported generated base (sseServerStream, new sse_server_stream_base.go.tpl) now owns the event writer and send/sendError; per-endpoint streams embed it keeping their public Send/SendAndClose/SendError signatures; protocol-level errors instantiate the base directly. The dead service-level Send/SendError, its dedupe machinery, and the unused decoder field are gone, and the SSE endpoint-error path in the handler routes through the per-endpoint SendError's design-driven error mapping instead of a hardcoded switch plus fallback. Everything renders into gen/jsonrpc/<svc>/server/sse.go via ServerFiles; client/stream.go renders only from ClientFiles (it was generated twice, once from the misnamed SSEServerFiles which is now deleted along with its generator wiring).
wrapAttr marks every synthesized wrapper message with grpc:wrapped meta at creation. Transforms and helper collection unwrap deterministically via isWrappedAttr/unwrapAttr on the proto side of each pair instead of calling IsCompatible, treating failure as 'must be wrapped', unwrapping an attribute literally named field, and retrying — eight retry sites plus two unconditional unwraps converted. IsCompatible remains at each site as a pure assertion whose failure now means a real bug, and unwrapAttr panics when asked to unwrap a non-wrapper, closing the latent hole where a user type with a genuine 'field' attribute could be silently mis-unwrapped. The marker is invisible to type hashing and .proto rendering; generated output is byte-identical. New table-driven tests pin the marker at each nesting level and the unwrap panic.
The gRPC transform was a ~1,040-line drifted fork of the shared Go transform engine. The engine now exposes explicit extension points (TransformAttrs.Hooks): UnwrapPair/WrapDirective for synthetic wrapper messages, FieldPairAttrs for per-field normalization, ConvertPrimitive for cast/Any conversions, full TransformArray/Map/Union renderer overrides (union parent-message context threaded by parameter, killing the fork's mutable message state, dupTransformAttrs and isUnionMessage), HelperNameAttrs, GuardCondition, ZeroTypeName, ObjectDeref, and an InlineCompositeElems policy flag. Nil hooks are bit-identical to the previous engine behavior for the HTTP, JSON-RPC, and service generators. grpc/codegen now implements those hooks in proto_hooks.go and deletes its copies of transformAttribute, transformObject, transformAttributeHelpers, collectHelpers, walkMatches, transformHelperName and the transformAttrs state struct — 1,040 lines of drift-prone duplication down to 460 lines of genuinely proto-specific code. Two fork bugs die with the drift, both producing non-compiling Go: bytes fields with defaults compared a slice against a zero value (slices only compare to nil — the single golden hunk), and aliased array defaults rendered untyped element literals. Verified end to end with a repro design (nested map-of-array, primitive unions both directions, Any, aliased defaults, stream envelopes) whose generated code now compiles.
Validation codegen no longer mutates the design tree. The new expr.EffectiveValidation computes the validation that effectively applies to an attribute — its own validation merged with the user type alias chain's, in the exact order the old in-place flattening produced — and returns a fully detached expression that shares no memory with the design. This deletes flattenValidations and both write-backs in validationCode, which could leave att.Validation aliased to a user type's own ValidationExpr so that later merges contaminated the shared validation seen by every other consumer. hasValidations is now a side-effect-free predicate. New chained-alias golden fixtures pin the merge order and purity tests assert the design tree is deep-equal before and after generation, with zero churn in existing goldens.
HTTP: ServerTypeNames/ClientTypeNames are now write-once analysis
state (map[string]struct{}) — no file generator mutates them, and a new
regression test renders the full file set twice from one ServicesData
and asserts byte-identical output. PathInit ClientArgs get fresh
InitArgData/AttributeData copies so the client request-builder's
renames can no longer leak into ServerArgs (the latent
signature-vs-body mismatch when Unique() renamed). The redundant
DupAtt before makeHTTPType in buildResponses is gone.
gRPC: analyze no longer writes the protobuf-shaped request, streaming
request, response, and error messages back onto the design expressions;
the shaped attributes are locals threaded explicitly through the
convert/stream/error builders. The .proto generator and CLI read only
ServiceData, so nothing outside analyze depended on the rewrites. A
purity test snapshots the endpoint expressions and asserts analyze
leaves them untouched.
Generated output is byte-identical.
The protobuf-equivalent fix for HTTP: analyze no longer writes alias-flattened/tagged bodies back onto e.Body, response bodies, or error response bodies. Shaped bodies live in an unexported per-service cache keyed by expression pointer, so every consumer (payload/response/ error builders, the body-attribute collection passes, websocket) shares one shaped instance and the call-order-dependent example randomizer stays byte-stable. The discarded makeHTTPType(e.StreamingBody) no-op is gone; streaming bodies keep their never-flattened shipped behavior (flattening them is a separate open decision). openapi/v3 stops projecting static views into the design's response body (the dup only copied the type, then wrote the projection into the original attribute); projection now happens on a detached copy. This fixes a real production divergence: OpenAPI specs generated after the HTTP transport pass lost named alias schemas (v2 dropped definitions and inlined enums, v3 dropped alias descriptions) relative to what the OpenAPI golden tests — which run on pristine roots — pin. A new order-independence test generates specs before and after building transport data and asserts identity (4 of 7 cases failed pre-fix), and a purity test snapshots body pointers, types, and struct:tag meta across nine DSL shapes. Two remaining design-tree mutations are documented for the final normalization phase: service wrapObject (to be sanctioned in NormalizeRoot) and json_schema's in-place projected-type prefixing.
codegen.NormalizeRoot is now the only sanctioned design mutation after eval finalization: it owns the wrapping of anonymous object method payloads/results in named user types (moved out of service analyze, which now panics if it meets an unwrapped raw object), replaying the exact per-service scope seeding so generated names are identical. generator.Generate runs it once after plugin preparation; test harnesses apply it in their shared helpers. Two more mutations die: openapi's ResultTypeRefWithPrefix renamed projected result types in place (now renames a copy, with an instance-keyed registry preserving the sticky first-prefix-wins refs), and makeHTTPTypeRecursive could rewrite the attribute type of the process-global expr.Empty — found by the new purity test, not by inspection. The lasting guarantee is codegen/generator/purity_test.go: for designs covering alias chains, views, WebSocket, SSE, JSON-RPC mixed transport, and gRPC unions/streaming, it deep-snapshots every expression reachable from the root, runs every generator (Service, Transport, OpenAPI, Example), and asserts the design is bit-identical. expr's GeneratedResultTypes append and the example randomizer cache remain documented process-global exclusions. End-to-end: goa gen + goa example output is byte-identical to HEAD~1 across HTTP/gRPC/streaming/viewed-collections including the name-collision case.
Example values were drawn from a single sequential PRNG seeded by the API name, so every value depended on how many draws happened earlier in the generation run. Any change to generator internals — or to an unrelated part of the design — shifted all downstream examples, which is why regenerating with a new goa version churned examples even though the designs were untouched. Examples are now anchored to design identities. ExampleGenerator gains Derived (path-relative streams), Rebased (absolute identity streams), and Field (the identity of one element extracted from a payload or result). User types draw from a stream seeded by their type ID no matter where they are reached from, object fields derive from their field name, array elements and map entries from their index, and union members from their name. The transport walkers anchor param, header, cookie, metadata, and security element examples to the payload field they project, so a standalone element example now equals the value of the same field in the composite payload example. Generators built around caller-supplied Randomizers cannot re-seed and keep the sequential behavior. The one-time golden re-keying this causes is the last one: new unit tests pin order independence, draw-count independence, field locality, and element/composite example agreement.
The JSON schema walk drew every node's example from the shared root stream, so schema examples — including the tool schemas goa-ai builds through this package — depended on generation order. The walk now threads an example generator anchored at each design boundary: type definitions rebase on the user type ID, object properties derive from the property name, array items from the element index, map entries and union members likewise. A schema's composite example now equals the expr-level example of the same type, and exported signatures are unchanged.
Path, query, header, cookie, and SSE Last-Event-ID parameter examples drew from the shared root stream. They now anchor to the payload field they project via ExampleGenerator.Field, matching the transport walkers, so parameter examples agree with the composite payload example and survive generator reorderings.
The v3 schemafier carried one example stream for the whole walk so inline schema examples (notably the OpenAPI 3.2 event content schemas) depended on generation order. The schemafier now forks a derived stream per object property, array item, map entry, and union member, and rebases on the user type identity at type boundaries — same anchoring contract as the shared JSON schema package and the expr layer.
Anonymous body types (inline arrays, maps, primitives) have no type identity, so their schema and content examples drew from the shared sequential stream and changed whenever generation order did. Both the schema walk and the media-type example initialization now rebase on a service.endpoint.role identity, which also makes the schema-level and content-level examples of the same body agree. With this, a consumer-level perturbation test (consuming PRNG draws before generation) leaves an entire real-world gen tree — transport code, OpenAPI 3.0/3.2 documents, and goa-ai tool schemas — byte identical: every generated example is now a pure function of the design.
Keep normalized synthetic method result wrappers out of projected view output, and preserve primitive alias schema descriptions while retaining anchored examples.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR hardens Goa code generation around three contracts that were previously implicit:
The implementation also removes several duplicate or fragile generator paths. JSON-RPC is now modeled as a real HTTP transport variant instead of relying on template-source surgery, gRPC protobuf wrapping uses explicit transform hooks, and HTTP/gRPC type file emission is consolidated behind shared constructors.
Why
Several generator behaviors were correct only because mutation order happened to line up across generators. That made output stability hard to reason about and made downstream changes risky: one generator could mutate shared expressions and affect another generator's examples, validations, or type names.
This PR makes those ownership boundaries explicit:
codegen.NormalizeRootis the only sanctioned post-finalize design mutation.expr.EffectiveValidationcomputes validation without mutating attributes or alias chains.expr.ExampleGeneratorsupports derived, rebased, and field-scoped streams so examples are a pure function of the design element they document.Notable Fixes
Review Notes
The diff is test-heavy. Non-test Go source is smaller overall, while the branch adds coverage and golden fixtures for the new contracts.
Expected generated output changes are limited to:
Test Plan
go test ./...make lint