Skip to content

feat(network): add structured failure traces to fail fulfillment#231

Open
fakedev9999 wants to merge 1 commit into
mainfrom
fakedev9999/error-trace-fulfiller
Open

feat(network): add structured failure traces to fail fulfillment#231
fakedev9999 wants to merge 1 commit into
mainfrom
fakedev9999/error-trace-fulfiller

Conversation

@fakedev9999

Copy link
Copy Markdown
Member

Summary

Adds an optional structured failure trace field to FailFulfillmentRequestBody so open-source prover clusters can send useful failure context when reporting failed fulfillments.

This is the producer-side wire contract for network-services PR #1191. The field number intentionally matches the receive-side proto there:

  • FailFulfillmentRequestBody.error_trace = 4

The field carries bounded, sanitized UTF-8 JSON bytes. Consumers may omit it when no useful context is available.

Why

External prover operators run the open-source sp1-cluster fulfiller, which depends on this repo's spn-network-types. Today that fulfiller can reduce cluster failure details to a coarse ProofRequestError, but it has no wire field for the original structured context. As a result, mainnet SPN requests can end up with opaque failures like Unknown Error.

This PR gives the fulfiller a compatible field and helper for constructing the trace from cluster extra_data.

What Changed

  • Adds optional bytes error_trace = 4 to FailFulfillmentRequestBody.
  • Regenerates the Rust network types.
  • Adds spn_network_types::error_trace, including:
    • ErrorTrace::from_cluster_extra_data(...)
    • sanitization/redaction
    • per-field and total size bounds
    • wire serialization helpers
  • Handles the real cluster proving-failure shape where task_type is serialized as a string enum name, for example:
    • {"proving_failure":{"reason":"GPU OOM","task_type":"ProveShard"}}
  • Documents the manual sync requirement with network-services PR #1191.

Scope

This PR only adds the wire contract and producer-side helper in network.

The follow-up sp1-cluster PR will:

  • pin to this network rev
  • wire bin/fulfiller to convert proof_requests.extra_data into error_trace
  • send it on fail_fulfillment

The receive/store side is handled separately in network-services PR #1191.

Validation

  • cargo build --features network
  • cargo test -p spn-network-types error_trace --lib → 44 passed
  • cargo +nightly fmt --check → clean
  • cargo clippy --features network -D warnings → clean
  • git diff --check → clean

…Trace helper

Producer-side support so the open-source sp1-cluster fulfiller can ship a
structured, sanitized failure trace to SPN, wire-compatible with network-services
PR #1191 (succinctlabs/network-services#1191).

- proto: FailFulfillmentRequestBody.error_trace = 4 (additive, optional bytes;
  same field number/shape as network-services)
- port the canonical ErrorTrace authoring/sanitization module from #1191:
  from_cluster_extra_data (real string task_type + integer), finalize (redact
  secrets, bound sizes, drop-if-useless), to_wire_bytes; regex-free sanitizer
- regenerate committed proto types (types.rs)

The error_trace module is intentionally byte-identical (modulo this repo's
rustfmt) to network-services so the two non-synced proto repos stay consistent.

@0xernesto 0xernesto left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you need to add error_trace: None to make CI pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants