Skip to content

fix(canonical-family): migrate-uppercase tool + preflight guard for stale lowercase rows#131

Closed
sroussey wants to merge 2 commits into
mainfrom
claude/wonderful-hypatia-b0do4n
Closed

fix(canonical-family): migrate-uppercase tool + preflight guard for stale lowercase rows#131
sroussey wants to merge 2 commits into
mainfrom
claude/wonderful-hypatia-b0do4n

Conversation

@sroussey

@sroussey sroussey commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Summary

The June 8 revert to UPPER normalization (commit 330523d) leaves any operator DB with rows minted during the lowercase window (June 4-8 for sponsor-family, entire history for underwriter-family) unreachable by the resolver. Next ingest silently double-mints canonical ids and orphans identity links, memberships, and operator-installed aliases.

This PR adds:

  1. One-shot migration CLIsec canonical {sponsor,underwriter}-family migrate-uppercase [--apply]. Dry-run by default. SQLite BEGIN EXCLUSIVE / Postgres SERIALIZABLE for concurrent-ingest safety. Refuses (exit 2) when collisions would force a merge; prints dependent-row counts so the operator can run sec canonical … alias first.
  2. Preflight guard — runs on every command. Warns on console.warn for most subcommands; throws SecCliConfigurationError on safety-critical paths (resolve, canonical … alias, … by-family) unless --allow-stale is passed.

Test plan

  • bun test src/commands/canonicalFamilyMigrate.test.ts — seed lowercase, --apply, assert UPPER; seed collision, assert exit 2 + no writes; dry-run; --resolver-version scope.
  • bun test src/cli/StaleCanonicalGuard.test.ts — warn path, throw path, --allow-stale bypass.
  • bun scripts/test.ts — full suite green.

Out of scope

  • Resolver code unchanged.
  • Fact tables / identity links not migrated — only canonical_*_family.normalized_name.
  • No full re-resolve pass.

Generated by Claude Code

claude added 2 commits June 9, 2026 08:35
…amily canonical rows

The June 4-8 lowercase-fold window in normalizeFamilyName left rows whose
normalized_name is lowercase. The subsequent revert to UPPER means the
resolver's exact-match lookup (findByResolverAndName) misses them on every
ingest, silently double-minting canonical family ids and orphaning identity
links, memberships, and operator-installed aliases on the lowercase row.

`sec canonical sponsor-family migrate-uppercase` and the matching
`underwriter-family` command UPPER-fold every offender in a single
transaction. Dry-run by default; --apply writes. Scope to one
resolver_version with --resolver-version.

Concurrency: SQLite uses BEGIN EXCLUSIVE so the writer lock is held from
the planning SELECT through the UPDATE and the residual-row recount.
Postgres uses SERIALIZABLE for the same guarantee. The repository fallback
(InMemoryTabularStorage under TestingDI) mirrors the algorithm shape so
tests exercise the same find / collision / apply logic.

Collisions: when a lowercase row already has a same-resolver-version UPPER
sibling, the UPPER-fold would violate the (resolver_version, normalized_name)
natural key. We refuse to write, dump the lower_id, upper_id, normalized
name, plus membership / link / alias counts as TSV on stderr, and exit 2
so an operator can merge them via `canonical <kind> alias` before retrying.
…ical-family rows

Adds a cheap two-COUNT preflight that runs at the tail of the program's
preAction hook, after DI is fully initialised. The check classifies the
target subcommand into one of three buckets:

* Safety-critical (`resolve`, `canonical … alias`, `spac by-family`,
  `underwriter by-family`) — paths where family resolution round-trip
  correctness is load-bearing. If any lowercase rows remain in
  canonical_sponsor_family / canonical_underwriter_family, the guard
  throws SecCliConfigurationError so the command never runs against a
  silently-double-minting DB. SecCliConfigurationError already has a
  quiet exit path in src/sec.ts (no stack trace).

* Exempt (`init`, the `db` subcommand group, and `migrate-uppercase`
  itself) — chicken-and-egg: these are the commands needed to set up
  or repair the DB.

* Everything else — console.warn (yellow when --color is on) with the
  same body, no exit. The operator sees the warning every invocation
  until the migration is applied.

`--allow-stale` is added to the global options as an explicit override
for the safety-critical throw. A read failure (e.g. tables not yet
created) is treated as "no stale rows" so a first-run `db setup` is
not blocked by a self-check that has nothing to read.
@sroussey sroussey closed this Jun 9, 2026
@sroussey sroussey deleted the claude/wonderful-hypatia-b0do4n branch June 9, 2026 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants