You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
testmd run reports overall_status: "passed" and exit code 0 while a step actually FAILED (steps.failed >= 1, steps.passed = 0). A genuinely failing test is reported as green — for a CI gate this is the worst failure mode (a real regression passes silently). Present in 0.4.1; readonly_fallback path still in 0.4.4.
Mechanism
After a checkpoint analyzer fails on replay (see #85), kane hits a readonly lock conflict and falls back to a readonly_fallback, and in that path the failed run is mislabeled passed:
"reason":"analyzer_failed: @ step 7" # a checkpoint flakes on replay (#85)
[lock] concurrent session — running in readonly mode (no commit)
replay_decisions:2 author_decisions:0 # PURE REPLAY — no re-author, no commit attempted
"reason":"readonly_fallback"
=> {"type":"test_md_summary","overall_status":"passed","steps":{"passed":0,"failed":1,"skipped":1}}, exit 0
Important (corrected from an earlier version of this report): this is not caused by concurrent CI jobs. Serializing the matrix to max-parallel: 1 reproduces it identically, and the trace shows author_decisions:0 — a pure single-threaded replay with no commit attempted still hits the "concurrent session" readonly lock. So the lock detection appears spurious/stale (no real second session), and a pure replay should not be touching a write/session lock at all.
kane-cli testmd run <file> --agent --headless (single run, no concurrency).
When the checkpoint fails on a replay, the run logs readonly_fallback and emits overall_status:"passed" with steps.failed:1, steps.passed:0, exit 0.
Expected
If any non-optional step fails (steps.failed > 0) or none passed (steps.passed < 1), overall_status must be "failed" and the exit code non-zero. A readonly_fallback must never upgrade a failed run to passed. (Separately: a pure replay shouldn't acquire/conflict on a write lock, and the "concurrent session" detection shouldn't fire with no second session active.)
Impact
High — overall_status + exit code are what CI keys on. We've had to parse test_md_summary and fail the job whenever steps.failed > 0 || steps.passed < 1, ignoring kane's own status.
Env
kane-cli 0.4.1, testmd run --agent --headless [--retry], single run and CI matrix (both). Related: #85 (the analyzer flake that triggers this).
Summary
testmd runreportsoverall_status: "passed"and exit code 0 while a step actually FAILED (steps.failed >= 1,steps.passed = 0). A genuinely failing test is reported as green — for a CI gate this is the worst failure mode (a real regression passes silently). Present in 0.4.1; readonly_fallback path still in 0.4.4.Mechanism
After a checkpoint analyzer fails on replay (see #85), kane hits a readonly lock conflict and falls back to a
readonly_fallback, and in that path the failed run is mislabeledpassed:Important (corrected from an earlier version of this report): this is not caused by concurrent CI jobs. Serializing the matrix to
max-parallel: 1reproduces it identically, and the trace showsauthor_decisions:0— a pure single-threaded replay with no commit attempted still hits the "concurrent session" readonly lock. So the lock detection appears spurious/stale (no real second session), and a pure replay should not be touching a write/session lock at all.Repro
*_test.mdwith a visual checkpoint that flakes on replay (Multi-checkpoint step (multiple ANALYZE + combining ASSERT) authors green but fails deterministically on replay with null assert values #85).kane-cli testmd run <file> --agent --headless(single run, no concurrency).readonly_fallbackand emitsoverall_status:"passed"withsteps.failed:1, steps.passed:0, exit 0.Expected
If any non-optional step fails (
steps.failed > 0) or none passed (steps.passed < 1),overall_statusmust be"failed"and the exit code non-zero. Areadonly_fallbackmust never upgrade a failed run to passed. (Separately: a pure replay shouldn't acquire/conflict on a write lock, and the "concurrent session" detection shouldn't fire with no second session active.)Impact
High —
overall_status+ exit code are what CI keys on. We've had to parsetest_md_summaryand fail the job wheneversteps.failed > 0 || steps.passed < 1, ignoring kane's own status.Env
kane-cli 0.4.1,
testmd run --agent --headless [--retry], single run and CI matrix (both). Related: #85 (the analyzer flake that triggers this).