archscan

Curated architecture pitfall scanners for AI agents and humans.

What is archscan?

archscan is a library of architecture pitfall documents. Each pitfall is a self-contained detection check with profile references, grep hints, severity, and a concrete fix. The corpus is plain Markdown; the automated archscan CLI is a Python orchestrator that calls the Claude Agent SDK or the OpenAI Codex Python SDK. You can feed the documents to any AI coding tool that can read Markdown, or run the checks yourself.

Installation

Prerequisites

Python ≥ 3.10
uv — install with curl -LsSf https://astral.sh/uv/install.sh | sh

One-line install (Claude Code)

bash <(curl -s https://raw.githubusercontent.com/ebarti/archscan/main/install.sh)

Or clone and install manually:

git clone https://github.com/ebarti/archscan.git ~/.archscan
bash ~/.archscan/install.sh

This:

Clones the scanner library to ~/.archscan/
Runs uv sync --frozen in that directory to materialize the locked dependency set into ~/.archscan/.venv/ (claude-agent-sdk, codex-app-server-sdk, transitive deps)
Adds the /archscan slash command to Claude Code globally
Symlinks the archscan CLI (a Python entry point) to ~/.local/bin/. The launcher detects ~/.archscan/.venv/ and re-execs under it automatically, so the symlink works from any shell

Authentication

archscan needs API access to whichever backend you choose:

--cli claude (default) — set ANTHROPIC_API_KEY (or one of CLAUDE_CODE_USE_BEDROCK=1, CLAUDE_CODE_USE_VERTEX=1, CLAUDE_CODE_USE_FOUNDRY=1 with the matching cloud credentials).
--cli codex — CODEX_HOME=~/.codex-archscan codex login once. The pipeline isolates Codex state under ~/.codex-archscan so it does not touch your normal Codex configuration.

The slash command is Claude Code-specific, but the terminal CLI can run the pipeline with either Claude Code or Codex.

Usage

Claude Code slash command -- open any repo and run:

/archscan python

Terminal CLI -- run the default Claude backend, or select Codex explicitly:

archscan python
archscan python --cli codex

archscan extracts your architecture profile, runs all pitfall checks in parallel, and writes a prioritized report outside the scanned repo by default. Relative --report paths resolve under ARCHSCAN_OUTPUT_DIR; the default location is $XDG_STATE_HOME/archscan/reports/ or $HOME/.local/state/archscan/reports/.

Options

--model <model>          Model for scan agents (default: claude-opus-4-7)
--effort <level>         Reasoning effort: low|medium|high|max (default: high)
--profile-model <model>  Model for profile extraction (default: value of --model)
--profile-effort <level> Effort for profile extraction (default: max)
--merge-model <model>    Model for report merging (default: value of --model)
--merge-effort <level>   Effort for report merging (default: high)
--output <format>        markdown-backlog|github-issues|json|linear
--parallel <n>           Max parallel agents (default: 10)
--report <file>          Output path (absolute path or relative path under ARCHSCAN_OUTPUT_DIR)
--cli <claude|codex>     Backend CLI for the full pipeline (default: claude)
--stability-check <sev>  Re-check fail findings at/above critical|high|medium|low (default: none)
--dev                    Preserve the temp work dir after the run
--max-capability         Shorthand: run every stage on claude-opus-4-7 at effort=max
                         (per-stage flags still win if set alongside it)

Examples:

archscan python --model haiku --effort low --parallel 20
archscan python --output github-issues --report audit.md
archscan python --stability-check high
archscan python --cli codex

When --cli codex is selected, archscan runs Codex with CODEX_HOME=~/.codex-archscan so the scan pipeline does not write into the user's regular Codex state directory. Bootstrap that isolated home once with CODEX_HOME=~/.codex-archscan codex login; if auth.json is missing there, archscan fails fast with that exact command.

--cli selects the backend SDK:

claude — uses claude-agent-sdk (Anthropic's agent runtime library).
codex — uses codex-app-server (the official OpenAI Codex Python SDK). Both SDKs are installed automatically by install.sh. The legacy claude and codex CLI binaries are no longer required.

Per-repo configuration (optional)

Add .archscan.yml to your repo root to set defaults for your team:

scanner: python
model: opus
effort: high
output: github-issues
stability-check: high
exclude:
  - some-pitfall-name

With this file, contributors just run /archscan or archscan with no arguments. CLI flags always override .archscan.yml values.

Updating

bash ~/.archscan/install.sh

The installer pulls the latest version, runs uv sync --frozen to reconcile the lockfile, and re-copies the slash command and CLI shim.

For development:

cd ~/Github/archscan
uv sync          # match pyproject.toml + uv.lock
uv run python -m unittest discover tests
uv run archscan python   # invoke the entry point in the project venv

Pipeline

archscan implements a single 7-stage pipeline. Architecture decision record: docs/adr/2026-04-17-single-pipeline-architecture.md. Runtime contract: spec/PIPELINE.md.

                   [1/7] repo knowledge      -> canonical-index.json + profile.md
                         (cold: full canonical indexer,
                          warm: reconcile prior knowledge + repo diff)
                                 |
                                 v
                   [2/7] overlay resolver    -> overlay-<pitfall>.json
                         (Python, deterministic, one per pitfall, parallel)
                                 |
                        +--------+--------+
                        v                 v
             needs_expansion?         no missing concepts
                        |                 |
                        v                 v
         [3/7] targeted expansion     (skip)
               (LLM, one per missing,
                parallel subset)
                        |                 |
                        +--------+--------+
                                 v
                   [4/7] evidence scan      -> scan-<pitfall>.md
                         (LLM, one per pitfall, parallel)
                                 |
                        +--------+
                        v
              primary verdict = fail?
                        |
                        v
                   [5/7] challenge pass     -> challenge-<pitfall>.md
                         (LLM, adversarial falsification,
                          parallel subset — label hidden)
                                 |
                                 v
                   [6/7] report merger      -> final report
                         (LLM, serial, tools disabled)
                                 |
                                 v
                   [7/7] memory consolidation
                         (Python, deterministic, after report)

archscan requires every scanner to ship concepts.json, and every pitfall to declare concept_ids:. There is no fallback pipeline.

Prompts and tools used by the pipeline:

prompts/canonical-indexer.md — cold-start knowledge extraction, produces markdown profile plus canonical-index.json
prompts/knowledge-reconciler.md — warm-start update of persisted knowledge using the previous knowledge base plus repo diff
bin/archscan-overlay — deterministic Python resolver mapping pitfall concept_ids to canonical-index symbols, ranked evidence, token budgets, and relevant typed memory
prompts/overlay-expansion.md — only when an overlay's needs_expansion is true
prompts/scan-runner.md — one instance per pitfall, consumes overlay + optional expansion and may emit a reusable knowledge contribution block
prompts/challenge-pass.md — one instance per fail primary verdict, adversarial
prompts/report-merger.md — combines primary + challenge into one report, handles upheld | weakened | overturned outcomes
bin/archscan-knowledge-store consolidate-run — deduplicates durable memory, marks changed-file facts for revalidation, and writes memory ROI metrics

Knowledge evolution: archscan persists repo knowledge under $ARCHSCAN_CACHE_DIR/knowledge/. The first run does a cold canonical index; later runs diff the repo against the previous manifest and either reuse the stored knowledge snapshot or warm-reconcile it via prompts/knowledge-reconciler.md. Pitfall scans can append new reusable facts to the same knowledge base, and those facts are recorded in a per-run audit ledger. Accepted facts are also promoted into typed memory.json records. Challenge outcomes create durable false_positive_pattern or episodic_finding records, and future overlays inject only records relevant to the same pitfall or overlapping concepts. Because stage 4 dispatches scans in parallel, only scans that have not started yet can benefit from a contribution merged earlier in the same invocation; already-running scans keep the overlay they started with.

Persisted knowledge storage contains snapshot.json, profile.md, manifest.json, memory.json, metrics.json, and runs/<run-id>.json. memory.json stores typed records such as semantic_fact, negative_finding, false_positive_pattern, procedural_hint, repo_invariant, and episodic_finding.

Cost notes: The pipeline adds a challenge pass (~1 call per fail), an occasional expansion pass, and a deterministic memory consolidation pass. Counterbalances: the deterministic overlay cuts LLM work in stage 2, the persisted knowledge store avoids re-indexing unchanged repos, typed memory reduces repeated false positives, and content-addressed cache keys (bin/archscan-cache-key) eliminate duplicate work across reruns. The cache covers the canonical-index, evidence-scan, and challenge-pass stages under $ARCHSCAN_CACHE_DIR (default $XDG_CACHE_HOME/archscan); set ARCHSCAN_NO_CACHE=1 to bypass both cache replay and persisted repo knowledge for that run.

Manual Usage

If you are not using the automated CLI pipeline, or prefer to run archscan step-by-step, follow the manual workflow below.

1. Pick a scanner

Browse scanners/ and choose the scanner matching your architecture:

Scanner	Description	Pitfalls	Categories
`python`	Generic Python systems: async services, workers, SQLite persistence, subprocesses, credentials, logging, and schema evolution	34	10
`python-vulnerabilities`	Python web/API and backend vulnerabilities requiring cross-file trust-boundary reasoning	7	7
`python-agentic-runtime`	Python LLM/orchestrator runtimes: routing, MCP/tooling, autonomous queues, evals, and self-improvement loops	51	13
`distributed-architecture`	Distributed / event-driven / service-based Python architectures: pattern failure modes from Ford et al.'s Software Architecture Patterns, Antipatterns, and Pitfalls (orchestration, compensation, event contracts, supervisor, stamp coupling, sinkhole, stovepipe, externalized state)	35	9

2. Extract your architecture profile

Give your AI agent these two documents:

prompts/profile-extractor.md -- tells the agent what to extract
scanners/<scanner>/profile.md -- tells it what to look for in your specific architecture

The agent reads your codebase and produces a compact architecture profile (~1,000 tokens). This profile is shared context for all subsequent scans.

3. Run the stages

Run prompts/canonical-indexer.md with the scanner's profile.md and concepts.json to produce profile.md plus canonical-index.json.
Run bin/archscan-overlay for each pitfall to produce overlay-<pitfall>.json.
For overlays with needs_expansion: true, run prompts/overlay-expansion.md.
Run prompts/scan-runner.md per pitfall using the profile, overlay, and optional expansion.
Run prompts/challenge-pass.md for every primary fail.
Run prompts/report-merger.md across the scan and challenge results.
Run bin/archscan-knowledge-store consolidate-run after the report if persisted memory is enabled.

For an example of what the final report looks like, see examples/mestre-audit-report.md.

Available Scanners

Scanner	Description	Pitfalls	Categories
`python`	Generic Python systems with async concurrency, SQLite persistence, subprocesses, credentials, logging, and durable queues	34	async-python, sqlite, credential-management, security-boundaries, observability, pydantic-evolution, session-management, queue-autonomous-operation, cross-cutting
`python-vulnerabilities`	Python web/API and backend vulnerability scanner for tenant isolation, auth propagation, cache scope, webhook replay, SSRF, transaction side effects, and background-job privilege drift	7	tenant-isolation, authorization-propagation, cache-scope, outbound-http, webhook-intake, transaction-integrity, background-jobs
`python-agentic-runtime`	Python agentic runtimes with LLM calls, routing, MCP/tooling, autonomous queues, evals, and self-improvement workflows	51	async-python, multi-llm-orchestration, pattern-selection-routing, mcp-subprocess, security-boundaries, self-improvement-evals, pydantic-evolution, session-management, queue-autonomous-operation, cross-cutting
`distributed-architecture`	Distributed / event-driven / service-based Python architectures with pattern failure modes drawn from Ford et al.'s Software Architecture Patterns, Antipatterns, and Pitfalls	35	architecture-by-implication, contract-coupling, cross-cutting, cross-cutting-duplication, event-contract, layering-architecture, mcp-subprocess, queue-autonomous-operation, service-granularity

Each scanner directory contains:

profile.md — extraction guide tailored to that architecture class
_quick-scan.md — one-liner grep commands for rapid triage (see Manual triage below)
concepts.json — machine-readable concept registry required by the runtime pipeline. 63 concepts for python, 23 for python-vulnerabilities, 99 for python-agentic-runtime.
Individual pitfall .md files — one per pitfall, following spec/FORMAT.md

Manual triage

Each scanner also ships _quick-scan.md — a human-facing cheatsheet of one-liner grep -rEn commands grouped by category. It is not part of the archscan pipeline (the CLI explicitly excludes it from parallel scanning). It is intended for humans who want a fast applicability check against a candidate codebase before committing to a full scan, or when onboarding to a new repo.

scanners/python/_quick-scan.md
scanners/python-vulnerabilities/_quick-scan.md
scanners/python-agentic-runtime/_quick-scan.md

Creating Your Own Scanner

Run prompts/profile-extractor.md against your target architecture to understand the tech stack
Draft pitfall candidates for each category relevant to the profile, following spec/FORMAT.md exactly. Start from an existing scanner's pitfall as a template and adapt.
Curate the output: verify detection checks are mechanical, fixes are concrete, severities match spec/SEVERITY.md
Place the profile and pitfall files in scanners/<your-scanner-name>/
Open a PR -- see CONTRIBUTING.md for the quality bar

Output Formats

The report merger (prompts/report-merger.md) supports four output formats:

markdown-backlog (default) -- severity-sorted table with file references, ready for sprint planning
github-issues -- gh issue create commands, one per finding, with labels
json -- structured array for programmatic consumption
linear -- import-ready table with Linear priority mapping

Specification

The format and quality rules are defined in spec/:

spec/FORMAT.md -- pitfall file structure, frontmatter fields, detection check format
spec/SEVERITY.md -- severity level definitions with blast radius / frequency / reversibility matrix
spec/DETECTION_METHODS.md -- LSP, grep, and read detection tiers with examples

Contributing

See CONTRIBUTING.md.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

archscan

What is archscan?

Installation

Prerequisites

One-line install (Claude Code)

Authentication

Usage

Options

Per-repo configuration (optional)

Updating

Pipeline

Manual Usage

1. Pick a scanner

2. Extract your architecture profile

3. Run the stages

Available Scanners

Manual triage

Creating Your Own Scanner

Output Formats

Specification

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
archscan		archscan
bin		bin
commands		commands
docs		docs
examples		examples
prompts		prompts
scanners		scanners
spec		spec
tests		tests
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
install-local.sh		install-local.sh
install.sh		install.sh
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

archscan

What is archscan?

Installation

Prerequisites

One-line install (Claude Code)

Authentication

Usage

Options

Per-repo configuration (optional)

Updating

Pipeline

Manual Usage

1. Pick a scanner

2. Extract your architecture profile

3. Run the stages

Available Scanners

Manual triage

Creating Your Own Scanner

Output Formats

Specification

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages