- Python 100%
refactor(status): split dogfood wedge contribution Moves the operator dogfood wedge status contribution into operator_dogfood_wedge.py as another #254 slice. |
||
|---|---|---|
| .forgejo/workflows | ||
| docs | ||
| policies | ||
| spec | ||
| src/patchwarden | ||
| tests | ||
| third_party | ||
| .fallow-py.toml | ||
| .gitignore | ||
| LICENSE | ||
| PM-SHOW.md | ||
| pyproject.toml | ||
| README.md | ||
Patchwarden
Self-hosted quality contracts for AI-written code.
Patchwarden runs on your box, watches your AI coding agents, and keeps generated changes behind local deterministic quality contracts. Cloud reviewers may inspect configured PR classes, but policy stays local and deterministic. See the original source framing in docs/product/original-vision-2026-05.md and the current product framing in docs/product/june-2026-vision.md.
Local operator status artifacts: docs/status.html and docs/status.json. June 2026 PR passage contract: docs/operations/pr-passage-contracts.md. Feature/story QA tracker: docs/qa/patchwarden-feature-status.xlsx.
Note for cousins working on this repo (claude / codex / glm / iskra / hermes / dziadek / gemini / pyfallow / pan-herbatka): if this is a fresh session inside Piotr Durlej's cousin family, read
pdurlej/agent-souls/skills/cousin-onboarding/SKILL.mdfirst. It covers identity verification, Infisical bootstrap, cardinalagent-soulsreads, and the hard rules (no secrets in chat, per-cousin commits, ADR-0017 + ADR-0018). Patchwarden-specific: this repo is the system that will watch your output — be especially careful about identity-isolation here; commits attributed to wrong cousin defeat the purpose. External contributors NOT in the cousin family can ignore this note. Discipline source:pdurlej/agent-souls/practices/repo-onboarding-pointer.md.
Status: 🟢 Product direction defined; v0 implementation path selected. Research complete (2026-05-15). On 2026-05-25 the operator selected Patchwarden-as-owner with
pdurlej/platformas the first dogfood client. Seedocs/decisions.md.
What works today vs what's planned
| Capability | Today | Planned |
|---|---|---|
Path-prefix classifier (pr-check) |
Today ✅: Filename and path-prefix classification used for PR automerge lane routing. Production on pdurlej/platform W6d lane (see docs/operations/platform-dogfood.md). Tested in tests/test_pr_check.py. |
- |
Operator status map (status / status-serve) |
Today ✅: Local June 2026 status surface showing actor model, protection layers, PR passage contract layers, hard boundaries, dashboard read-model counts, Core module registry details, working/partial surfaces, artifact drilldowns for stale/missing evidence, external wiring backlog items, a local artifact-index over concrete JSON run outputs, and the 18 tracked vision gaps for the autonomy loop. It renders text, checked-in JSON (patchwarden status --format json --output docs/status.json), deterministic static HTML (patchwarden status --format html --output docs/status.html), and a read-only local HTTP dashboard (patchwarden status-serve) with fresh HTML, /status.json, /artifact-index.json, and /healthz; artifact-index and status-serve can scan explicit roots or controller-published patchwarden.artifact_root_manifest.v1 files. docs/status.html and docs/status.json are guarded against drift by comparing them to operator_status. The passage model is documented in docs/operations/pr-passage-contracts.md; the QA tracker is docs/qa/patchwarden-feature-status.xlsx. Tested in tests/test_cli_status.py, tests/test_status_html.py, tests/test_status_server.py, and tests/test_artifact_index.py. |
Have external controllers/CI publish artifact-root manifests for every live loop. |
Head-bound contract run (evaluate) |
Today ✅: Emits patchwarden.contract_run.v1 for built-in Core modules with producer/source revision metadata, a structured operator action, blocking reason, and blocking Core module, and exits nonzero on blocked policy. It can consume current-head evidence-check, sense-check, review-quorum, nullsec-s1-check, structural-code-check, inventory-scan-check, and simplicity-review-check verdict artifacts through opt-in Core modules without running CI, reviewers, scanners, sandboxes, package managers, or models itself. This is the executable merge-safety judge path without a plugin registry. Tested in tests/test_contract_run.py, tests/test_cli_evaluate.py, and spec/schemas/contract-run.schema.json. |
Dogfood the artifact inputs in pdurlej/platform, then add more deterministic Core modules. |
Contract status/comment adapter (publish-contract) |
Today ✅: Renders patchwarden.contract_publication.v1 with producer/source revision metadata, policy identity, operator summary, verdict, operator action, blocking Core module/reason, resolved Core module stack, per-module results, publication guard metadata, and optional formal review metadata. Dry-run by default; --execute first re-fetches the live PR and refuses stale-head or non-open PR publication before setting Forgejo commit status and upserting one marker-backed PR audit comment. With explicit --execute --approve-on-pass, a passing deterministic Contract Run also creates an idempotent marker-backed Forgejo APPROVED review after status/comment publication succeeds. The stable markers keep one durable comment and one durable approval per target, while hidden marker metadata records head SHA, registry hash, and run id for audit. Tested in tests/test_contract_publish.py, tests/test_cli_publish_contract.py, and spec/schemas/contract-publication.schema.json. |
Wire into pdurlej/platform dogfood workflow as a required status/check. |
Contract PR runner (contract-pr) |
Today ✅: CI-friendly wrapper that fetches or loads one Forgejo PR, emits durable Contract Run / publication artifacts, and writes a top-level patchwarden.contract_pr_result.v1 summary with producer/source revision metadata, policy identity, verdict, operator action, blocking reason, blocking Core module, resolved Core module stack, per-module results, and publication/review result. It accepts explicit --required-check plus evidence/sense/review-quorum/nullsec-s1/structural-code/inventory-scan/simplicity-review artifact inputs in artifact mode and dry-runs or executes the status/comment publication adapter; --execute --approve-on-pass additionally publishes the deterministic Patchwarden approval for passing current-head contracts. Tested in tests/test_contract_pipeline.py, tests/test_cli_contract_pr.py, tests/test_artifact_schema_contract.py, and spec/schemas/contract-pr-result.schema.json. |
Dogfood the artifact inputs in pdurlej/platform, then mark patchwarden/contract as the required status. |
Contract PR artifact inspector (inspect-contract-pr) |
Today ✅: Reads a patchwarden.contract_pr_result.v1 file, emits a concise smoke diagnosis, fails old/partial results missing producer metadata, first-class Core fields, or durable comment marker metadata, verifies that marker metadata matches target/head/registry/run, and optionally checks extracted contract-run / contract-publication artifacts for consistency under --artifact-search-root. This is the quick runner-smoke diagnostic for proving a workflow used a current Patchwarden build. Tested in tests/test_contract_pr_inspect.py and tests/test_cli_contract_pr.py. |
Wire into client workflows after contract-pr once the platform dogfood branch is ready. |
Patchwarden Core module registry (core-modules, core-module-stack) |
Today ✅: Read-only, first-class module catalog plus target-aware stack resolver for choosing which deterministic Core modules a run uses. The registry is static (spec/modules/core.v0.json), schema-validated, declares target applicability, verdict surface, artifact input/output contract, and operator-facing safety metadata (proposal_only, requires_secrets, external_writes), and is embedded into Contract Run artifacts as a self-describing module_stack with registry hash plus per-module contract declarations. Current modules include path classification, required checks, opt-in Vistula PR shape with optional [vistula] value/effort/lead-time vocabularies, generated-artifact/security-path/dependency-risk/content-slop sensors, and opt-in evidence/sense/review-quorum/nullsec-s1/structural-code/inventory-scan/simplicity-review artifact gates. Tested in tests/test_core_module_registry.py, tests/test_contract_run.py, and tests/test_cli_evaluate.py. |
Add richer deterministic Core modules after platform dogfood evidence; no dynamic plugin loading in v0. |
PR-class evidence gate (evidence-check) |
Today ✅: Read-only evaluator for required test/sandbox evidence by PR class. Evidence counts only when status=success and sha matches the evaluated target_sha; docs-only can require no proof, logic classes can require unit/integration evidence, and sandboxed classes can require sandbox-smoke evidence. evaluate and contract-pr can consume a satisfied verdict artifact as patchwarden.pr.evidence_gate. Tested in tests/test_evidence_gate.py, tests/test_cli_evidence_gate.py, and tests/test_artifact_schema_contract.py. |
Wire evidence bundle production from CI/Iskra. |
Sense and trust gate (sense-check) |
Today ✅/partial: Read-only evaluator for product-sense routing and trust tiers. It requires decision refs for non-trivial classes, routes hard-manual classes to needs_human, blocks unknown classes/tiers, extracts configured [product_policy] source_docs into the verdict, and can propose trust-tier widening from evidence without applying it automatically. trust-state-check turns eligible sense proposals plus optional matching post-merge trust_signal artifacts and explicit operator decisions into write-free ready/noop/blocked state-update plans. policies/sense.platform.v0.toml dogfoods the source-doc manifest for Patchwarden's strategy and decision docs. evaluate and contract-pr can consume an aligned verdict artifact as patchwarden.pr.sense_gate. Tested in tests/test_sense_trust.py, tests/test_cli_sense_trust.py, tests/test_trust_state.py, tests/test_cli_trust_state.py, and tests/test_artifact_schema_contract.py. |
Add richer repo-specific product-policy semantics and wire an external trust-state writer. |
| Reviewer-lane Ollama call | Today ✅: Call verified with bearer auth from OLLAMA_API_KEY (PR #60). Confirmed on platform#527 (see docs/operations/dogfood-actual-vs-mental-model.md#L76-L84). Tested in tests/test_ollama_client.py. |
- |
Dedicated bot PR comment (post-findings --execute) |
Today ✅: Posts findings comments under dedicated patchwarden bot identity (Forgejo user id 9) via CLI flag --execute (PR #50). Confirmed on platform#527 (see docs/operations/dogfood-actual-vs-mental-model.md#L86-L92). Tested in tests/test_cli_post_findings.py. |
- |
| Soft-fail detection | Today ✅: Emits soft_fail_review_unreliable if LLM/Ollama call fails/timeouts. Confirmed in PR #54. Tested in tests/test_resolve_findings.py. |
- |
Reviewer quorum (review-quorum) |
Today ✅: Read-only quorum resolver over reviewer artifacts. Emits clean, blocked, or inconclusive, treats missing/soft-failed required lanes as hold-worthy, includes stable artifact/quorum keys for later redrive design, and never mutates Forgejo. evaluate and contract-pr can consume a clean quorum artifact as patchwarden.pr.review_quorum. Tested in tests/test_review_quorum.py, tests/test_cli_review_quorum.py, and tests/test_artifact_schema_contract.py. |
Add richer timeout/non-blocking policy and a safe redrive command/job surface before controller automation. |
| D20 authority boundary | Today ✅: Build-time enforced rule that reviewer lanes remain sensors, Patchwarden never merges, and the only formal approval surface is the deterministic contract-publication adapter with explicit opt-in. Tested in tests/test_d20_architectural_boundary.py (originated in PR #55). | - |
| OpenClaw runtime-maintenance PR gate | Today ✅: Gating rules requiring preflight evidence for runtime maintenance. Described in docs/operations/openclaw-runtime-maintenance-gate.md (PR #66). Tested in tests/test_pr_check.py. | - |
| OpenClaw runtime-repair evidence evaluator | Today ✅: Read-only, fail-closed runtime-repair-check command evaluating auto-heal evidence. Confirmed in PR #70 (first slice of #68, refs #65), described in docs/operations/openclaw-runtime-repair-evaluator.md. Tested in tests/test_runtime_repair.py. |
- |
| Auto-PR proposal gate | Today ✅: ClawSweeper-port, read-only auto-pr-proposal-check command for deciding whether a generated maintenance proposal is narrow enough for a deterministic executor to consider opening a PR. The verdict now includes durable work identity (identity, dedupe_key, job_path, branch, marker, allowed/blocked actions) so the future executor has a stable contract instead of a loose suggestion. Described in docs/operations/auto-pr-proposal-gate.md. Tested in tests/test_auto_pr_proposal.py. |
Add command/intake parser and deterministic job writer before target-repo branch/PR mutation. |
Controller command intake (controller-intake-check) |
Today ✅: ClawSweeper/OpenClaw controller-intake slice that parses maintainer commands such as review, fix, retry, stop, approve, automerge, and Polish approval aliases into a read-only handoff artifact with dedupe_key, marker, allowed/blocked actions, and optional positive-review intent bound to a passing exact-head Contract Run. It never creates jobs, reviews, branches, PRs, or merges. Tested in tests/test_controller_intake.py, tests/test_cli_controller_intake.py, and tests/test_artifact_schema_contract.py. |
Wire the artifact into the external ClawSweeper job writer/controller; executor mutation stays outside Patchwarden. |
Controller approval preflight (controller-approval-preflight) |
Today ✅: Read-only visible-approval preflight for ClawSweeper/OpenClaw controllers. It consumes an accepted positive-review handoff, a passing Contract Run, and an optional live PR state snapshot, then emits ready, needs_live_state, or blocked with explicit merge_allowed=false and external_write_allowed=false. A ready verdict requires a known PR author, a known visible approver, and author != approver, so generated output cannot approve itself. It never posts reviews, statuses, comments, jobs, branches, PRs, or merges. Tested in tests/test_controller_approval.py, tests/test_cli_controller_approval.py, tests/test_artifact_schema_contract.py, and spec/schemas/controller-approval-preflight.schema.json. |
Wire ClawSweeper/OpenClaw to consume ready preflight artifacts before publishing visible approval. |
Deterministic job plan (job-plan-check) |
Today ✅: Read-only materialization plan over job/work handoff artifacts. It normalizes identity, dedupe_key, job_path, marker, allowed/blocked actions, detects duplicates from an optional state file, and emits ready, duplicate, or blocked while forbidding job writes and external writes. Tested in tests/test_job_plan.py, tests/test_cli_job_plan.py, and tests/test_artifact_schema_contract.py. |
Have ClawSweeper consume the plan and perform actual job materialization outside Patchwarden. |
Feedback/gap intake (feedback-intake-check) |
Today ✅: Read-only intake for post-merge value, incident, friction, and vision-gap signals. It emits recorded, follow_up_needed, or blocked plus a dedupe key, marker, and issue/job candidate intent while explicitly forbidding external writes. Tested in tests/test_feedback_intake.py, tests/test_cli_feedback_intake.py, and tests/test_artifact_schema_contract.py. |
Wire follow-up candidates into an external issue/job writer and future live dashboard. |
Forgejo pull_request event parser (forgejo-event) |
Today ✅: Offline parser for Forgejo pull_request event fixtures (opened, synchronize, reopened, labeled, unlabeled) that emits the normalized PR metadata consumed by pipeline --metadata-file and contract-pr --metadata-file. It never starts a listener, verifies signatures, or fetches files/statuses from Forgejo. Tested in tests/test_forgejo_event.py and tests/test_cli_forgejo.py. |
Build HTTP listener + signature verification as a separate service-mode slice. |
| jsonschema artifact-contract validation | Today ✅: Validator testing contract shapes of CLI output artifacts. Confirmed in PR #71 (closes #62). Tested in tests/test_artifact_schema_contract.py. | - |
| pyfallow / fallow-ts deterministic structural checks | Today ✅/partial: structural-code-check normalizes external pyfallow/fallow-ts-style report JSON into patchwarden.structural_code_verdict.v1, and evaluate / contract-pr can consume the clean verdict through patchwarden.pr.structural_code_gate. Patchwarden still does not run the checkers or import client code. Tested in tests/test_structural_code.py, tests/test_contract_run.py, and tests/test_artifact_schema_contract.py. |
Wire report production in client CI and refine repo-specific content/slop rules. |
| Inventory scan verdict gate | Today ✅/partial: inventory-scan-check normalizes external Bumblebee/Patchwarden-native package, extension, and developer-tool inventory JSON/NDJSON into patchwarden.inventory_scan.v1, and evaluate / contract-pr can consume clean or advisory verdicts through patchwarden.pr.inventory_scan_gate while blocking blocked. Patchwarden still does not run scanners or package managers. Tested in tests/test_contract_run.py, tests/test_cli_evaluate.py, and tests/test_artifact_schema_contract.py. |
Wire report production in client CI where supply-chain inventory matters. |
| Simplicity review verdict gate | Today ✅/partial: simplicity-review-check normalizes external Ponytail-inspired delete/reuse/simplify findings into patchwarden.simplicity_review.v1, and evaluate / contract-pr can consume clean, advisory, or needs_simplification verdicts through patchwarden.pr.simplicity_review_gate while blocking only blocked_overbuild. Patchwarden still does not install Ponytail, run hooks, or treat code golf as a goal. Tested in tests/test_contract_run.py, tests/test_cli_evaluate.py, and tests/test_artifact_schema_contract.py. |
Dogfood false-positive rate before any stricter policy. |
| Forgejo webhook / service mode | - | Planned 🟡: HTTP listener + signature verification remain future service-mode work after the offline forgejo-event parser. Documented in docs/operations/platform-dogfood.md#L45-L47. |
| Cloud control plane (L3) | - | Planned 🟡: Paid control plane deferred to Year 2. Documented in docs/roadmap.md#L42 and docs/roadmap.md#L167 (Bet 2 wrongness #3). |
| Auto-heal apply/execute path | - | Planned 🟡: Beyond the read-only evaluator, the execute/apply path is deferred. Documented in docs/operations/openclaw-runtime-repair-evaluator.md#L12-L19 and tracked under issues #68 / #65. |
| Iskra handoff automation | - | Planned 🟡: Handoff notifications automated to Iskra. Parked, requires new feature scoping, documented in docs/operations/dogfood-actual-vs-mental-model.md#L150. |
| Security-LLM (nullsec-s1) integration | Today ✅/partial: nullsec-s1-check normalizes external scanner JSON into patchwarden.nullsec_s1_verdict.v1, and evaluate / contract-pr can consume the clean verdict through patchwarden.pr.nullsec_s1_gate. Patchwarden still does not run the scanner or contact the hosted backend. Tested in tests/test_nullsec_s1.py, tests/test_contract_run.py, and tests/test_artifact_schema_contract.py. |
Wire scanner execution in client CI where appropriate. |
| Style/linter (Ruff/Biome/ESLint) | - | Won't fix ❌: Externalized to client CI responsibility, out of scope for Patchwarden. Documented in docs/operations/dogfood-actual-vs-mental-model.md#L64 and docs/operations/dogfood-actual-vs-mental-model.md#L151. |
For the full coverage breakdown (~60% of the operator's mental model evidence-confirmed), see docs/operations/dogfood-actual-vs-mental-model.md.
Who is this for?
- Self-hosters running Forgejo/Gitea who use AI coding agents (Codex, Claude Code, Cursor, etc.) on private projects
- Technical non-developers who can specify desired behavior and risk tolerance but don't want to become full-time CI/security/review engineers
- Founders, operators, researchers, AI-systems builders running personal infrastructure
Who is NOT this for?
- Professional developer teams already using GitHub + CodeRabbit/Copilot/Bugbot/Cursor (different wedge)
- GitHub-first workflows (Patchwarden is Forgejo/Gitea-first by design)
- "Cheap LLM tokens for code review" buyers (Patchwarden sells reduced cognitive load + safer autonomy, not token pass-through)
Core principle
LLMs produce findings. Policy produces decisions.
A model may suspect or recommend. A PR is blocked only when a finding has:
- clear evidence,
- affected file/path/behavior,
- mapped policy rule,
- configured blocker severity,
- non-stale commit SHA.
CLI v0 behavior
patchwarden resolve-findings is fail-closed by default. If no --state-file
is provided, every known finding starts as open; high/blocker findings hold
the verdict until a trusted state file marks them resolved_by_push,
defense_accepted, non_blocking, or superseded. Unknown finding types are
always manual_only and block.
Product line (6 layers)
| Layer | Status | What |
|---|---|---|
| L1 — pyfallow (separate repo) | ✅ Phase A merged | Deterministic guardrail engine for Python; stdlib-only, JSON/SARIF output, MCP server |
| L1' — fallow-ts | ⏳ planned | TypeScript twin of pyfallow |
| L2 — Patchwarden OSS (this repo) | 🟡 spec phase | Self-hosted Forgejo/Gitea governance bot — webhook listener, policy decision engine, reviewer orchestrator, audit trail |
| L3 — Patchwarden Cloud | ⏳ Y2 | Paid control plane (~$7–9/mo): curated policy packs, model routing presets, BYOK, weekly reports |
| L4 — OpenClaw auto-heal | ⏳ 2–3 months post-v0 | Self-diagnostic module for OpenClaw runtime (first real high-stakes consumer of Patchwarden) |
| L5 — Coding agents as executors | ✅ exists | Codex, Claude Code, Cursor — NOT policy authority, executors under Patchwarden gates |
| L6 — Architecture/coordination layer | ✅ exists | Iskra, Prof Kong, future advisor agents — issue shaping, architecture, NOT merge-safety authority |
Status
This repository is v0 dogfood bootstrap (platform-first client, 2026-05-25). Contains:
- ✅ Vision, positioning, architecture documents
- ✅ Decisions updated for v0 implementation ownership and platform-first dogfood
- ✅ Research artifacts from 2026-05-15 (4 dimensions: wedge, architecture overlap, decisions, business model + GTM)
- ✅ Discovery interview plan
- ✅ Product management pitch artifact (
PM-SHOW.md) - ✅ Python CLI with issue readiness, PR eligibility, reviewer artifacts, finding resolution, and dry-run finding comment rendering
- ✅ JSON schemas and unit tests for the current v0 artifacts
Not yet contains:
- ❌ Forgejo webhook/service mode
- ❌ webhook-triggered automatic PR processing
- ❌ Docker compose
Next steps
See docs/roadmap.md. Short version:
- v0 foundation PRs: ownership docs, verdict schemas, issue gate, PR classifier, reviewer artifacts, finding resolver.
- Platform dogfood: use
pdurlej/platformas the first client for a narrow docs/status automerge lane. - Discovery interviews + trademark filing continue in parallel; they do not block the private dogfood loop.
- Forgejo runner/service mode comes after the CLI proves stable inside Actions.
License
Apache-2.0 (see LICENSE). Patent grant is decisive for policy-enforcement infrastructure (CNCF/OSI standard).
Related repositories
pdurlej/pyfallow— deterministic guardrail engine (separate, public)pdurlej/iskra-openclaw— dogfooding consumer + #190/#191/#192 architecture foundationspdurlej/agent-souls— canonical agent practices/personas (NOT Patchwarden config)
Skeleton commit 2026-05-15 by Prof Kong (claude Opus 4.7) in cousin-family collaboration with Iskra, Codex, and operator (pdurlej).