docs(v0): add platform dogfood playbook (closes #32) #51

Merged

pdurlej merged 1 commit from claude/patchwarden-platform-dogfood-playbook into main

2026-05-27 12:12:51 +02:00

claude commented

2026-05-27 12:08:15 +02:00

Collaborator

What

One new file: docs/operations/platform-dogfood.md (+121 lines). Pure docs — zero src changes, zero test count change (still 133/133 green).

Captures the proven vs not-proven boundary for Patchwarden's first dogfood client (pdurlej/platform), so any cousin (claude / codex / glm / iskra / hermes / gemini) can pick up a W6d calibration PR cold without reconstructing context from chat.

Structure

What IS proven — table of every CLI subcommand currently used in the platform dry-run loop. Each row anchored to a recent PR (#49 for live Ollama call, #50 for post-findings --execute).
What is NOT proven — explicit table (webhook mode, end-to-end LLM reviewer wiring, non-docs scope, auto issue selection, anything that Patchwarden would merge or APPROVE). Each row grounded in D20.
How to run ONE boring calibration PR — 12-step checklist from picking a candidate through operator merge. Each step gated by an explicit human or policy action.
When to break the loop — escalation triggers (false eligible_clean on blocked path, three consecutive red signals = ADR-0018 trigger, Iskra disagrees = D20 operator arbitration).
Reference materials — anchors to README, decisions.md, roadmap.md, architecture.md, policies/platform.v0.toml + canonical platform workflow + cycle notes.
Anti-playbook — explicit "this document is NOT" section so the next cousin doesn't over-extend scope.

Why this scope (and not more)

Per codex's issue #32 spec ("concise enough for a fresh cousin to use"):

❌ No architecture essay — docs/architecture.md already covers L1-L6.
❌ No product-strategy expansion — docs/product/strategy.md owns that.
❌ No webhook-mode design — issue #31 owns that.
❌ No CLI reference dump — README + --help own that.

Calibration references

Platform PRs #496-#499 (W6d-automerge-calibration-2026-05-25), per codex spec in #32. The cycle notes file pdurlej/platform:state/cycle/W6d-automerge-calibration-2026-05-25.md is the source of truth for what actually happened in those PRs.

D20 + ADR alignment

Constraint	Where restated
Merge actor is always the operator (D20)	Step 11 of the 12-step checklist
Patchwarden never APPROVED reviews (D20)	NOT-proven table, second-last row
Fail-closed default of `resolve-findings` (D20)	"When to break the loop" section
No workaround outcomes after 3 red signals (ADR-0018)	"When to break the loop" section
Atomic, single-file PR (ADR-0017)	This PR itself

Test impact

None. Docs-only addition. 133/133 tests still green locally on the branch.

ADR-0017 atomic

1 file, 1 new directory (docs/operations/), 0 src changes.
base=main, no stacking.

Token-accounting (agent-kanban)

Opus directly. Pure docs at Opus is slightly wasteful in theory, but the boundary content (D20 references, calibration anchors, escalation triggers) needed deep project context I had already loaded. A Sonnet sub-agent would have needed a 2-3K-token brief just to get to parity. Net: marginal, ~2% weekly Opus.

Closes #32.

## What One new file: `docs/operations/platform-dogfood.md` (+121 lines). Pure docs — zero src changes, zero test count change (still 133/133 green). Captures the **proven vs not-proven** boundary for Patchwarden's first dogfood client (`pdurlej/platform`), so any cousin (claude / codex / glm / iskra / hermes / gemini) can pick up a W6d calibration PR cold without reconstructing context from chat. ## Structure - **What IS proven** — table of every CLI subcommand currently used in the platform dry-run loop. Each row anchored to a recent PR (#49 for live Ollama call, #50 for `post-findings --execute`). - **What is NOT proven** — explicit table (webhook mode, end-to-end LLM reviewer wiring, non-docs scope, auto issue selection, anything that Patchwarden would merge or APPROVE). Each row grounded in **D20**. - **How to run ONE boring calibration PR** — 12-step checklist from picking a candidate through operator merge. Each step gated by an explicit human or policy action. - **When to break the loop** — escalation triggers (false `eligible_clean` on blocked path, three consecutive red signals = **ADR-0018 trigger**, Iskra disagrees = D20 operator arbitration). - **Reference materials** — anchors to README, decisions.md, roadmap.md, architecture.md, policies/platform.v0.toml + canonical platform workflow + cycle notes. - **Anti-playbook** — explicit "this document is NOT" section so the next cousin doesn't over-extend scope. ## Why this scope (and not more) Per codex's issue #32 spec ("concise enough for a fresh cousin to use"): - ❌ No architecture essay — `docs/architecture.md` already covers L1-L6. - ❌ No product-strategy expansion — `docs/product/strategy.md` owns that. - ❌ No webhook-mode design — issue #31 owns that. - ❌ No CLI reference dump — README + `--help` own that. ## Calibration references Platform PRs #496-#499 (W6d-automerge-calibration-2026-05-25), per codex spec in #32. The cycle notes file `pdurlej/platform:state/cycle/W6d-automerge-calibration-2026-05-25.md` is the source of truth for what actually happened in those PRs. ## D20 + ADR alignment | Constraint | Where restated | |---|---| | Merge actor is always the operator (D20) | Step 11 of the 12-step checklist | | Patchwarden never APPROVED reviews (D20) | NOT-proven table, second-last row | | Fail-closed default of `resolve-findings` (D20) | "When to break the loop" section | | No workaround outcomes after 3 red signals (ADR-0018) | "When to break the loop" section | | Atomic, single-file PR (ADR-0017) | This PR itself | ## Test impact None. Docs-only addition. **133/133 tests still green** locally on the branch. ## ADR-0017 atomic - 1 file, 1 new directory (`docs/operations/`), 0 src changes. - `base=main`, no stacking. ## Token-accounting (agent-kanban) Opus directly. Pure docs at Opus is slightly wasteful in theory, but the boundary content (D20 references, calibration anchors, escalation triggers) needed deep project context I had already loaded. A Sonnet sub-agent would have needed a 2-3K-token brief just to get to parity. Net: marginal, ~2% weekly Opus. Closes #32.

Rows
Columns