feat(ci): wire Ollama review-run + post-findings --execute into patchwarden-client-dry-run workflow #523

Closed
opened 2026-05-27 15:20:31 +02:00 by claude · 0 comments
Collaborator

Goal

Wire Patchwarden CLI capabilities that already exist on main of pdurlej/patchwarden (since PR #49 + #50) into the existing platform dry-run workflow. Today the workflow uses --model-id deterministic (stub, never calls Ollama) and never posts to PR. After this issue, it calls real Ollama for review and posts the findings as a PR comment.

This is wiring of existing capability, not new feature work — explicitly legalized under pdurlej/patchwarden D21 (M2 gate amendment, 2026-05-27).

Context — why now

Operator-side mental-model discovery during the pdurlej/patchwarden 2026-05-27 session revealed that today's dogfood loop is ~30% of the operator's mental model. Three luki:

  1. Ollama review NOT wired in workflowreview-run uses --model-id deterministic, never calls Ollama (this issue).
  2. post-findings --execute NOT wired in workflow — workflow only uploads artifacts, never speaks to the PR (this issue).
  3. pyfallow / fallow-ts NOT wired in Patchwarden CLI — separate code change in pdurlej/patchwarden, parked under M2 per D21.

This issue closes luki 1 + 2. Luka 3 stays parked.

Scope

Edit .forgejo/workflows/patchwarden-client-dry-run.yml:

Change 1 — review-run uses real Ollama lane config

Replace stub flags with lane-driven flags so Patchwarden's existing live-model path (PR #49) actually runs.

Today (stub):

patchwarden review-run \
  --target-kind pull_request \
  --target-id "forgejo://git.pdurlej.com/pdurlej/platform/pulls/$PR_NUMBER" \
  --target-sha "$HEAD_SHA" \
  --reviewer-id platform-client-dry-run \
  --model-id deterministic \
  --model-version v0 \
  --prompt-version platform-client-v0 \
  --pass-index 1 \
  --output patchwarden-client-review-artifact.json

After: generate pr-metadata.json from event payload (using existing pr-files.txt), pass --lane-config + --pr-metadata-file + --diff-file. Lane-derived reviewer-id / model-id / model-version / prompt-version are taken from policies/platform.v0.toml:

# Build PR metadata JSON from event payload + existing diff/files artifacts
cat > /tmp/patchwarden-client/pr-metadata.json <<JSON
{
  "repo": {"host": "git.pdurlej.com", "owner": "pdurlej", "name": "platform"},
  "pr_number": $PR_NUMBER,
  "title": "$PR_TITLE",
  "author": "$PR_AUTHOR",
  "head_sha": "$HEAD_SHA",
  "base_sha": "$BASE_SHA",
  "labels": [],
  "changed_files": $(jq -Rs 'split("\n") | map(select(length > 0))' < /tmp/patchwarden-client/pr-files.txt)
}
JSON

patchwarden review-run \
  --target-kind pull_request \
  --target-id "forgejo://git.pdurlej.com/pdurlej/platform/pulls/$PR_NUMBER" \
  --target-sha "$HEAD_SHA" \
  --lane-config "${{ steps.patchwarden.outputs.source_dir }}/policies/platform.v0.toml" \
  --pr-metadata-file /tmp/patchwarden-client/pr-metadata.json \
  --diff-file /tmp/patchwarden-client/pr.diff \
  --output patchwarden-client-review-artifact.json

Change 2 — add post-findings --execute step (gated on Patchwarden source available)

After resolve-findings, add:

- name: Post Patchwarden comment to PR
  if: steps.patchwarden.outputs.status != 'not_configured'
  env:
    FORGEJO_TOKEN: ${{ secrets.FORGEJO_TOKEN }}
  run: |
    set -euo pipefail
    patchwarden post-findings \
      --artifact-file patchwarden-client-review-artifact.json \
      --resolution-file patchwarden-client-finding-resolution.json \
      --repo-host git.pdurlej.com \
      --repo-owner pdurlej \
      --repo-name platform \
      --pr-number "$PR_NUMBER" \
      --execute

The FORGEJO_TOKEN is read by patchwarden.forgejo_client from the env (see src/patchwarden/forgejo_client.py:_config). No code change in Patchwarden needed.

Change 3 — update permissions

Top-level workflow permissions: (or the dry-run job specifically) needs pull-requests: write so the POST /issues/{pr_number}/comments call succeeds:

permissions:
  contents: read
  pull-requests: write  # NEW: required for post-findings --execute

Acceptance criteria

  • --model-id flag removed; --lane-config present pointing at policies/platform.v0.toml from the cloned Patchwarden source dir.
  • --pr-metadata-file + --diff-file flags both present, pointing at files generated in the collect-diff job (already exists).
  • post-findings --execute step exists, gated on steps.patchwarden.outputs.status != 'not_configured'.
  • Workflow permissions includes pull-requests: write.
  • Smoke test: open a trivial test PR (e.g. one-line docs/ change with W6d-automerge-calibration label), workflow runs end-to-end, real Ollama gets called (visible in workflow log: model_used: kimi-k2.6:cloud or fallback gemma4:31b-cloud in the review-artifact JSON), PR receives a Patchwarden comment with the rendered findings (or "No Patchwarden findings to render." if model returned []).
  • Fail-closed validation: if Ollama is unreachable AND lane has fail_on_missing: true (default per platform.v0.toml), review-run exits 2, workflow fails, no comment posted. Verify in log that OllamaClientError propagated.

No-go (per D21 in pdurlej/patchwarden)

  • DO NOT add pyfallow / fallow-ts invocation here. Runtime dep binding stays parked under M2 per D21. A separate planning doc (docs/operations/pyfallow-integration-plan.md in patchwarden) covers the eventual integration shape.
  • DO NOT change policies/platform.v0.toml. Lane config already has the right repo_pattern_sanity lane with kimi-k2.6:cloud primary + gemma4:31b-cloud fallback + fail_on_missing: true.
  • DO NOT extend the dogfood lane beyond safe_docs_status classification — that's a policy schema expansion, parked under M2 per D21.
  • DO NOT add new Patchwarden CLI subcommands. This is pure workflow wiring of existing capability in main.
  • DO NOT change permissions: contents: write — only pull-requests: write is needed.

Pre-flight check (operator)

Before Codex starts, verify:

  • FORGEJO_TOKEN repo secret exists in platform repo and has scope to post comments on issues/PRs (write:issue or equivalent).
  • If not, add it before the workflow can post.

Rollout discipline

  1. Land workflow edit in PR.
  2. Operator opens a small smoke-test PR (one-line docs change with W6d-automerge-calibration label).
  3. Verify workflow runs end-to-end as described in Acceptance.
  4. Verify PR comment renders correctly and contains expected fields.
  5. Only after smoke test passes, treat this as the new baseline for W6d calibration cycles.
  6. If Ollama findings are too noisy in practice, consider lowering temperature or tightening lane prompt — but in a separate issue, not here.

Spec sources

  • pdurlej/patchwarden#49review-run --lane-config / --pr-metadata-file / --diff-file end-to-end path, including soft/hard-fail behavior per lane.fail_on_* flags.
  • pdurlej/patchwarden#50post-findings --execute behavior, --execute flag semantics, exit codes on Forgejo errors.
  • pdurlej/patchwarden#54resolve-findings now consumes runtime_metadata.error_kind from review-artifact (Swiss Cheese Layer 8 fix). If Ollama soft-fails, resolver surfaces a soft_fail_review_unreliable blocker.
  • pdurlej/patchwarden#55 — D20 architectural lint enforces that Patchwarden never calls Forgejo merge or APPROVED. post-findings uses only /issues/{pr}/comments endpoint.
  • pdurlej/patchwarden#57D21 M2 gate amendment that explicitly permits this workflow wiring.
  • pdurlej/patchwarden:docs/operations/platform-dogfood.md — current loop documentation (will need an update PR in patchwarden once this lands).
  • pdurlej/patchwarden:docs/operations/code-vs-vision-snapshot-2026-05-27.md — Bet 2 ratio reasoning + gap analysis.
  • pdurlej/patchwarden:policies/platform.v0.toml — lane config the workflow will point at.

Created 2026-05-27 by claude (pdurlej/patchwarden dedicated thread) as Wizji-Wartości Wave step D3. Operator (pdurlej) hands off to codex for implementation per cousin-family lane discipline.

## Goal Wire Patchwarden CLI capabilities that **already exist on `main`** of `pdurlej/patchwarden` (since PR #49 + #50) into the existing platform dry-run workflow. Today the workflow uses `--model-id deterministic` (stub, never calls Ollama) and never posts to PR. After this issue, it calls real Ollama for review and posts the findings as a PR comment. This is **wiring of existing capability**, not new feature work — explicitly legalized under `pdurlej/patchwarden` D21 (M2 gate amendment, 2026-05-27). ## Context — why now Operator-side mental-model discovery during the `pdurlej/patchwarden` 2026-05-27 session revealed that today's dogfood loop is **~30% of the operator's mental model**. Three luki: 1. **Ollama review NOT wired in workflow** — `review-run` uses `--model-id deterministic`, never calls Ollama (this issue). 2. **`post-findings --execute` NOT wired in workflow** — workflow only uploads artifacts, never speaks to the PR (this issue). 3. **pyfallow / fallow-ts NOT wired in Patchwarden CLI** — separate code change in `pdurlej/patchwarden`, **parked under M2** per D21. This issue closes luki 1 + 2. Luka 3 stays parked. ## Scope Edit `.forgejo/workflows/patchwarden-client-dry-run.yml`: ### Change 1 — `review-run` uses real Ollama lane config Replace stub flags with lane-driven flags so Patchwarden's existing live-model path (PR #49) actually runs. **Today** (stub): ```yaml patchwarden review-run \ --target-kind pull_request \ --target-id "forgejo://git.pdurlej.com/pdurlej/platform/pulls/$PR_NUMBER" \ --target-sha "$HEAD_SHA" \ --reviewer-id platform-client-dry-run \ --model-id deterministic \ --model-version v0 \ --prompt-version platform-client-v0 \ --pass-index 1 \ --output patchwarden-client-review-artifact.json ``` **After**: generate `pr-metadata.json` from event payload (using existing `pr-files.txt`), pass `--lane-config + --pr-metadata-file + --diff-file`. Lane-derived reviewer-id / model-id / model-version / prompt-version are taken from `policies/platform.v0.toml`: ```yaml # Build PR metadata JSON from event payload + existing diff/files artifacts cat > /tmp/patchwarden-client/pr-metadata.json <<JSON { "repo": {"host": "git.pdurlej.com", "owner": "pdurlej", "name": "platform"}, "pr_number": $PR_NUMBER, "title": "$PR_TITLE", "author": "$PR_AUTHOR", "head_sha": "$HEAD_SHA", "base_sha": "$BASE_SHA", "labels": [], "changed_files": $(jq -Rs 'split("\n") | map(select(length > 0))' < /tmp/patchwarden-client/pr-files.txt) } JSON patchwarden review-run \ --target-kind pull_request \ --target-id "forgejo://git.pdurlej.com/pdurlej/platform/pulls/$PR_NUMBER" \ --target-sha "$HEAD_SHA" \ --lane-config "${{ steps.patchwarden.outputs.source_dir }}/policies/platform.v0.toml" \ --pr-metadata-file /tmp/patchwarden-client/pr-metadata.json \ --diff-file /tmp/patchwarden-client/pr.diff \ --output patchwarden-client-review-artifact.json ``` ### Change 2 — add `post-findings --execute` step (gated on Patchwarden source available) After `resolve-findings`, add: ```yaml - name: Post Patchwarden comment to PR if: steps.patchwarden.outputs.status != 'not_configured' env: FORGEJO_TOKEN: ${{ secrets.FORGEJO_TOKEN }} run: | set -euo pipefail patchwarden post-findings \ --artifact-file patchwarden-client-review-artifact.json \ --resolution-file patchwarden-client-finding-resolution.json \ --repo-host git.pdurlej.com \ --repo-owner pdurlej \ --repo-name platform \ --pr-number "$PR_NUMBER" \ --execute ``` The `FORGEJO_TOKEN` is read by `patchwarden.forgejo_client` from the env (see `src/patchwarden/forgejo_client.py:_config`). No code change in Patchwarden needed. ### Change 3 — update permissions Top-level workflow `permissions:` (or the `dry-run` job specifically) needs `pull-requests: write` so the `POST /issues/{pr_number}/comments` call succeeds: ```yaml permissions: contents: read pull-requests: write # NEW: required for post-findings --execute ``` ## Acceptance criteria - `--model-id` flag removed; `--lane-config` present pointing at `policies/platform.v0.toml` from the cloned Patchwarden source dir. - `--pr-metadata-file` + `--diff-file` flags both present, pointing at files generated in the `collect-diff` job (already exists). - `post-findings --execute` step exists, gated on `steps.patchwarden.outputs.status != 'not_configured'`. - Workflow `permissions` includes `pull-requests: write`. - **Smoke test**: open a trivial test PR (e.g. one-line `docs/` change with W6d-automerge-calibration label), workflow runs end-to-end, **real Ollama gets called** (visible in workflow log: `model_used: kimi-k2.6:cloud` or fallback `gemma4:31b-cloud` in the review-artifact JSON), **PR receives a Patchwarden comment** with the rendered findings (or "No Patchwarden findings to render." if model returned `[]`). - **Fail-closed validation**: if Ollama is unreachable AND lane has `fail_on_missing: true` (default per `platform.v0.toml`), `review-run` exits 2, workflow fails, **no comment posted**. Verify in log that `OllamaClientError` propagated. ## No-go (per D21 in pdurlej/patchwarden) - ❌ DO NOT add pyfallow / fallow-ts invocation here. Runtime dep binding stays parked under M2 per D21. A separate planning doc (`docs/operations/pyfallow-integration-plan.md` in patchwarden) covers the eventual integration shape. - ❌ DO NOT change `policies/platform.v0.toml`. Lane config already has the right `repo_pattern_sanity` lane with `kimi-k2.6:cloud` primary + `gemma4:31b-cloud` fallback + `fail_on_missing: true`. - ❌ DO NOT extend the dogfood lane beyond `safe_docs_status` classification — that's a policy schema expansion, parked under M2 per D21. - ❌ DO NOT add new Patchwarden CLI subcommands. This is **pure workflow wiring of existing capability** in `main`. - ❌ DO NOT change `permissions: contents: write` — only `pull-requests: write` is needed. ## Pre-flight check (operator) Before Codex starts, verify: - `FORGEJO_TOKEN` repo secret exists in platform repo and has scope to post comments on issues/PRs (`write:issue` or equivalent). - If not, add it before the workflow can post. ## Rollout discipline 1. Land workflow edit in PR. 2. Operator opens a small smoke-test PR (one-line docs change with `W6d-automerge-calibration` label). 3. Verify workflow runs end-to-end as described in Acceptance. 4. Verify PR comment renders correctly and contains expected fields. 5. Only **after** smoke test passes, treat this as the new baseline for W6d calibration cycles. 6. If Ollama findings are too noisy in practice, consider lowering temperature or tightening lane prompt — but in a **separate** issue, not here. ## Spec sources - `pdurlej/patchwarden#49` — `review-run --lane-config / --pr-metadata-file / --diff-file` end-to-end path, including soft/hard-fail behavior per `lane.fail_on_*` flags. - `pdurlej/patchwarden#50` — `post-findings --execute` behavior, `--execute` flag semantics, exit codes on Forgejo errors. - `pdurlej/patchwarden#54` — `resolve-findings` now consumes `runtime_metadata.error_kind` from review-artifact (Swiss Cheese Layer 8 fix). If Ollama soft-fails, resolver surfaces a `soft_fail_review_unreliable` blocker. - `pdurlej/patchwarden#55` — D20 architectural lint enforces that Patchwarden never calls Forgejo merge or APPROVED. `post-findings` uses only `/issues/{pr}/comments` endpoint. - `pdurlej/patchwarden#57` — **D21 M2 gate amendment** that explicitly permits this workflow wiring. - `pdurlej/patchwarden:docs/operations/platform-dogfood.md` — current loop documentation (will need an update PR in patchwarden once this lands). - `pdurlej/patchwarden:docs/operations/code-vs-vision-snapshot-2026-05-27.md` — Bet 2 ratio reasoning + gap analysis. - `pdurlej/patchwarden:policies/platform.v0.toml` — lane config the workflow will point at. --- **Created** 2026-05-27 by `claude` (pdurlej/patchwarden dedicated thread) as Wizji-Wartości Wave step D3. Operator (pdurlej) hands off to **codex** for implementation per cousin-family lane discipline.
Sign in to join this conversation.
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform#523
No description provided.