feat(openclaw): supervised runtime repair gate for Iskra auto-heal #68

Closed
opened 2026-06-07 17:33:11 +02:00 by codex · 2 comments
Collaborator

Purpose

Migrate the broad intent of pdurlej/iskra-openclaw#27 into Patchwarden instead of growing a second local auto-repair authority inside Iskra/OpenClaw.

Operator decision (2026-06-07): Iskra/OpenClaw should keep runtime evidence and repair substrate, while Patchwarden becomes the policy/gate layer for OpenClaw maintenance and supervised repair decisions.

Boundary / DDD split

  • Iskra/OpenClaw owns runtime evidence: probes, runtime truth reports, incident packets, autoheal dashboards, upgrade simulator outputs, safe-update dry-runs, and repair proposals.
  • Patchwarden owns the gate: policy verdicts, hard-manual classes, required evidence, missing-preflight blockers, allowed repair classes, and whether a repair is eligible for operator consideration.
  • Operator owns approval: runtime maintenance, service restarts, deploy/apply, auth/network changes, and destructive actions remain explicit operator decisions.

This follows existing Patchwarden direction:

  • docs/operations/openclaw-runtime-maintenance-gate.md
  • policies/iskra-openclaw.v0.toml
  • PR pdurlej/patchwarden#66 / runtime maintenance gate
  • child slice pdurlej/patchwarden#65 for deploy drift probe extraction

First useful slice

Add a Patchwarden-side evaluator for Iskra runtime repair evidence:

  1. Accept an input fixture shaped from Iskra/OpenClaw runtime evidence, e.g. runtime-truth report, upgrade simulator result, autoheal dashboard, remediation proposal, or supervised repair plan.
  2. Emit deterministic verdicts such as eligible_repair_dry_run, needs_human, or blocked.
  3. Fail closed when required evidence is missing, stale, or implies runtime mutation without explicit operator approval.
  4. Treat deploy_drift_probe from #65 as one repair class, not the whole system.
  5. Keep the default mode read-only/proposal-only.

Non-goals

  • No automatic OpenClaw upgrade.
  • No automatic production repair.
  • No service restart.
  • No deploy/apply/sync.
  • No second deployer replacing iskra-openclaw deploy scripts.
  • No hidden Signal/Matrix/Forgejo writes from the evaluator.

Acceptance

  • Patchwarden docs state the Iskra/Patchwarden repair boundary clearly.
  • A fixture derived from Iskra autoheal/runtime evidence can be evaluated without touching production.
  • Missing runtime-truth-report, openclaw-upgrade-simulator, or autoheal-dashboard evidence blocks runtime-maintenance candidates where applicable.
  • Runtime mutation remains needs_human even when evidence is green.
  • #65 is linked as a child/slice for deploy drift, not treated as the entire #27 migration.

Source

Supersedes the broad runtime self-healing direction from pdurlej/iskra-openclaw#27. The Iskra repo can retain probes/dashboards/scripts as runtime substrate, but Patchwarden is the policy authority for repair/maintenance gates.

## Purpose Migrate the broad intent of `pdurlej/iskra-openclaw#27` into Patchwarden instead of growing a second local auto-repair authority inside Iskra/OpenClaw. Operator decision (2026-06-07): Iskra/OpenClaw should keep runtime evidence and repair substrate, while Patchwarden becomes the policy/gate layer for OpenClaw maintenance and supervised repair decisions. ## Boundary / DDD split - **Iskra/OpenClaw owns runtime evidence:** probes, runtime truth reports, incident packets, autoheal dashboards, upgrade simulator outputs, safe-update dry-runs, and repair proposals. - **Patchwarden owns the gate:** policy verdicts, hard-manual classes, required evidence, missing-preflight blockers, allowed repair classes, and whether a repair is eligible for operator consideration. - **Operator owns approval:** runtime maintenance, service restarts, deploy/apply, auth/network changes, and destructive actions remain explicit operator decisions. This follows existing Patchwarden direction: - `docs/operations/openclaw-runtime-maintenance-gate.md` - `policies/iskra-openclaw.v0.toml` - PR `pdurlej/patchwarden#66` / runtime maintenance gate - child slice `pdurlej/patchwarden#65` for deploy drift probe extraction ## First useful slice Add a Patchwarden-side evaluator for Iskra runtime repair evidence: 1. Accept an input fixture shaped from Iskra/OpenClaw runtime evidence, e.g. runtime-truth report, upgrade simulator result, autoheal dashboard, remediation proposal, or supervised repair plan. 2. Emit deterministic verdicts such as `eligible_repair_dry_run`, `needs_human`, or `blocked`. 3. Fail closed when required evidence is missing, stale, or implies runtime mutation without explicit operator approval. 4. Treat `deploy_drift_probe` from #65 as one repair class, not the whole system. 5. Keep the default mode read-only/proposal-only. ## Non-goals - No automatic OpenClaw upgrade. - No automatic production repair. - No service restart. - No deploy/apply/sync. - No second deployer replacing `iskra-openclaw` deploy scripts. - No hidden Signal/Matrix/Forgejo writes from the evaluator. ## Acceptance - Patchwarden docs state the Iskra/Patchwarden repair boundary clearly. - A fixture derived from Iskra autoheal/runtime evidence can be evaluated without touching production. - Missing `runtime-truth-report`, `openclaw-upgrade-simulator`, or `autoheal-dashboard` evidence blocks runtime-maintenance candidates where applicable. - Runtime mutation remains `needs_human` even when evidence is green. - #65 is linked as a child/slice for deploy drift, not treated as the entire #27 migration. ## Source Supersedes the broad runtime self-healing direction from `pdurlej/iskra-openclaw#27`. The Iskra repo can retain probes/dashboards/scripts as runtime substrate, but Patchwarden is the policy authority for repair/maintenance gates.
Collaborator

{
"confidence": 5,
"effort_hint": "large",
"escalation": {
"kind": "operator",
"reason": "Runtime repair policy defines which maintenance actions can be considered and which remain hard manual gates."
},
"evidence_refs": [
{
"note": "Issue migrates supervised OpenClaw runtime repair gating into Patchwarden rather than duplicating authority inside Iskra.",
"type": "forgejo",
"value": "issue-title-body-labels-and-target-snapshot"
},
{
"note": "Body splits ownership between OpenClaw runtime evidence, Patchwarden policy gates, and operator approval.",
"type": "forgejo",
"value": "issue-body-boundary-ddd-split"
},
{
"note": "Scope frames Patchwarden as the policy layer for maintenance and supervised repair decisions.",
"type": "forgejo",
"value": "issue-body-purpose"
}
],
"impact": 5,
"judge_actor": {
"name": "iskra",
"runtime": "openclaw"
},
"judged_at": "2026-06-09T01:04:00Z",
"labels_to_apply": [
"judge/p1",
"judge/operator-needed"
],
"piotr_fit": "high",
"priority": "p1",
"rationale_summary": "This is P1 operator-shaped architecture because supervised auto-heal needs a clear gate between runtime evidence, policy verdicts, and explicit approval.",
"reach": 5,
"recommended_next_action": "operator_needed",
"rerun_reason": "no_prior_judgment",
"schema": "openclaw.judge.v0",
"target": {
"kind": "issue",
"number": 68,
"repo": "pdurlej/patchwarden"
},
"target_snapshot": {
"body_hash": "sha256:f425c06c629e822ea5a0087f8159ba26db21f94c24a582654e09b32add71b0ab",
"commit_count": null,
"evidence_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"head_sha": null,
"labels": [],
"labels_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"state": "open",
"title_hash": "sha256:0cf9659e61fd3afc8764de861dc835bfcd929ada6b67d6704d408b99174fc78a",
"updated_at": "2026-06-08T19:59:48+02:00"
},
"top_caveat": "Keep repair execution out of Patchwarden and require explicit operator approval for restarts, deploys, auth, network, or destructive actions."
}

<!-- openclaw.judge.v0 --> { "confidence": 5, "effort_hint": "large", "escalation": { "kind": "operator", "reason": "Runtime repair policy defines which maintenance actions can be considered and which remain hard manual gates." }, "evidence_refs": [ { "note": "Issue migrates supervised OpenClaw runtime repair gating into Patchwarden rather than duplicating authority inside Iskra.", "type": "forgejo", "value": "issue-title-body-labels-and-target-snapshot" }, { "note": "Body splits ownership between OpenClaw runtime evidence, Patchwarden policy gates, and operator approval.", "type": "forgejo", "value": "issue-body-boundary-ddd-split" }, { "note": "Scope frames Patchwarden as the policy layer for maintenance and supervised repair decisions.", "type": "forgejo", "value": "issue-body-purpose" } ], "impact": 5, "judge_actor": { "name": "iskra", "runtime": "openclaw" }, "judged_at": "2026-06-09T01:04:00Z", "labels_to_apply": [ "judge/p1", "judge/operator-needed" ], "piotr_fit": "high", "priority": "p1", "rationale_summary": "This is P1 operator-shaped architecture because supervised auto-heal needs a clear gate between runtime evidence, policy verdicts, and explicit approval.", "reach": 5, "recommended_next_action": "operator_needed", "rerun_reason": "no_prior_judgment", "schema": "openclaw.judge.v0", "target": { "kind": "issue", "number": 68, "repo": "pdurlej/patchwarden" }, "target_snapshot": { "body_hash": "sha256:f425c06c629e822ea5a0087f8159ba26db21f94c24a582654e09b32add71b0ab", "commit_count": null, "evidence_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", "head_sha": null, "labels": [], "labels_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", "state": "open", "title_hash": "sha256:0cf9659e61fd3afc8764de861dc835bfcd929ada6b67d6704d408b99174fc78a", "updated_at": "2026-06-08T19:59:48+02:00" }, "top_caveat": "Keep repair execution out of Patchwarden and require explicit operator approval for restarts, deploys, auth, network, or destructive actions." } <!-- /openclaw.judge.v0 -->
Collaborator

Recommend close — acceptance met (operator decision: judge/operator-needed)

The supervised runtime-repair gate this issue scoped is built, shipped, and live in production. Mapping #68's acceptance criteria to delivered work:

#68 acceptance Status
Patchwarden docs state the Iskra/Patchwarden repair boundary clearly docs/operations/openclaw-runtime-repair-evaluator.md, autoheal-roadmap.md, iskra-openclaw-evidence-bundle-spec.md
A fixture from Iskra evidence can be evaluated without touching production patchwarden runtime-repair-check (evaluator #70) — read-only, no network
Missing runtime-truth-report / openclaw-upgrade-simulator / autoheal-dashboard blocks runtime-maintenance candidates fail-closed blocked on missing/stale evidence
Runtime mutation stays needs_human even when evidence is green mutation classes cap at needs_human (verified e2e in prod)
#65 linked as a child slice for deploy drift deploy_drift_probe is one repair class, slice A #77

Beyond the original "first slice", the whole arc shipped: evaluator #70 + slices A #77 (plan-object) / B #78 (TOML repair classes) / D #76 (schema contract), and Slice C wired it liveiskra-openclaw#452 closed 2026-06-10: iskra-runtime-repair-gate.service + hourly timer, e2e-verified (exit 0 / eligible_repair_dry_run, D20 boundaries held in production).

Remaining work is tracked separately, not part of #68's acceptance:

  • richer real-evidence model → #102 (Slice E, now unblocked)
  • apply-path question → parked (M3+, per autoheal-roadmap.md)

Since this is judge/operator-needed, leaving the actual close to the operator. Recommend: close as done, pointing at #102 for the next slice.

(Summary from the 2026-06-16 docs-maturity wave, item #11.)

## Recommend close — acceptance met (operator decision: `judge/operator-needed`) The supervised runtime-repair gate this issue scoped is **built, shipped, and live in production**. Mapping #68's acceptance criteria to delivered work: | #68 acceptance | Status | |---|---| | Patchwarden docs state the Iskra/Patchwarden repair boundary clearly | ✅ `docs/operations/openclaw-runtime-repair-evaluator.md`, `autoheal-roadmap.md`, `iskra-openclaw-evidence-bundle-spec.md` | | A fixture from Iskra evidence can be evaluated without touching production | ✅ `patchwarden runtime-repair-check` (evaluator `#70`) — read-only, no network | | Missing `runtime-truth-report` / `openclaw-upgrade-simulator` / `autoheal-dashboard` blocks runtime-maintenance candidates | ✅ fail-closed `blocked` on missing/stale evidence | | Runtime mutation stays `needs_human` even when evidence is green | ✅ mutation classes cap at `needs_human` (verified e2e in prod) | | `#65` linked as a child slice for deploy drift | ✅ `deploy_drift_probe` is one repair class, slice A `#77` | Beyond the original "first slice", the whole arc shipped: evaluator `#70` + slices **A `#77`** (plan-object) / **B `#78`** (TOML repair classes) / **D `#76`** (schema contract), and **Slice C wired it live** — `iskra-openclaw#452` closed 2026-06-10: `iskra-runtime-repair-gate.service` + hourly timer, **e2e-verified** (exit 0 / `eligible_repair_dry_run`, D20 boundaries held in production). **Remaining work is tracked separately**, not part of #68's acceptance: - richer real-evidence model → **#102** (Slice E, now unblocked) - apply-path question → parked (M3+, per `autoheal-roadmap.md`) Since this is `judge/operator-needed`, leaving the actual close to the operator. **Recommend: close as done**, pointing at `#102` for the next slice. *(Summary from the 2026-06-16 docs-maturity wave, item #11.)*
Sign in to join this conversation.
No labels
agent/claude-code
agent/codex
agent/gemini
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
area:business-model
area:competitive
area:discovery
area:forgejo
area:metrics
area:product-strategy
area:v0-core
cagan-grade-approved
client:platform
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
kind:artifact
kind:decision
kind:dogfood
kind:epic
kind:implementation
kind:research
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
priority:p0
priority:p1
priority:p2
priority:p3
ready-for-agent
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:blocked-on-discovery
status:cagan-grade-review-pending
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:needs-operator-decision
status:operator-needed
status:parked
tier:0-anchor
tier:0-platform-substrate
tier:1-core
tier:1-iskra-value-layer
tier:2-supporting
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
wave:1-foundation
wave:2-positioning
wave:3-validation
wave:4-economics
wave:5-operating
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/patchwarden#68
No description provided.