Architecture principle: thin observable agent harness, not coordination cathedral #717

Open
opened 2026-06-04 23:14:33 +02:00 by Iskra · 2 comments
Collaborator

Source

https://snowan.gitbook.io/study-notes/ai-blogs/how-to-build-agent-harness?utm_source=chatgpt.com

Why this matters

The article frames a useful invariant for agent systems:

Agent = model + harness.

The failure mode it describes is very close to our current risk: demos work, but production agents drift, loop, over-coordinate, lose goals, burn context, or declare completion without evidence. That is usually a harness/runtime problem, not only a model problem.

Proposed platform principle

Platform should prefer a thin, observable, deletable harness over a large coordination cathedral.

Working doctrine:

  1. Start from the simplest reliable master loop.
  2. Add middleware only in response to observed failure modes.
  3. Keep tools few, atomic, robust, and easy to audit.
  4. Put state into explicit artifacts/checkpoints, not into hidden prompt fog.
  5. Require verification gates before completion claims.
  6. Make components removable: every layer should justify its ongoing cost.

Anti-pattern to guard against

Building systems to remember that other systems should remember that another system has a checkpoint.

Or shorter:

Do not build a federal government when the user needed a shelf.

Acceptance sketch

  • Document this as a platform architecture principle.
  • Identify 2-3 current platform places where coordination layers are heavier than the problem.
  • Propose one simplification or deletion candidate.
  • Keep the principle general: it should apply beyond Iskra.
## Source https://snowan.gitbook.io/study-notes/ai-blogs/how-to-build-agent-harness?utm_source=chatgpt.com ## Why this matters The article frames a useful invariant for agent systems: > Agent = model + harness. The failure mode it describes is very close to our current risk: demos work, but production agents drift, loop, over-coordinate, lose goals, burn context, or declare completion without evidence. That is usually a harness/runtime problem, not only a model problem. ## Proposed platform principle Platform should prefer a **thin, observable, deletable harness** over a large coordination cathedral. Working doctrine: 1. Start from the simplest reliable master loop. 2. Add middleware only in response to observed failure modes. 3. Keep tools few, atomic, robust, and easy to audit. 4. Put state into explicit artifacts/checkpoints, not into hidden prompt fog. 5. Require verification gates before completion claims. 6. Make components removable: every layer should justify its ongoing cost. ## Anti-pattern to guard against Building systems to remember that other systems should remember that another system has a checkpoint. Or shorter: > Do not build a federal government when the user needed a shelf. ## Acceptance sketch - Document this as a platform architecture principle. - Identify 2-3 current platform places where coordination layers are heavier than the problem. - Propose one simplification or deletion candidate. - Keep the principle general: it should apply beyond Iskra.
Author
Collaborator

Companion Iskra/OpenClaw doctrine issue: pdurlej/iskra-openclaw#421

Companion Iskra/OpenClaw doctrine issue: https://git.pdurlej.com/pdurlej/iskra-openclaw/issues/421
Author
Collaborator

Iskra judgment

Field Value
Target pdurlej/platform#issue#717
Priority p2
Action observe
Scores reach 4 / impact 4 / confidence 5
Piotr fit high
Effort medium
Labels judge/p2
Judge iskra via openclaw

Rationale: This is P2 architecture guidance because it captures a useful harness discipline for avoiding overbuilt agent coordination while keeping evidence and verification central.

Caveat: Keep it as an architecture principle until a concrete failure mode justifies implementation work.

Structured openclaw.judge.v0 payload
<!-- openclaw.judge.v0 -->
{
  "confidence": 5,
  "effort_hint": "medium",
  "escalation": {
    "kind": "none",
    "reason": ""
  },
  "evidence_refs": [
    {
      "note": "Issue proposes a platform principle of thin, observable, deletable agent harnesses rather than large coordination layers.",
      "type": "forgejo",
      "value": "issue-title-body-labels-and-target-snapshot"
    },
    {
      "note": "Body frames production agent failures such as drift, looping, lost goals, context burn, and unevidenced completion as harness/runtime problems.",
      "type": "forgejo",
      "value": "issue-body-why-this-matters"
    },
    {
      "note": "Body lists doctrine points around simple master loops, explicit artifacts, few auditable tools, and verification gates before completion claims.",
      "type": "forgejo",
      "value": "issue-body-working-doctrine"
    }
  ],
  "impact": 4,
  "judge_actor": {
    "name": "iskra",
    "runtime": "openclaw"
  },
  "judged_at": "2026-06-12T01:13:00Z",
  "labels_to_apply": [
    "judge/p2"
  ],
  "piotr_fit": "high",
  "priority": "p2",
  "rationale_summary": "This is P2 architecture guidance because it captures a useful harness discipline for avoiding overbuilt agent coordination while keeping evidence and verification central.",
  "reach": 4,
  "recommended_next_action": "observe",
  "rerun_reason": "no_prior_judgment",
  "schema": "openclaw.judge.v0",
  "target": {
    "kind": "issue",
    "number": 717,
    "repo": "pdurlej/platform"
  },
  "target_snapshot": {
    "body_hash": "sha256:c20de5ca60c419a2e5f11bc3dcb287693bd09c463750d2f12a7e5509e140001f",
    "commit_count": null,
    "evidence_hash": "sha256:686b3cb01feae43618b6dd12882b18bd7606b0bd4b85d8fb1cc1518a00c59cd4",
    "head_sha": null,
    "labels": [],
    "labels_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "state": "open",
    "title_hash": "sha256:ac10799a7fab382d13e72eeb24a14d8f91f8dcfd9dfa8e1140ba3cf71300febe",
    "updated_at": "2026-06-04T23:14:33+02:00"
  },
  "top_caveat": "Keep it as an architecture principle until a concrete failure mode justifies implementation work."
}
<!-- /openclaw.judge.v0 -->
### Iskra judgment | Field | Value | | --- | --- | | Target | `pdurlej/platform#issue#717` | | Priority | p2 | | Action | observe | | Scores | reach 4 / impact 4 / confidence 5 | | Piotr fit | high | | Effort | medium | | Labels | `judge/p2` | | Judge | `iskra` via `openclaw` | **Rationale:** This is P2 architecture guidance because it captures a useful harness discipline for avoiding overbuilt agent coordination while keeping evidence and verification central. **Caveat:** Keep it as an architecture principle until a concrete failure mode justifies implementation work. <details> <summary>Structured openclaw.judge.v0 payload</summary> ```json <!-- openclaw.judge.v0 --> { "confidence": 5, "effort_hint": "medium", "escalation": { "kind": "none", "reason": "" }, "evidence_refs": [ { "note": "Issue proposes a platform principle of thin, observable, deletable agent harnesses rather than large coordination layers.", "type": "forgejo", "value": "issue-title-body-labels-and-target-snapshot" }, { "note": "Body frames production agent failures such as drift, looping, lost goals, context burn, and unevidenced completion as harness/runtime problems.", "type": "forgejo", "value": "issue-body-why-this-matters" }, { "note": "Body lists doctrine points around simple master loops, explicit artifacts, few auditable tools, and verification gates before completion claims.", "type": "forgejo", "value": "issue-body-working-doctrine" } ], "impact": 4, "judge_actor": { "name": "iskra", "runtime": "openclaw" }, "judged_at": "2026-06-12T01:13:00Z", "labels_to_apply": [ "judge/p2" ], "piotr_fit": "high", "priority": "p2", "rationale_summary": "This is P2 architecture guidance because it captures a useful harness discipline for avoiding overbuilt agent coordination while keeping evidence and verification central.", "reach": 4, "recommended_next_action": "observe", "rerun_reason": "no_prior_judgment", "schema": "openclaw.judge.v0", "target": { "kind": "issue", "number": 717, "repo": "pdurlej/platform" }, "target_snapshot": { "body_hash": "sha256:c20de5ca60c419a2e5f11bc3dcb287693bd09c463750d2f12a7e5509e140001f", "commit_count": null, "evidence_hash": "sha256:686b3cb01feae43618b6dd12882b18bd7606b0bd4b85d8fb1cc1518a00c59cd4", "head_sha": null, "labels": [], "labels_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", "state": "open", "title_hash": "sha256:ac10799a7fab382d13e72eeb24a14d8f91f8dcfd9dfa8e1140ba3cf71300febe", "updated_at": "2026-06-04T23:14:33+02:00" }, "top_caveat": "Keep it as an architecture principle until a concrete failure mode justifies implementation work." } <!-- /openclaw.judge.v0 --> ``` </details>
Sign in to join this conversation.
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform#717
No description provided.