Architecture principle: thin observable agent harness, not coordination cathedral #717

New issue

Open

opened 2026-06-04 23:14:33 +02:00 by Iskra · 2 comments

Iskra commented

2026-06-04 23:14:33 +02:00

Collaborator

Source

https://snowan.gitbook.io/study-notes/ai-blogs/how-to-build-agent-harness?utm_source=chatgpt.com

Why this matters

The article frames a useful invariant for agent systems:

Agent = model + harness.

The failure mode it describes is very close to our current risk: demos work, but production agents drift, loop, over-coordinate, lose goals, burn context, or declare completion without evidence. That is usually a harness/runtime problem, not only a model problem.

Proposed platform principle

Platform should prefer a thin, observable, deletable harness over a large coordination cathedral.

Working doctrine:

Start from the simplest reliable master loop.
Add middleware only in response to observed failure modes.
Keep tools few, atomic, robust, and easy to audit.
Put state into explicit artifacts/checkpoints, not into hidden prompt fog.
Require verification gates before completion claims.
Make components removable: every layer should justify its ongoing cost.

Anti-pattern to guard against

Building systems to remember that other systems should remember that another system has a checkpoint.

Or shorter:

Do not build a federal government when the user needed a shelf.

Acceptance sketch

Document this as a platform architecture principle.
Identify 2-3 current platform places where coordination layers are heavier than the problem.
Propose one simplification or deletion candidate.
Keep the principle general: it should apply beyond Iskra.

## Source https://snowan.gitbook.io/study-notes/ai-blogs/how-to-build-agent-harness?utm_source=chatgpt.com ## Why this matters The article frames a useful invariant for agent systems: > Agent = model + harness. The failure mode it describes is very close to our current risk: demos work, but production agents drift, loop, over-coordinate, lose goals, burn context, or declare completion without evidence. That is usually a harness/runtime problem, not only a model problem. ## Proposed platform principle Platform should prefer a **thin, observable, deletable harness** over a large coordination cathedral. Working doctrine: 1. Start from the simplest reliable master loop. 2. Add middleware only in response to observed failure modes. 3. Keep tools few, atomic, robust, and easy to audit. 4. Put state into explicit artifacts/checkpoints, not into hidden prompt fog. 5. Require verification gates before completion claims. 6. Make components removable: every layer should justify its ongoing cost. ## Anti-pattern to guard against Building systems to remember that other systems should remember that another system has a checkpoint. Or shorter: > Do not build a federal government when the user needed a shelf. ## Acceptance sketch - Document this as a platform architecture principle. - Identify 2-3 current platform places where coordination layers are heavier than the problem. - Propose one simplification or deletion candidate. - Keep the principle general: it should apply beyond Iskra.

Iskra commented

2026-06-04 23:14:33 +02:00

Author

Collaborator

Companion Iskra/OpenClaw doctrine issue: pdurlej/iskra-openclaw#421

Companion Iskra/OpenClaw doctrine issue: https://git.pdurlej.com/pdurlej/iskra-openclaw/issues/421

Iskra commented

2026-06-12 03:14:25 +02:00

Author

Collaborator

Iskra judgment

Field	Value
Target	`pdurlej/platform#issue#717`
Priority	p2
Action	observe
Scores	reach 4 / impact 4 / confidence 5
Piotr fit	high
Effort	medium
Labels	`judge/p2`
Judge	`iskra` via `openclaw`

Rationale: This is P2 architecture guidance because it captures a useful harness discipline for avoiding overbuilt agent coordination while keeping evidence and verification central.

Caveat: Keep it as an architecture principle until a concrete failure mode justifies implementation work.

Structured openclaw.judge.v0 payload

<!-- openclaw.judge.v0 -->
{
  "confidence": 5,
  "effort_hint": "medium",
  "escalation": {
    "kind": "none",
    "reason": ""
  },
  "evidence_refs": [
    {
      "note": "Issue proposes a platform principle of thin, observable, deletable agent harnesses rather than large coordination layers.",
      "type": "forgejo",
      "value": "issue-title-body-labels-and-target-snapshot"
    },
    {
      "note": "Body frames production agent failures such as drift, looping, lost goals, context burn, and unevidenced completion as harness/runtime problems.",
      "type": "forgejo",
      "value": "issue-body-why-this-matters"
    },
    {
      "note": "Body lists doctrine points around simple master loops, explicit artifacts, few auditable tools, and verification gates before completion claims.",
      "type": "forgejo",
      "value": "issue-body-working-doctrine"
    }
  ],
  "impact": 4,
  "judge_actor": {
    "name": "iskra",
    "runtime": "openclaw"
  },
  "judged_at": "2026-06-12T01:13:00Z",
  "labels_to_apply": [
    "judge/p2"
  ],
  "piotr_fit": "high",
  "priority": "p2",
  "rationale_summary": "This is P2 architecture guidance because it captures a useful harness discipline for avoiding overbuilt agent coordination while keeping evidence and verification central.",
  "reach": 4,
  "recommended_next_action": "observe",
  "rerun_reason": "no_prior_judgment",
  "schema": "openclaw.judge.v0",
  "target": {
    "kind": "issue",
    "number": 717,
    "repo": "pdurlej/platform"
  },
  "target_snapshot": {
    "body_hash": "sha256:c20de5ca60c419a2e5f11bc3dcb287693bd09c463750d2f12a7e5509e140001f",
    "commit_count": null,
    "evidence_hash": "sha256:686b3cb01feae43618b6dd12882b18bd7606b0bd4b85d8fb1cc1518a00c59cd4",
    "head_sha": null,
    "labels": [],
    "labels_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "state": "open",
    "title_hash": "sha256:ac10799a7fab382d13e72eeb24a14d8f91f8dcfd9dfa8e1140ba3cf71300febe",
    "updated_at": "2026-06-04T23:14:33+02:00"
  },
  "top_caveat": "Keep it as an architecture principle until a concrete failure mode justifies implementation work."
}
<!-- /openclaw.judge.v0 -->

### Iskra judgment | Field | Value | | --- | --- | | Target | `pdurlej/platform#issue#717` | | Priority | p2 | | Action | observe | | Scores | reach 4 / impact 4 / confidence 5 | | Piotr fit | high | | Effort | medium | | Labels | `judge/p2` | | Judge | `iskra` via `openclaw` | **Rationale:** This is P2 architecture guidance because it captures a useful harness discipline for avoiding overbuilt agent coordination while keeping evidence and verification central. **Caveat:** Keep it as an architecture principle until a concrete failure mode justifies implementation work. <details> <summary>Structured openclaw.judge.v0 payload</summary> ```json  { "confidence": 5, "effort_hint": "medium", "escalation": { "kind": "none", "reason": "" }, "evidence_refs": [ { "note": "Issue proposes a platform principle of thin, observable, deletable agent harnesses rather than large coordination layers.", "type": "forgejo", "value": "issue-title-body-labels-and-target-snapshot" }, { "note": "Body frames production agent failures such as drift, looping, lost goals, context burn, and unevidenced completion as harness/runtime problems.", "type": "forgejo", "value": "issue-body-why-this-matters" }, { "note": "Body lists doctrine points around simple master loops, explicit artifacts, few auditable tools, and verification gates before completion claims.", "type": "forgejo", "value": "issue-body-working-doctrine" } ], "impact": 4, "judge_actor": { "name": "iskra", "runtime": "openclaw" }, "judged_at": "2026-06-12T01:13:00Z", "labels_to_apply": [ "judge/p2" ], "piotr_fit": "high", "priority": "p2", "rationale_summary": "This is P2 architecture guidance because it captures a useful harness discipline for avoiding overbuilt agent coordination while keeping evidence and verification central.", "reach": 4, "recommended_next_action": "observe", "rerun_reason": "no_prior_judgment", "schema": "openclaw.judge.v0", "target": { "kind": "issue", "number": 717, "repo": "pdurlej/platform" }, "target_snapshot": { "body_hash": "sha256:c20de5ca60c419a2e5f11bc3dcb287693bd09c463750d2f12a7e5509e140001f", "commit_count": null, "evidence_hash": "sha256:686b3cb01feae43618b6dd12882b18bd7606b0bd4b85d8fb1cc1518a00c59cd4", "head_sha": null, "labels": [], "labels_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", "state": "open", "title_hash": "sha256:ac10799a7fab382d13e72eeb24a14d8f91f8dcfd9dfa8e1140ba3cf71300febe", "updated_at": "2026-06-04T23:14:33+02:00" }, "top_caveat": "Keep it as an architecture principle until a concrete failure mode justifies implementation work." }  ``` </details>

Iskra added the

judge/p2

label

2026-06-12 03:14:25 +02:00