test(v0): jsonschema validation for CLI artifact outputs #62

Closed
opened 2026-05-28 13:37:39 +02:00 by claude · 1 comment
Collaborator

Goal

Close a real gap surfaced in the 2026-05-28 Cloud Opus 4.6 repo review: the four JSON Schemas under spec/schemas/ are rigorously defined but no test actually runs jsonschema.validate() against real CLI outputs. Schemas drift silently from code today.

Why this matters

  • Today: tests check individual fields (assert result.verdict["schema_version"] == "patchwarden.pr_verdict.v1") but a missing field in an artifact won't fail unless the test specifically reads that field.
  • Risk: schemas become documentation rather than contract.
  • Defense hardening per D21 — no new core capability, just enforces what the schemas already promise.

Scope

1) Add jsonschema to test extras only (NOT runtime dep)

In pyproject.toml:

[project.optional-dependencies]
test = [
  "pytest>=8.0",
  "jsonschema>=4.0",
]
dev = [
  "pytest>=8.0",
  "jsonschema>=4.0",
]

Critically: src/patchwarden/ MUST remain stdlib-only. jsonschema is opt-in for the test suite, never imported from runtime code. Verify by grepping src/patchwarden/ for import jsonschema post-PR — must return zero hits.

2) New file tests/test_artifact_schema_contract.py

Loads each schema from spec/schemas/*.json once at module level and validates representative CLI outputs against it:

Schema Validate against
pr-verdict.schema.json evaluate_pull_request() output (use tests/test_pr_check.py fixtures)
issue-verdict.schema.json evaluate_issue() output
review-artifact.schema.json run_review() output (both empty-findings path and findings path; also soft-fail path with runtime_metadata.error_kind)
finding-resolution.schema.json resolve_findings() output (PASS path + HOLD path + soft-fail path)

Each schema gets at minimum: one valid happy-path test, one valid edge case (e.g. empty findings array, soft-fail), and one deliberately invalid payload that must raise jsonschema.ValidationError (catch-the-catcher meta-test so we know validation actually runs).

3) Strengthen tests/test_boring_pr_lifecycle.py

Add a single jsonschema.validate() call after each CLI emit so the existing lifecycle test doubles as a schema contract check.

4) Bonus (close at the same time): unit test for _cloud_review_mode_from_lane()

src/patchwarden/cli.py has the helper:

def _cloud_review_mode_from_lane(lane):
    if lane.cloud_review == "allowed": return "cloud"
    if lane.cloud_review == "required-with-redaction": return "redacted"
    return "local"

No unit test exists; this value lands in the review artifact. Add tests/test_cli_cloud_review_mode.py with three cases (allowed, required-with-redaction, anything else → local).

Acceptance criteria

  • PYTHONPATH=src python3 -m unittest discover tests → all green (156 baseline + new tests; target ≥165).
  • pip install -e .[test] installs jsonschema from test extras.
  • grep -r "import jsonschema" src/patchwarden/ returns zero hits (stdlib-only rule preserved).
  • Each of the four schemas has both a happy-path validate call and a deliberately-invalid case proving the validator works.

No-go

  • Do NOT add jsonschema to runtime deps. Stdlib-only in src/patchwarden/ per cousin discipline.
  • Do NOT change schema version strings — schemas are versioned contracts. If a real validation failure surfaces, file a separate issue, don't loosen the schema to make it pass.
  • Do NOT bump schema_version on any schema (parked per D21).
  • Do NOT switch test framework — keep unittest (pytest is opt-in only per #22).

Spec sources

  • spec/schemas/*.json — the four schemas to validate against
  • tests/test_boring_pr_lifecycle.py — lifecycle test where the strengthening lands
  • tests/test_resolve_findings.py + tests/test_review_run.py — fixtures for soft-fail validation cases
  • src/patchwarden/cli.py_cloud_review_mode_from_lane() helper for the bonus test
  • 2026-05-28 Cloud Opus 4.6 repo analysis, §II.2 "Brak walidacji schematów w testach"

Status flow

ready-for-agent (now) → agent claims → PR → operator review → merge


Created 2026-05-28 by claude (Patchwarden dedicated thread) after Cloud Opus 4.6 repository review surfaced the gap. Atomic enough for any cousin (e.g. Cloud Opus 4.6 via Antigravity) to pick up.

## Goal Close a real gap surfaced in the 2026-05-28 Cloud Opus 4.6 repo review: the four JSON Schemas under `spec/schemas/` are rigorously defined but **no test actually runs `jsonschema.validate()`** against real CLI outputs. Schemas drift silently from code today. ## Why this matters - Today: tests check individual fields (`assert result.verdict["schema_version"] == "patchwarden.pr_verdict.v1"`) but a missing field in an artifact won't fail unless the test specifically reads that field. - Risk: schemas become *documentation* rather than *contract*. - Defense hardening per D21 — no new core capability, just enforces what the schemas already promise. ## Scope ### 1) Add `jsonschema` to **test extras only** (NOT runtime dep) In `pyproject.toml`: ```toml [project.optional-dependencies] test = [ "pytest>=8.0", "jsonschema>=4.0", ] dev = [ "pytest>=8.0", "jsonschema>=4.0", ] ``` Critically: **`src/patchwarden/` MUST remain stdlib-only**. `jsonschema` is opt-in for the test suite, never imported from runtime code. Verify by grepping `src/patchwarden/` for `import jsonschema` post-PR — must return zero hits. ### 2) New file `tests/test_artifact_schema_contract.py` Loads each schema from `spec/schemas/*.json` once at module level and validates representative CLI outputs against it: | Schema | Validate against | |---|---| | `pr-verdict.schema.json` | `evaluate_pull_request()` output (use `tests/test_pr_check.py` fixtures) | | `issue-verdict.schema.json` | `evaluate_issue()` output | | `review-artifact.schema.json` | `run_review()` output (both empty-findings path and findings path; also soft-fail path with `runtime_metadata.error_kind`) | | `finding-resolution.schema.json` | `resolve_findings()` output (PASS path + HOLD path + soft-fail path) | Each schema gets at minimum: one valid happy-path test, one valid edge case (e.g. empty findings array, soft-fail), and one **deliberately invalid** payload that must raise `jsonschema.ValidationError` (catch-the-catcher meta-test so we know validation actually runs). ### 3) Strengthen `tests/test_boring_pr_lifecycle.py` Add a single `jsonschema.validate()` call after each CLI emit so the existing lifecycle test doubles as a schema contract check. ### 4) Bonus (close at the same time): unit test for `_cloud_review_mode_from_lane()` `src/patchwarden/cli.py` has the helper: ```python def _cloud_review_mode_from_lane(lane): if lane.cloud_review == "allowed": return "cloud" if lane.cloud_review == "required-with-redaction": return "redacted" return "local" ``` No unit test exists; this value lands in the review artifact. Add `tests/test_cli_cloud_review_mode.py` with three cases (`allowed`, `required-with-redaction`, anything else → `local`). ## Acceptance criteria - `PYTHONPATH=src python3 -m unittest discover tests` → all green (156 baseline + new tests; target ≥165). - `pip install -e .[test]` installs `jsonschema` from test extras. - `grep -r "import jsonschema" src/patchwarden/` returns **zero hits** (stdlib-only rule preserved). - Each of the four schemas has both a happy-path validate call and a deliberately-invalid case proving the validator works. ## No-go - ❌ Do NOT add `jsonschema` to runtime deps. Stdlib-only in `src/patchwarden/` per cousin discipline. - ❌ Do NOT change schema version strings — schemas are versioned contracts. If a real validation failure surfaces, file a separate issue, don't loosen the schema to make it pass. - ❌ Do NOT bump `schema_version` on any schema (parked per D21). - ❌ Do NOT switch test framework — keep unittest (pytest is opt-in only per `#22`). ## Spec sources - `spec/schemas/*.json` — the four schemas to validate against - `tests/test_boring_pr_lifecycle.py` — lifecycle test where the strengthening lands - `tests/test_resolve_findings.py` + `tests/test_review_run.py` — fixtures for soft-fail validation cases - `src/patchwarden/cli.py` — `_cloud_review_mode_from_lane()` helper for the bonus test - 2026-05-28 Cloud Opus 4.6 repo analysis, §II.2 "Brak walidacji schematów w testach" ## Status flow `ready-for-agent` (now) → agent claims → PR → operator review → merge --- Created 2026-05-28 by claude (Patchwarden dedicated thread) after Cloud Opus 4.6 repository review surfaced the gap. Atomic enough for any cousin (e.g. Cloud Opus 4.6 via Antigravity) to pick up.
Collaborator

{
"confidence": 5,
"effort_hint": "small",
"escalation": {
"kind": "patchwarden_review",
"reason": "Schema-contract hardening should stay aligned with Patchwarden artifact guarantees."
},
"evidence_refs": [
{
"note": "Issue adds jsonschema validation tests for Patchwarden CLI artifact outputs.",
"type": "forgejo",
"value": "issue-title-body-labels-and-target-snapshot"
},
{
"note": "Body states schemas currently risk drifting into documentation rather than executable contracts.",
"type": "forgejo",
"value": "issue-body-risk"
},
{
"note": "Scope keeps jsonschema in test extras only while preserving stdlib-only runtime code.",
"type": "forgejo",
"value": "issue-body-scope"
}
],
"impact": 4,
"judge_actor": {
"name": "iskra",
"runtime": "openclaw"
},
"judged_at": "2026-06-08T01:03:00Z",
"labels_to_apply": [
"judge/p2",
"judge/patchwarden-candidate"
],
"piotr_fit": "high",
"priority": "p2",
"rationale_summary": "This is P2 Patchwarden hardening because executable schema tests turn artifact formats into real contracts instead of passive docs.",
"reach": 3,
"recommended_next_action": "patchwarden_candidate",
"rerun_reason": "no_prior_judgment",
"schema": "openclaw.judge.v0",
"target": {
"kind": "issue",
"number": 62,
"repo": "pdurlej/patchwarden"
},
"target_snapshot": {
"body_hash": "sha256:b0283b9529b276f9ba2ee810169eb3a9141f42b45d5ffa84ca0f8f82f0354c05",
"commit_count": null,
"evidence_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"head_sha": null,
"labels": [
"agent/gemini",
"area:v0-core",
"flow/ready",
"kind:implementation",
"ready-for-agent",
"size/small"
],
"labels_hash": "sha256:f3fa2375e9f4d4580b4900454ec76e039cbe581cac183e63b96935b54256c649",
"state": "open",
"title_hash": "sha256:7e6219fb12c3b159eba6c47c209dc8279b1aa06e540805ac8d72b37cabcd9b1b",
"updated_at": "2026-05-28T14:04:52+02:00"
},
"top_caveat": "Keep jsonschema as a test-only dependency and do not add runtime dependencies to core code."
}

<!-- openclaw.judge.v0 --> { "confidence": 5, "effort_hint": "small", "escalation": { "kind": "patchwarden_review", "reason": "Schema-contract hardening should stay aligned with Patchwarden artifact guarantees." }, "evidence_refs": [ { "note": "Issue adds jsonschema validation tests for Patchwarden CLI artifact outputs.", "type": "forgejo", "value": "issue-title-body-labels-and-target-snapshot" }, { "note": "Body states schemas currently risk drifting into documentation rather than executable contracts.", "type": "forgejo", "value": "issue-body-risk" }, { "note": "Scope keeps jsonschema in test extras only while preserving stdlib-only runtime code.", "type": "forgejo", "value": "issue-body-scope" } ], "impact": 4, "judge_actor": { "name": "iskra", "runtime": "openclaw" }, "judged_at": "2026-06-08T01:03:00Z", "labels_to_apply": [ "judge/p2", "judge/patchwarden-candidate" ], "piotr_fit": "high", "priority": "p2", "rationale_summary": "This is P2 Patchwarden hardening because executable schema tests turn artifact formats into real contracts instead of passive docs.", "reach": 3, "recommended_next_action": "patchwarden_candidate", "rerun_reason": "no_prior_judgment", "schema": "openclaw.judge.v0", "target": { "kind": "issue", "number": 62, "repo": "pdurlej/patchwarden" }, "target_snapshot": { "body_hash": "sha256:b0283b9529b276f9ba2ee810169eb3a9141f42b45d5ffa84ca0f8f82f0354c05", "commit_count": null, "evidence_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", "head_sha": null, "labels": [ "agent/gemini", "area:v0-core", "flow/ready", "kind:implementation", "ready-for-agent", "size/small" ], "labels_hash": "sha256:f3fa2375e9f4d4580b4900454ec76e039cbe581cac183e63b96935b54256c649", "state": "open", "title_hash": "sha256:7e6219fb12c3b159eba6c47c209dc8279b1aa06e540805ac8d72b37cabcd9b1b", "updated_at": "2026-05-28T14:04:52+02:00" }, "top_caveat": "Keep jsonschema as a test-only dependency and do not add runtime dependencies to core code." } <!-- /openclaw.judge.v0 -->
Sign in to join this conversation.
No labels
agent/claude-code
agent/codex
agent/gemini
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
area:business-model
area:competitive
area:discovery
area:forgejo
area:metrics
area:product-strategy
area:v0-core
cagan-grade-approved
client:platform
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
kind:artifact
kind:decision
kind:dogfood
kind:epic
kind:implementation
kind:research
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
priority:p0
priority:p1
priority:p2
priority:p3
ready-for-agent
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:blocked-on-discovery
status:cagan-grade-review-pending
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:needs-operator-decision
status:operator-needed
status:parked
tier:0-anchor
tier:0-platform-substrate
tier:1-core
tier:1-iskra-value-layer
tier:2-supporting
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
wave:1-foundation
wave:2-positioning
wave:3-validation
wave:4-economics
wave:5-operating
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/patchwarden#62
No description provided.