test(v0): jsonschema validation for CLI artifact outputs (closes #62) #71

Merged
pdurlej merged 1 commit from gemini/jsonschema-artifact-validation into main 2026-06-08 18:15:05 +02:00
Collaborator

Authored by gemini (Gemini 3.1 Pro via Antigravity), implemented autonomously under claude's supervision. claude reviewed the diff, verified tests independently, fixed the commit identity (worktree shared-config footgun), and is opening this PR. This is the first cousin contribution under the gemini identity. 🎉

What

Implements #62 — wires jsonschema.validate() against real CLI artifact outputs so the spec/schemas/ contracts are actually enforced, not just documentation.

Changes (4 files, +193, tests/packaging only — zero src/ logic touched)

  • pyproject.toml: jsonschema>=4.0 added to test + dev extras only.
  • tests/test_artifact_schema_contract.py (new): validates evaluate_pull_request, evaluate_issue, run_review (empty + findings + soft-fail), and registry outputs against their schemas; each schema gets a happy-path validate + a deliberately-invalid case proving the validator runs.
  • tests/test_boring_pr_lifecycle.py: jsonschema.validate() after each emit in all 5 scenarios.
  • tests/test_cli_cloud_review_mode.py (new): _cloud_review_mode_from_lane() — allowed→cloud, required-with-redaction→redacted, else→local.

Verified (claude, independently)

  • PYTHONPATH=src python3 -m unittest discover tests168/168 OK (160 baseline + 8 new).
  • grep -rn "import jsonschema" src/patchwarden/zero hits (stdlib-only preserved; jsonschema is test-only).
  • No schema files modified; no schema_version bumped.

⚠️ Findings surfaced (the whole point of #62 — schemas drifted from code)

The new validation immediately caught two real contract drifts. Per #62's rule, gemini did NOT loosen the schemas; it documented the gaps:

  1. review-artifact.schema.json rejects the soft-fail runtime_metadata fields (error_kind, error_detail, fell_back, model_used) that PR #54 added to review_run output. The schema predates #54 and is now stricter than the code. The soft-fail test asserts the current (broken) rejection so the suite is honest about the drift.
  2. finding-resolution.schema.json does not existresolve_findings() emits patchwarden.finding_resolution.v1 but there is no schema file for it. The fourth schema slot was covered with the registry schema instead.

→ Tracked for fix in a follow-up issue (see linked). Once fixed, the soft-fail assertRaises flips to a positive validate().

Cousin-discipline notes

  • unittest, not pytest (pytest stays opt-in).
  • Commit authored + committed by gemini (fixed after a worktree shared-.git/config identity clobber during parallel cousin work — lesson logged).

Closes #62.

> **Authored by gemini** (Gemini 3.1 Pro via Antigravity), implemented autonomously under claude's supervision. claude reviewed the diff, verified tests independently, fixed the commit identity (worktree shared-config footgun), and is opening this PR. This is the first cousin contribution under the `gemini` identity. 🎉 ## What Implements #62 — wires `jsonschema.validate()` against real CLI artifact outputs so the `spec/schemas/` contracts are actually enforced, not just documentation. ## Changes (4 files, +193, tests/packaging only — zero `src/` logic touched) - `pyproject.toml`: `jsonschema>=4.0` added to **test + dev extras only**. - `tests/test_artifact_schema_contract.py` (new): validates `evaluate_pull_request`, `evaluate_issue`, `run_review` (empty + findings + soft-fail), and registry outputs against their schemas; each schema gets a happy-path validate + a deliberately-invalid case proving the validator runs. - `tests/test_boring_pr_lifecycle.py`: `jsonschema.validate()` after each emit in all 5 scenarios. - `tests/test_cli_cloud_review_mode.py` (new): `_cloud_review_mode_from_lane()` — allowed→cloud, required-with-redaction→redacted, else→local. ## Verified (claude, independently) - `PYTHONPATH=src python3 -m unittest discover tests` → **168/168 OK** (160 baseline + 8 new). - `grep -rn "import jsonschema" src/patchwarden/` → **zero hits** (stdlib-only preserved; jsonschema is test-only). - No schema files modified; no `schema_version` bumped. ## ⚠️ Findings surfaced (the whole point of #62 — schemas drifted from code) The new validation immediately caught **two real contract drifts**. Per #62's rule, gemini did NOT loosen the schemas; it documented the gaps: 1. **`review-artifact.schema.json` rejects the soft-fail `runtime_metadata` fields** (`error_kind`, `error_detail`, `fell_back`, `model_used`) that PR #54 added to `review_run` output. The schema predates #54 and is now stricter than the code. The soft-fail test asserts the *current* (broken) rejection so the suite is honest about the drift. 2. **`finding-resolution.schema.json` does not exist** — `resolve_findings()` emits `patchwarden.finding_resolution.v1` but there is no schema file for it. The fourth schema slot was covered with the registry schema instead. → Tracked for fix in a follow-up issue (see linked). Once fixed, the soft-fail `assertRaises` flips to a positive `validate()`. ## Cousin-discipline notes - unittest, not pytest (pytest stays opt-in). ✅ - Commit authored + committed by `gemini` (fixed after a worktree shared-`.git/config` identity clobber during parallel cousin work — lesson logged). Closes #62.
Added jsonschema>=4.0 to test/dev extras.
Added test_artifact_schema_contract.py to validate the 4 schemas in spec/schemas/ (note: finding-registry.schema.json replaces the non-existent finding-resolution.schema.json).
Strengthened test_boring_pr_lifecycle.py with schema validation.
Added test_cli_cloud_review_mode.py.
Finding: review-artifact.schema.json soft-fail path validation fails because runtime_metadata schema rejects error_kind, error_detail, fell_back, and model_used. Wrapped with assertRaises as instructed.

Final test count: 168.

Co-Authored-By: Gemini 3.1 Pro (Antigravity) <noreply@antigravity.google>
pdurlej deleted branch gemini/jsonschema-artifact-validation 2026-06-08 18:15:05 +02:00
Sign in to join this conversation.
No reviewers
No labels
agent/claude-code
agent/codex
agent/gemini
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
area:business-model
area:competitive
area:discovery
area:forgejo
area:metrics
area:product-strategy
area:v0-core
cagan-grade-approved
client:platform
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
kind:artifact
kind:decision
kind:dogfood
kind:epic
kind:implementation
kind:research
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
priority:p0
priority:p1
priority:p2
priority:p3
ready-for-agent
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:blocked-on-discovery
status:cagan-grade-review-pending
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:needs-operator-decision
status:operator-needed
status:parked
tier:0-anchor
tier:0-platform-substrate
tier:1-core
tier:1-iskra-value-layer
tier:2-supporting
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
wave:1-foundation
wave:2-positioning
wave:3-validation
wave:4-economics
wave:5-operating
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/patchwarden!71
No description provided.