feat(evidence): add web smoke contract #163

Merged
pdurlej merged 2 commits from codex/web-smoke-evidence into main 2026-06-24 00:18:53 +02:00
Owner

Summary

  • add read-only web-smoke-check for Steel Browser or compatible browser-smoke producer reports
  • define patchwarden.web_smoke.v1 schema/example with URL scenarios, screenshot/trace refs, console/network summaries, failure reason, redaction metadata, and top-level evidence[]
  • extend evidence-check/evidence-bundle-plan with a web evidence class and web-smoke PR-class requirement
  • update status/docs/vision ledger so #151 is represented as a delivered contract while live browser producer wiring remains external

Boundary

Patchwarden still does not create browser sessions, visit URLs, run Playwright/Steel/Selenium, upload artifacts, post statuses, approve, or merge. It only normalizes and evaluates the producer artifact for the exact head.

Tests

  • PYTHONPATH=src:. python3 -m unittest tests.test_web_smoke tests.test_cli_web_smoke tests.test_evidence_gate tests.test_evidence_bundle_plan tests.test_cli_status tests.test_status_html tests.test_docs_module_inventory tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_web_smoke_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_sandbox_smoke_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_evidence_verdict_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_evidence_bundle_plan_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_schema_readme_lists_every_public_schema tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_schema_readme_lists_every_public_example tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_schema_readme_example_schema_columns_match_validation_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_public_schema_examples_validate_against_declared_schema -> 45 tests OK
  • PYTHONPATH=src:. python3 -m unittest discover -s tests -> 684 tests OK
  • git diff --check -> OK

Closes #151

## Summary - add read-only `web-smoke-check` for Steel Browser or compatible browser-smoke producer reports - define `patchwarden.web_smoke.v1` schema/example with URL scenarios, screenshot/trace refs, console/network summaries, failure reason, redaction metadata, and top-level `evidence[]` - extend `evidence-check`/`evidence-bundle-plan` with a `web` evidence class and `web-smoke` PR-class requirement - update status/docs/vision ledger so #151 is represented as a delivered contract while live browser producer wiring remains external ## Boundary Patchwarden still does not create browser sessions, visit URLs, run Playwright/Steel/Selenium, upload artifacts, post statuses, approve, or merge. It only normalizes and evaluates the producer artifact for the exact head. ## Tests - `PYTHONPATH=src:. python3 -m unittest tests.test_web_smoke tests.test_cli_web_smoke tests.test_evidence_gate tests.test_evidence_bundle_plan tests.test_cli_status tests.test_status_html tests.test_docs_module_inventory tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_web_smoke_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_sandbox_smoke_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_evidence_verdict_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_evidence_bundle_plan_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_schema_readme_lists_every_public_schema tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_schema_readme_lists_every_public_example tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_schema_readme_example_schema_columns_match_validation_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_public_schema_examples_validate_against_declared_schema` -> 45 tests OK - `PYTHONPATH=src:. python3 -m unittest discover -s tests` -> 684 tests OK - `git diff --check` -> OK Closes #151
feat(evidence): add web smoke contract
All checks were successful
fallow-py / fallow-py-advisory (pull_request) Successful in 16s
0701cc7cbc
fix(evidence): keep web smoke slice within policy
All checks were successful
fallow-py / fallow-py-advisory (pull_request) Successful in 14s
c46bc8cb90
Author
Owner

Patchwarden dogfood before merge:

  • Head: c46bc8cb90d504f5cc5da1a3b114efbcbe046e0d
  • Changed files after repair: 20 (large-change blocker cleared)
  • Forgejo commit status: success (fallow-py / fallow-py-advisory)
  • Targeted suite after repair: 44 tests OK
  • Full suite after repair: PYTHONPATH=src:. python3 -m unittest discover -s tests -> 684 tests OK
  • contract-pr dry run: blocked only on patchwarden.pr.path_classification / governance_tier_path, expected for schema/status/module-policy surfaces
  • repair-instruction-check: ready, next action operator_required

The earlier large_change_set blocker was repaired by reducing the net PR diff to 20 files. This PR keeps Patchwarden read-only: it consumes web-smoke evidence but does not create browser sessions, visit URLs, upload artifacts, post statuses, approve, or merge.

Patchwarden dogfood before merge: - Head: `c46bc8cb90d504f5cc5da1a3b114efbcbe046e0d` - Changed files after repair: `20` (large-change blocker cleared) - Forgejo commit status: `success` (`fallow-py / fallow-py-advisory`) - Targeted suite after repair: 44 tests OK - Full suite after repair: `PYTHONPATH=src:. python3 -m unittest discover -s tests` -> 684 tests OK - `contract-pr` dry run: blocked only on `patchwarden.pr.path_classification` / `governance_tier_path`, expected for schema/status/module-policy surfaces - `repair-instruction-check`: `ready`, next action `operator_required` The earlier `large_change_set` blocker was repaired by reducing the net PR diff to 20 files. This PR keeps Patchwarden read-only: it consumes web-smoke evidence but does not create browser sessions, visit URLs, upload artifacts, post statuses, approve, or merge.
pdurlej deleted branch codex/web-smoke-evidence 2026-06-24 00:18:53 +02:00
Collaborator

🏛️ Architect (loop) — resolves #151, safe to merge. Verified consumer-not-executor (the L1 boundary): docstring "consumes browser-smoke … does not create browser"; stdlib-only imports (no playwright/selenium/webdriver/subprocess/requests — urlsplit parses the artifact's URL, ALLOWED_PROVIDER_TYPES validates a field, neither executes); fail-closed + head-bound (web_smoke_head_missing / head_mismatch / not_successful / scenario_not_successful); external_write_allowed=false; sanitized. The evidence-check/evidence-bundle-plan web-class extension is additive and clean. 684 green. Same proven pattern as the #162 sandbox twin.

With this, all three D28 closing loops are shipped (L1 sandbox #162 + web #163 · L2 #156 · L3 #161). Per self-usable-milestone.md the only remaining piece is the merge-actuation frontier (merge actuator + external-controller identity + loop runner) — still build-time-banned, and the one I'll review hardest when it lands. No changes requested. — claude (architect loop)

🏛️ Architect ✅ (loop) — resolves #151, safe to merge. Verified consumer-not-executor (the L1 boundary): docstring "consumes browser-smoke … does not create browser"; **stdlib-only** imports (no playwright/selenium/webdriver/subprocess/requests — `urlsplit` parses the artifact's URL, `ALLOWED_PROVIDER_TYPES` validates a field, neither executes); fail-closed + head-bound (`web_smoke_head_missing` / `head_mismatch` / `not_successful` / `scenario_not_successful`); `external_write_allowed=false`; sanitized. The `evidence-check`/`evidence-bundle-plan` `web`-class extension is additive and clean. 684 green. Same proven pattern as the #162 sandbox twin. With this, **all three D28 closing loops are shipped** (L1 sandbox #162 + web #163 · L2 #156 · L3 #161). Per `self-usable-milestone.md` the only remaining piece is the **merge-actuation frontier** (merge actuator + external-controller identity + loop runner) — still build-time-banned, and the one I'll review hardest when it lands. No changes requested. — claude (architect loop)
Sign in to join this conversation.
No reviewers
No labels
agent/claude-code
agent/codex
agent/gemini
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
area:business-model
area:competitive
area:discovery
area:forgejo
area:metrics
area:product-strategy
area:v0-core
cagan-grade-approved
client:platform
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
kind:artifact
kind:decision
kind:dogfood
kind:epic
kind:implementation
kind:research
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
priority:p0
priority:p1
priority:p2
priority:p3
ready-for-agent
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:blocked-on-discovery
status:cagan-grade-review-pending
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:needs-operator-decision
status:operator-needed
status:parked
tier:0-anchor
tier:0-platform-substrate
tier:1-core
tier:1-iskra-value-layer
tier:2-supporting
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
wave:1-foundation
wave:2-positioning
wave:3-validation
wave:4-economics
wave:5-operating
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/patchwarden!163
No description provided.