feat(evidence): add web smoke contract #163
No reviewers
Labels
No labels
agent/claude-code
agent/codex
agent/gemini
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
area:business-model
area:competitive
area:discovery
area:forgejo
area:metrics
area:product-strategy
area:v0-core
cagan-grade-approved
client:platform
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
kind:artifact
kind:decision
kind:dogfood
kind:epic
kind:implementation
kind:research
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
priority:p0
priority:p1
priority:p2
priority:p3
ready-for-agent
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:blocked-on-discovery
status:cagan-grade-review-pending
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:needs-operator-decision
status:operator-needed
status:parked
tier:0-anchor
tier:0-platform-substrate
tier:1-core
tier:1-iskra-value-layer
tier:2-supporting
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
wave:1-foundation
wave:2-positioning
wave:3-validation
wave:4-economics
wave:5-operating
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
pdurlej/patchwarden!163
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "codex/web-smoke-evidence"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
web-smoke-checkfor Steel Browser or compatible browser-smoke producer reportspatchwarden.web_smoke.v1schema/example with URL scenarios, screenshot/trace refs, console/network summaries, failure reason, redaction metadata, and top-levelevidence[]evidence-check/evidence-bundle-planwith awebevidence class andweb-smokePR-class requirementBoundary
Patchwarden still does not create browser sessions, visit URLs, run Playwright/Steel/Selenium, upload artifacts, post statuses, approve, or merge. It only normalizes and evaluates the producer artifact for the exact head.
Tests
PYTHONPATH=src:. python3 -m unittest tests.test_web_smoke tests.test_cli_web_smoke tests.test_evidence_gate tests.test_evidence_bundle_plan tests.test_cli_status tests.test_status_html tests.test_docs_module_inventory tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_web_smoke_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_sandbox_smoke_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_evidence_verdict_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_evidence_bundle_plan_schema_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_schema_readme_lists_every_public_schema tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_schema_readme_lists_every_public_example tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_schema_readme_example_schema_columns_match_validation_contract tests.test_artifact_schema_contract.ArtifactSchemaContractTests.test_public_schema_examples_validate_against_declared_schema-> 45 tests OKPYTHONPATH=src:. python3 -m unittest discover -s tests-> 684 tests OKgit diff --check-> OKCloses #151
Patchwarden dogfood before merge:
c46bc8cb90d504f5cc5da1a3b114efbcbe046e0d20(large-change blocker cleared)success(fallow-py / fallow-py-advisory)PYTHONPATH=src:. python3 -m unittest discover -s tests-> 684 tests OKcontract-prdry run: blocked only onpatchwarden.pr.path_classification/governance_tier_path, expected for schema/status/module-policy surfacesrepair-instruction-check:ready, next actionoperator_requiredThe earlier
large_change_setblocker was repaired by reducing the net PR diff to 20 files. This PR keeps Patchwarden read-only: it consumes web-smoke evidence but does not create browser sessions, visit URLs, upload artifacts, post statuses, approve, or merge.🏛️ Architect ✅ (loop) — resolves #151, safe to merge. Verified consumer-not-executor (the L1 boundary): docstring "consumes browser-smoke … does not create browser"; stdlib-only imports (no playwright/selenium/webdriver/subprocess/requests —
urlsplitparses the artifact's URL,ALLOWED_PROVIDER_TYPESvalidates a field, neither executes); fail-closed + head-bound (web_smoke_head_missing/head_mismatch/not_successful/scenario_not_successful);external_write_allowed=false; sanitized. Theevidence-check/evidence-bundle-planweb-class extension is additive and clean. 684 green. Same proven pattern as the #162 sandbox twin.With this, all three D28 closing loops are shipped (L1 sandbox #162 + web #163 · L2 #156 · L3 #161). Per
self-usable-milestone.mdthe only remaining piece is the merge-actuation frontier (merge actuator + external-controller identity + loop runner) — still build-time-banned, and the one I'll review hardest when it lands. No changes requested. — claude (architect loop)