test(verify): add l4 prompt waivers #139
No reviewers
Labels
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
pdurlej/platform!139
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "codex/issues/122-l4-waivers"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Canary status: missing — tests/prompts PR; fire canary 3+3 before merge
Canary Context Pack
Product story
L4 verify should be runnable on main without silently hiding prompt hygiene debt. Historical executed prompts should stop counting as active dispatch material, while the remaining active exceptions should be explicit, auditable waivers.
What changed
tests/l4-verify-waivers.yamlfor explicit token-budget and cross-link waivers.tests/test_l4_verify.pyto load waiver metadata, validate referenced files exist, reject duplicate/malformed entries, and skip only waived cases with a visible reason.prompts/archive/<date>/and addedprompts/archive/README.md.Why it changed
Issue #122 needed a waiver mechanism after PR #137 restored the L4 verify suite onto main. The previous hardcoded exception set mixed active policy with historical execution artifacts inside test code.
Files touched
tests/test_l4_verify.pytests/l4-verify-waivers.yamlprompts/archive/README.mdprompts/archive/2026-04-26/,prompts/archive/2026-04-28/,prompts/archive/2026-05-03/, andprompts/archive/2026-05-04/Relevant context
tests/test_l4_verify.pyandtests/run-verify.shon mainRuntime evidence
N/A. This is repository verification hygiene, not runtime mutation.
Known constraints
Cross-link debt for active prompts is not fixed in this PR; it is explicitly waived and deferred to #138. Historical archived prompts are retained unchanged for audit history.
Explicit out-of-scope
Requested decision
Review whether this is a safe Packet M implementation: explicit waivers, archived historical prompts, and L4 verify green on main.
Merge blockers
Verification
python3 -c "import yaml; yaml.safe_load(open('tests/l4-verify-waivers.yaml')); print('waivers_yaml_ok')"->waivers_yaml_okPYTHONPATH=control-plane python3 -m pytest tests/test_l4_verify.py -q->312 passed, 15 skipped in 136.82s (0:02:16)tests/run-verify.sh->312 passed, 15 skipped in 151.30s (0:02:31)git diff --check-> passedSpec sources read
prompts/codex-cleanup-122-124-2026-05-09.md- Packet M instructionstests/test_l4_verify.py- target implementationtests/run-verify.sh- verification entrypointprompts/listing - prompt archive and active prompt classificationprompts/01.5-schema-v2-adhd-counters.md- active token-budget waiver candidatestate/agent-execution-template.md- adjacent prompt hygiene context referenced by Packet Mdocs/forgejo-agent-operations.md- Forgejo identity/API operation contractCloses #122
Orchestrator review (claude / Pan Herbata)
Verdict: MERGE_READY
Diff matches Packet M scope from prompts/codex-cleanup-122-124-2026-05-09.md exactly:
git mv9 historical prompts toprompts/archive/<date>/with date-by-last-commit (preserves history per non_goals "DO NOT delete any prompt file")prompts/archive/README.mdclearly documents the directory's purposetests/l4-verify-waivers.yamlis structured (2 sections, 4 active token_budget waivers + 13 cross_link waivers) with mandatorywaived_by+waived_atper entrytests/test_l4_verify.pynow loads waivers via_load_waivers()with validation: required fields, path existence, no duplicatespytest.xfail→pytest.skipwith reason — better semantics (xfail implies bug; skip implies intentional waiver)Bonus: structured YAML mechanism is cleaner than my prompt suggested (hardcoded set replacement). Good design call.
Identity OK (codex authored). Ready for operator merge.