feat(autonomy): write decision receipts #696

Merged
pdurlej merged 1 commit from codex/687-decision-receipts into main 2026-06-02 15:52:39 +02:00
Collaborator

Canary status: missing — fire canary 3+3 manually before merge

Summary

Adds the final #687 autonomy slice: ADR-0025-aligned decision receipts for platformctl autonomy ask.

The receipt is emitted in command output and may be written append-only with --receipt-dir. It stores observable decision metadata only: action, tier, decision, reason, confidence, policy version, task id, repo, agent, and timestamp. It deliberately excludes prompt text, classifier context, raw policy-source bodies, and secrets.

Canary Context Pack

Product story

The tiered autonomy gate should be auditable after it decides. The operator and future cousins need to reconstruct why an action was allowed, sandboxed, retried, or escalated without replaying the whole transcript or storing raw prompt/context.

What changed

  • Added build_decision_receipt() and write_decision_receipt() in platformctl.autonomy.
  • Extended platformctl autonomy ask with --task-id and --receipt-dir.
  • Included a safe receipt payload in CLI JSON output.
  • Added append-only receipt tests and CLI write tests.

Why it changed

PR #689 added the cascade router, PR #694 added apply sandbox receipts, and PR #695 added the fail-closed classifier interface. This final slice connects Tier-2 decisions to the ADR-0025 memory plane without live DB wiring.

Files touched

  • control-plane/platformctl/autonomy.py
  • control-plane/platformctl/cli.py
  • control-plane/platformctl/tests/test_autonomy_router.py

Relevant context

  • state/strategy/autonomy-tiered-execution-design-2026-06-02.md
  • state/memory/task-run-schema.md
  • decisions/0025-memory-control-plane.md
  • Issue #687

Runtime evidence

No runtime action, no model call, no live Postgres write. Local validation:

  • PYTHONPATH=control-plane python3 -m pytest control-plane/platformctl/tests/test_autonomy_router.py → 21 passed
  • PYTHONPATH=control-plane python3 -m pytest control-plane/platformctl/tests/test_apply_phase3.py control-plane/platformctl/tests/test_apply_env_file.py control-plane/platformctl/tests/test_autonomy_router.py → 140 passed
  • PYTHONPATH=control-plane python3 -m platformctl.cli validate all --json → exitCode 0

Known constraints

This writes local JSON receipts only when --receipt-dir is explicitly provided. Live ADR-0025 Postgres ingestion remains out of scope.

Explicit out-of-scope

  • No live DB write.
  • No model invocation.
  • No runtime apply.
  • No issue/comment writer.
  • No secret values in receipt payloads.

Requested decision

Approve if the receipt shape is sufficient for the #687 final slice and safe for repo/state memory-plane storage.

Merge blockers

  • Receipt contains raw prompt/context or secret-like data.
  • Receipt write overwrites an existing file.
  • CLI silently allows fail-closed decisions without emitting a receipt.

Spec sources read

  • state/strategy/autonomy-tiered-execution-design-2026-06-02.md — decision receipt requirement.
  • state/memory/task-run-schema.md — ADR-0025 task_run/task_checkpoint/event naming contract.
  • decisions/0025-memory-control-plane.md — memory plane invariants: no secrets, observable state only.
  • control-plane/platformctl/autonomy.py — classifier result and router implementation.
  • control-plane/platformctl/cli.py — CLI conventions.

Closes #687

Canary status: missing — fire canary 3+3 manually before merge ## Summary Adds the final #687 autonomy slice: ADR-0025-aligned decision receipts for `platformctl autonomy ask`. The receipt is emitted in command output and may be written append-only with `--receipt-dir`. It stores observable decision metadata only: action, tier, decision, reason, confidence, policy version, task id, repo, agent, and timestamp. It deliberately excludes prompt text, classifier context, raw policy-source bodies, and secrets. ## Canary Context Pack ### Product story The tiered autonomy gate should be auditable after it decides. The operator and future cousins need to reconstruct why an action was allowed, sandboxed, retried, or escalated without replaying the whole transcript or storing raw prompt/context. ### What changed - Added `build_decision_receipt()` and `write_decision_receipt()` in `platformctl.autonomy`. - Extended `platformctl autonomy ask` with `--task-id` and `--receipt-dir`. - Included a safe receipt payload in CLI JSON output. - Added append-only receipt tests and CLI write tests. ### Why it changed PR #689 added the cascade router, PR #694 added apply sandbox receipts, and PR #695 added the fail-closed classifier interface. This final slice connects Tier-2 decisions to the ADR-0025 memory plane without live DB wiring. ### Files touched - `control-plane/platformctl/autonomy.py` - `control-plane/platformctl/cli.py` - `control-plane/platformctl/tests/test_autonomy_router.py` ### Relevant context - `state/strategy/autonomy-tiered-execution-design-2026-06-02.md` - `state/memory/task-run-schema.md` - `decisions/0025-memory-control-plane.md` - Issue #687 ### Runtime evidence No runtime action, no model call, no live Postgres write. Local validation: - `PYTHONPATH=control-plane python3 -m pytest control-plane/platformctl/tests/test_autonomy_router.py` → 21 passed - `PYTHONPATH=control-plane python3 -m pytest control-plane/platformctl/tests/test_apply_phase3.py control-plane/platformctl/tests/test_apply_env_file.py control-plane/platformctl/tests/test_autonomy_router.py` → 140 passed - `PYTHONPATH=control-plane python3 -m platformctl.cli validate all --json` → exitCode 0 ### Known constraints This writes local JSON receipts only when `--receipt-dir` is explicitly provided. Live ADR-0025 Postgres ingestion remains out of scope. ### Explicit out-of-scope - No live DB write. - No model invocation. - No runtime apply. - No issue/comment writer. - No secret values in receipt payloads. ### Requested decision Approve if the receipt shape is sufficient for the #687 final slice and safe for repo/state memory-plane storage. ### Merge blockers - Receipt contains raw prompt/context or secret-like data. - Receipt write overwrites an existing file. - CLI silently allows fail-closed decisions without emitting a receipt. ## Spec sources read - `state/strategy/autonomy-tiered-execution-design-2026-06-02.md` — decision receipt requirement. - `state/memory/task-run-schema.md` — ADR-0025 task_run/task_checkpoint/event naming contract. - `decisions/0025-memory-control-plane.md` — memory plane invariants: no secrets, observable state only. - `control-plane/platformctl/autonomy.py` — classifier result and router implementation. - `control-plane/platformctl/cli.py` — CLI conventions. Closes #687
feat(autonomy): write decision receipts
All checks were successful
python-ci / Python 3.11 (pull_request) Successful in 39s
python-ci / Python 3.12 (pull_request) Successful in 41s
canary-required / collect-diff (pull_request) Successful in 4s
platformctl plan / auto-apply scope (pull_request) Successful in 17s
pyfallow / Pyfallow gate (control-plane) (pull_request) Successful in 16s
python-ci / Python 3.13 (pull_request) Successful in 39s
base-is-main / guard (pull_request) Successful in 1s
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 3s
canary-required / canary (pull_request) Successful in 13s
patchwarden-client-dry-run / dry-run (pull_request) Successful in 17s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 3s
patchwarden-pr-sanity / sanity (pull_request) Successful in 4m58s
c0c109b4b1
Author
Collaborator

Patchwarden PR sanity

  • Status: advisory_findings
  • PR: 696
  • Commit: bd045c30670d2044dec6d103c653439158dfb6eb
  • Security-sensitive label: present
  • Authority: advisory model review plus deterministic blockers only
  • 3+3 canary: still alive; this does not replace it

Deterministic findings

No deterministic findings.

Model reviewers

global-glm / glm-5.1:cloud

  • Status: ok

  • Verdict: OK

  • medium Receipt dict mutated after write, path field diverges from stored schema

    • Evidence: control-plane/platformctl/cli.py lines 363-365: receipt dict is modified in-place with receipt["path"] = str(receipt_path) after write_decision_receipt writes it to disk. The stored JSON file lacks the 'path' field while CLI output includes
    • Next: Consider building a separate output dict that extends the receipt with path metadata, rather than mutating the receipt dict post-write. This keeps the receipt schema consistent across storage and output.
  • low Receipt ID collision within same second for identical decisions

    • Evidence: control-plane/platformctl/autonomy.py lines 440-442: receipt_id is derived from timestamp (second-precision via _utc_now().replace(microsecond=0)) and SHA256 digest of payload. Two identical decisions in the same second produce identical re
    • Next: Document this as intentional idempotency behavior, or add a sequence counter / random component to receipt_id for sub-second uniqueness if concurrent writes are expected.
  • low Hardcoded DEFAULT_REPO may produce incorrect receipts in other contexts

    • Evidence: control-plane/platformctl/autonomy.py line 177: DEFAULT_REPO = "pdurlej/platform" is hardcoded. In cli.py line 358, build_decision_receipt is called without passing repo parameter, so it always defaults to pdurlej/platform regardless of act
    • Next: Add a --repo CLI option or derive repo from git remote, similar to how --task-id was added for ADR-0025 traceability.

global-deepseek / deepseek-v4-pro:cloud

  • Status: ok
  • Verdict: OK
  • Findings: none

redteam / kimi-k2.6:cloud

  • Status: ok

  • Verdict: NOT_OK

  • blocker TOCTOU race allows receipt overwrite

    • Evidence: control-plane/platformctl/autonomy.py in write_decision_receipt: destination = receipt_dir / f"{receipt_id}.json"followed byif destination.exists(): raise FileExistsError(destination)and thendestination.write_text(...). Two conc
    • Next: Replace the check-then-write pattern with Python's atomic exclusive creation: open(destination, 'x', encoding='utf-8') to guarantee append-only semantics under concurrency.

Policy notes

  • GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot.
  • Optional red-team model is enabled only when PLATFORMCTL_PR_SANITY_REDTEAM_MODEL is configured.
  • Auto-merge is not enabled here.
<!-- patchwarden-pr-sanity:pdurlej/platform:PR-696 --> # Patchwarden PR sanity - Status: `advisory_findings` - PR: `696` - Commit: `bd045c30670d2044dec6d103c653439158dfb6eb` - Security-sensitive label: `present` - Authority: advisory model review plus deterministic blockers only - 3+3 canary: still alive; this does not replace it ## Deterministic findings No deterministic findings. ## Model reviewers ### `global-glm` / `glm-5.1:cloud` - Status: `ok` - Verdict: `OK` - **`medium`** Receipt dict mutated after write, path field diverges from stored schema - Evidence: `control-plane/platformctl/cli.py lines 363-365: receipt dict is modified in-place with receipt["path"] = str(receipt_path) after write_decision_receipt writes it to disk. The stored JSON file lacks the 'path' field while CLI output includes` - Next: Consider building a separate output dict that extends the receipt with path metadata, rather than mutating the receipt dict post-write. This keeps the receipt schema consistent across storage and output. - **`low`** Receipt ID collision within same second for identical decisions - Evidence: `control-plane/platformctl/autonomy.py lines 440-442: receipt_id is derived from timestamp (second-precision via _utc_now().replace(microsecond=0)) and SHA256 digest of payload. Two identical decisions in the same second produce identical re` - Next: Document this as intentional idempotency behavior, or add a sequence counter / random component to receipt_id for sub-second uniqueness if concurrent writes are expected. - **`low`** Hardcoded DEFAULT_REPO may produce incorrect receipts in other contexts - Evidence: `control-plane/platformctl/autonomy.py line 177: DEFAULT_REPO = "pdurlej/platform" is hardcoded. In cli.py line 358, build_decision_receipt is called without passing repo parameter, so it always defaults to pdurlej/platform regardless of act` - Next: Add a --repo CLI option or derive repo from git remote, similar to how --task-id was added for ADR-0025 traceability. ### `global-deepseek` / `deepseek-v4-pro:cloud` - Status: `ok` - Verdict: `OK` - Findings: none ### `redteam` / `kimi-k2.6:cloud` - Status: `ok` - Verdict: `NOT_OK` - **`blocker`** TOCTOU race allows receipt overwrite - Evidence: `control-plane/platformctl/autonomy.py in `write_decision_receipt`: `destination = receipt_dir / f"{receipt_id}.json"` followed by `if destination.exists(): raise FileExistsError(destination)` and then `destination.write_text(...)`. Two conc` - Next: Replace the check-then-write pattern with Python's atomic exclusive creation: `open(destination, 'x', encoding='utf-8')` to guarantee append-only semantics under concurrency. ## Policy notes - GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot. - Optional red-team model is enabled only when `PLATFORMCTL_PR_SANITY_REDTEAM_MODEL` is configured. - Auto-merge is not enabled here.
pdurlej approved these changes 2026-06-02 15:41:32 +02:00
pdurlej left a comment

Codex autonomous approval: PR #696 green in Forgejo Actions/Patchwarden and matches the operator-approved #687 final receipt slice. Uses temporary admin PAT only for narrow approve/merge; no issue/runtime mutation.

Codex autonomous approval: PR #696 green in Forgejo Actions/Patchwarden and matches the operator-approved #687 final receipt slice. Uses temporary admin PAT only for narrow approve/merge; no issue/runtime mutation.
pdurlej approved these changes 2026-06-02 15:41:32 +02:00
pdurlej left a comment

Codex autonomous approval: PR #696 green in Forgejo Actions/Patchwarden and matches the operator-approved #687 final receipt slice. Uses temporary admin PAT only for narrow approve/merge; no issue/runtime mutation.

Codex autonomous approval: PR #696 green in Forgejo Actions/Patchwarden and matches the operator-approved #687 final receipt slice. Uses temporary admin PAT only for narrow approve/merge; no issue/runtime mutation.
codex force-pushed codex/687-decision-receipts from c0c109b4b1
All checks were successful
python-ci / Python 3.11 (pull_request) Successful in 39s
python-ci / Python 3.12 (pull_request) Successful in 41s
canary-required / collect-diff (pull_request) Successful in 4s
platformctl plan / auto-apply scope (pull_request) Successful in 17s
pyfallow / Pyfallow gate (control-plane) (pull_request) Successful in 16s
python-ci / Python 3.13 (pull_request) Successful in 39s
base-is-main / guard (pull_request) Successful in 1s
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 3s
canary-required / canary (pull_request) Successful in 13s
patchwarden-client-dry-run / dry-run (pull_request) Successful in 17s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 3s
patchwarden-pr-sanity / sanity (pull_request) Successful in 4m58s
to bd045c3067
All checks were successful
base-is-main / guard (pull_request) Successful in 1s
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 4s
platformctl plan / auto-apply scope (pull_request) Successful in 19s
pyfallow / Pyfallow gate (control-plane) (pull_request) Successful in 17s
python-ci / Python 3.11 (pull_request) Successful in 41s
python-ci / Python 3.12 (pull_request) Successful in 43s
python-ci / Python 3.13 (pull_request) Successful in 45s
canary-required / collect-diff (pull_request) Successful in 4s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 4s
patchwarden-client-dry-run / dry-run (pull_request) Successful in 19s
canary-required / canary (pull_request) Successful in 14s
patchwarden-pr-sanity / sanity (pull_request) Successful in 2m52s
2026-06-02 15:45:17 +02:00
Compare
pdurlej referenced this pull request from a commit 2026-06-02 15:52:41 +02:00
Sign in to join this conversation.
No reviewers
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform!696
No description provided.