feat(identity-wiring): Order B — glm comment hook + charter §3/§4 + identity doctor stub #8

Merged
pdurlej merged 11 commits from codex/identity-wiring/order-b-glm-comment-hook into main 2026-05-01 20:09:30 +02:00
Owner

Summary

First wiring step toward multi-actor attribution canary (Oracle's neutral label for "flame war"). Implements Oracle's Order B from session infisical-update-flame-war-wiring-2026-05-01: smallest blast radius, no Claude restart, most-visible canary effect (review comments visibly authored by glm).

Closes none — first PR in identity-wiring sequence (B → A → C per Oracle).

What's in this PR

Charter rules (Oracle's exact wording, validated):

  • §3 sub-rule: runtime state lives under ~/.platformctl-runtime/, never committed
  • §4 (new): tool-originated Forgejo writes use named non-admin service identities, operator/admin = break-glass only

Infisical Universal Auth + token cache (secrets/infisical.py):

  • Trust expiresIn from auth response (NOT hardcoded TTL — Oracle's caveat)
  • 60s safety margin
  • Atomic file writes (mode 0600 file, 0700 parent)
  • force_refresh override

Forgejo PR comment hook (run_review.py extended):

  • --post-forgejo-comment (opt-in)
  • --actor (default glm; choice limited to known service identities)
  • --dry-run-comment default ON (charter: dry-run until canary)
  • Idempotency marker: <!-- platform-review:{actor}:{repo}:PR-{id}:{sha[:8]} --> (Oracle's exact format)
  • Pre-write actor verification: GET /api/v1/user, reject if token owner ≠ actor
  • UPDATE-or-CREATE via marker substring match (no spam)

platformctl identity doctor stub (tools/identity_doctor.py):

  • 5 checks: forgejo, git, infisical, review, mcp
  • Catches "looks wired, but still acting as operator" — Oracle's specific failure mode
  • MCP heuristic flags literal PAT in ~/.claude.json as fail (Oracle: avoid "MCP config secret leakage")
  • Skip-able for partial flows during rollout

Runtime layout README (state/runtime-layout.md): canonical tree per Oracle spec with permissions documented.

Tests: 101/101 green

pytest platformctl/tests/ → 101 passed in 4.18s

Distribution:

  • 14 consolidator (existing)
  • 4 negative-control (existing)
  • 5 smoke (existing)
  • 37 run_review (existing)
  • 15 infisical-cache (NEW — safety margin, file modes, atomic writes, malformed fallback, force_refresh)
  • 13 glm-hook (NEW — marker format actor+sha distinguishing, body composition, CLI defaults, dry-run-no-HTTP, actor mismatch abort)
  • 13 identity-doctor (NEW — render, individual checks incl. literal-token detection)

Out of scope (deferred per Oracle Order B → A → C sequencing)

  • Order A — Forgejo MCP wrapper (replace operator's admin PAT in ~/.claude.json with wrapper that resolves claude PAT at process start; requires Claude restart)
  • Order C — Codex git config (per-repo user.name/user.email for codex commits + askpass-codex.sh)
  • Per-tool Infisical identities (Oracle: "later, not now" — first stabilize one identity)
  • Live HTTP integration test = first real canary run with --post-forgejo-comment --no-dry-run-comment on a small Phase 02 PR

Test plan

  • Unit tests 101/101 green
  • python -m platformctl.tools.identity_doctor --help shows args
  • Manual: dry-run canary on this very PR with --post-forgejo-comment --actor glm (no-real-post)
  • Manual after merge: real canary on small target PR with --no-dry-run-comment, observe glm-authored comment

Linkage

  • Built on PR #7 (charter §3 Learning patterns)
  • First exercise of the new charter §3+§4 service-identity rules
  • Open_loops filed (separate commit on state/AUDIT_LOG.jsonl):
    • infisical-identity-rename (rename claudeplatform-orchestrator)
    • oracle-call-metadata-out-of-band (Oracle's discipline feedback)

🤖 Generated with Claude Code

## Summary First wiring step toward **multi-actor attribution canary** (Oracle's neutral label for "flame war"). Implements Oracle's **Order B** from session `infisical-update-flame-war-wiring-2026-05-01`: smallest blast radius, no Claude restart, most-visible canary effect (review comments visibly authored by `glm`). Closes none — first PR in identity-wiring sequence (B → A → C per Oracle). ## What's in this PR **Charter rules** (Oracle's exact wording, validated): - §3 sub-rule: runtime state lives under `~/.platformctl-runtime/`, never committed - §4 (new): tool-originated Forgejo writes use named non-admin service identities, operator/admin = break-glass only **Infisical Universal Auth + token cache** (`secrets/infisical.py`): - Trust `expiresIn` from auth response (NOT hardcoded TTL — Oracle's caveat) - 60s safety margin - Atomic file writes (mode 0600 file, 0700 parent) - `force_refresh` override **Forgejo PR comment hook** (`run_review.py` extended): - `--post-forgejo-comment` (opt-in) - `--actor` (default `glm`; choice limited to known service identities) - `--dry-run-comment` default **ON** (charter: dry-run until canary) - Idempotency marker: `<!-- platform-review:{actor}:{repo}:PR-{id}:{sha[:8]} -->` (Oracle's exact format) - Pre-write actor verification: GET /api/v1/user, reject if token owner ≠ actor - UPDATE-or-CREATE via marker substring match (no spam) **`platformctl identity doctor` stub** (`tools/identity_doctor.py`): - 5 checks: forgejo, git, infisical, review, mcp - Catches "looks wired, but still acting as operator" — Oracle's specific failure mode - MCP heuristic flags **literal PAT in `~/.claude.json`** as fail (Oracle: avoid "MCP config secret leakage") - Skip-able for partial flows during rollout **Runtime layout README** (`state/runtime-layout.md`): canonical tree per Oracle spec with permissions documented. ## Tests: 101/101 green ``` pytest platformctl/tests/ → 101 passed in 4.18s ``` Distribution: - 14 consolidator (existing) - 4 negative-control (existing) - 5 smoke (existing) - 37 run_review (existing) - 15 infisical-cache (NEW — safety margin, file modes, atomic writes, malformed fallback, force_refresh) - 13 glm-hook (NEW — marker format actor+sha distinguishing, body composition, CLI defaults, dry-run-no-HTTP, actor mismatch abort) - 13 identity-doctor (NEW — render, individual checks incl. literal-token detection) ## Out of scope (deferred per Oracle Order B → A → C sequencing) - **Order A — Forgejo MCP wrapper** (replace operator's admin PAT in `~/.claude.json` with wrapper that resolves `claude` PAT at process start; requires Claude restart) - **Order C — Codex git config** (per-repo `user.name`/`user.email` for codex commits + askpass-codex.sh) - **Per-tool Infisical identities** (Oracle: "later, not now" — first stabilize one identity) - **Live HTTP integration test** = first real canary run with `--post-forgejo-comment --no-dry-run-comment` on a small Phase 02 PR ## Test plan - [x] Unit tests 101/101 green - [x] `python -m platformctl.tools.identity_doctor --help` shows args - [ ] Manual: dry-run canary on this very PR with `--post-forgejo-comment --actor glm` (no-real-post) - [ ] Manual after merge: real canary on small target PR with `--no-dry-run-comment`, observe glm-authored comment ## Linkage - Built on PR #7 (charter §3 Learning patterns) - First exercise of the new charter §3+§4 service-identity rules - Open_loops filed (separate commit on `state/AUDIT_LOG.jsonl`): - `infisical-identity-rename` (rename `claude` → `platform-orchestrator`) - `oracle-call-metadata-out-of-band` (Oracle's discipline feedback) 🤖 Generated with [Claude Code](https://claude.com/claude-code)
First wiring step toward multi-actor attribution canary (Oracle's neutral
label for what operator called "flame war"). Implements Order B from Oracle
session `infisical-update-flame-war-wiring-2026-05-01` — smallest blast
radius, no Claude restart, most-visible canary effect (review comments
visibly authored by `glm` instead of operator/admin).

## Charter §3 sub-rules + new §4 (PLATFORM_CHARTER.md)

Per Oracle's exact wording:

§3 (sub-rule, Runtime state location):
> Runtime credentials, token caches, generated env wrappers, and mutable
> machine-local state live under `~/.platformctl-runtime/` and must never
> be committed. Repository code may define schemas and templates, but not
> live secrets, bearer tokens, or host-specific credential material.

§4 (new, Service identities):
> All tool-originated Forgejo writes must use named non-admin service
> identities with the minimum token scope required for the action.
> Operator/admin credentials are break-glass only and must not be embedded
> in agent, MCP, reviewer, or git automation configs.

Plus Oracle's "identity channels are separate concerns" distinction:
push auth actor / PR-comment actor / commit author / committer all need
matching configuration; PAT alone fixes only auth actor.

## New: Infisical Universal Auth + token cache

`platformctl/secrets/infisical.py` (155 lines):
- `CachedToken.is_valid()` with 60s safety margin
- `load_cached_token` / `save_cached_token` — atomic via tmp+rename, mode 0600
- `_login_universal_auth` — trusts `expiresIn` from auth response (NOT
  hardcoded 30d per Oracle's caveat)
- `get_access_token` — cache-then-login pattern; force_refresh override
- `read_secret` — read single secret value via /api/v3/secrets/raw

15 tests in `test_infisical_cache.py` cover safety margin, file modes,
atomic writes, malformed-cache fallback, force_refresh, cache-hit-no-network.

## New: Forgejo PR comment hook in run_review.py

CLI flags:
- `--post-forgejo-comment` (opt-in, default OFF)
- `--actor {claude,codex,glm,platform-orchestrator}` (default `glm`)
- `--dry-run-comment` (default ON; charter rule "dry-run default until canary")
- `--no-dry-run-comment` (disable dry-run for real posting)
- `--forgejo-host`, `--forgejo-repo`

Logic:
- `_idempotency_marker` — Oracle's exact format
  `<!-- platform-review:{actor}:{repo}:PR-{pr_id}:{head_sha[:8]} -->`
- `_verify_actor` — pre-write GET /api/v1/user; reject if token owner != actor
- `_find_existing_comment` — substring match on marker; UPDATE not append
- `maybe_post_forgejo_comment` — dry-run-by-default; PATCH or POST chosen
  by marker presence

13 tests in `test_glm_comment_hook.py` cover marker format (actor + sha
distinguish), body composition, CLI defaults, no-token guard, dry-run
no-HTTP, actor verification mismatch.

## New: platformctl identity doctor stub

`platformctl/tools/identity_doctor.py` (190 lines):
Catches "looks wired, but still acting as operator" failure mode.
Five checks (skip-able): forgejo, git, infisical, review, mcp.
- forgejo: GET /api/v1/user with PAT, verify login matches actor
- git: `git config user.name`/`user.email` match actor
- infisical: token cache present + not expired (no value print)
- review: `glm-comment-state.json` parseable (or absent = first run)
- mcp: heuristic — `~/.claude.json` Forgejo MCP env literal vs wrapper-resolved

CLI: `python -m platformctl.tools.identity_doctor <actor> [--skip ...]`.
Exit codes: 0=pass, 1=fail, 2=usage error.

13 tests in `test_identity_doctor.py` cover render, individual checks
including the literal-token-detection (Oracle: "MCP config secret leakage").

## New: state/runtime-layout.md

Canonical structure documentation for `~/.platformctl-runtime/` per
Oracle's tree spec. Documents permissions (0700/0600), token cache
contract (`expiresIn`-driven), credential-rotation guidance.

## Tests: 101/101 green

```
pytest platformctl/tests/ → 101 passed in 4.18s
```

Distribution: 14 consolidator + 4 negative-control + 5 smoke + 37 run_review
+ 15 infisical-cache + 13 glm-hook + 13 identity-doctor.

## Out of scope (deferred to follow-up PRs)

- **Order A — Forgejo MCP wrapper**: `~/.claude.json` swap to wrapper
  script that resolves `claude` PAT at process start. Requires Claude restart.
- **Order C — Codex git config**: per-repo `user.name`/`user.email` for
  codex commits + askpass-codex.sh. Requires Codex CLI integration work.
- **Per-tool Infisical identities**: Oracle says "later, not now" — first
  stabilize one identity, then split.
- **Live HTTP integration tests**: this PR's hook tests are unit-level with
  monkeypatched httpx. Real test = first canary run with `--post-forgejo-comment
  --no-dry-run-comment` on a small Phase 02 PR.

## Open loops filed (state/AUDIT_LOG.jsonl, separate commit)

- `infisical-identity-rename` — `claude` → `platform-orchestrator` since
  current identity reads multiple PATs (Oracle Q1 caveat #1)
- `oracle-call-metadata-out-of-band` — Oracle's discipline feedback:
  include model/provider/thinking-budget metadata so Oracle can self-validate

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per canary #4 PR #8 (2026-05-01): tech-gpt HIGH + product-gpt HIGH convergent —
"Token TTL rule violated in implementation". Charter §3 token-cache rule says
"trust expiresIn from auth response, NOT hardcoded TTL", but previous code had
`int(body.get("expiresIn") or body.get("tokenDuration") or 86400)` which
silently defaulted to 24h when the response was malformed.

Fix: fail closed — raise InfisicalAuthError if expiresIn is missing, None,
non-numeric, zero, or negative. No tokenDuration fallback (Infisical's
documented field is expiresIn).

Tests: 3 new (missing / zero / negative) all raise. Suite 18/18 in
test_infisical_cache.py.

Reviewer references:
- tech-gpt: "expiresIn is not actually required" (high)
- product-gpt: "Token TTL rule is violated in the implementation" (high)
Two related token-cache write fixes from canary #4 PR #8 (2026-05-01):

1. **chmod race** (tech-gpt high): previous code did `write_text` (default
   umask, often 0644) then `chmod 0600` — bearer token bytes on disk with
   loose perms for the window. Replaced with `os.open(O_CREAT|O_EXCL|O_WRONLY,
   0o600)` so the kernel sets perms before any byte is written. O_EXCL also
   prevents symlink-following attacks.

2. **PID-collision race** (tech-glm high): tmp filename used only `os.getpid()`,
   which can collide across processes (PID wraparound, container PID
   re-assignment). Added `uuid.uuid4().hex` suffix.

3 new tests: file mode 0600 immediately, UUID in tmp filename.
Suite 21/21 in test_infisical_cache.py.

Reviewer references:
- tech-gpt: "Token cache writes expose bearer token before chmod" (high)
- tech-glm: "Race condition in token cache save" (high)
Follow-up to previous commit. The first version of
test_save_tmp_filename_includes_uuid_not_just_pid used a fragile monkeypatch
on a non-package attribute (`platformctl.secrets.infisical.uuid`). Replaced
with a clean `os.replace` spy that asserts the actual tmp filename matches
the documented pattern `.tmp-<pid>-<32 hex chars>`.

Suite 21/21 green.
Two related comment-hook fixes from canary #4 PR #8 (2026-05-01):

1. **Pagination + fail-closed lookup** (tech-gpt high): previous
   `_find_existing_comment` returned None on non-200 (treated as "no
   existing comment" → CREATE NEW). On a flaky API call this duplicates
   comments. Plus no pagination — older comment beyond page 1 was also
   missed.

   Fix: introduce `CommentLookupError`, paginate (page/limit, max_pages
   safety), raise on any non-200 or non-list response. Caller MUST treat
   exception as "lookup not complete" and skip create-new-comment path.

2. **Hard-fail on --no-dry-run-comment errors** (tech-gpt high): main()
   wrapped the comment hook in catch-all → warning, but still exited 0/1
   based on review packet. CI/canary couldn't tell that the real-mode
   post failed.

   Fix: dry-run errors → warning (continue), real-mode errors → exit 4
   (new dedicated code, distinct from approve=0 / defer=1 / wait=2 /
   usage=3). Review packet still saved to disk regardless.

Suite 102/102 green.

Reviewer references:
- tech-gpt: "Comment idempotency fails open on list errors and pagination" (high)
- tech-gpt: "Explicit real comment posting can silently fail while review exits OK" (high)
Per canary #4 PR #8 (tech-gpt high "platformctl identity doctor is not
registered in the CLI"): the implementation existed at
`platformctl.tools.identity_doctor:run_doctor` but was reachable only as
`python -m platformctl.tools.identity_doctor`, not via the documented
`platformctl identity doctor <actor>` invocation that charter §3
service-identity rule references.

Wires `identity` Click group + `doctor` subcommand to existing
`run_doctor()` and `render_report()`. Honors all argparse-mode flags:
--forgejo-token, --forgejo-host, --repo-root, --expected-email,
--skip (multiple). Exits 1 on report.passed=False so CI can gate on it.

Smoke: `platformctl identity doctor --help` shows full help.
Suite 106/106 green (no test changes — Click registration is structural).

Reviewer reference:
- tech-gpt: "platformctl identity doctor is not registered in the CLI" (high)
Per canary #4 PR #8 (2026-05-01) two convergent BLOCKER fixes:

1. **product-gpt blocker "Do not ship a safety check that overclaims":**
   charter previously claimed identity_doctor "verifies all four channels."
   Stub only checks git config + heuristic MCP env-var-vs-literal. Downgraded
   wording to "partial stub (not approval-grade)", enumerated current vs
   missing checks, and stated explicitly that doctor-pass ≠ rollout-safe.
   Files open_loop `identity-doctor-full-coverage` for: commit author,
   committer-vs-auth-actor, MCP wrapper resolved-actor, review-comment
   end-to-end smoke.

2. **tech-glm blocker "Charter violation: state file not tracked":**
   charter listed `review/glm-comment-state.json` in runtime layout but
   `run_review.py` hook never reads/writes it (achieves idempotency via
   Forgejo-API marker substring lookup, no local state file needed).
   Removed the false reference rather than adding unused state-file plumbing.
   `state/runtime-layout.md` updated correspondingly.

Net effect: charter and runtime-layout doc now match shipped behavior;
no new code (only docs).

Reviewer references:
- product-gpt: "Do not ship a safety check that overclaims" (blocker)
- tech-glm: "Charter violation: state file not tracked" (blocker)
Per canary #4 amended PR #8 (2026-05-01) product-claude blocker
"Doctor prints 'PASS' without partial-stub warning — operator will get
false confidence":

Previous render_report printed bare "=> PASS" when all (skipped + present)
checks passed. Operator could read this as "rollout safe" — but charter §3
explicitly downgraded the doctor to "partial stub, not approval-grade"
(per earlier canary #4 fix). Output now matches the charter wording.

After: "=> PASS (partial stub; not approval-grade — see charter §3)"

FAIL output unchanged (failure is already unambiguous; no caveat needed).

Tests: +2 (caveat present on PASS, absent on FAIL). Suite 108/108 green.

Reviewer reference:
- product-claude: "Doctor prints 'PASS' without partial-stub warning" (blocker)
- product-gpt:    "Doctor label overstates rollout safety" (high — same root)
Per canary #4 amended PR #8 (tech-gpt blocker "Real Forgejo comment posting
still fails open" + product-gpt high "Real comment failure can still look
successful"):

Previous fix added hard-fail (exit 4) in main() when comment hook raises an
exception. But guard branches inside `maybe_post_forgejo_comment` (no-token,
actor verification fail) used early-return without raising — so main()'s
try/except saw nothing and exited 0/1 based on review packet alone.
The "real comment posting failed" status was lost.

Fix: in real-post mode (dry_run_comment=False):
- no token → raise RuntimeError (was: log + return)
- actor verification fail → raise RuntimeError (was: log + return)

Dry-run mode unchanged — failures still log + return so review can proceed
to packet generation without side effect.

Tests: 2 new (real-mode raises on no-token, on actor mismatch) + 1 existing
test updated (was checking actor-mismatch behavior; now explicitly
dry-run-mode). Suite 110/110 green.

Reviewer references:
- tech-gpt:    "Real Forgejo comment posting still fails open" (blocker)
- product-gpt: "Real comment failure can still look successful" (high — same root)
Per canary #4 amended PR #8 (tech-gpt high "Token cache reads ignore unsafe
permissions"): save_cached_token writes 0600, but if the file is later
tampered with (chmod 0644 by attacker, umask weirdness, manual edit) the
load path blindly trusts whatever is on disk.

Fix: load_cached_token checks mode & 0o777 before parsing. If not 0600,
emit warning and return None — caller then re-authenticates (writes a fresh
cache with strict perms via O_CREAT|O_EXCL+0o600 from the earlier fix).

Tests: 2 new (loose perms ignored, strict 0600 accepted). Suite 22/22 in
test_infisical_cache.py.

Reviewer reference:
- tech-gpt: "Token cache reads ignore unsafe permissions" (high)
Follow-up to commit bf8dd51 (load_cached_token rejects 0644 caches).
test_infisical_check_expired_cache_fails wrote the cache via Path.write_text
which uses default umask (typically 0644) — after the mode-check fix, that
file is now treated as "no cache" (rejected with WARNING) instead of
"expired cache". Test was checking the latter behavior.

Fix: use save_cached_token() which writes 0600 atomically, so the cache
file is loadable + parseable + then evaluated as expired.

Suite 112/112 green.
First-time contributor

Decision packet — PR 8

Risk: BLOCKER
Default action: defer

Tech votes

  • tech-gpt → NOT_OK
  • tech-glm → OK
  • tech-claude → NOT_OK

Product votes

  • product-glm → OK
  • product-gpt → NOT_OK
  • product-claude → OK

Weak-signal risks (1 reviewer)

  • blocker Real Forgejo comment write failures still exit successfully (raised by tech-gpt)
  • blocker Real comment failure can still look successful (raised by product-gpt)
  • blocker Silent failure on Forgejo PATCH/POST non-2xx in real-post mode (recreates the exact bug PR claims to fix) (raised by tech-claude)
  • high Doctor preserves removed local-state concept (raised by product-gpt)
  • high Silent failure on non-200/201 comment POST/PATCH response (raised by tech-glm)
  • high Pagination in _find_existing_comment is order-non-deterministic; concurrent activity causes marker miss → duplicate (raised by tech-claude)
  • high Identity-doctor 'review' check is dead code that emits a misleading green (raised by tech-claude)
  • medium Identity doctor still checks removed review state path (raised by tech-gpt)
  • medium Doctor can pass vacuously when every check is skipped (raised by tech-gpt)
  • medium Infisical cache save can leak temp files on os.replace failure (raised by tech-gpt)
  • medium Charter absorbs implementation caveats from an unfinished rollout (raised by product-gpt)
  • medium Dry-run is not a low-friction rehearsal (raised by product-gpt)
  • medium TOCTOU race in comment idempotency check (raised by tech-glm)
  • medium Orphaned temp files if os.replace fails (raised by tech-glm)
  • medium Parent directory permission hardening is best-effort only (raised by tech-glm)
  • medium Operator surface expansion is large for one PR — many new things to remember (raised by product-claude)
  • medium identity doctor PASS still risks misleading a tired operator despite the caveat (raised by product-claude)
  • medium Charter rule and its implementation land in the same PR (raised by product-claude)
  • medium expiresIn: true would be accepted as 1-second TTL (bool is int) (raised by tech-claude)
  • medium MCP literal-token heuristic accepts $VAR (unbraced) and any $-prefixed string (raised by tech-claude)
  • low Pagination logic assumes Forgejo API behavior (raised by tech-glm)
  • low glm is opaque naming for a PM operator (raised by product-claude)
  • low No operator runbook for the new failure modes (exit 4, expired token cache, doctor FAIL) (raised by product-claude)
  • low OverflowError from int(float('inf')) on expiresIn not caught (raised by tech-claude)
  • low --dry-run-comment argparse flag is a no-op; only --no-dry-run-comment mutates state (raised by tech-claude)
  • low Marker docstring vs implementation inconsistency invites later drift (raised by tech-claude)
  • low _ensure_runtime_dir best-effort chmod walks parents without checking symlinks (raised by tech-claude)

Opportunities

  • Surface dry-run status in CI logs — The review hook defaults to dry-run, which is safe, but CI logs might be noisy. Consider adding a concise '[DRY-RUN]' prefix to the decision packet render_markdown output when --dry-run-comment is active, so operators don't mistakenly think a live post occurred. (product-glm)
  • Extend identity doctor to cover commit author/committer — Per charter §3 open loop, the doctor currently verifies forgejo token and git config but not the actual commit authorship (which can differ via --author). Adding this check sooner would prevent 'right config, wrong commit' rollouts. (product-glm)
  • Pin real-post behavior with end-to-end tests — Current tests cover missing token and actor mismatch, but not lookup success followed by PATCH/POST failure. Add tests around main() exit code 4 for real write HTTP failures. (tech-gpt)
  • Add one operator-facing canary checklist — The PR spans charter, token cache, Forgejo posting, and identity doctor. A short checklist with expected command, expected exit code, expected visible actor, and rollback action would reduce approval burden for a non-technical risk owner. (product-gpt)
  • Name the feature neutrally everywhere — The PR title uses 'FLAME-WAR CANARY' while code/docs use service identity and multi-actor attribution. Converging on the neutral name would make future maintenance and search less emotionally loaded. (product-gpt)
  • MCP heuristic false positives/negatives — control-plane/platformctl/tools/identity_doctor.py:157 - The literal token check (token.startswith('$')) has false negatives for (cmd) syntax and false positives for env var names without . Consider a more robust heuristic or explicit config flag. (tech-glm)
  • Duplicate actor validation lists — The valid actors list is duplicated in three files (cli.py:368, run_review.py:352, identity_doctor.py:23). Could centralize in a shared constants module to prevent drift. (tech-glm)
  • Review state file check references removed charter component — control-plane/platformctl/tools/identity_doctor.py:133 - check_review_idempotency_state checks for review/glm-comment-state.json, which charter/runtime-layout.md says was removed. The check correctly handles absence (passes), but creates cognitive load. (tech-glm)
  • Add operator cheat-sheet to runtime-layout.md — state/runtime-layout.md documents the directory tree but not 'how do I, the operator, use this'. One paragraph with the canary one-liner + the 'doctor before canary' invocation would convert this from spec to usable reference. While the diff is open this is a 5-line addition; later it becomes a separate PR. (product-claude)
  • Surface the dry-run-default behavior in run_review --help epilog — The 'dry-run default ON, opt-in posting' is the single most operator-protective decision in this PR. It's only visible in the test names and PR description. An epilog/footer on the argparse parser ('SAFETY: --post-forgejo-comment will not actually post unless --no-dry-run-comment is also set') would make this self-documenting at the point of use. (product-claude)
  • Consider whether four actor names is one too many — PR introduces four choices (claude/codex/glm/platform-orchestrator), but only glm is actually wired this PR. codex is deferred to Order C, platform-orchestrator is described as a future rename. Allowing all four in --actor choices today means an operator can pick a not-yet-wired actor and get a confusing failure. Consider restricting --actor choices to currently-wired actors and expanding as Orders A/C land. (product-claude)
  • Add HTTP-failure regression tests for real-post mode — test_glm_comment_hook covers no-token and verify-fail real-mode raises but not the PATCH/POST 4xx/5xx path. Mocking httpx.post → 403/500 and asserting RuntimeError would have caught the silent-failure blocker above. (tech-claude)
  • Hoist httpx imports to module level — httpx is imported inside _verify_actor, _find_existing_comment, maybe_post_forgejo_comment, _login_universal_auth, read_secret, and check_forgejo_actor. Module-level import (with one ImportError fallback) reduces per-call cost and makes the dependency explicit. (tech-claude)
  • Replace marker substring match with PR-scoped GET by-id where possible — Forgejo supports filtering comments; storing the comment id on first create (e.g., in an audit-log line keyed on actor+sha) would make idempotency O(1) and immune to pagination races without needing a runtime state file. (tech-claude)

Dissents

  • tech-gpt voted NOT_OK (confidence 0.90): {
    "verdict": "NOT_OK",
    "confidence": 0.9,
    "risks": [
    {
    "title": "Real Forgejo comment write failures still exit successfully",
    "severity": "blocker",
    "evidence": "control-plane/platformctl/tools/run_review.py:686: non-200 PATCH/POST only logs [comment] {action} failed: HTTP ... and does not raise, so main can still return approve_merge/defer status after --no-dry-run-comment failed.",
    "recommendation": "In real-post mode, raise RuntimeError on any non-200/20
  • product-gpt voted NOT_OK (confidence 0.86): {
    "verdict": "NOT_OK",
    "confidence": 0.86,
    "risks": [
    {
    "title": "Real comment failure can still look successful",
    "severity": "blocker",
    "evidence": "control-plane/platformctl/tools/run_review.py:650 finding: missing httpx returns instead of raising in real-post mode; lines near final POST/PATCH only log non-200 responses and do not raise",
    "recommendation": "In --no-dry-run-comment mode, raise on missing httpx and any non-200/non-201 write response so the run
  • tech-claude voted NOT_OK (confidence 0.85): {
    "verdict": "NOT_OK",
    "confidence": 0.85,
    "risks": [
    {
    "title": "Silent failure on Forgejo PATCH/POST non-2xx in real-post mode (recreates the exact bug PR claims to fix)",
    "severity": "blocker",
    "evidence": "control-plane/platformctl/tools/run_review.py — end of maybe_post_forgejo_comment(): if resp.status_code in (200, 201): ...write OK\\nelse: ...write 'failed: HTTP {code}'. The else branch only logs; no raise. The function returns normally, so main()'s try/exce

Operator decisions (yes/no)

  1. Risk 'Real Forgejo comment write failures still exit successfully' raised by 1/6: do you accept this risk, or should this PR be deferred until it's mitigated?
  2. Risk 'Real comment failure can still look successful' raised by 1/6: do you accept this risk, or should this PR be deferred until it's mitigated?
  3. Risk 'Silent failure on Forgejo PATCH/POST non-2xx in real-post mode (recreates the exact bug PR claims to fix)' raised by 1/6: do you accept this risk, or should this PR be deferred until it's mitigated?

Evidence

  • Total review wallclock: ~435s
  • Tech consensus: dissent
  • Product consensus: dissent
<!-- platform-review:glm:pdurlej/platform:PR-8:c98936e0 --> # Decision packet — PR 8 **Risk:** `BLOCKER` **Default action:** `defer` ## Tech votes - ❌ `tech-gpt` → NOT_OK - ✅ `tech-glm` → OK - ❌ `tech-claude` → NOT_OK ## Product votes - ✅ `product-glm` → OK - ❌ `product-gpt` → NOT_OK - ✅ `product-claude` → OK ## Weak-signal risks (1 reviewer) - `blocker` **Real Forgejo comment write failures still exit successfully** (raised by tech-gpt) - `blocker` **Real comment failure can still look successful** (raised by product-gpt) - `blocker` **Silent failure on Forgejo PATCH/POST non-2xx in real-post mode (recreates the exact bug PR claims to fix)** (raised by tech-claude) - `high` **Doctor preserves removed local-state concept** (raised by product-gpt) - `high` **Silent failure on non-200/201 comment POST/PATCH response** (raised by tech-glm) - `high` **Pagination in _find_existing_comment is order-non-deterministic; concurrent activity causes marker miss → duplicate** (raised by tech-claude) - `high` **Identity-doctor 'review' check is dead code that emits a misleading green** (raised by tech-claude) - `medium` **Identity doctor still checks removed review state path** (raised by tech-gpt) - `medium` **Doctor can pass vacuously when every check is skipped** (raised by tech-gpt) - `medium` **Infisical cache save can leak temp files on os.replace failure** (raised by tech-gpt) - `medium` **Charter absorbs implementation caveats from an unfinished rollout** (raised by product-gpt) - `medium` **Dry-run is not a low-friction rehearsal** (raised by product-gpt) - `medium` **TOCTOU race in comment idempotency check** (raised by tech-glm) - `medium` **Orphaned temp files if os.replace fails** (raised by tech-glm) - `medium` **Parent directory permission hardening is best-effort only** (raised by tech-glm) - `medium` **Operator surface expansion is large for one PR — many new things to remember** (raised by product-claude) - `medium` **`identity doctor` PASS still risks misleading a tired operator despite the caveat** (raised by product-claude) - `medium` **Charter rule and its implementation land in the same PR** (raised by product-claude) - `medium` **`expiresIn: true` would be accepted as 1-second TTL (bool is int)** (raised by tech-claude) - `medium` **MCP literal-token heuristic accepts `$VAR` (unbraced) and any `$`-prefixed string** (raised by tech-claude) - `low` **Pagination logic assumes Forgejo API behavior** (raised by tech-glm) - `low` **`glm` is opaque naming for a PM operator** (raised by product-claude) - `low` **No operator runbook for the new failure modes (exit 4, expired token cache, doctor FAIL)** (raised by product-claude) - `low` **OverflowError from `int(float('inf'))` on expiresIn not caught** (raised by tech-claude) - `low` **`--dry-run-comment` argparse flag is a no-op; only `--no-dry-run-comment` mutates state** (raised by tech-claude) - `low` **Marker docstring vs implementation inconsistency invites later drift** (raised by tech-claude) - `low` **_ensure_runtime_dir best-effort chmod walks parents without checking symlinks** (raised by tech-claude) ## Opportunities - **Surface dry-run status in CI logs** — The review hook defaults to dry-run, which is safe, but CI logs might be noisy. Consider adding a concise '[DRY-RUN]' prefix to the decision packet render_markdown output when --dry-run-comment is active, so operators don't mistakenly think a live post occurred. (product-glm) - **Extend identity doctor to cover commit author/committer** — Per charter §3 open loop, the doctor currently verifies forgejo token and git config but not the actual commit authorship (which can differ via --author). Adding this check sooner would prevent 'right config, wrong commit' rollouts. (product-glm) - **Pin real-post behavior with end-to-end tests** — Current tests cover missing token and actor mismatch, but not lookup success followed by PATCH/POST failure. Add tests around `main()` exit code 4 for real write HTTP failures. (tech-gpt) - **Add one operator-facing canary checklist** — The PR spans charter, token cache, Forgejo posting, and identity doctor. A short checklist with expected command, expected exit code, expected visible actor, and rollback action would reduce approval burden for a non-technical risk owner. (product-gpt) - **Name the feature neutrally everywhere** — The PR title uses 'FLAME-WAR CANARY' while code/docs use service identity and multi-actor attribution. Converging on the neutral name would make future maintenance and search less emotionally loaded. (product-gpt) - **MCP heuristic false positives/negatives** — control-plane/platformctl/tools/identity_doctor.py:157 - The literal token check (token.startswith('$')) has false negatives for $(cmd) syntax and false positives for env var names without $. Consider a more robust heuristic or explicit config flag. (tech-glm) - **Duplicate actor validation lists** — The valid actors list is duplicated in three files (cli.py:368, run_review.py:352, identity_doctor.py:23). Could centralize in a shared constants module to prevent drift. (tech-glm) - **Review state file check references removed charter component** — control-plane/platformctl/tools/identity_doctor.py:133 - check_review_idempotency_state checks for review/glm-comment-state.json, which charter/runtime-layout.md says was removed. The check correctly handles absence (passes), but creates cognitive load. (tech-glm) - **Add operator cheat-sheet to runtime-layout.md** — state/runtime-layout.md documents the directory tree but not 'how do I, the operator, use this'. One paragraph with the canary one-liner + the 'doctor before canary' invocation would convert this from spec to usable reference. While the diff is open this is a 5-line addition; later it becomes a separate PR. (product-claude) - **Surface the dry-run-default behavior in run_review --help epilog** — The 'dry-run default ON, opt-in posting' is the single most operator-protective decision in this PR. It's only visible in the test names and PR description. An epilog/footer on the argparse parser ('SAFETY: --post-forgejo-comment will not actually post unless --no-dry-run-comment is also set') would make this self-documenting at the point of use. (product-claude) - **Consider whether four actor names is one too many** — PR introduces four choices (claude/codex/glm/platform-orchestrator), but only `glm` is actually wired this PR. `codex` is deferred to Order C, `platform-orchestrator` is described as a future rename. Allowing all four in --actor choices today means an operator can pick a not-yet-wired actor and get a confusing failure. Consider restricting --actor choices to currently-wired actors and expanding as Orders A/C land. (product-claude) - **Add HTTP-failure regression tests for real-post mode** — test_glm_comment_hook covers no-token and verify-fail real-mode raises but not the PATCH/POST 4xx/5xx path. Mocking httpx.post → 403/500 and asserting RuntimeError would have caught the silent-failure blocker above. (tech-claude) - **Hoist httpx imports to module level** — httpx is imported inside _verify_actor, _find_existing_comment, maybe_post_forgejo_comment, _login_universal_auth, read_secret, and check_forgejo_actor. Module-level import (with one ImportError fallback) reduces per-call cost and makes the dependency explicit. (tech-claude) - **Replace marker substring match with PR-scoped GET by-id where possible** — Forgejo supports filtering comments; storing the comment id on first create (e.g., in an audit-log line keyed on actor+sha) would make idempotency O(1) and immune to pagination races without needing a runtime state file. (tech-claude) ## Dissents - `tech-gpt` voted **NOT_OK** (confidence 0.90): { "verdict": "NOT_OK", "confidence": 0.9, "risks": [ { "title": "Real Forgejo comment write failures still exit successfully", "severity": "blocker", "evidence": "control-plane/platformctl/tools/run_review.py:686: non-200 PATCH/POST only logs `[comment] {action} failed: HTTP ...` and does not raise, so main can still return approve_merge/defer status after `--no-dry-run-comment` failed.", "recommendation": "In real-post mode, raise RuntimeError on any non-200/20 - `product-gpt` voted **NOT_OK** (confidence 0.86): { "verdict": "NOT_OK", "confidence": 0.86, "risks": [ { "title": "Real comment failure can still look successful", "severity": "blocker", "evidence": "control-plane/platformctl/tools/run_review.py:650 finding: missing httpx returns instead of raising in real-post mode; lines near final POST/PATCH only log non-200 responses and do not raise", "recommendation": "In --no-dry-run-comment mode, raise on missing httpx and any non-200/non-201 write response so the run - `tech-claude` voted **NOT_OK** (confidence 0.85): { "verdict": "NOT_OK", "confidence": 0.85, "risks": [ { "title": "Silent failure on Forgejo PATCH/POST non-2xx in real-post mode (recreates the exact bug PR claims to fix)", "severity": "blocker", "evidence": "control-plane/platformctl/tools/run_review.py — end of maybe_post_forgejo_comment(): `if resp.status_code in (200, 201): ...write OK\\nelse: ...write 'failed: HTTP {code}'`. The else branch only logs; no raise. The function returns normally, so main()'s try/exce ## Operator decisions (yes/no) 1. Risk 'Real Forgejo comment write failures still exit successfully' raised by 1/6: do you accept this risk, or should this PR be deferred until it's mitigated? 2. Risk 'Real comment failure can still look successful' raised by 1/6: do you accept this risk, or should this PR be deferred until it's mitigated? 3. Risk 'Silent failure on Forgejo PATCH/POST non-2xx in real-post mode (recreates the exact bug PR claims to fix)' raised by 1/6: do you accept this risk, or should this PR be deferred until it's mitigated? ## Evidence - Total review wallclock: ~435s - Tech consensus: dissent - Product consensus: dissent
Sign in to join this conversation.
No reviewers
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform!8
No description provided.