feat(identity-wiring): Order B — glm comment hook + charter §3/§4 + identity doctor stub #8
No reviewers
Labels
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
pdurlej/platform!8
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "codex/identity-wiring/order-b-glm-comment-hook"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
First wiring step toward multi-actor attribution canary (Oracle's neutral label for "flame war"). Implements Oracle's Order B from session
infisical-update-flame-war-wiring-2026-05-01: smallest blast radius, no Claude restart, most-visible canary effect (review comments visibly authored byglm).Closes none — first PR in identity-wiring sequence (B → A → C per Oracle).
What's in this PR
Charter rules (Oracle's exact wording, validated):
~/.platformctl-runtime/, never committedInfisical Universal Auth + token cache (
secrets/infisical.py):expiresInfrom auth response (NOT hardcoded TTL — Oracle's caveat)force_refreshoverrideForgejo PR comment hook (
run_review.pyextended):--post-forgejo-comment(opt-in)--actor(defaultglm; choice limited to known service identities)--dry-run-commentdefault ON (charter: dry-run until canary)<!-- platform-review:{actor}:{repo}:PR-{id}:{sha[:8]} -->(Oracle's exact format)platformctl identity doctorstub (tools/identity_doctor.py):~/.claude.jsonas fail (Oracle: avoid "MCP config secret leakage")Runtime layout README (
state/runtime-layout.md): canonical tree per Oracle spec with permissions documented.Tests: 101/101 green
Distribution:
Out of scope (deferred per Oracle Order B → A → C sequencing)
~/.claude.jsonwith wrapper that resolvesclaudePAT at process start; requires Claude restart)user.name/user.emailfor codex commits + askpass-codex.sh)--post-forgejo-comment --no-dry-run-commenton a small Phase 02 PRTest plan
python -m platformctl.tools.identity_doctor --helpshows args--post-forgejo-comment --actor glm(no-real-post)--no-dry-run-comment, observe glm-authored commentLinkage
state/AUDIT_LOG.jsonl):infisical-identity-rename(renameclaude→platform-orchestrator)oracle-call-metadata-out-of-band(Oracle's discipline feedback)🤖 Generated with Claude Code
First wiring step toward multi-actor attribution canary (Oracle's neutral label for what operator called "flame war"). Implements Order B from Oracle session `infisical-update-flame-war-wiring-2026-05-01` — smallest blast radius, no Claude restart, most-visible canary effect (review comments visibly authored by `glm` instead of operator/admin). ## Charter §3 sub-rules + new §4 (PLATFORM_CHARTER.md) Per Oracle's exact wording: §3 (sub-rule, Runtime state location): > Runtime credentials, token caches, generated env wrappers, and mutable > machine-local state live under `~/.platformctl-runtime/` and must never > be committed. Repository code may define schemas and templates, but not > live secrets, bearer tokens, or host-specific credential material. §4 (new, Service identities): > All tool-originated Forgejo writes must use named non-admin service > identities with the minimum token scope required for the action. > Operator/admin credentials are break-glass only and must not be embedded > in agent, MCP, reviewer, or git automation configs. Plus Oracle's "identity channels are separate concerns" distinction: push auth actor / PR-comment actor / commit author / committer all need matching configuration; PAT alone fixes only auth actor. ## New: Infisical Universal Auth + token cache `platformctl/secrets/infisical.py` (155 lines): - `CachedToken.is_valid()` with 60s safety margin - `load_cached_token` / `save_cached_token` — atomic via tmp+rename, mode 0600 - `_login_universal_auth` — trusts `expiresIn` from auth response (NOT hardcoded 30d per Oracle's caveat) - `get_access_token` — cache-then-login pattern; force_refresh override - `read_secret` — read single secret value via /api/v3/secrets/raw 15 tests in `test_infisical_cache.py` cover safety margin, file modes, atomic writes, malformed-cache fallback, force_refresh, cache-hit-no-network. ## New: Forgejo PR comment hook in run_review.py CLI flags: - `--post-forgejo-comment` (opt-in, default OFF) - `--actor {claude,codex,glm,platform-orchestrator}` (default `glm`) - `--dry-run-comment` (default ON; charter rule "dry-run default until canary") - `--no-dry-run-comment` (disable dry-run for real posting) - `--forgejo-host`, `--forgejo-repo` Logic: - `_idempotency_marker` — Oracle's exact format `<!-- platform-review:{actor}:{repo}:PR-{pr_id}:{head_sha[:8]} -->` - `_verify_actor` — pre-write GET /api/v1/user; reject if token owner != actor - `_find_existing_comment` — substring match on marker; UPDATE not append - `maybe_post_forgejo_comment` — dry-run-by-default; PATCH or POST chosen by marker presence 13 tests in `test_glm_comment_hook.py` cover marker format (actor + sha distinguish), body composition, CLI defaults, no-token guard, dry-run no-HTTP, actor verification mismatch. ## New: platformctl identity doctor stub `platformctl/tools/identity_doctor.py` (190 lines): Catches "looks wired, but still acting as operator" failure mode. Five checks (skip-able): forgejo, git, infisical, review, mcp. - forgejo: GET /api/v1/user with PAT, verify login matches actor - git: `git config user.name`/`user.email` match actor - infisical: token cache present + not expired (no value print) - review: `glm-comment-state.json` parseable (or absent = first run) - mcp: heuristic — `~/.claude.json` Forgejo MCP env literal vs wrapper-resolved CLI: `python -m platformctl.tools.identity_doctor <actor> [--skip ...]`. Exit codes: 0=pass, 1=fail, 2=usage error. 13 tests in `test_identity_doctor.py` cover render, individual checks including the literal-token-detection (Oracle: "MCP config secret leakage"). ## New: state/runtime-layout.md Canonical structure documentation for `~/.platformctl-runtime/` per Oracle's tree spec. Documents permissions (0700/0600), token cache contract (`expiresIn`-driven), credential-rotation guidance. ## Tests: 101/101 green ``` pytest platformctl/tests/ → 101 passed in 4.18s ``` Distribution: 14 consolidator + 4 negative-control + 5 smoke + 37 run_review + 15 infisical-cache + 13 glm-hook + 13 identity-doctor. ## Out of scope (deferred to follow-up PRs) - **Order A — Forgejo MCP wrapper**: `~/.claude.json` swap to wrapper script that resolves `claude` PAT at process start. Requires Claude restart. - **Order C — Codex git config**: per-repo `user.name`/`user.email` for codex commits + askpass-codex.sh. Requires Codex CLI integration work. - **Per-tool Infisical identities**: Oracle says "later, not now" — first stabilize one identity, then split. - **Live HTTP integration tests**: this PR's hook tests are unit-level with monkeypatched httpx. Real test = first canary run with `--post-forgejo-comment --no-dry-run-comment` on a small Phase 02 PR. ## Open loops filed (state/AUDIT_LOG.jsonl, separate commit) - `infisical-identity-rename` — `claude` → `platform-orchestrator` since current identity reads multiple PATs (Oracle Q1 caveat #1) - `oracle-call-metadata-out-of-band` — Oracle's discipline feedback: include model/provider/thinking-budget metadata so Oracle can self-validate Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>Decision packet — PR 8
Risk:
BLOCKERDefault action:
deferTech votes
tech-gpt→ NOT_OKtech-glm→ OKtech-claude→ NOT_OKProduct votes
product-glm→ OKproduct-gpt→ NOT_OKproduct-claude→ OKWeak-signal risks (1 reviewer)
blockerReal Forgejo comment write failures still exit successfully (raised by tech-gpt)blockerReal comment failure can still look successful (raised by product-gpt)blockerSilent failure on Forgejo PATCH/POST non-2xx in real-post mode (recreates the exact bug PR claims to fix) (raised by tech-claude)highDoctor preserves removed local-state concept (raised by product-gpt)highSilent failure on non-200/201 comment POST/PATCH response (raised by tech-glm)highPagination in _find_existing_comment is order-non-deterministic; concurrent activity causes marker miss → duplicate (raised by tech-claude)highIdentity-doctor 'review' check is dead code that emits a misleading green (raised by tech-claude)mediumIdentity doctor still checks removed review state path (raised by tech-gpt)mediumDoctor can pass vacuously when every check is skipped (raised by tech-gpt)mediumInfisical cache save can leak temp files on os.replace failure (raised by tech-gpt)mediumCharter absorbs implementation caveats from an unfinished rollout (raised by product-gpt)mediumDry-run is not a low-friction rehearsal (raised by product-gpt)mediumTOCTOU race in comment idempotency check (raised by tech-glm)mediumOrphaned temp files if os.replace fails (raised by tech-glm)mediumParent directory permission hardening is best-effort only (raised by tech-glm)mediumOperator surface expansion is large for one PR — many new things to remember (raised by product-claude)mediumidentity doctorPASS still risks misleading a tired operator despite the caveat (raised by product-claude)mediumCharter rule and its implementation land in the same PR (raised by product-claude)mediumexpiresIn: truewould be accepted as 1-second TTL (bool is int) (raised by tech-claude)mediumMCP literal-token heuristic accepts$VAR(unbraced) and any$-prefixed string (raised by tech-claude)lowPagination logic assumes Forgejo API behavior (raised by tech-glm)lowglmis opaque naming for a PM operator (raised by product-claude)lowNo operator runbook for the new failure modes (exit 4, expired token cache, doctor FAIL) (raised by product-claude)lowOverflowError fromint(float('inf'))on expiresIn not caught (raised by tech-claude)low--dry-run-commentargparse flag is a no-op; only--no-dry-run-commentmutates state (raised by tech-claude)lowMarker docstring vs implementation inconsistency invites later drift (raised by tech-claude)low_ensure_runtime_dir best-effort chmod walks parents without checking symlinks (raised by tech-claude)Opportunities
main()exit code 4 for real write HTTP failures. (tech-gpt)(cmd) syntax and false positives for env var names without. Consider a more robust heuristic or explicit config flag. (tech-glm)glmis actually wired this PR.codexis deferred to Order C,platform-orchestratoris described as a future rename. Allowing all four in --actor choices today means an operator can pick a not-yet-wired actor and get a confusing failure. Consider restricting --actor choices to currently-wired actors and expanding as Orders A/C land. (product-claude)Dissents
tech-gptvoted NOT_OK (confidence 0.90): {"verdict": "NOT_OK",
"confidence": 0.9,
"risks": [
{
"title": "Real Forgejo comment write failures still exit successfully",
"severity": "blocker",
"evidence": "control-plane/platformctl/tools/run_review.py:686: non-200 PATCH/POST only logs
[comment] {action} failed: HTTP ...and does not raise, so main can still return approve_merge/defer status after--no-dry-run-commentfailed.","recommendation": "In real-post mode, raise RuntimeError on any non-200/20
product-gptvoted NOT_OK (confidence 0.86): {"verdict": "NOT_OK",
"confidence": 0.86,
"risks": [
{
"title": "Real comment failure can still look successful",
"severity": "blocker",
"evidence": "control-plane/platformctl/tools/run_review.py:650 finding: missing httpx returns instead of raising in real-post mode; lines near final POST/PATCH only log non-200 responses and do not raise",
"recommendation": "In --no-dry-run-comment mode, raise on missing httpx and any non-200/non-201 write response so the run
tech-claudevoted NOT_OK (confidence 0.85): {"verdict": "NOT_OK",
"confidence": 0.85,
"risks": [
{
"title": "Silent failure on Forgejo PATCH/POST non-2xx in real-post mode (recreates the exact bug PR claims to fix)",
"severity": "blocker",
"evidence": "control-plane/platformctl/tools/run_review.py — end of maybe_post_forgejo_comment():
if resp.status_code in (200, 201): ...write OK\\nelse: ...write 'failed: HTTP {code}'. The else branch only logs; no raise. The function returns normally, so main()'s try/exceOperator decisions (yes/no)
Evidence