fix(platformctl): prefer Infisical token auth before direct PAT #273

Merged
pdurlej merged 1 commit from codex/issues/272-infisical-first-token-resolution into main 2026-05-14 09:30:26 +02:00
Collaborator

Canary status: missing - class/security-sensitive; operator merge only after review.

Canary Context Pack

Product story

The deploy runner needs a 7-day soak where Infisical Token Auth is exercised while the old direct Forgejo PAT remains available as a fallback. The current resolver makes that impossible because it silently chooses the direct PAT first.

What changed

  • platformctl apply now prefers PLATFORMCTL_INFISICAL_TOKEN_AUTH_FILE over direct PAT env vars when both are configured.
  • Added non-secret log markers for token source:
    • forgejo_token_source=infisical-token-auth
    • forgejo_token_source=direct-env-fallback
    • forgejo_token_source=explicit-argument
  • Updated the runner install script so it preserves direct PAT during the soak instead of removing it immediately.
  • Added tests for Infisical-over-direct precedence and direct fallback logging.

Why it changed

During the Infisical Token Auth close-out dispatch on 2026-05-14, audit found that a green matrix-well-known smoke would have been misleading: direct PAT was still present, and apply.py would never call Infisical.

Files touched

  • control-plane/platformctl/apply.py
  • control-plane/platformctl/tests/test_apply_phase3.py
  • scripts/forgejo/deploy-runner-install-infisical-token-auth

Relevant context

  • #265 — Infisical Token Auth migration thread
  • #272 — blocker issue for silent direct PAT bypass
  • #142 — RS2000 cutover lane

Runtime evidence

No smoke and no production apply were run from this PR.

Runner-local setup evidence from #142 comment 5225:

  • token file installed on RS2000, mode 0600, owner forgejo-deploy:forgejo-deploy, size 333 bytes
  • runner active
  • direct PAT preserved for soak
  • public https://infisical.pdurlej.com returns 403 from RS2000, but local Infisical container endpoint returns HTTP 200 for the same token

Test output

uv run pytest platformctl/tests/test_apply_phase3.py -q
# 46 passed in 1.43s

uv run pytest platformctl/tests/test_forgejo_ci_scripts_contract.py -q
# 25 passed in 1.24s

bash -n scripts/forgejo/deploy-runner-install-infisical-token-auth
git diff --check

Known constraints

  • This PR does not remove the direct PAT. That is deliberate; direct PAT removal is after the 7-day soak.
  • forgejo_token_source=* markers are intentionally logged as warnings so they appear in workflow stderr without extra logging configuration. They contain no secrets.
  • The runner currently uses an internal Infisical endpoint because the public edge returns 403 from RS2000.

Explicit out-of-scope

  • No direct PAT removal.
  • No F3 smoke.
  • No real-change apply.
  • No Infisical ACL/UI changes.
  • No /etc/hosts or Traefik changes.

Requested decision

Merge this before retrying the Infisical close-out matrix-well-known smoke. Without this PR, the smoke cannot prove Infisical resolution.

Merge blockers

  • Any token value in logs.
  • Any code path that removes direct PAT before soak ends.
  • Missing proof that Infisical wins over direct PAT when both are configured.

Spec sources read

  • control-plane/platformctl/apply.py — token resolution implementation.
  • control-plane/platformctl/tests/test_apply_phase3.py — approval/token tests.
  • scripts/forgejo/deploy-runner-install-infisical-token-auth — runner install behavior.
  • .forgejo/workflows/platformctl-auto-apply.yml — workflow logging/artifact context.

Closes #272

Canary status: missing - class/security-sensitive; operator merge only after review. ## Canary Context Pack ### Product story The deploy runner needs a 7-day soak where Infisical Token Auth is exercised while the old direct Forgejo PAT remains available as a fallback. The current resolver makes that impossible because it silently chooses the direct PAT first. ### What changed - `platformctl apply` now prefers `PLATFORMCTL_INFISICAL_TOKEN_AUTH_FILE` over direct PAT env vars when both are configured. - Added non-secret log markers for token source: - `forgejo_token_source=infisical-token-auth` - `forgejo_token_source=direct-env-fallback` - `forgejo_token_source=explicit-argument` - Updated the runner install script so it preserves direct PAT during the soak instead of removing it immediately. - Added tests for Infisical-over-direct precedence and direct fallback logging. ### Why it changed During the Infisical Token Auth close-out dispatch on 2026-05-14, audit found that a green `matrix-well-known` smoke would have been misleading: direct PAT was still present, and `apply.py` would never call Infisical. ### Files touched - `control-plane/platformctl/apply.py` - `control-plane/platformctl/tests/test_apply_phase3.py` - `scripts/forgejo/deploy-runner-install-infisical-token-auth` ### Relevant context - #265 — Infisical Token Auth migration thread - #272 — blocker issue for silent direct PAT bypass - #142 — RS2000 cutover lane ### Runtime evidence No smoke and no production apply were run from this PR. Runner-local setup evidence from #142 comment 5225: - token file installed on RS2000, mode `0600`, owner `forgejo-deploy:forgejo-deploy`, size `333 bytes` - runner active - direct PAT preserved for soak - public `https://infisical.pdurlej.com` returns 403 from RS2000, but local Infisical container endpoint returns HTTP 200 for the same token ### Test output ```bash uv run pytest platformctl/tests/test_apply_phase3.py -q # 46 passed in 1.43s uv run pytest platformctl/tests/test_forgejo_ci_scripts_contract.py -q # 25 passed in 1.24s bash -n scripts/forgejo/deploy-runner-install-infisical-token-auth git diff --check ``` ### Known constraints - This PR does not remove the direct PAT. That is deliberate; direct PAT removal is after the 7-day soak. - `forgejo_token_source=*` markers are intentionally logged as warnings so they appear in workflow stderr without extra logging configuration. They contain no secrets. - The runner currently uses an internal Infisical endpoint because the public edge returns 403 from RS2000. ### Explicit out-of-scope - No direct PAT removal. - No F3 smoke. - No real-change apply. - No Infisical ACL/UI changes. - No `/etc/hosts` or Traefik changes. ### Requested decision Merge this before retrying the Infisical close-out `matrix-well-known` smoke. Without this PR, the smoke cannot prove Infisical resolution. ### Merge blockers - Any token value in logs. - Any code path that removes direct PAT before soak ends. - Missing proof that Infisical wins over direct PAT when both are configured. ## Spec sources read - `control-plane/platformctl/apply.py` — token resolution implementation. - `control-plane/platformctl/tests/test_apply_phase3.py` — approval/token tests. - `scripts/forgejo/deploy-runner-install-infisical-token-auth` — runner install behavior. - `.forgejo/workflows/platformctl-auto-apply.yml` — workflow logging/artifact context. Closes #272
fix(platformctl): prefer Infisical token auth before direct PAT
All checks were successful
canary-required / collect-diff (pull_request) Successful in 4s
platformctl plan / auto-apply scope (pull_request) Successful in 21s
pyfallow / Pyfallow gate (control-plane) (pull_request) Successful in 19s
python-ci / Python 3.11 (pull_request) Successful in 36s
python-ci / Python 3.12 (pull_request) Successful in 36s
python-ci / Python 3.13 (pull_request) Successful in 36s
canary-required / canary (pull_request) Successful in 12s
base-is-main / guard (pull_request) Successful in 1s
4772388813
Sign in to join this conversation.
No reviewers
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform!273
No description provided.