docs(vault): add safe-session CA dual-trust bootstrap #618

Merged
pdurlej merged 1 commit from codex/m04-safe-session-ca-dual-trust-bootstrap into main 2026-05-29 22:47:55 +02:00
Collaborator

Canary status: local preflight green; Forgejo checks pending

Summary

This PR adds the missing safe-session CA bootstrap/trust handoff gate before any live Vault sunset cutover.

DeepSeek V4 Pro redteam returned HOLD on doing live CA handoff immediately because replacing /etc/ssh/trusted-user-ca-keys.pem would invalidate outstanding Vault-signed certificates. This PR implements the recommended safe path: a temporary dual-trust window.

What changed

  • Added scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh.
  • Updated runbooks/safe-session-local-ca-cutover.md with the CA bootstrap and dual-trust handoff procedure.
  • Updated runbooks/vault-quarantine-and-sunset.md preconditions so Vault quarantine requires dual-trust handoff and local signer cutover evidence first.
  • Updated scripts/cutover/README.md.
  • Added tests for the bootstrap script contract.

Safety model

Default mode is read-only:

scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh --check

The write path requires an explicit separate gate:

scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh \
  --execute \
  --confirm m04-safe-session-ca-bootstrap-approved

The script appends the new public CA to TrustedUserCAKeys; it does not replace the old Vault CA in one step. Vault-signed certificates remain valid during the dual-trust window.

Validation

  • bash -n scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh
  • bash -n scripts/cutover/safe-session-local-ca-preflight.sh
  • bash -n scripts/cutover/vault-sunset-readiness.sh
  • UV_CACHE_DIR=/private/tmp/codex-uv-cache PYTHONPATH=control-plane uv run --project control-plane pytest tests/test_safe_session_ca_bootstrap.py -q — 5 passed.
  • UV_CACHE_DIR=/private/tmp/codex-uv-cache PYTHONPATH=control-plane uv run --project control-plane python -m platformctl.cli validate all --json — exitCode 0.
  • RS2000 read-only check: current trusted CA present, one active line; local CA key/pub missing; execute still gated.

DeepSeek V4 Pro redteam

Verdict before this PR: HOLD.

Blockers were:

  • replacing trusted CA would invalidate outstanding Vault-signed certs;
  • runbook did not document SSH trust handoff/bootstrap;
  • runtime cutover gate phrase was not provided;
  • local CA material was not staged.

This PR addresses the first two as repo/runbook/script gates. It does not execute bootstrap or runtime cutover.

Non-goals

  • No live Infisical write in this PR.
  • No SSH trust mutation in this PR.
  • No safe-session runtime cutover in this PR.
  • No Vault stop/quarantine/delete.
Canary status: local preflight green; Forgejo checks pending ## Summary This PR adds the missing safe-session CA bootstrap/trust handoff gate before any live Vault sunset cutover. DeepSeek V4 Pro redteam returned HOLD on doing live CA handoff immediately because replacing `/etc/ssh/trusted-user-ca-keys.pem` would invalidate outstanding Vault-signed certificates. This PR implements the recommended safe path: a temporary dual-trust window. ## What changed - Added `scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh`. - Updated `runbooks/safe-session-local-ca-cutover.md` with the CA bootstrap and dual-trust handoff procedure. - Updated `runbooks/vault-quarantine-and-sunset.md` preconditions so Vault quarantine requires dual-trust handoff and local signer cutover evidence first. - Updated `scripts/cutover/README.md`. - Added tests for the bootstrap script contract. ## Safety model Default mode is read-only: ```bash scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh --check ``` The write path requires an explicit separate gate: ```bash scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh \ --execute \ --confirm m04-safe-session-ca-bootstrap-approved ``` The script appends the new public CA to `TrustedUserCAKeys`; it does not replace the old Vault CA in one step. Vault-signed certificates remain valid during the dual-trust window. ## Validation - `bash -n scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh` - `bash -n scripts/cutover/safe-session-local-ca-preflight.sh` - `bash -n scripts/cutover/vault-sunset-readiness.sh` - `UV_CACHE_DIR=/private/tmp/codex-uv-cache PYTHONPATH=control-plane uv run --project control-plane pytest tests/test_safe_session_ca_bootstrap.py -q` — 5 passed. - `UV_CACHE_DIR=/private/tmp/codex-uv-cache PYTHONPATH=control-plane uv run --project control-plane python -m platformctl.cli validate all --json` — exitCode 0. - RS2000 read-only check: current trusted CA present, one active line; local CA key/pub missing; execute still gated. ## DeepSeek V4 Pro redteam Verdict before this PR: HOLD. Blockers were: - replacing trusted CA would invalidate outstanding Vault-signed certs; - runbook did not document SSH trust handoff/bootstrap; - runtime cutover gate phrase was not provided; - local CA material was not staged. This PR addresses the first two as repo/runbook/script gates. It does not execute bootstrap or runtime cutover. ## Non-goals - No live Infisical write in this PR. - No SSH trust mutation in this PR. - No safe-session runtime cutover in this PR. - No Vault stop/quarantine/delete.
docs(vault): add safe-session CA dual-trust bootstrap
All checks were successful
canary-required / collect-diff (pull_request) Successful in 5s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 5s
python-ci / Python 3.11 (pull_request) Successful in 44s
python-ci / Python 3.12 (pull_request) Successful in 44s
python-ci / Python 3.13 (pull_request) Successful in 43s
canary-required / canary (pull_request) Successful in 14s
patchwarden-client-dry-run / dry-run (pull_request) Successful in 22s
base-is-main / guard (pull_request) Successful in 1s
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 5s
patchwarden-pr-sanity / sanity (pull_request) Successful in 3m22s
6284ee0638
Author
Collaborator

Patchwarden PR sanity

  • Status: advisory_findings
  • PR: 618
  • Commit: 6284ee06388e67924a3e8a1173e87ef02e1527f7
  • Security-sensitive label: present
  • Authority: advisory model review plus deterministic blockers only
  • 3+3 canary: still alive; this does not replace it

Deterministic findings

No deterministic findings.

Model reviewers

global-glm / glm-5.1:cloud

  • Status: ok
  • Verdict: OK
  • Findings: none

global-deepseek / deepseek-v4-pro:cloud

  • Status: ok

  • Verdict: NOT_OK

  • high Dual-trust append deduplication check is broken due to newline in pattern

    • Evidence: In scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh, the line if grep -Fqx "$(cat "$ca_pub_file")" "$trusted_ca_file"; thenusescatwhich includes a trailing newline, causinggrep -x to never match a line without a newline. Th
    • Next: Strip the trailing newline from the public key content before using it in grep, e.g., pub_key_content=$(cat "$ca_pub_file"); pub_key_content=${pub_key_content%$'\n'} and then use that variable.
  • medium No rollback of TrustedUserCAKeys if sshd -t fails after append

    • Evidence: The script appends to the trusted CA file, then runs sshd -t. If sshd -tfails, the script exits due toset -e, but the modified file remains, potentially breaking SSH. The backup is created but not automatically restored.
    • Next: Add a trap or conditional to restore the backup if sshd -t fails, or at least print a warning to manually restore.
  • low Infisical write failure may be silent

    • Evidence: The infisical secrets setcommand uses--silentand>/dev/null, so if it fails, the error message is suppressed. The script relies on set -e to catch non-zero exit, but the operator won't see the error details.
    • Next: Capture stderr or check exit code explicitly and emit a clear failure message.

redteam / kimi-k2.6:cloud

  • Status: ok

  • Verdict: NOT_OK

  • high SSH trust mutated before validation; no rollback on sshd -t failure

    • Evidence: scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh: cat "$ca_pub_file" >> "$trusted_ca_file"precedessshd -t; with set -e, a validation failure aborts without restoring the $backup copy, leaving a potentially broken CA file on
    • Next: Stage the appended CA file to a temporary path, validate with sshd -t (or via a temporary sshd_config pointing at the staging file), and only atomically replace the live file after validation passes; add an ERR trap to restore $backup on any failure after mutation.

Policy notes

  • GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot.
  • Optional red-team model is enabled only when PLATFORMCTL_PR_SANITY_REDTEAM_MODEL is configured.
  • Auto-merge is not enabled here.
<!-- patchwarden-pr-sanity:pdurlej/platform:PR-618 --> # Patchwarden PR sanity - Status: `advisory_findings` - PR: `618` - Commit: `6284ee06388e67924a3e8a1173e87ef02e1527f7` - Security-sensitive label: `present` - Authority: advisory model review plus deterministic blockers only - 3+3 canary: still alive; this does not replace it ## Deterministic findings No deterministic findings. ## Model reviewers ### `global-glm` / `glm-5.1:cloud` - Status: `ok` - Verdict: `OK` - Findings: none ### `global-deepseek` / `deepseek-v4-pro:cloud` - Status: `ok` - Verdict: `NOT_OK` - **`high`** Dual-trust append deduplication check is broken due to newline in pattern - Evidence: `In scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh, the line `if grep -Fqx "$(cat "$ca_pub_file")" "$trusted_ca_file"; then` uses `cat` which includes a trailing newline, causing `grep -x` to never match a line without a newline. Th` - Next: Strip the trailing newline from the public key content before using it in grep, e.g., `pub_key_content=$(cat "$ca_pub_file"); pub_key_content=${pub_key_content%$'\n'}` and then use that variable. - **`medium`** No rollback of TrustedUserCAKeys if sshd -t fails after append - Evidence: `The script appends to the trusted CA file, then runs `sshd -t`. If `sshd -t` fails, the script exits due to `set -e`, but the modified file remains, potentially breaking SSH. The backup is created but not automatically restored.` - Next: Add a trap or conditional to restore the backup if `sshd -t` fails, or at least print a warning to manually restore. - **`low`** Infisical write failure may be silent - Evidence: `The `infisical secrets set` command uses `--silent` and `>/dev/null`, so if it fails, the error message is suppressed. The script relies on `set -e` to catch non-zero exit, but the operator won't see the error details.` - Next: Capture stderr or check exit code explicitly and emit a clear failure message. ### `redteam` / `kimi-k2.6:cloud` - Status: `ok` - Verdict: `NOT_OK` - **`high`** SSH trust mutated before validation; no rollback on sshd -t failure - Evidence: `scripts/cutover/safe-session-ca-dual-trust-bootstrap.sh: `cat "$ca_pub_file" >> "$trusted_ca_file"` precedes `sshd -t`; with `set -e`, a validation failure aborts without restoring the `$backup` copy, leaving a potentially broken CA file on` - Next: Stage the appended CA file to a temporary path, validate with `sshd -t` (or via a temporary sshd_config pointing at the staging file), and only atomically replace the live file after validation passes; add an ERR trap to restore `$backup` on any failure after mutation. ## Policy notes - GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot. - Optional red-team model is enabled only when `PLATFORMCTL_PR_SANITY_REDTEAM_MODEL` is configured. - Auto-merge is not enabled here.
Sign in to join this conversation.
No reviewers
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform!618
No description provided.