ops(security): Codex SSH key delivery via ssh-agent with TTL — keep key off disk (Phase 1 Lite) #73

Closed
opened 2026-05-05 02:03:47 +02:00 by pdurlej · 0 comments
Owner

Migrated from pdurlej/iskra-openclaw#47

After Piotr's review tonight (2026-05-05 ~01:45 CEST): this issue was originally opened in pdurlej/iskra-openclaw#47 but the right home is pdurlej/platform, because:

  • Infisical infrastructure already lives here (runbooks/infisical-*.md, env/infisical.*, scripts/infisical-*.sh, scripts/openclaw-infisical-render.sh, policy/security.rego, docs/dependency-migrations/infisical.md).
  • Issue #64 (docs(migrations): vault-to-infisical Phase 1-5 execution) is the parent topic — secrets management strategy.
  • Codex SSH access is cross-cutting agent governance, not Iskra-runtime-specific.
  • The wrapper code on vps1000 (pdurlej/iskra-openclaw PR #45) stays in iskra-openclaw because the allowlist is iskra-specific (iskra-canary, iskra-cockpit, continuity-pack). Delivery flow ≠ wrapper allowlist; only the wrapper allowlist is iskra-runtime-bound.

This issue specifies delivery flow (cross-cutting credentials infrastructure). Closing the original pdurlej/iskra-openclaw#47 with a redirect comment.

— Claude (post-hoc placement fix, 2026-05-05 ~01:45 CEST)


Origin

Po nocnej rozmowie Piotra z Oracle GPT 5.5 pro o 3-warstwowej architekturze secret management (pdurlej/iskra-openclaw#43, will likely also migrate here) i overnight-deployed Path A SSH wrapper (pdurlej/iskra-openclaw#44 + PR #45), wracamy do delivery question: jak Codex's runtime ma dostać prywatny klucz tak, żeby:

  1. Klucz nigdy nie ląduje na dysku Codex runtime
  2. Codex może go używać dla ssh, ale nie widzi raw bytes po fetch window
  3. Path A → Path B migration path zostaje cleanly preserved
  4. Quick win — nie czekać na pełny Phase 1 resolver z pdurlej/iskra-openclaw#43

Recommended w pdurlej/iskra-openclaw#44 podejście (Bitwarden field + bw get item na startup) działa, ale ma kilka paper cuts które ssh-agent + TTL pattern rozwiązuje strukturalnie. Ten issue specyfikuje delivery flow jako Phase 1 Lite — pomost między aktualnym Path A wrapper deploy a pełnym Phase 1 resolver z #64.

Mission name from Piotr: "SSH agent locked dla Codexa".

Problem (current state per pdurlej/iskra-openclaw#44 thread)

  • Public key + wrapper deployed on vps1000 (PR #45 in pdurlej/iskra-openclaw, smoke 5/5 pass)
  • Private key sits at /tmp/codex-key-bundle/codex_iskra_openclaw_ed25519 on Piotr's Mac, mode 0600
  • Recommended delivery: Bitwarden item vps1000 (codex) with ssh_key field, fetched by Codex runtime via bw get item 'vps1000 (codex)' --session "$BW_SESSION" | jq -r '...'
  1. Plaintext-on-disk risk: most natural implementations of "fetch and use SSH key" write it to a tempfile (echo "$key" > /tmp/key && ssh -i /tmp/key). Even with cleanup, the key briefly exists on disk and possibly in shell history.
  2. Process-substitution doesn't fully escape: ssh -i <(bw get...) keeps key in named pipe, but key is in Codex's process memory and the named pipe is readable by anyone with FD access during the fetch window.
  3. No TTL on Codex-side cache: once key is fetched and stored anywhere on Codex runtime, it stays until Codex code explicitly clears it. Forgetting that = persistent secret.
  4. No automatic rotation enforcement: if we rotate the key in Bitwarden, Codex won't pick up the change until next runtime restart.
  5. Codex sees raw bytes for the entire session: even if not on disk, the raw key passes through Codex's process memory and stays there.

These aren't disasters individually. Sum of paper cuts > one structural fix.

Proposal: ssh-agent as locked vault, with TTL

Use OpenSSH's built-in ssh-agent as the credential broker for Codex's audit access. Key lives only in ssh-agent's process memory, with a forced TTL via ssh-add -t. Codex calls ssh ... normally; SSH_AUTH_SOCK routes to ssh-agent which signs the SSH handshake without exposing the key bytes.

This is the "agent uses but never sees the secret" pattern from Oracle's Phase 3 brief, applied to the SSH-key-only special case using a mechanism that has been in OpenSSH since 1995, not a research-preview product.

Architecture

Bitwarden / Infisical             Codex runtime
(key source)                      ┌──────────────────────────────┐
       │                          │                              │
       │  1. fetch (one-time      │   ssh-agent (process)        │
       │     per TTL window,      │   ┌────────────────────┐     │
       │     stdin to ssh-add)    │   │  key in RAM        │     │
       └─────────────────────────►│   │  TTL = 3600s       │     │
                                  │   └─────────┬──────────┘     │
                                  │             │ unix socket    │
                                  │             │ SSH_AUTH_SOCK  │
                                  │   ┌─────────▼──────────┐     │
                                  │   │  Codex agent       │     │
                                  │   │  ssh openclaw@...  │     │
                                  │   │  ("sign this", not │     │
                                  │   │   "give me bytes") │     │
                                  │   └─────────┬──────────┘     │
                                  │             │                │
                                  └─────────────┼────────────────┘
                                                │
                                                ▼
                                  vps1000 (Path A wrapper unchanged,
                                  pdurlej/iskra-openclaw PR #45)

After 3600s, ssh-agent auto-evicts the key. Codex must re-fetch on
next operation. No raw key bytes persist anywhere on Codex runtime.

Concrete delivery flow (Bitwarden-backed Phase 1 Lite)

# Codex runtime startup (or per-session preflight):

# 1. Ensure ssh-agent is running, capture SSH_AUTH_SOCK
eval "$(ssh-agent -s)"

# 2. Fetch key from Bitwarden, pipe directly to ssh-add stdin, TTL 1h
bw get item 'vps1000 (codex)' --session "$BW_SESSION" \
  | jq -r '.fields[] | select(.name=="ssh_key") | .value' \
  | ssh-add -t 3600 -

# 3. Use SSH normally
#    Key never on disk, never in env, never in Codex's argv
ssh -F /dev/null \
    -o IdentitiesOnly=no \
    openclaw@vps1000 'iskra-canary --json --timeout-seconds 30'

What this achieves:

  • Key never written to any disk file (no tempfile, no ~/.ssh/codex_key, no env var with key body).
  • Key in ssh-agent process memory only — separate process from Codex agent's main loop.
  • After 3600 seconds, ssh-agent evicts the key. Codex must re-fetch from Bitwarden on next operation.
  • Codex code signs SSH handshakes via the unix socket — operations are "sign this challenge", not "give me the bytes".
  • Path A wrapper on vps1000 is unchanged (pdurlej/iskra-openclaw PR #45 stays as-is).

Phase 1 Lite vs. Phase 1 Full vs. Phase 3

Aspect Phase 1 Lite (this issue) Phase 1 Full (#64) Phase 3 (Oracle Agent Vault)
Key source Bitwarden item Infisical via openclaw-infisical-resolver Infisical via Agent Vault broker
Codex sees raw bytes Once per TTL window Once per TTL window Never
Key on disk Never Never Never
Egress isolation None (Codex calls SSH directly) None HTTPS_PROXY enforced
TTL enforcement ssh-agent -t flag ssh-agent -t flag + Infisical token TTL Broker session TTL
Codex effort ~0.5 day ~2 days (depends on Phase 1 Full) Weeks (research preview)
Status Ships now Blocks on #64 / Phase 1 Full Blocks on Agent Vault readiness

The "Codex sees raw bytes once per TTL window" gap remains in Phase 1 Full and only closes in Phase 3. That's an honest limitation — Phase 1 Lite is not Phase 3 in disguise. But the gap is much smaller than current "key on disk indefinitely" state.

Migration path

When Phase 1 Full lands (#64 / Oracle 3-phase architecture), the Codex runtime startup script changes from:

bw get item 'vps1000 (codex)' --session "$BW_SESSION" \
  | jq -r '.fields[] | select(.name=="ssh_key") | .value' \
  | ssh-add -t 3600 -

to:

echo '{"protocolVersion":1,"provider":"infisical","ids":["codex-runtime/ssh/iskra-openclaw-key"]}' \
  | /usr/local/bin/openclaw-infisical-resolver \
  | jq -r '.values["codex-runtime/ssh/iskra-openclaw-key"]' \
  | ssh-add -t 3600 -

ssh-agent + TTL pattern is unchanged. Only the key source rotates from Bitwarden to Infisical.

When Phase 3 Agent Vault lands, the SSH access flow itself changes — Codex no longer holds a key at all, the broker proxies the SSH session.

Acceptance criteria

  • Codex runtime startup script uses ssh-add -t 3600 - to load key from stdin (no tempfile)
  • Codex runtime never writes the SSH private key to any persistent file
  • Codex's ssh openclaw@vps1000 ... succeeds within 1h of startup; fails after 1h until re-fetch (smoke: wait 3700s, retry)
  • Codex log/transcripts contain no raw key material (test: grep -E 'BEGIN OPENSSH PRIVATE KEY' codex.log returns nothing)
  • bw get item 'vps1000 (codex)' returns the key only when BW_SESSION is unlocked (relies on Bitwarden's own ACL)
  • After Phase 1 Full lands (#64), one-line swap from bw get to resolver-based fetch verified
  • Threat model section added to pdurlej/iskra-openclaw docs/security/codex-ssh-wrapper.md covering the in-memory-only invariant
  • One smoke test demonstrating closure of one pdurlej/iskra-openclaw#10/#12/#13/#22/#33 audit using the new delivery flow
Threat Bitwarden tempfile (recommended in iskra-openclaw#44) Phase 1 Lite (this issue)
Disk forensics post-breach Key recoverable from filesystem journal/snapshots/SSD residue Not present on any disk at any time
TTL/rotation discipline Manual cleanup required, easy to forget Auto-evict after 1h
Process-restart leakage Possible if cleanup races process death Not possible (ssh-agent dies with runtime)
Codex-side argv leak ssh -i /tmp/key leaks path to ps/audit logs ssh openclaw@vps1000 — no key path in argv
Forgotten key rotation Until Codex restart Within 1h of Bitwarden rotation
In-process compromise Same Same (no improvement)
Bitwarden compromise Same Same (no improvement)

What this does NOT protect

  • In-process compromise on Codex runtime: if attacker gets local code execution with the same UID as ssh-agent, they can ptrace ssh-agent or read /proc/<pid>/maps. Same as any in-process secret. HSM is the structural fix; out of scope here.
  • Socket file permission: ssh-agent socket lives in /tmp/ssh-XXX/ with mode 0600 owned by Codex's UID. Anyone with that UID (or root) can use the socket. Acceptable trade-off for read-only audit access — wrapper allowlist on vps1000 is the second wall.
  • Bitwarden compromise: if Bitwarden is breached, the key is exfiltrated. Same threat surface as the Path A "key in Bitwarden" recommendation.
  • vps1000 compromise: the wrapper allowlist (pdurlej/iskra-openclaw scripts/security/codex-ssh-wrapper.py) is the protection there, not this delivery mechanism.

Open questions

  1. TTL value: 1h chosen as compromise. Codex's audit operations are typically short bursts; 1h covers a session comfortably without burning Bitwarden API quota. Open to 30min if Codex's operations support it.
  2. Codex runtime boundaries: does Codex have a clean "session start" hook where this fetch+ssh-add can be wired? Or does Codex spawn fresh per-task processes? If per-task, ssh-agent needs to be a long-running sibling process.
  3. Failure mode on ssh-add expiry mid-session: surface error and let job-level retry handle re-fetch (avoid tight loops on Bitwarden during outages).
  4. Bitwarden API quota: how aggressive is bw get item rate limit? Worth measuring.
  5. ssh-agent -a socket location: default /tmp/ssh-XXX/ is fine for now. Future hardening could use /run/user/<uid>/ssh-agent.sock with stricter unix permissions and tmpfs backing.

Decision asks for Piotr

  1. Phase 1 Lite vs. plain Bitwarden-fetch: do we want the ssh-agent + TTL layer now, or keep recommended iskra-openclaw#44 approach until Phase 1 Full lands? My read: Phase 1 Lite is ~0.5 day Codex work, removes 4 of 5 paper cuts permanently. Worth shipping now.
  2. TTL window: 1h, 30min, or shorter? My read: 1h, revisit after first week.
  3. BW vs. skip-do-Infisical: should we use Bitwarden as Phase 1 Lite source, or skip directly to Infisical? My read: BW first (key already lands there per iskra-openclaw#44), Infisical machine-identity SSH provisioning is its own design under #64.
  • #64docs(migrations): vault-to-infisical Phase 1-5 execution (Phase 1 Full target; this is Phase 1 Lite of that)
  • pdurlej/iskra-openclaw#43 — 3-phase secrets architecture (Oracle GPT 5.5 pro consult; also a candidate for migration to platform per the same audit)
  • pdurlej/iskra-openclaw#44 — Codex SSH access provisioning (also a candidate for migration to platform); this issue specifies delivery for that wrapper
  • pdurlej/iskra-openclaw PR #45 — wrapper deployment (stays in iskra-openclaw — allowlist is iskra-specific)
  • pdurlej/iskra-openclaw#10, #12, #13, #22, #33 — five audits unblocked by Path A
  • pdurlej/iskra-openclaw#46 — defense-in-depth for Agent Souls/ (sibling memory-sidecar follow-up; stays in iskra-openclaw)
  • #56 (Iskra) — fix(identity): split Forgejo MCP identity per agent and disable admin MCP by default — adjacent agent-governance topic

Why "small, but ship now"

Każda warstwa to ~5–15 linii bash w startup script Codexa. Pattern jest production-grade (OpenSSH od ~30 lat). Path A→B migration path zachowany. Quick-win removes 4 paper cuts vs. plain Bitwarden fetch. Codex effort ~0.5 dnia.

— Claude (overnight pass on Piotr's "SSH agent locked dla Codexa" mission, 2026-05-05 noc, post-Oracle-GPT-5.5-pro-consult)

## Migrated from `pdurlej/iskra-openclaw#47` After Piotr's review tonight (2026-05-05 ~01:45 CEST): this issue was originally opened in `pdurlej/iskra-openclaw#47` but the right home is `pdurlej/platform`, because: - Infisical infrastructure already lives here (`runbooks/infisical-*.md`, `env/infisical.*`, `scripts/infisical-*.sh`, `scripts/openclaw-infisical-render.sh`, `policy/security.rego`, `docs/dependency-migrations/infisical.md`). - Issue **#64** (`docs(migrations): vault-to-infisical Phase 1-5 execution`) is the parent topic — secrets management strategy. - Codex SSH access is cross-cutting agent governance, not Iskra-runtime-specific. - The wrapper code on vps1000 (`pdurlej/iskra-openclaw` PR #45) stays in iskra-openclaw because the allowlist is iskra-specific (`iskra-canary`, `iskra-cockpit`, `continuity-pack`). Delivery flow ≠ wrapper allowlist; only the wrapper allowlist is iskra-runtime-bound. This issue specifies **delivery flow** (cross-cutting credentials infrastructure). Closing the original `pdurlej/iskra-openclaw#47` with a redirect comment. — Claude (post-hoc placement fix, 2026-05-05 ~01:45 CEST) --- ## Origin Po nocnej rozmowie Piotra z Oracle GPT 5.5 pro o 3-warstwowej architekturze secret management (`pdurlej/iskra-openclaw#43`, will likely also migrate here) i overnight-deployed Path A SSH wrapper (`pdurlej/iskra-openclaw#44` + PR #45), wracamy do **delivery question**: jak Codex's runtime ma dostać prywatny klucz tak, żeby: 1. Klucz nigdy nie ląduje na dysku Codex runtime 2. Codex może go używać dla `ssh`, ale nie widzi raw bytes po fetch window 3. Path A → Path B migration path zostaje cleanly preserved 4. Quick win — nie czekać na pełny Phase 1 resolver z `pdurlej/iskra-openclaw#43` Recommended w `pdurlej/iskra-openclaw#44` podejście (Bitwarden field + `bw get item` na startup) działa, ale ma kilka paper cuts które `ssh-agent` + TTL pattern rozwiązuje strukturalnie. Ten issue specyfikuje delivery flow jako **Phase 1 Lite** — pomost między aktualnym Path A wrapper deploy a pełnym Phase 1 resolver z #64. Mission name from Piotr: "SSH agent locked dla Codexa". ## Problem (current state per `pdurlej/iskra-openclaw#44` thread) - Public key + wrapper deployed on vps1000 (PR #45 in `pdurlej/iskra-openclaw`, smoke 5/5 pass) - Private key sits at `/tmp/codex-key-bundle/codex_iskra_openclaw_ed25519` on Piotr's Mac, mode 0600 - Recommended delivery: Bitwarden item `vps1000 (codex)` with `ssh_key` field, fetched by Codex runtime via `bw get item 'vps1000 (codex)' --session "$BW_SESSION" | jq -r '...'` ### Limitations of recommended Bitwarden-direct-fetch 1. **Plaintext-on-disk risk**: most natural implementations of "fetch and use SSH key" write it to a tempfile (`echo "$key" > /tmp/key && ssh -i /tmp/key`). Even with cleanup, the key briefly exists on disk and possibly in shell history. 2. **Process-substitution doesn't fully escape**: `ssh -i <(bw get...)` keeps key in named pipe, but key is in Codex's process memory and the named pipe is readable by anyone with FD access during the fetch window. 3. **No TTL on Codex-side cache**: once key is fetched and stored anywhere on Codex runtime, it stays until Codex code explicitly clears it. Forgetting that = persistent secret. 4. **No automatic rotation enforcement**: if we rotate the key in Bitwarden, Codex won't pick up the change until next runtime restart. 5. **Codex sees raw bytes for the entire session**: even if not on disk, the raw key passes through Codex's process memory and stays there. These aren't disasters individually. Sum of paper cuts > one structural fix. ## Proposal: ssh-agent as locked vault, with TTL Use OpenSSH's built-in `ssh-agent` as the credential broker for Codex's audit access. Key lives only in ssh-agent's process memory, with a forced TTL via `ssh-add -t`. Codex calls `ssh ...` normally; `SSH_AUTH_SOCK` routes to ssh-agent which signs the SSH handshake without exposing the key bytes. This is the "agent uses but never sees the secret" pattern from Oracle's Phase 3 brief, applied to the **SSH-key-only special case** using a mechanism that has been in OpenSSH since 1995, not a research-preview product. ### Architecture ``` Bitwarden / Infisical Codex runtime (key source) ┌──────────────────────────────┐ │ │ │ │ 1. fetch (one-time │ ssh-agent (process) │ │ per TTL window, │ ┌────────────────────┐ │ │ stdin to ssh-add) │ │ key in RAM │ │ └─────────────────────────►│ │ TTL = 3600s │ │ │ └─────────┬──────────┘ │ │ │ unix socket │ │ │ SSH_AUTH_SOCK │ │ ┌─────────▼──────────┐ │ │ │ Codex agent │ │ │ │ ssh openclaw@... │ │ │ │ ("sign this", not │ │ │ │ "give me bytes") │ │ │ └─────────┬──────────┘ │ │ │ │ └─────────────┼────────────────┘ │ ▼ vps1000 (Path A wrapper unchanged, pdurlej/iskra-openclaw PR #45) After 3600s, ssh-agent auto-evicts the key. Codex must re-fetch on next operation. No raw key bytes persist anywhere on Codex runtime. ``` ### Concrete delivery flow (Bitwarden-backed Phase 1 Lite) ```bash # Codex runtime startup (or per-session preflight): # 1. Ensure ssh-agent is running, capture SSH_AUTH_SOCK eval "$(ssh-agent -s)" # 2. Fetch key from Bitwarden, pipe directly to ssh-add stdin, TTL 1h bw get item 'vps1000 (codex)' --session "$BW_SESSION" \ | jq -r '.fields[] | select(.name=="ssh_key") | .value' \ | ssh-add -t 3600 - # 3. Use SSH normally # Key never on disk, never in env, never in Codex's argv ssh -F /dev/null \ -o IdentitiesOnly=no \ openclaw@vps1000 'iskra-canary --json --timeout-seconds 30' ``` What this achieves: - Key never written to any disk file (no tempfile, no `~/.ssh/codex_key`, no env var with key body). - Key in ssh-agent process memory only — separate process from Codex agent's main loop. - After 3600 seconds, ssh-agent evicts the key. Codex must re-fetch from Bitwarden on next operation. - Codex code signs SSH handshakes via the unix socket — operations are "sign this challenge", not "give me the bytes". - Path A wrapper on vps1000 is unchanged (`pdurlej/iskra-openclaw` PR #45 stays as-is). ### Phase 1 Lite vs. Phase 1 Full vs. Phase 3 | Aspect | Phase 1 Lite (this issue) | Phase 1 Full (#64) | Phase 3 (Oracle Agent Vault) | |---|---|---|---| | Key source | Bitwarden item | Infisical via `openclaw-infisical-resolver` | Infisical via Agent Vault broker | | Codex sees raw bytes | Once per TTL window | Once per TTL window | Never | | Key on disk | Never | Never | Never | | Egress isolation | None (Codex calls SSH directly) | None | HTTPS_PROXY enforced | | TTL enforcement | `ssh-agent -t` flag | `ssh-agent -t` flag + Infisical token TTL | Broker session TTL | | Codex effort | ~0.5 day | ~2 days (depends on Phase 1 Full) | Weeks (research preview) | | Status | Ships now | Blocks on #64 / Phase 1 Full | Blocks on Agent Vault readiness | The "Codex sees raw bytes once per TTL window" gap remains in Phase 1 Full and only closes in Phase 3. That's an honest limitation — Phase 1 Lite is not Phase 3 in disguise. But the gap is much smaller than current "key on disk indefinitely" state. ## Migration path When **Phase 1 Full** lands (#64 / Oracle 3-phase architecture), the Codex runtime startup script changes from: ```bash bw get item 'vps1000 (codex)' --session "$BW_SESSION" \ | jq -r '.fields[] | select(.name=="ssh_key") | .value' \ | ssh-add -t 3600 - ``` to: ```bash echo '{"protocolVersion":1,"provider":"infisical","ids":["codex-runtime/ssh/iskra-openclaw-key"]}' \ | /usr/local/bin/openclaw-infisical-resolver \ | jq -r '.values["codex-runtime/ssh/iskra-openclaw-key"]' \ | ssh-add -t 3600 - ``` ssh-agent + TTL pattern is unchanged. Only the **key source** rotates from Bitwarden to Infisical. When **Phase 3** Agent Vault lands, the SSH access flow itself changes — Codex no longer holds a key at all, the broker proxies the SSH session. ## Acceptance criteria - [ ] Codex runtime startup script uses `ssh-add -t 3600 -` to load key from stdin (no tempfile) - [ ] Codex runtime never writes the SSH private key to any persistent file - [ ] Codex's `ssh openclaw@vps1000 ...` succeeds within 1h of startup; fails after 1h until re-fetch (smoke: wait 3700s, retry) - [ ] Codex log/transcripts contain no raw key material (test: `grep -E 'BEGIN OPENSSH PRIVATE KEY' codex.log` returns nothing) - [ ] `bw get item 'vps1000 (codex)'` returns the key only when `BW_SESSION` is unlocked (relies on Bitwarden's own ACL) - [ ] After Phase 1 Full lands (#64), one-line swap from `bw get` to resolver-based fetch verified - [ ] Threat model section added to `pdurlej/iskra-openclaw` `docs/security/codex-ssh-wrapper.md` covering the in-memory-only invariant - [ ] One smoke test demonstrating closure of one `pdurlej/iskra-openclaw#10`/`#12`/`#13`/`#22`/`#33` audit using the new delivery flow ## Threat model — diff vs. recommended Bitwarden-direct-fetch | Threat | Bitwarden tempfile (recommended in `iskra-openclaw#44`) | Phase 1 Lite (this issue) | |---|---|---| | Disk forensics post-breach | Key recoverable from filesystem journal/snapshots/SSD residue | Not present on any disk at any time | | TTL/rotation discipline | Manual cleanup required, easy to forget | Auto-evict after 1h | | Process-restart leakage | Possible if cleanup races process death | Not possible (ssh-agent dies with runtime) | | Codex-side argv leak | `ssh -i /tmp/key` leaks path to `ps`/audit logs | `ssh openclaw@vps1000` — no key path in argv | | Forgotten key rotation | Until Codex restart | Within 1h of Bitwarden rotation | | In-process compromise | Same | Same (no improvement) | | Bitwarden compromise | Same | Same (no improvement) | ### What this does NOT protect - **In-process compromise on Codex runtime**: if attacker gets local code execution with the same UID as ssh-agent, they can `ptrace` ssh-agent or read `/proc/<pid>/maps`. Same as any in-process secret. HSM is the structural fix; out of scope here. - **Socket file permission**: ssh-agent socket lives in `/tmp/ssh-XXX/` with mode 0600 owned by Codex's UID. Anyone with that UID (or root) can use the socket. Acceptable trade-off for read-only audit access — wrapper allowlist on vps1000 is the second wall. - **Bitwarden compromise**: if Bitwarden is breached, the key is exfiltrated. Same threat surface as the Path A "key in Bitwarden" recommendation. - **vps1000 compromise**: the wrapper allowlist (`pdurlej/iskra-openclaw scripts/security/codex-ssh-wrapper.py`) is the protection there, not this delivery mechanism. ## Open questions 1. **TTL value**: 1h chosen as compromise. Codex's audit operations are typically short bursts; 1h covers a session comfortably without burning Bitwarden API quota. Open to 30min if Codex's operations support it. 2. **Codex runtime boundaries**: does Codex have a clean "session start" hook where this fetch+ssh-add can be wired? Or does Codex spawn fresh per-task processes? If per-task, ssh-agent needs to be a long-running sibling process. 3. **Failure mode on ssh-add expiry mid-session**: surface error and let job-level retry handle re-fetch (avoid tight loops on Bitwarden during outages). 4. **Bitwarden API quota**: how aggressive is `bw get item` rate limit? Worth measuring. 5. **`ssh-agent -a` socket location**: default `/tmp/ssh-XXX/` is fine for now. Future hardening could use `/run/user/<uid>/ssh-agent.sock` with stricter unix permissions and tmpfs backing. ## Decision asks for Piotr 1. **Phase 1 Lite vs. plain Bitwarden-fetch**: do we want the ssh-agent + TTL layer now, or keep recommended `iskra-openclaw#44` approach until Phase 1 Full lands? My read: Phase 1 Lite is ~0.5 day Codex work, removes 4 of 5 paper cuts permanently. Worth shipping now. 2. **TTL window**: 1h, 30min, or shorter? My read: 1h, revisit after first week. 3. **BW vs. skip-do-Infisical**: should we use Bitwarden as Phase 1 Lite source, or skip directly to Infisical? My read: BW first (key already lands there per `iskra-openclaw#44`), Infisical machine-identity SSH provisioning is its own design under #64. ## Related - **#64** — `docs(migrations): vault-to-infisical Phase 1-5 execution` (Phase 1 Full target; this is Phase 1 Lite of that) - **`pdurlej/iskra-openclaw#43`** — 3-phase secrets architecture (Oracle GPT 5.5 pro consult; **also a candidate for migration to platform** per the same audit) - **`pdurlej/iskra-openclaw#44`** — Codex SSH access provisioning (**also a candidate for migration to platform**); this issue specifies delivery for that wrapper - **`pdurlej/iskra-openclaw` PR #45** — wrapper deployment (stays in iskra-openclaw — allowlist is iskra-specific) - **`pdurlej/iskra-openclaw#10`, `#12`, `#13`, `#22`, `#33`** — five audits unblocked by Path A - **`pdurlej/iskra-openclaw#46`** — defense-in-depth for Agent Souls/ (sibling memory-sidecar follow-up; stays in iskra-openclaw) - **#56** (Iskra) — `fix(identity): split Forgejo MCP identity per agent and disable admin MCP by default` — adjacent agent-governance topic ## Why "small, but ship now" Każda warstwa to ~5–15 linii bash w startup script Codexa. Pattern jest production-grade (OpenSSH od ~30 lat). Path A→B migration path zachowany. Quick-win removes 4 paper cuts vs. plain Bitwarden fetch. Codex effort ~0.5 dnia. — Claude (overnight pass on Piotr's "SSH agent locked dla Codexa" mission, 2026-05-05 noc, post-Oracle-GPT-5.5-pro-consult)
Sign in to join this conversation.
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform#73
No description provided.