pdurlej/platform

Fork 0

design(autonomy): tiered agent-execution gate — 4-tier cascade (#673) #686

Merged

pdurlej merged 1 commit from claude/673-autonomy-design into main

2026-06-02 12:02:24 +02:00

claude commented

2026-06-02 11:32:07 +02:00

Collaborator

Design for #673 — tiered agent-execution gate (Cursor-Auto-review-inspired, red-team-checked).

Key insight: a 4-tier cascade (hard-stop → allowlist → sandbox → classifier) where the order structurally guarantees the hard rule — the classifier is reached last and never sees a hard boundary, so it cannot gate one by construction.

Sandbox = reuse 2 (git+CI, container-isolation), build 1 (platformctl apply --sandbox). Classifier = depend, not build (cheap-model + policy-as-text steering). Hard rails stay deterministic; the soft layer only ever touches reversible ground.

Becomes a codex-ready impl spec once the operator accepts the 3 open-question calls. Ties #76, #634, #566, ADR-0025.

Design for #673 — tiered agent-execution gate (Cursor-Auto-review-inspired, red-team-checked). **Key insight:** a 4-tier **cascade** (hard-stop → allowlist → sandbox → classifier) where the *order* structurally guarantees the hard rule — the classifier is reached last and never sees a hard boundary, so it cannot gate one by construction. Sandbox = reuse 2 (git+CI, container-isolation), build 1 (`platformctl apply --sandbox`). Classifier = **depend, not build** (cheap-model + policy-as-text steering). Hard rails stay deterministic; the soft layer only ever touches reversible ground. Becomes a codex-ready impl spec once the operator accepts the 3 open-question calls. Ties #76, #634, #566, ADR-0025.

claude added 1 commit

2026-06-02 11:32:07 +02:00

design(autonomy): tiered agent-execution gate — 4-tier cascade (#673 )

base-is-main / guard (pull_request) Successful in 1s

Details

canary-required / collect-diff (pull_request) Successful in 3s

Details

patchwarden-client-dry-run / collect-diff (pull_request) Successful in 3s

Details

patchwarden-pr-sanity / collect-diff (pull_request) Successful in 3s

Details

canary-required / canary (pull_request) Has been skipped

Details

patchwarden-client-dry-run / dry-run (pull_request) Failing after 1m24s

Details

patchwarden-pr-sanity / sanity (pull_request) Successful in 1m16s

Details

2fd23516b4

claude referenced this pull request

2026-06-02 11:32:07 +02:00

explore(autonomy): tiered agent-execution gate — sandbox + soft-classifier tiers (Cursor Auto-review-inspired) #673

claude referenced this pull request

2026-06-02 11:33:44 +02:00

docs(strategy): platform security & maturity roadmap v1 (audit-grounded) #674

codex commented

2026-06-02 11:35:02 +02:00

Collaborator

Patchwarden PR sanity

Status: eligible_sanity_clean
PR: 686
Commit: 2fd23516b409b26f6e6f2168581532fe8d32dc05
Security-sensitive label: missing
Authority: advisory model review plus deterministic blockers only
3+3 canary: still alive; this does not replace it

Deterministic findings

No deterministic findings.

Model reviewers

`global-glm` / `glm-5.1:cloud`

Status: ok
Verdict: OK
high Hard-stop list completeness is security-critical assumption
- Evidence: Lines 24-30: cascade guarantees classifier 'never sees a hard boundary' by construction, but this depends on hard-stop list being exhaustive. Tier 3 actions listed (lines 55-59) but no mechanism specified to verify completeness or catch mis
- Next: Add a section specifying how hard-stop list completeness is validated (e.g., audit trail, red-team review cadence, or a 'default-deny for unmapped high-impact actions' rule)
medium Tier numbering contradicts cascade execution order
- Evidence: Lines 20-30: cascade checks hard-stop first (step 1) → routes to Tier 3, allowlist second (step 2) → routes to Tier 0. Tier numbers (0=immediate, 3=operator-gate) invert execution priority. Implementation could confuse 'check Tier 3 first'
- Next: Rename tiers to match cascade order (Tier 1=hard-stop, Tier 2=allowlist, Tier 3=sandbox, Tier 4=classifier) or add explicit clarification note that tier numbers are severity labels, not execution order
medium Sandbox isolation boundaries unspecified for MCP/shell calls
- Evidence: Line 79: 'container / namespace isolation' for high-capability MCP calls tagged exec_command/secret_resolve/deploy/write_file/memory_write. No specification of filesystem limits, network egress rules, or what constitutes 'isolation prevents
- Next: Before Codex implementation, specify isolation boundary contract: what capabilities are blocked in sandbox containers, what network egress is permitted, and how memory_write isolation prevents cross-contamination
medium Tier 1 sandbox side-effects before discard not addressed
- Evidence: Line 44: 'run contained → commit-or-discard' and line 72: 'applies to disposable namespace / preview-slot, runs module healthcheck, then discards'. If a sandbox action triggers external webhooks, sends network requests, or modifies external
- Next: Add explicit side-effect containment requirement: Tier 1 sandbox actions must be reversible-by-construction (no external side-effects) or document which side-effect classes are acceptable in sandbox mode

`global-deepseek` / `deepseek-v4-pro:cloud`

Status: ok
Verdict: OK
Findings: none

`redteam` / `kimi-k2.6:cloud`

Status: error
Verdict: -
Note: Ollama response had no message.content.
Findings: none

Policy notes

GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot.
Optional red-team model is enabled only when PLATFORMCTL_PR_SANITY_REDTEAM_MODEL is configured.
Auto-merge is not enabled here.

# Patchwarden PR sanity - Status: `eligible_sanity_clean` - PR: `686` - Commit: `2fd23516b409b26f6e6f2168581532fe8d32dc05` - Security-sensitive label: `missing` - Authority: advisory model review plus deterministic blockers only - 3+3 canary: still alive; this does not replace it ## Deterministic findings No deterministic findings. ## Model reviewers ### `global-glm` / `glm-5.1:cloud` - Status: `ok` - Verdict: `OK` - **`high`** Hard-stop list completeness is security-critical assumption - Evidence: `Lines 24-30: cascade guarantees classifier 'never sees a hard boundary' by construction, but this depends on hard-stop list being exhaustive. Tier 3 actions listed (lines 55-59) but no mechanism specified to verify completeness or catch mis` - Next: Add a section specifying how hard-stop list completeness is validated (e.g., audit trail, red-team review cadence, or a 'default-deny for unmapped high-impact actions' rule) - **`medium`** Tier numbering contradicts cascade execution order - Evidence: `Lines 20-30: cascade checks hard-stop first (step 1) → routes to Tier 3, allowlist second (step 2) → routes to Tier 0. Tier numbers (0=immediate, 3=operator-gate) invert execution priority. Implementation could confuse 'check Tier 3 first' ` - Next: Rename tiers to match cascade order (Tier 1=hard-stop, Tier 2=allowlist, Tier 3=sandbox, Tier 4=classifier) or add explicit clarification note that tier numbers are severity labels, not execution order - **`medium`** Sandbox isolation boundaries unspecified for MCP/shell calls - Evidence: `Line 79: 'container / namespace isolation' for high-capability MCP calls tagged exec_command/secret_resolve/deploy/write_file/memory_write. No specification of filesystem limits, network egress rules, or what constitutes 'isolation prevents` - Next: Before Codex implementation, specify isolation boundary contract: what capabilities are blocked in sandbox containers, what network egress is permitted, and how memory_write isolation prevents cross-contamination - **`medium`** Tier 1 sandbox side-effects before discard not addressed - Evidence: `Line 44: 'run contained → commit-or-discard' and line 72: 'applies to disposable namespace / preview-slot, runs module healthcheck, then discards'. If a sandbox action triggers external webhooks, sends network requests, or modifies external` - Next: Add explicit side-effect containment requirement: Tier 1 sandbox actions must be reversible-by-construction (no external side-effects) or document which side-effect classes are acceptable in sandbox mode ### `global-deepseek` / `deepseek-v4-pro:cloud` - Status: `ok` - Verdict: `OK` - Findings: none ### `redteam` / `kimi-k2.6:cloud` - Status: `error` - Verdict: `-` - Note: Ollama response had no message.content. - Findings: none ## Policy notes - GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot. - Optional red-team model is enabled only when `PLATFORMCTL_PR_SANITY_REDTEAM_MODEL` is configured. - Auto-merge is not enabled here.

claude referenced this pull request

2026-06-02 11:40:43 +02:00

impl(autonomy): tiered execution gate — cascade router, sandbox, classifier (per #673 / PR #686) #687

claude referenced this pull request

2026-06-02 11:40:43 +02:00

explore(autonomy): tiered agent-execution gate — sandbox + soft-classifier tiers (Cursor Auto-review-inspired) #673