pdurlej/platform

Fork 0

feat(platformctl): automated truth-verification layer — `lint --cross-refs` + `drift-check` #772

Closed

ollama wants to merge 3 commits from deepseek/audit-2026-06-08 into main

ollama commented

2026-06-08 23:31:21 +02:00

Collaborator

Canary status: missing — fire canary 3+3 manually before merge

Canary Context Pack

Product story

Every audit burns tokens on the same checks: "is INDEX.yaml consistent with modules?", "are ADR numbers unique?", "do referenced ADRs exist?", "are images drifted?". This PR makes these checks automatic — zero tokens, instant feedback, CI-enforceable.

What changed

New command: platformctl lint --cross-refs — 4 static integrity checks:
1. INDEX.yaml ↔ module manifests (lifecycle, criticality, area, host)
2. ADR numbering (duplicate detection)
3. ADR references (phantom ADR detection across all docs)
4. Runbook coverage (missing runbooks, orphan runbooks)
New command: platformctl drift-check [module|--all] — compare image_observed against live Docker digests via SSH
Both follow existing platformctl patterns (transport, plan.py primitives, exit codes)

Why it changed

DeepSeek audit 2026-06-08 identified that the #1 systemic risk is the platform not knowing when it lies about itself. These commands prevent the drift that every manual audit has been rediscovering.

Files touched

control-plane/platformctl/lint.py (new, 283 lines)
control-plane/platformctl/drift_check.py (new, 224 lines)
control-plane/platformctl/cli.py (added 2 commands + formatters)

Relevant context

DeepSeek audit: state/audit/deepseek-2026-06-08-multiperspective.md
Bug fixes depend on this: #752, #756, #760, #765
Strategic improvements depend on this: #767, #771 (both superseded by this PR which implements them together)

Runtime evidence

platformctl lint --cross-refs runs on Mac and RS2000, identical results
platformctl drift-check honcho-api tested; SSH transport works through existing Tailscale infrastructure
1282 existing tests pass (zero regressions)
Lint output validated with --json mode for CI integration

Known constraints

drift-check requires platform-host-agent SSH key setup (same as health and plan commands)
lint is fully offline — no runtime dependencies

Explicit out-of-scope

No runtime mutations
No secret access
No new dependencies (uses existing transport/plan/manifest modules)
Does not automatically fix findings (that's what the atomic issues are for)

Requested decision

Review and merge. This is the foundation for CI-enforced platform integrity.

Merge blockers

1282 tests pass
Lint finds known issues (INDEX drift, duplicate ADR, phantom refs)
No regressions in existing commands

Spec sources read

control-plane/platformctl/cli.py — command structure, exit codes, patterns
control-plane/platformctl/plan.py — container_name, inspect_command, transport contract
control-plane/platformctl/health.py — SSH health check patterns
control-plane/platformctl/manifest.py — load_manifest, all_module_ids, find_modules_dir
control-plane/platformctl/transport/tailscale.py — SSH transport
modules/INDEX.yaml — data structure
decisions/ — ADR file patterns
state/audit/deepseek-2026-06-08-multiperspective.md — audit findings that motivated this

Verification

# Static checks (zero deps)
PYTHONPATH=control-plane uv run --project control-plane python -m platformctl.cli lint --cross-refs --json | python3 -m json.tool

# Drift check (needs SSH)
PYTHONPATH=control-plane uv run --project control-plane python -m platformctl.cli drift-check --all --json

# Existing tests
PYTHONPATH=control-plane uv run --project control-plane python -m pytest tests/ -q

Closes #767, closes #771

Canary status: missing — fire canary 3+3 manually before merge ## Canary Context Pack ### Product story Every audit burns tokens on the same checks: "is INDEX.yaml consistent with modules?", "are ADR numbers unique?", "do referenced ADRs exist?", "are images drifted?". This PR makes these checks automatic — zero tokens, instant feedback, CI-enforceable. ### What changed - **New command: `platformctl lint --cross-refs`** — 4 static integrity checks: 1. INDEX.yaml ↔ module manifests (lifecycle, criticality, area, host) 2. ADR numbering (duplicate detection) 3. ADR references (phantom ADR detection across all docs) 4. Runbook coverage (missing runbooks, orphan runbooks) - **New command: `platformctl drift-check [module|--all]`** — compare `image_observed` against live Docker digests via SSH - Both follow existing platformctl patterns (transport, plan.py primitives, exit codes) ### Why it changed DeepSeek audit 2026-06-08 identified that the #1 systemic risk is the platform not knowing when it lies about itself. These commands prevent the drift that every manual audit has been rediscovering. ### Files touched - `control-plane/platformctl/lint.py` (new, 283 lines) - `control-plane/platformctl/drift_check.py` (new, 224 lines) - `control-plane/platformctl/cli.py` (added 2 commands + formatters) ### Relevant context - DeepSeek audit: `state/audit/deepseek-2026-06-08-multiperspective.md` - Bug fixes depend on this: #752, #756, #760, #765 - Strategic improvements depend on this: #767, #771 (both superseded by this PR which implements them together) ### Runtime evidence - `platformctl lint --cross-refs` runs on Mac and RS2000, identical results - `platformctl drift-check honcho-api` tested; SSH transport works through existing Tailscale infrastructure - 1282 existing tests pass (zero regressions) - Lint output validated with `--json` mode for CI integration ### Known constraints - `drift-check` requires `platform-host-agent` SSH key setup (same as `health` and `plan` commands) - `lint` is fully offline — no runtime dependencies ### Explicit out-of-scope - No runtime mutations - No secret access - No new dependencies (uses existing transport/plan/manifest modules) - Does not automatically fix findings (that's what the atomic issues are for) ### Requested decision Review and merge. This is the foundation for CI-enforced platform integrity. ### Merge blockers - 1282 tests pass - Lint finds known issues (INDEX drift, duplicate ADR, phantom refs) - No regressions in existing commands ## Spec sources read - `control-plane/platformctl/cli.py` — command structure, exit codes, patterns - `control-plane/platformctl/plan.py` — `container_name`, `inspect_command`, transport contract - `control-plane/platformctl/health.py` — SSH health check patterns - `control-plane/platformctl/manifest.py` — `load_manifest`, `all_module_ids`, `find_modules_dir` - `control-plane/platformctl/transport/tailscale.py` — SSH transport - `modules/INDEX.yaml` — data structure - `decisions/` — ADR file patterns - `state/audit/deepseek-2026-06-08-multiperspective.md` — audit findings that motivated this ## Verification ```bash # Static checks (zero deps) PYTHONPATH=control-plane uv run --project control-plane python -m platformctl.cli lint --cross-refs --json | python3 -m json.tool # Drift check (needs SSH) PYTHONPATH=control-plane uv run --project control-plane python -m platformctl.cli drift-check --all --json # Existing tests PYTHONPATH=control-plane uv run --project control-plane python -m pytest tests/ -q ``` Closes #767, closes #771

ollama added 2 commits

2026-06-08 23:31:21 +02:00

docs(audit): DeepSeek multi-perspective platform audit 2026-06-08 d312452cc2

6 perspectives: architect, devops, engineer, tester, product, UX.
36 findings total: 19 immediately actionable by small models,
7 need investigation, 8 operator-gated.

Created 15 atomic Forgejo issues (#752-#766) for Gemini agents.
Audit report saved to state/audit/deepseek-2026-06-08-multiperspective.md

feat(platformctl): add lint --cross-refs and drift-check commands

base-is-main / guard (pull_request) Successful in 2s

Details

canary-required / collect-diff (pull_request) Successful in 5s

Details

patchwarden-client-dry-run / collect-diff (pull_request) Successful in 4s

Details

patchwarden-pr-sanity / collect-diff (pull_request) Successful in 4s

Details

platformctl plan / auto-apply scope (pull_request) Successful in 20s

Details

python-ci / Python 3.11 (pull_request) Successful in 45s

Details

python-ci / Python 3.12 (pull_request) Successful in 47s

Details

python-ci / Python 3.13 (pull_request) Successful in 46s

Details

patchwarden-client-dry-run / dry-run (pull_request) Successful in 21s

Details

patchwarden-pr-sanity / sanity (pull_request) Successful in 4m20s

Details

pyfallow / Pyfallow gate (control-plane) (pull_request) Successful in 21s

Details

canary-required / canary (pull_request) Successful in 15s

Details

760b045cc0

- lint --cross-refs: static cross-reference integrity checks
  (INDEX.yaml vs manifests, ADR numbering, ADR references, runbook coverage)
- drift-check: compare declared image_observed vs live Docker digests via SSH
- Both commands follow existing platformctl patterns
- 1282 existing tests still green (zero regressions)
- Part of DeepSeek audit 2026-06-08 strategic improvements

patchwarden commented

2026-06-08 23:39:33 +02:00

First-time contributor

Patchwarden PR sanity

Status: advisory_findings
PR: 772
Commit: 2aa6d7bc855304ce7bb5a7dd6974cc5eb6891938
Security-sensitive label: missing
Authority: advisory model review plus deterministic blockers only
3+3 canary: still alive; this does not replace it

Deterministic findings

No deterministic findings.

Model reviewers

`global-glm` / `glm-5.1:cloud`

Status: ok
Verdict: NOT_OK
high KeyError crash in _check_index_consistency when manifest lacks nested classification
- Evidence: control-plane/platformctl/lint.py lines ~155-160: m.data["spec"]["classification"]["area"]uses direct dict access while INDEX.yaml side uses.get()with defaults. If any module.yaml lacksspec.classification.area, this raises KeyErro
- Next: Change m.data["spec"]["classification"]["area"] to m.data.get("spec", {}).get("classification", {}).get("area") or wrap in try/except KeyError to match the defensive pattern used for INDEX.yaml access.
low Weak type contract on transport parameter allows unsafe calls
- Evidence: control-plane/platformctl/drift_check.py lines 42, 54, 67: transport: TailscaleTransport | Any | None = None— theAny type defeats type checking and could allow invalid transport objects at runtime.
- Next: Remove Any from the union type. Use TailscaleTransport | None = None or define a Protocol for the transport interface if polymorphism is needed.

`global-deepseek` / `deepseek-v4-pro:cloud`

Status: ok
Verdict: OK
medium No tests for new lint and drift-check commands
- Evidence: Diff adds 507 lines of new logic across lint.py and drift_check.py, but no test files are included. PR description only confirms existing tests pass, not that new functionality is covered.
- Next: Add unit/integration tests for lint --cross-refs and drift-check before merging to ensure correctness and prevent regressions.
low Canary review not yet performed
- Evidence: PR description states 'Canary status: missing — fire canary 3+3 manually before merge'. The canary process is required by project policy for medium/large changes.
- Next: Complete the canary review (3+3 ensemble) before merging to satisfy governance requirements.

`redteam` / `kimi-k2.6:cloud`

Status: error
Verdict: -
Note: Ollama response had no message.content.
Findings: none

Policy notes

GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot.
Optional red-team model is enabled only when PLATFORMCTL_PR_SANITY_REDTEAM_MODEL is configured.
Auto-merge is not enabled here.

# Patchwarden PR sanity - Status: `advisory_findings` - PR: `772` - Commit: `2aa6d7bc855304ce7bb5a7dd6974cc5eb6891938` - Security-sensitive label: `missing` - Authority: advisory model review plus deterministic blockers only - 3+3 canary: still alive; this does not replace it ## Deterministic findings No deterministic findings. ## Model reviewers ### `global-glm` / `glm-5.1:cloud` - Status: `ok` - Verdict: `NOT_OK` - **`high`** KeyError crash in _check_index_consistency when manifest lacks nested classification - Evidence: `control-plane/platformctl/lint.py lines ~155-160: `m.data["spec"]["classification"]["area"]` uses direct dict access while INDEX.yaml side uses `.get()` with defaults. If any module.yaml lacks `spec.classification.area`, this raises KeyErro` - Next: Change `m.data["spec"]["classification"]["area"]` to `m.data.get("spec", {}).get("classification", {}).get("area")` or wrap in try/except KeyError to match the defensive pattern used for INDEX.yaml access. - **`low`** Weak type contract on transport parameter allows unsafe calls - Evidence: `control-plane/platformctl/drift_check.py lines 42, 54, 67: `transport: TailscaleTransport | Any | None = None` — the `Any` type defeats type checking and could allow invalid transport objects at runtime.` - Next: Remove `Any` from the union type. Use `TailscaleTransport | None = None` or define a Protocol for the transport interface if polymorphism is needed. ### `global-deepseek` / `deepseek-v4-pro:cloud` - Status: `ok` - Verdict: `OK` - **`medium`** No tests for new lint and drift-check commands - Evidence: `Diff adds 507 lines of new logic across lint.py and drift_check.py, but no test files are included. PR description only confirms existing tests pass, not that new functionality is covered.` - Next: Add unit/integration tests for lint --cross-refs and drift-check before merging to ensure correctness and prevent regressions. - **`low`** Canary review not yet performed - Evidence: `PR description states 'Canary status: missing — fire canary 3+3 manually before merge'. The canary process is required by project policy for medium/large changes.` - Next: Complete the canary review (3+3 ensemble) before merging to satisfy governance requirements. ### `redteam` / `kimi-k2.6:cloud` - Status: `error` - Verdict: `-` - Note: Ollama response had no message.content. - Findings: none ## Policy notes - GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot. - Optional red-team model is enabled only when `PLATFORMCTL_PR_SANITY_REDTEAM_MODEL` is configured. - Auto-merge is not enabled here.

ollama referenced this pull request from a commit

2026-06-09 07:47:24 +02:00

fix(platformctl): address Patchwarden PR #772 review findings

ollama added 1 commit

2026-06-09 07:47:24 +02:00

fix(platformctl): address Patchwarden PR #772 review findings

canary-required / collect-diff (pull_request) Successful in 5s

Details

platformctl plan / auto-apply scope (pull_request) Successful in 20s

Details

pyfallow / Pyfallow gate (control-plane) (pull_request) Successful in 17s

Details

python-ci / Python 3.11 (pull_request) Successful in 40s

Details

python-ci / Python 3.12 (pull_request) Successful in 42s

Details

python-ci / Python 3.13 (pull_request) Successful in 42s

Details

canary-required / canary (pull_request) Successful in 19s

Details

base-is-main / guard (pull_request) Successful in 1s

Details

patchwarden-client-dry-run / collect-diff (pull_request) Successful in 4s

Details

patchwarden-pr-sanity / sanity (pull_request) Successful in 4m6s

Details

patchwarden-client-dry-run / dry-run (pull_request) Successful in 19s

Details

patchwarden-pr-sanity / collect-diff (pull_request) Successful in 4s

Details

2aa6d7bc85

- Use shlex.quote() on image_ref in drift-check (command injection vector)
- Fix unreachable EXIT_UNKNOWN_PARTIAL branch (reorder conditions)
- Use safe .get() access pattern for spec/classification/area in lint.py
- Exit 1 when 'lint' runs without --cross-refs flag (no silent bypass)

All 1282 tests still green.

codex added the

merge/manual-merge-conflict

risk/process

source/agent-generated

status:parked

tier/full

type:feat

labels

2026-06-11 08:15:03 +02:00

pdurlej commented

2026-06-17 23:17:19 +02:00

Owner

PR-zero queue collapse: closing this parked/conflicted mega-PR without merge. The part has now landed via narrower PR #783. The part should come back only as a fresh, smaller PR/issue with current tests and no stale Honcho/live assumptions. No runtime mutation was performed.

pdurlej closed this pull request

2026-06-17 23:17:20 +02:00

pdurlej commented

2026-06-17 23:17:52 +02:00

Owner

Correction to the previous close comment: shell interpolation stripped the inline command names.

Intended wording: the lint --cross-refs part has landed via narrower PR #783; the drift-check part should return only as a fresh, smaller PR/issue with current tests and no stale Honcho/live assumptions.

Correction to the previous close comment: shell interpolation stripped the inline command names. Intended wording: the `lint --cross-refs` part has landed via narrower PR #783; the `drift-check` part should return only as a fresh, smaller PR/issue with current tests and no stale Honcho/live assumptions.