chore(governance): ADR-0016 — DeepSeek-v4-Pro as 8th cousin (deep-reviewer lane) #185

Merged
pdurlej merged 1 commit from claude/orders/adr-0016-deepseek-8th-cousin into main 2026-05-11 22:03:17 +02:00
Collaborator

Canary status: missing — fire canary 3+3 manually OR operator merge directly (this is Lite tier per ADR-0007: single new ADR + new directory README, <300 LoC, no sensitive paths).

Why now

Promotes Prof Kong's draft handoff #1 to Accepted status. Convergent evidence from 4 independent strands made this no-longer-a-draft:

  1. Prof Kong's identification (iskra-openclaw thread, pre-this-PR) — pattern noticed after DeepSeek's external review caught architectural gaps Prof Kong had missed
  2. Pan Herbatka's ralph batch experience (2026-05-10) — deepseek-v4-pro:cloud consistently produced the hardest findings as iter-3 red-team + iter-5 arbiter in 5-iter chains. 8 out of 10 iteration-credit notes on PR #157 mention deepseek-pro finding the issue.
  3. Operator's direct DeepSeek session 2026-05-11 — full iskra-openclaw analysis in 5% context window + full pdurlej/platform analysis in 7% context window. Found HIGH-severity gaps Pan Herbatka had missed (status file proliferation, 65/80 missing recovery sections, 35-day stale DR test). Addressed by PR #184.
  4. Bidirectional cross-validation — Pan Herbatka caught DeepSeek's docs/ci/ false positive (files exist; check_docs_drift.py passes 0 findings) within 5 min of reading the review. Validates modelizm-guard pattern works in both directions.

This is convergent evidence from 4 independent contexts (different threads, repos, focus areas) reaching the same conclusion: DeepSeek-v4-Pro should be a formal cousin, not an ad-hoc tool.

What ships (2 files, +287 LoC)

  • decisions/0016-deepseek-v4-pro-deep-reviewer-cousin.md (197 LoC) — full ADR with:

    • Context (4 evidence strands)
    • Decision (deep-reviewer lane assignment)
    • Invocation criteria (periodic + pre/post-Phase + cross-cutting + suspected-bubble-bias)
    • Cost discipline (≤2 reviews/week, <10% Ollama weekly per major review, operator-approval for >50k tokens)
    • Cross-validation rule (HIGH findings require ls/grep/cat evidence before action)
    • Cousin family v2 table (now 8 cousins: operator, claude, codex, glm, iskra, hermes, antigravity, deepseek-v4-pro)
    • 4 open questions for operator decision
  • state/deep-reviews/README.md (90 LoC) — landing zone for periodic deep review reports:

    • Filename convention (<YYYY-MM-DD>-<topic>-deep-review.md)
    • Report structure (findings, severity, cross-validation status, action items)
    • Current reviews table (2026-05-11 sessions noted but not yet pasted as files)
    • Backfill protocol for past reviews

What this PR does NOT do

  • Does NOT change any other ADR (ADR-0005 cousin lanes will be updated in a follow-up — that's where the v2 cousin table will live as the source of truth)
  • Does NOT change REVIEW.md (already mentions DeepSeek as part of the cousin family from PR #184)
  • Does NOT auto-invoke DeepSeek anywhere (all invocations remain operator-initiated)
  • Does NOT backfill 2026-05-11 deep review reports (operator can paste raw output if desired; not blocking)
  • Does NOT touch sacred paths, schemas, runtime

Out of scope (filed as open questions in the ADR)

  • Cadence formalization (monthly vs quarterly)
  • Whether pdurlej/iskra-openclaw adopts same ADR
  • Auto-opening Forgejo issues from deep-review findings
  • Pi-CLI vs API direct dispatch authority

Tier classification (per ADR-0007)

Lite — single new ADR + single new directory README; ~287 LoC; no sensitive paths; no schema/runtime; no breaking change. Add label tier/lite if operator agrees.

Under ADR-0007 trial period: review by claude + codex + DeepSeek (deep-reviewer itself being the subject = recursive validation — DeepSeek can review its own elevation ADR if operator chooses, which would be a poetic ouroboros).

Test plan

  • Operator readback: does the cousin family v2 table reflect the actual setup?
  • Operator readback: are the invocation criteria correct (when SHOULD we invoke DeepSeek)?
  • Operator readback: are the open questions worth answering now, or defer?
  • Optional: invoke DeepSeek-v4-Pro to review this very ADR (recursive validation; would be cute)
  • Operator merge or operator-override per ADR-0001 Rule 2
  • Post-merge: ADR-0005 follow-up to update cousin lanes table with v2

Spec sources read

  • ADR-0005 (base cousin coordination lanes)
  • ADR-0006 + ADR-0007 (sibling ADRs in same work-stream)
  • REVIEW.md § Cousin family rules (already cites DeepSeek)
  • state/STATUS_NOW.md § Cousin coordination snapshot (already mentions 8th cousin)
  • Prof Kong handoff #1 draft (in PR #183 of iskra-openclaw)
  • Pan Herbatka ralph batch evidence (~/Iskra-i-Piotr/05 System/Swarmheart Backups/ralph-phase3-apply/)
  • Operator quote 2026-05-11: "znowu używaliśmy tyle tego Dipsyka. Wiesz, ile procent zużyliśmy mojej weekly Olamy? Mniej niż 3%."

Operator's North Star check

Does this ADR reduce or increase operator-attention-cost?

  • Reduces: operator no longer needs to invent reasons to invoke DeepSeek; criteria are codified. Plus deep-reviewer surfaces architectural patterns that would otherwise require hours of operator review.
  • Slight increase: operator approval gate for >50k tokens (deliberate — prevents agent-runaway). Net positive given current budget evidence (<3% weekly).

Verdict: operator-friendly. Ship.

🍵 — Filed end-of-day 2026-05-11 by claude (Pan Herbatka). Operator-merge welcome anytime.

Refs: PR #184 (DeepSeek-findings remediation), Prof Kong handoff #1 origin, ralph batch 2026-05-10 evidence

Canary status: missing — fire canary 3+3 manually OR operator merge directly (this is Lite tier per ADR-0007: single new ADR + new directory README, <300 LoC, no sensitive paths). ## Why now Promotes Prof Kong's draft handoff #1 to Accepted status. Convergent evidence from 4 independent strands made this no-longer-a-draft: 1. **Prof Kong's identification** (iskra-openclaw thread, pre-this-PR) — pattern noticed after DeepSeek's external review caught architectural gaps Prof Kong had missed 2. **Pan Herbatka's ralph batch experience** (2026-05-10) — `deepseek-v4-pro:cloud` consistently produced the hardest findings as iter-3 red-team + iter-5 arbiter in 5-iter chains. 8 out of 10 iteration-credit notes on PR #157 mention deepseek-pro finding the issue. 3. **Operator's direct DeepSeek session 2026-05-11** — full `iskra-openclaw` analysis in 5% context window + full `pdurlej/platform` analysis in 7% context window. Found HIGH-severity gaps Pan Herbatka had missed (status file proliferation, 65/80 missing recovery sections, 35-day stale DR test). Addressed by PR #184. 4. **Bidirectional cross-validation** — Pan Herbatka caught DeepSeek's `docs/ci/` false positive (files exist; `check_docs_drift.py` passes 0 findings) within 5 min of reading the review. Validates modelizm-guard pattern works in both directions. This is convergent evidence from 4 independent contexts (different threads, repos, focus areas) reaching the same conclusion: DeepSeek-v4-Pro should be a formal cousin, not an ad-hoc tool. ## What ships (2 files, +287 LoC) - **`decisions/0016-deepseek-v4-pro-deep-reviewer-cousin.md`** (197 LoC) — full ADR with: - Context (4 evidence strands) - Decision (deep-reviewer lane assignment) - Invocation criteria (periodic + pre/post-Phase + cross-cutting + suspected-bubble-bias) - Cost discipline (≤2 reviews/week, <10% Ollama weekly per major review, operator-approval for >50k tokens) - Cross-validation rule (HIGH findings require ls/grep/cat evidence before action) - Cousin family v2 table (now 8 cousins: operator, claude, codex, glm, iskra, hermes, antigravity, **deepseek-v4-pro**) - 4 open questions for operator decision - **`state/deep-reviews/README.md`** (90 LoC) — landing zone for periodic deep review reports: - Filename convention (`<YYYY-MM-DD>-<topic>-deep-review.md`) - Report structure (findings, severity, cross-validation status, action items) - Current reviews table (2026-05-11 sessions noted but not yet pasted as files) - Backfill protocol for past reviews ## What this PR does NOT do - Does NOT change any other ADR (ADR-0005 cousin lanes will be updated in a follow-up — that's where the v2 cousin table will live as the source of truth) - Does NOT change `REVIEW.md` (already mentions DeepSeek as part of the cousin family from PR #184) - Does NOT auto-invoke DeepSeek anywhere (all invocations remain operator-initiated) - Does NOT backfill 2026-05-11 deep review reports (operator can paste raw output if desired; not blocking) - Does NOT touch sacred paths, schemas, runtime ## Out of scope (filed as open questions in the ADR) - Cadence formalization (monthly vs quarterly) - Whether `pdurlej/iskra-openclaw` adopts same ADR - Auto-opening Forgejo issues from deep-review findings - Pi-CLI vs API direct dispatch authority ## Tier classification (per ADR-0007) **Lite** — single new ADR + single new directory README; ~287 LoC; no sensitive paths; no schema/runtime; no breaking change. Add label `tier/lite` if operator agrees. Under ADR-0007 trial period: review by claude + codex + DeepSeek (deep-reviewer itself being the subject = recursive validation — DeepSeek can review its own elevation ADR if operator chooses, which would be a poetic ouroboros). ## Test plan - [ ] Operator readback: does the cousin family v2 table reflect the actual setup? - [ ] Operator readback: are the invocation criteria correct (when SHOULD we invoke DeepSeek)? - [ ] Operator readback: are the open questions worth answering now, or defer? - [ ] Optional: invoke DeepSeek-v4-Pro to review this very ADR (recursive validation; would be cute) - [ ] Operator merge or operator-override per ADR-0001 Rule 2 - [ ] Post-merge: ADR-0005 follow-up to update cousin lanes table with v2 ## Spec sources read - ADR-0005 (base cousin coordination lanes) - ADR-0006 + ADR-0007 (sibling ADRs in same work-stream) - `REVIEW.md` § Cousin family rules (already cites DeepSeek) - `state/STATUS_NOW.md` § Cousin coordination snapshot (already mentions 8th cousin) - Prof Kong handoff #1 draft (in PR #183 of iskra-openclaw) - Pan Herbatka ralph batch evidence (`~/Iskra-i-Piotr/05 System/Swarmheart Backups/ralph-phase3-apply/`) - Operator quote 2026-05-11: *"znowu używaliśmy tyle tego Dipsyka. Wiesz, ile procent zużyliśmy mojej weekly Olamy? Mniej niż 3%."* ## Operator's North Star check Does this ADR reduce or increase operator-attention-cost? - **Reduces**: operator no longer needs to invent reasons to invoke DeepSeek; criteria are codified. Plus deep-reviewer surfaces architectural patterns that would otherwise require hours of operator review. - **Slight increase**: operator approval gate for >50k tokens (deliberate — prevents agent-runaway). Net positive given current budget evidence (<3% weekly). Verdict: operator-friendly. Ship. 🍵 — Filed end-of-day 2026-05-11 by claude (Pan Herbatka). Operator-merge welcome anytime. Refs: PR #184 (DeepSeek-findings remediation), Prof Kong handoff #1 origin, ralph batch 2026-05-10 evidence
chore(governance): ADR-0016 — DeepSeek-v4-Pro as 8th cousin (deep-reviewer lane)
All checks were successful
canary-required / collect-diff (pull_request) Successful in 4s
canary-required / canary (pull_request) Successful in 19s
fe12b22a6a
Promotes Prof Kong's draft handoff #1 (from iskra-openclaw thread) to
Accepted status, based on convergent evidence from 4 independent strands:

1. Prof Kong's identification (iskra-openclaw thread, pre-this-PR)
2. Pan Herbatka's ralph batch experience (deepseek-pro consistently
   produced the hardest findings as iter-3 red-team + iter-5 arbiter)
3. Operator's direct DeepSeek session 2026-05-11 (5% context for full
   OpenClaw, 7% context for full Platform; found HIGH-severity gaps
   addressed by PR #184)
4. Bidirectional cross-validation: Pan Herbatka caught DeepSeek's
   docs/ci/ false positive within 5 min, validating the
   modelizm-guard pattern works in both directions

## Lane: deep-reviewer

- Cross-cutting architectural review; cold-reading whole repo
- Distinct from canary ensemble reviewers (per-PR per ADR-0007)
- Distinct from ralph chain arbiter (per-PR, tactical)
- Operator-invoked (periodic + pre/post-Phase + cross-cutting questions)

## Invocation criteria + cost discipline

- Quarterly full-repo OR pre/post-Phase OR cross-cutting question
- Frequency cap: ≤2 reviews/week
- Operator-approval for >50k tokens
- Current usage: <3% weekly Ollama budget (operator confirmed)

## Cross-validation rule

Every HIGH-severity DeepSeek finding MUST be cross-verified by
Pan Herbatka (ls/grep/cat evidence) before action. Bidirectional:
claude findings also sanity-checked. No reviewer is infallible.

## What ships

- decisions/0016-deepseek-v4-pro-deep-reviewer-cousin.md (full ADR)
- state/deep-reviews/README.md (landing zone + report structure)

## Tier (per ADR-0007)

**Lite** — single new ADR + single new directory README; <200 LoC
total; no sensitive paths; no schema/runtime change. Eligible for
auto-merge in future (per ADR-0019 once it lands), today
operator-merge.

Refs: PR #184 (DeepSeek-findings remediation), ADR-0005 (base
cousin lanes), ADR-0006/0007 (sibling ADRs), Prof Kong handoff #1

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign in to join this conversation.
No reviewers
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform!185
No description provided.