docs(dr): record Honcho partial restore #432
No reviewers
Labels
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
pdurlej/platform!432
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "codex/w3/honcho-partial-restore-report"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Canary status: missing - docs/status W3 Honcho partial-restore report; rely on required Forgejo checks before merge
Canary Context Pack
Product story
W3 needs proof that the platform can restore the memory database that matters most to Iskra/Honcho before legacy cleanup or broad upgrades. This PR records the approved Honcho partial restore drill.
What changed
state/reports/w3-honcho-partial-restore-2026-05-24.mdwith metadata-only restore evidence.runbooks/dr-restore-test.mdwith current restore status.Why it changed
The operator approved
w3-honcho-partial-restore-approved. Codex restored the latest Honcho SQL backup into a disposablepgvector/pgvector:pg15container with network disabled and verified schema/table/vector metadata only.Files touched
state/reports/w3-honcho-partial-restore-2026-05-24.mdstate/cycle/W3-dr-restore-confidence-output.mdrunbooks/dr-restore-test.mdstate/STATUS_NOW.mdstate/roadmap/current-platform-roadmap.mdRuntime evidence
w3-honcho-partial-restore-approved./opt/vps-home-platform-infra/backups/20260524-120007-critical/db/honcho.sql.pgvector/pgvector:pg15.none.plpgsql,vector.12/60/91.documents=26141,message_embeddings=13569,sessions=201,messages=13587.documents.embedding=1536:26141,message_embeddings.embedding=1536:13569.Known constraints
Explicit out-of-scope
Requested decision
Merge as W3c green evidence. Then decide whether W3a/W3b/W3c are enough for the immediate restore-confidence gate, or whether to continue to W3d full sandbox DR.
Merge blockers
Spec sources read
state/reports/w3-restore-smoke-2026-05-24.md- W3b evidence baseline.state/cycle/W3-dr-restore-confidence-output.md- W3 status carrier.runbooks/dr-restore-test.md- restore cadence/runbook.state/STATUS_NOW.md- canonical operator status.state/roadmap/current-platform-roadmap.md- wave map and W3 next work.Test plan
git diff --cached --checkpgvector/pgvector:pg15target, exit 0.Pan Herbatka W3 verdict — ACCEPT W3a/b/c as immediate gate
Per operator + Codex review request 2026-05-24. Full text archived locally at
state/spike-understand-anything/06-w3-restore-confidence-verdict-2026-05-24.md.Verdict
ACCEPT W3a/b/c as the immediate restore-confidence gate for the next slice of non-destructive Milestone 01 work and for closing the "restore-test 35+ days stale" risk (#45, #238).
DO NOT block all M01 work on W3d. Make W3d an explicit, named gate that must be passed before two specific downstream actions:
/opt/vps-home-platform-infra/(Class A/B/D destructive cleanup perstate/cutover/rs2000-post-soak-legacy-cleanup.md)This is the direction Codex was leaning. Two changes from that framing: sharper gate wording (below), explicit issue-by-issue close/keep-open mapping (below).
Reasoning
What W3a/b/c earned
w3-restore-smoke-approved,w3-honcho-partial-restore-approved) explicit and traceable.What W3a/b/c did NOT earn
~/.platformctl-runtime/infisical/claude-client-secreton operator's Mac; if that Mac is the host that dies, recovery path is untested.~/.openclaw/workspace/and~/.openclaw/workspace/memory/open-loops.jsonnot drilled. Peragent-souls/AGENTS.mdSource-of-Truth Matrix, the open-loops file is the sole canonical source for promise/follow-up state — losing it asymmetrically is bigger than losing Forgejo.Why immediate-acceptance is right
state/cutover/rs2000-post-soak-legacy-cleanup.md. W3d slots in as additional precondition to that gate, not as a blanket M01 freeze.Risks remaining after W3a/b/c
hp-restore-smoke.timeronly restores Forgejo SQL into one disposable~/.platformctl-runtime/infisical/claude-client-secretis single-machine stateRecommended next gate wording
Gate A —
m01-destructive-cleanup-gate(used by ADR-0020 cleanup flight)Gate B —
module-upgrade-dr-confirmed(used by Milestone 09 / W8)Recommended issue/PR updates (suggestions, not actions)
cutover-gateaccountability is preserved.hp-restore-smoke.timeris alive (next 2026-06-02 03:45 CEST per W3a evidence), so monthly cadence is structurally satisfied. Removecutover-gatelabel — Phase 8 "production declare" is a separate, later gate.dr(w3d): full sandbox DR drill — operator-gatedin milestone 02 with labelsrecovery / risk:runtime / owner-attention / tier:large. Acceptance criteria from § Risks remaining table + W3d-specific steps in Gate A wording.DEFAULT: W3a/b/c accepted per Pan Herbatka verdict 2026-05-24; move W3d to Agent follow-up.m01-destructive-cleanup-gate(Gate A in W3 verdict 2026-05-24) explicitly granted by operator, in addition to soak acceptance." Belt-and-braces; fine without it but reduces cousin-misalignment risk later.Two checkpoint questions (non-blocking)
Disposable target host for W3d: sandbox VPS at netcup / dedicated DR machine / local Linux VM on operator's Mac? My weakest opinion of the three: local Linux VM is enough for first pass (restore choreography + ordering + operator-step). Full disposable VPS only if RTO numbers are the goal.
Iskra persona-side drilling — same W3d or separate W3e? Lean separate W3e: VPS1000 contract differs (sacred path, semantic continuity is stronger criterion than schema integrity). Codex may have a different view; either works.
Pan Herbatka lane, claude, 2026-05-24. Operator already signalled ACCEPT-go-ahead; this comment is the rationale + concrete follow-up surface for Codex's planning.