dr(w3): record vps1000 sandbox restore pass #522

Merged
pdurlej merged 1 commit from codex/w3d-vps1000-sandbox-pass into main 2026-05-27 15:26:11 +02:00
Collaborator

Canary status: missing - not fired; touched scripts/dr and state evidence only, not mandatory canary paths.

Canary Context Pack

Product story

W3d needs evidence that restore mechanics work beyond one local Docker sandbox, without buying or relying on a third standing VPS. This PR records a second restore pass using vps1000 as an isolated Docker sandbox host.

What changed

  • Allows the W3d disposable drill wrapper to target vps1000 only with explicit W3D_ALLOW_LIVE_HOST_SANDBOX=1.
  • Keeps rs2000 blocked as a restore target for this drill.
  • Generalizes W3d report wording so local and remote sandbox reports use the same evidence shape.
  • Adds the vps1000 restore report and updates W3/status notes.
  • Records the platform policy correction: no third standing VPS; use RS2000, VPS1000, Mac/local Docker, or temporary/serverless-per-minute only when truly needed.

Why it changed

The previous W3d pass proved local mechanics. The operator clarified that vps1000 is the right second sandbox host and that buying another server is not acceptable default infrastructure.

Files touched

  • scripts/dr/w3d-disposable-vps-drill.sh
  • scripts/dr/w3d-local-sandbox-drill.sh
  • state/reports/w3d-vps1000-sandbox-drill-2026-05-27.md
  • state/cycle/W3-dr-restore-confidence-output.md
  • state/STATUS_NOW.md

Runtime evidence

  • Target: vps1000 / v2202603338414444051, Docker-only isolated Compose project w3d-20260527t101906z.
  • Backup root: /opt/vps-home-platform-infra/backups/20260527-120006-critical.
  • Routed smoke: Traefik ping 200, Forgejo /api/healthz 200, Honcho /openapi.json 200.
  • Restored counts: Forgejo repositories 19, Forgejo users 7, Honcho documents 26141, message embeddings 13572, sessions 202, messages 13590.
  • RTO from backup-on-disk on vps1000 to routed smoke: 97s; target drill elapsed after staging: 130s.
  • Cleanup verified: no w3d* containers, volumes, or staging dirs remained on vps1000.

Known constraints

  • No RS2000 production restore, restart, apply, promote, or service mutation was performed.
  • No live OpenClaw/Iskra service on vps1000 was recreated or touched.
  • The sandbox did not receive direct production secret access; artifacts were staged through the operator machine.
  • Semantic OpenClaw/Iskra persona continuity is not proven here; split to W3e if needed.

Explicit out-of-scope

  • Buying or provisioning a third standing VPS.
  • Forgejo upgrade.
  • Runtime cleanup/destructive deletion.
  • Persona/memory semantic restore validation.

Requested decision

Merge to accept W3d restore mechanics as green across local Docker and vps1000 remote sandbox evidence.

Merge blockers

  • Any overclaim that this proves persona continuity.
  • Missing cleanup evidence.
  • Secret or private-data leakage in report text.

Spec sources read

  • AGENTS.md - Forgejo/identity and PR contract.
  • docs/forgejo-agent-operations.md - Forgejo API write contract.
  • state/cycle/W3-dr-restore-confidence-output.md - W3d acceptance state.
  • state/reports/w3d-local-sandbox-drill-2026-05-27.md - local W3d evidence shape.
  • Forgejo issue #433 - W3 DR/restore confidence coordination.
  • Recent W3d PR context #520/#521 - first local-pass closeout and operator correction.

Refs #433

Canary status: missing - not fired; touched scripts/dr and state evidence only, not mandatory canary paths. ## Canary Context Pack ### Product story W3d needs evidence that restore mechanics work beyond one local Docker sandbox, without buying or relying on a third standing VPS. This PR records a second restore pass using `vps1000` as an isolated Docker sandbox host. ### What changed - Allows the W3d disposable drill wrapper to target `vps1000` only with explicit `W3D_ALLOW_LIVE_HOST_SANDBOX=1`. - Keeps `rs2000` blocked as a restore target for this drill. - Generalizes W3d report wording so local and remote sandbox reports use the same evidence shape. - Adds the `vps1000` restore report and updates W3/status notes. - Records the platform policy correction: no third standing VPS; use RS2000, VPS1000, Mac/local Docker, or temporary/serverless-per-minute only when truly needed. ### Why it changed The previous W3d pass proved local mechanics. The operator clarified that `vps1000` is the right second sandbox host and that buying another server is not acceptable default infrastructure. ### Files touched - `scripts/dr/w3d-disposable-vps-drill.sh` - `scripts/dr/w3d-local-sandbox-drill.sh` - `state/reports/w3d-vps1000-sandbox-drill-2026-05-27.md` - `state/cycle/W3-dr-restore-confidence-output.md` - `state/STATUS_NOW.md` ### Runtime evidence - Target: `vps1000` / `v2202603338414444051`, Docker-only isolated Compose project `w3d-20260527t101906z`. - Backup root: `/opt/vps-home-platform-infra/backups/20260527-120006-critical`. - Routed smoke: Traefik ping `200`, Forgejo `/api/healthz` `200`, Honcho `/openapi.json` `200`. - Restored counts: Forgejo repositories `19`, Forgejo users `7`, Honcho documents `26141`, message embeddings `13572`, sessions `202`, messages `13590`. - RTO from backup-on-disk on `vps1000` to routed smoke: `97s`; target drill elapsed after staging: `130s`. - Cleanup verified: no `w3d*` containers, volumes, or staging dirs remained on `vps1000`. ### Known constraints - No RS2000 production restore, restart, apply, promote, or service mutation was performed. - No live OpenClaw/Iskra service on `vps1000` was recreated or touched. - The sandbox did not receive direct production secret access; artifacts were staged through the operator machine. - Semantic OpenClaw/Iskra persona continuity is not proven here; split to W3e if needed. ### Explicit out-of-scope - Buying or provisioning a third standing VPS. - Forgejo upgrade. - Runtime cleanup/destructive deletion. - Persona/memory semantic restore validation. ### Requested decision Merge to accept W3d restore mechanics as green across local Docker and `vps1000` remote sandbox evidence. ### Merge blockers - Any overclaim that this proves persona continuity. - Missing cleanup evidence. - Secret or private-data leakage in report text. ## Spec sources read - `AGENTS.md` - Forgejo/identity and PR contract. - `docs/forgejo-agent-operations.md` - Forgejo API write contract. - `state/cycle/W3-dr-restore-confidence-output.md` - W3d acceptance state. - `state/reports/w3d-local-sandbox-drill-2026-05-27.md` - local W3d evidence shape. - Forgejo issue #433 - W3 DR/restore confidence coordination. - Recent W3d PR context #520/#521 - first local-pass closeout and operator correction. Refs #433
dr(w3): record vps1000 sandbox restore pass
All checks were successful
base-is-main / guard (pull_request) Successful in 1s
canary-required / collect-diff (pull_request) Successful in 4s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 4s
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 5s
canary-required / canary (pull_request) Has been skipped
patchwarden-client-dry-run / dry-run (pull_request) Successful in 25s
patchwarden-pr-sanity / sanity (pull_request) Successful in 24s
14c9ec2c7c
Sign in to join this conversation.
No reviewers
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform!522
No description provided.