feat(backup): add unique-knowledge offsite backup runner #701

Merged
pdurlej merged 1 commit from codex/698-unique-knowledge-backup into main 2026-06-04 02:05:09 +02:00
Collaborator

Canary status: missing — fire canary 3+3 manually before merge

Summary

Adds the ADR-0013 / #698 unique-knowledge backup runner for local + rclone-crypt pCloud backups, plus restore verification and a live evidence report.

Closes #698.

Canary Context Pack

Product story

The platform now has an operator-run path to preserve irreplaceable knowledge outside RS2000: local archive plus encrypted pCloud offsite, with restore verification instead of trusting backup creation alone.

What changed

  • Added config/backup/unique-knowledge.sources.json for the default unique-knowledge source set.
  • Added scripts/backup/unique_knowledge_backup.py.
  • Added scripts/backup/restore_check.py.
  • Added a Mac LaunchDaemon/LaunchAgent example and runbook.
  • Added tests for backup receipts, archive shape, restore-check failure behavior, and unsafe source names.
  • Added sanitized live evidence under state/reports/unique-knowledge-backup-2026-06-04.md.

Why it changed

ADR-0013 chose rclone-crypt pCloud plus a pluggable local copy. #698 required actual restore verification, not only scripts.

Files touched

  • config/backup/unique-knowledge.sources.json
  • scripts/backup/
  • runbooks/unique-knowledge-backup.md
  • state/reports/unique-knowledge-backup-2026-06-04.md
  • tests/test_unique_knowledge_backup.py

Relevant context

  • ADR-0013: offsite + local backup of unique knowledge.
  • #698: implementation issue.
  • #674: backup-continuity phase link.

Runtime evidence

A full live run completed:

  • run id 20260603T231146Z;
  • archive size 3133883380 bytes;
  • archive sha256 c84a6f7455d096bbcaf049d7993641fccf58e582017d2ee64d410f3e234fee37;
  • rclone status copied to pcloud_crypt;
  • restore-check from the pCloud crypt copy verified 12/12 captured sources and 387973 archive entries.

Known constraints

  • Backup archives contain private/secret-bearing state and are intentionally not committed.
  • Receipts and reports record metadata only.
  • Remote hosts did not require rsync; the script used tar-over-ssh.
  • The LaunchDaemon plist is an example; loading it on the target old MacBook is an operator host setup step.

Explicit out-of-scope

  • No pCloud credential creation in repo.
  • No rclone crypt secret in repo.
  • No production service restart.
  • No deletion or pruning.
  • No broad M10 server upgrade.

Requested decision

Approve this as the first completed ADR-0013 backup-continuity implementation and live restore proof.

Merge blockers

  • Secret leakage in code/docs/reports.
  • Restore-check unable to validate the offsite archive.
  • Source set judged too broad or too narrow for #698.

Spec sources read

  • decisions/0013-offsite-backup-rclone-crypt-pcloud.md — ADR contract.
  • state/strategy/platform-maturity-roadmap-2026-06-01.md — Phase-1 backup continuity context.
  • scripts/cutover/backup-before-apply.sh — existing backup receipt/style reference.
  • scripts/cutover/README.md — existing backup helper runbook style.
  • Forgejo issue #698 body — acceptance criteria.

Validation

  • uv run pytest tests/test_unique_knowledge_backup.py
  • uv run pytest tests/test_unique_knowledge_backup.py tests/test_honcho_log_privacy.py
  • PYTHONPATH=control-plane uv run --project control-plane python -m platformctl.cli validate all --json
  • git diff --check
  • live backup + pCloud crypt copy + restore-check, recorded in state/reports/unique-knowledge-backup-2026-06-04.md
Canary status: missing — fire canary 3+3 manually before merge ## Summary Adds the ADR-0013 / #698 unique-knowledge backup runner for local + rclone-crypt pCloud backups, plus restore verification and a live evidence report. Closes #698. ## Canary Context Pack ### Product story The platform now has an operator-run path to preserve irreplaceable knowledge outside RS2000: local archive plus encrypted pCloud offsite, with restore verification instead of trusting backup creation alone. ### What changed - Added `config/backup/unique-knowledge.sources.json` for the default unique-knowledge source set. - Added `scripts/backup/unique_knowledge_backup.py`. - Added `scripts/backup/restore_check.py`. - Added a Mac LaunchDaemon/LaunchAgent example and runbook. - Added tests for backup receipts, archive shape, restore-check failure behavior, and unsafe source names. - Added sanitized live evidence under `state/reports/unique-knowledge-backup-2026-06-04.md`. ### Why it changed ADR-0013 chose rclone-crypt pCloud plus a pluggable local copy. #698 required actual restore verification, not only scripts. ### Files touched - `config/backup/unique-knowledge.sources.json` - `scripts/backup/` - `runbooks/unique-knowledge-backup.md` - `state/reports/unique-knowledge-backup-2026-06-04.md` - `tests/test_unique_knowledge_backup.py` ### Relevant context - ADR-0013: offsite + local backup of unique knowledge. - #698: implementation issue. - #674: backup-continuity phase link. ### Runtime evidence A full live run completed: - run id `20260603T231146Z`; - archive size `3133883380` bytes; - archive sha256 `c84a6f7455d096bbcaf049d7993641fccf58e582017d2ee64d410f3e234fee37`; - rclone status `copied` to `pcloud_crypt`; - restore-check from the pCloud crypt copy verified `12/12` captured sources and `387973` archive entries. ### Known constraints - Backup archives contain private/secret-bearing state and are intentionally not committed. - Receipts and reports record metadata only. - Remote hosts did not require `rsync`; the script used `tar-over-ssh`. - The LaunchDaemon plist is an example; loading it on the target old MacBook is an operator host setup step. ### Explicit out-of-scope - No pCloud credential creation in repo. - No rclone crypt secret in repo. - No production service restart. - No deletion or pruning. - No broad M10 server upgrade. ### Requested decision Approve this as the first completed ADR-0013 backup-continuity implementation and live restore proof. ### Merge blockers - Secret leakage in code/docs/reports. - Restore-check unable to validate the offsite archive. - Source set judged too broad or too narrow for #698. ## Spec sources read - `decisions/0013-offsite-backup-rclone-crypt-pcloud.md` — ADR contract. - `state/strategy/platform-maturity-roadmap-2026-06-01.md` — Phase-1 backup continuity context. - `scripts/cutover/backup-before-apply.sh` — existing backup receipt/style reference. - `scripts/cutover/README.md` — existing backup helper runbook style. - Forgejo issue #698 body — acceptance criteria. ## Validation - `uv run pytest tests/test_unique_knowledge_backup.py` - `uv run pytest tests/test_unique_knowledge_backup.py tests/test_honcho_log_privacy.py` - `PYTHONPATH=control-plane uv run --project control-plane python -m platformctl.cli validate all --json` - `git diff --check` - live backup + pCloud crypt copy + restore-check, recorded in `state/reports/unique-knowledge-backup-2026-06-04.md`
feat(backup): add unique-knowledge offsite backup runner
Some checks failed
base-is-main / guard (pull_request) Successful in 1s
canary-required / canary (pull_request) Waiting to run
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 4s
patchwarden-pr-sanity / sanity (pull_request) Waiting to run
python-ci / Python 3.11 (pull_request) Successful in 42s
canary-required / collect-diff (pull_request) Successful in 4s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 4s
patchwarden-client-dry-run / dry-run (pull_request) Waiting to run
python-ci / Python 3.12 (pull_request) Successful in 42s
python-ci / Python 3.13 (pull_request) Has been cancelled
54071bd046
codex force-pushed codex/698-unique-knowledge-backup from 54071bd046
Some checks failed
base-is-main / guard (pull_request) Successful in 1s
canary-required / canary (pull_request) Waiting to run
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 4s
patchwarden-pr-sanity / sanity (pull_request) Waiting to run
python-ci / Python 3.11 (pull_request) Successful in 42s
canary-required / collect-diff (pull_request) Successful in 4s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 4s
patchwarden-client-dry-run / dry-run (pull_request) Waiting to run
python-ci / Python 3.12 (pull_request) Successful in 42s
python-ci / Python 3.13 (pull_request) Has been cancelled
to dc7765c613
All checks were successful
base-is-main / guard (pull_request) Successful in 2s
canary-required / collect-diff (pull_request) Successful in 4s
python-ci / Python 3.11 (pull_request) Successful in 40s
python-ci / Python 3.12 (pull_request) Successful in 42s
canary-required / canary (pull_request) Successful in 15s
patchwarden-pr-sanity / sanity (pull_request) Successful in 5m41s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 3s
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 4s
python-ci / Python 3.13 (pull_request) Successful in 42s
patchwarden-client-dry-run / dry-run (pull_request) Successful in 18s
2026-06-04 01:55:51 +02:00
Compare
Author
Collaborator

Patchwarden PR sanity

  • Status: advisory_findings
  • PR: 701
  • Commit: dc7765c61363ee04babe695590bd731c65652075
  • Security-sensitive label: missing
  • Authority: advisory model review plus deterministic blockers only
  • 3+3 canary: still alive; this does not replace it

Deterministic findings

No deterministic findings.

Model reviewers

global-glm / glm-5.1:cloud

  • Status: ok

  • Verdict: NOT_OK

  • high Command injection via unvalidated remote_host in SSH commands

    • Evidence: In scripts/backup/unique_knowledge_backup.py, remote_host from the manifest JSON is passed directly to SSH without validation. Lines ~219-236 (copy_remote_path_tar), ~170 (remote_path_exists), ~179 (remote_path_is_dir), and ~195 (r`
    • Next: Add a require_safe_hostname function that validates remote_host against a pattern like ^[A-Za-z0-9][A-Za-z0-9.-]{0,253}$ (no spaces, no dashes at start, no SSH option characters). Call it in copy_remote_path, copy_remote_path_tar, capture_postgres_dumpall, remote_path_exists, `remote_p
  • medium Unquoted remote_path in SSH test commands enables argument injection

    • Evidence: In scripts/backup/unique_knowledge_backup.py lines ~170-180, remote_path_existsandremote_path_is_direxecutesubprocess.run(["ssh", remote_host, "test", "-e", remote_path], ...). The remote_path string is not shell-quoted before b
    • Next: Use shlex.quote(remote_path) in the SSH remote command strings within remote_path_exists and remote_path_is_dir, consistent with how copy_remote_path_tar quotes paths.
  • low Overly broad hex-string redaction may mask legitimate error context

    • Evidence: In scripts/backup/unique_knowledge_backup.py line ~30, SECRETISH_REincludes([A-Fa-f0-9]{40}) which matches any 40+ character hex string. This could redact non-secret hex values like tar archive checksums or size counters that appear i
    • Next: Either narrow the hex pattern to only match known secret formats (e.g., require sk- prefix for OpenAI keys, which is already covered), or add a comment acknowledging the trade-off. Consider adding equivalent redaction to restore_check.py for consistency.

global-deepseek / deepseek-v4-pro:cloud

  • Status: ok

  • Verdict: OK

  • medium Restore check does not validate symlink targets

    • Evidence: scripts/backup/restore_check.py: list_archive_members() only checks for absolute paths and '..' in member names, but does not inspect symlink targets. A malicious archive could contain a symlink with an absolute target (e.g., 'foo -> /etc/p
    • Next: Add symlink target validation after extraction (e.g., check that all symlinks point within the restore root) or use tar with --no-same-owner and --delay-directory-restore, and emit a warning for any absolute symlink targets.

redteam / kimi-k2.6:cloud

  • Status: error
  • Verdict: -
  • Note: ReadTimeout: The read operation timed out
  • Findings: none

Policy notes

  • GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot.
  • Optional red-team model is enabled only when PLATFORMCTL_PR_SANITY_REDTEAM_MODEL is configured.
  • Auto-merge is not enabled here.
<!-- patchwarden-pr-sanity:pdurlej/platform:PR-701 --> # Patchwarden PR sanity - Status: `advisory_findings` - PR: `701` - Commit: `dc7765c61363ee04babe695590bd731c65652075` - Security-sensitive label: `missing` - Authority: advisory model review plus deterministic blockers only - 3+3 canary: still alive; this does not replace it ## Deterministic findings No deterministic findings. ## Model reviewers ### `global-glm` / `glm-5.1:cloud` - Status: `ok` - Verdict: `NOT_OK` - **`high`** Command injection via unvalidated remote_host in SSH commands - Evidence: `In scripts/backup/unique_knowledge_backup.py, `remote_host` from the manifest JSON is passed directly to SSH without validation. Lines ~219-236 (`copy_remote_path_tar`), ~170 (`remote_path_exists`), ~179 (`remote_path_is_dir`), and ~195 (`r` - Next: Add a `require_safe_hostname` function that validates `remote_host` against a pattern like `^[A-Za-z0-9][A-Za-z0-9.-]{0,253}$` (no spaces, no dashes at start, no SSH option characters). Call it in `copy_remote_path`, `copy_remote_path_tar`, `capture_postgres_dumpall`, `remote_path_exists`, `remote_p - **`medium`** Unquoted remote_path in SSH test commands enables argument injection - Evidence: `In scripts/backup/unique_knowledge_backup.py lines ~170-180, `remote_path_exists` and `remote_path_is_dir` execute `subprocess.run(["ssh", remote_host, "test", "-e", remote_path], ...)`. The `remote_path` string is not shell-quoted before b` - Next: Use `shlex.quote(remote_path)` in the SSH remote command strings within `remote_path_exists` and `remote_path_is_dir`, consistent with how `copy_remote_path_tar` quotes paths. - **`low`** Overly broad hex-string redaction may mask legitimate error context - Evidence: `In scripts/backup/unique_knowledge_backup.py line ~30, `SECRETISH_RE` includes `([A-Fa-f0-9]{40})` which matches any 40+ character hex string. This could redact non-secret hex values like tar archive checksums or size counters that appear i` - Next: Either narrow the hex pattern to only match known secret formats (e.g., require `sk-` prefix for OpenAI keys, which is already covered), or add a comment acknowledging the trade-off. Consider adding equivalent redaction to restore_check.py for consistency. ### `global-deepseek` / `deepseek-v4-pro:cloud` - Status: `ok` - Verdict: `OK` - **`medium`** Restore check does not validate symlink targets - Evidence: `scripts/backup/restore_check.py: list_archive_members() only checks for absolute paths and '..' in member names, but does not inspect symlink targets. A malicious archive could contain a symlink with an absolute target (e.g., 'foo -> /etc/p` - Next: Add symlink target validation after extraction (e.g., check that all symlinks point within the restore root) or use tar with --no-same-owner and --delay-directory-restore, and emit a warning for any absolute symlink targets. ### `redteam` / `kimi-k2.6:cloud` - Status: `error` - Verdict: `-` - Note: ReadTimeout: The read operation timed out - Findings: none ## Policy notes - GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot. - Optional red-team model is enabled only when `PLATFORMCTL_PR_SANITY_REDTEAM_MODEL` is configured. - Auto-merge is not enabled here.
pdurlej deleted branch codex/698-unique-knowledge-backup 2026-06-04 02:05:09 +02:00
Sign in to join this conversation.
No reviewers
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform!701
No description provided.