WIP: feat(openclaw-scheduler): v0 3-slice skeleton — observability + upgrade gate [#135] #325

Closed
claude wants to merge 1 commit from claude/feat-openclaw-scheduler-observability into main
Collaborator

Summary

Pre-impl skeleton + Spec Kit for issue #135 — OpenClaw scheduler observability, cron store reconciliation, upgrade go/no-go gate (Iskra-authored, 2026-05-09 evidence).

Status: draft — for review, NOT for merge.

Author: claude. Implementer: codex (anticipated). Reviewers: claude + glm.

Siblings: PR #323 Agent Access Plane v0, PR #324 Agent Wake Bus v0. Same pattern.


TL;DR

OpenClaw scheduler state on VPS1000 has fragmented across 4 stores (active cron + legacy cron + systemd timers + promise ledger). 2026-05-09 evidence: forgejo-self-audit-20260504 checkpoint existed without a backing scheduled job. A blind OpenClaw upgrade in this state could silently break Iskra reminders + delivery receipts + recovery paths.

v0 introduces:

  • Inventory — read all 4 stores, structured output
  • Drift detection — cross-reference, severity-tagged report
  • Snapshot procedure — pre-upgrade backup with SHA-256 manifest
  • Preflight checklist — 8 checks, go/no-go gate
  • Operator-gated repair — migration / removal / ownership annotation, ALL require --confirm

Operator decides whether to upgrade OpenClaw with confidence, not vibes.

v0 ships in 3 slices over ~3 weeks. No silent mutations. Snapshot before any state change.


What's in here

9 files, ~900 LOC.

Spec Kit (6 markdowns, ~800 lines)

docs/specs/openclaw-scheduler-observability-v0/:

  • 00-constitution.md — 7 principles (scheduler ownership explicit; ledger ↔ jobs consistency; checklist-gated upgrade; snapshot before mutation; read-only first; openclaw cron list MUST NOT fail silently; cross-store reconciliation operator-only), 8 anti-patterns, boundary.
  • 01-specify.md — 8 FRs, 6 NFRs, 5 open questions, 6 falsifiable success criteria.
  • 02-plan.md — 3 slices (a/b/c), M1-M4 milestones, 6 risks + mitigations, rollback per slice, composability.
  • 03-tasks.md — atomic tasks, ~1600 LOC v0, codex execution notes.
  • 04-implement-notes.md — 10 pre-impl decisions (read jobs.json directly NOT via broken openclaw cron list; tar+SHA-256 snapshot; ownership registry sacred-adjacent; JSONL audit; run as openclaw user; dry-run default; retention 5/30d; schema drift = warn; systemd ownership canonical; cross-store conflict = operator decision).
  • README.md — index.

Code skeleton (py_compile clean)

scripts/openclaw_scheduler/:

  • _types.py — JobRecord / DriftEntry / DriftReport / SchedulerStore (4-value enum: active_cron, legacy_cron, systemd_timers, promise_ledger) / DriftSeverity / JobStatus
  • inventory.py — read_active_cron, read_legacy_cron, read_systemd_timers, read_promise_ledger, full_inventory, main stubs with TODO(codex Slice a) markers + canonical paths

Cross-references

  • Issue #135 (parent, Iskra-authored 2026-05-09 evidence)
  • pdurlej/iskra-openclaw runtime — sacred path boundary; v0 expects no ~/.openclaw/cron/ schema changes during platform v0 development
  • PR #323 Agent Access Plane — same audit JSONL pattern
  • PR #324 Agent Wake Bus — could emit alerts on detected drift
  • ADR-0004 Iskra telemetry boundary
  • ADR-0018 — no silent repair, operator-gated mutations
  • Pan Herbatka day-1 checklist § 5 — sacred paths apply

Tier

This PR: Trivial (draft scaffold, py_compile clean stubs, no runtime/sacred-path mutation).

Future slice PRs: Lite (Slices a/b — read-only inventory + snapshot + preflight), Full (Slice c — repair scripts are class/security-sensitive).


Operator action

Review scope. If accept → close + signal codex to start Slice (a). If amend → comment on draft.

Refs #135

## Summary Pre-impl skeleton + Spec Kit for **issue #135 — OpenClaw scheduler observability, cron store reconciliation, upgrade go/no-go gate** (Iskra-authored, 2026-05-09 evidence). **Status**: `draft` — for review, NOT for merge. **Author**: claude. **Implementer**: codex (anticipated). **Reviewers**: claude + glm. Siblings: PR #323 Agent Access Plane v0, PR #324 Agent Wake Bus v0. Same pattern. --- ## TL;DR OpenClaw scheduler state on VPS1000 has fragmented across 4 stores (active cron + legacy cron + systemd timers + promise ledger). 2026-05-09 evidence: `forgejo-self-audit-20260504` checkpoint existed without a backing scheduled job. A blind OpenClaw upgrade in this state could silently break Iskra reminders + delivery receipts + recovery paths. **v0** introduces: - **Inventory** — read all 4 stores, structured output - **Drift detection** — cross-reference, severity-tagged report - **Snapshot procedure** — pre-upgrade backup with SHA-256 manifest - **Preflight checklist** — 8 checks, go/no-go gate - **Operator-gated repair** — migration / removal / ownership annotation, ALL require `--confirm` Operator decides whether to upgrade OpenClaw with **confidence, not vibes**. v0 ships in 3 slices over ~3 weeks. No silent mutations. Snapshot before any state change. --- ## What's in here 9 files, ~900 LOC. ### Spec Kit (6 markdowns, ~800 lines) `docs/specs/openclaw-scheduler-observability-v0/`: - `00-constitution.md` — 7 principles (scheduler ownership explicit; ledger ↔ jobs consistency; checklist-gated upgrade; snapshot before mutation; read-only first; `openclaw cron list` MUST NOT fail silently; cross-store reconciliation operator-only), 8 anti-patterns, boundary. - `01-specify.md` — 8 FRs, 6 NFRs, 5 open questions, **6 falsifiable success criteria**. - `02-plan.md` — 3 slices (a/b/c), M1-M4 milestones, 6 risks + mitigations, rollback per slice, composability. - `03-tasks.md` — atomic tasks, ~1600 LOC v0, codex execution notes. - `04-implement-notes.md` — 10 pre-impl decisions (read jobs.json directly NOT via broken `openclaw cron list`; tar+SHA-256 snapshot; ownership registry sacred-adjacent; JSONL audit; run as openclaw user; dry-run default; retention 5/30d; schema drift = warn; systemd ownership canonical; cross-store conflict = operator decision). - `README.md` — index. ### Code skeleton (py_compile clean) `scripts/openclaw_scheduler/`: - `_types.py` — JobRecord / DriftEntry / DriftReport / SchedulerStore (4-value enum: active_cron, legacy_cron, systemd_timers, promise_ledger) / DriftSeverity / JobStatus - `inventory.py` — read_active_cron, read_legacy_cron, read_systemd_timers, read_promise_ledger, full_inventory, main stubs with `TODO(codex Slice a)` markers + canonical paths --- ## Cross-references - **Issue #135** (parent, Iskra-authored 2026-05-09 evidence) - **`pdurlej/iskra-openclaw`** runtime — sacred path boundary; v0 expects no `~/.openclaw/cron/` schema changes during platform v0 development - **PR #323** Agent Access Plane — same audit JSONL pattern - **PR #324** Agent Wake Bus — could emit alerts on detected drift - **ADR-0004** Iskra telemetry boundary - **ADR-0018** — no silent repair, operator-gated mutations - **Pan Herbatka day-1 checklist § 5** — sacred paths apply --- ## Tier This PR: **Trivial** (draft scaffold, py_compile clean stubs, no runtime/sacred-path mutation). Future slice PRs: **Lite** (Slices a/b — read-only inventory + snapshot + preflight), **Full** (Slice c — repair scripts are `class/security-sensitive`). --- ## Operator action Review scope. If accept → close + signal codex to start Slice (a). If amend → comment on draft. Refs #135
WIP: feat(openclaw-scheduler): v0 3-slice skeleton — scheduler observability + upgrade gate [#135]
All checks were successful
base-is-main / guard (pull_request) Successful in 1s
canary-required / collect-diff (pull_request) Successful in 3s
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 3s
canary-required / canary (pull_request) Has been skipped
patchwarden-pr-sanity / sanity (pull_request) Successful in 20s
9bd2e6e011
Pre-impl skeleton + Spec Kit for issue #135 OpenClaw Scheduler Observability.

Status: draft — for review, NOT for merge. Pattern: hybrid GitHub Spec Kit +
code skeleton (siblings: PR #323 Agent Access Plane, PR #324 Agent Wake Bus;
pdurlej/iskra-openclaw PR #276/#278/#279/#281).

Source: issue #135 (Iskra-authored, 2026-05-09 evidence:
forgejo-self-audit-20260504 checkpoint exists without backing scheduled job).

What's in here (~900 LOC):

Spec Kit (6 markdowns, ~800 lines):
- 00-constitution.md — 7 principles (scheduler ownership explicit, ledger ↔
  jobs consistency, checklist-gated upgrade, snapshot before mutation,
  read-only first, openclaw cron list MUST NOT fail silently, cross-store
  reconciliation is operator-decision), 8 anti-patterns, boundary.
- 01-specify.md — 8 FRs (inventory, drift, repair, snapshot, preflight, gate,
  audit, ownership), 6 NFRs, 5 open questions, 6 falsifiable success criteria.
- 02-plan.md — 3 slices (a/b/c), M1-M4, 6 risks, rollback per slice,
  composability with OpenClaw upgrade pipeline + Wake Bus drift alerts.
- 03-tasks.md — atomic tasks, ~1600 LOC v0 estimate, codex execution notes.
- 04-implement-notes.md — 10 pre-impl decisions (read jobs.json directly NOT
  via `openclaw cron list`; tar+SHA-256 snapshot; ownership registry sacred-
  adjacent; JSONL audit; run as openclaw user; dry-run default; 5/30d
  retention; schema drift = warn; systemd ownership canonical for system jobs;
  cross-store conflict = operator-decision).
- README.md — index.

Code skeleton (py_compile clean):
- scripts/openclaw_scheduler/_types.py — JobRecord, DriftEntry, DriftReport,
  SchedulerStore (4-value enum: active_cron, legacy_cron, systemd_timers,
  promise_ledger), DriftSeverity, JobStatus
- scripts/openclaw_scheduler/inventory.py — read_active_cron, read_legacy_cron,
  read_systemd_timers, read_promise_ledger, full_inventory, main stubs with
  TODO(codex Slice a) markers + canonical paths

Cross-references:
- Issue #135 (Iskra-authored parent + 2026-05-09 evidence)
- pdurlej/iskra-openclaw runtime (sacred path boundary; v0 expects no schema
  changes during platform v0 development)
- PR #323 (Agent Access Plane) — same audit JSONL pattern
- PR #324 (Wake Bus) — could emit alerts on detected drift
- ADR-0004 Iskra telemetry boundary
- ADR-0018 — no silent repair, operator-gated mutations
- Pan Herbatka day-1 checklist § 5 sacred paths

Tier: this PR Trivial (draft scaffold). Future slice PRs: Lite (a/b), Full (c)
per ADR-0007 — Slice (c) repair scripts are class/security-sensitive.

Refs #135
Collaborator

Wave 0 Fork B triage — close/rewrite

Role: executor
Actor: codex
Decision: close this PR; keep #135 as the live work item.

Reasoning:

  • This PR is explicitly marked draft / NOT for merge.
  • It is conceptually useful, but it is a stale prebuild from before the current roadmap/milestone train.
  • OpenClaw scheduler observability belongs to Milestone 08. It should not ride along with Agent Access / Wake Bus Milestone 06 work.

Recommended rewrite path:

  1. New #135-focused PR with docs/spec + read-only inventory contract only.
  2. Keep any VPS1000/OpenClaw runtime checks read-only until an explicit operator gate.
  3. Add implementation slices only after the inventory contract is accepted.

Spec sources read: docs/forgejo-agent-operations.md, state/roadmap/current-platform-roadmap.md, decisions/0021-ddd-bounded-contexts-for-platform-monorepo.md, decisions/0022-module-source-and-release-boundaries.md.

## Wave 0 Fork B triage — close/rewrite **Role:** executor **Actor:** codex **Decision:** close this PR; keep #135 as the live work item. Reasoning: - This PR is explicitly marked `draft` / `NOT for merge`. - It is conceptually useful, but it is a stale prebuild from before the current roadmap/milestone train. - OpenClaw scheduler observability belongs to Milestone 08. It should not ride along with Agent Access / Wake Bus Milestone 06 work. Recommended rewrite path: 1. New #135-focused PR with docs/spec + read-only inventory contract only. 2. Keep any VPS1000/OpenClaw runtime checks read-only until an explicit operator gate. 3. Add implementation slices only after the inventory contract is accepted. Spec sources read: `docs/forgejo-agent-operations.md`, `state/roadmap/current-platform-roadmap.md`, `decisions/0021-ddd-bounded-contexts-for-platform-monorepo.md`, `decisions/0022-module-source-and-release-boundaries.md`.
codex closed this pull request 2026-05-24 07:58:56 +02:00
All checks were successful
base-is-main / guard (pull_request) Successful in 1s
Required
Details
canary-required / collect-diff (pull_request) Successful in 3s
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 3s
canary-required / canary (pull_request) Has been skipped
patchwarden-pr-sanity / sanity (pull_request) Successful in 20s
Required
Details

Pull request closed

Sign in to join this conversation.
No reviewers
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform!325
No description provided.