WIP: docs(specs): OpenClaw Scheduler Observability — Spec Kit + inventory code #568

Closed
ollama wants to merge 2 commits from ollama/dziadek-openclaw-scheduler-spec into main
Collaborator

WIP — OpenClaw Scheduler Observability Spec Kit

Cherry-pick of claude's original pre-impl skeleton (2026-05-17, from branch claude/feat-openclaw-scheduler-observability, never merged).

Original author: claude
Refreshed by: dziadek (DeepSeek-v4-Pro / ollama)
Source issue: #135 (Iskra-authored, OPEN, 5 comments, milestone 08)

The problem

Iskra discovered (2026-05-09) that OpenClaw's scheduler on VPS1000 has split brain:

  • Active cron store: missing/empty
  • Legacy cron store: contains old jobs
  • systemd timers: own some automations
  • openclaw cron list: crashes
  • A checkpoint exists in the promise ledger with no backing scheduled job

Operator won't upgrade OpenClaw until this is understood.

What this ships (9 files, ~767 LOC)

Spec Kit (docs/specs/openclaw-scheduler-observability-v0/, 6 files):

  • 00-constitution — 7 principles (ledger↔jobs consistency, checklist-gated upgrade, snapshot-before-mutation, read-only first)
  • 01-specify — 8 FRs (inventory, drift, repair, snapshot, preflight, gate, audit, ownership)
  • 02-plan — 3 slices (a/b/c), M1-M4
  • 03-tasks — ~1600 LOC v0 estimate
  • 04-implement-notes — 10 pre-impl decisions
  • README

Code skeleton (scripts/openclaw_scheduler/):

  • _types.py — JobRecord, DriftEntry, DriftReport, SchedulerStore enum
  • inventory.py — read_active_cron, read_legacy_cron, read_systemd_timers, full_inventory stubs

Key design decisions

  • Read cron stores directly (NOT via openclaw cron list which crashes)
  • tar+SHA-256 snapshot before any mutation
  • Ownership registry: sacred-adjacent
  • JSONL audit trail
  • Dry-run as default; operator-gated mutations
  • Cross-store conflict → operator decision, not silent repair

What this PR does NOT

  • Does NOT implement actual inventory/repair scripts (stubs only)
  • Does NOT perform any OpenClaw upgrade
  • Does NOT modify cron stores or systemd timers

Role: deep-reviewer
Refs: #135, Iskra evidence 2026-05-09

## WIP — OpenClaw Scheduler Observability Spec Kit Cherry-pick of claude's original pre-impl skeleton (2026-05-17, from branch `claude/feat-openclaw-scheduler-observability`, never merged). **Original author:** claude **Refreshed by:** dziadek (DeepSeek-v4-Pro / ollama) **Source issue:** #135 (Iskra-authored, OPEN, 5 comments, milestone 08) ## The problem Iskra discovered (2026-05-09) that OpenClaw's scheduler on VPS1000 has split brain: - Active cron store: missing/empty - Legacy cron store: contains old jobs - systemd timers: own some automations - `openclaw cron list`: crashes - A checkpoint exists in the promise ledger with no backing scheduled job Operator won't upgrade OpenClaw until this is understood. ## What this ships (9 files, ~767 LOC) **Spec Kit** (`docs/specs/openclaw-scheduler-observability-v0/`, 6 files): - 00-constitution — 7 principles (ledger↔jobs consistency, checklist-gated upgrade, snapshot-before-mutation, read-only first) - 01-specify — 8 FRs (inventory, drift, repair, snapshot, preflight, gate, audit, ownership) - 02-plan — 3 slices (a/b/c), M1-M4 - 03-tasks — ~1600 LOC v0 estimate - 04-implement-notes — 10 pre-impl decisions - README **Code skeleton** (`scripts/openclaw_scheduler/`): - `_types.py` — JobRecord, DriftEntry, DriftReport, SchedulerStore enum - `inventory.py` — read_active_cron, read_legacy_cron, read_systemd_timers, full_inventory stubs ## Key design decisions - Read cron stores directly (NOT via `openclaw cron list` which crashes) - tar+SHA-256 snapshot before any mutation - Ownership registry: sacred-adjacent - JSONL audit trail - Dry-run as default; operator-gated mutations - Cross-store conflict → operator decision, not silent repair ## What this PR does NOT - Does NOT implement actual inventory/repair scripts (stubs only) - Does NOT perform any OpenClaw upgrade - Does NOT modify cron stores or systemd timers **Role:** deep-reviewer **Refs:** #135, Iskra evidence 2026-05-09
Pre-impl skeleton + Spec Kit for issue #135 OpenClaw Scheduler Observability.

Status: draft — for review, NOT for merge. Pattern: hybrid GitHub Spec Kit +
code skeleton (siblings: PR #323 Agent Access Plane, PR #324 Agent Wake Bus;
pdurlej/iskra-openclaw PR #276/#278/#279/#281).

Source: issue #135 (Iskra-authored, 2026-05-09 evidence:
forgejo-self-audit-20260504 checkpoint exists without backing scheduled job).

What's in here (~900 LOC):

Spec Kit (6 markdowns, ~800 lines):
- 00-constitution.md — 7 principles (scheduler ownership explicit, ledger ↔
  jobs consistency, checklist-gated upgrade, snapshot before mutation,
  read-only first, openclaw cron list MUST NOT fail silently, cross-store
  reconciliation is operator-decision), 8 anti-patterns, boundary.
- 01-specify.md — 8 FRs (inventory, drift, repair, snapshot, preflight, gate,
  audit, ownership), 6 NFRs, 5 open questions, 6 falsifiable success criteria.
- 02-plan.md — 3 slices (a/b/c), M1-M4, 6 risks, rollback per slice,
  composability with OpenClaw upgrade pipeline + Wake Bus drift alerts.
- 03-tasks.md — atomic tasks, ~1600 LOC v0 estimate, codex execution notes.
- 04-implement-notes.md — 10 pre-impl decisions (read jobs.json directly NOT
  via `openclaw cron list`; tar+SHA-256 snapshot; ownership registry sacred-
  adjacent; JSONL audit; run as openclaw user; dry-run default; 5/30d
  retention; schema drift = warn; systemd ownership canonical for system jobs;
  cross-store conflict = operator-decision).
- README.md — index.

Code skeleton (py_compile clean):
- scripts/openclaw_scheduler/_types.py — JobRecord, DriftEntry, DriftReport,
  SchedulerStore (4-value enum: active_cron, legacy_cron, systemd_timers,
  promise_ledger), DriftSeverity, JobStatus
- scripts/openclaw_scheduler/inventory.py — read_active_cron, read_legacy_cron,
  read_systemd_timers, read_promise_ledger, full_inventory, main stubs with
  TODO(codex Slice a) markers + canonical paths

Cross-references:
- Issue #135 (Iskra-authored parent + 2026-05-09 evidence)
- pdurlej/iskra-openclaw runtime (sacred path boundary; v0 expects no schema
  changes during platform v0 development)
- PR #323 (Agent Access Plane) — same audit JSONL pattern
- PR #324 (Wake Bus) — could emit alerts on detected drift
- ADR-0004 Iskra telemetry boundary
- ADR-0018 — no silent repair, operator-gated mutations
- Pan Herbatka day-1 checklist § 5 sacred paths

Tier: this PR Trivial (draft scaffold). Future slice PRs: Lite (a/b), Full (c)
per ADR-0007 — Slice (c) repair scripts are class/security-sensitive.

Refs #135
docs(specs): add refresh note to openclaw-scheduler README (dziadek)
Some checks failed
canary-required / collect-diff (pull_request) Successful in 4s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 4s
canary-required / canary (pull_request) Has been skipped
patchwarden-client-dry-run / dry-run (pull_request) Successful in 22s
base-is-main / guard (pull_request) Successful in 1s
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 5s
patchwarden-pr-sanity / sanity (pull_request) Failing after 6m0s
fbc231fb0d
Verified 2026-05-28: issue #135 still OPEN, milestone 08.
Original: claude (2026-05-17)
codex changed title from docs(specs): OpenClaw Scheduler Observability — Spec Kit + inventory code (WIP) to WIP: docs(specs): OpenClaw Scheduler Observability — Spec Kit + inventory code 2026-05-28 14:56:45 +02:00
Collaborator

Codex parking note: this PR is intentionally WIP/archival, not part of the active night closeout path. Treat it as self-contained recovered knowledge from old branches. Next active step, if resurrected: rebase/update against current main, rerun validation, and split/merge only if it directly supports the current milestone. No OpenClaw runtime mutation should happen from this PR.

Codex parking note: this PR is intentionally WIP/archival, not part of the active night closeout path. Treat it as self-contained recovered knowledge from old branches. Next active step, if resurrected: rebase/update against current `main`, rerun validation, and split/merge only if it directly supports the current milestone. No OpenClaw runtime mutation should happen from this PR.
Collaborator

Codex note: this WIP PR is valuable but structurally stale against current main.

I inspected the diff and it now carries large accidental reverse-diff churn unrelated to OpenClaw scheduler observability. I opened a clean replacement PR that preserves only the intended 9 files from this branch:

Recommendation: keep this PR open only until #631 merges, then close this as superseded so the archival content is preserved without merging stale branch drift.

Secret output: none.

Codex note: this WIP PR is valuable but structurally stale against current `main`. I inspected the diff and it now carries large accidental reverse-diff churn unrelated to OpenClaw scheduler observability. I opened a clean replacement PR that preserves only the intended 9 files from this branch: - replacement: #631 - parent issue: #135 Recommendation: keep this PR open only until #631 merges, then close this as superseded so the archival content is preserved without merging stale branch drift. Secret output: none.
pdurlej referenced this pull request from a commit 2026-05-30 09:13:51 +02:00
Collaborator

Superseded by #631, which rescued the intended OpenClaw scheduler observability Spec Kit from a fresh main branch and merged it without the stale broad diff from this WIP branch.

Leaving this PR closed as archival context only. No runtime changes were taken from this branch.

Superseded by #631, which rescued the intended OpenClaw scheduler observability Spec Kit from a fresh `main` branch and merged it without the stale broad diff from this WIP branch. Leaving this PR closed as archival context only. No runtime changes were taken from this branch.
codex closed this pull request 2026-05-30 09:14:58 +02:00
Some checks failed
canary-required / collect-diff (pull_request) Successful in 4s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 4s
canary-required / canary (pull_request) Has been skipped
patchwarden-client-dry-run / dry-run (pull_request) Successful in 22s
base-is-main / guard (pull_request) Successful in 1s
Required
Details
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 5s
patchwarden-pr-sanity / sanity (pull_request) Failing after 6m0s
Required
Details

Pull request closed

Sign in to join this conversation.
No reviewers
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform!568
No description provided.