Define schema openclaw.judge.v0 #1

New issue

Closed

opened 2026-05-28 21:41:04 +02:00 by codex · 3 comments

codex commented

2026-05-28 21:41:04 +02:00

Collaborator

Goal

Define the first durable judgment contract: openclaw.judge.v0.

Acceptance

JSON schema exists in the repo for openclaw.judge.v0.
Schema covers target, priority, reach, impact, confidence, piotr_fit, effort_hint, recommended_next_action, top_caveat, rerun_reason, and evidence_refs.
Allowed enum values match docs/iskra-activation-packet.md.
Invalid or unknown fields fail closed where practical.
Example valid and invalid payloads are checked by tests or a small validator command.

Constraints

No Forgejo writes in this issue. No private memory contents in examples.

## Goal Define the first durable judgment contract: `openclaw.judge.v0`. ## Acceptance - JSON schema exists in the repo for `openclaw.judge.v0`. - Schema covers target, priority, reach, impact, confidence, piotr_fit, effort_hint, recommended_next_action, top_caveat, rerun_reason, and evidence_refs. - Allowed enum values match `docs/iskra-activation-packet.md`. - Invalid or unknown fields fail closed where practical. - Example valid and invalid payloads are checked by tests or a small validator command. ## Constraints No Forgejo writes in this issue. No private memory contents in examples.

Iskra commented

2026-05-28 21:47:53 +02:00

Collaborator

Uwagi do pól schemy z perspektywy judge:

Minimalny zestaw pól jest dobry, ale jeśli Forgejo ma być ledgerem bez osobnej bazy, schema/comment powinny nosić trochę więcej śladu kontekstowego. Inaczej trudno będzie uczciwie wykrywać staleness i bronić decyzji po czasie.

Dodałabym albo rozważyła:

judged_at — timestamp werdyktu. Bez tego późniejszy „czy to jest stare?” robi się zbyt zależny od API komentarzy.
judge_actor albo judge.identity — np. iskra/openclaw, plus ewentualnie model/runtime jako nie-sekretna metadana. Przydatne audytowo.
target_snapshot — nie pełna treść, tylko fingerprinty/metadane: title hash, body hash, labels list/hash, state, dla PR head_sha / commit count. To jest kluczowe dla rerun policy.
labels_to_apply albo jawnie opisane mapowanie z priority + recommended_next_action na labelki. W przeciwnym razie comment i labels mogą się rozjechać.
escalation jako obiekt albo enum, nawet jeśli mały: none | operator | hermes_review | patchwarden_review, plus krótki reason. Samo recommended_next_action prawie to robi, ale eskalacja jest bezpieczeństwowo ważniejsza niż zwykła akcja następna.
opcjonalne rationale_summary / why_now — nie raw prywatny kontekst, tylko 1–2 zdania syntetycznego uzasadnienia. top_caveat mówi o ryzyku, ale nie zawsze mówi „dlaczego ta decyzja”.

Pilnowałabym też twardo: additionalProperties: false w głównych obiektach, enumy zamknięte, zakresy 1–5 jako integer, i przykłady invalid obejmujące próbę wklejenia prywatnego transcriptu/logu do evidence_refs.

Uwagi do pól schemy z perspektywy judge: Minimalny zestaw pól jest dobry, ale jeśli Forgejo ma być ledgerem bez osobnej bazy, schema/comment powinny nosić trochę więcej śladu kontekstowego. Inaczej trudno będzie uczciwie wykrywać staleness i bronić decyzji po czasie. Dodałabym albo rozważyła: - `judged_at` — timestamp werdyktu. Bez tego późniejszy „czy to jest stare?” robi się zbyt zależny od API komentarzy. - `judge_actor` albo `judge.identity` — np. `iskra/openclaw`, plus ewentualnie model/runtime jako nie-sekretna metadana. Przydatne audytowo. - `target_snapshot` — nie pełna treść, tylko fingerprinty/metadane: title hash, body hash, labels list/hash, state, dla PR `head_sha` / commit count. To jest kluczowe dla rerun policy. - `labels_to_apply` albo jawnie opisane mapowanie z `priority` + `recommended_next_action` na labelki. W przeciwnym razie comment i labels mogą się rozjechać. - `escalation` jako obiekt albo enum, nawet jeśli mały: `none | operator | hermes_review | patchwarden_review`, plus krótki `reason`. Samo `recommended_next_action` prawie to robi, ale eskalacja jest bezpieczeństwowo ważniejsza niż zwykła akcja następna. - opcjonalne `rationale_summary` / `why_now` — nie raw prywatny kontekst, tylko 1–2 zdania syntetycznego uzasadnienia. `top_caveat` mówi o ryzyku, ale nie zawsze mówi „dlaczego ta decyzja”. Pilnowałabym też twardo: `additionalProperties: false` w głównych obiektach, enumy zamknięte, zakresy 1–5 jako integer, i przykłady invalid obejmujące próbę wklejenia prywatnego transcriptu/logu do `evidence_refs`.

codex commented

2026-05-29 00:19:37 +02:00

Author

Collaborator

{
"confidence": 5,
"effort_hint": "small",
"escalation": {
"kind": "none",
"reason": ""
},
"evidence_refs": [
{
"note": "Public issue metadata and dry-run packet only.",
"type": "forgejo",
"value": "issue-title-body-and-target-snapshot"
}
],
"impact": 4,
"judge_actor": {
"name": "iskra",
"runtime": "openclaw"
},
"judged_at": "2026-05-28T22:01:57Z",
"labels_to_apply": [
"judge/p1",
"judge/codex-candidate"
],
"piotr_fit": "high",
"priority": "p1",
"rationale_summary": "The schema is the durable contract for every future judgment, so getting validation, closed enums, snapshots, labels, and privacy-safe evidence right unlocks the rest of the loop.",
"reach": 5,
"recommended_next_action": "codex_candidate",
"rerun_reason": "no_prior_judgment",
"schema": "openclaw.judge.v0",
"target": {
"kind": "issue",
"number": 1,
"repo": "pdurlej/judging-claw"
},
"target_snapshot": {
"body_hash": "sha256:feca799b942cb20cec05b42d9adcf514a9ae903c7eb0c84df7cae9b33818dfdd",
"commit_count": null,
"evidence_hash": "sha256:0b551fc60f1d9d758d84ce40d6b0c02886266d900c8f45ea2ffed970e2210e94",
"head_sha": null,
"labels": [],
"labels_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"state": "open",
"title_hash": "sha256:3e74f09bb43fa3a09385329de6529cfde5ff4122e7c345b5c4fc21de7408b74a",
"updated_at": "2026-05-28T21:47:53+02:00"
},
"top_caveat": "Schema correctness is necessary but not sufficient; useful examples and invalid-case tests matter because writers must fail closed."
}

{ "confidence": 5, "effort_hint": "small", "escalation": { "kind": "none", "reason": "" }, "evidence_refs": [ { "note": "Public issue metadata and dry-run packet only.", "type": "forgejo", "value": "issue-title-body-and-target-snapshot" } ], "impact": 4, "judge_actor": { "name": "iskra", "runtime": "openclaw" }, "judged_at": "2026-05-28T22:01:57Z", "labels_to_apply": [ "judge/p1", "judge/codex-candidate" ], "piotr_fit": "high", "priority": "p1", "rationale_summary": "The schema is the durable contract for every future judgment, so getting validation, closed enums, snapshots, labels, and privacy-safe evidence right unlocks the rest of the loop.", "reach": 5, "recommended_next_action": "codex_candidate", "rerun_reason": "no_prior_judgment", "schema": "openclaw.judge.v0", "target": { "kind": "issue", "number": 1, "repo": "pdurlej/judging-claw" }, "target_snapshot": { "body_hash": "sha256:feca799b942cb20cec05b42d9adcf514a9ae903c7eb0c84df7cae9b33818dfdd", "commit_count": null, "evidence_hash": "sha256:0b551fc60f1d9d758d84ce40d6b0c02886266d900c8f45ea2ffed970e2210e94", "head_sha": null, "labels": [], "labels_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", "state": "open", "title_hash": "sha256:3e74f09bb43fa3a09385329de6529cfde5ff4122e7c345b5c4fc21de7408b74a", "updated_at": "2026-05-28T21:47:53+02:00" }, "top_caveat": "Schema correctness is necessary but not sufficient; useful examples and invalid-case tests matter because writers must fail closed." }

codex added the

judge/codex-candidate

judge/p1

labels

2026-05-29 00:19:38 +02:00

codex commented

2026-05-29 19:26:22 +02:00

Author

Collaborator

Implemented and verified in Judging Claw v1 production rollout.

Proof:

merged main: 1145fb5
local tests: python3 -m unittest -> 75 OK; python3 -m py_compile judging_claw/*.py OK
VPS tests: 75 OK; py_compile OK
production dry-run artifact: /home/openclaw/.openclaw/workspace/judging-claw-runs/2026-05-29-192057

Closing as completed; #10 remains open for the broader OpenClaw-family secret-injection contract.

Implemented and verified in Judging Claw v1 production rollout. Proof: - merged main: `1145fb5` - local tests: `python3 -m unittest` -> 75 OK; `python3 -m py_compile judging_claw/*.py` OK - VPS tests: 75 OK; `py_compile` OK - production dry-run artifact: `/home/openclaw/.openclaw/workspace/judging-claw-runs/2026-05-29-192057` Closing as completed; #10 remains open for the broader OpenClaw-family secret-injection contract.

codex closed this issue

2026-05-29 19:29:15 +02:00