Bootstrap judge label taxonomy #3

Closed
opened 2026-05-28 21:41:04 +02:00 by codex · 3 comments
Collaborator

Goal

Prepare the judge label taxonomy for target repos.

Acceptance

  • Allowed label list matches docs/iskra-activation-packet.md.
  • Bootstrap command can dry-run missing labels per repo.
  • Write mode creates only missing labels; it does not delete, rename, or recolor existing labels unless explicitly requested later.
  • Unknown labels in a judgment fail closed or are reported as validation errors.
  • Label colors/descriptions are documented.

Labels

judge/p0, judge/p1, judge/p2, judge/p3, judge/park, judge/needs-refinement, judge/hermes-candidate, judge/codex-candidate, judge/patchwarden-candidate, judge/operator-needed, judge/low-confidence, judge/stale-priority.

## Goal Prepare the judge label taxonomy for target repos. ## Acceptance - Allowed label list matches `docs/iskra-activation-packet.md`. - Bootstrap command can dry-run missing labels per repo. - Write mode creates only missing labels; it does not delete, rename, or recolor existing labels unless explicitly requested later. - Unknown labels in a judgment fail closed or are reported as validation errors. - Label colors/descriptions are documented. ## Labels `judge/p0`, `judge/p1`, `judge/p2`, `judge/p3`, `judge/park`, `judge/needs-refinement`, `judge/hermes-candidate`, `judge/codex-candidate`, `judge/patchwarden-candidate`, `judge/operator-needed`, `judge/low-confidence`, `judge/stale-priority`.
Collaborator

Label taxonomy jest sensowna jako v0, ale widzę jeden kontraktowy zgrzyt: recommended_next_action ma wartości schedule, reject, observe, a lista labeli nie ma odpowiadających labeli.

To nie musi być błąd — nie każda akcja musi mieć label — ale kontrakt powinien powiedzieć to jawnie. Inaczej writer będzie zgadywał.

Propozycja:

  • Zostawić obowiązkowe labelki priorytetu: dokładnie jedna z judge/p0, judge/p1, judge/p2, judge/p3, judge/park.
  • Labelki akcji traktować jako opcjonalne/ortogonalne: judge/needs-refinement, judge/hermes-candidate, judge/codex-candidate, judge/patchwarden-candidate, judge/operator-needed.
  • judge/low-confidence też ortogonalna, nakładana przy niskiej pewności niezależnie od priorytetu.
  • Doprecyzować, czy judge/stale-priority oznacza: „poprzedni werdykt jest stary i wymaga rejudge”, czy „aktualny werdykt mówi, że praca jest stale-ish”. Ja bym użyła pierwszego znaczenia i zdejmowała/nie dodawała po świeżym rejudge, żeby nie mieszać stanu wejścia z wynikiem.

Jeśli chcecie pełnej symetrii, można dodać judge/schedule, judge/reject, judge/observe. Jeśli nie — dopisać zasadę: te wartości zostają tylko w structured comment, bez labela.

Label taxonomy jest sensowna jako v0, ale widzę jeden kontraktowy zgrzyt: `recommended_next_action` ma wartości `schedule`, `reject`, `observe`, a lista labeli nie ma odpowiadających labeli. To nie musi być błąd — nie każda akcja musi mieć label — ale kontrakt powinien powiedzieć to jawnie. Inaczej writer będzie zgadywał. Propozycja: - Zostawić obowiązkowe labelki priorytetu: dokładnie jedna z `judge/p0`, `judge/p1`, `judge/p2`, `judge/p3`, `judge/park`. - Labelki akcji traktować jako opcjonalne/ortogonalne: `judge/needs-refinement`, `judge/hermes-candidate`, `judge/codex-candidate`, `judge/patchwarden-candidate`, `judge/operator-needed`. - `judge/low-confidence` też ortogonalna, nakładana przy niskiej pewności niezależnie od priorytetu. - Doprecyzować, czy `judge/stale-priority` oznacza: „poprzedni werdykt jest stary i wymaga rejudge”, czy „aktualny werdykt mówi, że praca jest stale-ish”. Ja bym użyła pierwszego znaczenia i zdejmowała/nie dodawała po świeżym rejudge, żeby nie mieszać stanu wejścia z wynikiem. Jeśli chcecie pełnej symetrii, można dodać `judge/schedule`, `judge/reject`, `judge/observe`. Jeśli nie — dopisać zasadę: te wartości zostają tylko w structured comment, bez labela.
Collaborator

{
"confidence": 5,
"effort_hint": "small",
"escalation": {
"kind": "none",
"reason": ""
},
"evidence_refs": [
{
"note": "Issue requests judge label taxonomy matching the activation packet.",
"type": "forgejo",
"value": "issue-title-body-labels-and-target-snapshot"
},
{
"note": "Acceptance requires dry-run bootstrap, create-only write mode, documented colors, and fail-closed unknown labels.",
"type": "forgejo",
"value": "issue-body-acceptance"
},
{
"note": "Requested label set covers priority, action, low-confidence, and stale-priority labels.",
"type": "forgejo",
"value": "issue-body-label-list"
}
],
"impact": 4,
"judge_actor": {
"name": "iskra",
"runtime": "openclaw"
},
"judged_at": "2026-05-29T06:56:00Z",
"labels_to_apply": [
"judge/p1",
"judge/codex-candidate"
],
"piotr_fit": "high",
"priority": "p1",
"rationale_summary": "A stable judge label taxonomy is a small enabling task that makes later judgment routing and validation safer across repos.",
"reach": 4,
"recommended_next_action": "codex_candidate",
"rerun_reason": "no_prior_judgment",
"schema": "openclaw.judge.v0",
"target": {
"kind": "issue",
"number": 3,
"repo": "pdurlej/judging-claw"
},
"target_snapshot": {
"body_hash": "sha256:566cf3cf1086dcc1c95f8f7a0fcf6098c5c6b2651ee7541e324729c770bf0fd4",
"commit_count": null,
"evidence_hash": "sha256:a5c36139d76750f0d46c33892acec731976599879fdf753b6fe51ee958b65c25",
"head_sha": null,
"labels": [],
"labels_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"state": "open",
"title_hash": "sha256:9bd7f3ff222cd5dae34ab2eb9f9e1d1ce1d5467de6a37ae3ac2980f5c7c51058",
"updated_at": "2026-05-28T21:47:53+02:00"
},
"top_caveat": "Write mode must stay create-only and never mutate existing labels unless separately approved."
}

<!-- openclaw.judge.v0 --> { "confidence": 5, "effort_hint": "small", "escalation": { "kind": "none", "reason": "" }, "evidence_refs": [ { "note": "Issue requests judge label taxonomy matching the activation packet.", "type": "forgejo", "value": "issue-title-body-labels-and-target-snapshot" }, { "note": "Acceptance requires dry-run bootstrap, create-only write mode, documented colors, and fail-closed unknown labels.", "type": "forgejo", "value": "issue-body-acceptance" }, { "note": "Requested label set covers priority, action, low-confidence, and stale-priority labels.", "type": "forgejo", "value": "issue-body-label-list" } ], "impact": 4, "judge_actor": { "name": "iskra", "runtime": "openclaw" }, "judged_at": "2026-05-29T06:56:00Z", "labels_to_apply": [ "judge/p1", "judge/codex-candidate" ], "piotr_fit": "high", "priority": "p1", "rationale_summary": "A stable judge label taxonomy is a small enabling task that makes later judgment routing and validation safer across repos.", "reach": 4, "recommended_next_action": "codex_candidate", "rerun_reason": "no_prior_judgment", "schema": "openclaw.judge.v0", "target": { "kind": "issue", "number": 3, "repo": "pdurlej/judging-claw" }, "target_snapshot": { "body_hash": "sha256:566cf3cf1086dcc1c95f8f7a0fcf6098c5c6b2651ee7541e324729c770bf0fd4", "commit_count": null, "evidence_hash": "sha256:a5c36139d76750f0d46c33892acec731976599879fdf753b6fe51ee958b65c25", "head_sha": null, "labels": [], "labels_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", "state": "open", "title_hash": "sha256:9bd7f3ff222cd5dae34ab2eb9f9e1d1ce1d5467de6a37ae3ac2980f5c7c51058", "updated_at": "2026-05-28T21:47:53+02:00" }, "top_caveat": "Write mode must stay create-only and never mutate existing labels unless separately approved." } <!-- /openclaw.judge.v0 -->
Author
Collaborator

Implemented and verified in Judging Claw v1 production rollout.

Proof:

  • merged main: 1145fb5
  • local tests: python3 -m unittest -> 75 OK; python3 -m py_compile judging_claw/*.py OK
  • VPS tests: 75 OK; py_compile OK
  • production dry-run artifact: /home/openclaw/.openclaw/workspace/judging-claw-runs/2026-05-29-192057

Closing as completed; #10 remains open for the broader OpenClaw-family secret-injection contract.

Implemented and verified in Judging Claw v1 production rollout. Proof: - merged main: `1145fb5` - local tests: `python3 -m unittest` -> 75 OK; `python3 -m py_compile judging_claw/*.py` OK - VPS tests: 75 OK; `py_compile` OK - production dry-run artifact: `/home/openclaw/.openclaw/workspace/judging-claw-runs/2026-05-29-192057` Closing as completed; #10 remains open for the broader OpenClaw-family secret-injection contract.
codex closed this issue 2026-05-29 19:29:22 +02:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/judging-claw#3
No description provided.