product: AI Slot Machine — local static linter / integrity companion for agent answers #55

Closed
opened 2026-05-04 13:32:49 +02:00 by Iskra · 1 comment
Collaborator

Context

Moved from pdurlej/iskra-openclaw#31 because this is a platform/operator microproduct, not primarily an Iskra-specific feature.

Piotr developed the "AI Slot Machine / LLM Slot Machine" idea in Signal voice notes on 2026-05-04.

Initial shape:

  • a small OSS git repo / microproduct;
  • MCP, sidecar, plugin, or companion service;
  • not a "truth machine", but a risk/integrity indicator;
  • green/yellow/red/slot-machine style badge for AI answers.

Follow-up refinement:

Potraktować AI Slot Machine jako statyczny gate / coś jak statyczna analiza kodu. Automatycznie z boku MCP mówi, czy odpowiedź AI-a, biorąc pod uwagę model i kontekst, wygląda dobrze. Skupiamy się na modelach kodeksowych, ChatGPT i modelach Claude'owych. To ma być deterministyczne, niczym code review/lintery: trzyma poziom, daje feedback automatycznie, może samo napisać wiadomość: "hej, nie spełniasz tego, tego i tego, proszę popraw". Dajemy automatycznego asystenta dla vibe-coderów, który guardrailuje współpracę. Ponieważ jest statyczny i niczego nigdzie nie wysyła, działa prywatnie.

Core idea

Build a lightweight, local-first AI answer integrity linter.

It should inspect an agent/LLM answer and produce a deterministic risk/integrity verdict, similar to how linters and static analyzers inspect code.

The point is not to prove factual truth. The point is to flag answer-production smells before the user trusts the output.

Positioning

Not:

  • hallucination oracle;
  • factual truth machine;
  • another heavyweight eval framework;
  • cloud compliance dashboard.

Instead:

  • local/static answer linter;
  • operator companion;
  • "slot machine" risk badge;
  • guardrail for vibe-coding / agentic coding workflows;
  • private-by-default, no external calls required in baseline mode.

Suggested product names

Working names:

  • AI Slot Machine
  • LLM Slot Machine
  • Answer Integrity Companion
  • SlotCheck
  • Truthiness Companion
  • Jednoręki Bandyta AI

Deterministic checks / heuristics

Candidate static checks:

  • unsupported strong factual claims;
  • claims about files/system state without tool evidence;
  • claims about past actions without receipts/logs;
  • "confirmed/done/working" language without evidence;
  • contradiction between tool errors and final answer;
  • missing source tags for important state/history claims;
  • overconfident language under uncertainty;
  • no tests/lint/build despite code-change claims;
  • privacy-sensitive claim without explicit source;
  • stale model/context warnings;
  • mismatch between requested output format and answer;
  • apology/hedging loops that hide the actual blocker;
  • excessive smoothness / sycophancy markers if configured.

Output shape

Simple badge + actionable feedback:

  • 🟢 low risk / grounded enough;
  • 🟡 some uncertainty or missing evidence;
  • 🔴 high risk / unsupported claims or tool contradiction;
  • 🎰 slot-machine mode: answer sounds fluent but has poor grounding.

Example:

🎰 SLOT MACHINE
Reasons:
- says "confirmed" but no command/log/source supports it
- references a file path that returned ENOENT
- no verification step after code change
Suggested fix:
- rerun path check
- include exact evidence or mark as hypothesis

Integration targets

Possible MVP targets:

  1. CLI:
    • aislot check answer.md --context trace.json
  2. MCP server:
    • expose check_answer, check_trace, explain_verdict;
  3. OpenClaw plugin/sidecar:
    • inspect agent final replies before delivery;
  4. GitHub/Forgejo bot:
    • comment on PR/issue agent outputs;
  5. Codex/Claude Code companion:
    • guardrail coding-agent responses.

Privacy model

Baseline should be fully local/static:

  • no external API calls;
  • no telemetry;
  • no uploading prompts/answers;
  • optional config file for local rules;
  • optional external verifier mode only when explicitly enabled.

This privacy angle is important: a small tool can be trusted if it does not need to send answer/context anywhere.

Prior art / nearby OSS

Quick search found related but not identical tools:

  • cisco-open/polygraphLLM — hallucination/factuality library;
  • KRLabsOrg/LettuceDetect — RAG hallucination detection;
  • exa-labs/exa-hallucination-detector — claim extraction + web evidence;
  • HalluciGuard — claim extraction + trust score;
  • VATBox/llm-confidence — confidence from logprobs.

These are closer to detectors/eval frameworks. The proposed niche is a lighter operator-facing static linter/badge for agent answers.

MVP scope

A tiny first version could be:

  • input: answer text + optional tool trace JSON;
  • deterministic rules engine;
  • verdict: green/yellow/red/slot;
  • reasons + suggested fix;
  • no network;
  • configurable severity thresholds;
  • examples from Iskra/OpenClaw incidents.

Acceptance criteria

  • Repo skeleton exists with README explaining the concept.
  • CLI can evaluate a saved answer against static rules.
  • At least 10 deterministic rules implemented.
  • Output includes verdict, reasons, and suggested fix.
  • Baseline mode performs no network calls.
  • Example fixtures include at least one green, yellow, red, and slot-machine answer.
  • Future MCP/server integration path is documented.

Why this matters

This directly addresses the recurring trust problem in agent systems: fluent answers can hide weak grounding.

A cheap deterministic linter will not solve truth, but it can catch the obvious operational failures before they reach a human as polished bullshit.

That is very on-brand for OpenClaw/Iskra: evidence discipline over vibes.

  • #27 — OpenClaw upgrade blocked by missing auto-repair/self-healing guardrails
  • #29 — WordPress/blog integration
  • #30 — Rybbit analytics visible to Iskra
## Context Moved from `pdurlej/iskra-openclaw#31` because this is a platform/operator microproduct, not primarily an Iskra-specific feature. Piotr developed the "AI Slot Machine / LLM Slot Machine" idea in Signal voice notes on 2026-05-04. Initial shape: - a small OSS git repo / microproduct; - MCP, sidecar, plugin, or companion service; - not a "truth machine", but a risk/integrity indicator; - green/yellow/red/slot-machine style badge for AI answers. Follow-up refinement: > Potraktować AI Slot Machine jako statyczny gate / coś jak statyczna analiza kodu. Automatycznie z boku MCP mówi, czy odpowiedź AI-a, biorąc pod uwagę model i kontekst, wygląda dobrze. Skupiamy się na modelach kodeksowych, ChatGPT i modelach Claude'owych. To ma być deterministyczne, niczym code review/lintery: trzyma poziom, daje feedback automatycznie, może samo napisać wiadomość: "hej, nie spełniasz tego, tego i tego, proszę popraw". Dajemy automatycznego asystenta dla vibe-coderów, który guardrailuje współpracę. Ponieważ jest statyczny i niczego nigdzie nie wysyła, działa prywatnie. ## Core idea Build a lightweight, local-first **AI answer integrity linter**. It should inspect an agent/LLM answer and produce a deterministic risk/integrity verdict, similar to how linters and static analyzers inspect code. The point is not to prove factual truth. The point is to flag answer-production smells before the user trusts the output. ## Positioning Not: - hallucination oracle; - factual truth machine; - another heavyweight eval framework; - cloud compliance dashboard. Instead: - local/static answer linter; - operator companion; - "slot machine" risk badge; - guardrail for vibe-coding / agentic coding workflows; - private-by-default, no external calls required in baseline mode. ## Suggested product names Working names: - AI Slot Machine - LLM Slot Machine - Answer Integrity Companion - SlotCheck - Truthiness Companion - Jednoręki Bandyta AI ## Deterministic checks / heuristics Candidate static checks: - unsupported strong factual claims; - claims about files/system state without tool evidence; - claims about past actions without receipts/logs; - "confirmed/done/working" language without evidence; - contradiction between tool errors and final answer; - missing source tags for important state/history claims; - overconfident language under uncertainty; - no tests/lint/build despite code-change claims; - privacy-sensitive claim without explicit source; - stale model/context warnings; - mismatch between requested output format and answer; - apology/hedging loops that hide the actual blocker; - excessive smoothness / sycophancy markers if configured. ## Output shape Simple badge + actionable feedback: - 🟢 low risk / grounded enough; - 🟡 some uncertainty or missing evidence; - 🔴 high risk / unsupported claims or tool contradiction; - 🎰 slot-machine mode: answer sounds fluent but has poor grounding. Example: ```text 🎰 SLOT MACHINE Reasons: - says "confirmed" but no command/log/source supports it - references a file path that returned ENOENT - no verification step after code change Suggested fix: - rerun path check - include exact evidence or mark as hypothesis ``` ## Integration targets Possible MVP targets: 1. CLI: - `aislot check answer.md --context trace.json` 2. MCP server: - expose `check_answer`, `check_trace`, `explain_verdict`; 3. OpenClaw plugin/sidecar: - inspect agent final replies before delivery; 4. GitHub/Forgejo bot: - comment on PR/issue agent outputs; 5. Codex/Claude Code companion: - guardrail coding-agent responses. ## Privacy model Baseline should be fully local/static: - no external API calls; - no telemetry; - no uploading prompts/answers; - optional config file for local rules; - optional external verifier mode only when explicitly enabled. This privacy angle is important: a small tool can be trusted if it does not need to send answer/context anywhere. ## Prior art / nearby OSS Quick search found related but not identical tools: - `cisco-open/polygraphLLM` — hallucination/factuality library; - `KRLabsOrg/LettuceDetect` — RAG hallucination detection; - `exa-labs/exa-hallucination-detector` — claim extraction + web evidence; - `HalluciGuard` — claim extraction + trust score; - `VATBox/llm-confidence` — confidence from logprobs. These are closer to detectors/eval frameworks. The proposed niche is a lighter operator-facing static linter/badge for agent answers. ## MVP scope A tiny first version could be: - input: answer text + optional tool trace JSON; - deterministic rules engine; - verdict: green/yellow/red/slot; - reasons + suggested fix; - no network; - configurable severity thresholds; - examples from Iskra/OpenClaw incidents. ## Acceptance criteria - Repo skeleton exists with README explaining the concept. - CLI can evaluate a saved answer against static rules. - At least 10 deterministic rules implemented. - Output includes verdict, reasons, and suggested fix. - Baseline mode performs no network calls. - Example fixtures include at least one green, yellow, red, and slot-machine answer. - Future MCP/server integration path is documented. ## Why this matters This directly addresses the recurring trust problem in agent systems: fluent answers can hide weak grounding. A cheap deterministic linter will not solve truth, but it can catch the obvious operational failures before they reach a human as polished bullshit. That is very on-brand for OpenClaw/Iskra: evidence discipline over vibes. ## Related - #27 — OpenClaw upgrade blocked by missing auto-repair/self-healing guardrails - #29 — WordPress/blog integration - #30 — Rybbit analytics visible to Iskra
Collaborator

Operator decision: SPIN OFF (2026-05-06)

Operator-confirmed via chat 2026-05-06: AI Slot Machine moves to a separate repo.

Rationale

This is an OSS microproduct, not platform-internal infrastructure. Mixing it with pdurlej/platform would:

  • Bloat platform repo with unrelated product code
  • Mix licensing concerns (OSS-public vs platform-private)
  • Create confusion for future contributors
  • Violate platform↔microproject contract (Issue #74) — different scope, different audience

Action

Closing this issue from pdurlej/platform.

Operator-side TODO (not for agent automation):

  • Create new repo pdurlej/aislot (or chosen name) on Forgejo
  • Initial commit: README with the concept (copy from this issue body) + LICENSE choice (MIT or Apache 2.0 — operator picks)
  • Apply governance pattern from #75 (mandatory non-author reviewer + branch protection) to the new repo
  • Open the same issue in the new repo if active development starts; reference this closed one as origin

What stays in platform

Nothing from #55 carries forward to platform repo. The aislot design itself (deterministic linter for AI answers, slot-machine risk badge, local-first privacy) is operator's IP and lives in the spin-off repo.

## Operator decision: SPIN OFF (2026-05-06) Operator-confirmed via chat 2026-05-06: AI Slot Machine moves to a separate repo. ### Rationale This is an OSS microproduct, not platform-internal infrastructure. Mixing it with `pdurlej/platform` would: - Bloat platform repo with unrelated product code - Mix licensing concerns (OSS-public vs platform-private) - Create confusion for future contributors - Violate platform↔microproject contract (Issue #74) — different scope, different audience ### Action Closing this issue from `pdurlej/platform`. Operator-side TODO (not for agent automation): - [ ] Create new repo `pdurlej/aislot` (or chosen name) on Forgejo - [ ] Initial commit: README with the concept (copy from this issue body) + LICENSE choice (MIT or Apache 2.0 — operator picks) - [ ] Apply governance pattern from #75 (mandatory non-author reviewer + branch protection) to the new repo - [ ] Open the same issue in the new repo if active development starts; reference this closed one as origin ### What stays in platform Nothing from #55 carries forward to platform repo. The `aislot` design itself (deterministic linter for AI answers, slot-machine risk badge, local-first privacy) is operator's IP and lives in the spin-off repo.
Sign in to join this conversation.
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform#55
No description provided.