Spike: evaluate Understand Anything for repo maps / agent context #404
Labels
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
pdurlej/platform#404
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
Why this may matter for platform
pdurlej/platformis exactly the kind of repo where agents can lose architectural context.Suggested spike
pdurlej/platformin a controlled Claude session.Acceptance criteria
Notes
Spike #404 — Understand Anything (UA) evaluation report
Run: 2026-05-23 by claude
Commit analyzed:
30ca5d2056ed0962a3553e83b612b3d5761fdbea(post-PR-#397 merge)Plugin:
Lum1104/Understand-Anything@2.7.4(Claude Code marketplace)Issue: #404
TL;DR
Recommendation: keep as one-off deep-onboarding tool, NOT default agent-orientation dependency.
UA produced a genuinely high-quality knowledge graph of
pdurlej/platform— 849 nodes, 894 edges, 10 architectural layers, 14-step pedagogical tour. The architecture-analyzer correctly classified the repo as a declarative platform spec (not a runtime app) — non-trivial framing that LLM-cold-readers often miss. The Honcho subsystem dependency edges (honcho-api → honcho-postgres + honcho-redis,honcho-deriver → ...) match real topology. File summaries are concise and accurate to a degree where a fresh cousin could skip opening many files.But the cost is large (≈1.9M sub-agent tokens for a single run on the 436-file working set), the graph is a point-in-time snapshot (already partially stale at write-time per the auto-update needing fingerprints), and the existing
state/STATUS_NOW.md+decisions/*.md+README.md+INDEX.mdsubstrate already gives fresh cousins a high-quality onboarding path without ongoing UA cost.UA wins as an on-demand deep-dive tool (one-off "scan + ask" for a specific architectural question), not as a always-on context layer.
What got built (acceptance criterion 1: one controlled run)
.understandignoregeneration, scope-downproject-scanneragent inventoried 436 files (post-exclude), built import-map (19 files have internal imports)file-analyzeragents (4 rounds of 5+5+5+3 in parallel)assemble-reviewer— clean, no fixesarchitecture-analyzer→ 10 layers, all 479 file-level nodes assignedtour-builder→ 14 ordered pedagogical stepsknowledge-graph.json(732 KB) +fingerprints.json(404 KB) +meta.jsonTotal: ~45 min wall, ~1.87M tokens of sub-agent work.
(Operator-visible session cost on the orchestrator side was modest — heavy work happened in sub-agents whose context is separate from the parent session. Practical impact on Claude Max session budget felt closer to 25-35% on the parent.)
Cost / time notes (acceptance criterion 2)
.understandignoreexcludingbaseline/(binary DB dumps),.sisyphus/(logs),control-plane/.venv/(~900 dep files), and pyc/cache, the run would have been ~3× the cost and ~3× the wall time, mostly on irrelevant noise.--reviewflag (full LLM graph reviewer) was NOT used; deterministic inline JS validation passed cleanly. Adding--reviewwould have added ~3-5k more sub-agent tokens for marginal value on a clean graph.Generated artifacts
Under
/Users/pd/Developer/iskra-platform-2026-04-30/.understand-anything/:knowledge-graph.jsonfingerprints.jsonmeta.json.understandignoreRecommendation on commit: leave
.understand-anything/gitignored by default; promote.understandignoreto repo root only if UA is adopted. This matches issue body's "be careful not to commit generated bulky/private artifacts without review."Usefulness test (acceptance criterion 3)
Query (real operator-onboarding style): "What would I need to read to understand the RS2000 → VPS1000 deploy path and the relationship between the control plane (platformctl) and the per-module module.yaml manifests? Give a reading order and tell me which files are actual code vs spec/intent."
/understand-chatproduced (paraphrased):PLATFORM_CHARTER.md→README.mdmodules/INDEX.yaml→schema/module.schema.json→ exemplarmodules/honcho-api/{module.yaml, runbook.md}control-plane/platformctl/{cli.py, manifest.py, graph.py, apply.py, doctor.py}— each annotated with what it does (e.g.apply.py: "Core apply engine that verifies an approved plan against Forgejo PR state, builds a docker-compose runtime command, executes it")compose/{base,core,edge,apps}/compose.yamlplatformctl-plan.yml,platformctl-auto-apply.yml,deploy-vps1000.yml,platform-smoke.yml— with one-liner per workflow describing trigger + scopestate/STATUS_NOW.md(canonical per ADR-0006)Quality vs grep + manual read:
module.yamlupstream blocks manually--reviewLLM pass to catch.)Verdict on usefulness: net positive for a first-time deep orientation question. Marginal for a returning cousin who's already seen
STATUS_NOW.md+INDEX.md+ the README's reading order. NOT a replacement for those — supplements them.Where UA shined
depends_onedges fromspec.dependencies.upstreamextraction is exactly right.serves_areaskey in onemodule.yaml(need to find which manifest — fingerprints script reported "YAML parse failed, falling back to regex extraction"). Worth a follow-up issue.Where UA was weak
STATUS_NOW.md+INDEX.md+state/L3/audit synthesis (JOURNEY/CONTRADICTIONS/OPEN_LOOPS/ORPHANS) + ADRs + README. These give 80% of UA's value at 0% of the per-run cost.30ca5d2. Repo moves fast (PR #397 merged today). Without auto-update (which requires--auto-updateflag + commit hooks), the graph becomes lying-documentation within a few merges. Auto-update itself fires file-analyzer per changed file — bounded cost but recurring.file-analyzerper-batch limits prevent fully wiring known cross-doc semantic links — a--reviewLLM pass might catch these but adds cost.pnpm installfailed twice due to pnpm 11's new build-script policy (allowBuilds:placeholder template strings inpnpm-workspace.yaml). Required manual patch. Documented instate/spike-understand-anything/00-handoff.md. Operator-time cost to first-run: ~10 minutes on top of the plugin install itself.Recommendation (acceptance criterion 4)
One-off deep-onboarding tool. NOT default agent-orientation.
When to use UA on platform:
When NOT to use UA on platform:
STATUS_NOW.md+INDEX.md+ per-cousin-onboarding skill substratemodule.yaml+runbook.mdpair is denser and more accurate per-fileConcrete follow-ups (if adopted)
.understand-anything/to.gitignore(and document the on-demand-run convention inAGENTS.md)..understandignoreat repo root with the same exclusions used here (baseline/,.sisyphus/,control-plane/.venv/,**/__pycache__/,*.egg-info/,state/AUDIT_LOG.jsonl,state/spike-understand-anything/).Lum1104/Understand-Anythingabout pnpm 11allowBuilds:template strings causing install failure — should ship astrue/false, not placeholder strings.serves_areasYAML key flagged by the fingerprints script (find whichmodule.yamlhas it, fix the duplicate).Cleanup if rejected (no commit needed)
Spike report by claude under Earl Grey discipline. Operator's call on adopt / one-off / reject.
W9 issue cleanup: closing as done/superseded.
Reason: Understand Anything spike is complete and recommendation recorded.
Evidence: Issue body acceptance criteria are checked; report exists at state/spike-understand-anything/01-recommendation.md with recommendation: one-off deep-onboarding tool, not default dependency; CodeGraph is the continuous-use compromise.
If this becomes relevant again, reopen with current acceptance criteria or create a smaller fresh issue from current main.