External review — claude opus 4.7 sharp pass on bootstrap state (2026-05-16) #17
Labels
No labels
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
priority:p0
priority:p1
priority:p2
priority:p3
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
pdurlej/fallow-ts#17
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
External, opinionated read of the fallow-ts bootstrap by
claude(Opus 4.7), at the operator's direction. Sister-project perspective — I'm the mandatory-non-author reviewer over onfallow-py. The point of this issue is not approval theatre. The point is to find the things that will hurt in three weeks if not addressed in the next three days.State reviewed: main @
7beeeca(Codex's three-wave roadmap commit) — full source + tests + docs + CI.What's good (anchor the rest in this)
fallow-rs/fallowconfusion before anyone asks. Most projects need a year + a bug report to land that paragraph..forgejo/, GitHub-native in.github/. Pattern lifted fromfallow-pyADR 0011 without re-debating it. Right move.@types/node+typescriptas devDeps. Pure stdlib analyzer. Strong starting posture.Now the spice.
Tier 1 — Real bugs in the v0 parser (verified by repro)
I ran the actual regex patterns from
src/analyze.tsagainst representative input. Three concrete bugs surfaced.B1. Single-quoted side-effect imports misclassified
src/analyze.tsline inparseImports:This checks only double-quote. Single-quoted side-effect imports stay classified as
static:Same semantic input, different output based on quote style → determinism claim violated at the kind-of-finding level. Existing test (
tests/analyze.test.mjs) only exercises double-quote, so the bug is silent.Fix: detect side-effect by absence of
from/{/identifier before the string, not by literal". Or simpler: rewrite asmatch[0].replace(/^\s*import\s+/, '').trim().match(/^['"]/). Add a single-quote fixture case to the test.B2. Multi-line imports completely invisible
The import pattern uses
[^\n]*?which cannot cross newlines:Real-world frequency: high. Most lint-formatted TS codebases auto-wrap
import { a, b, c, d } from "..."once the names cross the print-width:Actual
parseImportsoutput:The multi-line import from
bazis gone. The Wave 1 "contract credibility" goal cannot survive this — golden tests (Wave 1 #5) will lock in wrong output as canonical.Fix: either drop
[^\n]constraint and use a smarter pattern, or move to a tiny tokenizer that walks line-by-line and joins continuation lines until the closing quote. AST-based parsing (proper TypeScript compiler API or@typescript-eslint/typescript-estree) would solve this category entirely, at the cost of one runtime dep — possibly worth it before Wave 2, but a regex fix is fine for v0 if the limitation is documented.B3. Template-literal false positive
The
import foo from "./bar"lives inside a template literal (string data, not code). But the regex^\s*import\s+...withmflag matches it anyway — because at line 2 the literal text starts withimportafter just whitespace.parseImportsoutput:For a project whose pitch is "deterministic agent-readable evidence," emitting imports that don't exist in the program is the worst class of bug: agents will trust it, then make decisions on phantom edges.
Fix: strip template literals, line comments, and block comments before matching. Either pre-process the source (regex-based stripping has its own edge cases), or AST.
B4. Block-comment false positive (variant of B3)
Line 2 starts at column 0 with
import. Line-anchored regex matches. False positive. (Verified separately.)Line comments (
// import ...) and right-indented comment-internal lines (* import ...) do not trigger — the\s*only matches whitespace, not//*. Lucky pattern coincidence, not a deliberate design.Tier 2 — Structural concerns (determinism + governance + identity)
S1. Absolute root path in JSON output breaks reproducibility
src/analyze.ts:The
rootfield in the JSON is/Users/pd/...on operator's Mac,/home/runner/...in CI,/tmp/fallow-ts-...in tests. Same input → different output across machines.Wave 1 #5 (
golden-output-contract) will be brittle: every contributor's machine needs path-stripping in the golden assertion, or every golden file gets a per-machine diff that masks real drift.Fix options (pick one before Wave 1 #5 lands):
rootfrom output by default; expose it via--include-root-absoluteflag for debugging."."instead of the absolute path.JSON.parse(output.replace(absoluteRoot, "<root>"))).fallow-pyhas the same quiet drift; doesn't make it right forfallow-tsto inherit it.S2. No symlink-loop protection
src/analyze.tswalk()is unconditional recursion. A symlink cycle (a/ -> ../a/or evennode_modules/ -> ../) loops forever.fallow-pyADR 0002 documented exactly this class of hostile-input safety in Phase B (#6).fallow-tsis being born with the bugfallow-pyalready paid to learn.Fix: track visited real paths via
fs.realpath+ aSet<string>. Refuse to descend if the resolved real-path is already in the set. Worth a Wave 1 issue ("hostile-input-safety-v0") — it's table-stakes for a tool that any agent might point at an unknown checkout.S3. No graceful per-file error handling
Promise.all(files.map(indexFile))rejects on the first IO error. One unreadable file (permission denied, broken symlink, encoding issue) takes down the whole analysis.For an agent calling
fallow-tsfrom CI on an unfamiliar repo, full failure is worse than partial output. Prefer per-filePromise.allSettled+ accumulate failures inlimitations: [...]or aerrors: [...]field. Fail the run only if 0 files were indexed.S4. AGENTS.md is a stub vs fallow-py's runbook
fallow-ts/AGENTS.md= 1,133 bytesfallow-py/AGENTS.md= 17,685 bytesDelta isn't just length —
fallow-py/AGENTS.mddocuments:fallow-ts/AGENTS.mdhas none of these. As long as the repo is solo Codex pushing to main with no PRs, that's survivable. The moment a second agent (thisclaude, orglm, or a future cousin) wants to contribute, the governance vacuum becomes the bottleneck.Fix: port
fallow-py/AGENTS.mdsubstantially intact, withfallow-pyreferences swapped forfallow-tsequivalents. ~1 hour mechanical work. Not Wave-numbered but should land before the first non-author PR.S5. No
decisions/directory, no ADRsfallow-pyhas 12 numbered ADRs (Nygard format, immutable history). Includes the umbrella branding decision (ADR 0012) that birthedfallow-ts. None of that decision-record discipline transferred.docs/roadmap.mdanddocs/next-steps.mdare good but they are plans, not decisions. Plans change; decisions stay.Suggested first ADRs to land before Wave 1 work starts:
fallow-pyADR 0007 (bassist metaphor + harness vs agent). The metaphor transfers cleanly under the umbrella; the ADR makes it findable.fallow-pyADR 0009. Decides now (when it's cheap) that fallow-ts WILL adoptauto_safe/decision_needed/blockingrather than reinventing severity.fallow-pyADR 0010.ADRs in
decisions/NNNN-*.mdsurvive conversation compaction. Roadmap.md doesn't, structurally.S6. Schema naming drifts from fallow-py — umbrella becomes a mirage
schemas/pyfallow-fix-plan.schema.json(kebab)fallow_ts_report.v0(snake)"schema_version": "1.0"+"tool": "fallow""schema": "fallow_ts_report.v0"+"tool": "fallow-ts"1.0)v0literalThe umbrella brand is supposed to make these tools recognizably-related to downstream consumers. Today a single-repo CI viewer can't write one dispatcher for both reports without special-casing the schema field shape.
Fix: before Wave 1 #5 (golden output) locks the contract, decide:
schema_version(separate field) vsschema(combined identifier). Pick one umbrella convention.v0is fine for early alpha but commits the schema to that literal forever. Semver-shaped (1.0,1.1) gives room to grow."tool": "fallow"(umbrella) at the report level +"implementation": "fallow-ts"(specific) gives cross-language consumers a fighting chance.Write this as ADR 0003 (per S5) and update both
fallow-pyandfallow-tsto match. The umbrella branding ADR (fallow-py0012) implies this kind of cross-impl consistency without prescribing it; here's the prescription.Tier 3 — Real but non-blocking
T1. CLI exit-code semantics are a single-value smear
Current: 0 = success, 2 = anything-bad.
fallow-pyuses [0, 1, 2, 3] with distinct meanings (clean / findings-above-threshold / config-error / IO-error). When--fail-onlands in Wave 3 #15, the exit-code contract needs widening. Cheaper to write the convention now (in ADR 0001 or 0004) than to migrate users later.T2.
main()runs at module importsrc/cli.tsends withmain().catch(...)at top level. Meansimport { main } from "./cli.js"triggers execution — non-importable for testing without spawning subprocesses. Tests reflect this:cli.test.mjsusesspawnSync. Slower, and means any cli-internals test has to fork node.Fix: move bootstrap to a guard:
or split into
cli.ts(exportsmain) +bin.ts(just the bootstrap call). Lets future tests importmain()directly and call it with argv.T3. Hardcoded
SKIP_DIRECTORIESwill miss real ecosystem dirsCurrent:
[".git", "node_modules", "dist", "build", "coverage", ".next", ".turbo"].Missing real-world:
.svelte-kit,.astro,.cache,out,_site,target,.parcel-cache,.vinxi,.nuxt,.output,.vercel,.netlify,tmp. Hardcoded list also can't be overridden — what if a user legitimately wants to analyze a checked-indist/?Fix: make configurable via
.fallow-ts.toml(or JSON) with the current list as default. Documented as a config file path now means consumers can opt-out per project, and the same config file becomes the natural home for later Wave 3 settings (baseline path, fail-on thresholds).T4.
SOURCE_EXTENSIONSwon't see framework files.vue,.svelte,.astrocontain TS/JS code. README says "discover JS/TS files," which is technically honest about extensions but will surprise users with Svelte/Vue/Astro projects. Either:T5. No determinism canary test
For a project whose central claim is "same input → same output," there's no test that asserts byte-equality across two invocations. Trivial to write:
Lands the contract into CI. Catches future race conditions, Map iteration order regressions, etc.
T6. Unbounded
Promise.allconcurrency on huge reposPromise.all(files.map(indexFile))opens one read per file in parallel. On a 50k-file monorepo, that's 50k concurrent fd handles. Linux ulimit (1024 default) will EMFILE before it finishes. Bound with a worker pool (8-32 concurrent reads). Not urgent now; predictable failure mode at scale.T7. README's JSON example will lie
README shows a hand-written specimen of the JSON contract with specific module shape. Today the code emits that, but Wave 1 adds
package.jsonmetadata (#3), workspace detection (#4) — README example needs updating each time, or it drifts. Wave 1 #5 (golden-output-contract) + #6 (docs-contract-v0) covers it, but should be prioritized first in Wave 1, before any new fields land. Otherwise the README is the next thing to lie.T8. Missing
package.jsonmetadata for npm publishingNo
repository,bugs,homepagefields. Whennpm publishhappens, the npm page will be naked. Half a Wave 1 issue, not blocker.T9.
Math.random().toString(16).slice(2)in test fixturesReplace with
crypto.randomUUID(). Cosmetic.What I recommend the next 3-5 PRs look like
Not a roadmap rewrite — a re-ordering / addition before Wave 1 starts:
governance/agents-md + decisions-bootstrap— portfallow-py/AGENTS.mdsubstantially; adddecisions/0001-brand-identity.md,decisions/0002-three-bucket-commitment.md,decisions/0003-schema-naming-convention.md,decisions/0004-mandatory-non-author-reviewer.md. Pure governance. Zero source change. Survives compaction.fix/parser-bugs-v0— address B1 + B2 + B3 + B4 above with concrete fixture additions totests/analyze.test.mjsfirst (RED), then fix patterns (GREEN). Document remaining regex-parser limitations inREADME.md§ limitations and in the analyzer'slimitations: [...]field.harden/symlink-loop-and-per-file-errors— S2 + S3 from above. Addsfs.realpathvisited-set + per-file Promise.allSettled. Hostile-input safety floor.stabilize/deterministic-root-and-canary-test— S1 + T5. Normalize therootfield in JSON output; add the byte-equality canary test.Then Wave 1 #2-6 as Codex planned. The point of (1)-(4) is to make Wave 1's golden snapshot tests (#5) lock in correct behavior, not present-day-Codex behavior.
Closing read
The bootstrap is solid in posture and weak in surface area. Roadmap is honest, scope is honest, brand is honest. The parser is regex-first and has the bugs regex-first parsers always have — that's expected and documented as a known limitation. The structural concerns (S1-S6) matter more than the parser bugs, because the parser will get rewritten when Wave 2 needs a resolver; the absent governance won't fix itself.
The single most important thing to do this week is port AGENTS.md + create the first four ADRs. Without those, the next conversation's claude/codex/glm won't have the same anchor my
fallow-pyreview has — they'll re-derive everything from chat scrollback. That's exactly the slop posture the ADR convention exists to prevent.The second most important thing is B2 (multi-line imports) before any golden test locks in the contract.
Everything else can wait until Wave 1 starts, as long as it's tracked.
—
claude(Opus 4.7), Hermes-mode, fingers slightly burnt 🗡️