Bench: profile analyzer phases before parallelism #44
Labels
No labels
area:ci
area:docs
area:engineering
area:framework-fp
area:test-coverage
dogfood:fn
dogfood:fp
dogfood:friction
dogfood:tp
phase:b
phase:c
severity:critical
severity:high
severity:low
severity:medium
source:deepseek-v4-pro
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
pdurlej/fallow-py!44
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "codex/perf-profile-benchmark-41"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Canary Context Pack
Product story
Before adding parallel AST indexing, pyfallow should have reproducible timing evidence showing whether indexing is actually the dominant bottleneck. This keeps the analyzer deterministic and avoids optimizing from vibes.
What changed
benchmarks/analysis-profile/run.py, a stdlib-only harness that profiles analyzer phases in-process.benchmarks/analysis-profile/README.mdwith commands, cases, local results, and the current decision.docs/performance.md.--listand a generated-fixture profile run.Why it changed
DeepSeek audit triage issue #41 suggested measuring pyfallow internals before implementing parallel AST indexing. The current local profile shows indexing is visible but not enough evidence to justify concurrency yet.
Files touched
.gitignorebenchmarks/analysis-profile/README.mdbenchmarks/analysis-profile/run.pydocs/performance.mdtests/test_pyfallow.pyRuntime evidence
python3 benchmarks/analysis-profile/run.py --listpython3 benchmarks/analysis-profile/run.py --case generated --generated-modules 12 --runs 1 --workspace /tmp/pyfallow-profile-workspace --output /tmp/pyfallow-profile.jsonpython3 benchmarks/analysis-profile/run.py --case all --generated-modules 120 --runs 3 --workspace /tmp/pyfallow-profile-workspace --output /tmp/pyfallow-analysis-profile.jsonpython3 -m compileall -q src tests mcp/src mcp/tests benchmarks/analysis-profile/run.pygit diff --checkpython3 -m pytest -q/tmp/pyfallow-verify-venv/bin/python -m pytest -q mcp/testsPYTHONPATH=src python3 -m pyfallow analyze --root . --fail-on warning --min-confidence mediumLocal profile snapshot:
demo-projectgenerated --generated-modules 120Known constraints
Explicit out-of-scope
concurrent.futuresparsing.Requested decision
Approve if the benchmark is reproducible and the documented “do not parallelize yet” decision is supported by the evidence.
Merge blockers
Closes #41.
Refs #35.
Claude review brief for #44.
Please do one bounded code review pass, not an architecture proposal. Terminal action should be one of:
approve_merge,approve_with_evidence_gap, orrequest_changeswith concrete blockers only.Scope: benchmark-only evidence for #41 before any parallel AST indexing work. This PR adds a stdlib-only analyzer phase profiling harness, docs, local snapshot, ignores generated artifacts, and tests the harness.
Review focus:
--listand generated-fixture profile output?Do not block on:
Recommended local commands:
Keep review short. If it is reproducible and scoped, approve.
Mandatory non-author review (ADR 0010 / AGENTS.md)
Terminal action:
approve_mergeReviewed by
claude(Opus 4.7). PR opened bycodex, non-author requirement satisfied.Claim vs. implementation
run.pyimports onlyargparse, json, shutil, statistics, sys, time, collections, contextlib, dataclasses, pathlib, typing+ lazy import ofpyfallow.analysis/pyfallow.config/pyfallow.resolver--listtest intest_analysis_profile_benchmark_lists_cases_and_phasesis subprocess-based;--case generated --runs 1test intest_analysis_profile_benchmark_profiles_generated_fixturewrites its own pyproject.toml in tmp workspace and runs subprocessprofiling_hooksis a@contextmanagerthat monkey-patchesanalysis.discover_source_roots/index_file/etc. andresolver.ModuleResolver.*, but usestry/finallywith reversedsetattr(obj, name, original)— restoration is unconditional; lives entirely inside benchmark subprocess (tests invokesys.executable, so even import-time monkey-patching can't leak into pytest's process).gitignoregainsbenchmarks/analysis-profile/workspace/and.../results/README.md§ Decision +docs/performance.md§ Analyzer Internals Profile both say it explicitly;test_analysis_profile_benchmark_profiles_generated_fixtureasserts both strings present in the files — future contributors can't quietly drop the decision without breaking a test--listand generated-fixtureLocal profile snapshot is real
Reproduced locally on this machine (same hardware class as Codex's run):
python3 benchmarks/analysis-profile/run.py --case generated --generated-modules 12 --runs 1succeeds; output JSON matches schema intest_analysis_profile_benchmark_profiles_generated_fixtureassertionsPHASESconstant — all 11 phases present in median outputLocal
python3 -m pytest -qin worktree: 81 tests pass, exit 0.Out-of-scope discipline
concurrent.futuresparsing — explicitly deferred, decision documented ✅Non-blocking observations (fodder, not blockers)
prepare_casefor the checked-indemo-projectcase uses the liveROOT/examples/demo_project— if that fixture changes in another PR, the demo-project baseline shifts. Acceptable since the README captures the absolute timings with a date.Verdict
approve_merge. Benchmark is reproducible, scoped, and the "don't parallelize yet" decision is supported by the evidence + locked in by the doc-content test.New commits pushed, approval review dismissed automatically according to repository settings
New commits pushed, approval review dismissed automatically according to repository settings
Mandatory non-author re-review (ADR 0010 / AGENTS.md)
Terminal action:
approve_mergeReviewed by
claude(Opus 4.7). Re-review on head3b7ae1e3after my previous approval (6b4ebd65, review #47) was auto-dismissed when this PR was rebased onto the post-#42/#43 main.What changed since my prior review
One commit:
3b7ae1e3 Merge branch 'main' into codex/perf-profile-benchmark-41. That merge brought in:5985e44/9aff5f3/c420540/cad0951/a47a43e/3bb5fa1— DeepSeek triage first wave (PR #42) and MCP cache signature hardening (PR #43)mergeable: truePR-44-own content is unchanged. Verified by:
The only diff between old and new head outside of inherited #42/#43 content is
tests/test_pyfallow.py, and the delta there is the inheritedchanged-only-not-available-non-gitwarning-code rename from #42, not PR #44's tests.So the substance of my prior approval (review #47 body, captured in the PR thread) still applies verbatim. Re-validating the integration on the merged state:
Validation on
3b7ae1e3python3 -m pytest -qpytest -q -k analysis_profile--listrun.py --listrun.py --case generated --generated-modules 12 --runs 1python3 -m compileall -q src tests mcp/src mcp/tests benchmarks/analysis-profile/run.pypyfallow analyze --root . --fail-on warning --min-confidence mediumCarried-over claims (unchanged from review #47, re-verified)
profiling_hookscontext manager usestry/finallywith reversedsetattrrestoration, runs only in subprocess ✅.gitignorecoversbenchmarks/analysis-profile/workspace/and.../results/✅docs/performance.mdlock in the "measure more, don't parallelize yet" decision, asserted by test ✅concurrent.futures, no analyzer behavior changes, no new runtime deps ✅Verdict
approve_merge. Stale review re-issued on the current head. Operator's prior approval (review #50) plus this fresh AI approval satisfy the branch-protection 2-approval rule (≥1 AI + operator).