Evaluate Ponytail as a possible Patchwarden simplicity-review dependency #123

New issue

Closed

opened 2026-06-23 18:30:55 +02:00 by codex · 0 comments

codex commented

2026-06-23 18:30:55 +02:00

Collaborator

Context

Ponytail is an MIT-licensed “lazy senior dev” ruleset/tooling repo:
https://github.com/DietrichGebert/ponytail

It provides cross-agent instructions, hooks, skills, and MCP support around a simple code-minimization ladder:

Do not build what is not needed.
Reuse existing codebase patterns/helpers.
Prefer stdlib.
Prefer native platform features.
Prefer already-installed dependencies.
Prefer one-line/minimal implementation.
Only then write the minimum new code that works.

Relevant Ponytail sources:

Core repo/README: https://github.com/DietrichGebert/ponytail
Root AGENTS.md ruleset: https://raw.githubusercontent.com/DietrichGebert/ponytail/main/AGENTS.md
ponytail skill: https://raw.githubusercontent.com/DietrichGebert/ponytail/main/skills/ponytail/SKILL.md
ponytail-review skill: https://raw.githubusercontent.com/DietrichGebert/ponytail/main/skills/ponytail-review/SKILL.md
MCP adapter: https://github.com/DietrichGebert/ponytail/tree/main/ponytail-mcp

Why this may matter for Patchwarden

Patchwarden already has a clean architectural slot for this kind of signal:

reviewer lanes are sensors only;
deterministic policy remains the merge-safety authority;
external tools can produce artifacts that Patchwarden normalizes or consumes;
runtime dependency binding is sensitive and should be evaluated separately from “useful inspiration”.

This is adjacent to existing deterministic quality work: generated-artifact, security-path, dependency-risk, content/slop sensors, structural-code artifacts, and review-quorum.

Ponytail’s most interesting contribution is not its hooks/plugin runtime. It is the complexity taxonomy:

reinvented stdlib;
unnecessary dependency;
native platform feature ignored;
speculative abstraction;
one-implementation interface/factory/config;
dead flexibility;
shrinkable logic.

That maps naturally to a possible Patchwarden “simplicity / complexity-budget” review lane or later deterministic module.

Dependency question

Do not assume Ponytail should become a runtime dependency.

Evaluate these options explicitly:

Option A — Inspiration only

Lift the taxonomy into Patchwarden reviewer prompts/docs. No runtime dependency, no Node requirement, no hooks.

Likely best first step.

Option B — External workflow/tool input

Run Ponytail or a Ponytail-style review outside Patchwarden and map output into patchwarden.review_artifact.v1.

Useful only if Ponytail exposes a stable machine-readable contract.

Option C — MCP/plugin dependency

Probably not for Patchwarden core. Ponytail’s MCP/plugin story is useful for operator/agent context, but Patchwarden should not depend on always-on agent adapters.

Option D — Runtime dependency

Reject by default unless there is a strong reason. It would add Node/tooling surface and would collide with Patchwarden’s current dependency discipline unless the value is clearly proven.

Proposed first slice

Create a default-off reviewer lane, tentatively:

[review_lanes.simplicity_review]
enabled = false
required_for = []
cloud_review = "allowed"
provider = "ollama"

Expected finding types:

unnecessary_dependency
stdlib_reimplementation
native_feature_available
speculative_abstraction
dead_flexibility
shrinkable_logic

Initial behavior:

advisory only;
no automerge hard-blocking;
no approval or merge authority;
no new runtime dependency;
no Ponytail hooks installed;
use current review artifact/quorum shape if possible.

Only promote any finding to a blocker after dogfood calibration.

Questions to answer

Is Ponytail’s taxonomy useful enough to become a Patchwarden lane/policy vocabulary?
Should this stay LLM-suspicion only, or can some classes become deterministic checks?
Which findings are safe as blockers, if any?
Can this reuse existing review-run prompt plumbing, or does Patchwarden need lane-specific prompt schemas?
Does this overlap with current content_slop_sensor, dependency_risk_sensor, or structural-code gate?
Is there any license/patent concern in borrowing vocabulary from an MIT repo into Apache-2.0 Patchwarden docs/prompts?
What dogfood threshold proves value? Candidate: 10-20 PRs where the lane catches real complexity without causing false blockers.

Acceptance criteria

Produce a short decision note: Ponytail as inspiration, external tool, MCP/plugin, runtime dependency, or reject.
If useful, open a follow-up implementation issue for a default-off simplicity_review lane.
If rejected, document why and close this as analyzed.
Do not add Ponytail as a runtime dependency in this issue.

## Context Ponytail is an MIT-licensed “lazy senior dev” ruleset/tooling repo: https://github.com/DietrichGebert/ponytail It provides cross-agent instructions, hooks, skills, and MCP support around a simple code-minimization ladder: 1. Do not build what is not needed. 2. Reuse existing codebase patterns/helpers. 3. Prefer stdlib. 4. Prefer native platform features. 5. Prefer already-installed dependencies. 6. Prefer one-line/minimal implementation. 7. Only then write the minimum new code that works. Relevant Ponytail sources: - Core repo/README: https://github.com/DietrichGebert/ponytail - Root AGENTS.md ruleset: https://raw.githubusercontent.com/DietrichGebert/ponytail/main/AGENTS.md - `ponytail` skill: https://raw.githubusercontent.com/DietrichGebert/ponytail/main/skills/ponytail/SKILL.md - `ponytail-review` skill: https://raw.githubusercontent.com/DietrichGebert/ponytail/main/skills/ponytail-review/SKILL.md - MCP adapter: https://github.com/DietrichGebert/ponytail/tree/main/ponytail-mcp ## Why this may matter for Patchwarden Patchwarden already has a clean architectural slot for this kind of signal: - reviewer lanes are sensors only; - deterministic policy remains the merge-safety authority; - external tools can produce artifacts that Patchwarden normalizes or consumes; - runtime dependency binding is sensitive and should be evaluated separately from “useful inspiration”. This is adjacent to existing deterministic quality work: generated-artifact, security-path, dependency-risk, content/slop sensors, structural-code artifacts, and review-quorum. Ponytail’s most interesting contribution is not its hooks/plugin runtime. It is the **complexity taxonomy**: - reinvented stdlib; - unnecessary dependency; - native platform feature ignored; - speculative abstraction; - one-implementation interface/factory/config; - dead flexibility; - shrinkable logic. That maps naturally to a possible Patchwarden “simplicity / complexity-budget” review lane or later deterministic module. ## Dependency question Do not assume Ponytail should become a runtime dependency. Evaluate these options explicitly: ### Option A — Inspiration only Lift the taxonomy into Patchwarden reviewer prompts/docs. No runtime dependency, no Node requirement, no hooks. Likely best first step. ### Option B — External workflow/tool input Run Ponytail or a Ponytail-style review outside Patchwarden and map output into `patchwarden.review_artifact.v1`. Useful only if Ponytail exposes a stable machine-readable contract. ### Option C — MCP/plugin dependency Probably not for Patchwarden core. Ponytail’s MCP/plugin story is useful for operator/agent context, but Patchwarden should not depend on always-on agent adapters. ### Option D — Runtime dependency Reject by default unless there is a strong reason. It would add Node/tooling surface and would collide with Patchwarden’s current dependency discipline unless the value is clearly proven. ## Proposed first slice Create a default-off reviewer lane, tentatively: ```toml [review_lanes.simplicity_review] enabled = false required_for = [] cloud_review = "allowed" provider = "ollama" ``` Expected finding types: - `unnecessary_dependency` - `stdlib_reimplementation` - `native_feature_available` - `speculative_abstraction` - `dead_flexibility` - `shrinkable_logic` Initial behavior: - advisory only; - no automerge hard-blocking; - no approval or merge authority; - no new runtime dependency; - no Ponytail hooks installed; - use current review artifact/quorum shape if possible. Only promote any finding to a blocker after dogfood calibration. ## Questions to answer 1. Is Ponytail’s taxonomy useful enough to become a Patchwarden lane/policy vocabulary? 2. Should this stay LLM-suspicion only, or can some classes become deterministic checks? 3. Which findings are safe as blockers, if any? 4. Can this reuse existing `review-run` prompt plumbing, or does Patchwarden need lane-specific prompt schemas? 5. Does this overlap with current `content_slop_sensor`, `dependency_risk_sensor`, or structural-code gate? 6. Is there any license/patent concern in borrowing vocabulary from an MIT repo into Apache-2.0 Patchwarden docs/prompts? 7. What dogfood threshold proves value? Candidate: 10-20 PRs where the lane catches real complexity without causing false blockers. ## Acceptance criteria - Produce a short decision note: Ponytail as inspiration, external tool, MCP/plugin, runtime dependency, or reject. - If useful, open a follow-up implementation issue for a default-off `simplicity_review` lane. - If rejected, document why and close this as analyzed. - Do not add Ponytail as a runtime dependency in this issue.

codex added the

labels

2026-06-23 18:42:47 +02:00

codex referenced this issue

2026-06-23 18:44:49 +02:00

docs(policy): decide Ponytail simplicity posture #125