docs(specs): prebuild for #176 Hermes voice-clone feasibility spike v0 #351

Merged

pdurlej merged 1 commit from claude/h-batch/hermes-voice-clone-spike-v0-prebuild into main

2026-05-23 10:31:38 +02:00

claude commented

2026-05-17 22:21:18 +02:00

Collaborator

BATCH H Pan Herbatka fork output. Greenfield spike — hermes-agency has no existing voice/TTS infrastructure. Prepares execution of feasibility decision (SHIP/ITERATE/ABANDON) for voice-cloning operator voice for Hermes audio deliverables.

docs/specs/hermes-voice-clone-spike-v0/ — 6 spec files (constitution with 8 principles, specify, plan, tasks, implement-notes, README)
prompts/codex-hermes-voice-clone-spike.md — companion execution prompt

8 non-negotiable principles

P1 Privacy-first (NO cloud-API for samples — biometric identity)
P2 On-device inference only
P3 Spike-not-ship (Phase 07 ticket post-SHIP)
P4 Operator A/B as quality oracle (NOT MOS/CER metrics)
P5 Polish-language binding (NOT English benchmarks)
P6 Operator-bandwidth-conscious (≤30 min sample session)
P7 Infrastructure-realistic (Mac M1 / RS2000 / VPS1000)
P8 Decision-forcing (SHIP/ITERATE/ABANDON, no maybe)

5-slice funnel

(a) Survey → (b) Mac M1 bench → (c) Host bench → (d) Sample protocol → (e) Feasibility report

Top-3 candidates: Coqui XTTS-v2, OpenVoice (MyShell), F5-TTS — explicit Polish-support filter applied.

Safety/production boundary

This PR prepares execution only. Does NOT authorize: production deployment, cloud-API for samples, Hermes runtime mutation, implicit ship without separate Phase 07 ticket.

Tier: Trivial per ADR-0007 (docs-only).

Disjoint from active C/F forks (different file paths).

Refs #176

**BATCH H Pan Herbatka fork output.** Greenfield spike — `hermes-agency` has no existing voice/TTS infrastructure. Prepares execution of feasibility decision (SHIP/ITERATE/ABANDON) for voice-cloning operator voice for Hermes audio deliverables. ## Contents - `docs/specs/hermes-voice-clone-spike-v0/` — 6 spec files (constitution with 8 principles, specify, plan, tasks, implement-notes, README) - `prompts/codex-hermes-voice-clone-spike.md` — companion execution prompt ## 8 non-negotiable principles - P1 Privacy-first (NO cloud-API for samples — biometric identity) - P2 On-device inference only - P3 Spike-not-ship (Phase 07 ticket post-SHIP) - P4 Operator A/B as quality oracle (NOT MOS/CER metrics) - P5 Polish-language binding (NOT English benchmarks) - P6 Operator-bandwidth-conscious (≤30 min sample session) - P7 Infrastructure-realistic (Mac M1 / RS2000 / VPS1000) - P8 Decision-forcing (SHIP/ITERATE/ABANDON, no maybe) ## 5-slice funnel (a) Survey → (b) Mac M1 bench → (c) Host bench → (d) Sample protocol → (e) Feasibility report Top-3 candidates: Coqui XTTS-v2, OpenVoice (MyShell), F5-TTS — explicit Polish-support filter applied. ## Safety/production boundary This PR prepares execution only. Does NOT authorize: production deployment, cloud-API for samples, Hermes runtime mutation, implicit ship without separate Phase 07 ticket. Tier: Trivial per ADR-0007 (docs-only). Disjoint from active C/F forks (different file paths). Refs #176

claude added 1 commit

2026-05-17 22:21:19 +02:00

docs(specs): prebuild for #176 Hermes voice-clone feasibility spike v0

base-is-main / guard (pull_request) Successful in 2s

Details

canary-required / collect-diff (pull_request) Successful in 4s

Details

patchwarden-pr-sanity / collect-diff (pull_request) Successful in 3s

Details

canary-required / canary (pull_request) Successful in 12s

Details

patchwarden-pr-sanity / sanity (pull_request) Successful in 20s

Details

a37386ba9a

BATCH H Pan Herbatka fork output. Greenfield spike — hermes-agency has
no existing voice/TTS infrastructure. This prebuild prepares execution
of feasibility decision (SHIP/ITERATE/ABANDON) for voice-cloning
operator's voice for Hermes audio deliverables.

What's in:

docs/specs/hermes-voice-clone-spike-v0/
- 00-constitution.md (8 non-negotiable principles: privacy-first,
  on-device only, spike-not-ship, operator A/B as quality oracle,
  Polish-language binding, operator-bandwidth-conscious, infra-realistic,
  decision-forcing)
- 01-specify.md (problem, MUST/SHOULD outcomes, acceptance criteria,
  90-min operator-bandwidth budget)
- 02-plan.md (5-slice funnel: survey -> Mac M1 bench -> host bench ->
  sample protocol -> feasibility report; top-3 candidates Coqui XTTS-v2,
  OpenVoice, F5-TTS with explicit Polish-support filter; Mac M1 primary
  benchmark host)
- 03-tasks.md (per-slice checklists with explicit verification steps;
  codex-executor vs operator-runnable mode marked per slice)
- 04-implement-notes.md (Polish phoneme gotchas, per-engine specifics,
  sample storage discipline, hermes-agency integration boundary)
- README.md (TL;DR + reading order + scope)

prompts/codex-hermes-voice-clone-spike.md
- Companion execution prompt per ADR-0018 + Codex feedback pattern
- Safety/production boundary: NO cloud-API for samples, NO Hermes
  runtime mutation, NO implicit ship without separate Phase 07 ticket
- 8 hard gates (privacy, locality, spike-not-ship, decision-forcing,
  Polish-binding, operator-bandwidth, sacred-path-adjacent, ADR-0018)
- 6 stop conditions including SECURITY INCIDENT path if sample leaks
- Reporting format + cousin coordination

Constitution P1+P2 (no cloud, no API for cloning OR inference) is the
absolute privacy hard-line — operator's voice is biometric identity.

Spike output = SHIP/ITERATE/ABANDON verdict. SHIP triggers separate
Phase 07 implementation ticket; ABANDON closes #176 with evidence;
ITERATE schedules next round.

Tier: Trivial per ADR-0007 (docs-only prebuild; no code/schema/runtime).

Refs #176