fix(honcho): route memory LLM defaults to qwen #699

Merged
pdurlej merged 3 commits from codex/690-honcho-qwen-fix into main 2026-06-03 00:13:07 +02:00
Collaborator

Canary status: missing - fire canary 3+3 manually before merge

Summary

This PR takes over the repo-source slice from draft PR #690 and switches Honcho's default Ollama/OpenAI-compatible text model contract from gemma4:31b-cloud to qwen3.5.

It updates the source defaults, DR sandbox defaults, synthetic compatibility smoke, and active runbook/closeout docs. It does not restart production Honcho, does not edit RS2000 live override files, and does not write Infisical.

Refs #293, #357, #363, #690.

Canary Context Pack

Product story

Iskra/OpenClaw memory derivation must stop depending on a model that has been observed returning Markdown prose where Honcho needs JSON-compatible structured output. The repo source of truth should point future deploys and DR restores at the model already proven in the local classifier lane.

What changed

  • Changed Honcho LLM role defaults in compose/apps/compose.yaml from gemma4:31b-cloud to qwen3.5.
  • Aligned W3D local sandbox restore defaults with the same model.
  • Updated the Honcho Ollama compatibility smoke default/schema/prompt for Qwen while retaining the legacy script filename.
  • Updated tests for the new default model contract.
  • Updated active runbook, closeout plan, context MAP descriptions, and the incident packet with repo-source findings.

Why it changed

Draft PR #690 documents a live P1 symptom: Honcho derivation has been failing because the current Gemma route returns non-JSON prose. The source-of-truth repo needs to stop recreating that route before any runtime reconcile/restart happens.

Files touched

  • compose/apps/compose.yaml
  • scripts/dr/w3d-local-sandbox-drill.sh
  • scripts/honcho/ollama-gemma-compat-smoke.py
  • control-plane/platformctl/tests/test_honcho_ollama_contract.py
  • tests/test_honcho_log_privacy.py
  • runbooks/honcho-ollama-gemma-switch.md
  • state/cutover/honcho-closeout-plan.md
  • contexts/persona-bridge/MAP.md
  • contexts/observability/MAP.md
  • docs/incidents/2026-06-02-honcho-qwen-fix.md

Relevant context

  • Draft PR #690 incident packet.
  • Platform issue #357 for embedding-space migration boundaries.
  • Platform issue #363 / Memory OS spine context.
  • Existing Honcho Ollama provider switch runbook and closeout plan.

Runtime evidence

Repo-only evidence in this PR:

  • PYTHONPATH=control-plane python3 -m pytest control-plane/platformctl/tests/test_honcho_ollama_contract.py tests/test_honcho_log_privacy.py passed.
  • git diff --check passed.
  • bash -n scripts/dr/w3d-local-sandbox-drill.sh passed.
  • PYTHONPATH=control-plane python3 -m platformctl.cli validate all --json passed with exitCode: 0.

No production restart, no live Infisical write, and no live Honcho derivation proof was performed in this PR.

Known constraints

The live acceptance criteria from #690 still require a gated runtime step: reconcile Infisical and RS2000 live overrides, restart only home-platform-honcho-deriver-1 and home-platform-honcho-api-1, then prove valid JSON derivation and backlog progress.

Explicit out-of-scope

  • No production Honcho restart.
  • No live override edit on RS2000.
  • No Infisical write.
  • No live Postgres query or migration.
  • No embedding-space migration.
  • No attempt to close #357.

Requested decision

Approve this as the source-of-truth repo fix needed before the runtime cutover/reconcile step.

Merge blockers

  • If reviewers find another active Honcho source-of-truth still defaulting to gemma4:31b-cloud.
  • If the Qwen tag is disputed; current local evidence points to exact model tag qwen3.5 from the available classifier integration checkout.
  • If canary finds this should be split from the DR/runbook alignment.

Spec sources read

  • docs/incidents/2026-06-02-honcho-qwen-fix.md - incident packet and acceptance constraints.
  • compose/apps/compose.yaml - Honcho LLM defaults source of truth.
  • scripts/dr/w3d-local-sandbox-drill.sh - DR/sandbox restore default contract.
  • scripts/honcho/ollama-gemma-compat-smoke.py - synthetic provider compatibility smoke.
  • control-plane/platformctl/tests/test_honcho_ollama_contract.py - Honcho model/default tests.
  • tests/test_honcho_log_privacy.py - model metadata in log privacy fixture.
  • runbooks/honcho-ollama-gemma-switch.md - operator runbook.
  • state/cutover/honcho-closeout-plan.md - active closeout plan.
  • contexts/persona-bridge/MAP.md and contexts/observability/MAP.md - context navigation labels.
  • /Users/pd/Developer/books-for-iskra/iskra-openclaw-integration/scripts/iskra-fastmail-dmz-classifier.py - exact existing Qwen model tag evidence.
Canary status: missing - fire canary 3+3 manually before merge ## Summary This PR takes over the repo-source slice from draft PR #690 and switches Honcho's default Ollama/OpenAI-compatible text model contract from `gemma4:31b-cloud` to `qwen3.5`. It updates the source defaults, DR sandbox defaults, synthetic compatibility smoke, and active runbook/closeout docs. It does not restart production Honcho, does not edit RS2000 live override files, and does not write Infisical. Refs #293, #357, #363, #690. ## Canary Context Pack ### Product story Iskra/OpenClaw memory derivation must stop depending on a model that has been observed returning Markdown prose where Honcho needs JSON-compatible structured output. The repo source of truth should point future deploys and DR restores at the model already proven in the local classifier lane. ### What changed - Changed Honcho LLM role defaults in `compose/apps/compose.yaml` from `gemma4:31b-cloud` to `qwen3.5`. - Aligned W3D local sandbox restore defaults with the same model. - Updated the Honcho Ollama compatibility smoke default/schema/prompt for Qwen while retaining the legacy script filename. - Updated tests for the new default model contract. - Updated active runbook, closeout plan, context MAP descriptions, and the incident packet with repo-source findings. ### Why it changed Draft PR #690 documents a live P1 symptom: Honcho derivation has been failing because the current Gemma route returns non-JSON prose. The source-of-truth repo needs to stop recreating that route before any runtime reconcile/restart happens. ### Files touched - `compose/apps/compose.yaml` - `scripts/dr/w3d-local-sandbox-drill.sh` - `scripts/honcho/ollama-gemma-compat-smoke.py` - `control-plane/platformctl/tests/test_honcho_ollama_contract.py` - `tests/test_honcho_log_privacy.py` - `runbooks/honcho-ollama-gemma-switch.md` - `state/cutover/honcho-closeout-plan.md` - `contexts/persona-bridge/MAP.md` - `contexts/observability/MAP.md` - `docs/incidents/2026-06-02-honcho-qwen-fix.md` ### Relevant context - Draft PR #690 incident packet. - Platform issue #357 for embedding-space migration boundaries. - Platform issue #363 / Memory OS spine context. - Existing Honcho Ollama provider switch runbook and closeout plan. ### Runtime evidence Repo-only evidence in this PR: - `PYTHONPATH=control-plane python3 -m pytest control-plane/platformctl/tests/test_honcho_ollama_contract.py tests/test_honcho_log_privacy.py` passed. - `git diff --check` passed. - `bash -n scripts/dr/w3d-local-sandbox-drill.sh` passed. - `PYTHONPATH=control-plane python3 -m platformctl.cli validate all --json` passed with `exitCode: 0`. No production restart, no live Infisical write, and no live Honcho derivation proof was performed in this PR. ### Known constraints The live acceptance criteria from #690 still require a gated runtime step: reconcile Infisical and RS2000 live overrides, restart only `home-platform-honcho-deriver-1` and `home-platform-honcho-api-1`, then prove valid JSON derivation and backlog progress. ### Explicit out-of-scope - No production Honcho restart. - No live override edit on RS2000. - No Infisical write. - No live Postgres query or migration. - No embedding-space migration. - No attempt to close #357. ### Requested decision Approve this as the source-of-truth repo fix needed before the runtime cutover/reconcile step. ### Merge blockers - If reviewers find another active Honcho source-of-truth still defaulting to `gemma4:31b-cloud`. - If the Qwen tag is disputed; current local evidence points to exact model tag `qwen3.5` from the available classifier integration checkout. - If canary finds this should be split from the DR/runbook alignment. ## Spec sources read - `docs/incidents/2026-06-02-honcho-qwen-fix.md` - incident packet and acceptance constraints. - `compose/apps/compose.yaml` - Honcho LLM defaults source of truth. - `scripts/dr/w3d-local-sandbox-drill.sh` - DR/sandbox restore default contract. - `scripts/honcho/ollama-gemma-compat-smoke.py` - synthetic provider compatibility smoke. - `control-plane/platformctl/tests/test_honcho_ollama_contract.py` - Honcho model/default tests. - `tests/test_honcho_log_privacy.py` - model metadata in log privacy fixture. - `runbooks/honcho-ollama-gemma-switch.md` - operator runbook. - `state/cutover/honcho-closeout-plan.md` - active closeout plan. - `contexts/persona-bridge/MAP.md` and `contexts/observability/MAP.md` - context navigation labels. - `/Users/pd/Developer/books-for-iskra/iskra-openclaw-integration/scripts/iskra-fastmail-dmz-classifier.py` - exact existing Qwen model tag evidence.
fix(honcho): route memory LLM defaults to qwen
Some checks failed
canary-required / collect-diff (pull_request) Successful in 5s
platformctl plan / auto-apply scope (pull_request) Successful in 20s
patchwarden-client-dry-run / dry-run (pull_request) Successful in 21s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 4s
patchwarden-pr-sanity / sanity (pull_request) Has been cancelled
base-is-main / guard (pull_request) Waiting to run
patchwarden-pr-sanity / collect-diff (pull_request) Waiting to run
pyfallow / Pyfallow gate (control-plane) (pull_request) Successful in 19s
python-ci / Python 3.11 (pull_request) Successful in 41s
python-ci / Python 3.12 (pull_request) Successful in 46s
python-ci / Python 3.13 (pull_request) Successful in 46s
canary-required / canary (pull_request) Successful in 14s
c094cd8175
chore(ci): retrigger honcho qwen checks
All checks were successful
base-is-main / guard (pull_request) Successful in 1s
canary-required / collect-diff (pull_request) Successful in 4s
pyfallow / Pyfallow gate (control-plane) (pull_request) Successful in 18s
python-ci / Python 3.11 (pull_request) Successful in 41s
python-ci / Python 3.12 (pull_request) Successful in 42s
patchwarden-client-dry-run / collect-diff (pull_request) Successful in 3s
patchwarden-pr-sanity / collect-diff (pull_request) Successful in 3s
platformctl plan / auto-apply scope (pull_request) Successful in 19s
python-ci / Python 3.13 (pull_request) Successful in 41s
canary-required / canary (pull_request) Successful in 13s
patchwarden-client-dry-run / dry-run (pull_request) Successful in 17s
patchwarden-pr-sanity / sanity (pull_request) Successful in 2m51s
deb5081804
Author
Collaborator

Patchwarden PR sanity

  • Status: advisory_findings
  • PR: 699
  • Commit: deb5081804ccee2c3dbbbc04379d37ae4eba3fd2
  • Security-sensitive label: missing
  • Authority: advisory model review plus deterministic blockers only
  • 3+3 canary: still alive; this does not replace it

Deterministic findings

No deterministic findings.

Model reviewers

global-glm / glm-5.1:cloud

  • Status: ok

  • Verdict: OK

  • medium Legacy filenames may confuse future operators searching for Qwen references

    • Evidence: scripts/honcho/ollama-gemma-compat-smoke.py retains 'gemma' in filename while testing qwen3.5; runbooks/honcho-ollama-gemma-switch.md retains 'gemma' in filename while documenting Qwen switch. Diff shows schema changed to 'honcho_ollama_qwe
    • Next: Consider adding a prominent comment at the top of ollama-gemma-compat-smoke.py explaining the legacy filename retention, or create a symlink ollama-qwen-compat-smoke.py -> ollama-gemma-compat-smoke.py for discoverability. The runbook filename is lower risk since it's documentation, but a similar not

global-deepseek / deepseek-v4-pro:cloud

  • Status: ok

  • Verdict: OK

  • low Model tag qwen3.5 not verified against Ollama Cloud catalog

    • Evidence: compose/apps/compose.yaml: all Honcho model defaults changed to qwen3.5; incident doc references local classifier checkout but no live Ollama Cloud tag confirmation.
    • Next: Before runtime cutover, confirm that qwen3.5 is the exact tag available on Ollama Cloud and that it supports the required structured output for all Honcho roles.
  • low Smoke script filename still references Gemma

    • Evidence: scripts/honcho/ollama-gemma-compat-smoke.py: filename unchanged while content now defaults to Qwen; runbooks still reference this filename.
    • Next: Plan a follow-up rename of the script to ollama-qwen-compat-smoke.py after verifying no remote automation hardcodes the old name.

redteam / kimi-k2.6:cloud

  • Status: ok

  • Verdict: NOT_OK

  • high Compatibility smoke falls back to plain prompt, masking JSON structured-output failures

    • Evidence: scripts/honcho/ollama-gemma-compat-smoke.py chat_test only validates a plain string echo (HONCHO_QWEN_OK) and does not enforce JSON schema; control-plane/platformctl/tests/test_honcho_ollama_contract.py test_ollama_smoke_json_test_falls_bac
    • Next: Remove the plain-prompt fallback from smoke.json_test (or add a separate structured-output smoke that hard-fails on non-JSON) and update the runbook acceptance criteria to require 100% valid JSON before DR/closeout.

Policy notes

  • GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot.
  • Optional red-team model is enabled only when PLATFORMCTL_PR_SANITY_REDTEAM_MODEL is configured.
  • Auto-merge is not enabled here.
<!-- patchwarden-pr-sanity:pdurlej/platform:PR-699 --> # Patchwarden PR sanity - Status: `advisory_findings` - PR: `699` - Commit: `deb5081804ccee2c3dbbbc04379d37ae4eba3fd2` - Security-sensitive label: `missing` - Authority: advisory model review plus deterministic blockers only - 3+3 canary: still alive; this does not replace it ## Deterministic findings No deterministic findings. ## Model reviewers ### `global-glm` / `glm-5.1:cloud` - Status: `ok` - Verdict: `OK` - **`medium`** Legacy filenames may confuse future operators searching for Qwen references - Evidence: `scripts/honcho/ollama-gemma-compat-smoke.py retains 'gemma' in filename while testing qwen3.5; runbooks/honcho-ollama-gemma-switch.md retains 'gemma' in filename while documenting Qwen switch. Diff shows schema changed to 'honcho_ollama_qwe` - Next: Consider adding a prominent comment at the top of ollama-gemma-compat-smoke.py explaining the legacy filename retention, or create a symlink ollama-qwen-compat-smoke.py -> ollama-gemma-compat-smoke.py for discoverability. The runbook filename is lower risk since it's documentation, but a similar not ### `global-deepseek` / `deepseek-v4-pro:cloud` - Status: `ok` - Verdict: `OK` - **`low`** Model tag `qwen3.5` not verified against Ollama Cloud catalog - Evidence: `compose/apps/compose.yaml: all Honcho model defaults changed to `qwen3.5`; incident doc references local classifier checkout but no live Ollama Cloud tag confirmation.` - Next: Before runtime cutover, confirm that `qwen3.5` is the exact tag available on Ollama Cloud and that it supports the required structured output for all Honcho roles. - **`low`** Smoke script filename still references Gemma - Evidence: `scripts/honcho/ollama-gemma-compat-smoke.py: filename unchanged while content now defaults to Qwen; runbooks still reference this filename.` - Next: Plan a follow-up rename of the script to `ollama-qwen-compat-smoke.py` after verifying no remote automation hardcodes the old name. ### `redteam` / `kimi-k2.6:cloud` - Status: `ok` - Verdict: `NOT_OK` - **`high`** Compatibility smoke falls back to plain prompt, masking JSON structured-output failures - Evidence: `scripts/honcho/ollama-gemma-compat-smoke.py chat_test only validates a plain string echo (HONCHO_QWEN_OK) and does not enforce JSON schema; control-plane/platformctl/tests/test_honcho_ollama_contract.py test_ollama_smoke_json_test_falls_bac` - Next: Remove the plain-prompt fallback from smoke.json_test (or add a separate structured-output smoke that hard-fails on non-JSON) and update the runbook acceptance criteria to require 100% valid JSON before DR/closeout. ## Policy notes - GLM 5.1 + DeepSeek V4 Pro are the operator-required model mix for this bot. - Optional red-team model is enabled only when `PLATFORMCTL_PR_SANITY_REDTEAM_MODEL` is configured. - Auto-merge is not enabled here.
pdurlej approved these changes 2026-06-03 00:12:39 +02:00
pdurlej left a comment

Operator-delegated approval via temporary admin lane: PR #699 is repo-source only, all required checks are green, no runtime restart, no Infisical write, no live override edit.

Operator-delegated approval via temporary admin lane: PR #699 is repo-source only, all required checks are green, no runtime restart, no Infisical write, no live override edit.
pdurlej deleted branch codex/690-honcho-qwen-fix 2026-06-03 00:13:08 +02:00
Sign in to join this conversation.
No reviewers
No labels
W6d-automerge-calibration
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
automerge-candidate
class/security-sensitive
cutover-gate
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
iterating
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
large-impact
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
meta
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
needs-operator-decision
needs-triage
not-ready
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
operator-emotional
owner-attention
phase/02
phase/03
priority:p0
priority:p1
priority:p2
priority:p3
proposed
ready-for-agent
ready-for-operator
recovery
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
risk/exposure
risk/process
risk/product
risk/runtime
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
size/large
size/medium
size/small
size/tiny
size/unknown
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tier/full
tier/lite
tier/stacked
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/platform!699
No description provided.