gemini(w4): long-history flow metrics window fixture #122

Open
opened 2026-05-28 01:20:42 +02:00 by codex · 2 comments
Collaborator

Parent: #2
Agent lane: Gemini 3.5 Flash
Wave: 4 / analytics foundation
Risk class: low

Goal

Prevent transitionLimit from hiding recent completions in long histories.

Context refs

  • Open issue #57
  • docs/agent-mcp-contract.md flow metrics

Scope

  • Build fixture with long old transition history plus recent completions.
  • Assert since/until is applied at DB/query layer before safety cap.
  • Keep fixture cheap.

Acceptance

  • Recent window metrics are correct despite old transitions.
  • transitionLimit remains a cap, not a semantic filter.
  • Test would fail on the historical bug.

Suggested checks

  • Flow metrics targeted tests.

Non-goals / fences

  • Do not deploy, restart production, rotate secrets, or run production migrations.
  • Do not widen MCP write authority or public exposure.
  • Keep the change small enough for one focused PR or one scouting report.

Expected output

A short PR or issue comment with findings, touched files, tests run, and remaining risks.

Parent: #2 Agent lane: Gemini 3.5 Flash Wave: 4 / analytics foundation Risk class: low ## Goal Prevent `transitionLimit` from hiding recent completions in long histories. ## Context refs - Open issue #57 - `docs/agent-mcp-contract.md` flow metrics ## Scope - Build fixture with long old transition history plus recent completions. - Assert `since/until` is applied at DB/query layer before safety cap. - Keep fixture cheap. ## Acceptance - Recent window metrics are correct despite old transitions. - `transitionLimit` remains a cap, not a semantic filter. - Test would fail on the historical bug. ## Suggested checks - Flow metrics targeted tests. ## Non-goals / fences - Do not deploy, restart production, rotate secrets, or run production migrations. - Do not widen MCP write authority or public exposure. - Keep the change small enough for one focused PR or one scouting report. ## Expected output A short PR or issue comment with findings, touched files, tests run, and remaining risks.
Author
Collaborator

Codex verification on current origin/main (966de964...): #122 is already covered by current code and integration tests.

Evidence:

  • packages/api/src/routers/agent.ts#buildFlowMetrics queries transition history with since/until before applying transitionLimit.
  • It seeds createdAt from current cards and has a second targeted getTransitionHistory pass for completed cards missing cycleStartAt, so completions whose cycle start is before the requested window still get cycle-time samples.
  • packages/api/integration-tests/agent.integration.test.ts includes computes flow metrics from the requested window before applying transition limits:
    • creates 130 old completed cards plus 5 recent completions,
    • calls getFlowMetrics with transitionLimit: 100,
    • asserts throughput returns the 5 recent cards,
    • asserts leadTime.sampleSize === 5, cycleTime.sampleSize === 5, and transitionsTruncated === false.

Verification run:

pnpm --filter @kan/api exec vitest integration-tests/agent.integration.test.ts --run -t "computes flow metrics from the requested window"

Passed. Recommendation: close #122 as already satisfied by current main.

Codex verification on current `origin/main` (`966de964...`): #122 is already covered by current code and integration tests. Evidence: - `packages/api/src/routers/agent.ts#buildFlowMetrics` queries transition history with `since`/`until` before applying `transitionLimit`. - It seeds `createdAt` from current cards and has a second targeted `getTransitionHistory` pass for completed cards missing `cycleStartAt`, so completions whose cycle start is before the requested window still get cycle-time samples. - `packages/api/integration-tests/agent.integration.test.ts` includes `computes flow metrics from the requested window before applying transition limits`: - creates 130 old completed cards plus 5 recent completions, - calls `getFlowMetrics` with `transitionLimit: 100`, - asserts throughput returns the 5 recent cards, - asserts `leadTime.sampleSize === 5`, `cycleTime.sampleSize === 5`, and `transitionsTruncated === false`. Verification run: ```bash pnpm --filter @kan/api exec vitest integration-tests/agent.integration.test.ts --run -t "computes flow metrics from the requested window" ``` Passed. Recommendation: close #122 as already satisfied by current main.
Collaborator

Iskra judgment

Field Value
Target pdurlej/kan-ductor#issue#122
Priority p2
Action observe
Scores reach 4 / impact 4 / confidence 5
Piotr fit high
Effort small
Labels judge/p2
Judge iskra via openclaw

Rationale: This is P2 observe-first analytics testing because transitionLimit must remain a safety cap rather than silently filtering out recent flow metrics.

Caveat: Keep the fixture cheap and scoped; do not change metric semantics beyond proving window filtering precedes the safety cap.

Structured openclaw.judge.v0 payload
<!-- openclaw.judge.v0 -->
{
  "confidence": 5,
  "effort_hint": "small",
  "escalation": {
    "kind": "none",
    "reason": ""
  },
  "evidence_refs": [
    {
      "note": "Issue scopes a flow-metrics fixture to ensure recent completions are not hidden by long old transition histories.",
      "type": "forgejo",
      "value": "issue-title-body-labels-and-target-snapshot"
    },
    {
      "note": "Body requires since/until filtering to be applied at query layer before transitionLimit safety cap.",
      "type": "forgejo",
      "value": "issue-body-scope-and-acceptance"
    },
    {
      "note": "Body fences the work away from production deploys, migrations, authority widening, and exposure changes.",
      "type": "forgejo",
      "value": "issue-body-non-goals"
    }
  ],
  "impact": 4,
  "judge_actor": {
    "name": "iskra",
    "runtime": "openclaw"
  },
  "judged_at": "2026-06-18T01:08:00Z",
  "labels_to_apply": [
    "judge/p2"
  ],
  "piotr_fit": "high",
  "priority": "p2",
  "rationale_summary": "This is P2 observe-first analytics testing because transitionLimit must remain a safety cap rather than silently filtering out recent flow metrics.",
  "reach": 4,
  "recommended_next_action": "observe",
  "rerun_reason": "no_prior_judgment",
  "schema": "openclaw.judge.v0",
  "target": {
    "kind": "issue",
    "number": 122,
    "repo": "pdurlej/kan-ductor"
  },
  "target_snapshot": {
    "body_hash": "sha256:406919a6f0953a1046c1d318ebd7ef946f2ff0637f0bc0bc394437a7eeb580c5",
    "commit_count": null,
    "evidence_hash": "sha256:54be38007f7bb0632996d14b02b6e9364ebff2a590a559cf52a26ff91fc52341",
    "head_sha": null,
    "labels": [
      "analytics",
      "gemini-flash",
      "priority:p2",
      "small-task",
      "tests"
    ],
    "labels_hash": "sha256:fb5a1d5bb7bb666df34dd3de5a47748013df5c087589f0d29c6dc5ad08f3de4e",
    "state": "open",
    "title_hash": "sha256:de6a257514b540cfb947af0313363c2cbb5746046a6fc8d94a56301db9e7f7f0",
    "updated_at": "2026-06-03T10:20:03+02:00"
  },
  "top_caveat": "Keep the fixture cheap and scoped; do not change metric semantics beyond proving window filtering precedes the safety cap."
}
<!-- /openclaw.judge.v0 -->
### Iskra judgment | Field | Value | | --- | --- | | Target | `pdurlej/kan-ductor#issue#122` | | Priority | p2 | | Action | observe | | Scores | reach 4 / impact 4 / confidence 5 | | Piotr fit | high | | Effort | small | | Labels | `judge/p2` | | Judge | `iskra` via `openclaw` | **Rationale:** This is P2 observe-first analytics testing because transitionLimit must remain a safety cap rather than silently filtering out recent flow metrics. **Caveat:** Keep the fixture cheap and scoped; do not change metric semantics beyond proving window filtering precedes the safety cap. <details> <summary>Structured openclaw.judge.v0 payload</summary> ```json <!-- openclaw.judge.v0 --> { "confidence": 5, "effort_hint": "small", "escalation": { "kind": "none", "reason": "" }, "evidence_refs": [ { "note": "Issue scopes a flow-metrics fixture to ensure recent completions are not hidden by long old transition histories.", "type": "forgejo", "value": "issue-title-body-labels-and-target-snapshot" }, { "note": "Body requires since/until filtering to be applied at query layer before transitionLimit safety cap.", "type": "forgejo", "value": "issue-body-scope-and-acceptance" }, { "note": "Body fences the work away from production deploys, migrations, authority widening, and exposure changes.", "type": "forgejo", "value": "issue-body-non-goals" } ], "impact": 4, "judge_actor": { "name": "iskra", "runtime": "openclaw" }, "judged_at": "2026-06-18T01:08:00Z", "labels_to_apply": [ "judge/p2" ], "piotr_fit": "high", "priority": "p2", "rationale_summary": "This is P2 observe-first analytics testing because transitionLimit must remain a safety cap rather than silently filtering out recent flow metrics.", "reach": 4, "recommended_next_action": "observe", "rerun_reason": "no_prior_judgment", "schema": "openclaw.judge.v0", "target": { "kind": "issue", "number": 122, "repo": "pdurlej/kan-ductor" }, "target_snapshot": { "body_hash": "sha256:406919a6f0953a1046c1d318ebd7ef946f2ff0637f0bc0bc394437a7eeb580c5", "commit_count": null, "evidence_hash": "sha256:54be38007f7bb0632996d14b02b6e9364ebff2a590a559cf52a26ff91fc52341", "head_sha": null, "labels": [ "analytics", "gemini-flash", "priority:p2", "small-task", "tests" ], "labels_hash": "sha256:fb5a1d5bb7bb666df34dd3de5a47748013df5c087589f0d29c6dc5ad08f3de4e", "state": "open", "title_hash": "sha256:de6a257514b540cfb947af0313363c2cbb5746046a6fc8d94a56301db9e7f7f0", "updated_at": "2026-06-03T10:20:03+02:00" }, "top_caveat": "Keep the fixture cheap and scoped; do not change metric semantics beyond proving window filtering precedes the safety cap." } <!-- /openclaw.judge.v0 --> ``` </details>
Sign in to join this conversation.
No labels
3plus3-followup
agent/claude-code
agent/codex
agent/hermes
agent/iskra
agent/ollama
agent/patchwarden
analytics
api
cockpit
dependency/blocked
dependency/blocks-others
dependency/cross-repo
dependency/needs-confirmation
docs
domain:agents
domain:ci
domain:docs
domain:forgejo
domain:infra
domain:memory
domain:runtime
domain:signal
domain:ux
flow/architecture
flow/blocked
flow/deployed
flow/done
flow/implementation
flow/intake
flow/maintained
flow/observed
flow/ready
flow/refining
flow/retired
flow/review
gemini-flash
judge/codex-candidate
judge/hermes-candidate
judge/low-confidence
judge/needs-refinement
judge/operator-needed
judge/p0
judge/p1
judge/p2
judge/p3
judge/park
judge/patchwarden-candidate
judge/stale-priority
kind/adr
kind/bug
kind/chore
kind/feature
kind/infra
kind/ops
kind/refactor
kind/research
leviathan
mcp
merge/auto
merge/manual
merge/manual-dependency-conflict
merge/manual-failing-tests
merge/manual-merge-conflict
merge/manual-missing-review
merge/manual-operator-preference
merge/manual-red-zone
merge/manual-security-sensitive
merge/manual-unclear-scope
merge/manual-unknown
mode:operator-only
mode:patchwarden-iskra-approved
mode:safe-auto
observed/erroring
observed/needs-followup
observed/pending
observed/retire-candidate
observed/unused
observed/used
ops
priority:p0
priority:p1
priority:p2
priority:p3
review:claude-reviewed
review:codex-reviewed
review:dziadek-reviewed
review:needs-human
safety
safety:external-write
safety:no-prod-mutation
safety:prod-impact
safety:secret-touch
scout
security
size/large
size/medium
size/small
size/tiny
size/unknown
small-task
source/adr
source/agent-generated
source/manual
source/operator-chat
source/voice-note
status:blocked
status:codex-ready
status:merged:pending-evidence
status:needs-evidence
status:operator-needed
status:parked
tests
tier:0-platform-substrate
tier:1-iskra-value-layer
tier:2-tools-products-modules
type:bug
type:chore
type:docs
type:feat
type:policy
type:research
ui
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pdurlej/kan-ductor#122
No description provided.